Jupyter, Pixiedust & Maps: Simplifying spatial visualization in Jupyter Notebooks

video: https://www.youtube.com/watch?v=Ezh7Xb67lkI&t=107s&list=PLGVZCDnMOq0rxoq9Nx0B4tqtr891vaCn7&index=47

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. The Jupyter stack is built from the ground up to be extensible and hackable. The Developer Advocacy team at IBM Analytics has developed an open source library of useful time-saving and anxiety reducing tools we call "Pixiedust". It was designed to ease the pain of charting, saving data to the cloud and exposing Python data structures to Scala code. I'll talk about how I built mapping into Pixiedust, putting data from Spark-based analytics on maps using Mapbox GL.

Raj Singh

August 24, 2017

  Jupyter,

    Pixiedust & Maps Simplifying spatial visualization in Jupyter Notebooks Raj Singh Developer Advocate, IBM August, 2017
  "Good

    Programmers are Lazy and Dumb" -- Phillipp Lenssen • only lazy programmers will want to write the kind of tools that replace them • only a lazy programmer will avoid writing monotonous, repetitive code – thus avoiding redundancy, the enemy of software maintenance and flexible refactoring • tools and processes that come out will speed up production
  Machine

    learning Spark Visualization Notebooks Data lakes Updates Data cleansing Sharing & Collaboration Automated error correction table joins Database Interoperability Schema mapping ETL Model fitting Moving between platforms linear regression Security Good Bad Ugly!
  from

    mpl_toolkits.basemap import Basemap from matplotlib.offsetbox import AnnotationBbox from matplotlib._png import read_png from itertools import izip matplotlib.style.use('bmh') fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 12)) # background maps m1 = Basemap(projection='mill',resolution=None, llcrnrlon=-7.5, llcrnrlat=49.84,urcrnrlon=2.5, urcrnrlat=59,ax=axes[0]) m1.drawlsmask(land_color='dimgrey', ocean_color='dodgerBlue',lakes=True) # temperature map for [temp,city] in izip(temps,cities): lat = city[1] lon = city[2] if temp>8: col='indigo' elif temp>10: col='darkmagenta' elif temp>8: col='red' elif temp>6: col='tomato' elif temp>4: col='turquoise' x1, y1 = m2(lon,lat) bbox_props = dict(boxstyle="round,pad=0.3", fc=col, ec=col, lw=2) axes[1].text(x1, y1, temp, ha="center", va="center", size=11,bbox=bbox_props) plt.tight_layout() Is this really mapping in 2017?
  Jupyter

    + Pixiedust = 1. PackageManager 2. Visualizations 3. Cloud Integration 4. Scala Bridge 5. Extensibility 6. Embedded Apps
  1.

    Package Manager Install Spark packages or plain jars in your Notebook Python kernel without the need to modify configuration file Install GraphFrames Spark Package Uses the GraphFrame Python APIs
  2.

    Visualizations One simple API: display() Call the Options dialog Performance statistics Panning/Zooming options
  3.

    Cloud Integration Easily export your data to csv, json, html, etc. locally on your laptop or into a cloud-based service like Cloudant or Object Storage
  4.

    Scala Bridge Execute Scala code directly from your python Notebook %%scala val demo = com.ibm.cds.spark.samples.StreamingTwitter demo.setConfig("twitter4j.oauth.consumerKey","XXXXX") demo.setConfig("twitter4j.oauth.consumerSecret","XXXXX") demo.setConfig("twitter4j.oauth.accessToken","XXXXX") demo.setConfig("twitter4j.oauth.accessTokenSecret","XXXXX") demo.setConfig("watson.tone.url","https://watsonplatform.net/tone-analyzer/api") demo.setConfig("watson.tone.password","XXXXX") demo.setConfig("watson.tone.username","XXXX") import org.apache.spark.streaming._ demo.startTwitterStreaming(sc, Seconds(10)) pythonVar = "pixiedust" Define Python variable println(pythonVar) Use the python var in Scala val __fromScalaVar = "Hello from Scala" Define scala variable print(__fromScalaVar) Use the scala var in Python
  5.

    Extensibility Easily extend PixieDust to create your own visualizations using HTML/CSS/JavaScript Customized Visualization for GraphFrame Graphs
  6.

    Embed Apps in Notebooks PixieApps encapsulate analytics into lightweight HTML UIs for code-phobic end users
  Demo

    • https://apsportal.ibm.com/analytics/notebooks/f2bfaebf- 94ec-48a5-aed4- f2bd01226ae3/view?access_token=0b7840132d8634f682b 19a74d57064f75b39c6dfdbee83c28c00cb0fe69d6326
  How

    it all works • Spark DataFrame -> GeoJSON • /display/chart/renderers/mapbox/mapBoxMapDisplay.py • Get bin cutoff points for quantiles • /display/chart/renderers/mapbox/mapBoxMapDisplay.py • Create choropleth styling JSON • /display/chart/renderers/mapbox/mapBoxMapDisplay.py • GeoJSON data and styling JSON => Jinja2 template • /display/chart/renderers/mapbox/templates/mapView.html • Render template inside an <iframe> inside the cell • /display/chart/renderers/mapbox/templates/iframesrcdoc.html • Call Mapbox base mapping service for streets underlay • /display/chart/renderers/mapbox/templates/mapView.html
  Whither

    Pixiedust mapping? • More cartographic options • Animated temporal visualization • Partner integration: switch to official Mapbox Jupyter lib • Partner integration: Esri, CARTO providers
  References

    • IBM Data Science Experience • http://datascience.ibm.com • free 30-day trial • Pixiedust • https://github.com/ibm-watson-data-lab/pixiedust • Project Jupyter • http://jupyter.org/ • Me • [email protected]