Slide 1

Slide 1 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Jupyter, Pixiedust & Maps Simplifying spatial visualization in Jupyter Notebooks Raj Singh Developer Advocate, IBM August, 2017

Slide 2

Slide 2 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh “Good Programmers are Lazy and Dumb” -- Phillipp Lenssen • only lazy programmers will want to write the kind of tools that replace them • only a lazy programmer will avoid writing monotonous, repetitive code – thus avoiding redundancy, the enemy of software maintenance and flexible refactoring • tools and processes that come out will speed up production

Slide 3

Slide 3 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Machine learning Spark Visualization Notebooks Data lakes Updates Data cleansing Sharing & Collaboration Automated error correction table joins Database Interoperability Schema mapping ETL Model fitting Moving between platforms linear regression Security Good Bad Ugly!

Slide 4

Slide 4 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Is this really charting in 2017?

Slide 5

Slide 5 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh from mpl_toolkits.basemap import Basemap from matplotlib.offsetbox import AnnotationBbox from matplotlib._png import read_png from itertools import izip matplotlib.style.use('bmh') fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 12)) # background maps m1 = Basemap(projection='mill',resolution=None, llcrnrlon=-7.5, llcrnrlat=49.84,urcrnrlon=2.5, urcrnrlat=59,ax=axes[0]) m1.drawlsmask(land_color='dimgrey', ocean_color='dodgerBlue',lakes=True) # temperature map for [temp,city] in izip(temps,cities): lat = city[1] lon = city[2] if temp>8: col='indigo' elif temp>10: col='darkmagenta' elif temp>8: col='red' elif temp>6: col='tomato' elif temp>4: col='turquoise' x1, y1 = m2(lon,lat) bbox_props = dict(boxstyle="round,pad=0.3", fc=col, ec=col, lw=2) axes[1].text(x1, y1, temp, ha="center", va="center", size=11,bbox=bbox_props) plt.tight_layout() Is this really mapping in 2017?

Slide 6

Slide 6 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Enter Pixiedust with Mapbox…

Slide 7

Slide 7 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh

Slide 8

Slide 8 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Jupyter + Pixiedust = 1. PackageManager 2. Visualizations 3. Cloud Integration 4. Scala Bridge 5. Extensibility 6. Embedded Apps

Slide 9

Slide 9 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh 1. Package Manager Install Spark packages or plain jars in your Notebook Python kernel without the need to modify configuration file Install GraphFrames Spark Package Uses the GraphFrame Python APIs

Slide 10

Slide 10 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh 2. Visualizations One simple API: display() Call the Options dialog Performance statistics Panning/Zooming options

Slide 11

Slide 11 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh 3. Cloud Integration Easily export your data to csv, json, html, etc. locally on your laptop or into a cloud-based service like Cloudant or Object Storage

Slide 12

Slide 12 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh 4. Scala Bridge Execute Scala code directly from your python Notebook %%scala val demo = com.ibm.cds.spark.samples.StreamingTwitter demo.setConfig("twitter4j.oauth.consumerKey",”XXXXX") demo.setConfig("twitter4j.oauth.consumerSecret",”XXXXX") demo.setConfig("twitter4j.oauth.accessToken",”XXXXX") demo.setConfig("twitter4j.oauth.accessTokenSecret",”XXXXX") demo.setConfig("watson.tone.url","https://watsonplatform.net/tone-analyzer/api") demo.setConfig("watson.tone.password",”XXXXX") demo.setConfig("watson.tone.username",”XXXX”) import org.apache.spark.streaming._ demo.startTwitterStreaming(sc, Seconds(10)) pythonVar = “pixiedust” Define Python variable println(pythonVar) Use the python var in Scala val __fromScalaVar = “Hello from Scala” Define scala variable print(__fromScalaVar) Use the scala var in Python

Slide 13

Slide 13 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh 5. Extensibility Easily extend PixieDust to create your own visualizations using HTML/CSS/JavaScript Customized Visualization for GraphFrame Graphs

Slide 14

Slide 14 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh 6. Embed Apps in Notebooks PixieApps encapsulate analytics into lightweight HTML UIs for code-phobic end users

Slide 15

Slide 15 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Demo • https://apsportal.ibm.com/analytics/notebooks/f2bfaebf- 94ec-48a5-aed4- f2bd01226ae3/view?access_token=0b7840132d8634f682b 19a74d57064f75b39c6dfdbee83c28c00cb0fe69d6326

Slide 16

Slide 16 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh How it all works • Spark DataFrame -> GeoJSON • /display/chart/renderers/mapbox/mapBoxMapDisplay.py • Get bin cutoff points for quantiles • /display/chart/renderers/mapbox/mapBoxMapDisplay.py • Create choropleth styling JSON • /display/chart/renderers/mapbox/mapBoxMapDisplay.py • GeoJSON data and styling JSON => Jinja2 template • /display/chart/renderers/mapbox/templates/mapView.html • Render template inside an inside the cell • /display/chart/renderers/mapbox/templates/iframesrcdoc.html • Call Mapbox base mapping service for streets underlay • /display/chart/renderers/mapbox/templates/mapView.html

Slide 17

Slide 17 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh Whither Pixiedust mapping? • More cartographic options • Animated temporal visualization • Partner integration: switch to official Mapbox Jupyter lib • Partner integration: Esri, CARTO providers

Slide 18

Slide 18 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh References • IBM Data Science Experience • http://datascience.ibm.com • free 30-day trial • Pixiedust • https://github.com/ibm-watson-data-lab/pixiedust • Project Jupyter • http://jupyter.org/ • Me • rrsingh@us.ibm.com

Slide 19

Slide 19 text

© 2017 IBM Corp. IBM Cloud & Watson @rajrsingh