Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sprinkle some Pixiedust on your Jupyter Notebooks

Ee996d54796a7772b3a0c08b5adf6b7b?s=47 Raj Singh
November 19, 2016

Sprinkle some Pixiedust on your Jupyter Notebooks

The Jupyter stack is built from the ground up to be extensible and hackable, but Jupyter is a big planet with many moons. The Developer Advocacy team at IBM Analytics has developed an open source library of useful time-saving and anxiety reducing tools we call "Pixiedust". It eases the pain of charting and graphing, saving data to the cloud and exposing Python data structures to Scala code. I'll introduce Pixiedust's features and discuss how the community can contribute.

Ee996d54796a7772b3a0c08b5adf6b7b?s=128

Raj Singh

November 19, 2016
Tweet

Transcript

  1. IBM Cloud Data Services Sprinkle some pixiedust on your Jupyter

    Notebooks Raj Singh, PhD Developer Advocate: Geo | Open Data rrsingh@us.ibm.com http://ibm.biz/rajrsingh twitter: @rajrsingh
  2. @rajrsingh IBM Cloud Data Services Old-school analytics https://writescience.wordpress.com/tag/scientific-method/

  3. @rajrsingh IBM Cloud Data Services A browser-based notebook with support

    for code, text, mathematical expressions, inline plots and other media
  4. @rajrsingh IBM Cloud Data Services Jupyter features • Edit code

    in the browser, with automatic syntax highlighting, indentation, and tab completion/introspection. • Run code from the browser, with the results of computations attached to the code which generated them. • See the results of computations with rich media representations, such as HTML, LaTeX, PNG, SVG, PDF, etc. • Author narrative text using the Markdown markup language. • Javascript widgets, binding interactive UI controls and visualizations to reactive kernel side computations
  5. ©2016 IBM Corporation IBM Data Science Experience IBM Cloud Data

    Services makes data simple
  6. @rajrsingh IBM Cloud Data Services PixieDust an Open Source Library

    that simplifies and improves Jupyter Python Notebooks
  7. @rajrsingh IBM Cloud Data Services Jupyter + PixieDust 1. PackageManager

    2. Visualizations 3. Cloud Integration 4. Scala Bridge 5. Extensibility 6. Embedded Apps https://github.com/ibm-cds-labs/pixiedust
  8. @rajrsingh IBM Cloud Data Services 1. Pixiedust Package Manager •

    Install Spark packages or plain jars in your Notebook Python kernel without the need to modify configuration file Install GraphFrames Spark Package Uses the GraphFrame Python APIs
  9. @rajrsingh IBM Cloud Data Services 2. Visualizations Call the Options

    dialog Performance statistics Panning/Zooming options One simple API: display()
  10. @rajrsingh IBM Cloud Data Services 3. Cloud Integration Easily export

    your data to CSV, JSON, HTML, etc. locally on your laptop or into a cloud-based service like Cloudant or Object Storage
  11. @rajrsingh IBM Cloud Data Services 4. Scala Bridge • Execute

    Scala code directly from your python Notebook %%scala val demo = com.ibm.cds.spark.samples.StreamingTwitter demo.setConfig("twitter4j.oauth.consumerKey",”XXXXX") demo.setConfig("twitter4j.oauth.consumerSecret",”XXXXX") demo.setConfig("twitter4j.oauth.accessToken",”XXXXX") demo.setConfig("twitter4j.oauth.accessTokenSecret",”XXXXX") demo.setConfig("watson.tone.url","https://watsonplatform.net/tone-analyzer/api") demo.setConfig("watson.tone.password",”XXXXX") demo.setConfig("watson.tone.username",”XXXX”) import org.apache.spark.streaming._ demo.startTwitterStreaming(sc, Seconds(10)) pythonVar = “pixiedust” Define Python variable println(pythonVar) Use the python var in Scala val __fromScalaVar = “Hello from Scala” Define scala variable print(__fromScalaVar) Use the scala var in Python
  12. @rajrsingh IBM Cloud Data Services 5. Extensibility • Easily extend

    PixieDust to create your own visualizations using HTML/CSS/JavaScript Customized Visualization for GraphFrame Graphs
  13. @rajrsingh IBM Cloud Data Services 6. Embed Apps in Notebooks

    • Encapsulate your analytics into compelling User Interfaces better suited for Line of Business Users from pixiedust_twitterdemo import * twitterDemo()
  14. @rajrsingh IBM Cloud Data Services Graphs in matplotlib

  15. @rajrsingh IBM Cloud Data Services demo

  16. @rajrsingh IBM Cloud Data Services Spark: display DataFrame

  17. @rajrsingh IBM Cloud Data Services Pixiedust: display DataFrame

  18. IBM Cloud Data Services Thanks • https://github.com/ibm-cds-labs/pixiedust • Data Science

    Experience (DSX) • http://datascience.ibm.com/ • IBM Cloud Data Services on Bluemix • http://www.ibm.com/cloud-computing/bluemix/solutions/data-analytics/