Slide 1

Slide 1 text

© 2016 Continuum Analytics - Confidential & Proprietary © 2016 Continuum Analytics - Confidential & Proprietary Leveraging Jupyter to build an Excel-Python bridge JupyterCON 2017 Christine Doig, Senior Product Manager & Data Scientist Fabio Pliger, Tech Lead August 24th, 2017

Slide 2

Slide 2 text

2 Overview & Examples

Slide 3

Slide 3 text

© 2016 Continuum Analytics - Confidential & Proprietary 3 How to increase by 100x the impact of data science in your organization? Data Scientist Business Analysts Jupyter Users Est. 3-6 million Excel Users Est. 750 Million

Slide 4

Slide 4 text

© 2016 Continuum Analytics - Confidential & Proprietary 4 Business analysts vs Data scientists Works with: Excel, Tableau, SQL Python, Hadoop, Spark Data: spreadsheets, tables dataframes, arrays Delivers: Reports, dashboards, spreadsheets Notebooks, code, interactive visualizations

Slide 5

Slide 5 text

© 2016 Continuum Analytics - Confidential & Proprietary 5 Business analysts are being left out of the data science revolution 5 Big Data & ETL Interactive Data Visualizations Machine Learning Statistics and Advanced Analytics

Slide 6

Slide 6 text

© 2016 Continuum Analytics - Confidential & Proprietary 6 Anaconda Fusion is a bridge between Excel & Python 6 Big Data & ETL Interactive Data Visualizations Machine Learning Statistics and Advanced Analytics

Slide 7

Slide 7 text

© 2016 Continuum Analytics - Confidential & Proprietary 7 Analysts and Data Scientists can keep using their preferred tools 7

Slide 8

Slide 8 text

© 2016 Continuum Analytics - Confidential & Proprietary 8 Self-service Big Data analytics Head node Compute nodes Jupyter notebook Interactive Data Visualizations Machine Learning Predictions Extract, transform and query data

Slide 9

Slide 9 text

© 2016 Continuum Analytics - Confidential & Proprietary 9 “No Code” Data Science Example 1 2 Select Anaconda Fusion Notebook and click “Upload” Select function you wish to run Click “Run” Data is loaded into spreadsheet 3 4

Slide 10

Slide 10 text

© 2016 Continuum Analytics - Confidential & Proprietary 10 Just change one line of code in your notebook

Slide 11

Slide 11 text

© 2016 Continuum Analytics - Confidential & Proprietary 11 • Extract data - pull data directly into Excel to perform analysis • Machine Learning – use trained models created by Data Scientists and plug them into your spreadsheet data • Interactive Visualizations – create custom advanced interactive graphs, charts and plots from Excel data • Big Data – analyze, transform, model and query data stored in Hadoop and Spark Figure: Anaconda Fusion on Mac Anaconda Fusion use cases

Slide 12

Slide 12 text

© 2016 Continuum Analytics - Confidential & Proprietary 12 12 • Run Queries • Run Predictive Models • Running Big Data Text Analytics Examples

Slide 13

Slide 13 text

© 2016 Continuum Analytics - Confidential & Proprietary 13

Slide 14

Slide 14 text

© 2016 Continuum Analytics - Confidential & Proprietary 14

Slide 15

Slide 15 text

© 2016 Continuum Analytics - Confidential & Proprietary 15

Slide 16

Slide 16 text

16 Features & architecture

Slide 17

Slide 17 text

© 2016 Continuum Analytics - Confidential & Proprietary 17 Features – Formula bar

Slide 18

Slide 18 text

© 2016 Continuum Analytics - Confidential & Proprietary 18 Features – Write back to Excel

Slide 19

Slide 19 text

© 2016 Continuum Analytics - Confidential & Proprietary 19 Features – Interactive visualizations

Slide 20

Slide 20 text

© 2016 Continuum Analytics - Confidential & Proprietary 20 Features – Save custom advanced visualizations

Slide 21

Slide 21 text

© 2016 Continuum Analytics - Confidential & Proprietary 21 • OSS – base of most successful modern software • Maturity – long history • Diversity • 100s of projects • 1000s of contributors • Vision • Jupyterlab • Community & Support • Popularity Jupyter as a Platform

Slide 22

Slide 22 text

© 2016 Continuum Analytics - Confidential & Proprietary 22 • The jupyter ecosystem • https://github.com/jupyter • https://github.com/jupyterlab • https://github.com/phosphorjs • Great community/support • Very pluggable* • Perfect for our use case • I.e.: why can’t excel have ML? • i.e.: why can’t excel do things that numpy/pandas do? • i.e.: we need better graphics (ala bokeh ;) ) for a dashboard of our metrics in excel Jupyter As a Tech Choice

Slide 23

Slide 23 text

© 2016 Continuum Analytics - Confidential & Proprietary 23 Use Case Excel fusion Fusion Server Notebook kernels

Slide 24

Slide 24 text

© 2016 Continuum Analytics - Confidential & Proprietary 24 Use case Excel fusion Anaconda Platform Notebooks, Apps, … kernels API Apps

Slide 25

Slide 25 text

© 2016 Continuum Analytics - Confidential & Proprietary 25 A marketplace for Anaconda Fusion notebooks https://anaconda.org

Slide 26

Slide 26 text

© 2016 Continuum Analytics - Confidential & Proprietary Christine Doig [email protected] @ch_doig Fabio Pliger [email protected] @b_smoke THANK YOU!

Slide 27

Slide 27 text

© 2016 Continuum Analytics - Confidential & Proprietary QUESTIONS?