Slide 1

Slide 1 text

Fernando Pérez (@fperez_org & [email protected]) LBL & UC Berkeley A language-independent architecture for open computing and data science IPython & Project Jupyter

Slide 2

Slide 2 text

–Hamming'62 “The purpose of computing is insight, not numbers”

Slide 3

Slide 3 text

The Lifecycle of a Scientific Idea (schematically) 1. Individual exploratory work 2. Collaborative development 3. Parallel production runs (HPC, cloud, ...) 4. Publication & communication (reproducibly!) 5. Education 6. Goto 1.

Slide 4

Slide 4 text

IPython: CU Boulder, 2001 or how to best procrastinate on a Physics dissertation

Slide 5

Slide 5 text

November 2001: "Just an afternoon hack" ❖ 259 Line Python script. ❖ sys.ps1 -> In [N]. ❖ sys.displayhook -> Out[N], caches results. ❖ Plotting, Numeric, etc. Now (Openhub stats) ❖ 19,279 commits ❖ 442 contributors ❖ Total Lines: 187,326 ❖ Number of Languages : 7 (JS, CSS, HTML, ...)

Slide 6

Slide 6 text

Real credit goes to whole team Plus ~ 500 more Open source contributors!

Slide 7

Slide 7 text

Current funding

Slide 8

Slide 8 text

Beyond the Terminal… ❖ The REPL as a network protocol ❖ Kernels ❖ execute code ❖ Clients ❖ Read input ❖ Present output Simple abstractions enable rich, sophisticated clients

Slide 9

Slide 9 text

2011: The IPython Notebook ❖ Rich web client ❖ Text & math ❖ Code ❖ Results ❖ Share, reproduce.

Slide 10

Slide 10 text

The Notebook: “Literate Computing” Computational Narratives ❖ Computers deal with code and data. ❖ Humans deal with narratives that communicate. Literate Computing (not Literate Programming) narratives anchored in a live computation, that communicate a story based on data and results. Cf: Mathematica, Maple, MuPad, Sage…

Slide 11

Slide 11 text

Demo: IPython Notebook

Slide 12

Slide 12 text

From IPython to Project Jupyter

Slide 13

Slide 13 text

A simple and generic architecture

Slide 14

Slide 14 text

Not just about Python: Kernels in any language ❖ IPython "Official", we ship it. ❖ IJulia ❖ IRKernel ❖ IHaskell ❖ IFSharp ❖ Ruby ❖ IScala ❖ IErlang ❖ Lots more! ~37 and counting

Slide 15

Slide 15 text

“Why is it called IPython, if it can do Julia, R, Haskell, Ruby, … ?”

Slide 16

Slide 16 text

IPython ❖ Interactive Python shell at the terminal ❖ Kernel for this protocol in Python ❖ Tools for Interactive Parallel computing ❖ Network protocol for interactive computing ❖ Clients for protocol ❖ Console ❖ Qt Console ❖ Notebook ❖ Notebook file format & tools (nbconvert...) ❖ Nbviewer

Slide 17

Slide 17 text

IPython … Jupyter ❖ Interactive Python shell at the terminal ❖ Kernel for this protocol in Python ❖ Tools for Interactive Parallel computing ❖ Network protocol for interactive computing ❖ Clients for protocol ❖ Console ❖ Qt Console ❖ Notebook ❖ Notebook file format & tools (nbconvert...) ❖ Nbviewer Language Agnostic

Slide 18

Slide 18 text

What’s in a name? ❖ Inspired by the open languages of science: ❖ Julia, Python & R ❖ not an acronym: all languages equal class citizens. ❖ Astronomy and Scientific Python: ❖ A long and fruitful collaboration ❖ Galileo's notebooks: ❖ the original, open science, data-and-narrative papers ❖ Authorea: “Science was Always meant to be Open”

Slide 19

Slide 19 text

Demo: Jupyter Notebooks

Slide 20

Slide 20 text

The Jupyter Notebook Ecosystem

Slide 21

Slide 21 text

nbviewer: seamless notebook sharing ❖ Zero-install reading of notebooks ❖ Just share a URL ❖ nbviewer.ipython.org

Slide 22

Slide 22 text

Reproducible Research

Slide 23

Slide 23 text

Paper, Notebooks and Virtual Machine

Slide 24

Slide 24 text

Scientific Blogging Jake van der Plas @ UW http://blogs.scientificamerican.com/ sa-visual/2014/09/16/visualizing-4- dimensional-asteroids

Slide 25

Slide 25 text

Executable books ❖ Springer hardcover book ❖ Chapters: IPython Notebooks ❖ Posted as a blog entry ❖ All available as a Github repo Python for Signal Processing, by José Unpingco

Slide 26

Slide 26 text

More authors creating books this way By Cameron Davidson-Pilon By Matthew Russell

Slide 27

Slide 27 text

University Courses These are just some we are aware of!

Slide 28

Slide 28 text

A collaborative MOOC on OpenEdX http://lorenabarba.com/news/announcing-practical-numerical-methods-with-python-mooc ❖ Lorena Barba at George Washington University, USA. ❖ Ian Hawke at Southampton, UK ❖ Carlos Jerez at Pontifical Catholic University of Chile. ❖ All materials on Gihtub.

Slide 29

Slide 29 text

Books about IPython IPython Interactive Computing and Visualization Cookbook Learning IPython for Interactive Computing and Data Visualization Cyrille Rossant cyrille.rossant.net

Slide 30

Slide 30 text

Changing the scientific culture http://www.nature.com/news/interactive-notebooks-sharing-the-code-1.16261

Slide 31

Slide 31 text

Executable papers: the future? http://www.nature.com/news/ipython-interactive-demo-7.21492?article=1.16261

Slide 32

Slide 32 text

Notebook Workflows: The Big Picture Image credit: Joshua Barratt

Slide 33

Slide 33 text

Lots more! The IPython Gallery https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks

Slide 34

Slide 34 text

Jupyter as Infrastructure OSS and commercial adoption

Slide 35

Slide 35 text

Microsoft: Python Tools for Visual Studio Shahrokh Mortazavi, Dino Viehland, Wenming Ye, Dennis Gannon.

Slide 36

Slide 36 text

Microsoft Azure: Notebooks in the Cloud

Slide 37

Slide 37 text

Google CoLaboratory Kayur Patel, Kester Tong, Mark Sanders, Corinna Cortes @ Google Matt Turk @ NCSA/UIUC

Slide 38

Slide 38 text

IBM Watson

Slide 39

Slide 39 text

Quantopian: algorithmic trading Karen Rubin Dir. Product Management at Quantopian Quantopian Research Post Fortune.com

Slide 40

Slide 40 text

Authorea: notebooks in papers https://www.authorea.com/users/3/articles/3904/_show_article

Slide 41

Slide 41 text

New directions

Slide 42

Slide 42 text

Full-page text editor

Slide 43

Slide 43 text

In-browser terminal (real-time sync)

Slide 44

Slide 44 text

Google CoLab: next steps ❖ Google Research funding a postdoc @ Berkeley. Thanks! ❖ Integrate real-time collaboration into Jupyter architecture. ❖ First, supported on Google Drive. ❖ Then, generalize, support other real-time backends. Matthias Bussonnier @ Berkeley Kester Tong @ Google

Slide 45

Slide 45 text

JupyterHub: multiuser support ❖ Out of the box ❖ Unix accounts ❖ Local single-user notebooks ❖ Customizable ❖ Authentication: OAuth, LDAP, etc. ❖ Subprocess control: Docker, VMs, etc.

Slide 46

Slide 46 text

JupyterHub in Education @ Berkeley https://developer.rackspace.com/blog/deploying-jupyterhub-for-education ❖ Computationally intensive course, ~220 students ❖ Fully hosted environment, zero-install ❖ Homework management and grading (w B. Granger) Jess Hamrick @ Cal K. Kelley Rackspace M. Ragan-Kelley Cal B. Granger Cal Poly

Slide 47

Slide 47 text

Thank You @fperez_org [email protected] @ProjectJupyter @IPythonDev Try it out at try.jupyter.org