Slide 1

Slide 1 text

For Interactive Data Science Collaboration CineGrid December 10, 2015

Slide 2

Slide 2 text

HELLO

Slide 3

Slide 3 text

CAROL WILLING ➤ Python Software Foundation, Director ➤ Project Jupyter, Contributor ➤ Fab Lab San Diego, Geek in Residence

Slide 4

Slide 4 text

WRITER

Slide 5

Slide 5 text

MANAGER AND ANALYST

Slide 6

Slide 6 text

ENGINEER

Slide 7

Slide 7 text

ARTIST

Slide 8

Slide 8 text

TEACHER

Slide 9

Slide 9 text

WONDER AND CURIOSITY

Slide 10

Slide 10 text

PROJECT JUPYTER Just the Facts

Slide 11

Slide 11 text

JUPYTER NOTEBOOK

Slide 12

Slide 12 text

The Notebook: “Literate Computing” Computational Narratives ❖ Computers deal with code and data. ❖ Humans deal with narratives that communicate. Literate Computing (not Literate Programming) narratives anchored in a live computation, that communicate a story based on data and results. Cf: Mathematica, Maple, MuPad, Sage…

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

“Project Jupyter serves not only the academic and scientific communities but also a much broader constituency of data scientists in research, education, industry and journalism… - Fernando Pérez UC Berkeley

Slide 15

Slide 15 text

“…we see uses of our tools that range from high school education in programming to the nation’s supercomputing facilities and the leaders of the tech industry. - Fernando Pérez UC Berkeley

Slide 16

Slide 16 text

“ More than a million people are currently using Jupyter for everything from… -Prof. Brian Granger Cal Poly

Slide 17

Slide 17 text

“…analyzing massive gene sequencing datasets to processing images from the Hubble Space Telescope and developing models of financial markets. -Prof. Brian Granger Cal Poly

Slide 18

Slide 18 text

“We are excited by the potential of Project Jupyter to reach even wider audiences and to contribute to increased cross-disciplinary collaboration in the sciences. -Betsy Fader Helmsley Charitable Trust

Slide 19

Slide 19 text

“Jupyter Notebook… will enable data exploration, visualization, and analysis in a way that encourages sound science and speeds progress. -Chris Mentzel The Gordon and Betty Moore Foundation

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

DATA CHALLENGES Constraints or Opportunities?

Slide 22

Slide 22 text

SCALE

Slide 23

Slide 23 text

SPEED

Slide 24

Slide 24 text

CHOICES

Slide 25

Slide 25 text

CONNECTIONS

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

OPPORTUNITIES Use our strengths

Slide 28

Slide 28 text

–Hamming'62 “The purpose of computing is insight, not numbers”

Slide 29

Slide 29 text

The Lifecycle of a Scientific Idea (schematically) 1. Individual exploratory work 2. Collaborative development 3. Parallel production runs (HPC, cloud, ...) 4. Publication & communication (reproducibly!) 5. Education 6. Goto 1.

Slide 30

Slide 30 text

JUPYTERHUB and Project Jupyter ecosystem

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

EDUCATION

Slide 33

Slide 33 text

nbviewer: seamless notebook sharing ❖ Zero-install reading of notebooks ❖ Just share a URL ❖ nbviewer.ipython.org

Slide 34

Slide 34 text

Executable books ❖ Springer hardcover book ❖ Chapters: IPython Notebooks ❖ Posted as a blog entry ❖ All available as a Github repo Python for Signal Processing, by José Unpingco

Slide 35

Slide 35 text

University Courses These are just some we are aware of!

Slide 36

Slide 36 text

A collaborative MOOC on OpenEdX http://lorenabarba.com/news/announcing-practical-numerical-methods-with-python-mooc ❖ Lorena Barba at George Washington University, USA. ❖ Ian Hawke at Southampton, UK ❖ Carlos Jerez at Pontifical Catholic University of Chile. ❖ All materials on Gihtub.

Slide 37

Slide 37 text

Changing the scientific culture http://www.nature.com/news/interactive-notebooks-sharing-the-code-1.16261

Slide 38

Slide 38 text

Executable papers: the future? http://www.nature.com/news/ipython-interactive-demo-7.21492?article=1.16261

Slide 39

Slide 39 text

Notebook Workflows: The Big Picture Image credit: Joshua Barratt

Slide 40

Slide 40 text

Lots more! The IPython Gallery https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks

Slide 41

Slide 41 text

GOVERNMENT

Slide 42

Slide 42 text

Shreyas Cholia & ! Oliver Ruebel! NERSC Data & Analytics Services Group! Jupyterhub Day, July 17 2015 Jupyterhub at NERSC and OpenMSI

Slide 43

Slide 43 text

NERSC is the Production HPC & Data Facility for DOE Office of Science Research Bio$Energy,$$Environment$ Compu2ng$ Materials,$Chemistry,$$ Geophysics$ Par2cle$Physics,$ Astrophysics$ Largest$funder$of$physical$ science$research$in$U.S.$$ Nuclear$Physics$ Fusion$Energy,$ Plasma$Physics$ D$2$D$

Slide 44

Slide 44 text

ART

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

BUSINESS

Slide 48

Slide 48 text

Quantopian: algorithmic trading Karen Rubin Dir. Product Management at Quantopian Quantopian Research Post Fortune.com

Slide 49

Slide 49 text

Microsoft: Python Tools for Visual Studio Shahrokh Mortazavi, Dino Viehland, Wenming Ye, Dennis Gannon.

Slide 50

Slide 50 text

Microsoft Azure: Notebooks in the Cloud

Slide 51

Slide 51 text

Google CoLaboratory Kayur Patel, Kester Tong, Mark Sanders, Corinna Cortes @ Google Matt Turk @ NCSA/UIUC

Slide 52

Slide 52 text

IBM Watson

Slide 53

Slide 53 text

SCIENCE

Slide 54

Slide 54 text

JupyterHub: multiuser support ❖ Out of the box ❖ Unix accounts ❖ Local single-user notebooks ❖ Customizable ❖ Authentication: OAuth, LDAP, etc. ❖ Subprocess control: Docker, VMs, etc.

Slide 55

Slide 55 text

JupyterHub in Education @ Berkeley https://developer.rackspace.com/blog/deploying-jupyterhub-for-education ❖ Computationally intensive course, ~220 students ❖ Fully hosted environment, zero-install ❖ Homework management and grading (w B. Granger) Jess Hamrick @ Cal K. Kelley Rackspace M. Ragan-Kelley Cal B. Granger Cal Poly

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

COLLABORATION Why?

Slide 63

Slide 63 text

A ten year journey. Optimism and hope for the future.

Slide 64

Slide 64 text

IMAGINE THE POSSIBILITIES

Slide 65

Slide 65 text

TRY.JUPYTER.ORG

Slide 66

Slide 66 text

WE’RE OPEN FOR YOU.

Slide 67

Slide 67 text

THANK YOU try.jupyter.org www.jupyter.org numfocus.org ipython.org

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

CREDITS AND ATTRIBUTION ➤ Sources ➤ Jupyter website www.jupyter.org [11, 31, 65, 66, 69] ➤ Fernando Pérez [12, 28, 29, 33-40, 48-52, 53-55] http://fperez.org/ BIDS http://bids.berkeley.edu/ ➤ Cal Poly and UC Berkeley Press Releases http://calpolynews.calpoly.edu/news_releases/2015/July/jupyter.html, http://bids.berkeley.edu/news/ project-jupyter-gets-6m-expand-collaborative-data-science-software [14-19] ➤ Jupyterhub at NERSC and OpenMSI, S. Cholla and O. Ruebel, Jupyterhub Day presentation, July 17, 2015 [42, 43] ➤ music21 website http://web.mit.edu/music21/ [45] ➤ Jeremy Freeman http://jeremyfreeman.net/ PyData Talk NYC Winter 2015 https://github.com/freeman-lab/talk-nyc-winter-2015 [56, 57, 58] ➤ CodeNeuro website http://codeneuro.org/ [59-60] ➤ Binder website http://mybinder.org/ [61] ➤ Images ➤ [2, 10, 21, 27, 30, 62, 64] Galaxy ➤ [23] Hummingbird https://flic.kr/p/mo5pa1 ➤ [25] Netflix Prize Christopher Hefele https://flic.kr/p/6LWT6K ➤ [3-7, 8 (artwork FabLab interns), 9, 20, 22, 24, 26, 42, 43, 46, 57, 63] Carol Willing. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. ➤ For additional information ➤ Jupyter www.jupyter.org ➤ Python Software Foundation www.python.org ➤ Carol Willing, [email protected], @willingcarol, GitHub: willingc