Slide 1

Slide 1 text

@WillingCarol Scaling Reproducible Research Workshop: Building reproducible workflows for earth sciences ECMWF October 15, 2019 1 Carol Willing @WillingCarol

Slide 2

Slide 2 text

@WillingCarol 2 San Diego, CA

Slide 3

Slide 3 text

@WillingCarol 3

Slide 4

Slide 4 text

@WillingCarol 4

Slide 5

Slide 5 text

@WillingCarol 5

Slide 6

Slide 6 text

@WillingCarol 6 Tokyo

Slide 7

Slide 7 text

@WillingCarol –Pearl S. Buck The typhoon came out of the sea first as a deep hollow roar. 7

Slide 8

Slide 8 text

@WillingCarol 8

Slide 9

Slide 9 text

@WillingCarol 9

Slide 10

Slide 10 text

@WillingCarol 10

Slide 11

Slide 11 text

@WillingCarol –Pearl S. Buck I was surrounded by the madness, the unreason, of uncontrolled, undisciplined energy. 11

Slide 12

Slide 12 text

Copyright: 2019 European Union, contains modified Copernicus Sentinel data 2019, processed by EUMETSAT Super Typhoon Hagibis View of Super Typhoon Hagibis south-west of Japan, as captured by the Copernicus Sentinel-3 satellite on 08 October at 00:16 UTC.

Slide 13

Slide 13 text

Title Typhoon Hagibis Released 10/10/2019 4:45 pm Copyright contains modified Copernicus Sentinel data (2019), processed by ESA, CC BY-SA 3.0 IGO

Slide 14

Slide 14 text

Source:Twitter

Slide 15

Slide 15 text

A sign is partially submerged as the Tama River floods during Typhoon Hagibis. Source:Getty Images

Slide 16

Slide 16 text

@WillingCarol 16

Slide 17

Slide 17 text

@WillingCarol 17

Slide 18

Slide 18 text

@WillingCarol 18

Slide 19

Slide 19 text

@WillingCarol Lives depend on 19

Slide 20

Slide 20 text

@WillingCarol scaling reproducible research 20

Slide 21

Slide 21 text

@WillingCarol Tools Processes Communication 21

Slide 22

Slide 22 text

@WillingCarol 22

Slide 23

Slide 23 text

Jupyter Notebook A Jupyter Notebook document with a visualization of measles data.

Slide 24

Slide 24 text

@WillingCarol Research 24 Jupyter Citations Number 0 1000 2000 3000 4000 2015 2016 2017 2018 2019 Projected

Slide 25

Slide 25 text


 Millions of Notebooks https://github.com/trending/jupyter-notebook Over 5 million on GitHub

Slide 26

Slide 26 text

@WillingCarol 26 ‣ Growth ‣ ACM Award ‣ Industry adoption ‣ Creative uses ‣ Open Source Book

Slide 27

Slide 27 text

@WillingCarol JupyterLab 27

Slide 28

Slide 28 text

28 jupyter.org demo

Slide 29

Slide 29 text

29 jupyter.org demo

Slide 30

Slide 30 text

@WillingCarol 30 https://github.com/data-exp-lab/rust-yt-tools/ npm package @data-exp-lab/yt-tools Irber Junior LC. Oxidizing Python: writing extensions in Rust [version 1; not peer reviewed]. F1000Research 2018, 7(ISCB Comm J):955 (poster) (https://doi.org/ 10.7490/f1000research.1115726.1) https://github.com/munkm/widgyts yt and jupyter widgets

Slide 31

Slide 31 text

@WillingCarol 31 https://towardsdatascience.com/multivolume- rendering-in-jupyter-with-ipyvolume-cross- language-3d-visualization-64389047634a ipyvolume

Slide 32

Slide 32 text

@WillingCarol Healthy Best Practices 32

Slide 33

Slide 33 text

@WillingCarol 33 Ten Simple Rules for Reproducible Research in Jupyter Notebooks Adam Rule et al. https://github.com/jupyter-guide/ten-rules-jupyter https://github.com/jupyter-guide/jupyter-guide

Slide 34

Slide 34 text

@WillingCarol Keep up with changes 34

Slide 35

Slide 35 text

@WillingCarol Proceed cautiously with pseudo-open projects 35

Slide 36

Slide 36 text

@WillingCarol Ask why 36

Slide 37

Slide 37 text

@WillingCarol Tools Processes Communication 37

Slide 38

Slide 38 text

zero-to-jupyterhub.readthedocs.io

Slide 39

Slide 39 text

@WillingCarol 39 Papermill Parameterize and Run

Slide 40

Slide 40 text

@WillingCarol 40 Data at scale - Netflix https://medium.com/netflix-techblog/notebook-innovation-591ee3221233 nteract Papermill Scrapbook Bookstore Commuter

Slide 41

Slide 41 text

@WillingCarol 41 https://medium.com/dagster-io/dagster-0-6-0-impossible-princess-898b459375e0 Pipelines

Slide 42

Slide 42 text

@WillingCarol Create a Reproducibility Pipeline 42

Slide 43

Slide 43 text

@WillingCarol Decouple steps for flexibility 43

Slide 44

Slide 44 text

@WillingCarol Plan Execute Change 44 https://jupyterhub-team-compass.readthedocs.io https://github.com/jupyterhub/team-compass

Slide 45

Slide 45 text

@WillingCarol Tools Processes Communication 45

Slide 46

Slide 46 text

@WillingCarol Notebooks to web 46 https://blog.jupyter.org/and- voil%C3%A0-f6a2c08a4a93

Slide 47

Slide 47 text

@WillingCarol 47 Binder mybinder.org Binder 2.0 blog post elifesciences: Share your interactive research environment Nature article about Binder

Slide 48

Slide 48 text

48 Juliette Taka

Slide 49

Slide 49 text

49 Juliette Taka

Slide 50

Slide 50 text

50 Juliette Taka

Slide 51

Slide 51 text

51 Juliette Taka

Slide 52

Slide 52 text

52 Juliette Taka

Slide 53

Slide 53 text

53 Juliette Taka

Slide 54

Slide 54 text

@WillingCarol Binder 54

Slide 55

Slide 55 text

@WillingCarol 55 Binder mybinder.org

Slide 56

Slide 56 text

@WillingCarol 56 From a phone in the park!

Slide 57

Slide 57 text

@WillingCarol Pangeo 57 https://pangeo.io

Slide 58

Slide 58 text

@WillingCarol 58

Slide 59

Slide 59 text

@WillingCarol 59 https://simexp.github.io/vcog_hps_ad_book/intro.html Jupyter Book Binder Jupyter pandas scipy scikit learn matplotlib numpy seaborn Canadian Open Neuroscience Platform

Slide 60

Slide 60 text

@WillingCarol Build Communities 60

Slide 61

Slide 61 text

jupyter.org

Slide 62

Slide 62 text

@WillingCarol Leverage solutions across disciplines 62

Slide 63

Slide 63 text

@WillingCarol Share binders. Foster scientific research. 63

Slide 64

Slide 64 text

@WillingCarol Tools Processes Communication 64

Slide 65

Slide 65 text

@WillingCarol Why strive for reproducible research? 65

Slide 66

Slide 66 text

@WillingCarol Reproducible research improves prediction 66

Slide 67

Slide 67 text

@WillingCarol prediction = impact 67

Slide 68

Slide 68 text

@WillingCarol 68 Scaling reproducible research improves science and our world

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

@WillingCarol 70 Thank you ECMWF Workshop Organizers Claudia Vitolo Project Jupyter Team Min Ragan-Kelly

Slide 71

Slide 71 text

@WillingCarol Attributions 71 References to published research, projects, and drawings (and marked on slides) [2] Statistics: https://fivethirtyeight.com/features/which-city-has-the-most-unpredictable-weather/ [7, 11] A Bridge for Passing, Pearl S. Buck [8, 9, 18] ECMWF [12] Copyright: 2019 European Union, contains modified Copernicus Sentinel data 2019, processed by EUMETSAT [13] Copyright contains modified Copernicus Sentinel data (2019), processed by ESA, CC BY-SA 3.0 IGO [30] Madicken Munk [31] Maarten Breddels [33] Adam Rule et al. [46] Quantstack - Voila [48-53] Juliette Taka [57] Pangeo [58] Lindsey Heagy [59] Canadian Open Neuroscience Platform Photos [2-6, 10, 16-17, 69, 70] Source: Carol Willing and Linnea Willing [14] Twitter [15] Getty Images [55, 56] Kirstie Whitaker [23-29, 38, 44, 47, 54, 61] Project Jupyter [39-40] nteract and Netflix [41] Nick Shrock, Dagster

Slide 72

Slide 72 text

@WillingCarol 72