Scaling Reproducible Research with Jupyter

Scaling Reproducible Research with Jupyter

Keynote delivered on 12-09-2019 at the 2019 IEEE Big Data Conference - 4th Workshop on Open Science in Big Data (OSBD).

Jupyter Notebooks have taken the scientific and open data world by storm the past five years. Being able to tell a computational narrative that combines prose, code, media, and rich visualizations have increased a researcher’s ability to collaborate with others, share research in a reproducible way, and educate others in their scientific discipline and beyond.

A suite of tools, processes that scale, and modern ways to communicate openly about scientific research have grown rapidly within Project Jupyter’s open source community. Beyond the Jupyter Notebook, open source projects, including JupyterLab, JupyterHub, Binder, and nteract’s Papermill, offer new pipelines and services to allow open research to scale and impact others on a global scale.

C8eedb2bca5728f0f73294b5b5a0222e?s=128

Carol Willing

December 09, 2019
Tweet

Transcript

  1. 1.

    @WillingCarol Scaling Reproducible Research with Jupyter 4th Workshop on Open

    Science in Big Data (OSBD) IEEE Big Data, Los Angeles December 9, 2019 1 Carol Willing @WillingCarol 10.5281/zenodo.3567219.
  2. 2.

    @WillingCarol 2 Using data responsibly to solve real world issues

    and improve human lives Reproducible Research
  3. 6.

    Copyright: 2019 European Union, contains modified Copernicus Sentinel data 2019,

    processed by EUMETSAT Super Typhoon Hagibis View of Super Typhoon Hagibis south-west of Japan, as captured by the Copernicus Sentinel-3 satellite on 08 October at 00:16 UTC.
  4. 7.

    Title Typhoon Hagibis Released 10/10/2019 4:45 pm Copyright contains modified

    Copernicus Sentinel data (2019), processed by ESA, CC BY-SA 3.0 IGO
  5. 11.

    A sign is partially submerged as the Tama River floods

    during Typhoon Hagibis. Source:Getty Images Source:Japan Times
  6. 17.
  7. 19.
  8. 23.

    @WillingCarol 23 Ten Simple Rules for Reproducible Research in Jupyter

    Notebooks Adam Rule et al. https://github.com/jupyter-guide/ten-rules-jupyter https://github.com/jupyter-guide/jupyter-guide
  9. 28.

    A pictorial representation of the different tools constituting BinderHub. This

    image was created by Scriberia for The Turing Way community and is used under a CC-BY licence. Zenodo record. https://blog.jupyter.org/diving-into- leadership-to-build-push-button-code- df2a075c9914 zero-to-jupyterhub.readthedocs.io
  10. 29.

    @WillingCarol 29 nteract Papermill Scrapbook Bookstore Commuter Production data at

    scale 29 https://medium.com/netflix-techblog/notebook-innovation-591ee3221233 Bookstore
  11. 30.

    @WillingCarol 30 Papermill - parameterize / run Scrapbook - recording

    / reading Bookstore - store notebooks Commuter - share notebooks Production data at scale 30
  12. 37.

    @WillingCarol 37 Deploy your own BinderHub mybinder.org Binder 2.0 blog

    post elifesciences: Share your interactive research environment Nature article about Binder
  13. 58.

    @WillingCarol Attributions 58 References to published research, projects, and drawings

    (and marked on slides) [3] Statistics: https://fivethirtyeight.com/features/which-city-has-the-most-unpredictable-weather/ [5,9] ECMWF [6] Copyright: 2019 European Union, contains modified Copernicus Sentinel data 2019, processed by EUMETSAT [7] Copyright contains modified Copernicus Sentinel data (2019), processed by ESA, CC BY-SA 3.0 IGO [23] Adam Rule et al. [38-43] Juliette Taka [45] Pangeo [46] Lindsey Heagy [47] Canadian Open Neuroscience Platform Photos [3, 4, 57] Source: Carol Willing and Linnea Willing [8] Twitter [10] Getty Images [29-31] nteract and Netflix