Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Start Small and Scale: Big Data and Jupyter's Ecosystem

Start Small and Scale: Big Data and Jupyter's Ecosystem

Keynote Presentation at PyData LA conference held at Cal State Los Angeles.

Jupyter notebooks have become the de-facto standard as a scientific and data science tool for producing computational narratives. Over five million Jupyter notebooks exist on GitHub today. Beyond the classic Jupyter notebook, Project Jupyter's tools have evolved to provide end to end workflows for research that enable scientists to prototype, collaborate, and scale with ease. JupyterLab, a web-based, extensible, next generation interactive development environment enables researchers to combine Jupyter notebooks, code and data to form computational narratives. JupyterHub brings the power of notebooks to groups of users. It gives users access to computational environments and resources without burdening the users with installation and maintenance tasks. Binder builds upon JupyterHub and provides free, sharable, interactive computing environments to people all around the world.

Carol Willing

December 05, 2019

More Decks by Carol Willing

Other Decks in Technology


  1. @WillingCarol Start Small and Scale Carol Willing PyData LA December

    5, 2019 https://speakerdeck.com/willingc Big Data and Jupyter's Ecosystem
  2. @WillingCarol Hi! I'm Carol. • Python Steering Council • Jupyter

    Steering Council • Core Developer, Python, Jupyter, nteract • PSF Fellow and Former Director • Frank Willison Award 2019 • Open Source Directions Podcast Co- host 2
  3. @WillingCarol 3 Core maintainer Papermill, Scrapbook, Bookstore, Commuter Steering Council,

    Core Developer JupyterHub, BinderHub, mybinder.org I love creating tools which educate and empower people.
  4. Copyright: 2019 European Union, contains modified Copernicus Sentinel data 2019,

    processed by EUMETSAT Super Typhoon Hagibis View of Super Typhoon Hagibis south-west of Japan, as captured by the Copernicus Sentinel-3 satellite on 08 October at 00:16 UTC.
  5. Title Typhoon Hagibis Released 10/10/2019 4:45 pm Copyright contains modified

    Copernicus Sentinel data (2019), processed by ESA, CC BY-SA 3.0 IGO
  6. A sign is partially submerged as the Tama River floods

    during Typhoon Hagibis. Source:Getty Images Source:Japan Times
  7. @WillingCarol –Kevin Sayer, DexCom CEO This whole integration of health

    care data is really going to be the next frontier. 19 https://www.cnbc.com/2019/11/13/big-data-is-the-next-frontier-for-medicine-says-dexcom-ceo.html https://www.businesswire.com/news/home/20191106005764/en/Dexcom-Reports-Quarter-2019-Financial-Results
  8. @WillingCarol Outage Midnight Friday: mysterious outage Dexcom did not announce

    there was an outage until about 8 a.m. Pacific time Saturday, which is 11 a.m. on the East Coast, when it posted a brief notice on its Facebook page. Monday morning: Dexcom Follow partly restored 20 https://www.nytimes.com/2019/12/02/well/live/Dexcom-G6-diabetes-monitor-outage.html Source: https://www.dexcom.com/
  9. @WillingCarol 25 ‣ Growth ‣ ACM Award ‣ Industry adoption

    ‣ Creative uses ‣ Open Source Book https://www.youtube.com/watch?v=qbtDVdEr8SY
  10. @WillingCarol 31 Binder 2.0 blog post elifesciences: Share your interactive

    research environment Nature article about Binder 31 mybinder.org Try it. No install needed.
  11. @WillingCarol Start 42 Try it in the browser Install Libraries

    Choose your tools Avoid reinventing the wheel Step 1
  12. @WillingCarol 46 Ten Simple Rules for Reproducible Research in Jupyter

    Notebooks Adam Rule et al. https://github.com/jupyter-guide/ ten-rules-jupyter https://github.com/jupyter-guide/ jupyter-guide
  13. A pictorial representation of the different tools constituting BinderHub. This

    image was created by Scriberia for The Turing Way community and is used under a CC-BY licence. Zenodo record. https://blog.jupyter.org/diving-into- leadership-to-build-push-button-code- df2a075c9914
  14. @WillingCarol What's new 52 Talk Python to Me Tracking Jupyter

    Newsletter https://tinyletter.com/TrackingJupyter/archive Open Source Directions GitHub Trending Follow projects on Social Media
  15. @WillingCarol Explore 53 Use the ecosystem to learn Best practices

    Infrastructure/Analysis What's new Step 2
  16. @WillingCarol 57 nteract Papermill Scrapbook Bookstore Commuter Production data at

    scale 57 https://medium.com/netflix-techblog/notebook-innovation-591ee3221233 Bookstore
  17. @WillingCarol 58 Papermill - parameterize / run Scrapbook - recording

    / reading Bookstore - store notebooks Commuter - share notebooks Production data at scale 58
  18. @WillingCarol 77 Justine Dupont surfs the greatest wave of her

    life in Nazaré, Portugal © Rafael G. Riancho / Red Bull Content Pool
  19. @WillingCarol Attributions 81 Attributions on slides. Photos [7-8] Carol Willing

    and Linnea Willing [14] The Carpentries, Tracy Teal, Bérénice Batut [14] Godzilla By Toho Company Ltd. (東宝株式会社, Tōhō Kabushiki-kaisha) © 1954 - movie poster made by Toho Company Ltd. (東宝株式会社, Tōhō Kabushiki-kaisha), Public Domain, https://commons.wikimedia.org/w/index.php?curid=3648684