Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How a Notebook changed the world of Science!

How a Notebook changed the world of Science!

Tech talk on Semut.io

Notebooks are disrupting science, finance and every other sector. Learn more about Jupyter Notebooks on this tech talk with Shreyas Bapat, B.Tech. IIT Mandi, who is spearheading efficiency initiatives at Semut.io.
Watch this talk here:

Follow us on our social media channels to stay updated.
Website: https://semut.io
Twitter: https://bit.ly/semut_twitter
LinkedIn: https://bit.ly/semut_linkedin
Youtube: https://bit.ly/semut_youtube
Twitch: https://bit.ly/semut_twitch

Shreyas Bapat

March 12, 2021
Tweet

More Decks by Shreyas Bapat

Other Decks in Technology

Transcript

  1. View Slide

  2. View Slide

  3. Who am I?
    -> Software Engineer @ Semut.io
    -> Electrical Engineer from IIT Mandi
    -> Lead Developer @ EinsteinPy
    -> Managing Member @ PSF
    -> Love being in Mountains of
    Himachal Pradesh (India)

    View Slide

  4. Let’s start with what went wrong?

    View Slide

  5. What is it?
    -> Published in Non Peer Reviewed Issue of American
    Economic Review.
    -> Cited by politicians worldwide in debates to prove
    effectiveness of austerity in fiscal policies for debt burdened
    economies.
    -> When “Gross external debt reaches 60 percent of GDP", a
    country's annual growth declined by two percent, and "for
    levels of external debt in excess of 90 percent" GDP growth
    was "roughly cut in half.”
    -> Proved Wrong!

    View Slide

  6. View Slide

  7. Catch?
    Science needs explanation. Papers can be hard to
    understand. Excel is not made for this. You can’t
    document code, logic and results in excel!

    View Slide

  8. More Issues with using Excel for Science

    View Slide

  9. View Slide

  10. Is there a better way?
    -> Is there a better way to have the text, analysis,
    results, code, plots, comments in one place?
    -> Is it possible to write it sequentially such that
    anyone going through the code/paper/book could
    not just understand it, but run it, reproduce results,
    find flaws and suggest enhancements 1000X faster?
    -> Is it possible to distribute the results properly?

    View Slide

  11. What would be a better way?
    -> Excel users are people avoiding hardcore
    programming languages.
    -> Easy interface to write logic
    -> Low Cognitive Complexity
    -> Easy Syntax (Preferably like English) [Think BQL
    (Bloomberg Query Language)]

    View Slide

  12. Python stands out!
    -> Easy to grab!
    -> Easy interface to write logic
    -> Low Cognitive Complexity
    -> Mostly like writing english.
    -> Out of the box support, extremely friendly
    community.
    -> Democratic

    View Slide

  13. Results?

    View Slide

  14. Rise of
    iPython

    View Slide

  15. iPython?
    -> First version in 2001 (Started with just 259 lines)
    -> Inbuilt support for parallelization came in 2004
    -> Support for running the code in remote cluster from the
    shell.
    -> Inspired by Mathematica
    -> Boon for scientific computing

    View Slide

  16. View Slide

  17. Rise to
    fame!

    View Slide

  18. What are notebooks?
    -> Notebook is a rather old concept.
    -> Sequentially written logic
    -> Interactive
    -> Saves State (like a paper notebook does)

    View Slide

  19. View Slide

  20. Benefits of using iPython
    -> obj.[tab] , obj? , obj??
    -> files = !ls , !wget $url
    -> iPython Magic:
    - %run script.py (-p -> profile, -t -> time)
    - %debug (jump in after an exception)
    - %lsmagic (See the rest of magics)

    View Slide

  21. Some myths
    -> There’s no support
    -> Won’t be free forever
    -> Free software has bugs

    View Slide

  22. Some myths
    -> There’s no support
    Enthought, Continuum Analytics
    -> Won’t be free forever
    -> Free software has bugs

    View Slide

  23. Some myths
    -> There’s no support
    Enthought, Continuum Analytics
    -> Won’t be free forever
    Free software Belongs to the community
    -> Free software has bugs

    View Slide

  24. Some myths
    -> There’s no support
    Enthought, Continuum Analytics
    -> Won’t be free forever
    Free software Belongs to the community
    -> Free software has bugs
    Naturally!

    View Slide

  25. And it happened

    View Slide

  26. Jupyter
    or
    Jupiter?

    View Slide

  27. What is Jupyter?

    View Slide

  28. Jupyter : An Ecosystem
    -> JupyterLab
    -> Slides/Documents
    -> Write Books (O’Reilly) - jupyterbook
    -> JupyterHub
    -> Collab Notebooks, binder

    View Slide

  29. Jupyter : A way to teach in classrooms!
    -> Easy to create interactive tutorials which lets
    students play around with the code.
    -> Perform Live Coding, share lecture notes and
    materials
    -> Grade homeworks
    -> No need to run every script supplied by
    students

    View Slide

  30. Project Jupyter
    -> Separation of the language agnostic
    components
    - Jupyter : protocol, format, multi-user server
    - iPython : Jupyter Kernel, interactive python
    -> Jupyter Kernels: Languages which can be
    used in notebook: ~ 100 programming
    languages.

    View Slide

  31. Notebook Extensions
    -> Add ons to extend functionality. Much like VS
    Code extensions.
    -> VS Code supports Jupyter
    -> Written in JavaScript, send browser
    notifications, autoformat code. There are
    immense possibilities.

    View Slide

  32. Widgets
    -> Interact with the code output!
    -> Have slide bars, text boxes, inputs
    -> Like mini-GUIs
    -> Very helpful when working with hyper
    parameters.
    -> You can write one for yourself!

    View Slide

  33. Widgets

    View Slide

  34. Widgets

    View Slide

  35. Version Control in Notebooks
    -> The notebooks have extension .ipynb but are
    plain text files and are represented in JSON.
    -> The diffs used to be pretty large earlier even
    when nothing changed. Storing output is another
    issue.
    -> It’s getting better. With nbdime, it’s much easier
    to merge and compare now.

    View Slide

  36. Using Notebooks in Production!

    View Slide

  37. Issues with Jupyter Notebooks
    -> Hidden states in the Notebooks

    View Slide

  38. Issues with Jupyter Notebooks
    -> Hidden states in the Notebooks

    View Slide

  39. Issues with Jupyter Notebooks
    Notebooks are great for iterative development
    BUT
    Notebooks are *very* dangerous unless you run
    each cell only ONCE in CORRECT ORDER.

    View Slide

  40. A way to solve this...
    The %history magic!
    But you shouldn’t have to run a magic to find the
    state!

    View Slide

  41. Now take this!

    View Slide

  42. Problem number 2
    The ability to run a code in a non sequential arbitrary
    order is counter intuitive to most programmers out
    there.
    Can be daunting for beginners!

    View Slide

  43. Notebooks in Cloud
    -> Notebooks as a service is a cool new thing!
    -> Azure Notebooks, Collab Notebooks and what
    not.
    -> You can create your own notebook service!

    View Slide

  44. How does a Jupyter Notebook Work?
    Credits: Carol Willing

    View Slide

  45. What does JupyterHub do?
    -> Manages Authentication
    -> Spawns single-user notebook servers on demand
    -> Gives a user their complete notebook server!
    -> Hub and server are different entities.

    View Slide

  46. JupyterHub

    View Slide

  47. Parts of JupyterHub
    -> The Hub: User Database, Auth, and Spawner
    -> Users and their individual notebook servers
    -> Configurable HTTP Proxy
    The auth supports OAuth, pem etc.
    Deploy: https://github.com/jupyterhub/jupyterhub-deploy-docker

    View Slide

  48. Real Time Collaboration
    Jupyter RTC: https://jupyter-rtc.readthedocs.io/
    This works directly with JupyterLab

    View Slide

  49. Summary
    -> Jupyter / iPython is a useful tool, not only for
    coding but also for teaching, sharing, documenting,
    publishing!
    -> We don’t have to throw away previous work in
    different languages, now we can integrate them.
    -> Jupyter is gaining relevance in Open Science,
    Finance, Music, Teaching. We must go further!

    View Slide

  50. Questions?
    Semut Twitter: @semut_io
    GitHub: @shreyasbapat
    Twitter: @shreyasb94

    View Slide

  51. View Slide