Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Jupyter Frontends: From the classic Jupyter Notebook, to JupyterLab, nteract and beyond

Kyle Kelley
August 24, 2017

Jupyter Frontends: From the classic Jupyter Notebook, to JupyterLab, nteract and beyond

The Jupyter architecture (message specification, kernels, notebook documents) allows for multiple end-user applications or Jupyter frontends. The traditional application for Jupyter is the classic Jupyter Notebook, which began as the IPython Notebook in 2011. Since then, the Jupyter Notebook frontend has become a critical tool for millions of users doing interactive computing in scientific research, education, and commercial data science, machine learning, and AI. In recent years, a number of more modern end-user applications built on top of the Jupyter architecture have emerged, including Rodeo, CoCalc, Stencila, nteract, and JupyterLab. Project Jupyter is embracing the flowering of end-user applications and taking steps to document and formalize the abstractions across all Jupyter frontends.

Kyle Kelley and Brian Granger offer a broad look at Jupyter frontends, describing their common aspects and explaining how their differences help Jupyter reach a broader set of users. They also share ongoing challenges in building these frontends (real-time collaboration, security, rich output, different Markdown formats, etc.) as well as their ongoing work to address these questions.

Kyle Kelley

August 24, 2017
Tweet

More Decks by Kyle Kelley

Other Decks in Technology

Transcript

  1. 
 Jupyter Frontends: From the classic Jupyter Notebook, to JupyterLab,

    nteract and beyond Brian Granger, Cal Poly Kyle Kelley, Netflix Project Jupyter JupyterCon 2017
  2. • User Testing @ JupyterCon •What is interactive computing? •

    User interfaces for interactive computing • The terminal as a UI for interactive computing • Jupyter Notebook: successes and challenges • Foundations of Jupyter frontends: • Jupyter Message Specification • Survey of different Jupyter frontends/UIs Outline
  3. • We are doing extensive, in person user experience testing

    here at JupyterCon • We need your help to improve our software! • Thursday+Friday all day • Location: Gramercy room on level 2 • Drop in or sign up: • http://bit.ly/jupytercon-usertesting User Testing @ JupyterCon
  4. Interactive Computing An interactive computation is a persistent computer program

    that runs with a "human in the loop," where the primary mode of steering the program is through the human iteratively writing/running blocks of code and looking at the result. Human Centered Computing
  5. Interactive Computing: Breakdown Compose A UI to compose blocks of

    code Submit A UI to submit code to the running program Run A method to run code in a persistent namespace Output A method to return output to the user View A UI to view the output Repeat A method to repeat the process Frontend = User Interface = UI
  6. IPython: The Terminal as a UI IPython was released in

    2001 and provided a UI for interactive computing Tab completion, extended syntax (!, %, ?), plotting integration
  7. IPython: The Terminal as a UI Compose Type at the

    Terminal Submit Run eval(code, globals, local) Output print(output, file=sys.stdout) View Look at the Terminal Repeat while True: IPython was released in 2001 and provided a UI for interactive computing:
  8. • No narrative: code without narrative is empty of meaning

    • Narrative text/documentation • Images, Visualizations, Equations • No memory: wasteful and frustrating • Everything you type is lost when you exit IPython • Only memory was hitting the up arrow after restarting to cycle through previous commands • Python’s runtime effectively requires you to restart often • code, exit, ipython, up, up, code, exit, ipython, up, up, up, code, … • No reproducibility and communication: your own private dead end • No way to record/reproduce an interactive computation and its results • No way share a session with others and communicate results Usability Challenges of the Terminal In spite of this, Python, IPython, NumPy, Matplotlib, etc. were revolutionary
  9. The IPython/Jupyter Notebook maintains the full interactive computing workflow while

    adding narrative, memory, reproducibility, communication
  10. File = reproducible, communication, memory equations narrative images memory Narrative,

    Memory, Reproducible, Communication compose repeat submit/run output IPython Notebook 2011: “Computational Narrative”
  11. • Built with web technology of 2011 (jQuery, Bootstrap, require.js)

    • In contrast to core server/architecture in Python, which has been relatively stable • Important features became difficult to implement: • Real-time collaboration: models and views are welded together • Collapsing input/ouput: DOM/CSS had become pseudo-public APIs • No distinction between public/private APIs - anything breaks everything • Difficult to extend • Code base is large enough that you miss static typing • No dependency injection for runtime dependency resolution Jupyter Notebook: Challenges
  12. • Leaky foundations: • Message spec and notebook format had

    become “polluted” with implementation specific details • Libraries such as ipywidgets were welded to the classic notebook implementation details • Other non-notebook workflows: • QTConsole: interactive computing with rich output • Hydrogen/RStudio: Python/R scripts with interactive computing and rich output • Emergent behaviors: • Exciting, but unexpected usage cases • An ecosystem that has become an abundant, fertile, rich garden • With a codebase that is difficult to scale or maintain to span those usage Jupyter Notebook: Challenges
  13. Exploring the Protocol { "msg_type": "status", … } { "msg_type":

    "stream", "content": { "text":"hey\n", "name":"stdout" } }
  14. Exploring the Protocol { "msg_type": "status", … } { "msg_type":

    "stream", … } { "msg_type": "status", "content": { "execution_state": "idle" } }
  15. Exploring the Protocol Two types of messages so far •

    Execution status - busy or idle • A stream of “stdout”
  16. Exploring the Protocol { "msg_type": "status", "content" { "execution_state": "busy"

    } } { "msg_type": "stream", "content": { "text":"hey\n", "name":"stdout" } } { "msg_type": “stream", "content": { "text":"sup\n", "name":"stdout" } } { "msg_type": "status", “content" { "execution_state": "idle" } }
  17. Exploring the Protocol - Rich Media { "msg_type": "execute_result", "content":

    { "data": { // Full payloads below "text/plain": ..., "text/html": ..., } } }
  18. Exploring the Protocol - Rich Media temp date 4851 56.4

    2010/07/22 04:00:00 6232 65.9 2010/09/17 17:00:00 8497 47.0 2010/12/21 02:00:00 5875 60.9 2010/09/02 20:00:00 5625 65.7 2010/08/23 10:00:00 text/plain
  19. Exploring the Protocol - Rich Media <table> <thead> <tr style="text-align:

    right;"> <th></th><th>temp</th><th>date</th></tr> </thead> <tbody> <tr> <th>4851</th> <td>56.4</td> <td>2010/07/22 04:00:00</td> </tr> … text/html
  20. Building a Notebook Document We’ve witnessed • How code is

    sent to the runtime • What we receive as a frontend
  21. Building a Notebook Document How do we form a notebook?

    How do we associate messages to the cells they originated from?
  22. Building a Notebook Document • Send message with a msg_id

    (message id) • Responses refer to originating message as the parent Message IDs
  23. Building a Notebook Document { "msg_type": "execute_request", "msg_id": "0001", "content":

    { "code": "print('hey')" } } We send the execute_request as message 0001 code: "print('hey')" And initialize our state
  24. { "msg_type": "status", "msg_id": "0002", "parent_id": "0001", "content": { "execution_state":

    "busy" } } Building a Notebook Document Responses show originating msg_id is parent_id 0001
  25. Building a Notebook Document { "msg_type": "status", "msg_id": "0002", "parent_id":

    "0001", "content": { "execution_state": "busy" } } Message code: "print('hey')" status: "busy" Cell State
  26. Building a Notebook Document { "msg_type": "stream", "msg_id": "0003", "parent_id":

    "0001", "content": { “text": "hey\n", “name”: "stdout" } } Message code: "print('hey')" status: “busy" outputs: - type: "stream" text: "hey\n" name: "stdout" Cell State
  27. Building a Notebook Document { "msg_type": "status", "msg_id": "0004", "parent_id":

    "0001", "content": { "execution_state": "idle" } } Message code: "print('hey')" status: “idle" outputs: - type: "stream" text: "hey\n" name: "stdout" Cell State
  28. Building a Notebook Document Final State code: "print('hey')" status: “idle"

    outputs: - type: "stream" text: "hey\n" name: "stdout"
  29. Building a Notebook Document Final State code: "print('hey')" status: “idle"

    outputs: - type: "stream" text: "hey\n" name: "stdout"
  30. Building a Notebook Document That’s just one cell though —what

    would an entire notebook structure look like? What’s a notebook? • A rolling work log of computations • A linear list of cells
  31. cells: - text: "# Now with markdown!" - code: "from

    IPython.display import HTML" - code: "print('hey')" outputs: - type: "stream" text: "hey\n" name: "stdout" - code: "HTML('<b>WHOA</b>')" outputs: - type: "execute_result" data: - "text/html": "<b>WHOA</b>"
  32. cells: - text: "# Now with markdown!" - code: "from

    IPython.display import HTML" - code: "print('hey')" outputs: - type: "stream" text: "hey\n" name: "stdout" - code: "HTML('<b>WHOA</b>')" outputs: - type: "execute_result" data: - "text/html": "<b>WHOA</b>"
  33. Callouts For more in-depth on protocols and formats, check out

    these talks: Jupyter: Kernels, protocols, and the IPython reference implementation The Jupyter Notebook as document: From structure to application
  34. Callouts For more in-depth on alternative frontends, check out these

    talks: Lessons learned from tens of thousands of Kaggle notebooks JupyterLab: The next-generation Jupyter frontend