$30 off During Our Annual Pro Sale. View Details »

Jupyter Frontends: From the classic Jupyter Notebook, to JupyterLab, nteract and beyond

Kyle Kelley
August 24, 2017

Jupyter Frontends: From the classic Jupyter Notebook, to JupyterLab, nteract and beyond

The Jupyter architecture (message specification, kernels, notebook documents) allows for multiple end-user applications or Jupyter frontends. The traditional application for Jupyter is the classic Jupyter Notebook, which began as the IPython Notebook in 2011. Since then, the Jupyter Notebook frontend has become a critical tool for millions of users doing interactive computing in scientific research, education, and commercial data science, machine learning, and AI. In recent years, a number of more modern end-user applications built on top of the Jupyter architecture have emerged, including Rodeo, CoCalc, Stencila, nteract, and JupyterLab. Project Jupyter is embracing the flowering of end-user applications and taking steps to document and formalize the abstractions across all Jupyter frontends.

Kyle Kelley and Brian Granger offer a broad look at Jupyter frontends, describing their common aspects and explaining how their differences help Jupyter reach a broader set of users. They also share ongoing challenges in building these frontends (real-time collaboration, security, rich output, different Markdown formats, etc.) as well as their ongoing work to address these questions.

Kyle Kelley

August 24, 2017
Tweet

More Decks by Kyle Kelley

Other Decks in Technology

Transcript


  1. Jupyter Frontends:
    From the classic Jupyter
    Notebook, to JupyterLab,
    nteract and beyond
    Brian Granger, Cal
    Poly
    Kyle Kelley, Netflix
    Project Jupyter
    JupyterCon 2017

    View Slide

  2. • User Testing @ JupyterCon
    •What is interactive computing?
    • User interfaces for interactive computing
    • The terminal as a UI for interactive computing
    • Jupyter Notebook: successes and challenges
    • Foundations of Jupyter frontends:
    • Jupyter Message Specification
    • Survey of different Jupyter frontends/UIs
    Outline

    View Slide

  3. • We are doing extensive, in person user
    experience testing here at JupyterCon
    • We need your help to improve our software!
    • Thursday+Friday all day
    • Location: Gramercy room on level 2
    • Drop in or sign up:
    • http://bit.ly/jupytercon-usertesting
    User Testing @ JupyterCon

    View Slide

  4. Interactive Computing
    An interactive computation is a persistent
    computer program that runs with a "human in
    the loop," where the primary mode of steering
    the program is through the human iteratively
    writing/running blocks of code and looking at
    the result.
    Human Centered Computing

    View Slide

  5. Interactive Computing: Breakdown
    Compose A UI to compose blocks of code
    Submit A UI to submit code to the running program
    Run A method to run code in a persistent namespace
    Output A method to return output to the user
    View A UI to view the output
    Repeat A method to repeat the process
    Frontend = User Interface = UI

    View Slide

  6. IPython: The Terminal as a UI
    IPython was released in 2001 and provided a UI for interactive computing
    Tab completion, extended syntax (!, %, ?), plotting integration

    View Slide

  7. IPython: The Terminal as a UI
    Compose Type at the Terminal
    Submit
    Run eval(code, globals, local)
    Output print(output, file=sys.stdout)
    View Look at the Terminal
    Repeat while True:
    IPython was released in 2001 and provided a UI for interactive computing:

    View Slide

  8. • No narrative: code without narrative is empty of meaning
    • Narrative text/documentation
    • Images, Visualizations, Equations
    • No memory: wasteful and frustrating
    • Everything you type is lost when you exit IPython
    • Only memory was hitting the up arrow after restarting to cycle through previous
    commands
    • Python’s runtime effectively requires you to restart often
    • code, exit, ipython, up, up, code, exit, ipython, up, up, up, code, …
    • No reproducibility and communication: your own private dead end
    • No way to record/reproduce an interactive computation and its results
    • No way share a session with others and communicate results
    Usability Challenges of the Terminal
    In spite of this, Python, IPython, NumPy, Matplotlib, etc. were revolutionary

    View Slide

  9. Why is the classic IPython/
    Jupyter
    Notebook so useful?

    View Slide

  10. The IPython/Jupyter Notebook
    maintains the full interactive
    computing workflow while
    adding narrative, memory,
    reproducibility, communication

    View Slide

  11. File = reproducible, communication, memory
    equations
    narrative
    images
    memory
    Narrative, Memory, Reproducible, Communication
    compose
    repeat
    submit/run
    output
    IPython Notebook 2011:
    “Computational Narrative”

    View Slide

  12. Jupyter Notebook 2017

    View Slide

  13. Jupyter Notebook: Successes

    View Slide

  14. >6-8M??? Users
    https://github.com/jupyter/design/blob/master/surveys/2015-notebook-ux/analysis/report_dashboard.ipynb

    View Slide

  15. ~100 Languages
    https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

    View Slide

  16. Highly
    International
    Community
    Google Analytics for jupyter.org, July 2017

    View Slide

  17. Over 1M
    Notebooks on
    GitHub

    View Slide

  18. An Amazing
    Community:
    Emergent
    Behaviors

    View Slide

  19. Jupyter Notebook: Challenges

    View Slide

  20. • Built with web technology of 2011 (jQuery, Bootstrap, require.js)
    • In contrast to core server/architecture in Python, which has been relatively
    stable
    • Important features became difficult to implement:
    • Real-time collaboration: models and views are welded together
    • Collapsing input/ouput: DOM/CSS had become pseudo-public APIs
    • No distinction between public/private APIs - anything breaks everything
    • Difficult to extend
    • Code base is large enough that you miss static typing
    • No dependency injection for runtime dependency resolution
    Jupyter Notebook: Challenges

    View Slide

  21. • Leaky foundations:
    • Message spec and notebook format had become “polluted” with
    implementation specific details
    • Libraries such as ipywidgets were welded to the classic notebook
    implementation details
    • Other non-notebook workflows:
    • QTConsole: interactive computing with rich output
    • Hydrogen/RStudio: Python/R scripts with interactive computing and rich
    output
    • Emergent behaviors:
    • Exciting, but unexpected usage cases
    • An ecosystem that has become an abundant, fertile, rich garden
    • With a codebase that is difficult to scale or maintain to span those usage
    Jupyter Notebook: Challenges

    View Slide

  22. Frontend Foundations

    View Slide

  23. Exploring the Protocol

    View Slide

  24. Exploring the Protocol
    {
    "msg_type": "execute_request",
    "content": {
    "code": "print('hey')"
    }
    }

    View Slide

  25. Exploring the Protocol
    {
    "msg_type": "status",
    "content": {
    "execution_state": "busy"
    }
    }

    View Slide

  26. Exploring the Protocol
    {
    "msg_type": "status", …
    }
    {
    "msg_type": "stream",
    "content": {
    "text":"hey\n",
    "name":"stdout"
    }
    }

    View Slide

  27. Exploring the Protocol
    {
    "msg_type": "status", …
    }
    {
    "msg_type": "stream", …
    }
    {
    "msg_type": "status",
    "content": {
    "execution_state": "idle"
    }
    }

    View Slide

  28. Exploring the Protocol
    Two types of messages so far
    • Execution status - busy or idle
    • A stream of “stdout”

    View Slide

  29. Exploring the Protocol
    How about longer computations?

    View Slide

  30. Exploring the Protocol
    { "msg_type": "status",
    "content" { "execution_state": "busy" } }
    { "msg_type": "stream",
    "content": { "text":"hey\n", "name":"stdout" } }
    { "msg_type": “stream",
    "content": { "text":"sup\n", "name":"stdout" } }
    { "msg_type": "status",
    “content" { "execution_state": "idle" } }

    View Slide

  31. Exploring the Protocol - Rich Media
    How are tables, plots, and rich media shown?

    View Slide

  32. Exploring the Protocol - Rich Media
    "content": {
    "code": "pd.read_csv('temps.csv').sample(n=5)"
    }

    View Slide

  33. Exploring the Protocol - Rich Media
    {
    "msg_type": "execute_result",
    "content": {
    "data": {
    // Full payloads below
    "text/plain": ...,
    "text/html": ...,
    }
    }
    }

    View Slide

  34. Exploring the Protocol - Rich Media
    temp date
    4851 56.4 2010/07/22 04:00:00
    6232 65.9 2010/09/17 17:00:00
    8497 47.0 2010/12/21 02:00:00
    5875 60.9 2010/09/02 20:00:00
    5625 65.7 2010/08/23 10:00:00
    text/plain

    View Slide

  35. Exploring the Protocol - Rich Media



    tempdate



    4851
    56.4
    2010/07/22 04:00:00


    text/html

    View Slide

  36. Exploring the Protocol - Rich Media
    HTML Payload

    View Slide

  37. View Slide

  38. View Slide

  39. Building a Notebook
    Document

    View Slide

  40. Building a Notebook Document
    We’ve witnessed
    • How code is sent to the runtime
    • What we receive as a frontend

    View Slide

  41. Building a Notebook Document
    How do we form a notebook?
    How do we associate messages to the cells
    they originated from?

    View Slide

  42. Building a Notebook Document
    • Send message with a msg_id (message id)
    • Responses refer to originating message as the
    parent
    Message IDs

    View Slide

  43. Building a Notebook Document
    {
    "msg_type": "execute_request",
    "msg_id": "0001",
    "content": {
    "code": "print('hey')"
    }
    }
    We send the execute_request as message 0001
    code: "print('hey')"
    And initialize our state

    View Slide

  44. {
    "msg_type": "status",
    "msg_id": "0002",
    "parent_id": "0001",
    "content": {
    "execution_state": "busy"
    }
    }
    Building a Notebook Document
    Responses show originating msg_id is parent_id 0001

    View Slide

  45. Building a Notebook Document
    {
    "msg_type": "status",
    "msg_id": "0002",
    "parent_id": "0001",
    "content": {
    "execution_state": "busy"
    }
    }
    Message
    code: "print('hey')"
    status: "busy"
    Cell State

    View Slide

  46. Building a Notebook Document
    {
    "msg_type": "stream",
    "msg_id": "0003",
    "parent_id": "0001",
    "content": {
    “text": "hey\n",
    “name”: "stdout"
    }
    }
    Message
    code: "print('hey')"
    status: “busy"
    outputs:
    - type: "stream"
    text: "hey\n"
    name: "stdout"
    Cell State

    View Slide

  47. Building a Notebook Document
    {
    "msg_type": "status",
    "msg_id": "0004",
    "parent_id": "0001",
    "content": {
    "execution_state": "idle"
    }
    }
    Message
    code: "print('hey')"
    status: “idle"
    outputs:
    - type: "stream"
    text: "hey\n"
    name: "stdout"
    Cell State

    View Slide

  48. Building a Notebook Document
    Final State
    code: "print('hey')"
    status: “idle"
    outputs:
    - type: "stream"
    text: "hey\n"
    name: "stdout"

    View Slide

  49. Building a Notebook Document
    Final State
    code: "print('hey')"
    status: “idle"
    outputs:
    - type: "stream"
    text: "hey\n"
    name: "stdout"

    View Slide

  50. Building a Notebook Document
    That’s just one cell though —what would an
    entire notebook structure look like?
    What’s a notebook?
    • A rolling work log of computations
    • A linear list of cells

    View Slide

  51. cells:
    - text: "# Now with markdown!"
    - code: "from IPython.display import HTML"
    - code: "print('hey')"
    outputs:
    - type: "stream"
    text: "hey\n"
    name: "stdout"
    - code: "HTML('WHOA')"
    outputs:
    - type: "execute_result"
    data:
    - "text/html": "WHOA"

    View Slide

  52. cells:
    - text: "# Now with markdown!"
    - code: "from IPython.display import HTML"
    - code: "print('hey')"
    outputs:
    - type: "stream"
    text: "hey\n"
    name: "stdout"
    - code: "HTML('WHOA')"
    outputs:
    - type: "execute_result"
    data:
    - "text/html": "WHOA"

    View Slide

  53. Summarizing the Notebook
    send code →
    … run , run …
    → get result(s)

    View Slide

  54. Callouts
    For more in-depth on protocols and formats, check
    out these talks:
    Jupyter: Kernels, protocols, and the IPython
    reference implementation
    The Jupyter Notebook as document: From structure
    to application

    View Slide

  55. Survey of Frontends

    View Slide

  56. View Slide

  57. hydrogen

    View Slide

  58. hydrogen

    View Slide

  59. nteract

    View Slide

  60. Spyder

    View Slide

  61. Callouts
    For more in-depth on alternative frontends, check out
    these talks:
    Lessons learned from tens of thousands of Kaggle
    notebooks
    JupyterLab: The next-generation Jupyter frontend

    View Slide

  62. Questions

    View Slide