Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Anaconda Project and JupyterLab

Anaconda Project and JupyterLab

Data Science encapsulation and deployment, JupyterCON 2017

Christine Doig

August 25, 2017
Tweet

More Decks by Christine Doig

Other Decks in Technology

Transcript

  1. © 2016 Continuum Analytics - Confidential & Proprietary
    © 2017 Continuum Analytics - Confidential & Proprietary
    Data Science
    encapsulation and deployment
    with Anaconda Project and JupyterLab
    Christine Doig, Senior Product Manager and Data Scientist
    Continuum Analytics

    View Slide

  2. © 2017 Continuum Analytics - Confidential & Proprietary
    • Challenges in data science reproducibility and deployment
    • Encapsulating your data science with Anaconda Project
    • Using Anaconda Project with JupyterLab
    • Anaconda Project & JupyterLab powering Anaconda Enterprise v5
    Agenda
    2

    View Slide

  3. Challenges in Data Science reproducibility and deployment

    View Slide

  4. © 2017 Continuum Analytics - Confidential & Proprietary 4
    Laptop
    Data Science Development
    scikit-learn
    Bokeh Tensorflow
    Jupyter pandas
    matplotlib
    seaborn
    dask
    numba
    script 1 script 2 notebook A dataset Z
    script 3
    Python, R
    Reproducibility

    View Slide

  5. © 2017 Continuum Analytics - Confidential & Proprietary 5
    Workflows
    Data
    Query Visualize
    Clean
    & Tidy
    Predict,
    Simulate,
    & Optimize
    R
    P
    In
    N
    In
    A
    P
    M
    Interactive data visualizations
    and dashboards
    Jupyter notebooks
    Scripts
    Predictive models
    Processed
    Data
    Deployment

    View Slide

  6. © 2017 Continuum Analytics - Confidential & Proprietary
    • Data Scientists work in different platforms: Windows, macOS, Linux
    • Data science development environments different than deployment
    environments
    • Data science dependencies are more than just software packages: data,
    variables, commands, services
    • Managing software packages: versions, build, channel
    • Data scientists are not necessarily software developers. Current deployment
    tools are very focused on serving developers
    • There is a variety of “deployment” types: notebooks, REST APIs,
    dashboards, web apps…
    Challenges in Data Science
    reproducibility and deployment
    6

    View Slide

  7. Encapsulating your data science with Anaconda Project

    View Slide

  8. © 2016 Continuum Analytics - Confidential & Proprietary
    Laptop / Desktop
    conda env 1
    Analysis
    1
    conda env 2 conda env 3
    Analysis
    2
    Analysis
    3
    Data Science Development
    Anaconda Distribution
    Anaconda Distribution & conda make data
    science reproducibility and development easier
    Laptop / Desktop / Server
    conda env 1
    Analysis
    1
    conda env 2 conda env 3
    Analysis
    2
    Analysis
    3
    Data Science Reproducibility & Deployment
    Anaconda Distribution
    Docker container
    Windows, macOS, Linux Windows, macOS, Linux

    View Slide

  9. © 2017 Continuum Analytics - Confidential & Proprietary
    • Manage software packages
    across platforms: Windows,
    macOS, Linux
    • Isolate and recreate
    environments
    With Anaconda and conda, Data Scientists can:
    9

    View Slide

  10. © 2017 Continuum Analytics - Confidential & Proprietary 10
    Introducing Anaconda Project,
    now available in Anaconda Distribution

    View Slide

  11. © 2017 Continuum Analytics - Confidential & Proprietary 11
    Anaconda Project
    Data science portable encapsulation
    anaconda-project.yml
    • Define and manage:
    • deployment commands
    • downloads and data
    • project package dependencies
    • multiple enviroments
    • environment variables (with
    encryption)

    View Slide

  12. © 2017 Continuum Analytics - Confidential & Proprietary 12
    Anaconda Project
    Data science portable encapsulation
    • Lock your environments:
    • package versions, down to the build numbers
    • platforms
    • packages by platform
    Note: This file is automatically generated for you by Anaconda Project

    View Slide

  13. © 2016 Continuum Analytics - Confidential & Proprietary
    Laptop / Desktop Laptop / Desktop / Server
    Project 1 Project 2 Project 3 Project 1 Project 2 Project 3
    Data Science Development Data Science Reproducibility & Deployment
    Windows, macOS, Linux Windows, macOS, Linux
    Anaconda Project brings additional capabilities
    for data science reproducibility and development

    View Slide

  14. © 2017 Continuum Analytics - Confidential & Proprietary
    With Anaconda Projects, Data Scientists can:
    14
    • Export environments (with pinned
    versions) cross-platform
    • Manage other dependencies: data,
    variables, commands, services
    • Get the right abstraction for data
    scientists (e.g. Docker)

    View Slide

  15. © 2016 Continuum Analytics - Confidential & Proprietary
    Laptop / Desktop
    Project 1 Project 2 Project 3
    Project 1 Project 2 Project 3
    Data Science Development Data Science Reproducibility, Development
    and Deployment
    Anaconda Enterprise
    Container 1
    Container 2
    Container 3 Container 4
    Anaconda Enterprise makes project collaboration
    and deployment secure and scalable

    View Slide

  16. © 2017 Continuum Analytics - Confidential & Proprietary 16
    Project 1 Project 2
    Deploy
    Notebooks
    Models - REST APIs
    Dashboards Applications

    View Slide

  17. © 2017 Continuum Analytics - Confidential & Proprietary 17

    View Slide

  18. © 2017 Continuum Analytics - Confidential & Proprietary 18

    View Slide

  19. © 2017 Continuum Analytics - Confidential & Proprietary
    With Anaconda Enterprise, Data Scientists can:
    19
    • Easily deploy projects a wide variety of “deployment”
    types: notebooks, REST APIs, dashboards, web apps…
    • Collaborate with other data scientists
    • Securily share deployments with other applications and
    users

    View Slide

  20. Anaconda Project & JupyterLab
    powering Anaconda Enterprise v5

    View Slide

  21. © 2017 Continuum Analytics - Confidential & Proprietary 21
    JupyterLab is the default experience in Anaconda Enterprise

    View Slide

  22. © 2017 Continuum Analytics - Confidential & Proprietary 22
    Anaconda Project Lab extension
    • Manage your
    Anaconda Project
    dependencies from a
    graphical interface
    inside JupyterLab
    Nicholas Bollweg
    Github: bollwyvl

    View Slide

  23. © 2017 Continuum Analytics - Confidential & Proprietary 23

    View Slide

  24. © 2017 Continuum Analytics - Confidential & Proprietary
    • Anaconda Distribution and conda:
    • Manage software packages across platforms: Windows, macOS, Linux
    • Isolate and recreate environments
    • Anaconda Project:
    • Export environments (with pinned versions) cross-platform
    • Manage other dependencies: data, variables, commands, services
    • Get the right abstraction for data scientists (e.g. Docker)
    • Anaconda Enterprise:
    • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards,
    web apps…
    • Collaborate with other data scientists
    • Securily share deployments with other applications and users
    Anaconda helps Data Scientists
    reproduce and deploy their projects
    24

    View Slide

  25. https://speakerdeck.com/chdoig
    @ch_doig

    View Slide

  26. Questions?

    View Slide