Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Anaconda Project and JupyterLab

Anaconda Project and JupyterLab

Data Science encapsulation and deployment, JupyterCON 2017

6cc5be6a122c6e768981003fd2e24789?s=128

Christine Doig

August 25, 2017
Tweet

Transcript

  1. © 2016 Continuum Analytics - Confidential & Proprietary © 2017

    Continuum Analytics - Confidential & Proprietary Data Science encapsulation and deployment with Anaconda Project and JupyterLab Christine Doig, Senior Product Manager and Data Scientist Continuum Analytics
  2. © 2017 Continuum Analytics - Confidential & Proprietary • Challenges

    in data science reproducibility and deployment • Encapsulating your data science with Anaconda Project • Using Anaconda Project with JupyterLab • Anaconda Project & JupyterLab powering Anaconda Enterprise v5 Agenda 2
  3. Challenges in Data Science reproducibility and deployment

  4. © 2017 Continuum Analytics - Confidential & Proprietary 4 Laptop

    Data Science Development scikit-learn Bokeh Tensorflow Jupyter pandas matplotlib seaborn dask numba script 1 script 2 notebook A dataset Z script 3 Python, R Reproducibility
  5. © 2017 Continuum Analytics - Confidential & Proprietary 5 Workflows

    Data Query Visualize Clean & Tidy Predict, Simulate, & Optimize R P In N In A P M Interactive data visualizations and dashboards Jupyter notebooks Scripts Predictive models Processed Data Deployment
  6. © 2017 Continuum Analytics - Confidential & Proprietary • Data

    Scientists work in different platforms: Windows, macOS, Linux • Data science development environments different than deployment environments • Data science dependencies are more than just software packages: data, variables, commands, services • Managing software packages: versions, build, channel • Data scientists are not necessarily software developers. Current deployment tools are very focused on serving developers • There is a variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… Challenges in Data Science reproducibility and deployment 6
  7. Encapsulating your data science with Anaconda Project

  8. © 2016 Continuum Analytics - Confidential & Proprietary Laptop /

    Desktop conda env 1 Analysis 1 conda env 2 conda env 3 Analysis 2 Analysis 3 Data Science Development Anaconda Distribution Anaconda Distribution & conda make data science reproducibility and development easier Laptop / Desktop / Server conda env 1 Analysis 1 conda env 2 conda env 3 Analysis 2 Analysis 3 Data Science Reproducibility & Deployment Anaconda Distribution Docker container Windows, macOS, Linux Windows, macOS, Linux
  9. © 2017 Continuum Analytics - Confidential & Proprietary • Manage

    software packages across platforms: Windows, macOS, Linux • Isolate and recreate environments With Anaconda and conda, Data Scientists can: 9
  10. © 2017 Continuum Analytics - Confidential & Proprietary 10 Introducing

    Anaconda Project, now available in Anaconda Distribution
  11. © 2017 Continuum Analytics - Confidential & Proprietary 11 Anaconda

    Project Data science portable encapsulation anaconda-project.yml • Define and manage: • deployment commands • downloads and data • project package dependencies • multiple enviroments • environment variables (with encryption)
  12. © 2017 Continuum Analytics - Confidential & Proprietary 12 Anaconda

    Project Data science portable encapsulation • Lock your environments: • package versions, down to the build numbers • platforms • packages by platform Note: This file is automatically generated for you by Anaconda Project
  13. © 2016 Continuum Analytics - Confidential & Proprietary Laptop /

    Desktop Laptop / Desktop / Server Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Reproducibility & Deployment Windows, macOS, Linux Windows, macOS, Linux Anaconda Project brings additional capabilities for data science reproducibility and development
  14. © 2017 Continuum Analytics - Confidential & Proprietary With Anaconda

    Projects, Data Scientists can: 14 • Export environments (with pinned versions) cross-platform • Manage other dependencies: data, variables, commands, services • Get the right abstraction for data scientists (e.g. Docker)
  15. © 2016 Continuum Analytics - Confidential & Proprietary Laptop /

    Desktop Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Reproducibility, Development and Deployment Anaconda Enterprise Container 1 Container 2 Container 3 Container 4 Anaconda Enterprise makes project collaboration and deployment secure and scalable
  16. © 2017 Continuum Analytics - Confidential & Proprietary 16 Project

    1 Project 2 Deploy Notebooks Models - REST APIs Dashboards Applications
  17. © 2017 Continuum Analytics - Confidential & Proprietary 17

  18. © 2017 Continuum Analytics - Confidential & Proprietary 18

  19. © 2017 Continuum Analytics - Confidential & Proprietary With Anaconda

    Enterprise, Data Scientists can: 19 • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… • Collaborate with other data scientists • Securily share deployments with other applications and users
  20. Anaconda Project & JupyterLab powering Anaconda Enterprise v5

  21. © 2017 Continuum Analytics - Confidential & Proprietary 21 JupyterLab

    is the default experience in Anaconda Enterprise
  22. © 2017 Continuum Analytics - Confidential & Proprietary 22 Anaconda

    Project Lab extension • Manage your Anaconda Project dependencies from a graphical interface inside JupyterLab Nicholas Bollweg Github: bollwyvl
  23. © 2017 Continuum Analytics - Confidential & Proprietary 23

  24. © 2017 Continuum Analytics - Confidential & Proprietary • Anaconda

    Distribution and conda: • Manage software packages across platforms: Windows, macOS, Linux • Isolate and recreate environments • Anaconda Project: • Export environments (with pinned versions) cross-platform • Manage other dependencies: data, variables, commands, services • Get the right abstraction for data scientists (e.g. Docker) • Anaconda Enterprise: • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… • Collaborate with other data scientists • Securily share deployments with other applications and users Anaconda helps Data Scientists reproduce and deploy their projects 24
  25. https://speakerdeck.com/chdoig @ch_doig

  26. Questions?