Data Science encapsulation and deployment, JupyterCON 2017
© 2016 Continuum Analytics - Confidential & Proprietary© 2017 Continuum Analytics - Confidential & ProprietaryData Scienceencapsulation and deploymentwith Anaconda Project and JupyterLabChristine Doig, Senior Product Manager and Data ScientistContinuum Analytics
View Slide
© 2017 Continuum Analytics - Confidential & Proprietary• Challenges in data science reproducibility and deployment• Encapsulating your data science with Anaconda Project• Using Anaconda Project with JupyterLab• Anaconda Project & JupyterLab powering Anaconda Enterprise v5Agenda2
Challenges in Data Science reproducibility and deployment
© 2017 Continuum Analytics - Confidential & Proprietary 4LaptopData Science Developmentscikit-learnBokeh TensorflowJupyter pandasmatplotlibseaborndasknumbascript 1 script 2 notebook A dataset Zscript 3Python, RReproducibility
© 2017 Continuum Analytics - Confidential & Proprietary 5WorkflowsDataQuery VisualizeClean& TidyPredict,Simulate,& OptimizeRPInNInAPMInteractive data visualizationsand dashboardsJupyter notebooksScriptsPredictive modelsProcessedDataDeployment
© 2017 Continuum Analytics - Confidential & Proprietary• Data Scientists work in different platforms: Windows, macOS, Linux• Data science development environments different than deploymentenvironments• Data science dependencies are more than just software packages: data,variables, commands, services• Managing software packages: versions, build, channel• Data scientists are not necessarily software developers. Current deploymenttools are very focused on serving developers• There is a variety of “deployment” types: notebooks, REST APIs,dashboards, web apps…Challenges in Data Sciencereproducibility and deployment6
Encapsulating your data science with Anaconda Project
© 2016 Continuum Analytics - Confidential & ProprietaryLaptop / Desktopconda env 1Analysis1conda env 2 conda env 3Analysis2Analysis3Data Science DevelopmentAnaconda DistributionAnaconda Distribution & conda make datascience reproducibility and development easierLaptop / Desktop / Serverconda env 1Analysis1conda env 2 conda env 3Analysis2Analysis3Data Science Reproducibility & DeploymentAnaconda DistributionDocker containerWindows, macOS, Linux Windows, macOS, Linux
© 2017 Continuum Analytics - Confidential & Proprietary• Manage software packagesacross platforms: Windows,macOS, Linux• Isolate and recreateenvironmentsWith Anaconda and conda, Data Scientists can:9
© 2017 Continuum Analytics - Confidential & Proprietary 10Introducing Anaconda Project,now available in Anaconda Distribution
© 2017 Continuum Analytics - Confidential & Proprietary 11Anaconda ProjectData science portable encapsulationanaconda-project.yml• Define and manage:• deployment commands• downloads and data• project package dependencies• multiple enviroments• environment variables (withencryption)
© 2017 Continuum Analytics - Confidential & Proprietary 12Anaconda ProjectData science portable encapsulation• Lock your environments:• package versions, down to the build numbers• platforms• packages by platformNote: This file is automatically generated for you by Anaconda Project
© 2016 Continuum Analytics - Confidential & ProprietaryLaptop / Desktop Laptop / Desktop / ServerProject 1 Project 2 Project 3 Project 1 Project 2 Project 3Data Science Development Data Science Reproducibility & DeploymentWindows, macOS, Linux Windows, macOS, LinuxAnaconda Project brings additional capabilitiesfor data science reproducibility and development
© 2017 Continuum Analytics - Confidential & ProprietaryWith Anaconda Projects, Data Scientists can:14• Export environments (with pinnedversions) cross-platform• Manage other dependencies: data,variables, commands, services• Get the right abstraction for datascientists (e.g. Docker)
© 2016 Continuum Analytics - Confidential & ProprietaryLaptop / DesktopProject 1 Project 2 Project 3Project 1 Project 2 Project 3Data Science Development Data Science Reproducibility, Developmentand DeploymentAnaconda EnterpriseContainer 1Container 2Container 3 Container 4Anaconda Enterprise makes project collaborationand deployment secure and scalable
© 2017 Continuum Analytics - Confidential & Proprietary 16Project 1 Project 2DeployNotebooksModels - REST APIsDashboards Applications
© 2017 Continuum Analytics - Confidential & Proprietary 17
© 2017 Continuum Analytics - Confidential & Proprietary 18
© 2017 Continuum Analytics - Confidential & ProprietaryWith Anaconda Enterprise, Data Scientists can:19• Easily deploy projects a wide variety of “deployment”types: notebooks, REST APIs, dashboards, web apps…• Collaborate with other data scientists• Securily share deployments with other applications andusers
Anaconda Project & JupyterLabpowering Anaconda Enterprise v5
© 2017 Continuum Analytics - Confidential & Proprietary 21JupyterLab is the default experience in Anaconda Enterprise
© 2017 Continuum Analytics - Confidential & Proprietary 22Anaconda Project Lab extension• Manage yourAnaconda Projectdependencies from agraphical interfaceinside JupyterLabNicholas BollwegGithub: bollwyvl
© 2017 Continuum Analytics - Confidential & Proprietary 23
© 2017 Continuum Analytics - Confidential & Proprietary• Anaconda Distribution and conda:• Manage software packages across platforms: Windows, macOS, Linux• Isolate and recreate environments• Anaconda Project:• Export environments (with pinned versions) cross-platform• Manage other dependencies: data, variables, commands, services• Get the right abstraction for data scientists (e.g. Docker)• Anaconda Enterprise:• Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards,web apps…• Collaborate with other data scientists• Securily share deployments with other applications and usersAnaconda helps Data Scientistsreproduce and deploy their projects24
https://speakerdeck.com/chdoig@ch_doig
Questions?