Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Experimentation with Jupyter, Papermill, and MLFlow

Shadab Hussain
September 17, 2020

Experimentation with Jupyter, Papermill, and MLFlow

Presented at "One Week Workshop on the Internet of Things (IoT)" under the ATAL Program Sponsored by AICTE and organized by "The Department of Computer Science & Technology (Central University of Jharkhand, Ranchi)"

Shadab Hussain

September 17, 2020
Tweet

More Decks by Shadab Hussain

Other Decks in Technology

Transcript

  1. A tool for data science at scale • Rich web

    client (HTML, images, videos, LaTeX) • Code (40+ programming languages) • Results • Share (email, Dropbox, GitHub) • Reproduce https://jupyter.org/ https://towardsdatascience.com/optimizing-jupyter-notebook-tips-tricks-and-nbextensions-26d75d502663
  2. Running Papermill Notebook Notebook Sources Database File Services Parameters SNP

    GOLD SSE HANGSENG NIKKEI Index Runtime Manager Runtime Process source notebook parameter values stream i/o messages execute cells kernel messages input { } Notebook Sinks Database File Services output notebook store
  3. It adds notebook isolation. • Immutable inputs • Immutable outputs

    • Parameterization of notebook runs • Configurable sourcing/sinking Solves our problems for automated execution! How does this change the notebook experience?
  4. https://medium.com/faun/mlflow-on-google-cloud-platform-cd8c9b04a2d8 https://mlflow.org/docs/latest/index.html An open source platform to manage the ML

    lifecycle Tracking Projects Models Registry Record and query experiments: code, data, config, results Packaging Data Science Code for reproducible runs on any platform General format for sending models to diverse deploy tools Store, annotate and manage models in a repository Components of MLflow
  5. Model Format Run Sources Inference Code Batch & Stream Processing

    Serving Tools Flavor1 Flavor2 Model Flavors