DataTalk#40 1/3 -- Polyaxon pour la gestion des expériences en sciences des données

DataTalk#40 1/3 -- Polyaxon pour la gestion des expériences en sciences des données

Speaker Victor Fomin, Magellium

Polyaxon est une solution open-source dédiée à la gestion du cycle de vie des applications machine/deep learning. Cette plateforme fait l'abstraction de l'infrastructure, gestion des dépendances et permet de reproduire les expérimentations machine/deep learning. Elle supporte la majorité de frameworks populaires (TF, Keras, PyTorch, MXNet, scikit-learn etc) et s'intégre en quelques ligne dans le code existant (e.g. Python). Polyaxon se deploy sur tout type de cluster Kubernetes: local ou cloud. Gestion des utilisateurs et l'interface graphique dans le navigateur permettent de facilement travailler en équipe et de suivre les projets et les expérimentations. La plateforme est disponible en CE et EE versions.

6aa4f3c589d3108830b371d0310bc4da?s=128

Toulouse Data Science

November 19, 2019
Tweet

Transcript

  1. Victor FOMIN DL engineer @ Magellium Reproducing and managing machine

    learning and deep learning applications #40 Multi-Sujets: Gestion d'expériences en science des données - 19 Nov 2019
  2. What we will cover Polyaxon platform • What it is

    ? Why it is important ? • Demo • Outlooks
  3. Imagine a world without VCS • Pain to share the

    code with the team • Pain to restore a previous version • Pain to review changes Machine Learning world can resemble to this ... • without code versioning • without dataset versioning • without experiment tracking and reproducibility
  4. Imagine a world without tests and CI/CD • commiting new

    code is nervous • deployment is not joyful ◦ we are happy to announce the new release of ... Machine Learning world can resemble to this ... • without code testing • without dataset checking • without easy input/output management
  5. Polyaxon • is an open-source tool (one among others) to

    create an order in a chaos of ML/DL applications: ◦ scalable infrastructure ◦ automatic experiment tracking ◦ framework/language agnostic ◦ possibility to reproduce results ◦ team collaboration • it solves some of technical debts in ML systems1 ◦ unstable data dependencies ◦ unstable software dependencies ◦ configuration debt ◦ reproducibility 1 - https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf What it is ? Why it is important ? https://polyaxon.com/
  6. Polyaxon - What it is ? Platform on Kubernetes cluster

    • Orchestration: jobs and experiments • Tracking: logs, metrics, artifacts, params, ... • Visualization: dashboard, Tensorboard • Collaboration: user management • Additional features: ◦ HP tuning ◦ Pipelines ◦ Distributed computations ◦ ...
  7. Demo

  8. Outlooks - CE vs EE

  9. Outlooks • Unified input/output data management • Infrastructure abstraction •

    Experiments tracking & reproducibility • Framework/language agnostic • Dashboard UI ◦ one-click to see results ◦ easily compare experiments • Platform under active development • Code debugging is not integrated out-of-box • Kubernetes problems are not well propagated to Polyaxon • Community Edition is single-user per project
  10. Thank you for your attention! Any questions, remarks, suggestions ?

    https://polyaxon.com/