$30 off During Our Annual Pro Sale. View Details »

DataTalk#40 1/3 -- Polyaxon pour la gestion des expériences en sciences des données

DataTalk#40 1/3 -- Polyaxon pour la gestion des expériences en sciences des données

Speaker Victor Fomin, Magellium

Polyaxon est une solution open-source dédiée à la gestion du cycle de vie des applications machine/deep learning. Cette plateforme fait l'abstraction de l'infrastructure, gestion des dépendances et permet de reproduire les expérimentations machine/deep learning. Elle supporte la majorité de frameworks populaires (TF, Keras, PyTorch, MXNet, scikit-learn etc) et s'intégre en quelques ligne dans le code existant (e.g. Python). Polyaxon se deploy sur tout type de cluster Kubernetes: local ou cloud. Gestion des utilisateurs et l'interface graphique dans le navigateur permettent de facilement travailler en équipe et de suivre les projets et les expérimentations. La plateforme est disponible en CE et EE versions.

Toulouse Data Science

November 19, 2019
Tweet

More Decks by Toulouse Data Science

Other Decks in Programming

Transcript

  1. Victor FOMIN
    DL engineer @ Magellium
    Reproducing and managing machine learning
    and deep learning applications
    #40 Multi-Sujets: Gestion d'expériences en science des données - 19 Nov 2019

    View Slide

  2. What we will cover
    Polyaxon platform
    ● What it is ? Why it is important ?
    ● Demo
    ● Outlooks

    View Slide

  3. Imagine a world without VCS
    ● Pain to share the code with the team
    ● Pain to restore a previous version
    ● Pain to review changes
    Machine Learning world can resemble to this ...
    ● without code versioning
    ● without dataset versioning
    ● without experiment tracking and reproducibility

    View Slide

  4. Imagine a world without tests and CI/CD
    ● commiting new code is nervous
    ● deployment is not joyful
    ○ we are happy to announce the new release of ...
    Machine Learning world can resemble to this ...
    ● without code testing
    ● without dataset checking
    ● without easy input/output management

    View Slide

  5. Polyaxon
    ● is an open-source tool (one among others) to create an
    order in a chaos of ML/DL applications:
    ○ scalable infrastructure
    ○ automatic experiment tracking
    ○ framework/language agnostic
    ○ possibility to reproduce results
    ○ team collaboration
    ● it solves some of technical debts in ML systems1
    ○ unstable data dependencies
    ○ unstable software dependencies
    ○ configuration debt
    ○ reproducibility
    1 - https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
    What it is ? Why it is
    important ?
    https://polyaxon.com/

    View Slide

  6. Polyaxon - What it is ?
    Platform on Kubernetes cluster
    ● Orchestration: jobs and experiments
    ● Tracking: logs, metrics, artifacts, params, ...
    ● Visualization: dashboard, Tensorboard
    ● Collaboration: user management
    ● Additional features:
    ○ HP tuning
    ○ Pipelines
    ○ Distributed computations
    ○ ...

    View Slide

  7. Demo

    View Slide

  8. Outlooks - CE vs EE

    View Slide

  9. Outlooks

    ● Unified input/output data management
    ● Infrastructure abstraction
    ● Experiments tracking & reproducibility
    ● Framework/language agnostic
    ● Dashboard UI
    ○ one-click to see results
    ○ easily compare experiments
    ● Platform under active development

    ● Code debugging is not integrated
    out-of-box
    ● Kubernetes problems are not well
    propagated to Polyaxon
    ● Community Edition is single-user per
    project

    View Slide

  10. Thank you for your attention!
    Any questions, remarks, suggestions ?
    https://polyaxon.com/

    View Slide