Upgrade to Pro — share decks privately, control downloads, hide ads and more …

End-to-End AutoML with Ludwig on Ray (Travis Ad...

End-to-End AutoML with Ludwig on Ray (Travis Addair)

Ludwig is an open source AutoML framework that allows you to train and deploy state-of-the-art deep learning models with no code required. With a single parameter on the command line, the same Ludwig configuration used to train models on your local machine can be scaled to train on massive datasets across hundreds of machines in parallel using Ray. In this talk, we'll show you how Ludwig combines Dask on Ray for distributed out-of-memory data preprocessing, Horovod on Ray for distributed training, and Ray Tune for hyperparameter optimization together into a single end-to-end solution you can run on your existing Ray cluster.

Anyscale

July 13, 2021
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Travis Addair // [email protected] Making ML at scale simple &

    flexible declaratively End-to-end AutoML with Ludwig on Ray
  2. What are the problems today ML projects take 6-18mo for

    most organizations Data scientists are a precious & limited resource ML today is limited and slow Solutions tend to be a blackbox and dead-end AutoML has become a bad word in DS orgs Lacks flexibility & introspection from expert users Dilemma: We need solutions that are simpler, but also more flexible
  3. Ludwig on Ray: Declarative Framework for DL at Scale General

    Models Stochastic gradient descent, transformers for text, vision, and tabular. Structured Data Data lakes to cloud data warehouses, ETL to ELT, unified schemas / storage. Declarative ML +
  4. What is Ludwig? Ludwig is a low-code declarative framework to

    build deep neural networks Ludwig supports multi-task learning and mixed modality data types Developed and used in production at Uber Now a Linux Foundation project 2400+ downloads/month 7700+ Stars on GitHub 60+ Contributors
  5. Ludwig Architecture Category Column Numerical Column Binary Column Text Column

    Image Column Audio Column ... Preproc Preproc Preproc Preproc Preproc Preproc Preproc Encode Encode Encode Encode Encode Encode Encode Category Column Numerical Column Binary Column Text Column Image Column Audio Column ... Decode Decode Decode Decode Decode Decode Decode Postproc Postproc Postproc Postproc Postproc Postproc Postproc Combine INPUTS OUTPUTS
  6. Ludwig Task Flexibility Category Numerical Binary Encode Encode Encode Numerical

    Decode Combine Regression Text Encode Category Decode Combine Text Classification Image Encode Text Decode Combine Image Captioning Audio Encode Binary Decode Combine Speech Verification Audio Encode Time series Encode Numerical Decode Combine Forecasting ... Encode Binary Decode Combine Binary Classification
  7. Why declarative? Abstract away complexity of scale, optimization, and productionization

    into a “query planner” (i.e., AutoML / hyperopt) Retain full flexibility over model properties; specify what you want and Ludwig will determine how to optimally deliver. input_features: - name: utterance type: text encoder: rnn cell_type: lstm num_layers: 2 output_features: - name: class type: category training: learning_rate: 0.001 optimizer: type: adam Easy to install: pip install ludwig Programmatic API: from ludwig.api import LudwigModel model = LudwigModel(config) model.train(train_data) predictions = model.predict(other_data) Model serving: ludwig serve --model_path <model_path> One-line to create, train and use a model Models as Config
  8. Ludwig is SOTA on NLP & Tabular Tasks Model Task

    Benchmarks Accuracy RoBERTa (‘19) Yelp Review Classification 97.3% Tabnet (‘19) Higgs-Boson Particle Detection 78.5% T5 (‘20) Multi-dimensional Gender Bias 89.4% NLP Tabular Dataset XGBoost Accuracy Tabnet Paper Accuracy Ludwig Accuracy Forest Tree Cover 0.8934 0.9699 0.9508 Higgs Boson - 0.7884 0.7846 Poker Hands 0.711 0.992 0.9914
  9. Ludwig before Ray Only runs locally Data must fit in

    memory No distributed training No parallel evaluation
  10. Ludwig on Ray Dask distributed data Horovod distributed training Ray

    remote execution Dask distributed evaluation
  11. Configuring Ray for Ludwig cluster_name: ludwig-ray-gpu-nightly min_workers: 4 max_workers: 4

    docker: image: "ludwigai/ludwig-ray-gpu:nightly" container_name: "ray_container" head_node: InstanceType: c5.2xlarge ImageId: latest_dlami worker_nodes: InstanceType: g4dn.xlarge ImageId: latest_dlami
  12. Running Ludwig $ ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet $

    ray up cluster.yaml $ ray submit cluster.yaml ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet Running Ludwig on Ray
  13. Running Ludwig $ ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet $

    ray up cluster.yaml $ ray submit cluster.yaml ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet Running Ludwig on Ray
  14. Ludwig Hyperopt with Ray Tune hyperopt: parameters: training.learning_rate: space: loguniform

    lower: 0.01 upper: 0.1 combiner.num_fc_layers: space: randint lower: 2 upper: 6 goal: minimize executor: type: ray sampler: type: ray
  15. Ludwig Hyperopt with Ray Tune (advanced) hyperopt: parameters: training.learning_rate: space:

    loguniform lower: 0.01 upper: 0.1 combiner.num_fc_layers: space: randint lower: 2 upper: 6 goal: minimize executor: type: ray sampler: type: ray search_algo: type: bohb scheduler: type: hb_bohb time_attr: training_iteration reduction_factor: 4 num_samples: 100
  16. Ludwig AutoML: no config needed (experimental) import ludwig dataset =

    "s3://mybucket/dataset.train.parquet" model = ludwig.auto_train(target='label', dataset=dataset) model.predict("s3://mybucket/dataset.test.parquet")
  17. A sneak peek into Predibase Predictive Database Simplicity & Flexibility

    of Declarative ML Rise of Data Warehousing & Analytics Scalability of Horovod & Ray
  18. A sneak peek into Predibase Our Mission Use Ludwig and

    Horovod to make machine learning as simple and fast as writing a query. Reach out to [email protected] if you’re interested in hearing more. We are hiring! GIVEN * FROM Customers WHERE start > ‘03-01-20’ PREDICT churn ID date amount churn Claim #7 10-31-20 $11,000 Retained Claim #8 11-15-20 $18,000 Retained Claim #9 12-20-20 $36,000 Churned