Save 37% off PRO during our Black Friday Sale! »

End-to-End AutoML with Ludwig on Ray (Travis Addair)

End-to-End AutoML with Ludwig on Ray (Travis Addair)

Ludwig is an open source AutoML framework that allows you to train and deploy state-of-the-art deep learning models with no code required. With a single parameter on the command line, the same Ludwig configuration used to train models on your local machine can be scaled to train on massive datasets across hundreds of machines in parallel using Ray. In this talk, we'll show you how Ludwig combines Dask on Ray for distributed out-of-memory data preprocessing, Horovod on Ray for distributed training, and Ray Tune for hyperparameter optimization together into a single end-to-end solution you can run on your existing Ray cluster.

Af07bbf978a0989644b039ae6b8904a5?s=128

Anyscale
PRO

July 13, 2021
Tweet

Transcript

  1. Travis Addair // travis@predibase.com Making ML at scale simple &

    flexible declaratively End-to-end AutoML with Ludwig on Ray
  2. What are the problems today ML projects take 6-18mo for

    most organizations Data scientists are a precious & limited resource ML today is limited and slow Solutions tend to be a blackbox and dead-end AutoML has become a bad word in DS orgs Lacks flexibility & introspection from expert users Dilemma: We need solutions that are simpler, but also more flexible
  3. Trends in Modern Data Infrastructure High level Simpler to use

    Low level More flexible
  4. Trends in Machine Learning Frameworks High level Simpler to use

    Low level More flexible
  5. Ludwig on Ray: Declarative Framework for DL at Scale General

    Models Stochastic gradient descent, transformers for text, vision, and tabular. Structured Data Data lakes to cloud data warehouses, ETL to ELT, unified schemas / storage. Declarative ML +
  6. What is Ludwig? Ludwig is a low-code declarative framework to

    build deep neural networks Ludwig supports multi-task learning and mixed modality data types Developed and used in production at Uber Now a Linux Foundation project 2400+ downloads/month 7700+ Stars on GitHub 60+ Contributors
  7. Ludwig Architecture Category Column Numerical Column Binary Column Text Column

    Image Column Audio Column ... Preproc Preproc Preproc Preproc Preproc Preproc Preproc Encode Encode Encode Encode Encode Encode Encode Category Column Numerical Column Binary Column Text Column Image Column Audio Column ... Decode Decode Decode Decode Decode Decode Decode Postproc Postproc Postproc Postproc Postproc Postproc Postproc Combine INPUTS OUTPUTS
  8. Ludwig Task Flexibility Category Numerical Binary Encode Encode Encode Numerical

    Decode Combine Regression Text Encode Category Decode Combine Text Classification Image Encode Text Decode Combine Image Captioning Audio Encode Binary Decode Combine Speech Verification Audio Encode Time series Encode Numerical Decode Combine Forecasting ... Encode Binary Decode Combine Binary Classification
  9. Why declarative solves our earlier dilemma

  10. Why declarative? Abstract away complexity of scale, optimization, and productionization

    into a “query planner” (i.e., AutoML / hyperopt) Retain full flexibility over model properties; specify what you want and Ludwig will determine how to optimally deliver. input_features: - name: utterance type: text encoder: rnn cell_type: lstm num_layers: 2 output_features: - name: class type: category training: learning_rate: 0.001 optimizer: type: adam Easy to install: pip install ludwig Programmatic API: from ludwig.api import LudwigModel model = LudwigModel(config) model.train(train_data) predictions = model.predict(other_data) Model serving: ludwig serve --model_path <model_path> One-line to create, train and use a model Models as Config
  11. Ludwig is SOTA on NLP & Tabular Tasks Model Task

    Benchmarks Accuracy RoBERTa (‘19) Yelp Review Classification 97.3% Tabnet (‘19) Higgs-Boson Particle Detection 78.5% T5 (‘20) Multi-dimensional Gender Bias 89.4% NLP Tabular Dataset XGBoost Accuracy Tabnet Paper Accuracy Ludwig Accuracy Forest Tree Cover 0.8934 0.9699 0.9508 Higgs Boson - 0.7884 0.7846 Poker Hands 0.711 0.992 0.9914
  12. Ludwig before Ray

  13. Ludwig before Ray Only runs locally Data must fit in

    memory No distributed training No parallel evaluation
  14. Ludwig on Ray

  15. Ludwig on Ray Dask distributed data Horovod distributed training Ray

    remote execution Dask distributed evaluation
  16. Configuring Ray for Ludwig cluster_name: ludwig-ray-gpu-nightly min_workers: 4 max_workers: 4

    docker: image: "ludwigai/ludwig-ray-gpu:nightly" container_name: "ray_container" head_node: InstanceType: c5.2xlarge ImageId: latest_dlami worker_nodes: InstanceType: g4dn.xlarge ImageId: latest_dlami
  17. Running Ludwig $ ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet $

    ray up cluster.yaml $ ray submit cluster.yaml ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet Running Ludwig on Ray
  18. Running Ludwig $ ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet $

    ray up cluster.yaml $ ray submit cluster.yaml ludwig train --config config.yaml --dataset s3://mybucket/dataset.parquet Running Ludwig on Ray
  19. Ludwig with Ray Tune

  20. Ludwig Hyperopt with Ray Tune hyperopt: parameters: training.learning_rate: space: loguniform

    lower: 0.01 upper: 0.1 combiner.num_fc_layers: space: randint lower: 2 upper: 6 goal: minimize executor: type: ray sampler: type: ray
  21. Ludwig Hyperopt with Ray Tune (advanced) hyperopt: parameters: training.learning_rate: space:

    loguniform lower: 0.01 upper: 0.1 combiner.num_fc_layers: space: randint lower: 2 upper: 6 goal: minimize executor: type: ray sampler: type: ray search_algo: type: bohb scheduler: type: hb_bohb time_attr: training_iteration reduction_factor: 4 num_samples: 100
  22. Ludwig AutoML: no config needed (experimental) import ludwig dataset =

    "s3://mybucket/dataset.train.parquet" model = ludwig.auto_train(target='label', dataset=dataset) model.predict("s3://mybucket/dataset.test.parquet")
  23. A sneak peek into Predibase Predictive Database Simplicity & Flexibility

    of Declarative ML Rise of Data Warehousing & Analytics Scalability of Horovod & Ray
  24. A sneak peek into Predibase Our Mission Use Ludwig and

    Horovod to make machine learning as simple and fast as writing a query. Reach out to travis@predibase.com if you’re interested in hearing more. We are hiring! GIVEN * FROM Customers WHERE start > ‘03-01-20’ PREDICT churn ID date amount churn Claim #7 10-31-20 $11,000 Retained Claim #8 11-15-20 $18,000 Retained Claim #9 12-20-20 $36,000 Churned