Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Faster time series forecasting using Ray and An...

Anyscale
January 20, 2022

Faster time series forecasting using Ray and Anyscale

Forecasting is an important part of running any business. The “embarrassingly parallel” pattern of distributed computing works with time series algorithms such as ARIMA, Prophet, and Neural Prophet. For global deep learning algorithms, both data and model parallelism become simple and easy with Ray.

In this webinar, we will demo both patterns. For the deep learning algorithm, we will show how to take Google’s Temporal Fusion Transformer, available in PyTorch Forecasting, and turn training and inference into distributed applications using Ray. Ray code runs in parallel across local laptop cores. The exact same code can run in parallel across any cluster in any cloud too.

Anyscale

January 20, 2022
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Faster forecasting using Ray and Anyscale Christy Bergman, @cbergman Developer

    Advocate, Anyscale Amog Kamsetty Software Engineer Ray Train, Anyscale
  2. • Forecasting: Introduction and challenges • Why Ray? • Ray

    overview and integration with PyTorch Lightning • Forecasting with Ray ◦ Demo on laptop (Ray) and cloud (Anyscale) • Q & A Overview
  3. Typically a forecast is a time series that predicts both

    the quantity and timing of something. Example: a demand forecast predicts the quantity and timing of sales quantities for a particular product. What is a Forecast
  4. Statistical algorithms: Individually train a different model for each time

    series. - ARIMA, Prophet, Neural Prophet Forecasting existing tools Deep learning algorithms: Train a global model for all time series at once. - RNN, CNN, Temporal Fusion Transformer
  5. Statistical algorithms: Individually train a different model for each time

    series. - ARIMA, Prophet, Neural Prophet Forecasting existing tools Deep learning algorithms: Train a global model for all time series at once. - RNN, CNN, TemporalFusionTransformer
  6. Statistical algorithms: Individually train a different model for each time

    series. Accelerate with distributed computing Deep learning algorithms: Train a global model for all time series at once Same data / same functions with fully sharded fully parallel data and gradients (e.g. training a deep neural network global model) Different data / different functions (e.g. individually-trained time-series)
  7. Statistical algorithms: Individually train a different model for each time

    series. Accelerate with distributed computing Deep learning algorithms: Train a global model for all time series at once Same data / same functions with fully sharded fully parallel data and gradients (e.g. training a deep neural network global model) Different data / different functions (e.g. individually-trained time-series)
  8. - Machine learning is pervasive in every domain - Distributed

    machine learning is becoming a necessity - Distributed systems is notoriously hard Ray’s vision: Make distributed computing accessible to every developer Why Ray?
  9. 35x every 18 m onths 2020 GPT-3 Moore’s Law (2x

    every 18 months) CPU https://openai.com/blog/ai-and-compute/ GPU* TPU * Compute demand growing faster than supply
  10. 35x every 18 m onths 2020 GPT-3 Specialized hardware is

    also not enough Moore’s Law (2x every 18 months) CPU https://openai.com/blog/ai-and-compute/ GPU* TPU * No way out but to distribute!
  11. The Ray Layered Cake and Ecosystem Datasets Workflows Run anywhere

    Universal framework for distributed computing Library + app ecosystem
  12. Native Libraries 3rd Party Libraries Your app here! Universal framework

    for distributed computing Run anywhere Library + app ecosystem
  13. Native Libraries 3rd Party Libraries Your app here! Universal framework

    for distributed computing Run anywhere Library + app ecosystem
  14. Native Libraries Your app here! Universal framework for distributed computing

    Run anywhere Library + app ecosystem Ray Lightning https://github.com/ray-project/ray_lightning
  15. Universal framework for distributed computing Run anywhere Library + app

    ecosystem Ray Lightning https://github.com/ray-project/ray_lightning
  16. Example Parallelism Patterns Forecasting Algorithms on Ray Train statistical models

    Train global DL models Same data / same functions with fully sharded fully parallel data and gradients (e.g. training a deep neural network global model)
  17. Accelerating statistical algorithms with Ray core An example with Prophet

    • Convert already-existing Python functions & classes to remote Ray Tasks and Actors in 6 steps: 1. install and import ray and anyscale 2. ray.init(), 3. decorate with @ray.remote 4. ray.put(data) 5. call distributed remote tasks with .remote() 6. ray.get() results • Same concepts can be used for ARIMA ….
  18. Prophet Train Function Prophet Inference fn def train_prophet(df): # split

    df into train/test # fit prophet on train data return train_df,test_df,model # read data into pandas df_raw = read(file1.parquet) # train on train data df_train, df_test, model =\ train_prophet(df_raw) def inf_prophet(model, test_df): # get n_periods from test_df # create forecast forecast = model.predict(...) return forecast # inference on test data forecast = \ inf_prophet(model, df_test)
  19. Prophet Train Function Prophet Inference fn @ray.remote(num_returns=3) def train_prophet(df): #

    split df into train/test # fit prophet on train data return train_df,test_df,model # read data into pandas df_raw = read(file1.parquet) # train on train data df_train, df_test, model =\ train_prophet(df_raw) @ray.remote def inf_prophet(model, test_df): # get n_periods from test_df # create forecast forecast = model.predict(...) return forecast # inference on test data forecast = \ inf_prophet(model, df_test) 3 3 Step 3: decorate with @ray.remote
  20. Prophet Train Function Prophet Inference fn @ray.remote(num_returns=3) def train_prophet(df): #

    split df into train/test # fit prophet on train data return train_df,test_df,model # read data into pandas df_raw = read(file1.parquet) # distribute data input_data_ref = ray.put(df_raw) # train on train data df_train, df_test, model =\ train_prophet.remote(df_raw) @ray.remote def inf_prophet(model, test_df): # get n_periods from test_df # create forecast forecast = model.predict(...) return forecast # inference on test data forecast = inf_prophet.remote( model, df_test) 4 Step 4: ray.put(data) onto each compute node
  21. Prophet Train Function Prophet Inference fn @ray.remote(num_returns=3) def train_prophet(df): #

    split df into train/test # fit prophet on train data return train_df,test_df,model # read data into pandas df_raw = read(file1.parquet) # distribute data input_data_ref = ray.put(df_raw) # train on train data df_train, df_test, model =\ train_prophet.remote(df_raw) @ray.remote def inf_prophet(model, test_df): # get n_periods from test_df # create forecast forecast = model.predict(...) return forecast # inference on test data forecast = inf_prophet.remote( model, df_test) 5 5 Step 5: call distributed remote tasks
  22. Prophet Train Function Prophet Inference fn @ray.remote(num_returns=3) def train_prophet(df): #

    split df into train/test # fit prophet on train data return train_df,test_df,model # read data into pandas df_raw = read(file1.parquet) # distribute data input_data_ref = ray.put(df_raw) # train on train data df_train, df_test, model =\ train_prophet.remote(df_raw) @ray.remote def inf_prophet(model, test_df): # get n_periods from test_df # create forecast forecast = model.predict(...) return forecast # inference on test data forecast_ray_obj_ref = \ inf_prophet.remote( model, df_test) forecast = \ ray.get(forecast_ray_obj_ref) 6 Step 6: ray.get() results
  23. Prophet Main function The .remote() tasks are scheduled in parallel

    asynchronously file1 file1 Node 1 Node 2 train_prophet Blue variables are Object IDs (similar to futures) train_prophet inf_prophet inf_prophet id1 id2 import ray # initialize ray ray.init() # train on train data df_train, df_test, model =\ train_prophet.remote(df_raw) # inference on test data forecast_ray_obj_ref = \ inf_prophet.remote( model, df_test) forecast = \ ray.get(forecast_ray_obj_ref) 4 1 4 Step 4: call distributed remote tasks 2
  24. Prophet Main function file1 file1 Node 1 Node 2 train_prophet

    Blue variables are Object IDs (similar to futures) train_prophet inf_prophet inf_prophet id1 id2 import ray # initialize ray ray.init() # train on train data df_train, df_test, model =\ train_prophet.remote(df_raw) # inference on test data forecast_ray_obj_ref = \ inf_prophet.remote( model, df_test) forecast = \ ray.get(forecast_ray_obj_ref) 5 ray.get() blocks until all remote task results are available Step 5: ray.get() results
  25. ARIMA Train function ARIMA Inference fn @ray.remote(num_returns=3) def train_arima(df): #

    split df into train/test # fit prophet on train data # pickle the ARIMA model return train_df,test_df,model # read data into pandas df_raw = read(file1.parquet) # train on train data df_train, df_test, model =\ train_arima.remote(df_raw) @ray.remote def inf_arima(model, test_df): # get n_periods from test_df # unpickle ARIMA model # create forecast forecast = model.predict(...) return forecast # inference on test data forecast_ray_obj_ref = \ inf_arima.remote( model, df_test) forecast = \ ray.get(forecast_ray_obj_ref) 3’ 3’ 3 3 4 4 5
  26. Example Parallelism Patterns Forecasting Algorithms on Ray Train statistical models

    Train global DL models Same data / same functions with fully sharded fully parallel data and gradients (e.g. training a deep neural network global model)
  27. Accelerating deep learning forecasting with Ray Example with PyTorch Lightning

    / Ray Lightning • Convert already-existing Python functions & classes to remote Ray Tasks and Actors in 4 easy steps: 1. pip install and import Ray, Ray Lightning, and Anyscale 2. ray.init() 3. Create a RayPlugin object 4. Modify the PyTorch Lightning Trainer to use Ray library plugin • For regular distributed PyTorch or TensorFlow2 use Ray Train instead. Those integrations use the same concepts shown here, but they are built-in to Ray Train’s Trainer.
  28. import ray from ray_lightning import RayPlugin # initialize ray and

    plugin ray.init() plugin = RayPlugin() # convert data to PyTorch tensors # modify PyTorch Lightning trainer model, trainer = \ train_model(ptf_train_data, plugin) # fit PyTorch Forecasting model trainer.fit(model, pt_train_loader, pt_val_loader) # get best model from the trainer # inference on validation data predictions = best_model.predict(pt_val_loader) PyTorch Main function Node 1 Node 2 Same data parallel sharded, same functions, shared gradients 2 4 shard1 train_model gradients shard2 train_model gradients gradients 3 1
  29. Takeaways • Distributed computing is a necessity and norm •

    Ray’s vision: make distributed programming simple using core @ray.remote or Ray Libraries • Ray Core: Easily convert already-existing Python functions & classes to remote Ray Tasks and Actors in 6 steps: install, .init(), decorate, ray.put(data), .remote(), .get() • Ray Libraries: Convert existing ML or DL frameworks using Ray Libraries, example Ray Lightning in 4 steps: install, .init(), plugin, modify trainer to use plugin.
  30. Upcoming Anyscale events and resources Events (anyscale.com) - Meetup tomorrow

    online, Jan 19th, 6pm PST - Weekly Intro demo / office hours. Thursdays alternating 9am and 2pm PST - Approx monthly webinars on different AI/ML workloads, like this webinar Resources - Ray doc pages (docs.ray.io) - Forums to learn/share with broader Ray community (discuss.ray.io) - Ray Slack (ray.io/community) - Follow us on Twitter and linkedin (@raydistrtibuted, @anyscalecompute) - Github: Check out sources, file an issue, become a contributor, give us a Star :) (github.com/ray-project/ray)