Scaling ML + Python Workloads On-Demand 10.12

Simplify Scaling ML and Any Python Workload Phi Nguyen, GTM
Tech Lead Antoni Baum, ML Software Engineer

2 Agenda • Scaling Machine Learning and Python Workloads •
Instant Scaling From a Laptop to the Cloud • Use Cases • Demo • Questions

AI compute demands have increased exponentially AI scaling has lacked
a universal framework to ease scaling and enable distributed computing for machine learning & Python workloads MLOps is complex, making developer iterations and scaling too complex, leading to hampered developer productivity, consumption of excessive compute, and a slow-time-market. Fundamental Challenges

Python is the most popular programming language with a solid
footprint across: Python is Growing Everywhere 4 Data Science Analytics HPC/ Scientific IoT Web Development General Applications

• Deep Learning breakthrough innovation is based on unstructured data
• Data science libraries are primarily based in Python but do not scale well • How do you scale text, images, logs, geospatial, video and sensor data? 5 Exponential Growth of Unstructured Data

“Applying AI-driven forecasting to supply chain management, for example, can
reduce errors by between 20 and 50 percent” Demand forecasting • 100k SKUS - 50 categories • 210 DMA (Designated Market Area) • Time (Monthly, Weekly, Daily etc… 6 The Rise of Operational ML

Ray & Anyscale

Ray | The Fastest Growing Framework to Scale AI apps
1,000+ Organizations Using Ray 21,000+ GitHub Stars 4,000+ Repositories Depend on Ray 700+ Community Contributors 600+

CPG / Retail • Supply chain • Anomaly detection •
Forecasting factory • Demand forecasting • Recsys / Personalization Scaling ML & Python – Proven Use Cases Financial Services • Algorithm trading • Financial modelling • Market simulation • Backtesting Pharma / Biotech • Drug screening • Sequence DNA processing • Protein folding Digital Native Businesses • Recsys / Personalization • ETAs • Dynamic pricing • Product classification • NLP / CV Gaming • Game testing • In-game personalization AI Core Businesses • RL / DL / ML at scale • Complex data processing • Foundational Model

10 Scale Any Python or AI Workload Reinforcement Learning Computer
Vision / NLP Recommendation systems Complex / Unstructured data processing Large Scale Inference Simulation Embarrassingly parallel Batch training Tuning AutoML Time Series Forecast Factory Backtesting

11 Batch Training & Tuning Before Ray, we used 10
containers with celery running on AWS Batch and it used to take 2-3 days to train ~8000 models weekly for our marketplace use case. After doing a quick POC with Ray, we are now able to train 1000 models in 20 min. – ML Engineer, Instacart We did an internal benchmark for our forecast factory use cases and we found a 10x better performance compared to SageMaker. – Product Manager, Manufacturing Conglomerate With Ray we were able to create a self-service marketing attribution model. The service would train and tune many models & combinations and provide the best model based on user inputs. This has allowed to scale and provide an on-demand service for our customers. – Chief Technology Officer, AI Powered Marketing Co. We used Ray to solve our demand prediction use case and we obtained some astonishing results. Specifically, compared to our AWS Batch implementation, our Ray implementation is 9x faster and has reduced the cost by 87%. – Chief Technology Officer, anastasia.ai

12 Batch Inference & Data processing In order to generate
MSA data from sequence data for ~100,000+ proteins, one would need 10+ years on a laptop. However, with Ray and a simple ~100-line python script, I was able to perform this task on 8,192 cores in one day. – Laksh Aithani, CEO at CHARM Therapeutics Using Ray and spacy.io, we can now process 15M documents in hours instead of weeks. This has allowed us to accelerate our NLP pipelines provide faster value to our clients. – Chief Technology Officer, AI Startup With Ray, we can now scale our time series backtesting part of our our algorithm trading workbench. This has allowed us to test and provide more robust models in a shorter period of time. – Thomas Kutschera, CEO @ Axovision Ray is 11x faster and 3x cheaper than traditional methods for our ultra-high resolution drone imagery — allowing us to significantly accelerate our restoration efforts in a cost-effective way. – Richard Decal, SWE at Dendra Systems

13 Ray | A Unified Framework for Scaling AI &
Python Ray AI Runtime Unified Framework for Scalable Computing Data Loading Training Reinforcement Learning Model Serving Ray Core Tuning One framework to scale all workloads!

14 Anyscale | A Unified Compute Platform for Scaling and
Fast Time-to-Market Managed Ray Platform Eases development, deployment, scaling and management Managed Service Observability Access Control Workspaces Jobs Services Fully-Managed Scalable Compute Platform Ray AI Runtime Unified Framework for Scalable Computing Data Loading Training Reinforcement Learning Model Serving Ray Core Tuning Data / Features Orchestration Explainability / Observability Experiment Management Hyperparameter Tuning Serving / Applications Training

15 Instant Scaling to the Cloud Develop on your laptop,
seamlessly scale to the cloud with no code changes! Develop Debug Develop on Laptop No code changes Develop Test Debug Develop on Cluster No code changes Run Monitor Production Deployment

• Effortlessly scale your Python and AI Workloads No code
change to go from laptop to the cloud • Speed time-to-market Train in hours or days rather than weeks or months. • Unified distributed framework Parallelize any ML and Python code • Simplify your MLOps Single script for data preprocessing, training, tuning and serving

Ray Fundamentals

Batch Tuning Batch Training / Inference AutoML Different data /
Same function / Different hyperparam per job Different data / Same function 18 Scaling Design Patterns Same data / Different function Compute Data ... ...

Decentralized Scheduler • Cluster auto-scales based on Python calls! •
Low task / actor overhead (ms) Node #N Head of a Node Node #1 19 Anatomy of a Ray Cluster Driver Worker Global Control Store (GCS) Scheduler Object Store RAYLET Worker Worker Scheduler Object Store RAYLET Worker Worker Scheduler Object Store RAYLET

Ray Core API

21 Python → Ray: Basic Patterns Function Class Object Task
(Stateless) Actor (Stateful Process) Distributed Object (Immutable)

22 Function → Task Class → Actor class Counter(object): def
__init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter() c.inc() c.inc() def read_array(path): # ... read ndarray “a” from path return a def add(a, b): return np.add(a, b) a = read_array(path1) b = read_array(path2) sum = add(a, b)

23 Function → Task Class → Actor @ray.remote class Counter(object):
def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() @ray.remote def read_array(path): # ... read ndarray “a” from path return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(path1) id2 = read_array.remote(path2) id = add.remote(id1, id2) sum = ray.get(id)

24 Function → Task Class → Actor @ray.remote class Counter(object):
def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() @ray.remote def read_array(path): # ... read ndarray “a” from path return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(path1) id2 = read_array.remote(path2) id = add.remote(id1, id2) sum = ray.get(id)

25 Function → Task Class → Actor @ray.remote(num_gpus=1, num_cpus=4) class
Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() @ray.remote def read_array(path): # ... read ndarray “a” from path return a @ray.remote(num_gpus=1, accelerator_type=TESLA_V100) def add(a, b): return np.add(a, b) id1 = read_array.remote(path1) id2 = read_array.remote(path2) id = add.remote(id1, id2) sum = ray.get(id)

26 Ray Demos Simple Composable AutoML for TS Complex Data
Processing Batch Forecasting M5 Dataset ... ... ETS AutoArima Best Model Compute NYC Taxi Dataset PU Loc 1 PU Loc 2 ... ... PU Loc N LightShot Img 1 Img 2 ... ... Img N OCR Language? ... ... ...

27 Questions? Join the Community • discuss.ray.io • github.com/ray-project/ray •
@raydistributed • @anyscalecompute Fill out our survey for: • Feedback https://bit.ly/3CoqLX3 Request a Demo of the Anyscale Platform – Go to www.anyscale.com and Select ‘Try It Now’

Thank You October 2022

Scaling ML + Python Workloads On-Demand 10.12

Scaling ML + Python Workloads On-Demand 10.12

Anyscale

More Decks by Anyscale

Other Decks in Education

Featured

Transcript

Simplify Scaling ML and Any Python Workload Phi Nguyen, GTM

2 Agenda • Scaling Machine Learning and Python Workloads •

AI compute demands have increased exponentially AI scaling has lacked

Python is the most popular programming language with a solid

• Deep Learning breakthrough innovation is based on unstructured data

“Applying AI-driven forecasting to supply chain management, for example, can

Ray & Anyscale

Ray | The Fastest Growing Framework to Scale AI apps

CPG / Retail • Supply chain • Anomaly detection •

10 Scale Any Python or AI Workload Reinforcement Learning Computer

11 Batch Training & Tuning Before Ray, we used 10

12 Batch Inference & Data processing In order to generate

13 Ray | A Unified Framework for Scaling AI &

14 Anyscale | A Unified Compute Platform for Scaling and

15 Instant Scaling to the Cloud Develop on your laptop,

• Effortlessly scale your Python and AI Workloads No code

Ray Fundamentals

Batch Tuning Batch Training / Inference AutoML Different data /

Decentralized Scheduler • Cluster auto-scales based on Python calls! •

Ray Core API

21 Python → Ray: Basic Patterns Function Class Object Task

22 Function → Task Class → Actor class Counter(object): def

23 Function → Task Class → Actor @ray.remote class Counter(object):

24 Function → Task Class → Actor @ray.remote class Counter(object):

25 Function → Task Class → Actor @ray.remote(num_gpus=1, num_cpus=4) class

26 Ray Demos Simple Composable AutoML for TS Complex Data

27 Questions? Join the Community • discuss.ray.io • github.com/ray-project/ray •

Thank You October 2022