Slide 25
Slide 25 text
Approach 2: Online Inference Solutions
Bento ML, Ray Serve, Sagemaker Batch Transform
● Abstracts away infra complexities
● Abstractions for model packaging
● Framework integrations
Unnecessary complexities for offline inference
Starting HTTP Server, sending requests over network…
Hard to saturate GPUs
BentoML integrates with Spark for offline inference