Ray in 2023: Ray in Reflection

December 07, 2023

160

Ray in 2023: Ray in Reflection

The quick recap and reflection of how Ray has progressed and its pivotal role in the LLM stack landscape
and Generative AI domain.

Anyscale

December 07, 2023

Tweet

More Decks by Anyscale

See All by Anyscale

Evaluating LLM Applications is hard

0

4.2k

Developing and serving RAG-Based LLM applications in production

0

150

Ray_Essentials__Introduction_to_Ray_for_machine_learning.pdf

0

170

How to build a serverless database cloud service

0

110

Multi-Region/Cloud Ray Pipeline with Distributed Caching

0

180

Modern Compute Stack for Scaling Large AI/ML/LLM Workloads

0

110

5 Painful Lessons using LLMs

0

140

How continuous batching enables 23x throughput in LLM inference

0

1.4k

Ray Community LLM August Meetup

0

97

Other Decks in Technology

See All in Technology

Amazon Bedrock AgentCoreのフロントエンドを探す旅 (Next.js編)

1

140

10年以上続くプロダクトで今取り組んでること、取り組もうとしていること

PRO

2

110

Eval-Centric AI: Agent 開発におけるベストプラクティスの探求

0

120

Infrastructure as Prompt実装記〜Bedrock AgentCoreで作る自然言語インフラエージェント〜

1

110

AIのグローバルトレンド 2025 / ai global trend 2025

PRO

1

140

Jamf Connect ZTNAとMDMで実現! 金融ベンチャーにおける「デバイストラスト」実例と軌跡 / Kyash Device Trust

1

200

MCP認可の現在地と自律型エージェント対応に向けた課題 / MCP Authorization Today and Challenges to Support Autonomous Agents

5

2.3k

AWS DDoS攻撃防御の最前線

1

150

Bet "Bet AI" - Accelerating Our AI Journey #BetAIDay

PRO

4

1.7k

ユーザー課題を愛し抜く――AI時代のPdM価値

PRO

1

120

Instant Apps Eulogy

1

100

リモートワークで心掛けていること〜AI活用編〜

0

150

Featured

See All Featured

Thoughts on Productivity

69

4.8k

Art, The Web, and Tiny UX

301

21k

Testing 201, or: Great Expectations

45

7.6k

What's in a price? How to price your products and services

246

12k

Unsuck your backbone

671

58k

Building Adaptive Systems

43

2.7k

185

16k

Why You Should Never Use an ORM

PRO

58

9.5k

Practical Tips for Bootstrapping Information Extraction Pipelines

PRO

23

1.4k

How to train your dragon (web standard)

96

6.2k

個人開発の失敗を避けるイケてる考え方 / tips for indie hackers

110

19k

Typedesign – Prime Four

42

2.7k

Transcript

Ray in 2023 Robert Nishihara
None
12x 50% 40% 10x 5x 30% Why Ray? faster cheaper
cheaper cheaper faster cheaper
As AI capabilities have grown, so have the challenges Scale
Future readiness Cost These are the challenges Ray was built for
Anyscale Endpoints - fine-tuning Llama-2-7B GPT-4 fine-tuned 86% 3% 78%
Superior task-specific performance at 1/300th the cost of GPT-4!
Spark SageMaker $0 $20 $40 $60 $3.5 $7.3 $57 AWS
Cost to process 1M images $2.5 Batch inference - costs
Anyscale Endpoints Cost efficient LLM inference Anyscale Endpoints Single GPU
optimizations Multi-GPU modeling Inference server Autoscaling Multi-region, multi-cloud $1 / million tokens (Llama-2 70B)
None
None
None