Ray Serve@機械学習の社会実装勉強会第11回

Ray Serve 2022-05-28 Naka Masato

自己紹介名前那珂将人経歴 • アルゴリズムエンジニアとしてレコメンドエンジン開発 • インフラ基盤整備 GitHub: https://github.com/nakamasato
Twitter: https://twitter.com/gymnstcs

前回 - Rayを紹介 UC Berkeley RISE Lab で開発されたオープンソースのプロジェクト As
a general-purpose and universal distributed compute framework, you can ﬂexibly run any compute-intensive Python workload — 1. from distributed training or 2. hyperparameter tuning to 3. deep reinforcement learning and 4. production model serving. Deep learning から Model Serving まで開発者が簡単にスケールできる https://www.ray.io/

前回 - Ray Components さまざまな Package がある 1. Core: コア
← 前回 2. Tune: Scalable hyperparameter tuning 3. RLlib: Reinforcement learning 4. Train: Distributed deep learning (PyTorch, TensorFlow, Horovod) 5. Datasets: Distributed data loading and compute 6. Serve: Scalable and programmable serving ← 今回 7. Workﬂows: Fast, durable application ﬂows

Ray Serve ML モデルを簡単にスケーラブルにデプロイできるモデルサービング 1. Framework 依存なし : PyTorch, Tensorﬂow,
keras, scikit-learn などどんな Framework でも OK 2. Python ファースト : Conﬁguration を Python で書ける

Why Ray Serve? 2 つのメインの Serving の仕方 : 1. Traditional
Web Server a. 簡単に始められる b. スケールは難しい 2. Cloud-Hosted a. スケールはしやすい b. ベンダーロックイン c. フレームワーク特化ツール d. 一般的な拡張性に欠ける → Ray Serve: 簡単スケーラブル＆多様な Framework のサポート

Ray Serve 1. Ray Cluster 上で Ray Serve をスタート 2.
Ray Serve に、 1 つまたは複数の Deployment をデプロイして ML モデルを Serving できる Ray Cluster (local or real) Ray Serve Deployment1 Deployment2 python deployment1 deployment2 Deploy の設定まで Python で書ける！

Ray Serveのシンプルさ実装 : 1. ray.init(): cluster に接続 2. serve.start():
serve をスタート 3. @serve.deployment を用いて Deploy する対象を定義 a. Deployment は callable -> __call__ に処理を書く 4. xxx.deploy() 実行 (local): 1. ray start –head: Ray のローカルクラスタ準備 2. python quickstart.py: Ray Serve スタート quickstart.py http://127.0.0.1:8000/router でアクセス可能に

Demo1: Quickstart

Ray Serveにアクセス 1. HTTP a. curl localhost:8000/<deployment_name> b. python だと
requests 2. ServeHandle a. 別の PythonScript から Serving にリクエストを送る場合に b. handle = serve.get_deployment(“<deployment_name>”).get_handle() c. ray.get(handle.<method>.remote()) i. Deployment の method を呼ぶ HTTP で呼ぶ場合 Deployment 定義 ServeHandle を使って呼ぶ場合

Demo2: 文章の要約の例

その他の機能 Ray はシンプルな Deploy 以外にも 1. Model Composition: 複数のモデルを組み合わせて Deploy
できる 2. Request Batching: 複数のリクエストをまとめて処理 3. Resource Management: CPU や GPU を各レプリカで設定可能 (ray_actor_options)

Model Composition 複数のモデルを組み合わせて Serving: 1. それぞれの Model を Deployment で定義→
Scalable 2. ComposedModel から Model を ServeHandle を用いて定義 Alpha: Deployment Graph Composed Model model_one model_two

まとめ 1. Ray Serve の基本的な使い方 a. Cluster + Ray Serve
(serve.start() + xxx.deploy()) b. Request (HTTP or ServeHandle) 2. Ray Serve の強み a. Framework に縛られない b. Python で設定、 Deploy までが管理できる c. スケーラブルなシステムを簡単に構築できる 3. 他の Serving との比較 a. TFServing, TorchServe, ONNXServe: Framework 特化 b. AWS SageMaker, AzureML, Google AI Platform: i. RayServe は Kubernetes 、 On-premise でもどの Cloud Provider でも構築可能 Ray Cluster が local でもすぐ立つので開発効率がよさそう <-> Kubernetes, Cloud ML Platform

Ray Serve@機械学習の社会実装勉強会第11回

Ray Serve@機械学習の社会実装勉強会第11回

Naka Masato

More Decks by Naka Masato

Other Decks in Technology

Featured

Transcript

Ray Serve 2022-05-28 Naka Masato

自己紹介名前那珂将人経歴 • アルゴリズムエンジニアとしてレコメンドエンジン開発 • インフラ基盤整備 GitHub: https://github.com/nakamasato

前回 - Rayを紹介 UC Berkeley RISE Lab で開発されたオープンソースのプロジェクト As

前回 - Ray Components さまざまな Package がある 1. Core: コア

Ray Serve ML モデルを簡単にスケーラブルにデプロイできるモデルサービング 1. Framework 依存なし : PyTorch, Tensorﬂow,

Why Ray Serve? 2 つのメインの Serving の仕方 : 1. Traditional

Ray Serve 1. Ray Cluster 上で Ray Serve をスタート 2.

Ray Serveのシンプルさ実装 : 1. ray.init(): cluster に接続 2. serve.start():

Demo1: Quickstart

Ray Serveにアクセス 1. HTTP a. curl localhost:8000/<deployment_name> b. python だと

Demo2: 文章の要約の例

その他の機能 Ray はシンプルな Deploy 以外にも 1. Model Composition: 複数のモデルを組み合わせて Deploy

Model Composition 複数のモデルを組み合わせて Serving: 1. それぞれの Model を Deployment で定義→

まとめ 1. Ray Serve の基本的な使い方 a. Cluster + Ray Serve