[2023年11月版] Databricksを用いた『生成AIアプローチ』

©2023 Databricks Inc. — All rights reserved Generative AI on
Databricks データ中心アプローチのテクニカル・ウォークスルー

©2023 Databricks Inc. — All rights reserved • GenAIへのアプローチ •
GenAIの旅 • プロンプトエンジニアリング • 検索補強型ジェネレーション（RAG） • ファインチューニング • 事前トレーニング • MLOps for GenAI • GenAIで何が変わるのか？ • リファレンスアーキテクチャ • リソースと次のステップ Outline

©2023 Databricks Inc. — All rights reserved Databricks AI レイクハウスに完全統合されたGenAI
AI Models & Tools Cloud Storage Prepare Data Develop & Evaluate AI Serve AI SQL Workflows DLT Features Model Serving (CPU/GPU) Model Registry Notebooks Marketplace AI Functions Lakehouse Monitoring Spark AutoML Databricks AI specific capabilities Lakehouse common capabilities Unity Catalog MLflow CI/CD support OpenAI … Delta Files (Volumes) MosaicML Serve Data Data & AI Governance MLOps + LLMOps Hugging Face Feature Store MLflow (Track/Evaluate) External Services MLflow AI Gateway Vector Search Runtimes

©2023 Databricks Inc. — All rights reserved GenAIに関する（生成された）ストーリーを紹介: MLチームが1週間で自動顧客アシスタントを構築！ •
シンプルなUX：顧客はアシスタントとチャットで注文できる • 迅速な導入: ◦ AutoGPTを使用して迅速に構築 ◦ 顧客のデータにアクセスし、注文を代行できる • 初期のテスト: ◦ 社内テストによれば、リクエストの75%は自動的に処理できる

©2023 Databricks Inc. — All rights reserved Is this a
good or bad story? 顧客価値の可能性すべての要求にはコストがかかる可能性がある ⇒ ROIが低いかマイナス独自データを一部活用。顧客データをサードパーティの APIに送信。いいことかも？逃げ道があればね。悪いこと？偽陰性と陽性。これをどのように生産し、監視するのか？これはプロトタイプ？可能性を素早く証明これが製品なのか？競争上の差別化がない実験的な技術で作られている • シンプルなUX：顧客はアシスタントとチャットで注文できる • 迅速な導入: ◦ AutoGPTを使用して迅速に構築 ◦ 顧客のデータにアクセスし、注文を代行できる • 初期のテスト: ◦ 社内テストによれば、リクエストの 75%は自動的に処理できる

©2023 Databricks Inc. — All rights reserved Underlying needs 本番利用
コントロール/ カスタマイズコスト管理 • シンプルなUX：顧客はアシスタントとチャットで注文できる • 迅速な導入: ◦ AutoGPTを使用して迅速に構築 ◦ 顧客のデータにアクセスし、注文を代行できる • 初期のテスト: ◦ 社内テストによれば、リクエストの 75%は自動的に処理できる

©2023 Databricks Inc. — All rights reserved Lower Costs Production
Quality Complete Control Key questions モデルやデータを所有し、管理していますか？競合他社に対する優位性は何か？セキュリティとプライバシーのリスクは？効率的にモデルやパイプラインを提供するには？モデルの微調整やプリトレインを効率的に行うには？どのように研究開発から生産に移行するのか？ GenAIはMLOpsプラットフォームとプロセスにどのように適合しますか？コスト本番利用コントロール

©2023 Databricks Inc. — All rights reserved Key advice How
do you serve models and pipelines efficiently? How do you fine-tune and pretrain models efficiently? How do you move from R&D to production? How does GenAI fit into your MLOps platform and processes? データは競争力です。データを管理し、GenAIアプリケーションやカスタムモデルに活用しましょう。最適化されたサービングシステムを活用する。主要なML/AIプラットフォーム上でモデルの微調整と事前学習を計画する GenAIはデータ中心のプラットフォーム上で展開するのが最適である。従来のMLOpsのほとんどは GenAIに対応するが、GenAI のいくつかの新しい要件に注意するコスト本番利用コントロール

©2023 Databricks Inc. — All rights reserved GenAIの行動を誘導するための特別なプロンプトとパイプラインの作成
LLMとカスタム・エンタープライズ・データの組み合わせ事前に訓練されたGenAIモデルを特定のデータセットやドメインに適応させる GenAIモデルをゼロからトレーニングするコントロールとカスタマイズの幅は広がるが、計算量と複雑さは増す GenAI journey データを活用しながら、基本的なGenAIから高度なGenAIへの反復的なパスを計画する Prompt Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training

©2023 Databricks Inc. — All rights reserved • これらは相互に排他的な選択肢ではない。 •
シンプルに始め、ベースラインを作り、反復する。 GenAI journey • 以下に基づいてテクニックを選択する : ◦ データの量と質 ◦ コンピューティング・リソース ◦ レイテンシー要件 ◦ 特定のドメインまたはアプリケーション Prompt Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training データを活用しながら、基本的なGenAIから高度なGenAIへの反復的なパスを計画する

©2023 Databricks Inc. — All rights reserved Method Definition Primary
use case Data requirements Training time Advantages Considerations Prompt engineering モデルの行動を導くための特別なプロンプトの作成迅速なオンザフライ・モデル・ガイダンス None None 迅速、費用対効果、トレーニング不要微調整よりもコントロール性が劣る Retrieval augmented generation (RAG) Combining an LLM with external knowledge retrieval Dynamic datasets & external knowledge External knowledge base or vector database Moderate (e.g. computing embeddings) Dynamically updated context, enhanced accuracy Increases prompt length and inference computation Fine-tuning Adapting a pre-trained LLM to specific datasets or domains Domain or task specialization Thousands of domain-specific or instruction examples Moderate - long (depending on data size) Granular control, high specialization Requires labeled data, computational cost Pre-training Training an LLM from scratch Unique tasks or domain-specific corpora Large datasets (billions to trillions of tokens) Long (days to many weeks) Maximum control, tailored for specific needs Extremely resource-intensive GenAI journey

use case Data requirements Training time Advantages Considerations Prompt engineering モデルの行動を導くための特別なプロンプトの作成迅速なオンザフライ・モデル・ガイダンス None None 迅速、費用対効果、トレーニング不要微調整よりもコントロール性が劣る Retrieval augmented generation (RAG) LLMと外部知識検索の組み合わせダイナミックなデータセットと外部知識外部の知識ベースまたはベクトルデータベース適度な (埋め込み計算など）コンテクストが動的に更新され、精度が向上プロンプトの長さと推論の計算量が増加 Fine-tuning Adapting a pre-trained LLM to specific datasets or domains Domain or task specialization Thousands of domain-specific or instruction examples Moderate - long (depending on data size) Granular control, high specialization Requires labeled data, computational cost Pre-training Training an LLM from scratch Unique tasks or domain-specific corpora Large datasets (billions to trillions of tokens) Long (days to many weeks) Maximum control, tailored for specific needs Extremely resource-intensive GenAI journey

use case Data requirements Training time Advantages Considerations Prompt engineering モデルの行動を導くための特別なプロンプトの作成迅速なオンザフライ・モデル・ガイダンス None None 迅速、費用対効果、トレーニング不要微調整よりもコントロール性が劣る Retrieval augmented generation (RAG) LLMと外部知識検索の組み合わせダイナミックなデータセットと外部知識外部の知識ベースまたはベクトルデータベース適度な (埋め込み計算など）コンテクストが動的に更新され、精度が向上プロンプトの長さと推論の計算量が増加 Fine-tuning 事前に訓練されたモデルを特定のデータセットやドメインに適応させるドメインまたはタスクの専門化何千もの分野別または指導例中～長（データサイズによる）きめ細かなコントロール、高い専門性ラベル付きデータが必要、計算コスト Pre-training Training an LLM from scratch Unique tasks or domain-speciﬁc corpora Large datasets (billions to trillions of tokens) Long (days to many weeks) Maximum control, tailored for speciﬁc needs Extremely resource-intensive GenAI journey

use case Data requirements Training time Advantages Considerations Prompt engineering モデルの行動を導くための特別なプロンプトの作成迅速なオンザフライ・モデル・ガイダンス None None 迅速、費用対効果、トレーニング不要微調整よりもコントロール性が劣る Retrieval augmented generation (RAG) LLMと外部知識検索の組み合わせダイナミックなデータセットと外部知識外部の知識ベースまたはベクトルデータベース適度な (埋め込み計算など）コンテクストが動的に更新され、精度が向上プロンプトの長さと推論の計算量が増加 Fine-tuning 事前に訓練されたモデルを特定のデータセットやドメインに適応させるドメインまたはタスクの専門化何千もの分野別または指導例中～長（データサイズによる）きめ細かなコントロール、高い専門性ラベル付きデータが必要、計算コスト Pre-training GenAIモデルをゼロからトレーニングする独自のタスクまたはドメイン固有のコーパス大規模データセット（数十億～数兆トークン）長い（数日から数週間）特定のニーズに合わせた最大限のコントロール極めて資源集約的 GenAI journey

©2023 Databricks Inc. — All rights reserved GenAI journey Prompt
Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training Method Deﬁnition Primary use case Data requirements Training time Advantages Considerations Prompt engineering モデルの行動を導くための特別なプロンプトの作成迅速なオンザフライ・モデル・ガイダンス None None 迅速、費用対効果、トレーニング不要微調整よりもコントロール性が劣る

©2023 Databricks Inc. — All rights reserved AI App Application
architecture: prompt engineering Users Create Prompts 2 Send prompts Response 4 1 Model Hub Model Serving 3 Models in Unity Catalog Hugging Face Hub … Query AI Gateway Choose and serve LLM Templates Instructions Examples Prompts Model Serving (CPU/GPU) Web App … OpenAI Monitoring Lakehouse Monitoring Inference Tables Mosaic Inference Log query, response, metrics 5 Notebooks (e.g. Python) SQL (AI functions)

©2023 Databricks Inc. — All rights reserved Prompt engineering GenAIモデルのテキストプロンプトを調整し、より良い回答を引き出す
18 Evaluation UI プロンプトエンジニアリングは、基礎となるモデルを変更することなく、より正確で文脈を考慮したアウトプットのためにモデルの相互作用を洗練させる MLﬂow で評価 UI を使用すると、モデル間のプロンプトとコンテキストのテストが簡素化されるため、アプリケーションに最適なモデルを発見できます

©2023 Databricks Inc. — All rights reserved Prompt engineering GenAIモデルのテキストプロンプトを調整し、より良い回答を引き出す
19 Evaluation UI プロンプトエンジニアリングは、基礎となるモデルを変更することなく、より正確で文脈を考慮したアウトプットのためにモデルの相互作用を洗練させる MLﬂow でEvaluation UI を使用すると、モデル間のプロンプトとコンテキストのテストが簡素化されるため、アプリケーションに最適なモデルを発見できます

©2023 Databricks Inc. — All rights reserved Databricks Model Serving
サービスに必要なすべてのモデルを一元管理 Model Serving Custom Models External Models Foundational Models • LangChain • sklearn • MLﬂow pyfunc • … • Llama2 70B • MPT 7B • BGE-Large • … • OpenAI GPT-* • AWS Bedrock • Anthropic • …

©2023 Databricks Inc. — All rights reserved Databricks Model Serving
Unified management of all models you need to serve Model Serving Custom Models External Models Foundational Models UC / Marketplace / Workspaceから任意のMLflow モデルをAPIとしてServerless Computeでデプロイ。 CPUとGPU。 Feature Serving やVector Searchとの統合。トップのFoundationモデルをAPI として呼び出します。迅速な実験のためのトークン単位の課金。専用コンピュート用のスループットベースのDBU価格設定外部モデルと API を管理します。これは、MLflow AI Gatewayのガバナンスに加え、従来の Databricks Model Servingのモニタリングとペイロードロギングを提供します。 Available now as Model Serving Available via MosaicML Inference Available via AI Gateway preview

©2023 Databricks Inc. — All rights reserved OSS Model Marketplace
最適化されたモデルサービングに支えられたキュレーションモデル Governed: マーケットプレイスのモデルを Unity カタログで管理することで、コンプライアンスとガバナンスを確保します Simple: ボイラープレート・コードを書くことなく、わずか数クリックで最新のオープンソース・モデルを入手できる Optimized: 一般的なモデルアーキテクチャのための自動最適化により、コスト効率とパフォーマンスの高い方法でModel Servingにデプロイし、拡張することができます Instruction following MPT-7B-Instruct MPT-30B-Instruct Falcon-7B-Instruct Text embeddings instructor-xl e5-base-v2 all-mpnet-base-v2 Code generation StarCoderBase replit-code-v1-3b Transcription whisper-large-v2 whisper-medium Image generation stable-diffusion-2-1 OSS model guidance: Research team ﬁndings are also published here.

©2023 Databricks Inc. — All rights reserved MLflow AI Gateway
Central Management and Governance Unified APIs for AI Models and Providers Enable Multiple Gen AI Use Cases ルートユーザは、各組織が管理するクレデンシャルを使用して、指定されたモデルを照会できます組織は、コストを管理するためにルートを制限することができます。ルートは、より良いトラッキングと観測可能性のために、リクエストのロギングとキャッシングを可能にする。多様なモデルやプロバイダーに共通のAPI を提供することで、ユーザーはベンダー固有のAPIやドキュメントに精通する必要がなくなる。残りのコードを再構築することなく、最新かつ最高のLLMに簡単にアップグレードできます。モデルへのアクセスと管理を一元化することで、開発者は最終製品に集中でき、インフラの更新に費やす時間を減らすことができる。シンプルな統合により、リリースされる最適なモデルを常に使用できる柔軟性を提供することで、将来性を証明します。コスト管理と運用監視により、AIアプリケーションを責任を持ってスケールアウトすることができます。 23 モデルの管理、統治、評価、切り替えが容易 All of Databricks Model Serving will provide these benefits.

©2023 Databricks Inc. — All rights reserved MLflow Deployments Server
モデルの管理、統治、評価、切り替えが容易以前は「AI Gateway」と呼ばれていた All of Databricks Model Serving will provide these benefits. Central Management and Governance Unified APIs for AI Models and Providers Enable Multiple Gen AI Use Cases ルートユーザは、各組織が管理するクレデンシャルを使用して、指定されたモデルを照会できます組織は、コストを管理するためにルートを制限することができます。ルートは、より良いトラッキングと観測可能性のために、リクエストのロギングとキャッシングを可能にする。多様なモデルやプロバイダーに共通のAPI を提供することで、ユーザーはベンダー固有のAPIやドキュメントに精通する必要がなくなる。残りのコードを再構築することなく、最新かつ最高のLLMに簡単にアップグレードできます。モデルへのアクセスと管理を一元化することで、開発者は最終製品に集中でき、インフラの更新に費やす時間を減らすことができる。シンプルな統合により、リリースされる最適なモデルを常に使用できる柔軟性を提供することで、将来性を証明します。コスト管理と運用監視により、AIアプリケーションを責任を持ってスケールアウトすることができます。

©2022 Databricks Inc. — All rights reserved Lakehouse Monitoring Integrated:
トレーニングデータ、フィーチャーテーブル、モデル、推論ログのエンドツーエンドのリネージをUnityカタログで追跡し、よりシンプルなガバナンスを実現します Simple: 推論テーブルを自動的に記録し、メトリックテーブルとSQLダッシュボードを生成します。 Proactive: テーブルの品質やカスタムメトリクスに関するアラートを自動化し、データやモデルの問題を診断します。信頼性が高く、洞察に富み、シンプルなデータからAIへのパイプラインを実現する統合モニタリング

©2023 Databricks Inc. — All rights reserved AI App Application
architecture: prompt engineering Users Create Prompts 2 Send prompts Response 4 1 Model Hub Model Serving 3 Models in Unity Catalog Hugging Face Hub … Query AI Gateway Choose and serve LLM Templates Instructions Examples Prompts Model Serving (CPU/GPU) Web App … OpenAI Monitoring Lakehouse Monitoring Inference Tables Mosaic Inference Log query, response, metrics 5 Notebooks (e.g. Python) SQL (AI functions) Flexible deployment Governance and cost controls Choice of models Scalability MLOps integration

©2023 Databricks Inc. — All rights reserved Model Serving AI
Gateway Model Serving (CPU/GPU) … OpenAI Mosaic Inference Reusable infrastructure Model Hub Models in Unity Catalog Hugging Face Hub … Monitoring Inference Tables Lakehouse Monitoring 最初のGenAIのユースケースは、最終的なGenAI＋データプラットフォームの主要部分を組み立てるのに役立ちます AI App Web App Notebooks (e.g. Python) SQL (AI functions) Flexible deployment Governance and cost controls Choice of models Scalability MLOps integration

Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training Method Deﬁnition Primary use case Data requirements Training time Advantages Considerations Retrieval augmented generation (RAG) LLMと外部知識検索の組み合わせダイナミックなデータセットと外部知識外部の知識ベースまたはベクトルデータベース適度な (埋め込み計算など）コンテクストが動的に更新され、精度が向上プロンプトの長さと推論の計算量が増加

©2023 Databricks Inc. — All rights reserved 29 Beneﬁt Explanation
最新のカスタム知識モデル・レスポンスは、トレーニング・データだけでなく、更新されたカスタム・ドキュメントやデータにも基づくことができる。幻覚のリスクが減少 RAGはモデルのインプットを外部の知識に基づかせ、ソースを引用することができる。ドメイン固有の文脈化 RAGは、独自のデータやドメイン固有のデータを使用することで、特殊でドメイン固有のクエリを処理することができる。効率性と費用対効果 RAGは、データによるカスタマイズを可能にし、微調整のオーバーヘッドをなくし、開発時間とコストを削減します。 Retrieval Augmented Generation (RAG)

©2023 Databricks Inc. — All rights reserved RAGはLLMを静的モデルとしてではなく、推論エンジンとして使
用する Your data + an LLM “brain” Retrieval Augmented Generation (RAG) Users Query RAG chain “What is Spark Connect?” 2 Vector Database or Feature Store Retrieve relevant info/data (context) “The Spark Connect client translates DataFrame…” 3 Prompt with context Augment prompt with context Respond to Q based on D: Relevant docs Question Instruction-following LLM 4 Generate answer from context “Spark Connect allows a decoupled client-server…”

©2023 Databricks Inc. — All rights reserved • お客様のために管理される取り込みパイプライン
• Unityカタログが管理するインデックス • また API対応 ◦ 非管理エンベッディング ◦ CRUD APIのアップサート/デリート MLflow AI Gateway Vector Search Unity Catalogによって管理される、自動更新のベクター・インデックスを作成する client.create_delta_sync_index( endpoint_name="storage_endpoint", index_name=f"{catalog}.{schema}.{index}", source_table_name="ml.llm.spark_docs", primary_key="id", embedding_source_column="text", ai_gateway_route_name="openai-embedding", pipeline_type="CONTINUOUS" ) ソーステーブルを選択するセマンティック検索インデックスを作成するためのシンプルなAPIを呼び出すリアルタイム検索用コール・エンドポイント result = index.similarity_search( query_text="What is Spark Connect?", columns=["id", "text", "link"], filters={"doctype": "wiki"}) 任意のエンベッディング・モデルを選択する • LangChain、LlamaIndexなどと統合 • 必要に応じてエンドポイントをスケールアウト

©2023 Databricks Inc. — All rights reserved Chains (and agents)
Vector DB lookup Prompt template LLM summarizer Example chain チェーンとエージェントは、モジュール化された LLMの機能を構造的につなぎ合わせることができる一般的なフレームワークには以下のようなものがある： • LangChain • LlamaIndex • Hugging Face mlflow.langchain.log_model(lc_model=llm_chain, …) コンテキストと複雑な推論を含むパイプラインの構築 MLﬂowは、チェーン、エージェント、モデルのトラッキングとロギングをサポートします。モデルは、ガバナンスと系統追跡のために Unity カタログに登録できます。内蔵のMLﬂowフレーバーは以下の通り： • LangChain • OpenAI • Transformers • Sentence Transformers • PyFunc (for any custom framework) Development Deployment and Tracking

©2023 Databricks Inc. — All rights reserved Model Hub Hugging
Face Hub … Models in Marketplace RAG Architecture: Unstructured data processing Model Serving External Models Custom Models Vector DB Vector Search Storage External Sources Ingestion Tables Volumes Delta Tables OpenAI Automatic sync Deploy model(s) Foundational Models ﬁles & metadata Document processing 1. parsing 2. cleaning 3. chunking 4. featurization chunks & features chunks vector Workﬂows Delta Live Tables Notebooks Databricks managed embeddings

©2023 Databricks Inc. — All rights reserved Model Hub Hugging
Face Hub … Models in Marketplace RAG Architecture: Unstructured data processing Unified Serving Endpoints Remote Models Custom Models Vector DB Vector Search Batch Storage External Sources Ingestion Tables Volumes Delta Tables OpenAI Automatic sync Deploy model(s) Foundational Models ﬁles & metadata Document processing 1. parsing 2. cleaning 3. chunking 4. featurization chunks & features Embedding chunks, vectors & features chunks vector Workﬂows Delta Live Tables Notebooks Customer managed embeddings

©2023 Databricks Inc. — All rights reserved RAG Architecture: Structured
data processing External Sources Ingestion Delta Table rows Data Prep 1. parsing 2. cleaning 3. chunking 4. featurization Workﬂows Delta Live Tables Notebooks Serving Feature Serving Storage Delta Tables Automatic sync features

©2023 Databricks Inc. — All rights reserved RAG Architecture: Chain
AI Serving Models Functions Data External Models Custom Models Vector Search Index Question Query Processing Deploy Foundational Models Feature Serving Query Expansion Retriever(s) Prompt Engineering Generation Response Chain Logic Function Serving Post Processing Unity Catalog Inference Tables Logs Monitoring Lakehouse Monitoring 🦜🔗

©2023 Databricks Inc. — All rights reserved Product How it
solves this problem? Beneﬁt 市場投入までの時間を短縮必要なエンジニアリングリソースが少ないより正確で高品質なボット Vector Search 任意のデルタテーブルをスケーラブルで低レイテンシのベクトルデータベースに自動同期し、RAGアプリで利用可能にします。 ✅デプロイまで1 分 vs 数日 ✅データETLパイプラインは不要 ☑あなたのデータは、より簡単にボットで使用されます Feature Serving スケーラブルで低レイテンシーのデルタテーブルを RAGアプリで利用可能にします。 ✅デプロイまで1 分 vs 数日 ✅データETLパイプラインは不要 ☑あなたのデータは、より簡単にボットで使用されます Problem: Data prep: 構造化データおよび非構造化データをRAG用に準備するスケーラブルなパイプライン

©2023 Databricks Inc. — All rights reserved AI Application Application
architecture: RAG Users Construct Prompts 3 Send prompts to LLM to generate response Response 5 1 Instruction Following Model 4 Query AI Gateway Templates Prompts Model Serving (CPU/GPU) Related docs (from ) 2 2 Search for related content Data Serving Vector Search See implemented in https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot ETL Prepare docs (cleanse, chunk, …) Ingest docs Files Tables Volumes Delta Live Tables RAG Chain Embedding Model Model Serving (GPU) … OpenAI Mosaic Inference Automatically sync with Delta table Choose and load model(s) Compute embeddings Query RAG model Monitoring Model Hub Hugging Face Hub … Models in Unity Catalog Model Serving (CPU) 🦜🔗 …

©2023 Databricks Inc. — All rights reserved AI Application Application
architecture: RAG Users Construct Prompts 3 Send prompts to LLM to generate response Response 5 1 Instruction Following Model 4 Query Templates Prompts Related docs (from ) 2 2 Search for related content Data Serving Vector Search See implemented in https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot ETL Prepare docs (cleanse, chunk, …) Ingest docs Files Tables Volumes Delta Live Tables RAG Chain Model Serving (CPU) Embedding Model Model Serving (GPU) … OpenAI Mosaic Inference Automatically sync with Delta table Choose and load model(s) Compute embeddings Query RAG model Monitoring Automatic ingestion Unity Catalog governance Choice of models MLOps integration Choice of frameworks Model Hub Hugging Face Hub … Models in Unity Catalog 🦜🔗 … AI Gateway Model Serving (CPU/GPU)

Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training Method Deﬁnition Primary use case Data requirements Training time Advantages Considerations Fine-tuning 事前に訓練されたモデルを特定のデータセットやドメインに適応させるドメインまたはタスクの専門化何千もの分野別または指導例中～長（データサイズによる）きめ細かなコントロール、高い専門性ラベル付きデータが必要、計算コスト

©2023 Databricks Inc. — All rights reserved 41 Fine-tuning Beneﬁt
Explanation 特定ドメインのカスタマイズ一般的なLLMは、大規模で一般的なデータセットでトレーニングされるため、幅広い知識を持つが、ニッチな分野では深みに欠ける。モデルを微調整することで、組織は特定のドメインやアプリケーション向けにモデルを特化させることができるモデルの動作をよりコントロールファインチューニングは、モデルの出力をきめ細かく制御する。これにより、組織は特定のバイアスに対処し、正しさを強制し、フィードバックに基づいてモデルの動作を改良することができる。推論コストの削減微調整によってLLMに応答方法を教えることができるので、プロンプトにそのような指示を含める必要はない。

©2023 Databricks Inc. — All rights reserved Fine-tuning What is
it? Fine-tuning 既存のGenAIモデルを特定のドメインやタスクに適応させるために、小規模なデータセット（1,000億の代わりに数百万のトークン）で特別なトレーニングを行います。 2 つの一般的なフォーム : • Supervised instruction ﬁne-tuning: • 1000例の（指示、文脈、反応）データセットで学習を続ける • 例) 質問応答アプリケーション • Continued pre-training: • 新しい語彙や新しい言語を含むテキストなど、ドメイン固有の非構造化テキストでの継続的なトレーニング • 例) 難解なコーディング言語のコード補完

©2023 Databricks Inc. — All rights reserved Databricks Fine-Tuning API
一般的なエンベッディングと命令追従アーキテクチャを微調整するためのシンプルなツール Set configs • Training cost • Quality target • Serving cost • Serving latency Bring custom training data Choose model architecture Custom code in Databricks 最新のツールを使って、あらゆるGenAI モデルを微調整： Hugging Face、 DeepSpeed、PyTorch、TensorFlowなど。標準的なDatabricksワークフローを使用： GPUクラスター、MLflow、ノートブック/ジョブ、... Examples: • Hugging Face and MLflow (docs) • Hugging Face and DeepSpeed (blog) • Parameter Efficient Fine-Tuning with LoRA (blog) Fine-tuning with Databricks Available in AutoML Private Preview Fine Tuning Build custom models using your data via a simple API and configs MosaicML More in the next section! Generally available

©2023 Databricks Inc. — All rights reserved Model Hub AI
Orchestrator & Tools Files Application workflow: fine-tuning Load base model 3 Compute 4 myModel Register customized model 5 Fine-tune model Models in Unity Catalog Ingest training docs 1 DeepSpeed MosaicML Notebooks Tables GPU cluster Trainer PEFT Spark Hugging Face Hub … Prepare data 2 Volumes PyTorch MLflow Data Preparation TensorFlow Notebooks Transformers Spark PyTorch Delta Live Tables … …

©2023 Databricks Inc. — All rights reserved Model Hub AI
Orchestrator & Tools Files Application workflow: fine-tuning Load base model 3 Compute 4 myModel Register customized model 5 Fine-tune model Models in Unity Catalog Ingest training docs 1 DeepSpeed MosaicML Notebooks Tables GPU cluster Trainer PEFT Spark Hugging Face Hub … Prepare data 2 Volumes PyTorch MLflow Data Preparation TensorFlow Notebooks Transformers Spark PyTorch Delta Live Tables … … Seamless transition from Data Engineering to Data Science Unity Catalog governance Simple infrastructure for your custom code Scalability Unity Catalog governance

Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training Method Deﬁnition Primary use case Data requirements Training time Advantages Considerations Pre-training GenAIモデルをゼロからトレーニングする独自のタスクまたはドメイン固有のコーパス大規模データセット（数十億～数兆トークン）長い（数日から数週間）特定のニーズに合わせた最大限のコントロール極めて資源集約的

©2023 Databricks Inc. — All rights reserved Pre-training Beneﬁt Explanation
特定ドメインのフルカスタマイズモデルをゼロからトレーニングすることで、モデルの基礎知識が特定のドメイン（医療、法律、コードなど）に合わせて調整されます。独自のデータソース貴社独自のデータは、他の事前訓練された Gen AIモデルがアクセスできる以上の知識と洞察をモデルに提供することができます。トレーニングデータの完全管理事前トレーニングは、モデルがトレーニングされるデータの透明性とコントロールを提供する。これにより、データのセキュリティ、プライバシー、法的要件が満たされることを保証することができる。第三者のバイアスを避けるサードパーティの事前トレーニング済みモデルには、トレーニングデータによるバイアスや制限があります。カスタムプリトレーニングは、 Gen AIアプリケーションにおいて、そのようなバイアスや制限をよりコントロールすることができます。

©2023 Databricks Inc. — All rights reserved Pre-training 新しいGenAIモデルをゼロからトレーニングする •
ファインチューニングとは対照的で、事前に訓練された既存のモデルをさらに適合させる。 • 事前に訓練されたモデルは、そのまま使用することも、さらに微調整することもできる For example: • 1970年から2022年までのPubmedの全記事を学習したモデル • Bloomberg GPTは、Bloombergのすべての記事とファイナンスについて知っている What is it?

©2023 Databricks Inc. — All rights reserved MosaicML Up to
7X faster and cheaper training of large AI Models 大規模なAIモデルのトレーニングを簡素化、スケーラブル、かつコスト効率よく行うことができますお客様のセキュアな環境で、お客様のデータを使って独自の生成 AI モデルをトレーニングまたは微調整しますモデルの完全な制御とデータのプライバシー Your data, your model, built in your secure environment.

©2023 Databricks Inc. — All rights reserved MosaicML example use
case "MosaicMLプラットフォームを使うことで、私たちは1週間以内に自分たちのデータを使ってLLMを訓練し、展開することができました" Amjad Masad, CEO

Engineering Retrieval Augmented Generation (RAG) Fine-tuning Pre-training

©2023 Databricks Inc. — All rights reserved Databricks, powered by
AI Example: Databricks Assistant ノートブック、SQLエディタ、ファイルエディタ内で、データの文脈を理解する AIアシスタント • コードとクエリの生成と自動補完 • 問題の説明と修正 • Unity Catalogと統合し、データ資産に関連するコンテキストに基づいた結果を提供

©2023 Databricks Inc. — All rights reserved Search あなたの組織の構造、専門用語、データを理解する
Lakehouse IQ お客様のビジネスを独自に理解するAI搭載のナレッジエンジン Org Chart Unity Catalog Docs Popularity Dashboards Lineage Notebooks Queries Databricks Assistant Unityカタログ、テーブルのセマンティクス、およびパーミッションを知っている Administration テーブル・メタデータの自動生成など、Unityカタログ向け Custom apps Lakehouse IQ API Listen to the DAIS 2023 keynote announcing Lakehouse IQ

©2023 Databricks Inc. — All rights reserved MLOps - What
changes with LLMs? Properties of LLMs Implications for MLOps LLMには様々な形態がある： • 有料APIを介した一般的なプロプライエタリおよび OSSモデル • 既製のオープンソースモデル • 特定のアプリケーション用に微調整されたカスタムモデル • カスタム訓練済みモデル Development process: • 段階的な開発 • APIから始め、カスタムモデルへ進む

changes with LLMs? Properties of LLMs Implications for MLOps LLMには様々な形態がある： • 有料APIを介した一般的なプロプライエタリおよび OSSモデル • 既製のオープンソースモデル • 特定のアプリケーション用に微調整されたカスタムモデル • カスタム訓練済みモデル Development process: • 段階的な開発 • APIから始め、カスタムモデルへ進む LLMは自然言語プロンプトを入力として受け取る : • プロンプトは、望ましい反応を引き出すように設計することができる開発プロセス: • LLMクエリ用のプロンプト・テンプレートをデザインするパッケージの成果物： • MLロジックをモデルだけでなくパイプラインとしてパッケージ化する

changes with LLMs? Properties of LLMs Implications for MLOps LLMには様々な形態がある： • 有料APIを介した一般的なプロプライエタリおよび OSSモデル • 既製のオープンソースモデル • 特定のアプリケーション用に微調整されたカスタムモデル • カスタム訓練済みモデル Development process: • 段階的な開発 • APIから始め、カスタムモデルへ進む LLMは自然言語プロンプトを入力として受け取る : • プロンプトは、望ましい反応を引き出すように設計することができる開発プロセス: • LLMクエリ用のプロンプト・テンプレートをデザインするパッケージの成果物： • MLロジックをモデルだけでなくパイプラインとしてパッケージ化する LLMには、例文や文脈を用いたプロンプトを与えることができるインフラを提供する： • ベクトルデータベースやその他のツールを使って、関連する文脈を見つける

changes with LLMs? Properties of LLMs Implications for MLOps サードパーティの APIは、プロプライエタリモデルとオープンソースモデルを提供する。 API governance: • APIプロバイダー間のオプションと柔軟性を持つために、集中型 APIガバナンスを使用する。

changes with LLMs? Properties of LLMs Implications for MLOps サードパーティの APIは、プロプライエタリモデルとオープンソースモデルを提供する。 API governance: • APIプロバイダー間のオプションと柔軟性を持つために、集中型 APIガバナンスを使用する。 LLMは非常に大きなディープラーニングモデルで、多くの場合、ギガバイトから数百ギガバイトに及ぶ。 Serving infrastructure: • LLMの処理にはGPUを使う。 • モデルを動的にロードする必要がある場合は、高速ストレージを使用する。

changes with LLMs? Properties of LLMs Implications for MLOps サードパーティの APIは、プロプライエタリモデルとオープンソースモデルを提供する。 API governance: • APIプロバイダー間のオプションと柔軟性を持つために、集中型 APIガバナンスを使用する。 LLMは非常に大きなディープラーニングモデルで、多くの場合、ギガバイトから数百ギガバイトに及ぶ。 Serving infrastructure: • LLMの処理にはGPUを使う。 • モデルを動的にロードする必要がある場合は、高速ストレージを使用する。 LLMは、単一の「正しい」答えがないことが多いため、従来のMLメトリクスで評価するのは難しい。 Human feedback: • ユーザーフィードバックによる LLMの評価とテスト。 • ユーザーからのフィードバックを、テスト、モニタリング、将来の微調整を含むMLOpsプロセスに直接組み込む。

©2023 Databricks Inc. — All rights reserved 多くの既存のツールやプロセスは変わらない • 開発、ステージング、プロダクションの分離は変わりません。
• GitとUnityカタログModelは、MLのロジックをプロダクションに向けて推進するための主要な導管であることに変わりはありません。 • Lakehouseアーキテクチャは、効率とコラボレーションのために不可欠であることに変わりはない。 • 既存のCI/CDインフラを再利用できる。 • MLOpsの構造は、モデルトレーニング、モデル推論などの個別のパイプラインを持つモジュール式のままです。 DevOps MLOps - What changes with LLMs? 61 Lakehouse Platform development staging production DataOps Unity Catalog ModelOps

©2023 Databricks Inc. — All rights reserved Resources Learn about
Gen AI • edX LLM Courses (also on Databricks Academy) • Generative AI Fundamentals (Databricks Academy) Learn about Gen AI on Databricks • RAG End-to-End Example (code) • Vector Search + Lakehouse Monitoring (talk) • LLM Eval best practices and LLM Eval in MLﬂow (blogs) Learn about MLOps • Big Book of MLOps (includes a Gen AI section) • LLMOps Data+AI Summit 2023 talk

©2023 Databricks Inc. — All rights reserved JetBlue blog post
and Data+AI Summit 2023 talk on Databricks AI and LLMs for many use cases easyJet blog post on LLMs for digital customer service, personalization and operations Comcast Data+AI Summit 2023 talk on the Databricks AI platform Texas Rangers YouTube short on Databricks AI for powering player performance and fan experience Barracuda blog post on Databricks AI for preventing email phishing attacks at scale Customer Stories

©2023 Databricks Inc. — All rights reserved Unified security and
governance Unified data storage, management, and sharing Unity Catalog Delta Lake Intelligence Engine LakehouseIQ Real-time Analytics ETL & Orchestration Data Warehousing Data Science & Gen AI Databricks AI Delta Live Tables Workflows Databricks SQL Databricks Generative Data Platform

©2023 Databricks Inc. — All rights reserved Model Serving Real-time
inference of Models with up to 10X lower latency and reduced costs 高可用性、低レイテンシー、自動スケーリングによるサービス提供自動機能検索、モニタリング、統合ガバナンス OSS GenAIモデルに最適化レイテンシーとコストを最大10倍削減

Databricks AI — a data-centric approach Use Pre-trained Model or
Build Custom Model Serve Models into Real-Time Apps and Monitor Prepare data & features with native tools Data Platform — Delta Lake Governance — Unity Catalog Datasets Models Applications

©2023 Databricks Inc. — All rights reserved Confidential Generative AI
with Databricks Foundation model + your data Custom model fine tuned or trained on your data Data requirements Small: 10s of thousands of words Medium - Large: Millions - Trillions of words Objective Securely host an open source model and connect to your enterprise data Customize models on your data for your specific use cases Models MPT family • Llama 2 • Falcon MPT family • Llama 2 • Your Specific Model

©2023 Databricks Inc. — All rights reserved Confidential Foundation Model
+ Your Data Customer has small amounts of text (~100k words max), such as ▪ HR handbook ▪ Instruction manual ▪ Support tickets Customer data is organized using Vector Search and queried using an open source model hosted in Databricks Model Serving Falcon LlaMA 2

©2023 Databricks Inc. — All rights reserved Confidential Customer has
medium to large amounts of text (1M-1T words) • Enables models to have domain knowledge • Enables new modalities, such as code, images and proteinomics Customer data is used to securely build a custom model in customer’s private environment using MosaicML. This model is unique to their data and becomes their IP Custom Model ﬁne-tuned or trained with your data

©2023 Databricks Inc. — All rights reserved What’s needed to
be successful データとモデルを所有するプライバシーの向上企業データによって安全にトレーニングされたGenAI モデルを所有する GenAIモデルを迅速かつ確実に本番稼動させる LLMの訓練と配備は、規模に応じて費用対効果が高い標準化されたオペレーションにより、ユースケースを問わず本番環境へ拡張可能ガバナンスとモニタリング LLM育成の費用対効果生産化までの時間短縮

[2023年11月版] Databricksを用いた『生成AIアプローチ』

[2023年11月版] Databricksを用いた『生成AIアプローチ』

More Decks by Databricks Japan

Other Decks in Technology

Featured

Transcript