Azure AI Search 概要資料_Startups_11292023

Azure AI Search 概要資料 https://aka.ms/mfs_discord https://aka.ms/daka_linkedin https://aka.ms/daka_x https://aka.ms/daka_qiita

Azure AI Searchの一般的な用途 Workplace Search 内部チームがデータベースやファイルを探索するのを助ける • 効率と生産性を向上させる •
データアクセスを強化する • 意思決定を改善する SaaS Search 顧客向けの市場対応アプリケーションを構築する • ユーザーエクスペリエンスを向上させる • 開発時間を短縮する eCommerce 顧客が商品やサービスを見つけて購入するのを手助けする • パーソナライズされたレコメンドを提供する • ユーザーエクスペリエンスを改善する • 製品発見を強化する • コンバージョン率を増加させる Website Search 訪問者が情報を迅速かつ容易に見つけられるよう支援する • 見つけやすさを向上させる • ユーザーの行動とニーズをより良く理解する

Azure AI Search  プラットフォーム・アズ・ア・サービスセマンティック検索管理不要キーワード検索ファセティング言語分析
地理空間サポートサジェスチョン/オートコンプリートカスタマイズ可能なスコアリング近接検索同義語認知スキルなど

スペルミス地理空間クエリフィルターとファセットスニペットとハイライト提案と自動補完ランキングページング

Azure AI Search 機能豊富なベクトルデータベースあらゆるデータタイプを、どんなソースからでも取り込むシームレスなデータおよびプラット
フォーム統合最先端の検索ランキングエンタープライズ対応の基盤 Generally available Public preview Generally available ベクトル検索 Azure AI Search in Azure AI Studio セマンティックランカー統合されたベクトル化 Generative AI での用途

Azure AI Searchにおけるベクトル検索機能豊富でエンタープライズ対応

Azure AI Searchにおけるベクトル検索  包括的なベクトル検索ソリューション  エンタープライズ対応  → スケーラビリティ、セキュリティ、コンプライアン
ス  Semantic Kernel, LangChain, LlamaIndex, Azure OpenAI Service, Azure AI Studioなどと統合済み Generally available

ベクトル検索戦略 ANN search  スケールでの高速ベクトル検索  優れたパフォーマンス・リコールプロファイルを持つグラフ手法のHNSWを使用  インデックスパラメーターの細かい制御が可
能 Exhaustive KNN search  クエリごと、またはスキーマに組み込まれてる  リコールベースラインを作成するのに便利  高度に選択的なフィルターを使用するシナリオ  例：密集したマルチテナントアプリケーション r = search_client.search( None, top=5, vector_queries=[RawVectorQuery( vector=search_vector, k=5, fields="embedding")]) r = search_client.search( None, top=5, vector_queries=[RawVectorQuery( vector=search_vector, k=5, fields="embedding", exhaustive=True)])

リッチなベクトル検索クエリ機能フィルター付きベクトル検索日付範囲、カテゴリ、地理的距離などに対応  豊かなフィルター表現事前/事後フィルタリング  事前フィルター：選択的なフィルターに適しており、リコールの乱れがありません 
事後フィルター：選択性の低いフィルターには適していますが、結果が空にならないよう注意が必要です r = search_client.search( None, top=5, vector_queries=[RawVectorQuery( vector=query_vector, k=5, fields="embedding")], vector_filter_mode=VectorFilterMode.PRE_FILTER, filter= "category eq 'perks' and created gt 2023-11-15T00:00:00Z") r = search_client.search( None, top=5, vector_queries=[ RawVectorQuery( vector=query1, k=5, fields="embedding"), RawVectorQuery( vector=query2, k=5, fields="embedding") ]) マルチベクトルシナリオ  文書ごとに複数のベクトルフィールド  マルチベクトルクエリ  必要に応じて組み合わせ可能

エンタープライズ対応のベクトルデータベースデータ暗号化顧客管理の暗号化キーのオプションを含むセキュアな認証管理されたアイデンティティとRBACのサポートネットワークの隔離プライベートエンドポイント、仮想ネットワークコンプライアンス認証金融、医療、政府など、幅広い分野での広範な認証

テキストだけではない  画像、音声、グラフなどマルチモーダル埋め込み - 例：Azure AI Visionでの画像+文章既存のベクトル →
ベクトル検索が適用される GPT-4 Turbo with Visionを使った画像付きRAG

Azure AI Search: シームレスなデータおよびプラットフォーム統合

RAGアプリケーションのためのデータ準備 Chunking  チャンキング  長文テキストを短いパッセージに分割する  LLMのコンテキスト長の制限  コンテンツの焦点を絞ったサブセット
 複数の独立したパッセージ  Basics  パッセージあたり約200～500トークン  語彙の境界を維持する  オーバーラップを導入する  Layout  レイアウト情報は価値がある、例えば、表ベクトル化  インデックス作成時：パッセージをベクトルに変換クエリ時：クエリをベクトルに変換

Azure AI Studio & Azure AI SDK  ファーストクラスの統合 
Blobストレージ、Microsoft Fabricなどのデータからインデックスを構築する。  既存のAzure AI Searchインデックスにアタッチする。

統合ベクトル化 RAGに合わせたエンドツーエンドのデータ処理データソースアクセス • Blob Storage • ADLSv2 •
SQL DB • CosmosDB • … + インクリメンタル変更追跡ファイル形式の解析 • PDFs • Office documents • JSON files • … + 画像とテキストの抽出、必要に応じてOCR チャンキング • テキストをパッセージに分割 • ドキュメントのメタデータを伝播ベクトル化 • チャンクをベクトルに変換 • OpenAIの埋め込みまたはあなたのカスタムモデルインデックス作成 • ドキュメントインデックス • チャンクインデックス • 両方 In preview

Azure AI Search: 最先端の検索システム

Semantic ranker SOTAリランキングモデル最高性能の検索モード新しい従量課金制の価格設定：月1,000リクエスト無料、追加1,000リクエストごとに$1 多言語対応抽出型回答、キャプション、ランキングを含む Generally available
*Formerly semantic search

関連性  RAGアプリにとって関連性は重要です。  プロンプト内の多数のパッセージ →品質の低下→リコールだけに焦点を当てることはできません  プロンプト内の不正確なパッセージ →おそらく根拠はしっかりしているが間違った回答
になる可能性 →「十分に良い」根拠データの閾値を設定するのに役立ちます Source: Lost in the Middle: How Language Models Use Long Contexts, Liu et al. arXiv:2307.03172 50 55 60 65 70 75 5 10 15 20 25 30 Accuracy Number of documents in input context

関連性の向上すべての情報検索のトリックが適用されます完全な検索スタックがより良い結果を出します：  ハイブリッド検索（キーワード+ベクトル）＞純粋なベクトルまたはキーワード検索  ハイブリッド+リランキング＞ハイブリッド良い候補と悪い候補を特定する 
セマンティックランカーからの正規化されたスコア  閾値以下のドキュメントを除外する Vector Keywords Fusion (RRF) Reranking

0 10 20 30 40 50 60 70 80 Customer
datasets [NDCG@3] Beir [NDCG@10] Miracl [NDCG@10] Keyword Vector (ada-002) Hybrid (Keyword + Vector) Hybrid + Semantic ranker メソッドによる情報検索の関連性 Retrieval comparison using Azure AI Search in various retrieval modes on customer and academic benchmarks Source: Outperforming vector search with hybrid + reranking

クエリの種類が関連性に与える影響 Source: Outperforming vector search with hybrid + reranking Query
type Keyword [NDCG@3] Vector [NDCG@3] Hybrid [NDCG@3] Hybrid + Semantic ranker [NDCG@3] Concept seeking queries 39 45.8 46.3 59.6 Fact seeking queries 37.8 49 49.1 63.4 Exact snippet search 51.1 41.5 51 60.8 Web search-like queries 41.8 46.3 50 58.9 Keyword queries 79.2 11.7 61 66.9 Low query/doc term overlap 23 36.1 35.9 49.1 Queries with misspellings 28.8 39.1 40.6 54.6 Long queries 42.7 41.6 48.1 59.4 Medium queries 38.1 44.7 46.7 59.9 Short queries 53.1 38.8 53 63.9

Retrieval-augmented generation (RAG)

今週に関連する「Falcon Climate Finance」についてのTeamsメッセージを見つける 1 結果を表示し、参照を伝播させる 3 プロンプトを作成する: 指示
コンテキスト取得したコンテンツ 2

Large Language Model 検索システムあなたのカスタム Copilot データソース (files, databases,
etc.) RAG – Retrieval Augmented Generation

RAGを高度な検索機能で強化最先端の検索技術に投資して、結果を向上 R A G 検索システム（Retriever）の品質が重要です Azure AI Searchは、以下を通じて最高の検索ソ
リューションを提供することに尽力しています： - ベクター検索機能 - ハイブリッド検索 - 高度なフィルタリング - ドキュメントセキュリティ - L2再ランキング/最適化 - 組み込みのチャンキング - 自動ベクトル化 - その他多くの機能！

例 RAGアプリケーションのための堅牢な検索機能  取得されたデータの良さがレスポンスの質を決めます  キーワード検索のリコールの課題  「語彙のギャップ」  自然言語の質問ではさらに精度が下がる
 ベクトルベースの検索は、意味的類似性によって文書を見つけます  概念の表現方法の変化に強い（単語の選択、形態、特異性など） Question: 「水中活動に関するレッスンを探しています」 Won’t match: 「スキューバダイビングのクラス」「シュノーケリングのグループセッション」

ベクトルとベクトルデータベース

ベクトル学習されたベクトル表現  アイテムをベクトルにエンコードするモデル  類似したアイテムは近いベクトルにマッピングされる  文章、画像、グラフなどベクトル検索
 「クエリ」ベクトルを与えられて、最も近い K個のベクトルを見つける  徹底的に検索するか、近似値を通じて検索する

ベクトルデータベース  大規模にベクトルとメタデータを耐久的に保存し、インデックスを作成する  様々なインデックス作成と検索戦略  ベクトルクエリをメタデータフィルター
と組み合わせる  アクセス制御を可能にする

Vectors in Azure databases データをそのまま保持する：ネイティブベクター検索機能 Azure Cosmos DB MongoDB
vCore および Azure Cosmos DB for PostgreSQL サービスに組み込まれています Azure AI Search 最高の関連性：最高品質の結果 Azureデータソースから自動的にデータをインデックス化： SQL DB、Cosmos DB、Blobストレージ、 ADLSv2など Azureにおけるベクターデータベース

RAG at scale Azure AI Searchを活用して、巨大でミッションクリティカルなRAGワークロードを強化

Presentation resources Repos Azure Cognitive Search as a vector database
for OpenAI embeddings | OpenAI Cookbook Azure Cognitive Search — 🦜🔗 LangChain 0.0.198 Azure Cognitive Search - LlamaIndex 🦙 0.8.46 (gpt-index.readthedocs.io) Azure-Samples/azure-search-openai- demo (github.com) Azure-Samples/chat-with-your-data- solution-accelerator: A Solution Accelerator for the RAG pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices. (github.com) Azure-Samples/azure-search-comparison- tool: A demo app showcasing Vector Search using Azure Cognitive Search, Azure OpenAI for text embeddings, and Azure AI Vision for image embeddings. (github.com) Docs Vector search - Azure Cognitive Search | Microsoft Learn Azure/cognitive-search-vector- pr: Private repository for the Vector search feature in Azure Cognitive Search. (github.com) Azure OpenAI Service - Documentation, quickstarts, API reference - Azure Cognitive Services | Microsoft Learn Image Retrieval concepts - Image Analysis 4.0 - Azure Cognitive Services | Microsoft Learn Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities - Microsoft Community Hub

次ページ以降は参考情報 / English Only ・Understanding Vector Search ・Vector Search in
Azure AI Search

How might we take our enterprise search or RAG scenarios
to the next level?

Understanding Vector Search

Let’s review the basics! Traditional information retrieval  Query: Formal
statement representing your information need (e.g., search string)  Object: Entity within your content collection (e.g., document, image, audio)  Relevance: Quantitative measure of how well an object satisfies the intent of the query  Ranking: Ordered list of relevant results based on their desirability or relevance score

Let’s review the basics! Search via Inverted Indexes Document 1:
“apple orange banana” Document 2: “orange apple grape” Document 3: “banana grape apple” Term Freq Documents apple 3 1, 2, 3 orange 2 1, 2 banana 2 1, 3 grape 2 2, 3 Dictionary Postings Lists

Let’s review the basics! Ranking & Relevance in Traditional Search
 Relevance via Boolean Search: Retrieve documents containing specific terms (e.g., "apples" AND "oranges")  Ranking via BM25: A ranking algorithm influenced by 3 key factors:  Term Frequency (TF): More occurrences of the search term indicate higher relevance  Inverse Document Frequency (IDF): The rarer a term across documents, the more important it is  Field Length: Terms found in shorter fields (fewer words) are more likely to be relevant than terms in longer fields (more words) BM25 formula

Combining Lexical and Semantic Representations for Optimal Recall Leveraging the
Strengths of Lexical and Semantic Approaches in Retrieval  Discrete (Lexical) Representations  Advantages:  Exact matching  Precise control and easy explainability  Limitations:  Struggle to capture nuances in language  Limited understanding of conceptual similarity  Dense (Vector/Semantic) Representations  Advantages:  Capture conceptual similarity  Better understanding of language nuances  Limitations:  Not built to match exact terms  Reduced explainability compared to discrete Bottom Line: Achieve optimal recall by leveraging the strengths of both discrete and semantic representations for a comprehensive understanding of language

Search AI Better information in the context Better representations of
data (embeddings) Search + AI Better Together

Vector Search at a High Level A diverse collection of
books, each containing unique insights and knowledge Scenario Finding a book on a specific topic or theme can be time- consuming and overwhelming, especially when the content is scattered. The Challenge A skilled librarian can quickly connect you to books with similar topics or themes The Solution

Vector Search: Deeper Dive Organizations need efficient methods to retrieve
semantically similar items from large-scale data sources. Scenario Sifting through large-scale databases to find related items can be resource- intensive and time- consuming. The Challenge By calculating similarity metrics like cosine similarity, vector search organizes data and retrieves semantically similar items within the high-dimensional space. The Solution

How can I create vector representations of my data?

Embeddings: Convert Data into Vector Representation Simplifying complex data structures
for efficient analysis and processing in various applications. Definition  Abstract, dense, compact, learned numerical representations of data  Map complex structures into simpler, fixed-size vectors  Applicable to diverse data types (text, images, audio, etc.) Purpose  Facilitate analysis and processing of diverse data types  Enable similarity measurement, clustering, and classification  Power applications like Vector search and recommendation systems Benefits  Efficient search and organization of vast datasets  Improved accuracy and relevance of search results  Scalable and adaptable to various industries and use cases

Why do I need embeddings? Vectors are a universal representation
of data 3.4MB 4.1MB 1.1GB 0.1912 0.4123 . . . 0.9128

©Microsoft Corporation Azure Find relevant objects with embeddings Convert data
vectors (embeddings) and find the most similar objects according to metric App/UX Images Audio Video Text Transform using embedding model Vector Representation Vector index Vector Representation Transform using embedding model -2, -1 , 0, 1 2, 3, 4, 5 6, 7, 8, 9 Results 2, 3, 4, 5 Data Sources ...and more! Azure Cognitive Search

Choosing an Embedding Model Key Factors for Selecting the Optimal
Model for Your Use Case Model Characteristics  Task Specificity  Performance  Context Awareness  Model Size and Inference Speed  Language Support  Customizability (ability to fine-tune) Implementation Considerations  Training Time and Complexity  Pre-Trained Models  Integration  Community Support and Updates  Cost We recommend Azure OpenAI sercice “text-embedding-ada-002” for text embeddings We recommend Azure AI Vision Image Retrieval API for image embeddings

Approximate Nearest Neighbor Search Efficient and Scalable Similarity Search with
AI Definition  A fast search method for finding approximate nearest neighbors in high- dimensional spaces Purpose  Applicable in image search, NLP, recommendation systems Benefits  Provide faster search results in high-dimensional data by trading off a small degree of recall for significant performance gains compared to exhaustive vector search Vector indexes are data structures that let us perform approximate nearest neighbor search

Similarity Metrics  Cosine similarity  Dot Product  Euclidean
distance  Angular  Jaccard  Many more! https://platform.openai.com/docs/guides/embeddings/which-distance-function-should-i-use

Common Approximate Nearest Neighbor Algorithms Exploring AI Techniques for Efficient
Similarity Search HNSW (Hierarchical Navigable Small World)  Hierarchical graph structure  Fast search performance FLAT (Brute Force)  Exhaustive search in high-dimensional data  Slower but highly accurate LSH (Locality-Sensitive Hashing)  Hash-based similarity search  Trade-off between speed and accuracy IVF (Inverted File Index)  Reduces search space using quantization  Scalable and memory-efficient https://github.com/erikbern/ann-benchmarks When productionizing, it doesn’t matter what algorithm you chose, but rather the information retrieval system it’s part of.

Limitations of common OSS ANN indexes  ANN indexes on
their own, are simply just a data structure. • Limited Scalability: Struggle with vertical and horizontal scaling in large-scale data sets • Memory Constraints: High memory consumption affecting search performance and resource efficiency • Persistence and Durability: Lack of built-in mechanisms for data storage, recovery, and metadata management • Simplistic Query Support: Limited capabilities for combining sparse and dense retrieval methods • Hosting Challenges: Complex setup and hosting requirements • No Built-in Security Features: Open-source solutions often lack advanced security features • Embedding Management: Limited support for managing the embedding functions themselves

Dump a Bunch of Data Run a Query Get the
Most Relevant Data Back The Dream Scenario for Vector Search Effortless Data Management and Relevant Search Results Common Challenges - Scalability - Preprocessing - Splitting/Chunking - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm

Scalability in Vector Search Key Questions and Considerations for Efficient
Scaling • Data Volume: Can the system handle increasing amounts of data? • Storage capacity and management • Indexing and search performance • Query Load: How well does the system respond to growing query demands? • Query execution speed and response times • Handling concurrent queries and user connections • Distributed Infrastructure: Does the system support distributed and parallel processing? • Horizontal scaling across multiple nodes • Load balancing and fault tolerance • Cost Efficiency: How does the system optimize resource usage and cost management? • Balancing performance and cost requirements • Efficient use of hardware and cloud resources Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm

Preprocessing and Document Chunking Optimizing Data Preparation for Efficient Vector
Search • Text Preprocessing: Ensuring clean and structured data for the embedding model • Tokenization (or segmentation): Breaking text into words, phrases, or symbols • Lowercasing and normalization: Standardizing text representation • Stopword removal: Eliminating common words with little semantic value • Stemming and lemmatization: Reducing words to their root forms • Document Splitting: Adapting documents to fit within embedding model limits • Chunking: Dividing long documents into smaller, manageable sections • Passage extraction: Identifying and retaining meaningful segments • Overlap management: Ensuring continuity and context preservation • Model Compatibility: Preparing data to align with the chosen embedding model • Input requirements: Adhering to model-specific formatting and length constraints • Vocabulary coverage: Maximizing the overlap between document vocabulary and model vocabulary • Evaluation and Iteration: Continuously improving preprocessing and splitting strategies • Performance monitoring: Assessing the impact of preprocessing and splitting on search quality • Strategy refinement: Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm

Challenge of Embedding Management Overcoming Embedding Management in Vector Search
• Embedding Quality: Ensuring high-quality and accurate vector representations • Selecting appropriate embedding models (e.g., OpenAI, BERT) • Fine-tuning models for domain-specific vocabulary and context • Dimensionality: Balancing embedding size and search performance • Reducing dimensions while retaining semantic information • Implementing dimensionality reduction techniques (e.g., PCA, t-SNE) • Indexing and Storage: Efficiently managing and storing embeddings • Using optimized data structures for quick look-up and retrieval (e.g., Approximate Nearest Neighbors) • Embedding Updates: Keeping vector representations up-to-date with evolving data • Incremental updates to embeddings based on new or updated documents • Periodic model retraining for continuous improvement and/or model version updating • Evaluation and Iteration: Continuously assessing and refining embedding management strategies • Monitoring performance metrics (e.g., search relevance, recall, precision) • Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm

Addressing the Query Language Challenge Enhancing Vector Search Through Improved
Query Understanding • Beyond Similarity: Addressing complex search scenarios beyond "most similar documents" • Understanding user intent: Identifying specific search goals and requirements • Query Flexibility: Supporting various search parameters and filters • Boolean operators: Handling AND, OR, and NOT conditions • Filtering and Faceting: Allowing users to filter results based on specific attributes • Query Transformation: Converting user queries into vector representations • Text-to-vector conversion: Transforming query text into compatible embeddings • Query expansion: Incorporating additional keywords or phrases to improve search relevance • Evaluation and Iteration: Continuously refining query language understanding • Monitoring query performance metrics (e.g., query success rate, user satisfaction) • Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm

Enhancing Search Relevance in Vector Search Achieving Accurate Ranking, Result
Diversity, and Adaptability • Ranking Accuracy: Ensuring highly relevant results are ranked at the top • Hyperparameter tuning: leverage hyperparameters as needed to tradeoff recall/latency • Rank fusion (hybrid, re-ranker, HyDE): Combining multiple ranking signals for improved accuracy • Result Diversity: Balancing the variety and relevance of search results • Diversification strategies: Introducing variety while maintaining relevance • Document-level vs. Chunk-level search: Considering the impact of chunking long documents • More focused and relevant results from individual chunks (good or bad? -> depends on task) • Top results may all belong to the same document, reducing result diversity (good or bad? -> depends on task) • Search Algorithm Adaptability: Customizing search behavior based on the task at hand • Task-oriented search: Adjusting search algorithms for specific tasks or user requirements • Evaluation and Iteration: Continuously refining search relevance strategies • Monitoring search performance metrics (e.g., precision, recall, user satisfaction) • Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm

How do I deal with these common challenges?

Vector Search in Azure AI Search

Introducing vector search in Azure AI Search Revolutionize indexing and
retrieval augmented generation for LLM Apps Images Audio Video Graphs Text • Leverage data from any data store • Improve relevancy • Query across multiple types of data • Quickly search through large data sets • Deploy with enterprise-grade security • Easily scale with changing workloads • Build retrieval plugins for OpenAI's ChatGPT using Azure OpenAI service

©Microsoft Corporation Azure What is vector search? Convert data into
vector representations where distances represent similarity App/UX Images Audio Video Text Transform into embeddings Vector Representation Approximate Nearest Neighbor Vector Representation Transform into embedding -2, -1 , 0, 1 2, 3, 4, 5 6, 7, 8, 9 Results 2, 3, 4, 5 Data Sources ...and more! Azure AI Search

©Microsoft Corporation Azure Full-text search (BM25) Pure Vector search (ANN)
Hybrid search (BM25 + ANN) Exact keyword match Proximity search Term weighting Semantic similarity search Multi-modal search Multi-lingual search Retrieval Modes Vector search is good, but Hybrid search is even better!

Why is Hybrid Search important? Hybrid Queries with BM25 and
ANN Search Integration • Hybrid search allows you to take advantage of multiple scoring algorithms such as BM25 and ANN vector similarity so you can get the benefits of both keyword search and semantic search

Reciprocal Rank Fusion in Azure AI Search Hybrid Queries with
BM25 and ANN Search Integration • Reciprocal Rank Fusion (RRF): A technique for combining the results of multiple search strategies, resulting in improved search relevance and ranking • Azure Cognitive Search incorporates RRF by merging the ranked results of BM25 and ANN search, allowing the best features of both methods to contribute to the final search relevance https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf

Cost and Pricing Harness the Power of Vector Search at
No Additional Cost No additional cost for using Vector search! Note: Cognitive Search does not generate embeddings out of the box, therefore, you are responsible for the cost of generating your embeddings. You only pay for the storage of the vectors.

Scale: OpenAI Text-Embedding-Ada-002 Example Choose the Right Tier to Meet
Your Vector Index Size Needs For more information on how to calculate Vector index size, please visit Vector index size limit - Azure Cognitive Search | Microsoft Learn Tier Vector index size GB/partition Max index size for a service (GB) Basic 1 1 S1 3 36 S2 12 144 S3 36 432 L1 12 144 L2 36 432

How do I get started with vector search? Ingest data
sources Gather your data sources Generate document embeddings Generate embeddings for your data using you own model Add to your index Insert your vectors into your search index as a collection of floats via the Push API or the Indexer via a Custom Embedding Skill Create your vector configuration Configure your algorithm, similarity function, and parameters Generate query embedding Generate an embedding for your query using the same model as your docs Search using vectors Search your index using a vector representation of your data

Estimating the Right SKU for Your Needs Utilize a PoC
Index to Calculate Your Production Requirements ✓ Perform a PoC ✓ Index a representative sample of your production workload using the desired schema ✓ Sample to Production Ratio ✓ Calculate the ratio between the sample index size and raw data source size to estimate the corresponding production index size and data source size¥ ✓ Analyze and Adjust ✓ Consider the estimated number of documents, index size, and data source size for your production workload, and add a buffer for future growth ✓ Choose the Right SKU ✓ Use the estimated production index size and required number of partitions and replicas to determine the appropriate Azure Cognitive Search SKU ✓ Estimate Monthly Cost ✓ Use the pricing calculator to estimate the SKU cost Note: This estimation is for Azure Cognitive Search service costs only and does NOT include the cost of generating embeddings or other AI Enrichment features. For a more back-of-envelope calculation, see Vector index size limit - Azure Cognitive Search | Microsoft Learn

“Pull”  Automated data ingestion using our Indexer  Utilize
custom skills to generate embeddings and process data during indexing  Streamlined Indexing “Push”  Manual Data Ingestion giving you full control over the indexing process  Quick and easy to get started  High Flexibility Getting Data into Your Search Index Comparing Push and Pull Approaches for Indexing

Deep Dive into HNSW Algorithm ”Performance-Optimized” Search with Hierarchical Navigable
Small World • Performance-Optimized - HNSW is designed to offer a high-performance, memory-efficient solution for approximate nearest neighbor search in high-dimensional spaces • HNSW creates a multi-layer graph structure that enables fast search for nearest neighbors in high-dimensional data • Customize HNSW's behavior by adjusting key parameters for optimal performance and accuracy • Key Parameters: • "m": Controls the degree of the graph, affecting search speed and accuracy • "ef_construction": Influences the index construction time and quality • "ef_search": Determines the search time and accuracy trade-off • "metric": Specifies the distance function used, such as "cosine"

Tuning HNSW Parameters for Optimal Performance Striking the Right Balance
between Recall, Latency, and Indexing 1. Increase 'ef_search' to improve recall without reindexing; monitor for potential latency increases. 2. If increasing 'ef_search' isn't effective or causes high latency, consider reindexing with higher values of ‘m' and/or 'ef_construction'. 3. Enhance the quality of the HNSW graph by increasing 'ef_construction', keeping in mind it may result in longer indexing latency. 4. Carefully increase the ‘m' value only if other parameters don't sufficiently improve recall after trying previous steps.

Search Configuration Customer datasets [NDCG@3] Beir [NDCG@10] Multilingual Academic (MIRACL)
[NDCG@10] Keyword 40.6 40.6 49.6 Vector (Ada-002) 43.8 45.0 58.3 Hybrid (Keyword + Vector) 48.4 48.4 58.8 Hybrid + Semantic ranker 60.1 50.0 72.0 Introducing semantic ranker Outperform vector search with hybrid search + Semantic re-ranking • SOTA re-ranking encoder model • Highest performing retrieval mode • Free 1000 queries/month • Multilingual capabilities • Includes extractive answers, captions, and highlights just like Bing.con

Classified as Microsoft Confidential Vector Configuration w/Azure OpenAI service Vectorizer
I want to create a vector search configuration so that I can set the appropriate parameters for my search experience. "vectorSearch": { "algorithms": [ { "name": "myHnsw", "kind": "hnsw“ }, { "name": "myExhaustiveKNN", "kind": “exhaustiveKnn“ }, ], "vectorizers": [ { "name": “myAzureOpenAIVectorizer", "kind": "azureOpenAI", "azureOpenAIParameters": { "resourceUri" : "https://my-openai.openai.azure.com", "apiKey" : “xxx", "deploymentId" : "text-embedding-ada-002" } }, ], "profiles": [ { "name": "myHnswProfile", "algorithm": "myHnsw", "vectorizer":"myAzureOpenAIVectorizer" } ] },

Classified as Microsoft Confidential Vector Configuration w/Custom Vectorizer I want
to create a vector search configuration so that I can set the appropriate parameters for my search experience. "vectorSearch": { "algorithms": [ { "name": "myHnsw", "kind": "hnsw“ }, { "name": "myExhaustiveKNN", "kind": “exhaustiveKnn“ }, ], "vectorizers": [ { "name": “myCustomVectorizer", "kind": “customWebApi", “customVectorizerParameters": { “authIdentity" : “user-assigned", “httpHeaders" : “application/json", “httpMethod" : “POST“, “uri" : “https://my-custom-embedding-model.azure.com" } }, ], "profiles": [ { "name": "myHnswProfile", "algorithm": "myHnsw", "vectorizer":"myAzureOpenAIVectorizer" } ] },

Classified as Microsoft Confidential Configure Vector fields in your Index
Definition I want to create vector field types that will be supported in my nearest neighbor search. { "name": “contentVector", "type": "Collection(Edm.Single)", "dimensions": 1536, “vectorSearchProfile": "my-vector-profile", "searchable": true, "retrievable": true | false, "filterable": false, "sortable": false, "facetable": false }

Classified as Microsoft Confidential Pure vector search (Exhaustive) I want
to exhaustively search all the vectors in my index to find the ground-truth values. Alternatively, you can use this with smaller index sizes. { "vectorQueries": [ { "kind": "text", "text": “healthy foods", "fields": "vector" } ], "exhaustive": "true", }

Classified as Microsoft Confidential Vectors search with Filters I want
to use vectors with pre- filtering, so that I can limit the number of matched documents. I also want to use query vectorization. { "vectorQueries": [ { "kind": "text", "text": “healthy foods", "fields": "vector" } ], "vectorFilterMode": "preFilter", "filter": “category eq ‘fruits’" }

Classified as Microsoft Confidential Pure Vector search w/Vectorizer I want
to only use query vectorizer and to do a vector search and rank my search results by cosine similarity score, so that I get the full user intent of my search results. { "vectorQueries": [ { "kind": "text", "text": “healthy foods", "fields": "vector" } ], }

Classified as Microsoft Confidential Pure Vector search w/Raw Vector I
want to only use query vectorizer and to do a vector search and rank my search results by cosine similarity score, so that I get the full user intent of my search results. { "vectorQueries": [ { "kind": "vector", "vector": [1, 2, 3], "fields": "vector" } ], }

Classified as Microsoft Confidential Cross-Field Vector Query I want to
use vectors and search over multiple fields so that I can leverage multiple vector fields into my similarity function. { "vectorQueries": [ { "kind": "vector", "vector": [1, 2, 3], "fields": “titlevector, contentVector" } ], }

Classified as Microsoft Confidential Hybrid Search I want to use
hybrid search (text + vectors) so that I can leverage both vectors and keywords for my search relevance. { "vectorQueries": [ { "kind": "text", "text": "healthy foods", "fields": "vector" } ], "search": " healthy foods" }

Classified as Microsoft Confidential Multi-Vector Query I want to use
multi-vector queries to pass in two different query embeddings for my multi-modal search use case using CLIP. { "vectorQueries": [ { "kind": "text", "text": "yummy vanilla ice cream", "fields": "textVector" }, { "kind": "text", "text": "vanilla.png", "fields": "imageVector" } ], } City eq 'New York

Classified as Microsoft Confidential Hybrid Search with Semantic reranking I
want to use hybrid search (text + vectors) so that I can take advantage of vectors, keywords, and Semantic search capabilities such as captions and answers. { "vectorQueries": [ { "kind": "text", "text": "healthy foods", "fields": "vector" } ], "search": “healthy foods” "semanticConfiguration": “config" "queryType": "semantic" "answers": "extractive" "captions": "extractive" } City eq 'New York

Generate document embeddings docsEmbeddings.py # generate document embeddings def generate_embeddings(text):
response = openai.Embedding.create( input=text, engine="text-embedding-ada-002") embeddings = response['data'][0]['embedding'] return embeddings

Generate query embedding queryEmbedding.py # generate query embedding response =
openai.Embedding.create( input=“healthy foods", engine="text-embedding-ada-002" ) embeddings = response['data'][0]['embedding']

Thank you

Azure AI Search 概要資料_Startups_11292023

Azure AI Search 概要資料_Startups_11292023

More Decks by Daiki Kanemitsu

Featured

Transcript