Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Azure AI Search 概要資料_Startups_11292023

Daiki Kanemitsu
November 29, 2023
860

Azure AI Search 概要資料_Startups_11292023

[Azure AI Search 概要]
https://azure.microsoft.com/ja-jp/products/ai-services/ai-search
[プログラム参加者、参加希望者、興味ある関係者のコミュニティ]
https://aka.ms/mfs_discord
[プログラム申請ページ]
https://www.microsoft.com/ja-jp/startups

Daiki Kanemitsu

November 29, 2023
Tweet

Transcript

  1. Azure AI Searchの一般的な用途 Workplace Search 内部チームがデータベースやファイルを探索 するのを助ける • 効率と生産性を向上させる •

    データアクセスを強化する • 意思決定を改善する SaaS Search 顧客向けの市場対応アプリケー ションを構築する • ユーザーエクスペリエンスを向上 させる • 開発時間を短縮する eCommerce 顧客が商品やサービスを見つけて購入 するのを手助けする • パーソナライズされたレコメンドを提供 する • ユーザーエクスペリエンスを改善する • 製品発見を強化する • コンバージョン率を増加させる Website Search 訪問者が情報を迅速かつ容易に見 つけられるよう支援する • 見つけやすさを向上させる • ユーザーの行動とニーズをより良 く理解する
  2. Azure AI Search  プラットフォーム・アズ・ア・サービス セマンティック検索 管理不要 キーワード検索 ファセティング 言語分析

    地理空間サポート サジェスチョン/オートコンプリート カスタマイズ可能なスコアリング 近接検索 同義語 認知スキル など
  3. Azure AI Search 機能豊富な ベクトルデータベース あらゆるデータタイプを、 どんなソースからでも 取り込む シームレスなデータ およびプラット

    フォーム統合 最先端の 検索ランキング エンタープライズ 対応の基盤 Generally available Public preview Generally available ベクトル検索 Azure AI Search in Azure AI Studio セマンティックランカー 統合されたベクトル化 Generative AI での用途
  4. ベクトル検索戦略 ANN search  スケールでの高速ベクトル検索  優れたパフォーマンス・リコールプロファイルを 持つグラフ手法のHNSWを使用  インデックスパラメーターの細かい制御が可

    能 Exhaustive KNN search  クエリごと、またはスキーマに組み込まれてる  リコールベースラインを作成するのに便利  高度に選択的なフィルターを使用するシナリオ  例:密集したマルチテナントアプリケーション r = search_client.search( None, top=5, vector_queries=[RawVectorQuery( vector=search_vector, k=5, fields="embedding")]) r = search_client.search( None, top=5, vector_queries=[RawVectorQuery( vector=search_vector, k=5, fields="embedding", exhaustive=True)])
  5. リッチなベクトル検索クエリ機能 フィルター付きベクトル検索 日付範囲、カテゴリ、地理的距離などに対応  豊かなフィルター表現 事前/事後フィルタリング  事前フィルター:選択的なフィルターに適しており、リコールの乱れが ありません 

    事後フィルター:選択性の低いフィルターには適していますが、結果 が空にならないよう注意が必要です r = search_client.search( None, top=5, vector_queries=[RawVectorQuery( vector=query_vector, k=5, fields="embedding")], vector_filter_mode=VectorFilterMode.PRE_FILTER, filter= "category eq 'perks' and created gt 2023-11-15T00:00:00Z") r = search_client.search( None, top=5, vector_queries=[ RawVectorQuery( vector=query1, k=5, fields="embedding"), RawVectorQuery( vector=query2, k=5, fields="embedding") ]) マルチベクトルシナリオ  文書ごとに複数のベクトルフィールド  マルチベクトルクエリ  必要に応じて組み合わせ可能
  6. RAGアプリケーションのためのデータ準備 Chunking  チャンキング  長文テキストを短いパッセージに分割する  LLMのコンテキスト長の制限  コンテンツの焦点を絞ったサブセット

     複数の独立したパッセージ  Basics  パッセージあたり約200~500トークン  語彙の境界を維持する  オーバーラップを導入する  Layout  レイアウト情報は価値がある、例えば、表 ベクトル化  インデックス作成時:パッセージをベクトルに変換 クエリ時:クエリをベクトルに変換
  7. Azure AI Studio & Azure AI SDK  ファーストクラスの統合 

    Blobストレージ、Microsoft Fabricなどの データからインデックスを構築する。  既存のAzure AI Searchインデックスにア タッチする。
  8. 統合ベクトル化 RAGに合わせたエンドツーエンドのデータ処理 データソース アクセス • Blob Storage • ADLSv2 •

    SQL DB • CosmosDB • … + インクリメンタル変 更追跡 ファイル形式の 解析 • PDFs • Office documents • JSON files • … + 画像とテキストの 抽出、必要に応 じてOCR チャンキング • テキストをパッセー ジに分割 • ドキュメントのメタ データを伝播 ベクトル化 • チャンクをベクトル に変換 • OpenAIの埋め込 みまたはあなたの カスタムモデル インデックス作成 • ドキュメントインデックス • チャンクインデックス • 両方 In preview
  9. 関連性  RAGアプリにとって関連性は重要です。  プロンプト内の多数のパッセージ →品質の低下→リコールだけに焦点を当てること はできません  プロンプト内の不正確なパッセージ →おそらく根拠はしっかりしているが間違った回答

    になる可能性 →「十分に良い」根拠データの閾値を設定するの に役立ちます Source: Lost in the Middle: How Language Models Use Long Contexts, Liu et al. arXiv:2307.03172 50 55 60 65 70 75 5 10 15 20 25 30 Accuracy Number of documents in input context
  10. 0 10 20 30 40 50 60 70 80 Customer

    datasets [NDCG@3] Beir [NDCG@10] Miracl [NDCG@10] Keyword Vector (ada-002) Hybrid (Keyword + Vector) Hybrid + Semantic ranker メソッドによる情報検索の関連性 Retrieval comparison using Azure AI Search in various retrieval modes on customer and academic benchmarks Source: Outperforming vector search with hybrid + reranking
  11. クエリの種類が関連性に与える影響 Source: Outperforming vector search with hybrid + reranking Query

    type Keyword [NDCG@3] Vector [NDCG@3] Hybrid [NDCG@3] Hybrid + Semantic ranker [NDCG@3] Concept seeking queries 39 45.8 46.3 59.6 Fact seeking queries 37.8 49 49.1 63.4 Exact snippet search 51.1 41.5 51 60.8 Web search-like queries 41.8 46.3 50 58.9 Keyword queries 79.2 11.7 61 66.9 Low query/doc term overlap 23 36.1 35.9 49.1 Queries with misspellings 28.8 39.1 40.6 54.6 Long queries 42.7 41.6 48.1 59.4 Medium queries 38.1 44.7 46.7 59.9 Short queries 53.1 38.8 53 63.9
  12. RAGを高度な検索機能で強化 最先端の検索技術に投資して、結果を向上 R A G 検索システム(Retriever)の 品質が重要です Azure AI Searchは、以下を通じて最高の検索ソ

    リューションを提供することに尽力しています: - ベクター検索機能 - ハイブリッド検索 - 高度なフィルタリング - ドキュメントセキュリティ - L2再ランキング/最適化 - 組み込みのチャンキング - 自動ベクトル化 - その他多くの機能!
  13. 例 RAGアプリケーションのための堅牢な検索機能  取得されたデータの良さがレスポンスの質を決めます  キーワード検索のリコールの課題  「語彙のギャップ」  自然言語の質問ではさらに精度が下がる

     ベクトルベースの検索は、意味的類似性によって文書を 見つけます  概念の表現方法の変化に強い(単語の選択、形態、特異性 など) Question: 「水中活動に関するレッスンを探して います」 Won’t match: 「スキューバダイビングのクラス」 「シュノーケリングのグループセッション」
  14. ベクトル 学習されたベクトル表現  アイテムをベクトルにエンコードするモデル  類似したアイテムは近いベクトルにマッピン グされる  文章、画像、グラフなど ベクトル検索

     「クエリ」ベクトルを与えられて、最も近い K個のベクトルを見つける  徹底的に検索するか、近似値を通じて 検索する
  15. Vectors in Azure databases データをそのまま保持する: ネイティブベクター検索機能 Azure Cosmos DB MongoDB

    vCore および Azure Cosmos DB for PostgreSQL サービスに 組み込まれています Azure AI Search 最高の関連性: 最高品質の結果 Azureデータソースから自動的にデータを インデックス化: SQL DB、Cosmos DB、Blobストレージ、 ADLSv2など Azureにおけるベクターデータベース
  16. Presentation resources Repos Azure Cognitive Search as a vector database

    for OpenAI embeddings | OpenAI Cookbook Azure Cognitive Search — 🦜🔗 LangChain 0.0.198 Azure Cognitive Search - LlamaIndex 🦙 0.8.46 (gpt-index.readthedocs.io) Azure-Samples/azure-search-openai- demo (github.com) Azure-Samples/chat-with-your-data- solution-accelerator: A Solution Accelerator for the RAG pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices. (github.com) Azure-Samples/azure-search-comparison- tool: A demo app showcasing Vector Search using Azure Cognitive Search, Azure OpenAI for text embeddings, and Azure AI Vision for image embeddings. (github.com) Docs Vector search - Azure Cognitive Search | Microsoft Learn Azure/cognitive-search-vector- pr: Private repository for the Vector search feature in Azure Cognitive Search. (github.com) Azure OpenAI Service - Documentation, quickstarts, API reference - Azure Cognitive Services | Microsoft Learn Image Retrieval concepts - Image Analysis 4.0 - Azure Cognitive Services | Microsoft Learn Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities - Microsoft Community Hub
  17. Let’s review the basics! Traditional information retrieval  Query: Formal

    statement representing your information need (e.g., search string)  Object: Entity within your content collection (e.g., document, image, audio)  Relevance: Quantitative measure of how well an object satisfies the intent of the query  Ranking: Ordered list of relevant results based on their desirability or relevance score
  18. Let’s review the basics! Search via Inverted Indexes Document 1:

    “apple orange banana” Document 2: “orange apple grape” Document 3: “banana grape apple” Term Freq Documents apple 3 1, 2, 3 orange 2 1, 2 banana 2 1, 3 grape 2 2, 3 Dictionary Postings Lists
  19. Let’s review the basics! Ranking & Relevance in Traditional Search

     Relevance via Boolean Search: Retrieve documents containing specific terms (e.g., "apples" AND "oranges")  Ranking via BM25: A ranking algorithm influenced by 3 key factors:  Term Frequency (TF): More occurrences of the search term indicate higher relevance  Inverse Document Frequency (IDF): The rarer a term across documents, the more important it is  Field Length: Terms found in shorter fields (fewer words) are more likely to be relevant than terms in longer fields (more words) BM25 formula
  20. Combining Lexical and Semantic Representations for Optimal Recall Leveraging the

    Strengths of Lexical and Semantic Approaches in Retrieval  Discrete (Lexical) Representations  Advantages:  Exact matching  Precise control and easy explainability  Limitations:  Struggle to capture nuances in language  Limited understanding of conceptual similarity  Dense (Vector/Semantic) Representations  Advantages:  Capture conceptual similarity  Better understanding of language nuances  Limitations:  Not built to match exact terms  Reduced explainability compared to discrete Bottom Line: Achieve optimal recall by leveraging the strengths of both discrete and semantic representations for a comprehensive understanding of language
  21. Search AI Better information in the context Better representations of

    data (embeddings) Search + AI Better Together
  22. Vector Search at a High Level A diverse collection of

    books, each containing unique insights and knowledge Scenario Finding a book on a specific topic or theme can be time- consuming and overwhelming, especially when the content is scattered. The Challenge A skilled librarian can quickly connect you to books with similar topics or themes The Solution
  23. Vector Search: Deeper Dive Organizations need efficient methods to retrieve

    semantically similar items from large-scale data sources. Scenario Sifting through large-scale databases to find related items can be resource- intensive and time- consuming. The Challenge By calculating similarity metrics like cosine similarity, vector search organizes data and retrieves semantically similar items within the high-dimensional space. The Solution
  24. Embeddings: Convert Data into Vector Representation Simplifying complex data structures

    for efficient analysis and processing in various applications. Definition  Abstract, dense, compact, learned numerical representations of data  Map complex structures into simpler, fixed-size vectors  Applicable to diverse data types (text, images, audio, etc.) Purpose  Facilitate analysis and processing of diverse data types  Enable similarity measurement, clustering, and classification  Power applications like Vector search and recommendation systems Benefits  Efficient search and organization of vast datasets  Improved accuracy and relevance of search results  Scalable and adaptable to various industries and use cases
  25. Why do I need embeddings? Vectors are a universal representation

    of data 3.4MB 4.1MB 1.1GB 0.1912 0.4123 . . . 0.9128
  26. ©Microsoft Corporation Azure Find relevant objects with embeddings Convert data

    vectors (embeddings) and find the most similar objects according to metric App/UX Images Audio Video Text Transform using embedding model Vector Representation Vector index Vector Representation Transform using embedding model -2, -1 , 0, 1 2, 3, 4, 5 6, 7, 8, 9 Results 2, 3, 4, 5 Data Sources ...and more! Azure Cognitive Search
  27. Choosing an Embedding Model Key Factors for Selecting the Optimal

    Model for Your Use Case Model Characteristics  Task Specificity  Performance  Context Awareness  Model Size and Inference Speed  Language Support  Customizability (ability to fine-tune) Implementation Considerations  Training Time and Complexity  Pre-Trained Models  Integration  Community Support and Updates  Cost We recommend Azure OpenAI sercice “text-embedding-ada-002” for text embeddings We recommend Azure AI Vision Image Retrieval API for image embeddings
  28. Approximate Nearest Neighbor Search Efficient and Scalable Similarity Search with

    AI Definition  A fast search method for finding approximate nearest neighbors in high- dimensional spaces Purpose  Applicable in image search, NLP, recommendation systems Benefits  Provide faster search results in high-dimensional data by trading off a small degree of recall for significant performance gains compared to exhaustive vector search Vector indexes are data structures that let us perform approximate nearest neighbor search
  29. Similarity Metrics  Cosine similarity  Dot Product  Euclidean

    distance  Angular  Jaccard  Many more! https://platform.openai.com/docs/guides/embeddings/which-distance-function-should-i-use
  30. Common Approximate Nearest Neighbor Algorithms Exploring AI Techniques for Efficient

    Similarity Search HNSW (Hierarchical Navigable Small World)  Hierarchical graph structure  Fast search performance FLAT (Brute Force)  Exhaustive search in high-dimensional data  Slower but highly accurate LSH (Locality-Sensitive Hashing)  Hash-based similarity search  Trade-off between speed and accuracy IVF (Inverted File Index)  Reduces search space using quantization  Scalable and memory-efficient https://github.com/erikbern/ann-benchmarks When productionizing, it doesn’t matter what algorithm you chose, but rather the information retrieval system it’s part of.
  31. Limitations of common OSS ANN indexes  ANN indexes on

    their own, are simply just a data structure. • Limited Scalability: Struggle with vertical and horizontal scaling in large-scale data sets • Memory Constraints: High memory consumption affecting search performance and resource efficiency • Persistence and Durability: Lack of built-in mechanisms for data storage, recovery, and metadata management • Simplistic Query Support: Limited capabilities for combining sparse and dense retrieval methods • Hosting Challenges: Complex setup and hosting requirements • No Built-in Security Features: Open-source solutions often lack advanced security features • Embedding Management: Limited support for managing the embedding functions themselves
  32. Dump a Bunch of Data Run a Query Get the

    Most Relevant Data Back The Dream Scenario for Vector Search Effortless Data Management and Relevant Search Results Common Challenges - Scalability - Preprocessing - Splitting/Chunking - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm
  33. Scalability in Vector Search Key Questions and Considerations for Efficient

    Scaling • Data Volume: Can the system handle increasing amounts of data? • Storage capacity and management • Indexing and search performance • Query Load: How well does the system respond to growing query demands? • Query execution speed and response times • Handling concurrent queries and user connections • Distributed Infrastructure: Does the system support distributed and parallel processing? • Horizontal scaling across multiple nodes • Load balancing and fault tolerance • Cost Efficiency: How does the system optimize resource usage and cost management? • Balancing performance and cost requirements • Efficient use of hardware and cloud resources Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm
  34. Preprocessing and Document Chunking Optimizing Data Preparation for Efficient Vector

    Search • Text Preprocessing: Ensuring clean and structured data for the embedding model • Tokenization (or segmentation): Breaking text into words, phrases, or symbols • Lowercasing and normalization: Standardizing text representation • Stopword removal: Eliminating common words with little semantic value • Stemming and lemmatization: Reducing words to their root forms • Document Splitting: Adapting documents to fit within embedding model limits • Chunking: Dividing long documents into smaller, manageable sections • Passage extraction: Identifying and retaining meaningful segments • Overlap management: Ensuring continuity and context preservation • Model Compatibility: Preparing data to align with the chosen embedding model • Input requirements: Adhering to model-specific formatting and length constraints • Vocabulary coverage: Maximizing the overlap between document vocabulary and model vocabulary • Evaluation and Iteration: Continuously improving preprocessing and splitting strategies • Performance monitoring: Assessing the impact of preprocessing and splitting on search quality • Strategy refinement: Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm
  35. Challenge of Embedding Management Overcoming Embedding Management in Vector Search

    • Embedding Quality: Ensuring high-quality and accurate vector representations • Selecting appropriate embedding models (e.g., OpenAI, BERT) • Fine-tuning models for domain-specific vocabulary and context • Dimensionality: Balancing embedding size and search performance • Reducing dimensions while retaining semantic information • Implementing dimensionality reduction techniques (e.g., PCA, t-SNE) • Indexing and Storage: Efficiently managing and storing embeddings • Using optimized data structures for quick look-up and retrieval (e.g., Approximate Nearest Neighbors) • Embedding Updates: Keeping vector representations up-to-date with evolving data • Incremental updates to embeddings based on new or updated documents • Periodic model retraining for continuous improvement and/or model version updating • Evaluation and Iteration: Continuously assessing and refining embedding management strategies • Monitoring performance metrics (e.g., search relevance, recall, precision) • Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm
  36. Addressing the Query Language Challenge Enhancing Vector Search Through Improved

    Query Understanding • Beyond Similarity: Addressing complex search scenarios beyond "most similar documents" • Understanding user intent: Identifying specific search goals and requirements • Query Flexibility: Supporting various search parameters and filters • Boolean operators: Handling AND, OR, and NOT conditions • Filtering and Faceting: Allowing users to filter results based on specific attributes • Query Transformation: Converting user queries into vector representations • Text-to-vector conversion: Transforming query text into compatible embeddings • Query expansion: Incorporating additional keywords or phrases to improve search relevance • Evaluation and Iteration: Continuously refining query language understanding • Monitoring query performance metrics (e.g., query success rate, user satisfaction) • Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm
  37. Enhancing Search Relevance in Vector Search Achieving Accurate Ranking, Result

    Diversity, and Adaptability • Ranking Accuracy: Ensuring highly relevant results are ranked at the top • Hyperparameter tuning: leverage hyperparameters as needed to tradeoff recall/latency • Rank fusion (hybrid, re-ranker, HyDE): Combining multiple ranking signals for improved accuracy • Result Diversity: Balancing the variety and relevance of search results • Diversification strategies: Introducing variety while maintaining relevance • Document-level vs. Chunk-level search: Considering the impact of chunking long documents • More focused and relevant results from individual chunks (good or bad? -> depends on task) • Top results may all belong to the same document, reducing result diversity (good or bad? -> depends on task) • Search Algorithm Adaptability: Customizing search behavior based on the task at hand • Task-oriented search: Adjusting search algorithms for specific tasks or user requirements • Evaluation and Iteration: Continuously refining search relevance strategies • Monitoring search performance metrics (e.g., precision, recall, user satisfaction) • Adjusting techniques based on observed results and user feedback Common Challenges - Scalability - Preprocessing - Splitting/Chun king - Embedding management - Query understanding - Query flexibility - Ranking accuracy - Result Diversity - Search algorithm
  38. Introducing vector search in Azure AI Search Revolutionize indexing and

    retrieval augmented generation for LLM Apps Images Audio Video Graphs Text • Leverage data from any data store • Improve relevancy • Query across multiple types of data • Quickly search through large data sets • Deploy with enterprise-grade security • Easily scale with changing workloads • Build retrieval plugins for OpenAI's ChatGPT using Azure OpenAI service
  39. ©Microsoft Corporation Azure What is vector search? Convert data into

    vector representations where distances represent similarity App/UX Images Audio Video Text Transform into embeddings Vector Representation Approximate Nearest Neighbor Vector Representation Transform into embedding -2, -1 , 0, 1 2, 3, 4, 5 6, 7, 8, 9 Results 2, 3, 4, 5 Data Sources ...and more! Azure AI Search
  40. ©Microsoft Corporation Azure Full-text search (BM25) Pure Vector search (ANN)

    Hybrid search (BM25 + ANN) Exact keyword match Proximity search Term weighting Semantic similarity search Multi-modal search Multi-lingual search Retrieval Modes Vector search is good, but Hybrid search is even better!
  41. Why is Hybrid Search important? Hybrid Queries with BM25 and

    ANN Search Integration • Hybrid search allows you to take advantage of multiple scoring algorithms such as BM25 and ANN vector similarity so you can get the benefits of both keyword search and semantic search
  42. Reciprocal Rank Fusion in Azure AI Search Hybrid Queries with

    BM25 and ANN Search Integration • Reciprocal Rank Fusion (RRF): A technique for combining the results of multiple search strategies, resulting in improved search relevance and ranking • Azure Cognitive Search incorporates RRF by merging the ranked results of BM25 and ANN search, allowing the best features of both methods to contribute to the final search relevance https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf
  43. Cost and Pricing Harness the Power of Vector Search at

    No Additional Cost No additional cost for using Vector search! Note: Cognitive Search does not generate embeddings out of the box, therefore, you are responsible for the cost of generating your embeddings. You only pay for the storage of the vectors.
  44. Scale: OpenAI Text-Embedding-Ada-002 Example Choose the Right Tier to Meet

    Your Vector Index Size Needs For more information on how to calculate Vector index size, please visit Vector index size limit - Azure Cognitive Search | Microsoft Learn Tier Vector index size GB/partition Max index size for a service (GB) Basic 1 1 S1 3 36 S2 12 144 S3 36 432 L1 12 144 L2 36 432
  45. How do I get started with vector search? Ingest data

    sources Gather your data sources Generate document embeddings Generate embeddings for your data using you own model Add to your index Insert your vectors into your search index as a collection of floats via the Push API or the Indexer via a Custom Embedding Skill Create your vector configuration Configure your algorithm, similarity function, and parameters Generate query embedding Generate an embedding for your query using the same model as your docs Search using vectors Search your index using a vector representation of your data
  46. Estimating the Right SKU for Your Needs Utilize a PoC

    Index to Calculate Your Production Requirements ✓ Perform a PoC ✓ Index a representative sample of your production workload using the desired schema ✓ Sample to Production Ratio ✓ Calculate the ratio between the sample index size and raw data source size to estimate the corresponding production index size and data source size¥ ✓ Analyze and Adjust ✓ Consider the estimated number of documents, index size, and data source size for your production workload, and add a buffer for future growth ✓ Choose the Right SKU ✓ Use the estimated production index size and required number of partitions and replicas to determine the appropriate Azure Cognitive Search SKU ✓ Estimate Monthly Cost ✓ Use the pricing calculator to estimate the SKU cost Note: This estimation is for Azure Cognitive Search service costs only and does NOT include the cost of generating embeddings or other AI Enrichment features. For a more back-of-envelope calculation, see Vector index size limit - Azure Cognitive Search | Microsoft Learn
  47. “Pull”  Automated data ingestion using our Indexer  Utilize

    custom skills to generate embeddings and process data during indexing  Streamlined Indexing “Push”  Manual Data Ingestion giving you full control over the indexing process  Quick and easy to get started  High Flexibility Getting Data into Your Search Index Comparing Push and Pull Approaches for Indexing
  48. Deep Dive into HNSW Algorithm ”Performance-Optimized” Search with Hierarchical Navigable

    Small World • Performance-Optimized - HNSW is designed to offer a high-performance, memory-efficient solution for approximate nearest neighbor search in high-dimensional spaces • HNSW creates a multi-layer graph structure that enables fast search for nearest neighbors in high-dimensional data • Customize HNSW's behavior by adjusting key parameters for optimal performance and accuracy • Key Parameters: • "m": Controls the degree of the graph, affecting search speed and accuracy • "ef_construction": Influences the index construction time and quality • "ef_search": Determines the search time and accuracy trade-off • "metric": Specifies the distance function used, such as "cosine"
  49. Tuning HNSW Parameters for Optimal Performance Striking the Right Balance

    between Recall, Latency, and Indexing 1. Increase 'ef_search' to improve recall without reindexing; monitor for potential latency increases. 2. If increasing 'ef_search' isn't effective or causes high latency, consider reindexing with higher values of ‘m' and/or 'ef_construction'. 3. Enhance the quality of the HNSW graph by increasing 'ef_construction', keeping in mind it may result in longer indexing latency. 4. Carefully increase the ‘m' value only if other parameters don't sufficiently improve recall after trying previous steps.
  50. Search Configuration Customer datasets [NDCG@3] Beir [NDCG@10] Multilingual Academic (MIRACL)

    [NDCG@10] Keyword 40.6 40.6 49.6 Vector (Ada-002) 43.8 45.0 58.3 Hybrid (Keyword + Vector) 48.4 48.4 58.8 Hybrid + Semantic ranker 60.1 50.0 72.0 Introducing semantic ranker Outperform vector search with hybrid search + Semantic re-ranking • SOTA re-ranking encoder model • Highest performing retrieval mode • Free 1000 queries/month • Multilingual capabilities • Includes extractive answers, captions, and highlights just like Bing.con
  51. Classified as Microsoft Confidential Vector Configuration w/Azure OpenAI service Vectorizer

    I want to create a vector search configuration so that I can set the appropriate parameters for my search experience. "vectorSearch": { "algorithms": [ { "name": "myHnsw", "kind": "hnsw“ }, { "name": "myExhaustiveKNN", "kind": “exhaustiveKnn“ }, ], "vectorizers": [ { "name": “myAzureOpenAIVectorizer", "kind": "azureOpenAI", "azureOpenAIParameters": { "resourceUri" : "https://my-openai.openai.azure.com", "apiKey" : “xxx", "deploymentId" : "text-embedding-ada-002" } }, ], "profiles": [ { "name": "myHnswProfile", "algorithm": "myHnsw", "vectorizer":"myAzureOpenAIVectorizer" } ] },
  52. Classified as Microsoft Confidential Vector Configuration w/Custom Vectorizer I want

    to create a vector search configuration so that I can set the appropriate parameters for my search experience. "vectorSearch": { "algorithms": [ { "name": "myHnsw", "kind": "hnsw“ }, { "name": "myExhaustiveKNN", "kind": “exhaustiveKnn“ }, ], "vectorizers": [ { "name": “myCustomVectorizer", "kind": “customWebApi", “customVectorizerParameters": { “authIdentity" : “user-assigned", “httpHeaders" : “application/json", “httpMethod" : “POST“, “uri" : “https://my-custom-embedding-model.azure.com" } }, ], "profiles": [ { "name": "myHnswProfile", "algorithm": "myHnsw", "vectorizer":"myAzureOpenAIVectorizer" } ] },
  53. Classified as Microsoft Confidential Configure Vector fields in your Index

    Definition I want to create vector field types that will be supported in my nearest neighbor search. { "name": “contentVector", "type": "Collection(Edm.Single)", "dimensions": 1536, “vectorSearchProfile": "my-vector-profile", "searchable": true, "retrievable": true | false, "filterable": false, "sortable": false, "facetable": false }
  54. Classified as Microsoft Confidential Pure vector search (Exhaustive) I want

    to exhaustively search all the vectors in my index to find the ground-truth values. Alternatively, you can use this with smaller index sizes. { "vectorQueries": [ { "kind": "text", "text": “healthy foods", "fields": "vector" } ], "exhaustive": "true", }
  55. Classified as Microsoft Confidential Vectors search with Filters I want

    to use vectors with pre- filtering, so that I can limit the number of matched documents. I also want to use query vectorization. { "vectorQueries": [ { "kind": "text", "text": “healthy foods", "fields": "vector" } ], "vectorFilterMode": "preFilter", "filter": “category eq ‘fruits’" }
  56. Classified as Microsoft Confidential Pure Vector search w/Vectorizer I want

    to only use query vectorizer and to do a vector search and rank my search results by cosine similarity score, so that I get the full user intent of my search results. { "vectorQueries": [ { "kind": "text", "text": “healthy foods", "fields": "vector" } ], }
  57. Classified as Microsoft Confidential Pure Vector search w/Raw Vector I

    want to only use query vectorizer and to do a vector search and rank my search results by cosine similarity score, so that I get the full user intent of my search results. { "vectorQueries": [ { "kind": "vector", "vector": [1, 2, 3], "fields": "vector" } ], }
  58. Classified as Microsoft Confidential Cross-Field Vector Query I want to

    use vectors and search over multiple fields so that I can leverage multiple vector fields into my similarity function. { "vectorQueries": [ { "kind": "vector", "vector": [1, 2, 3], "fields": “titlevector, contentVector" } ], }
  59. Classified as Microsoft Confidential Hybrid Search I want to use

    hybrid search (text + vectors) so that I can leverage both vectors and keywords for my search relevance. { "vectorQueries": [ { "kind": "text", "text": "healthy foods", "fields": "vector" } ], "search": " healthy foods" }
  60. Classified as Microsoft Confidential Multi-Vector Query I want to use

    multi-vector queries to pass in two different query embeddings for my multi-modal search use case using CLIP. { "vectorQueries": [ { "kind": "text", "text": "yummy vanilla ice cream", "fields": "textVector" }, { "kind": "text", "text": "vanilla.png", "fields": "imageVector" } ], } City eq 'New York
  61. Classified as Microsoft Confidential Hybrid Search with Semantic reranking I

    want to use hybrid search (text + vectors) so that I can take advantage of vectors, keywords, and Semantic search capabilities such as captions and answers. { "vectorQueries": [ { "kind": "text", "text": "healthy foods", "fields": "vector" } ], "search": “healthy foods” "semanticConfiguration": “config" "queryType": "semantic" "answers": "extractive" "captions": "extractive" } City eq 'New York
  62. Generate document embeddings docsEmbeddings.py # generate document embeddings def generate_embeddings(text):

    response = openai.Embedding.create( input=text, engine="text-embedding-ada-002") embeddings = response['data'][0]['embedding'] return embeddings
  63. Generate query embedding queryEmbedding.py # generate query embedding response =

    openai.Embedding.create( input=“healthy foods", engine="text-embedding-ada-002" ) embeddings = response['data'][0]['embedding']