Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Theory behind Vector DB

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Yusuke Matsui Yusuke Matsui
December 26, 2023

The Theory behind Vector DB

Retrieval Augmented Generation (RAG) is a popular method for adding information to Large Language Models (LLMs), and Vector DB is the core technology used for the retrieval part of RAG. In this presentation, I will focus on the mathematical background of the approximate nearest neighbor search algorithms that are central components of Vector DB. Specifically, I will provide details on graph-based search, the most widely used family of algorithms.

The 1st Japan-Korea Workshop on Artificial Intelligence, Dec. 27, 2023 @UTokyo

Yusuke Matsui (The University of Tokyo) https://yusukematsui.me/

Avatar for Yusuke Matsui

Yusuke Matsui

December 26, 2023
Tweet

More Decks by Yusuke Matsui

Other Decks in Research

Transcript

  1. Slides: bit.ly/41BTrrS 1 The Theory behind Vector DB Yusuke Matsui

    The University of Tokyo The 1st Japan-Korea Workshop on Artificial Intelligence, Dec. 27, 2023 @UTokyo
  2. Slides: bit.ly/41BTrrS 2 Yusuke Matsui ✓ Computer vision ✓ Large-scale

    indexing ✓ Data structure + Machine learning http://yusukematsui.me Lecturer (Assistant Professor), the University of Tokyo, Japan @utokyo_bunny ARM impl. of Faiss [Matsui+, ICASSP 22] Nearest neighbor search [Ono & Matsui, ACMMM 23] @matsui528 ML-enhanced Bloom Filter [Sato & Matsui, NeurIPS 23]
  3. Slides: bit.ly/41BTrrS 3 ➢ Introduction ✓ Demo: Embedding + Search

    + LLM ➢ The theory behind Vector DB ➢ Discussion
  4. Slides: bit.ly/41BTrrS 4 ➢ Introduction ✓ Demo: Embedding + Search

    + LLM ➢ The theory behind Vector DB ➢ Discussion
  5. Slides: bit.ly/41BTrrS 5 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT

    "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “I'm sorry, but as an AI language model, I don't have information about the future events.” Ask  Want to add knowledge to LLM
  6. Slides: bit.ly/41BTrrS 6 Embedding + Search + LLM(RAG; Retrieval Augmented

    Generation) Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021)
  7. Slides: bit.ly/41BTrrS 7 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT

    "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  8. Slides: bit.ly/41BTrrS 8 𝒙1 , Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon

    credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  9. Slides: bit.ly/41BTrrS 9 𝒙1 , 𝒙2 , Texts are from:

    https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  10. Slides: bit.ly/41BTrrS 10 𝒙1 , 𝒙2 , … , 𝒙𝑁

    Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  11. Slides: bit.ly/41BTrrS 11 0.23 3.15 0.65 1.43 𝒙1 , 𝒙2

    , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  12. Slides: bit.ly/41BTrrS 12 0.23 3.15 0.65 1.43 0.20 3.25 0.72

    1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) Search “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “List of 2022 Winter Olympics medal winners…” Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  13. Slides: bit.ly/41BTrrS 13 0.23 3.15 0.65 1.43 0.20 3.25 0.72

    1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" Search Update “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  14. Slides: bit.ly/41BTrrS 14 0.23 3.15 0.65 1.43 0.20 3.25 0.72

    1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  15. Slides: bit.ly/41BTrrS 15 0.23 3.15 0.65 1.43 0.20 3.25 0.72

    1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) RAG is the simplest method to supplement knowledge to LLM Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  16. Slides: bit.ly/41BTrrS 17 DEMO1: OpenAI Playground (full managed service) Turn

    on “Retrieval” Upload Olympic information ☺ Referring uploaded information
  17. Slides: bit.ly/41BTrrS 18 DEMO1: OpenAI Playground (full managed service) Turn

    on “Retrieval” Upload Olympic information ☺ Referring uploaded information
  18. Slides: bit.ly/41BTrrS 19 DEMO2: Run on your PC (pseudo code)

    https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb # Embed knowledge texts = [ "Chinami Yoshida¥n¥n==Persona...", "Lviv bid for the 2022 Winter...", ... ] features = encode(texts) # Embed a question question = "Who won curling gold at the 2022 Winter Olympics?" query_feature = encode(question) # Search ids = search(query_feature, features) # Add results to the question question += "Use the bellow articles" for id in ids: question += texts[id] # Ask LLM result = LLM.ask(question)
  19. Slides: bit.ly/41BTrrS 20 DEMO2: Run on your PC (pseudo code)

    # Embed knowledge texts = [ "Chinami Yoshida¥n¥n==Persona...", "Lviv bid for the 2022 Winter...", ... ] features = encode(texts) # Embed a question question = "Who won curling gold at the 2022 Winter Olympics?" query_feature = encode(question) # Search ids = search(query_feature, features) # Add results to the question question += "Use the bellow articles" for id in ids: question += texts[id] # Ask LLM result = LLM.ask(question) ➢ エンコーダはなんでもいい ➢ 古典方式(BM25とか)もアリ ➢ モダン方式でもいい(がGPUいる) ➢ OpenAI APIでもいい(GPUいらないがお金かか る) ➢ Any encoders are fine ➢ Classic methods (e.g., BM25) is also fine ➢ Modern methods are also fine (but you need GPUs) ➢ OpenAI API is also fine (but you need to pay) How to prepare these? Search on either CPUs or GPUs (Accuracy vs Runtime vs Memory vs Money) ➢ You need GPUs to run it on your local machine. You need to prepare an executable model such as LLaMa ➢ Or, you can use OpenAI API. No need to prepare GPUs, but need to pay https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb
  20. Slides: bit.ly/41BTrrS 21 DEMO2: Run on your PC (pseudo code)

    # Embed knowledge texts = [ "Chinami Yoshida¥n¥n==Persona...", "Lviv bid for the 2022 Winter...", ... ] features = encode(texts) # Embed a question question = "Who won curling gold at the 2022 Winter Olympics?" query_feature = encode(question) # Search ids = search(query_feature, features) # Add results to the question question += "Use the bellow articles" for id in ids: question += texts[id] # Ask LLM result = LLM.ask(question) ➢ エンコーダはなんでもいい ➢ 古典方式(BM25とか)もアリ ➢ モダン方式でもいい(がGPUいる) ➢ OpenAI APIでもいい(GPUいらないがお金かか る) ➢ Any encoders are fine ➢ Classic methods (e.g., BM25) is also fine ➢ Modern methods are also fine (but you need GPUs) ➢ OpenAI API is also fine (but you need to pay) How to prepare these? Search on either CPUs or GPUs (Accuracy vs Runtime vs Memory vs Money) ➢ You need GPUs to run it on your local machine. You need to prepare an executable model such as LLaMa ➢ Or, you can use OpenAI API. No need to prepare GPUs, but need to pay ➢ “Embedding + Search” is traditional software engineering-ish ➢ API call or not ✓ Easy, but costly ✓ Reproducibility for research? https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb
  21. Slides: bit.ly/41BTrrS 22 ➢ Introduction ✓ Demo: Embedding + Search

    + LLM ➢ The theory behind Vector DB ➢ Discussion
  22. Slides: bit.ly/41BTrrS 23 0.23 3.15 0.65 1.43 0.20 3.25 0.72

    1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) RAG is the simplest method to supplement knowledge to LLM Embedding + Search + LLM(RAG; Retrieval Augmented Generation)
  23. Slides: bit.ly/41BTrrS 24 0.23 3.15 0.65 1.43 0.20 3.25 0.72

    1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) RAG is the simplest method to supplement knowledge to LLM Embedding + Search + LLM(RAG; Retrieval Augmented Generation) ➢ Approx. nearest neighbor search! ➢ Accuracy vs. runtime vs. memory ➢ Vector DB??
  24. Slides: bit.ly/41BTrrS 25 Three levels of technology Milvus Pinecone Qdrant

    ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald ScaNN jina
  25. Slides: bit.ly/41BTrrS Three levels of technology 26 Milvus Pinecone Qdrant

    ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Weaviate Vertex AI Matching Engine NMSLIB hnswlib Vald ScaNN jina Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] faiss One library may implement multiple algorithms  “I benchmarked faiss” ☺ “I benchmarked PQ in faiss”
  26. Slides: bit.ly/41BTrrS Three levels of technology 27 Milvus Pinecone Qdrant

    ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Weaviate Vertex AI Matching Engine Vald ScaNN jina Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] faiss NMSLIB hnswlib One algorithm may be implemented in multiple libraries
  27. Slides: bit.ly/41BTrrS Three levels of technology 28 Milvus Pinecone Qdrant

    Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald jina ScaNN (4-bit PQ) [Guo+, ICML 2020] ScaNN Often, one library = one algorithm
  28. Slides: bit.ly/41BTrrS Three levels of technology 29 Pinecone Qdrant ScaNN

    (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Vertex AI Matching Engine NMSLIB Vald ScaNN jina Weaviate Milvus faiss hnswlib Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] One service may use some libraries … or re-implement algorithms from scratch (e.g., by Go)
  29. Slides: bit.ly/41BTrrS 30 Three levels of technology Milvus Pinecone Qdrant

    ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald ScaNN jina This talk mainly focuses algorithms
  30. Slides: bit.ly/41BTrrS 31 𝑁 109 106 billion-scale million-scale Locality Sensitive

    Hashing (LSH) Tree / Space Partitioning Graph traversal 0.34 0.22 0.68 0.71 0 1 0 0 ID: 2 ID: 123 0.34 0.22 0.68 0.71 Space partition Data compression ➢ k-means ➢ PQ/OPQ ➢ Graph traversal ➢ etc… ➢ Raw data ➢ Scalar quantization ➢ PQ/OPQ ➢ etc… Look-up-based Hamming-based Linear-scan by Asymmetric Distance … Linear-scan by Hamming distance Inverted index + data compression For raw data: Acc. ☺, Memory:  For compressed data: Acc. , Memory: ☺
  31. Slides: bit.ly/41BTrrS 32 𝑁 109 106 billion-scale million-scale Locality Sensitive

    Hashing (LSH) Tree / Space Partitioning Graph traversal 0.34 0.22 0.68 0.71 0 1 0 0 ID: 2 ID: 123 0.34 0.22 0.68 0.71 Space partition Data compression ➢ k-means ➢ PQ/OPQ ➢ Graph traversal ➢ etc… ➢ Raw data ➢ Scalar quantization ➢ PQ/OPQ ➢ etc… Look-up-based Hamming-based Linear-scan by Asymmetric Distance … Linear-scan by Hamming distance Inverted index + data compression For raw data: Acc. ☺, Memory:  For compressed data: Acc. , Memory: ☺ De facto standard! Will explain
  32. Slides: bit.ly/41BTrrS 33 Graph search ➢ De facto standard if

    all data can be loaded on memory ➢ Fast and accurate for real-world data Images are from [Malkov+, Information Systems, 2013] ➢ Traverse graph towards the query ➢ Seems intuitive, but not so much easy to understand ➢ Review the algorithm carefully
  33. Slides: bit.ly/41BTrrS 34 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M ➢ Given a query vector Candidates (size = 3) Close to the query Name each node for explanation
  34. Slides: bit.ly/41BTrrS 35 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M ➢ Given a query vector ➢ Start from an entry point (e.g., ) Candidates (size = 3) Close to the query M
  35. Slides: bit.ly/41BTrrS 36 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M ➢ Given a query vector ➢ Start from an entry point (e.g., ). Record the distance to q. Candidates (size = 3) Close to the query M M 23.1
  36. Slides: bit.ly/41BTrrS 37 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 M
  37. Slides: bit.ly/41BTrrS 38 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query M 23.1 1st iteration
  38. Slides: bit.ly/41BTrrS 39 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M ➢ Pick up the unchecked best candidate ( ) Candidates (size = 3) Close to the query M M 23.1 Best Best
  39. Slides: bit.ly/41BTrrS 40 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M ➢ Pick up the unchecked best candidate ( ). Check it. Candidates (size = 3) Close to the query M M 23.1 Best Best check!
  40. Slides: bit.ly/41BTrrS 41 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. Best M M check!
  41. Slides: bit.ly/41BTrrS 42 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M M check!
  42. Slides: bit.ly/41BTrrS 43 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M J 11.1 N 15.3 K 19.4 M 23.1 check!
  43. Slides: bit.ly/41BTrrS 44 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M J 11.1 N 15.3 K 19.4 M 23.1
  44. Slides: bit.ly/41BTrrS 45 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) N M J 11.1 N 15.3 K 19.4 M 23.1
  45. Slides: bit.ly/41BTrrS 46 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) N M J 11.1 N 15.3 K 19.4
  46. Slides: bit.ly/41BTrrS 47 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4
  47. Slides: bit.ly/41BTrrS 48 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 2nd iteration
  48. Slides: bit.ly/41BTrrS 49 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ) J Best Best
  49. Slides: bit.ly/41BTrrS 50 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. J Best Best check!
  50. Slides: bit.ly/41BTrrS 51 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. J Best Best check!
  51. Slides: bit.ly/41BTrrS 52 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best 13.2 9.7 check! Already visited
  52. Slides: bit.ly/41BTrrS 53 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best 13.2 9.7 J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2 check! Already visited
  53. Slides: bit.ly/41BTrrS 54 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2
  54. Slides: bit.ly/41BTrrS 55 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) J Best J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2
  55. Slides: bit.ly/41BTrrS 56 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) J Best B 2.3 G 3.5 I 9.7
  56. Slides: bit.ly/41BTrrS 57 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7
  57. Slides: bit.ly/41BTrrS 58 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 3rd iteration
  58. Slides: bit.ly/41BTrrS 59 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ) B
  59. Slides: bit.ly/41BTrrS 60 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ). Check it. B check!
  60. Slides: bit.ly/41BTrrS 61 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. B check!
  61. Slides: bit.ly/41BTrrS 62 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B 0.5 2.1 check!
  62. Slides: bit.ly/41BTrrS 63 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B 0.5 2.1 C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 check!
  63. Slides: bit.ly/41BTrrS 64 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 Best
  64. Slides: bit.ly/41BTrrS 65 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) B C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 Best
  65. Slides: bit.ly/41BTrrS 66 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) B C 0.5 D 2.1 B 2.3 Best
  66. Slides: bit.ly/41BTrrS 67 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3
  67. Slides: bit.ly/41BTrrS 68 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 4th iteration
  68. Slides: bit.ly/41BTrrS 69 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). C Best Best
  69. Slides: bit.ly/41BTrrS 70 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. C Best Best check!
  70. Slides: bit.ly/41BTrrS 71 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. C Best Best check!
  71. Slides: bit.ly/41BTrrS 72 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. C Best check! Already visited Already visited Already visited Already visited
  72. Slides: bit.ly/41BTrrS 73 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) C Best
  73. Slides: bit.ly/41BTrrS 74 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3
  74. Slides: bit.ly/41BTrrS 75 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 5th iteration
  75. Slides: bit.ly/41BTrrS 76 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). D Best Best
  76. Slides: bit.ly/41BTrrS 77 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. D Best Best check!
  77. Slides: bit.ly/41BTrrS 78 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. D Best Best check!
  78. Slides: bit.ly/41BTrrS 79 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best check! Already visited Already visited
  79. Slides: bit.ly/41BTrrS 80 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best check! Already visited Already visited H 3.9
  80. Slides: bit.ly/41BTrrS 81 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best H 3.9
  81. Slides: bit.ly/41BTrrS 82 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) D Best H 3.9
  82. Slides: bit.ly/41BTrrS 83 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) D Best
  83. Slides: bit.ly/41BTrrS 84 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) C
  84. Slides: bit.ly/41BTrrS 85 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 C Final output 1: Candidates ➢ You can pick up topk results ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( )
  85. Slides: bit.ly/41BTrrS 86 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 C Final output 1: Candidates ➢ You can pick up topk results ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) Final output 2: Checked items ➢ i.e., search path
  86. Slides: bit.ly/41BTrrS 87 Search Images are from [Malkov+, Information Systems,

    2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) C Final output 1: Candidates ➢ You can pick up topk results Final output 2: Checked items ➢ i.e., search path Final output 3: Visit flag ➢ For each item, visited or not
  87. Slides: bit.ly/41BTrrS 88 Observation: runtime ➢ Item comparison takes time;

    𝑂 𝐷 ➢ The overall runtime ~ #item_comparison ∼ length_of_search_path * average_outdegree 𝒒 ∈ ℝ𝐷 𝒙13 ∈ ℝ𝐷 start query start query start query 1st path 2nd path 3rd path 2.1 1.9 outdegree = 1 outdegree = 2 outdegree = 2 #item_comparison = 3 * (1 + 2 + 2)/3 = 5 2.4
  88. Slides: bit.ly/41BTrrS 89 A D C B query Observation: candidate

    size E start Candidates (size = 1) C A D C B query E start Candidates (size = 3) C D E size = 1: Greedy search size > 1: Beam search ➢ Larger candidate size, better but slower results ➢ Online parameter to control the trade-off ➢ Called “ef” in HNSW Fast. But stuck in a local minimum Slow. But find a better solution
  89. Slides: bit.ly/41BTrrS 90 Graph search algorithms Images are from an

    excellent survey paper [Wang+, VLDB 2021] ➢ Lots of algorithms ➢ Ours also! Ono & Matsui, “Relative NN-Descent”, ACMMM 23 ➢ The basic structure is same: (1) designing a good graph + (2) beam search
  90. Slides: bit.ly/41BTrrS 91 Just NN? Vector DB? ➢ Vector DB

    companies say “Vector DB is cool” ➢ My own idea: ➢ Which vector DB? ➡ No conclusions! ✓ https://weaviate.io/blog/vector-library-vs-vector-database ✓ https://codelabs.milvus.io/vector-database-101-what-is-a-vector-database/index#2 ✓ https://zilliz.com/learn/what-is-vector-database Try the simplest numpy–only search Slow? Try fast algorithm such as HNSW in faiss Try Vector DB If speed is the only concern, just use libraries
  91. Slides: bit.ly/41BTrrS 92 ➢ Introduction ✓ Demo: Embedding + Search

    + LLM ➢ The theory behind Vector DB ➢ Discussion
  92. Slides: bit.ly/41BTrrS 93 Recent trends ➢ “Embedding + Search” is

    too naïve? ✓ “A similar sentence to the question” might not be useful at all. ➢ Smarter search? ✓ Classic technologies have been revisited! ✓ E.g., Query expansion  Recall [Chum+, ICCV 2007]. They applied query expansion, which was a classic IR technique, to image retrieval.  Revisit repeats! ➢ Directions ✓ Solve in a machine-learning way? E.g., training a special embedding space. ✓ Solve by data-structures? E.g., data structure with a special operations.
  93. Slides: bit.ly/41BTrrS 94 Recent trends ➢ “Embedding + Search” is

    too naïve? ✓ “A similar sentence to the question” might not be useful at all. ➢ Smarter search? ✓ Classic technologies have been revisited! ✓ E.g., Query expansion  Recall [Chum+, ICCV 2007]. They applied query expansion, which was a classic technique, to image retrieval.  Revisit repeats! ➢ Directions ✓ Solve in a machine-learning way? E.g., training a special embedding space. ✓ Solve by data-structures? E.g., data structure with a special operations.
  94. Slides: bit.ly/41BTrrS 95 Recent trends ➢ “Embedding + Search” is

    too naïve? ✓ “A similar sentence to the question” might not be useful at all. ➢ Smarter search? ✓ Classic technologies have been revisited! ✓ E.g., Query expansion  Recall [Chum+, ICCV 2007]. They applied query expansion, which was a classic IR technique, to image retrieval.  Revisit repeats! ➢ Directions ✓ Solve in a machine-learning way? E.g., training a special embedding space. ✓ Solve by data-structures? E.g., data structure with a special operations.
  95. Slides: bit.ly/41BTrrS 96 From the perspective of Nearest Neighbor Search

    ➢ CPU? GPU? Managed services? ✓ Trade-off! ✓ NVIDIA pushing GPU-search (RAFT&CAGRA [Ootomo+, arXiv 23]) ✓ Can we spend money to search? (Not training a model) ➢ Next paradigm ✓ Graph-based search is mature. ✓ We need breakthrough idea? ➢ Not trivial:Dimensionality (𝐷) is now 1000 (not 100) ✓ If 𝐷 > 100, it’s often said that we must PCA it to 𝐷 = 100. ✓ Because of LLM API, we need to handle 𝐷 = 1000 anyway.
  96. Slides: bit.ly/41BTrrS 97 Use LLM as a building block ➢

    The era of using LLMs as functions in pseudo code has already arrived. ➢ Should we run something locally or API call? A choice that has never been made before. ➢ How to think about "Reproducibility" in the LLM and API Era ➢ Need to be able to roughly estimate what is possible solutions to a problem, and at what cost, and what will be obtained. On-premises own server Cloud (e.g., AWS EC2) API Call ☺Free to use Initial cost maintenance ☺Good balance Depends on cloud ☺Super cheap Not good at large-processing Communication cost? New paradigm
  97. Slides: bit.ly/41BTrrS 98 Reference ➢ Survey for NN search: CVPR

    20 Tutorial ➢ SOTA graph-based search: CVPR 23 Tutorial ➢ Good tutorial for RAG [Asai+, ACL 2023 Tutorial]
  98. Slides: bit.ly/41BTrrS 99 ◼ [Jégou+, TPAMI 2011] H. Jégou+, “Product

    Quantization for Nearest Neighbor Search”, IEEE TPAMI 2011 ◼ [Guo+, ICML 2020] R. Guo+, “Accelerating Large-Scale Inference with Anisotropic Vector Quantization”, ICML 2020 ◼ [Malkov+, TPAMI 2019] Y. Malkov+, “Efficient and Robust Approximate Nearest Neighbor search using Hierarchical Navigable Small World Graphs,” IEEE TPAMI 2019 ◼ [Malkov+, IS 13] Y, Malkov+, “Approximate Nearest Neighbor Algorithm based on Navigable Small World Graphs”, Information Systems 2013 ◼ [Fu+, VLDB 19] C. Fu+, “Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graphs”, 2019 ◼ [Wang+, VLDB 21] M. Wang+, “A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search”, VLDB 2021 ◼ [Iwasaki+, arXiv 18] M. Iwasaki and D. Miyazaki, “Optimization if Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data”, arXiv 2018 ◼ [Ootomo+, arXiv 23] H. Ootomo+, “CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs”, arXiv 2023 ◼ [Chum+, ICCV 07] O. Chum+, “Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval”, ICCV 2007 ◼ [Pinecone] https://www.pinecone.io/ ◼ [Milvus] https://milvus.io/ ◼ [Qdrant] https://qdrant.tech/ ◼ [Weaviate] https://weaviate.io/ ◼ [Vertex AI Matching Engine] https://cloud.google.com/vertex-ai/docs/matching-engine ◼ [Vald] https://vald.vdaas.org/ ◼ [Modal] https://modal.com/ Reference