Slide 1

Slide 1 text

Slides: bit.ly/41BTrrS 1 The Theory behind Vector DB Yusuke Matsui The University of Tokyo The 1st Japan-Korea Workshop on Artificial Intelligence, Dec. 27, 2023 @UTokyo

Slide 2

Slide 2 text

Slides: bit.ly/41BTrrS 2 Yusuke Matsui ✓ Computer vision ✓ Large-scale indexing ✓ Data structure + Machine learning http://yusukematsui.me Lecturer (Assistant Professor), the University of Tokyo, Japan @utokyo_bunny ARM impl. of Faiss [Matsui+, ICASSP 22] Nearest neighbor search [Ono & Matsui, ACMMM 23] @matsui528 ML-enhanced Bloom Filter [Sato & Matsui, NeurIPS 23]

Slide 3

Slide 3 text

Slides: bit.ly/41BTrrS 3 ➢ Introduction ✓ Demo: Embedding + Search + LLM ➢ The theory behind Vector DB ➢ Discussion

Slide 4

Slide 4 text

Slides: bit.ly/41BTrrS 4 ➢ Introduction ✓ Demo: Embedding + Search + LLM ➢ The theory behind Vector DB ➢ Discussion

Slide 5

Slide 5 text

Slides: bit.ly/41BTrrS 5 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “I'm sorry, but as an AI language model, I don't have information about the future events.” Ask  Want to add knowledge to LLM

Slide 6

Slide 6 text

Slides: bit.ly/41BTrrS 6 Embedding + Search + LLM(RAG; Retrieval Augmented Generation) Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021)

Slide 7

Slide 7 text

Slides: bit.ly/41BTrrS 7 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 8

Slide 8 text

Slides: bit.ly/41BTrrS 8 𝒙1 , Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 9

Slide 9 text

Slides: bit.ly/41BTrrS 9 𝒙1 , 𝒙2 , Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 10

Slide 10 text

Slides: bit.ly/41BTrrS 10 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 11

Slide 11 text

Slides: bit.ly/41BTrrS 11 0.23 3.15 0.65 1.43 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 12

Slide 12 text

Slides: bit.ly/41BTrrS 12 0.23 3.15 0.65 1.43 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) Search “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “List of 2022 Winter Olympics medal winners…” Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 13

Slide 13 text

Slides: bit.ly/41BTrrS 13 0.23 3.15 0.65 1.43 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" Search Update “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 14

Slide 14 text

Slides: bit.ly/41BTrrS 14 0.23 3.15 0.65 1.43 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 15

Slide 15 text

Slides: bit.ly/41BTrrS 15 0.23 3.15 0.65 1.43 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) RAG is the simplest method to supplement knowledge to LLM Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 16

Slide 16 text

Slides: bit.ly/41BTrrS 16  DEMO1: OpenAI Playground (full managed service)

Slide 17

Slide 17 text

Slides: bit.ly/41BTrrS 17 DEMO1: OpenAI Playground (full managed service) Turn on “Retrieval” Upload Olympic information ☺ Referring uploaded information

Slide 18

Slide 18 text

Slides: bit.ly/41BTrrS 18 DEMO1: OpenAI Playground (full managed service) Turn on “Retrieval” Upload Olympic information ☺ Referring uploaded information

Slide 19

Slide 19 text

Slides: bit.ly/41BTrrS 19 DEMO2: Run on your PC (pseudo code) https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb # Embed knowledge texts = [ "Chinami Yoshida¥n¥n==Persona...", "Lviv bid for the 2022 Winter...", ... ] features = encode(texts) # Embed a question question = "Who won curling gold at the 2022 Winter Olympics?" query_feature = encode(question) # Search ids = search(query_feature, features) # Add results to the question question += "Use the bellow articles" for id in ids: question += texts[id] # Ask LLM result = LLM.ask(question)

Slide 20

Slide 20 text

Slides: bit.ly/41BTrrS 20 DEMO2: Run on your PC (pseudo code) # Embed knowledge texts = [ "Chinami Yoshida¥n¥n==Persona...", "Lviv bid for the 2022 Winter...", ... ] features = encode(texts) # Embed a question question = "Who won curling gold at the 2022 Winter Olympics?" query_feature = encode(question) # Search ids = search(query_feature, features) # Add results to the question question += "Use the bellow articles" for id in ids: question += texts[id] # Ask LLM result = LLM.ask(question) ➢ エンコーダはなんでもいい ➢ 古典方式(BM25とか)もアリ ➢ モダン方式でもいい(がGPUいる) ➢ OpenAI APIでもいい(GPUいらないがお金かか る) ➢ Any encoders are fine ➢ Classic methods (e.g., BM25) is also fine ➢ Modern methods are also fine (but you need GPUs) ➢ OpenAI API is also fine (but you need to pay) How to prepare these? Search on either CPUs or GPUs (Accuracy vs Runtime vs Memory vs Money) ➢ You need GPUs to run it on your local machine. You need to prepare an executable model such as LLaMa ➢ Or, you can use OpenAI API. No need to prepare GPUs, but need to pay https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb

Slide 21

Slide 21 text

Slides: bit.ly/41BTrrS 21 DEMO2: Run on your PC (pseudo code) # Embed knowledge texts = [ "Chinami Yoshida¥n¥n==Persona...", "Lviv bid for the 2022 Winter...", ... ] features = encode(texts) # Embed a question question = "Who won curling gold at the 2022 Winter Olympics?" query_feature = encode(question) # Search ids = search(query_feature, features) # Add results to the question question += "Use the bellow articles" for id in ids: question += texts[id] # Ask LLM result = LLM.ask(question) ➢ エンコーダはなんでもいい ➢ 古典方式(BM25とか)もアリ ➢ モダン方式でもいい(がGPUいる) ➢ OpenAI APIでもいい(GPUいらないがお金かか る) ➢ Any encoders are fine ➢ Classic methods (e.g., BM25) is also fine ➢ Modern methods are also fine (but you need GPUs) ➢ OpenAI API is also fine (but you need to pay) How to prepare these? Search on either CPUs or GPUs (Accuracy vs Runtime vs Memory vs Money) ➢ You need GPUs to run it on your local machine. You need to prepare an executable model such as LLaMa ➢ Or, you can use OpenAI API. No need to prepare GPUs, but need to pay ➢ “Embedding + Search” is traditional software engineering-ish ➢ API call or not ✓ Easy, but costly ✓ Reproducibility for research? https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb

Slide 22

Slide 22 text

Slides: bit.ly/41BTrrS 22 ➢ Introduction ✓ Demo: Embedding + Search + LLM ➢ The theory behind Vector DB ➢ Discussion

Slide 23

Slide 23 text

Slides: bit.ly/41BTrrS 23 0.23 3.15 0.65 1.43 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) RAG is the simplest method to supplement knowledge to LLM Embedding + Search + LLM(RAG; Retrieval Augmented Generation)

Slide 24

Slide 24 text

Slides: bit.ly/41BTrrS 24 0.23 3.15 0.65 1.43 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov¥n¥n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida¥n¥n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) RAG is the simplest method to supplement knowledge to LLM Embedding + Search + LLM(RAG; Retrieval Augmented Generation) ➢ Approx. nearest neighbor search! ➢ Accuracy vs. runtime vs. memory ➢ Vector DB??

Slide 25

Slide 25 text

Slides: bit.ly/41BTrrS 25 Three levels of technology Milvus Pinecone Qdrant ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald ScaNN jina

Slide 26

Slide 26 text

Slides: bit.ly/41BTrrS Three levels of technology 26 Milvus Pinecone Qdrant ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Weaviate Vertex AI Matching Engine NMSLIB hnswlib Vald ScaNN jina Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] faiss One library may implement multiple algorithms  “I benchmarked faiss” ☺ “I benchmarked PQ in faiss”

Slide 27

Slide 27 text

Slides: bit.ly/41BTrrS Three levels of technology 27 Milvus Pinecone Qdrant ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Weaviate Vertex AI Matching Engine Vald ScaNN jina Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] faiss NMSLIB hnswlib One algorithm may be implemented in multiple libraries

Slide 28

Slide 28 text

Slides: bit.ly/41BTrrS Three levels of technology 28 Milvus Pinecone Qdrant Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald jina ScaNN (4-bit PQ) [Guo+, ICML 2020] ScaNN Often, one library = one algorithm

Slide 29

Slide 29 text

Slides: bit.ly/41BTrrS Three levels of technology 29 Pinecone Qdrant ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Vertex AI Matching Engine NMSLIB Vald ScaNN jina Weaviate Milvus faiss hnswlib Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] One service may use some libraries … or re-implement algorithms from scratch (e.g., by Go)

Slide 30

Slide 30 text

Slides: bit.ly/41BTrrS 30 Three levels of technology Milvus Pinecone Qdrant ScaNN (4-bit PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald ScaNN jina This talk mainly focuses algorithms

Slide 31

Slide 31 text

Slides: bit.ly/41BTrrS 31 𝑁 109 106 billion-scale million-scale Locality Sensitive Hashing (LSH) Tree / Space Partitioning Graph traversal 0.34 0.22 0.68 0.71 0 1 0 0 ID: 2 ID: 123 0.34 0.22 0.68 0.71 Space partition Data compression ➢ k-means ➢ PQ/OPQ ➢ Graph traversal ➢ etc… ➢ Raw data ➢ Scalar quantization ➢ PQ/OPQ ➢ etc… Look-up-based Hamming-based Linear-scan by Asymmetric Distance … Linear-scan by Hamming distance Inverted index + data compression For raw data: Acc. ☺, Memory:  For compressed data: Acc. , Memory: ☺

Slide 32

Slide 32 text

Slides: bit.ly/41BTrrS 32 𝑁 109 106 billion-scale million-scale Locality Sensitive Hashing (LSH) Tree / Space Partitioning Graph traversal 0.34 0.22 0.68 0.71 0 1 0 0 ID: 2 ID: 123 0.34 0.22 0.68 0.71 Space partition Data compression ➢ k-means ➢ PQ/OPQ ➢ Graph traversal ➢ etc… ➢ Raw data ➢ Scalar quantization ➢ PQ/OPQ ➢ etc… Look-up-based Hamming-based Linear-scan by Asymmetric Distance … Linear-scan by Hamming distance Inverted index + data compression For raw data: Acc. ☺, Memory:  For compressed data: Acc. , Memory: ☺ De facto standard! Will explain

Slide 33

Slide 33 text

Slides: bit.ly/41BTrrS 33 Graph search ➢ De facto standard if all data can be loaded on memory ➢ Fast and accurate for real-world data Images are from [Malkov+, Information Systems, 2013] ➢ Traverse graph towards the query ➢ Seems intuitive, but not so much easy to understand ➢ Review the algorithm carefully

Slide 34

Slide 34 text

Slides: bit.ly/41BTrrS 34 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M ➢ Given a query vector Candidates (size = 3) Close to the query Name each node for explanation

Slide 35

Slide 35 text

Slides: bit.ly/41BTrrS 35 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M ➢ Given a query vector ➢ Start from an entry point (e.g., ) Candidates (size = 3) Close to the query M

Slide 36

Slide 36 text

Slides: bit.ly/41BTrrS 36 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M ➢ Given a query vector ➢ Start from an entry point (e.g., ). Record the distance to q. Candidates (size = 3) Close to the query M M 23.1

Slide 37

Slide 37 text

Slides: bit.ly/41BTrrS 37 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 M

Slide 38

Slide 38 text

Slides: bit.ly/41BTrrS 38 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query M 23.1 1st iteration

Slide 39

Slide 39 text

Slides: bit.ly/41BTrrS 39 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M ➢ Pick up the unchecked best candidate ( ) Candidates (size = 3) Close to the query M M 23.1 Best Best

Slide 40

Slide 40 text

Slides: bit.ly/41BTrrS 40 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M ➢ Pick up the unchecked best candidate ( ). Check it. Candidates (size = 3) Close to the query M M 23.1 Best Best check!

Slide 41

Slide 41 text

Slides: bit.ly/41BTrrS 41 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. Best M M check!

Slide 42

Slide 42 text

Slides: bit.ly/41BTrrS 42 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M M check!

Slide 43

Slide 43 text

Slides: bit.ly/41BTrrS 43 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M J 11.1 N 15.3 K 19.4 M 23.1 check!

Slide 44

Slide 44 text

Slides: bit.ly/41BTrrS 44 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M J 11.1 N 15.3 K 19.4 M 23.1

Slide 45

Slide 45 text

Slides: bit.ly/41BTrrS 45 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) N M J 11.1 N 15.3 K 19.4 M 23.1

Slide 46

Slide 46 text

Slides: bit.ly/41BTrrS 46 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) N M J 11.1 N 15.3 K 19.4

Slide 47

Slide 47 text

Slides: bit.ly/41BTrrS 47 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4

Slide 48

Slide 48 text

Slides: bit.ly/41BTrrS 48 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 2nd iteration

Slide 49

Slide 49 text

Slides: bit.ly/41BTrrS 49 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ) J Best Best

Slide 50

Slide 50 text

Slides: bit.ly/41BTrrS 50 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. J Best Best check!

Slide 51

Slide 51 text

Slides: bit.ly/41BTrrS 51 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. J Best Best check!

Slide 52

Slide 52 text

Slides: bit.ly/41BTrrS 52 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best 13.2 9.7 check! Already visited

Slide 53

Slide 53 text

Slides: bit.ly/41BTrrS 53 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best 13.2 9.7 J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2 check! Already visited

Slide 54

Slide 54 text

Slides: bit.ly/41BTrrS 54 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2

Slide 55

Slide 55 text

Slides: bit.ly/41BTrrS 55 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) J Best J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2

Slide 56

Slide 56 text

Slides: bit.ly/41BTrrS 56 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) J Best B 2.3 G 3.5 I 9.7

Slide 57

Slide 57 text

Slides: bit.ly/41BTrrS 57 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7

Slide 58

Slide 58 text

Slides: bit.ly/41BTrrS 58 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 3rd iteration

Slide 59

Slide 59 text

Slides: bit.ly/41BTrrS 59 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ) B

Slide 60

Slide 60 text

Slides: bit.ly/41BTrrS 60 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ). Check it. B check!

Slide 61

Slide 61 text

Slides: bit.ly/41BTrrS 61 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. B check!

Slide 62

Slide 62 text

Slides: bit.ly/41BTrrS 62 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B 0.5 2.1 check!

Slide 63

Slide 63 text

Slides: bit.ly/41BTrrS 63 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B 0.5 2.1 C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 check!

Slide 64

Slide 64 text

Slides: bit.ly/41BTrrS 64 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 Best

Slide 65

Slide 65 text

Slides: bit.ly/41BTrrS 65 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) B C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 Best

Slide 66

Slide 66 text

Slides: bit.ly/41BTrrS 66 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) B C 0.5 D 2.1 B 2.3 Best

Slide 67

Slide 67 text

Slides: bit.ly/41BTrrS 67 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3

Slide 68

Slide 68 text

Slides: bit.ly/41BTrrS 68 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 4th iteration

Slide 69

Slide 69 text

Slides: bit.ly/41BTrrS 69 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). C Best Best

Slide 70

Slide 70 text

Slides: bit.ly/41BTrrS 70 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. C Best Best check!

Slide 71

Slide 71 text

Slides: bit.ly/41BTrrS 71 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. C Best Best check!

Slide 72

Slide 72 text

Slides: bit.ly/41BTrrS 72 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. C Best check! Already visited Already visited Already visited Already visited

Slide 73

Slide 73 text

Slides: bit.ly/41BTrrS 73 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) C Best

Slide 74

Slide 74 text

Slides: bit.ly/41BTrrS 74 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3

Slide 75

Slide 75 text

Slides: bit.ly/41BTrrS 75 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 5th iteration

Slide 76

Slide 76 text

Slides: bit.ly/41BTrrS 76 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). D Best Best

Slide 77

Slide 77 text

Slides: bit.ly/41BTrrS 77 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. D Best Best check!

Slide 78

Slide 78 text

Slides: bit.ly/41BTrrS 78 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. D Best Best check!

Slide 79

Slide 79 text

Slides: bit.ly/41BTrrS 79 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best check! Already visited Already visited

Slide 80

Slide 80 text

Slides: bit.ly/41BTrrS 80 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best check! Already visited Already visited H 3.9

Slide 81

Slide 81 text

Slides: bit.ly/41BTrrS 81 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best H 3.9

Slide 82

Slide 82 text

Slides: bit.ly/41BTrrS 82 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) D Best H 3.9

Slide 83

Slide 83 text

Slides: bit.ly/41BTrrS 83 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) D Best

Slide 84

Slide 84 text

Slides: bit.ly/41BTrrS 84 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) C

Slide 85

Slide 85 text

Slides: bit.ly/41BTrrS 85 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 C Final output 1: Candidates ➢ You can pick up topk results ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( )

Slide 86

Slide 86 text

Slides: bit.ly/41BTrrS 86 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 C Final output 1: Candidates ➢ You can pick up topk results ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) Final output 2: Checked items ➢ i.e., search path

Slide 87

Slide 87 text

Slides: bit.ly/41BTrrS 87 Search Images are from [Malkov+, Information Systems, 2013] A B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) C Final output 1: Candidates ➢ You can pick up topk results Final output 2: Checked items ➢ i.e., search path Final output 3: Visit flag ➢ For each item, visited or not

Slide 88

Slide 88 text

Slides: bit.ly/41BTrrS 88 Observation: runtime ➢ Item comparison takes time; 𝑂 𝐷 ➢ The overall runtime ~ #item_comparison ∼ length_of_search_path * average_outdegree 𝒒 ∈ ℝ𝐷 𝒙13 ∈ ℝ𝐷 start query start query start query 1st path 2nd path 3rd path 2.1 1.9 outdegree = 1 outdegree = 2 outdegree = 2 #item_comparison = 3 * (1 + 2 + 2)/3 = 5 2.4

Slide 89

Slide 89 text

Slides: bit.ly/41BTrrS 89 A D C B query Observation: candidate size E start Candidates (size = 1) C A D C B query E start Candidates (size = 3) C D E size = 1: Greedy search size > 1: Beam search ➢ Larger candidate size, better but slower results ➢ Online parameter to control the trade-off ➢ Called “ef” in HNSW Fast. But stuck in a local minimum Slow. But find a better solution

Slide 90

Slide 90 text

Slides: bit.ly/41BTrrS 90 Graph search algorithms Images are from an excellent survey paper [Wang+, VLDB 2021] ➢ Lots of algorithms ➢ Ours also! Ono & Matsui, “Relative NN-Descent”, ACMMM 23 ➢ The basic structure is same: (1) designing a good graph + (2) beam search

Slide 91

Slide 91 text

Slides: bit.ly/41BTrrS 91 Just NN? Vector DB? ➢ Vector DB companies say “Vector DB is cool” ➢ My own idea: ➢ Which vector DB? ➡ No conclusions! ✓ https://weaviate.io/blog/vector-library-vs-vector-database ✓ https://codelabs.milvus.io/vector-database-101-what-is-a-vector-database/index#2 ✓ https://zilliz.com/learn/what-is-vector-database Try the simplest numpy–only search Slow? Try fast algorithm such as HNSW in faiss Try Vector DB If speed is the only concern, just use libraries

Slide 92

Slide 92 text

Slides: bit.ly/41BTrrS 92 ➢ Introduction ✓ Demo: Embedding + Search + LLM ➢ The theory behind Vector DB ➢ Discussion

Slide 93

Slide 93 text

Slides: bit.ly/41BTrrS 93 Recent trends ➢ “Embedding + Search” is too naïve? ✓ “A similar sentence to the question” might not be useful at all. ➢ Smarter search? ✓ Classic technologies have been revisited! ✓ E.g., Query expansion  Recall [Chum+, ICCV 2007]. They applied query expansion, which was a classic IR technique, to image retrieval.  Revisit repeats! ➢ Directions ✓ Solve in a machine-learning way? E.g., training a special embedding space. ✓ Solve by data-structures? E.g., data structure with a special operations.

Slide 94

Slide 94 text

Slides: bit.ly/41BTrrS 94 Recent trends ➢ “Embedding + Search” is too naïve? ✓ “A similar sentence to the question” might not be useful at all. ➢ Smarter search? ✓ Classic technologies have been revisited! ✓ E.g., Query expansion  Recall [Chum+, ICCV 2007]. They applied query expansion, which was a classic technique, to image retrieval.  Revisit repeats! ➢ Directions ✓ Solve in a machine-learning way? E.g., training a special embedding space. ✓ Solve by data-structures? E.g., data structure with a special operations.

Slide 95

Slide 95 text

Slides: bit.ly/41BTrrS 95 Recent trends ➢ “Embedding + Search” is too naïve? ✓ “A similar sentence to the question” might not be useful at all. ➢ Smarter search? ✓ Classic technologies have been revisited! ✓ E.g., Query expansion  Recall [Chum+, ICCV 2007]. They applied query expansion, which was a classic IR technique, to image retrieval.  Revisit repeats! ➢ Directions ✓ Solve in a machine-learning way? E.g., training a special embedding space. ✓ Solve by data-structures? E.g., data structure with a special operations.

Slide 96

Slide 96 text

Slides: bit.ly/41BTrrS 96 From the perspective of Nearest Neighbor Search ➢ CPU? GPU? Managed services? ✓ Trade-off! ✓ NVIDIA pushing GPU-search (RAFT&CAGRA [Ootomo+, arXiv 23]) ✓ Can we spend money to search? (Not training a model) ➢ Next paradigm ✓ Graph-based search is mature. ✓ We need breakthrough idea? ➢ Not trivial:Dimensionality (𝐷) is now 1000 (not 100) ✓ If 𝐷 > 100, it’s often said that we must PCA it to 𝐷 = 100. ✓ Because of LLM API, we need to handle 𝐷 = 1000 anyway.

Slide 97

Slide 97 text

Slides: bit.ly/41BTrrS 97 Use LLM as a building block ➢ The era of using LLMs as functions in pseudo code has already arrived. ➢ Should we run something locally or API call? A choice that has never been made before. ➢ How to think about "Reproducibility" in the LLM and API Era ➢ Need to be able to roughly estimate what is possible solutions to a problem, and at what cost, and what will be obtained. On-premises own server Cloud (e.g., AWS EC2) API Call ☺Free to use Initial cost maintenance ☺Good balance Depends on cloud ☺Super cheap Not good at large-processing Communication cost? New paradigm

Slide 98

Slide 98 text

Slides: bit.ly/41BTrrS 98 Reference ➢ Survey for NN search: CVPR 20 Tutorial ➢ SOTA graph-based search: CVPR 23 Tutorial ➢ Good tutorial for RAG [Asai+, ACL 2023 Tutorial]

Slide 99

Slide 99 text

Slides: bit.ly/41BTrrS 99 ◼ [Jégou+, TPAMI 2011] H. Jégou+, “Product Quantization for Nearest Neighbor Search”, IEEE TPAMI 2011 ◼ [Guo+, ICML 2020] R. Guo+, “Accelerating Large-Scale Inference with Anisotropic Vector Quantization”, ICML 2020 ◼ [Malkov+, TPAMI 2019] Y. Malkov+, “Efficient and Robust Approximate Nearest Neighbor search using Hierarchical Navigable Small World Graphs,” IEEE TPAMI 2019 ◼ [Malkov+, IS 13] Y, Malkov+, “Approximate Nearest Neighbor Algorithm based on Navigable Small World Graphs”, Information Systems 2013 ◼ [Fu+, VLDB 19] C. Fu+, “Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graphs”, 2019 ◼ [Wang+, VLDB 21] M. Wang+, “A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search”, VLDB 2021 ◼ [Iwasaki+, arXiv 18] M. Iwasaki and D. Miyazaki, “Optimization if Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data”, arXiv 2018 ◼ [Ootomo+, arXiv 23] H. Ootomo+, “CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs”, arXiv 2023 ◼ [Chum+, ICCV 07] O. Chum+, “Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval”, ICCV 2007 ◼ [Pinecone] https://www.pinecone.io/ ◼ [Milvus] https://milvus.io/ ◼ [Qdrant] https://qdrant.tech/ ◼ [Weaviate] https://weaviate.io/ ◼ [Vertex AI Matching Engine] https://cloud.google.com/vertex-ai/docs/matching-engine ◼ [Vald] https://vald.vdaas.org/ ◼ [Modal] https://modal.com/ Reference