Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

A History of Approximate Nearest Neighbor Searc...

Avatar for Yusuke Matsui Yusuke Matsui
December 14, 2025

A History of Approximate Nearest Neighbor Search from an Applications Perspective

Invited talk at Workshop of Approximate Nearest Neighbor Search: Bridging Theoretical Foundations and Industrial Frontiers at FOCS 2025 https://rajeshjayaram.com/focs-25-anns-workshop.html
Dec 14, 2025 @Sydney, Australia

Yusuke Matsui (The University of Tokyo)
https://yusukematsui.me/index.html

Avatar for Yusuke Matsui

Yusuke Matsui

December 14, 2025
Tweet

More Decks by Yusuke Matsui

Other Decks in Research

Transcript

  1. 1 A History of Approximate Nearest Neighbor Search from an

    Applications Perspective Yusuke Matsui The University of Tokyo Workshop of Approximate Nearest Neighbor Search: Bridging Theoretical Foundations and Industrial Frontiers at FOCS 2025 Dec 14, 2025 @Sydney, Australia
  2. 2 Yusuke Matsui ✓ Computer vision ✓ Data structure +

    Machine Learning http://yusukematsui.me Lecturer (Assistant Professor), the University of Tokyo, Japan @utokyo_bunny @matsui528 Diverse nearest neighbor search [Matsui, CVPR 25] ML-enhanced Sorting [Sato & Matsui, TMLR 25] ML-enhanced Bloom Filter [Sato & Matsui, NeurIPS 23]
  3. 3 ➢ Organized the 1st workshop on Vector Databases at

    ICML 2025 ➢ Planning the 2nd edition in 2026 ➢ Forum for NN researchers from various fields ➢ Welcome your submissions! The best paper was a theory paper! H. Xu, P. Indyk, S. Silwal, “Bi-metric Framework for Efficient Nearest Neighbor Search” Faiss CAGRA DiskANN
  4. Search 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝒙𝑛 ∈

    ℝ𝐷 4 ➢𝑁 𝐷-dim database vectors: 𝒙𝑛 𝑛=1 𝑁 Nearest Neighbor Search; NN
  5. 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 𝒒

    ∈ ℝ𝐷 𝒙74 argmin 𝑛∈ 1,2,…,𝑁 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝒙𝑛 ∈ ℝ𝐷 5 ➢𝑁 𝐷-dim database vectors: 𝒙𝑛 𝑛=1 𝑁 ➢Given a query 𝒒, find the closest vector from the database ➢One of the fundamental problems in computer science ➢Solution: linear scan, 𝑂 𝑁𝐷 , slow Nearest Neighbor Search; NN
  6. 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 𝒒

    ∈ ℝ𝐷 𝒙74 argmin 𝑛∈ 1,2,…,𝑁 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝒙𝑛 ∈ ℝ𝐷 Approximate Nearest Neighbor Search; ANN ➢Faster search ➢Don’t necessarily have to be exact neighbors ➢Trade off: runtime, accuracy, and memory-consumption 6
  7. 7 Why NN/ANN? ➢ NN/ANN is an interesting research area

    because: ✓ it's a pure theory (yes, this is FOCS WS!) ✓ at the same time, it is directly used in applications, e.g., Vector DBs ➢ Several research areas (CV, NLP, DB, ...) ➢ Because of RAG and Vector DB, ANN has become popular more and more
  8. 8 This talk ➢ Reorganized and expanded version of my

    previous tutorials of CVPR2020 and CVPR2023 ➢ See the original tutorials for more detailed contents ➢ CVPR 2020 Tutorial on Image Retrieval in the Wild ➢ Y. Matsui, “Billion-scale Approximate Nearest Neighbor Search” ➢ https://speakerdeck.com/matsui_528/cvpr20-tutorial-billion- scale-approximate-nearest-neighbor-search ➢ CVPR 2023 Tutorial on Neural Search in Action ➢ Y. Matsui, “Theory and Applications of Graph-based Search” ➢ https://speakerdeck.com/matsui_528/cvpr23-tutorial-theory-and- applications-of-graph-based-search
  9. 9 1. History from an applications perspective 2. Importance of

    implementation: nearest neighbor search in faiss 3. Basics of modern baseline: graph-based search Outline
  10. 10 1. History from an applications perspective 2. Importance of

    implementation: nearest neighbor search in faiss 3. Basics of modern baseline: graph-based search Outline
  11. 11 What tasks has ANN been used for? 2000 2010

    2020 Computer Vision (CV): SIFT + BoF CV & NLP: CLIP multimodal search CV: Image search Large Language Models: Retrieval Augmented Generation (RAG) Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB
  12. 12 What tasks has ANN been used for? 2000 2010

    2020 Computer Vision (CV): SIFT + BoF CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG)
  13. 13 SIFT (local feature) + BoF (Bag-of-features) + SVM https://jp.mathworks.com/help/vision/ug/image-classification-with-bag-of-visual-words.html

    𝒙 ∈ ℝ128 𝑘∗ = argmin 𝑘∈ 1,…,5 𝒙 − 𝒄𝑘 2 2 ➢ To compute BoF fast, several practical ANN technologies have been invented in the CV area in the 2000s – 2010s… E.g.: Product Quantization ➢ Good old days…. 𝒄1 𝒄2 Extract a local patch Given codewords 𝒄𝑘 , find the closest one Create a histogram, run SVM, recognize an image… ➢ This is nearest neighbor search! ➢ 𝐾 is 103 to 104 ➢ Must be in memory
  14. 14 What tasks has ANN been used for? 2000 2010

    2020 Computer Vision (CV): SIFT + BoF CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG)
  15. 15 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash
  16. 16 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 𝒙1 ResNet
  17. 17 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 𝒙1 , 𝒙2 , ResNet
  18. 18 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 𝒙1 , 𝒙2 , … , 𝒙𝑁 … ResNet
  19. 19 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 0.23 3.15 0.65 1.43 Search 𝒙1 , 𝒙2 , … , 𝒙𝑁 ResNet … ResNet
  20. 20 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 … ResNet ResNet
  21. 21 Image Search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos by

    Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 … ResNet ➢ Use a pre-trained CNN model (e.g., ResNet) as a feature extractor ➢ Represent an image as a high-dimensional vector ➢ Image-retrieval by nearest neighbor search ResNet
  22. 22 What tasks has ANN been used for? 2000 2010

    2020 CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG) Computer Vision (CV): SIFT + BoF
  23. 23 Text Search (Dense and/or Sparse) Doc1: “The cat sleeps

    on chairs” Doc2: “A quick blue fox runs” Doc3: “The cat sits near window” … Query: “fox” search 𝒒fox Search 𝒙2 argmin 𝒒fox − 𝒙𝑛 2 2 𝒙1 , 𝒙2 , … , 𝒙𝑁 Term ID the 1, 3 fox 2 cat 2, 3 Blue 2 … … “fox” Doc2 Dense search Sparse search ➢ How to design embedding? BERT and its successors… ➢ Classical “matching” and its extensions ➢ TF-IDF, BM25, SPLADE… ➢ Dense search: accurate ➢ Sparse search: fast ➢ Combining two approaches is a hot topic in information retrieval
  24. 24 What tasks has ANN been used for? 2000 2010

    2020 CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG) Computer Vision (CV): SIFT + BoF
  25. 25 https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm kNN classification Label: dog Label: cat ➢ The

    most straightforward approach for classification ➢ Given a query, find the closest sample from training data, and report its label ➢ Although super simple, it’s actually effective if the embedding is good and #samples are large ➢ We can just run ANN search ➢ Complex ML -> Simple but large-scale search
  26. 26 What tasks has ANN been used for? 2000 2010

    2020 CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG) Computer Vision (CV): SIFT + BoF
  27. 27 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash
  28. 28 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 𝒙1 CLIP Image Encoder
  29. 29 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 𝒙1 , 𝒙2 , CLIP Image Encoder
  30. 30 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 𝒙1 , 𝒙2 , … , 𝒙𝑁 … CLIP Image Encoder
  31. 31 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash 0.23 3.15 0.65 1.43 Search 𝒙1 , 𝒙2 , … , 𝒙𝑁 CLIP Text Encoder … CLIP Image Encoder “Two dogs playing in the snow”
  32. 32 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash “Two dogs playing in the snow” 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 CLIP Text Encoder … CLIP Image Encoder
  33. 33 CLIP multimodal search Image are from: https://github.com/haltakov/natural-language-image-search Credit: Photos

    by Genton Damian, bruce mars, Dalal Nizam, and Richard Burlton on Unsplash “Two dogs playing in the snow” 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 CLIP Text Encoder … CLIP Image Encoder ➢ CLIP enables us to compare images and texts ➢ Encoder determines the upper bound of the accuracy of the system ➢ ANN determines a trade-off between accuracy, runtime, and memory
  34. 34 What tasks has ANN been used for? 2000 2010

    2020 CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG) Computer Vision (CV): SIFT + BoF
  35. 35 RAG: LLM + embedding Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon

    credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021)
  36. 36 RAG: LLM + embedding Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon

    credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “I'm sorry, but as an AI language model, I don't have information about the future events.” Ask 
  37. 37 RAG: LLM + embedding Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon

    credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021)
  38. 38 RAG: LLM + embedding Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon

    credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida\n\n==Personal…”
  39. 39 RAG: LLM + embedding 𝒙1 , Texts are from:

    https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida\n\n==Personal…” Text Encoder
  40. 40 RAG: LLM + embedding 𝒙1 , 𝒙2 , Texts

    are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida\n\n==Personal…” Text Encoder
  41. 41 RAG: LLM + embedding 𝒙1 , 𝒙2 , …

    , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … “Chinami Yoshida\n\n==Personal…” Text Encoder
  42. 42 RAG: LLM + embedding 0.23 3.15 0.65 1.43 𝒙1

    , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida\n\n==Personal…” Text Encoder
  43. 43 RAG: LLM + embedding 0.23 3.15 0.65 1.43 0.20

    3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" ChatGPT 3.5 (trained in 2021) Search “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida\n\n==Personal…” Text Encoder “List of 2022 Winter Olympics medal winners…”
  44. 44 RAG: LLM + embedding 0.23 3.15 0.65 1.43 0.20

    3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" Search Update “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida\n\n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021)
  45. 45 RAG: LLM + embedding 0.23 3.15 0.65 1.43 0.20

    3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida\n\n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021)
  46. 46 RAG: LLM + embedding 0.23 3.15 0.65 1.43 0.20

    3.25 0.72 1.68 argmin 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 Texts are from: https://github.com/openai/openaicookbook/blob/main/examples/Question_answering_using_embeddings.ipynb Icon credit: https://ja.wikipedia.org/wiki/ChatGPT "Who won curling gold at the 2022 Winter Olympics?" “Niklas Edin, Oskar Eriksson, …” Search Update ☺ “Damir Sharipzyanov\n\n=Career…” “Lviv bid for the 2022 Winter…” … Text Encoder “Chinami Yoshida\n\n==Personal…” Text Encoder “Who won curling gold at the 2022 Winter Olympics? Use the bellow articles: List of 2022 Winter Olympics medal winners…” “List of 2022 Winter Olympics medal winners…” ChatGPT 3.5 (trained in 2021) Embedding+ANN is the current easiest way to provide knowledge to LLM
  47. 47 What tasks has ANN been used for? 2000 2010

    2020 CV & NLP: CLIP multimodal search CV: Image search Natural Language Processing (NLP) & Information Retrieval: Text search (Dense? Sparse?) Machine Learning: kNN classification, metric learning Database: VectorDB Large Language Models: Retrieval Augmented Generation (RAG) Computer Vision (CV): SIFT + BoF
  48. 48 What is Vector DB? Milvus Pinecone Qdrant ScaNN (4-bit

    PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald ScaNN jina
  49. What is Vector DB? 49 Milvus Pinecone Qdrant ScaNN (4-bit

    PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Weaviate Vertex AI Matching Engine NMSLIB hnswlib Vald ScaNN jina Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] faiss One library may implement multiple algorithms  “I benchmarked faiss” ☺ “I benchmarked PQ in faiss”
  50. What is Vector DB? 50 Milvus Pinecone Qdrant ScaNN (4-bit

    PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Weaviate Vertex AI Matching Engine Vald ScaNN jina Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] faiss NMSLIB hnswlib One algorithm may be implemented in multiple libraries
  51. What is Vector DB? 51 Milvus Pinecone Qdrant Algorithm ➢

    Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald jina ScaNN (4-bit PQ) [Guo+, ICML 2020] ScaNN Often, one library = one algorithm
  52. What is Vector DB? 52 Pinecone Qdrant ScaNN (4-bit PQ)

    [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Vertex AI Matching Engine NMSLIB Vald ScaNN jina Weaviate Milvus faiss hnswlib Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] One service may use some libraries … or re-implement algorithms from scratch (e.g., by Go)
  53. 53 What is Vector DB? Milvus Pinecone Qdrant ScaNN (4-bit

    PQ) [Guo+, ICML 2020] Algorithm ➢ Scientific paper ➢ Math ➢ Often, by researchers Library ➢ Implementations of algorithms ➢ Usually, a search function only ➢ By researchers, developers, etc Service (e.g., vector DB) ➢ Library + (handling metadata, serving, scaling, IO, CRUD, etc) ➢ Usually, by companies Product Quantization + Inverted Index (PQ, IVFPQ) [Jégou+, TPAMI 2011] Hierarchical Navigable Small World (HNSW) [Malkov+, TPAMI 2019] Weaviate Vertex AI Matching Engine faiss NMSLIB hnswlib Vald ScaNN jina ➢ Recently, the DB community has been the most active ➢ The CV community has settled down a bit
  54. 54 1. History from an applications perspective 2. Importance of

    implementation: nearest neighbor search in faiss 3. Basics of modern baseline: graph-based search Outline
  55. 0.23 3.15 0.65 1.43 Search 0.20 3.25 0.72 1.68 𝒒

    ∈ ℝ𝐷 𝒙74 argmin 𝑛∈ 1,2,…,𝑁 𝒒 − 𝒙𝑛 2 2 Result 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝒙𝑛 ∈ ℝ𝐷 Nearest Neighbor Search 55 ➢Before trying ANN, we should try NN ➢Introduce a naïve implementation ➢Introduce a fast implementation: Faiss library ➢Experience the drastic difference between the two implementations
  56. Task:Given 𝒒 ∈ 𝒬 and 𝒙 ∈ 𝒳, compute 𝒒

    − 𝒙 2 2 56 𝑀 𝐷-dim query vectors 𝒬 = 𝒒1 , 𝒒2 , … , 𝒒𝑀 𝑁 𝐷-dim database vectors 𝒳 = 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝑀 ≪ 𝑁
  57. 𝑀 𝐷-dim query vectors 𝒬 = 𝒒1 , 𝒒2 ,

    … , 𝒒𝑀 𝑁 𝐷-dim database vectors 𝒳 = 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝑀 ≪ 𝑁 Task:Given 𝒒 ∈ 𝒬 and 𝒙 ∈ 𝒳, compute 𝒒 − 𝒙 2 2 parfor q in Q: for x in X: l2sqr(q, x) def l2sqr(q, x): diff = 0.0 for (d = 0; d < D; ++d): diff += (q[d] – x[d])**2 return diff Naïve impl. Parallelize query-side Select min by heap, but omit it now 57
  58. Task:Given 𝒒 ∈ 𝒬 and 𝒙 ∈ 𝒳, compute 𝒒

    − 𝒙 2 2 parfor q in Q: for x in X: l2sqr(q, x) def l2sqr(q, x): diff = 0.0 for (d = 0; d < D; ++d): diff += (q[d] – x[d])**2 return diff Naïve impl. Parallelize query-side Select min by heap, but omit it now faiss impl. if 𝑀 < 20: compute 𝒒 − 𝒙 2 2 by SIMD else: compute 𝒒 − 𝒙 2 2 = 𝒒 2 2 − 2𝒒⊤𝒙 + 𝒙 2 2 by BLAS 58 𝑀 𝐷-dim query vectors 𝒬 = 𝒒1 , 𝒒2 , … , 𝒒𝑀 𝑁 𝐷-dim database vectors 𝒳 = 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝑀 ≪ 𝑁 Disclaimer: the code used in the explanation is from several years ago and is not up to date.
  59. Task:Given 𝒒 ∈ 𝒬 and 𝒙 ∈ 𝒳, compute 𝒒

    − 𝒙 2 2 parfor q in Q: for x in X: l2sqr(q, x) def l2sqr(q, x): diff = 0.0 for (d = 0; d < D; ++d): diff += (q[d] – x[d])**2 return diff Naïve impl. Parallelize query-side Select min by heap, but omit it now faiss impl. if 𝑀 < 20: compute 𝒒 − 𝒙 2 2 by SIMD else: compute 𝒒 − 𝒙 2 2 = 𝒒 2 2 − 2𝒒⊤𝒙 + 𝒙 2 2 by BLAS 59 𝑀 𝐷-dim query vectors 𝒬 = 𝒒1 , 𝒒2 , … , 𝒒𝑀 𝑁 𝐷-dim database vectors 𝒳 = 𝒙1 , 𝒙2 , … , 𝒙𝑁 𝑀 ≪ 𝑁 Disclaimer: the code used in the explanation is from several years ago and is not up to date.
  60. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } 𝒙 − 𝒚 2 2 by SIMD Ref. Rename variables for the sake of explanation x y D=31 float: 32bit 60 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff
  61. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my ➢ 256bit SIMD Register ➢ Process eight floats at once float: 32bit 61 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31
  62. float: 32bit float fvec_L2sqr (const float * x, const float

    * y, size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my 62 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  63. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my a_m_b1 ⊖⊖⊖⊖ ⊖ ⊖ ⊖⊖ float: 32bit 63 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  64. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my a_m_b1 msum1 a_m_b1 ⊖⊖⊖⊖ ⊖ ⊖ ⊖⊖ ⊗⊗⊗⊗⊗⊗⊗⊗ += float: 32bit 64 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  65. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my msum1 += float: 32bit 65 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  66. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my a_m_b1 ⊖⊖⊖⊖ ⊖ ⊖ ⊖⊖ msum1 += float: 32bit 66 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  67. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my a_m_b1 msum1 a_m_b1 ⊖⊖⊖⊖ ⊖ ⊖ ⊖⊖ ⊗⊗⊗⊗⊗⊗⊗⊗ += float: 32bit 67 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  68. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my a_m_b1 msum1 a_m_b1 msum2 ⊖⊖⊖⊖ ⊖ ⊖ ⊖⊖ ⊗⊗⊗⊗⊗⊗⊗⊗ ⊕⊕⊕⊕ ➢ 128bit SIMD Register += float: 32bit 68 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 256bit SIMD Register ➢ Process eight floats at once
  69. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y mx my a_m_b1 a_m_b1 msum2 ⊖⊖⊖⊖ ⊗⊗⊗⊗ += ➢ 128bit SIMD Register float: 32bit 69 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31
  70. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y 0 0 0 mx 0 0 0 my a_m_b1 a_m_b1 ⊖⊖⊖⊖ ⊗⊗⊗⊗ msum2 += The rest float: 32bit 70 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 128bit SIMD Register
  71. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y 0 0 0 mx 0 0 0 my a_m_b1 a_m_b1 ⊖⊖⊖⊖ ⊗⊗⊗⊗ ⊕ ⊕ ⊕ msum2 += Result float: 32bit 71 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 128bit SIMD Register The rest
  72. float fvec_L2sqr (const float * x, const float * y,

    size_t d) { __m256 msum1 = _mm256_setzero_ps(); while (d >= 8) { __m256 mx = _mm256_loadu_ps (x); x += 8; __m256 my = _mm256_loadu_ps (y); y += 8; const __m256 a_m_b1 = mx - my; msum1 += a_m_b1 * a_m_b1; d -= 8; } __m128 msum2 = _mm256_extractf128_ps(msum1, 1); msum2 += _mm256_extractf128_ps(msum1, 0); if (d >= 4) { __m128 mx = _mm_loadu_ps (x); x += 4; __m128 my = _mm_loadu_ps (y); y += 4; const __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; d -= 4; } if (d > 0) { __m128 mx = masked_read (d, x); __m128 my = masked_read (d, y); __m128 a_m_b1 = mx - my; msum2 += a_m_b1 * a_m_b1; } msum2 = _mm_hadd_ps (msum2, msum2); msum2 = _mm_hadd_ps (msum2, msum2); return _mm_cvtss_f32 (msum2); } x y 0 0 0 mx 0 0 0 my a_m_b1 a_m_b1 ⊖⊖⊖⊖ ⊗⊗⊗⊗ ⊕ ⊕ ⊕ msum2 += Result float: 32bit 72 def l2sqr(x, y): diff = 0.0 for (d = 0; d < D; ++d): diff += (x[d] – y[d])**2 return diff 𝒙 − 𝒚 2 2 by SIMD Rename variables for the sake of explanation Ref. D=31 ➢ 128bit SIMD Register The rest ➢ SIMD codes of faiss are simple and easy to read ➢ Being able to read SIMD codes comes in handy sometimes; why this impl is super fast ➢ Another example of SIMD L2sqr from HNSW: https://github.com/nmslib/hnswlib/blob/master/hnswlib/space_l2.h
  73. 𝑀 𝐷-dim query vectors 𝒬 = 𝒒1 , 𝒒2 ,

    … , 𝒒𝑀 𝑁 𝐷-dim database vectors 𝒳 = 𝒙1 , 𝒙2 , … , 𝒙𝑁 Task:Given 𝒒 ∈ 𝒬 and 𝒙 ∈ 𝒳, compute 𝒒 − 𝒙 2 2 parfor q in Q: for x in X: l2sqr(q, x) def l2sqr(q, x): diff = 0.0 for (d = 0; d < D; ++d): diff += (q[d] – x[d])**2 return diff Naïve impl. Parallelize query-side Select min by heap, but omit it now faiss impl. if 𝑀 < 20: compute 𝒒 − 𝒙 2 2 by SIMD else: compute 𝒒 − 𝒙 2 2 = 𝒒 2 2 − 2𝒒⊤𝒙 + 𝒙 2 2 by BLAS 73
  74. Compute 𝒒 − 𝒙 2 2 = 𝒒 2 2

    − 2𝒒⊤𝒙 + 𝒙 2 2 with BLAS # Compute tables q_norms = norms(Q) # 𝒒1 2 2, 𝒒2 2 2, … , 𝒒𝑀 2 2 x_norms = norms(X) # 𝒙1 2 2, 𝒙2 2 2, … , 𝒙𝑁 2 2 ip = sgemm_(Q, X, …) # 𝑄⊤𝑋 # Scan and sum parfor (m = 0; m < M; ++m): for (n = 0; n < N; ++n): dist = q_norms[m] + x_norms[n] – 2 * ip[m][n] Stack 𝑀 𝐷-dim query vectors to a 𝐷 × 𝑀 matrix: 𝑄 = 𝒒1 , 𝒒2 , … , 𝒒𝑀 ∈ ℝ𝐷×𝑀 Stack 𝑁 𝐷-dim database vectors to a 𝐷 × 𝑁 matrix: 𝑋 = 𝒙1 , 𝒙2 , … , 𝒙𝑁 ∈ ℝ𝐷×𝑁 SIMD-accelerated function ➢ Matrix multiplication by BLAS ➢ Dominant if 𝑄 and 𝑋 are large ➢ The difference of the background matters: ✓ Intel MKL is 30% faster than OpenBLAS 74 𝒒𝑚 2 2 𝒙𝑛 2 2 𝑄⊤𝑋 𝑚𝑛 𝒒𝑚 − 𝒙𝑛 2 2
  75. NN in GPU (faiss-gpu) is 10x faster than NN in

    CPU (faiss-cpu) ➢ NN-GPU always compute 𝒒 2 2 − 2𝒒⊤𝒙 + 𝒙 2 2 ➢ k-means for 1M vectors (D=256, K=20000) ✓ 11 min on CPU ✓ 55 sec on 1 Pascal-class P100 GPU (float32 math) ✓ 34 sec on 1 Pascal-class P100 GPU (float16 math) ✓ 21 sec on 4 Pascal-class P100 GPUs (float32 math) ✓ 16 sec on 4 Pascal-class P100 GPUs (float16 math) ➢ If GPU is available and its memory is enough, try GPU-NN ➢ The behavior is little bit different (e.g., a restriction for top-k) Benchmark: https://github.com/facebookresearch/faiss/wiki/Low-level-benchmarks x10 faster 75
  76. 76 1. History from an applications perspective 2. Importance of

    implementation: nearest neighbor search in faiss 3. Basics of modern baseline: graph-based search Outline
  77. 77 Graph search ➢ De facto standard if all data

    can be loaded on memory ➢ Fast and accurate for real-world data ➢ Important for billion-scale situation as well ✓ Graph-search is a building block for billion-scale systems Images are from [Malkov+, Information Systems, 2013] ➢ Traverse graph towards the query ➢ Seems intuitive, but not so much easy to understand ➢ Review the algorithm carefully
  78. 78 Construction Images are from [Malkov+, Information Systems, 2013] and

    [Subramanya+, NeruIPS 2019] Increment approach Refinement approach ➢ Add a new item to the current graph incrementally ➢ Iteratively refine an initial graph
  79. 79 Construction Images are from [Malkov+, Information Systems, 2013] and

    [Subramanya+, NeruIPS 2019] Increment approach Refinement approach ➢ Add a new item to the current graph incrementally ➢ Iteratively refine an initial graph
  80. 80 Images are from [Malkov+, Information Systems, 2013] ➢Each node

    is a database vector 𝒙13 Graph of 𝒙1 , … , 𝒙90 Construction: incremental approach
  81. 81 ➢Each node is a database vector ➢Given a new

    database vector, 𝒙13 𝒙91 Graph of 𝒙1 , … , 𝒙90 Images are from [Malkov+, Information Systems, 2013] Construction: incremental approach
  82. 82 ➢Each node is a database vector ➢Given a new

    database vector, create new edges to neighbors 𝒙13 𝒙91 Graph of 𝒙1 , … , 𝒙90 Images are from [Malkov+, Information Systems, 2013] Construction: incremental approach
  83. 83 ➢Each node is a database vector ➢Given a new

    database vector, create new edges to neighbors 𝒙13 𝒙91 Graph of 𝒙1 , … , 𝒙90 Images are from [Malkov+, Information Systems, 2013] Construction: incremental approach
  84. 84 ➢Each node is a database vector ➢Given a new

    database vector, create new edges to neighbors 𝒙13 𝒙91 Graph of 𝒙1 , … , 𝒙90 Images are from [Malkov+, Information Systems, 2013] Construction: incremental approach ➢ Prune edges if some node have too many edges ➢ Several strategies (e.g., RNG-pruning)
  85. 85 Construction Images are from [Malkov+, Information Systems, 2013] and

    [Subramanya+, NeruIPS 2019] Increment approach Refinement approach ➢ Add a new item to the current graph incrementally ➢ Iteratively refine an initial graph
  86. 86 Construction: refinement approach Images are from [Subramanya+, NeruIPS 2019]

    ➢ Create an initial graph (e.g., random graph or approx. kNN graph) ➢ Refine it iteratively (pruning/adding edges)
  87. 87 Construction: refinement approach Images are from [Subramanya+, NeruIPS 2019]

    ➢ Create an initial graph (e.g., random graph or approx. kNN graph) ➢ Refine it iteratively (pruning/adding edges) ➢ Need to be moderately sparse (otherwise the graph traverse is slow) ➢ Some “long” edges are required for shortcut
  88. 88 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M ➢ Given a query vector Candidates (size = 3) Close to the query Name each node for explanation
  89. 89 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M ➢ Given a query vector ➢ Start from an entry point (e.g., ) Candidates (size = 3) Close to the query M
  90. 90 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M ➢ Given a query vector ➢ Start from an entry point (e.g., ). Record the distance to q. Candidates (size = 3) Close to the query M M 23.1
  91. 91 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 M
  92. 92 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query M 23.1 1st iteration
  93. 93 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M ➢ Pick up the unchecked best candidate ( ) Candidates (size = 3) Close to the query M M 23.1 Best Best
  94. 94 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M ➢ Pick up the unchecked best candidate ( ). Check it. Candidates (size = 3) Close to the query M M 23.1 Best Best check!
  95. 95 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. Best M M check!
  96. 96 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query 23.1 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M M check!
  97. 97 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M J 11.1 N 15.3 K 19.4 M 23.1 check!
  98. 98 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. N M J 11.1 N 15.3 K 19.4 M 23.1
  99. 99 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. . ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) N M J 11.1 N 15.3 K 19.4 M 23.1
  100. 100 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) N M J 11.1 N 15.3 K 19.4
  101. 101 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4
  102. 102 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 2nd iteration
  103. 103 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ) J Best Best
  104. 104 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. J Best Best check!
  105. 105 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. J Best Best check!
  106. 106 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N J 11.1 N 15.3 K 19.4 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best 13.2 9.7 check! Already visited
  107. 107 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best 13.2 9.7 J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2 check! Already visited
  108. 108 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. J Best J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2
  109. 109 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) J Best J 11.1 N 15.3 B 2.3 G 3.5 I 9.7 F 10.2 L 13.2
  110. 110 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) J Best B 2.3 G 3.5 I 9.7
  111. 111 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7
  112. 112 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 3rd iteration
  113. 113 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ) B
  114. 114 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ). Check it. B check!
  115. 115 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. B check!
  116. 116 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N B 2.3 G 3.5 I 9.7 Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B 0.5 2.1 check!
  117. 117 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N Best ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B 0.5 2.1 C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 check!
  118. 118 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. B C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 Best
  119. 119 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) B C 0.5 D 2.1 A 3.6 B 2.3 G 3.5 I 9.7 Best
  120. 120 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) B C 0.5 D 2.1 B 2.3 Best
  121. 121 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3
  122. 122 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 4th iteration
  123. 123 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). C Best Best
  124. 124 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. C Best Best check!
  125. 125 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. C Best Best check!
  126. 126 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. C Best check! Already visited Already visited Already visited Already visited
  127. 127 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) C Best
  128. 128 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3
  129. 129 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 5th iteration
  130. 130 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). D Best Best
  131. 131 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. D Best Best check!
  132. 132 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. D Best Best check!
  133. 133 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best check! Already visited Already visited
  134. 134 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best check! Already visited Already visited H 3.9
  135. 135 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. D Best H 3.9
  136. 136 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) D Best H 3.9
  137. 137 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ Pick up the unchecked best candidate ( ). Check it. ➢ Find the connected points. ➢ Record the distances to q. ➢ Maintain the candidates (size=3) D Best
  138. 138 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) C
  139. 139 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 C Final output 1: Candidates ➢ You can pick up topk results ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( )
  140. 140 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 C Final output 1: Candidates ➢ You can pick up topk results ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) Final output 2: Checked items ➢ i.e., search path
  141. 141 Search Images are from [Malkov+, Information Systems, 2013] A

    B C D E F G H I J K L N M Candidates (size = 3) Close to the query N C 0.5 D 2.1 B 2.3 ➢ All candidates are checked. Finish. ➢ Here, is the closet to the query ( ) C Final output 1: Candidates ➢ You can pick up topk results Final output 2: Checked items ➢ i.e., search path Final output 3: Visit flag ➢ For each item, visited or not
  142. 142 Observation: runtime ➢ Item comparison takes time; 𝑂 𝐷

    ➢ The overall runtime ~ #item_comparison ∼ length_of_search_path * average_outdegree 𝒒 ∈ ℝ𝐷 𝒙13 ∈ ℝ𝐷 start query start query start query 1st path 2nd path 3rd path 2.1 1.9 outdegree = 1 outdegree = 2 outdegree = 2 #item_comparison = 3 * (1 + 2 + 2)/3 = 5 2.4
  143. 143 Observation: runtime ➢ Item comparison takes time; 𝑂 𝐷

    ➢ The overall runtime ~ #item_comparison ∼ length_of_search_path * average_outdegree 𝒒 ∈ ℝ𝐷 𝒙13 ∈ ℝ𝐷 start query start query start query 1st path 2nd path 3rd path 2.1 1.9 outdegree = 1 outdegree = 2 outdegree = 2 #item_comparison = 3 * (1 + 2 + 2)/3 = 5 2.4 To accelerate the search, (1) How to shorten the search path? ➢ E.g., long edge (shortcut), hierarchical structure (2) How to sparsify the graph? ➢ E.g., deleting redundant edges
  144. 144 A D C B query Observation: candidate size E

    start Candidates (size = 1) C A D C B query E start Candidates (size = 3) C D E size = 1: Greedy search size > 1: Beam search ➢ Larger candidate size, better but slower results ➢ Online parameter to control the trade-off ➢ Called “ef” in HNSW Fast. But stuck in a local minimum Slow. But find a better solution
  145. 145 Pseudo code ➢ All papers have totally different pseudo

    code ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19]
  146. 146 Pseudo code ➢ All papers have totally different pseudo

    code ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19] Sort the array explicitly Candidates are stored in a set Candidates are stored in a heap; automatically sorted Candidates are stored in an array When need to sort, say “closest L points”
  147. 147 Pseudo code ➢ All papers have totally different pseudo

    code ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19] Just “check” Checked items are stored in a set (“visit” in this code means “check” in our notation)
  148. 148 Pseudo code ➢ All papers have totally different pseudo

    code  ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19] Visited item are simply said to be “visited”; implying an additional hidden data structure (array) Visited items are stored in a set
  149. 149 Pseudo code ➢ All papers have totally different pseudo

    code ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19] Termination condition??
  150. 150 Pseudo code ➢ All papers have totally different pseudo

    code ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19] My explanation was based on NSG, but with slight modifications for simplicity: ➢ Candidates are stored in an automatically-sorted array ➢ Termination condition is “all candidates are checked”
  151. 151 Pseudo code ➢ All papers have totally different pseudo

    code ➢ Principles are the same. But small details are different. ➢ Hint: Explicitly state the data structure or not NSG [Cong+, VLDB 19] DiskANN [Subramanya+, NeurIPS 19] Learning to route [Baranchuk+, ICML 19] Formal (?) definition would be helpful for everyone
  152. 152 1. History from an applications perspective 2. Importance of

    implementation 3. Basics of modern baseline Outline