ESRE とか ELSER とか RRF ってナニ!? もろもろ理解してスッキリしよう

ESRE, ELSER, RRF?? Elastic Education Architect Koji Kawamura @ijokarumawak June
28 2023

Elastic と GAI

Generative AI (GAI) って? Lots of articles, books, audio and
videos are transforming into a human shaped cloud floating in the sky

Generative AI 使ってますか?

LLM の弱点

コンテキストを加える (8.8 リリースブログをコピペ) (以下略)

8.8 セキュリティリリース日本語概要

What are the key aspects of the company's 401k policy
for an employee in my location and how do I enroll? Question as search query original question context window ✓ GAI response based on the most relevant data ✓ Additional user personalization ✓ Reduces required compute resources Most relevant data to the query Elastic + Generative AI increases relevance and scalability at a lower cost Domain Specific, Private Content Elasticsearch

ESRE …!? - Elasticsearch Relevance Engine - sounds like ez-ray
- えずれぃ - Elastic の検索ツールセットを総称するブランディング

Elasticsearch Relevance Engine™ Gives developers a powerful set of machine
learning tools to build AI-powered search applications that integrate with large language models & Generative AI Reflects two years of R&D Vector database Ability to host your own transformer model Ability to integrate with 3rd party transformer models (OpenAI) RRF - hybrid scoring model (vector & textual search) Elastic’s proprietary ML model Integration with 3rd party tooling like LangChain

ELSER …!? - Elastic Learned Sparse EncodER - えるさー -
Elastic が開発した、テキストを疎なベクトル (Sparse vector) に変換するための学習済み言語モデル

ELSER - 何が嬉しい? - Elastic に組み込まれているのでわざわざモデルをインポートしなくていい - Fine-tune せずとも zero-shot
で効果が出る汎用モデル - Text expansion: - Vocabulary mismatch problem - 関連語、類語、同義語をまとめて検索し、検索漏れを改善 - 手動での Synonym 管理負荷を軽減できる ? ドメイン固有の synonym は個別対応が必要 - Dense model との違い: - Sparse は出現頻度の低いレアな token、意味の薄い stop words, コンテキストに強く依存する token (例: bank) などを捨てている、全ての単語間の関連を保持する Dense と違い、関連性の高い token の関係のみを Sparse matrix に保持 - Dense model の弱点 - 次ページ以降のブログシリーズで語られている

Dense? Sparse? Check out the other posts in this Information
Retrieval series: Part 1: Steps to improve search relevance Dense model の解説、pre-train, task-specific, domain-specific (fine-tune) Q&Aタスク固有の学習では MSMARCO (bing のクエリと結果) がよく使われる Part 2: Benchmarking passage retrieval BM25 と dense モデルの IR性能比較方法、 MSMARCO 以外での比較が必要 BEIR paper, 18のデータセットに対する zero-shot での IR性能比較 Dense モデルは BM25 よりドメインの異なるデータセットでの IR性能が劣る Zero-shot で BM25 と KNN を組み合わせると関連度が下がる傾向がある Fine-tune の必要性、しかし一般ユーザーにはコストが高い Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model なぜ ELSER、次ページ以降で解説

BM25 vs Dense models Part 2: Benchmarking passage retrieval より抜粋
Dense は学習に利用した MSMARCO 以外では BM25 より劣る結果に

BM25 vs ELSER Part 3: Introducing Elastic Learned Sparse Encoder,
our new retrieval model より抜粋

Sparse vs Dense?

Embedding Model Dense Vector Search (Semantical) Text 1 [1.23, 0.21,
… ] Query Dense Vector Text 2 [3.11, 5.04, … ] [1.11, 2.16, … ] T1 Q T2 Dense Vector space テキストの意味を元にベクトル化、近い文書を検索

Doc 2: my favorite pokemon Sparse Vector Search (Lexical) Analyzer
Doc 1: The newest chapter in the Pokémon series Inverted Index Query: the latest pokemon game Token Docs chapter 1 favorit 2 my 2 newest 1 pokemon 2 pokémon 1 seri 1 latest, pokemon, game 転置インデックスで検索 BM25 でスコアリング

Doc 2: my favorite pokemon Sparse Vector Search (ELSER) ELSER
Doc 1: The newest chapter in the Pokémon series Inverted Index Query: the latest pokemon game Token Docs chapter 1 (2.11) pokemon 1 (2.33) 2 (2.42) … etc ※生成された token の一部のみ表示 Search Result Doc Score 1 16.2088 2 10.0872

Sparse Vector Search (ELSER scoring) Search Result Doc Score 1
16.2088 2 10.0872 token query w doc w qw * dw pokemon 2.6080 2.3329 6.0842 latest 1.9030 1.1554 2.1987 anime 1.0525 1.1673 1.2285 release 0.9056 0.9164 0.8298 … … Doc 1: The newest chapter in the Pokémon series Query: the latest pokemon game sum()

Sparse Vector Search with ELSER TODO 生成される単語種類数はモデルのボキャブラリ数に等しい Lucene の
posting list の中に TF や rank_features のデータを持っているクエリ内の term(i) スコアとマッチした　 rank term(i) スコアの積の合計

NAVER LABS の開発した SPLADE がベース https://europe.naverlabs.com/blog/splade-a-sparse-bi-encoder-bert-based-model-achieves-effective-and-efficient-first-stage-ranking/

Blog part 3 の要点 (1) - Sparse vector はすでに Elasticsearch
(Lucene) に搭載済み、成熟している - Dense vector に比べてストレージ、メモリ効率が良い - 検索結果の一致箇所をハイライト表示できるのでわかりやすい - 言語モデルから抽出した関連 term をどの程度採用するか、検索精度と性能のトレードオフが可能 - SPLADE v2 で効果が明らかになっている Distillation で軽量化したモデルを利用

Blog part 3 の要点 (2) - クエリと文書の関連度ランキングが抽出される様に、クエリ、関連ドキュメント、関連しないドキュメントの三つを使い、関連ドキュメントと非関連ドキュメントのスコアの差を維持する様に Distillation
- - MiniLM L-6 と monot5 3b を teacher に利用? - FLOPS regulalizer? ちゃんと理解できません...

Sparse? Dense? 選ばないとダメ？

RRF …!? - Reciprocal Rank Fusion - あーるあーるえふ - 複数の検索結果を検索結果内の位置を元に合成する仕組み

Reciprocal Rank Fusion (RRF) Ranking Algorithm 1 Doc Score r(d)
A 1 1 B 0.7 2 C 0.5 3 D 0.2 4 E 0.01 5 Doc A C B F D D - set of docs R - set of rankings as permutation on 1..|D| K - typically set to 60 by default Ranking Algorithm 2 Doc Score r(d) C 1,341 1 A 739 2 F 732 3 G 192 4 H 183 5

RRF 擬似コード https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html

まとめ

What are the key aspects of the company's 401k policy
for an employee in my location and how do I enroll? Question as search query original question context window ✓ GAI response based on the most relevant data ✓ Additional user personalization ✓ Reduces required compute resources Most relevant data to the query Elastic + Generative AI increases relevance and scalability at a lower cost Domain Specific, Private Content Elasticsearch

まとめ - GAI のプロンプトに関連度の高い情報を付与 (grounding) - 関連度の高い情報を高速に数件取得 (IR) - Sparse
vector vs Dense vector - ESRE は Elastic の検索ツールセットを総称するブランディング - ELSER は text expansion 向け Sparse vector 生成用の学習済み言語モデル - Zero-shot で汎用的に使える - RRF は複数の検索結果を検索結果内の位置を元に合成する仕組み

おまけ GPT4All はどう?

おまけ GPT4All はどう? Model Me An irritated male adult leaning
forward and waiting for an answer to his question from a young child who is speaking very slowly. In digital art style.

明日は日本語ベクトル検索の Webinar ！ https://www.elastic.co/jp/virtual-events/introduction-to-nlp-models-and-vector-search

ESRE とか ELSER とか RRF ってナニ!? もろもろ理解してスッキリしよう

ESRE とか ELSER とか RRF ってナニ!? もろもろ理解してスッキリしよう

More Decks by Koji Kawamura

Other Decks in Technology

Featured

Transcript