Slide 173
Slide 173 text
RAG の 実装例 (インデックス化 – Chunkに情報を付加して検索対象とする)
前後関係を踏まえた付加情報を足したチャンクを検索対象としている例
{“seach_text”: “purpose: ~~~~~~~, keywords: ~~~,~~~,~~~,
main_text: # 1. 機械学習 ~~~~~~~~~~~”
“search_vector”, []}
{“seach_text”: “purpose: ~~~~~~~, keywords: ~~~,~~~,~~~,
main_text: ## 1.1 教師あり学習~~~~~~~~~~~”
“search_vector”, []}
{“seach_text”: “purpose: ~~~~~~~, keywords: ~~~,~~~,~~~,
main_text: ~~~~~~~~~~~”
“search_vector”, []}
{“seach_text”: “purpose: ~~~~~~~, keywords: ~~~,~~~,~~~,
main_text: ## 1.2 教師なし学習 ~~~~~~~~~~~”
“search_vector”, []}
{“seach_text”: “purpose: ~~~~~~~, keywords: ~~~,~~~,~~~,
main_text: # 1. 機械学習 ~~~~~~~~~~~”
“search_vector”, []}
5
6
{“chunk”: “# 1. 機械学習 ~~~~~~~~~~”,
“keywords”: [“~~~”, “~~~”, …],
“purpose”: ”~~~~~~~~~~~~~~”,
“questions” [“~~~~~”, “~~~~~”, …] }
{“chunk”: “## 1.1 教師あり学習~~~~~~~~”,
“keywords”: [“~~~”, “~~~”, …],
“purpose”: ”~~~~~~~~~~~~~~”,
“questions” [“~~~~~”, “~~~~~”, …] }
{“chunk”: “~~~~~~~~~~~~~~~~~~~~~”,
“keywords”: [“~~~”, “~~~”, …],
“purpose”: ”~~~~~~~~~~~~~~”,
“questions” [“~~~~~”, “~~~~~”, …] }
{“chunk”: “## 1.2 教師なし学習~~~~~~~”,
“keywords”: [“~~~”, “~~~”, …],
“purpose”: ”~~~~~~~~~~~~~~”,
“questions” [“~~~~~”, “~~~~~”, …] }
{“chunk”: “~~~~~~~~~~~~~~~~~~~~~”,
“keywords”: [“~~~”, “~~~”, …],
“purpose”: ”~~~~~~~~~~~~~~”,
“questions” [“~~~~~”, “~~~~~”, …] }
4
Embedding
情報を結合して
search_textに
登録時はURLなど
メタ情報なども足す
175