大規模言語モデルを作る、拡張する

Slide 1

Slide 1 text

小山田昌史 Feb 22, 2024 大規模言語モデルを作る、拡張するー LLM, RAG, Agents ※本資料で述べられている見解は個人的なものであり、所属組織とは関係ありません

Slide 25

Slide 25 text

26 事後学習とは、ドメイン DSL の設計 & その (データによる) 実装 (2/2) ② 事後学習 (Post-Training) / モデルの “機能” を定める (要件決め) / 機能を表す指示データを用意し学習 / 学習したモデルへのフィードバックを得る / フィードバックを反映 (アラインメント) u 例) WebGPT (OpenAI): ブラウザ操作をコマンド化、自律的な RAG を可能に WebGPT [Nakano+23] の DSL (ブラウザ操作コマンド) Question How can I train the crows in my neighborhood to bring me gifts? Quotes From Gifts From Crows | Outside My Window (www.birdsoutsidemywindow.org) > Many animals give gifts to members of their own species but crows and other corvids are the only ones known to give gifts to humans. Past actions Search how to train crows to bring you gifts Click Gifts From Crows | Outside My Window www.birdsoutsidemywindow.org Quote Back Title Search results for: how to train crows to bring you gifts Scrollbar: 0 - 11 Text 0How to Make Friends With Crows - PetHelpfulpethelpful.com If you did this a few times, your crows would learn your new place, but as I said, I’m not sure if they will follow or visit you there since it’s probably not in their territory. The other option is simply to make new crow friends with the crows that live in your new neighborhood. [Nakano+23] WebGPT: Browser-assisted question-answering with human feedback [Schulman23] Reinforcement Learning from Human Feedback: Progress and Challenges 実際の WebGPT モデルからの出力「近所にいるカラスを調教し、私に贈り物を持ってくるようにするには？」という（ナンセンスな）質問に対して「カラスに贈り物をさせる方法」でWeb検索、結果から回答に必要な情報を引用しようと頑張っている LLMが出力した文字列をコマンド列へとパースし、ブラウザが順に実行 ChatGPT の Web Browsing 機能でも同様の DSL を利用 [Shulman23]

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text