202309 kaggle 銀 LLM science exam まとめ資料

Slide 1

Slide 1 text

Slide 4

Slide 4 text

Platform Technology Division Copyright 2020 Sony Semiconductor Solutions Corporation DATE 4/xx なんでkaggleが、こんなコンペ開いたの？ ①LLMがLLM自身をテストする能力可能性と、②リソースに制約のある環境でのLLM可能性を研究者がよりよく理解・分析できるようにするため ★現在の世の中、背景大規模な言語モデルの能力が広がる中、LLMs（大規模言語モデル）自身を特徴付ける研究が増加しています。最先端のモデルにとって多くの既存のNLPベンチマークが容易であることが示されたため、LLMsを使用してさらに強力なモデルをテストするためのより難しいタスクを作成する興味深い研究も行われています。同時に、量子化や知識蒸留のような方法が使用されて、言語モデルを効果的に小さくし、より控えめなハードウェア上で実行するために使用されています。Kaggleの環境は、提出物がGPUと時間の制限の両方に従う必要があるため、これを独自の視点から調査する絶好の場です。 ★kaggleでやるモチベこのチャレンジのデータセットは、gpt3.5にWikipediaから引用したさまざまな科学的トピックのテキストの断片を与え、多肢選択の質問（既知の答え付き）を書かせることで生成されました。その後、簡単な質問は除外されました。現在、Kaggleで実行されている最大のモデルは約100億のパラメーターを持っていると推定されていますが、gpt3.5は 1750億のパラメーターを持っています。質問応答モデルが、そのサイズの10倍以上の質問作成モデルによって書かれたテストに完璧に合格した場合、これは真に興味深い結果となります。一方、大きなモデルが小さなモデルを効果的に難題に対応させることができれば、これはLLMsが自分自身をベンチマークとテストする能力に魅力的な意味合いを持ちます。

Slide 33

Slide 33 text

Platform Technology Division Copyright 2020 Sony Semiconductor Solutions Corporation DATE 33/xx 補足：gpt-3.5を使ったデータセット生成 • 70$かけて70k行の文書探索精度を上げるためのデータセットを作成 – Kaggle - LLM Science Exam | Kaggle – 私は使いこなせなかったためkaggleで公開 • 3位の人が使いこなした。＃嬉しい。 – プロンプトは→ • 文書のQAと、QAを作るためにつかった sentenceを同時に生成 system_message = f""" You will be provided with TEXT from wikipedia. ¥ The TEXT will be delimited with {delimiter} characters. Output a python list of 3 dict objects, where each object is ¥ a multiple choice question whose answers should be in ¥ the given TEXT and that has 5 choices each. Each object should have the following format: 'question': 'option_1': 'option_2': 'option_3': 'option_4': 'option_5': 'answer': 'reference_sentence': You should tell me which one of your proposed options is right ¥ by assigning the corresponding option's key label in the 'answer' field. Also, provide the original sentence ¥ from the TEXT that supports the answer in the 'reference_sentence' field. The question, the answer, and question answer options should be broad, ¥ challenging, long, detailed, and based on the TEXT provided. Additionally, ensure the token distribution of question follows these statistics: - Mean: 14.22 tokens - Std Deviation: 7.223939 tokens - Min: 4 token - 25th Percentile: 9 tokens - Median: 13 tokens - 75th Percentile: 17.25 tokens - Max: 49 tokens Additionally, ensure the token distribution of each answer follows these statistics: - Mean: 30.840 tokens - Std Deviation: 19.883692 tokens - Min: 1 token - 25th Percentile: 16 tokens - Median: 27.5 tokens - 75th Percentile: 43.25 tokens - Max: 100 tokens Only output the list of objects, with nothing else.

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text