Upgrade to Pro — share decks privately, control downloads, hide ads and more …

大規模言語モデルの原理と使いこなしの原則 / Principles of Large Language Models and Their Use

大規模言語モデルの原理と使いこなしの原則 / Principles of Large Language Models and Their Use

早稲田大学大学院経営管理研究科「プロンプトエンジニアリング ─ 生成AIの応用」2024 春のオンデマンド教材 第4回で使用したスライドです。

Kenji Saito

April 27, 2024
Tweet

More Decks by Kenji Saito

Other Decks in Technology

Transcript

  1. Generated by Stable Diffusion XL v1.0 — AI 2024 4

    (WBS) 2024 4 — 2024-04 – p.1/17
  2. ( 20 ) 1 • 2 • 3 Discord &

    • 4 • 5 6 RPG 7 “September 12th” 8 9 10 ∼ 11 Linux (Windows )(Mac ) 12 Open Interpreter ∼ 13 14 AGI (Artificial General Intelligence) 7 (5/6 ) / (2 ) OK / 2024 4 — 2024-04 – p.3/17
  3. m P(w1 , . . . , w m )

    (Wikipedia) 1 (Wikipedia) : ( ) ← (Generative Pre-training) : ( ) ( ) https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ 2024 4 — 2024-04 – p.4/17
  4. ChatGPT GPT ChatGPT GPT OpenAI (GPT-3.5, GPT-4) GPT Generative Pre-trained

    Transformer ( ) GPT ( ) GPT-3.5, GPT-4 RLHF (Reinforcement Learning from Human Feedback; ) 2024 4 — 2024-04 – p.5/17
  5. attention ( ) ( ) GPT / / / /

    / / / 2024 4 — 2024-04 – p.6/17
  6. GPT ( ) GPT Alec Radford, Karthik Narasimhan, Tim Salimans,

    and Ilya Sutskever. 2018. “Improving Language Understanding by Generative Pre-Training”. Available at: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf. GPT-2 Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Models are Unsupervised Multitask Learners”. Available at: https://paperswithcode.com/paper/language-models-are-unsupervised-multitask. GPT-3 Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. “Language Models are Few-Shot Learners”. Available at: https://doi.org/10.48550/arXiv.2005.14165. GPT-4 OpenAI. 2023. “GPT-4 Technical Report”. Available at: https://doi.org/10.48550/arXiv.2303.08774. 2024 4 — 2024-04 – p.7/17
  7. GPT ( ) GPT Alec Radford, Karthik Narasimhan, Tim Salimans,

    and Ilya Sutskever. 2018. “Improving Language Understanding by Generative Pre-Training”. Available at: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf. GPT-2 Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Models are Unsupervised Multitask Learners”. Available at: https://paperswithcode.com/paper/language-models-are-unsupervised-multitask. GPT-3 Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. “Language Models are Few-Shot Learners”. Available at: https://doi.org/10.48550/arXiv.2005.14165. GPT-4 OpenAI. 2023. “GPT-4 Technical Report”. Available at: https://doi.org/10.48550/arXiv.2303.08774. 2024 4 — 2024-04 – p.9/17
  8. GPT GPT-3 (2020) Language Models are Few-Shot Learners GPT-3 GPT-3

    ( 1750 ) (Few-Shot) GPT-3 2024 4 — 2024-04 – p.12/17
  9. GPT GPT-4 (2023) GPT-4 Technical Report GPT-4 GPT-4 10% GPT-4

    Transformer 2024 4 — 2024-04 – p.13/17
  10. Generative Pre-Training, Language Models (GPT, GPT-2, GPT-3) : : Improving

    Language Understanding (GPT : 1.17 ) ( ) ( ) Unsupervised Multitask Learners (GPT-2 : 15 ) Few-Shot Learners (GPT-3 : 1,750 ) → GPT 2024 4 — 2024-04 – p.14/17
  11. ( ) ( ← ChatGPT ) ( ) GPT-4 BibTEX

    ACM HTML (abstract) ( ) AI 2024 4 — 2024-04 – p.15/17
  12. ( ) GPT 3 1. (GPT ) 2. ( )

    3. ( ) 3 (↑ 3. ) 3. (GPT ) (↑ 3 ) 2024 4 — 2024-04 – p.16/17