Slide 1

Slide 1 text

Chat on GPT – 18 April 2023 Rishabh Anand @rishabh16_ Human in the Loop

Slide 2

Slide 2 text

Chat on GPT – 18 April 2023 The GPT Series GPT → Generative Pretrained Transformer GPT GPT-2 GPT-3 ChatGPT(GPT-3.5) GPT-4 2018 Improving Language Understanding by Generative Pre-Training Language Models are Unsupervised Multi- task Learners Training language models to follow instructions with human feedback * Language Models are Few-Shot Learners 2019 2020 2022 2023 * GPT-3.5 is built on top of InstructGPT with a different data collection setup (technical report) Rapid growth …

Slide 3

Slide 3 text

Chat on GPT – 18 April 2023 Reinforcement Learning from Human Feedback

Slide 4

Slide 4 text

Chat on GPT – 18 April 2023 (Large) Language Models ● Language Models (like GPT-X), ○ are chaotic ○ model a “giant mass of people” ~ Minqi Jiang, MetaAI ● For different prompts, you can get wildly different outputs ● We must “ “snip out” ” the ugly, less-preferred parts stuff that’s learned stuff we care about

Slide 5

Slide 5 text

Chat on GPT – 18 April 2023 RL from Human Feedback ● Provides a friendlier interface to interact with LMs ● Biases the underlying model to generate human-aligned content ● Improves reliability, honesty, and safety of LLMs “ “How do we get LLMs to sound more human?” ”

Slide 6

Slide 6 text

Chat on GPT – 18 April 2023 RL from Human Feedback

Slide 7

Slide 7 text

Chat on GPT – 18 April 2023 RL from Human Feedback 1. Pretrain a LLM on a body of text [GPT-X, for instance]

Slide 8

Slide 8 text

Chat on GPT – 18 April 2023 RL from Human Feedback 1. Pretrain a LLM on a body of text [GPT-X, for instance] 2. Train a Reward Model (RM) → “ “how would a human feel?” ”

Slide 9

Slide 9 text

Chat on GPT – 18 April 2023 RL from Human Feedback 1. Pretrain a LLM on a body of text [GPT-X, for instance] 2. Train a Reward Model (RM) → “ “how would a human feel?” ” 3. Finetune using RL [LLM agent predicts words and is scored]

Slide 10

Slide 10 text

Chat on GPT – 18 April 2023 LLMs + RLHF [source]

Slide 11

Slide 11 text

Chat on GPT – 18 April 2023 ChatGPT for Students

Slide 12

Slide 12 text

Chat on GPT – 18 April 2023 Ask Away! ● Treat ChatGPT as you would a friend ● Want something? Just ask for it! ● The art of “ “Prompt Engineering” ” with ChatGPT Use ChatGPT as a personal tutor!

Slide 13

Slide 13 text

Chat on GPT – 18 April 2023 ● Digestible explanations ● Summarising Long-form content ● Peer Review + feedback The Possibilities

Slide 14

Slide 14 text

Chat on GPT – 18 April 2023 Generate Digestible Explanations

Slide 15

Slide 15 text

Chat on GPT – 18 April 2023 Summarising Content Given some long-form content that contains a lot to go through …

Slide 16

Slide 16 text

Chat on GPT – 18 April 2023 Summarising Content

Slide 17

Slide 17 text

Chat on GPT – 18 April 2023 Peer Review + Feedback

Slide 18

Slide 18 text

Chat on GPT – 18 April 2023 Peer Review + Feedback

Slide 19

Slide 19 text

Chat on GPT – 18 April 2023 Peer Review + Feedback

Slide 20

Slide 20 text

Chat on GPT – 18 April 2023 ● LLM technology will only get better from here on ● Students should can learn how to operate these tools ● While LLMs can improve productivity, it’s not the be-all-end-all AI tools lower the activation energy to get started!!! ChatGPT for Students

Slide 21

Slide 21 text

Chat on GPT – 18 April 2023 But … shortcomings? Stay for our panels!