try! Swift Tokyo 2024 - Duolingo Max Roleplay

Slide 1

Slide 1 text

Transforming Language Learning with Generative AI try! Swift Tokyo Xingyu Wang March 22, 2024 A Deep Dive into Duolingo’s AI Tutor Feature - Roleplay

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

88 million MAUs (Monthly Active Users)

Slide 5

Slide 5 text

About me ● Full-stack (iOS) engineer at Duolingo since August 2021. First job out of school! ● I enjoy meditation, going to the gym, and reading ● Learning Japanese and French ● Speak Chinese and English ● First time speaking at a conference!

Slide 6

Slide 6 text

https://openai.com/customer-stories/duolingo

Slide 7

Slide 7 text

Conversation Helpful Phrases Speech recognition Feedback

Slide 8

Slide 8 text

Duolingo Max - Roleplay

Slide 9

Slide 9 text

Architecture

Slide 10

Slide 10 text

iOS Roleplay backend AI Features backend Roleplay State = chat history Model parameters; 100+ prompts; Requests teaching content from other services OpenAI Completions ARCHITECTUR E

Slide 11

Slide 11 text

Stateless protocol iOS Roleplay backend Chat history Character: … User: … Character: … User: … Chat history Character: … User: … Character: … User: … Character: … ARCHITECTUR E Roleplay State

Slide 12

Slide 12 text

Stateless protocol: Why? ● Stateless protocol makes Roleplay backend simple ● We don’t allow users to resume a conversation once they left it ● The payload object transmitted between iOS and backend, Roleplay State, has a reasonable size ○ Caveat: media content ARCHITECTUR E

Slide 13

Slide 13 text

Chat interface: MVVM ARCHITECTUR E

Slide 14

Slide 14 text

Challenge #1: Building a Chat Interface

Slide 15

Slide 15 text

➜ Custom message animations ➜ Working with stateless API: Translate Roleplay State updates to chat updates Chat that can… CHALLENGE #1: Building a Chat Interface

Slide 16

Slide 16 text

➜ Chat history: UICollectionView ➜ Cell types: top narration, character message, user message, loading ➜ Animations: fade-in, message insertion ◆ Subclass UICollectionViewFlowLayout ◆ Custom UICollectionViewLayoutAttributes ◆ Override initialLayoutAttributesForAppearingItem Chat Interface: UI CHALLENGE #1: Building a Chat Interface

Slide 17

Slide 17 text

Roleplay State UICollectionView updates Backend CHALLENGE #1: Building a Chat Interface

Slide 18

Slide 18 text

Character: What do you want to drink? User: I’d like a coffee. Character: Do you want to pay in cash or card? UICollectionView (in Roleplay View) Roleplay Backend Roleplay VM Roleplay State Character: What do you want to drink? User: I’d like a coffee. Character: Do you want to pay in cash or card? Roleplay Message Processor User: I want to pay by cash. ··· User: I want to pay by cash. Character: Great! Enjoy your coffee! Delete: loading Append: new message Character: Great! Enjoy your coffee!

Slide 19

Slide 19 text

Takeaway: Separation of Concerns ● Big application: 25K+ lines for Roleplay ● Stateless API puts message handling logic on the client ○ VM + state manager and message processor ● Views/view updates + Complex business logic + Networking ● Keeping separation of concerns in mind will help you iterate faster and reuse components CHALLENGE #1: Building a Chat Interface

Slide 20

Slide 20 text

Challenge #2: Latency Optimization on Helpful Phrases

Slide 21

Slide 21 text

CHALLENGE #2: LATENCY OPTIMIZATION Helpful Phrases

Slide 22

Slide 22 text

Generate character’s response Generate helpful phrases

Slide 23

Slide 23 text

Solution: Prompt and model optimization ➜ Use the right GPT model: GPT-3.5? GPT-4? Fine-tuning GPT-3.5? ➜ Decrease the number of output tokens ➜ Utilize cached input: front-load the repeated part of the prompt ◆ Put conversation history to the end CHALLENGE #2: LATENCY OPTIMIZATION

Slide 24

Slide 24 text

iOS Solution! Async generation Send character’s response to the user Client kicks off fetch-helpful-phrase s request Text-to-speech of the character plays User taps on input bar (helpful phrases about to surface) Display phrases to user Cancels request right before displaying. Show default phrases: “I want”, “I have”, “I can”, etc. User thinking how to respond Ready to show: “I want to eat”, “the bouillabaisse”, “please”, and “the” 1-2 seconds A few seconds CHALLENGE #2: LATENCY OPTIMIZATION

Slide 25

Slide 25 text

final class RoleplayRepository { /// Variable to store the fetch task private var fetchHelpfulPhrasesTask: Task<[RoleplayHelpfulPhrase]?, Never>? /// Cancel the fetch request func cancelAsyncHelpfulPhrasesFetch() { guard fetchHelpfulPhrasesTask != nil else { return } // Cancelling the fetch of Helpful Phrases fetchHelpfulPhrasesTask?.cancel() fetchHelpfulPhrasesTask = nil } } iOS Solution in code CHALLENGE #2: LATENCY OPTIMIZATION

Slide 26

Slide 26 text

extension RoleplayRepository { /// Fetch the helpful phrases, while playing the audio of the character’s message func fetchAsyncHelpfulPhrases(roleplayState: RoleplayState) async throws -> [RoleplayHelpfulPhrase]? { fetchHelpfulPhrasesTask = Task { @MainActor in let helpfulPhrases = try? await dataSource.getHelpfulPhrases(roleplayState: roleplayState) guard fetchHelpfulPhrasesTask?.isCancelled == false else { fetchHelpfulPhrasesTask = nil return nil } fetchHelpfulPhrasesTask = nil return helpfulPhrases } return await fetchHelpfulPhrasesTask?.value } } iOS Solution in code CHALLENGE #2: LATENCY OPTIMIZATION

Slide 27

Slide 27 text

➜ GPT-4 optimization techniques are essential ➜ Think of creative UX/iOS solutions! ◆ Parallelize requests as much as possible ◆ Cancel a request if taking too long + provide default options Learnings CHALLENGE #2: LATENCY OPTIMIZATION

Slide 28

Slide 28 text

Developing AI Applications on iOS: TOP TAKEAWAYS

Slide 29

Slide 29 text

➜ Handle OpenAI outages ➜ Reduce latency ➜ Update to the latest models 1. Backend Expertise TOP TAKEAWAYS

Slide 30

Slide 30 text

➜ Make sure GPT will follow the prompt ➜ Monitor questionable and problematic content ➜ Tie the GPT feature to the existing features ◆ Feed existing content to the prompt ◆ Make the new feature coherent 2. Prompt Engineering TOP TAKEAWAYS

Slide 31

Slide 31 text

➜ Latest GPT models ◆ Reduce cost ◆ Improve the quality of completions ➜ Product iterations ◆ Clear and ﬂexible design patterns in code 3. Fast iteration TOP TAKEAWAYS