Slide 1

Slide 1 text

CUSTOM CHATBOTS With GPT and Symfony PHP Framework Workshop by Christopher Hertel @ Web Summer Camp 2024

Slide 2

Slide 2 text

TODAY'S AGENDA 1. Basic Application Setup 2. Large Language Model 3. Prompts & Context 4. Vectors & Similarity 5. Retrieval Augmented Generation 6. Tools Mixing theory input & practical coding challenges.

Slide 3

Slide 3 text

WHO'S THAT GUY?

Slide 4

Slide 4 text

WORKSHOP APP

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

REQUIREMENTS What you need to follow this workshop: 1. Laptop 2. Internet Connection 3. Terminal & Browser 4. Git & GitHub Account 5. Docker with Docker Compose Plugin 6. Your Favorite IDE or Editor

Slide 7

Slide 7 text

TECHNOLOGY This small demo sits on top of following technologies: 1. PHP >= 8.3 2. Symfony 7.1 incl. Twig, Asset Mapper & UX 3. Bootstrap 5 4. OpenAI's GPT & Embeddings 5. ChromaDB Vector Store

Slide 8

Slide 8 text

BASIC APPLICATION SETUP Let's start with a git clone: See README for setup instructions. # ssh git clone [email protected]:chr-hertel/wsc-symfony-chatbot.git # https git clone https://github.com/chr-hertel/wsc-symfony-chatbot.git

Slide 9

Slide 9 text

SETTING UP OPENAI SECRETS 1. Go to bit.ly/wsc-chatbot 2. Copy file into config/secrets/dev/ 3. Run secrets command to test, see README

Slide 10

Slide 10 text

HELPER SCRIPT For convenience, we have a helper script: bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 11

Slide 11 text

WHAT YOU SHOULD HAVE NOW 1. Symfony app running in your browser 2. Script bin/check reporting All Checks Passed 3. Project setup in your IDE or editor

Slide 12

Slide 12 text

LLM & GPT ChatGPT and beyond

Slide 13

Slide 13 text

LLM & GPT Model: Result of training an AI algorithm with data, e.g. neural network LLM == Large Language Models Model to understand and generate language Trained on a large amount of data, e.g. the www Basic idea is completion on top of probabilities

Slide 14

Slide 14 text

LLM & GPT GPT == Generative Pre-Trained Transformer First released by OpenAI Generates language, word by word Pre-trained on large data-sets Transformer is a specific LLM architecture

Slide 15

Slide 15 text

WHAT ARE WE USING? OpenAI's GPT-4o, released May 2024 GPT !== ChatGPT, product on top of GPT Available via OpenAI's API But there are other models: Meta's LLaMA, Google's Gemini, Anthropic's Claude, Mistral, Falcon, Stanford Alpaca, GPT-J, GPT4ALL, ...

Slide 16

Slide 16 text

CHALLENGE 1 Integrate basic GPT model with your chatbot

Slide 17

Slide 17 text

BASIC GPT USAGE Merge branch 1-gpt into your working branch In the end bin/check should be green again. Implement App\OpenAI\GptClient Use App\OpenAI\GptClientInterface in App\Chat See CHALLENGES.md for guidance git fetch origin git merge origin/1-gpt bin/check # will fail

Slide 18

Slide 18 text

EXAMPLE API CALL curl https://api.openai.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "gpt-4o", "temperature": 1.0, "messages": [ {"role": "user", "content": "Hello!"} ] }'

Slide 19

Slide 19 text

ARE YOU READY? bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 20

Slide 20 text

ARE YOU READY?

Slide 21

Slide 21 text

GPT FUNDAMENTALS Prompts, Temperature, and Context

Slide 22

Slide 22 text

PROMPTS Messages that interact with the model Style of prompts really important => Prompt Engineering A lot of "trial & error" Roles User: Messages created by a user Assistant: Messages created by a system System:Instructions for the conversation

Slide 23

Slide 23 text

TEMPERATURE Mechanism how GPT handles probabilities Often referred to as "randomness" High temperature: more creative Low temperature: more deterministic

Slide 24

Slide 24 text

CONTEXT Collection of prompts & messages Basis for the model to generate new text Powerful to control the model

Slide 25

Slide 25 text

EXAMPLE

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

CHALLENGE 2 Extend the chat's context to bring in more knowledge

Slide 29

Slide 29 text

EXTENDED CONTEXT Merge branch 2-context into your working branch In the end bin/check should be green again. Inject knowledge about date, time and Web Summer Camp program into context. Implement two decorators for GptClient See CHALLENGES.md for more guidance git fetch origin git merge origin/2-context bin/check # will fail

Slide 30

Slide 30 text

ARE YOU READY? bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 31

Slide 31 text

ARE YOU READY?

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

RETRIEVAL AUGMENTED GENERATION Semantic Vectors & Similarity Search

Slide 34

Slide 34 text

TOKEN Sequences of characters found in texts Not necessarily same as words Foundation for GPT to understand semantics

Slide 35

Slide 35 text

TOKEN Sequences of characters found in texts Not necessarily same as words Foundation for GPT to understand semantics

Slide 36

Slide 36 text

SEMANTICS

Slide 37

Slide 37 text

SEMANTICS

Slide 38

Slide 38 text

SEMANTICS

Slide 39

Slide 39 text

VECTOR Expresses tokens as numerical data Enables arithmetical operations Distance == similarity GPT uses 1536 dimensions

Slide 40

Slide 40 text

EMBEDDINGS Convert words/texts into single vector The bigger the text, the "blurrier" the vector OpenAI also provides embedding models e.g. text-embedding-ada-002

Slide 41

Slide 41 text

EMBEDDINGS

Slide 42

Slide 42 text

RETRIEVAL AUGMENTED GENERATION Combines GPT with embeddings Embeddings used for similarity search Allows to inject knowledge into GPT

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

CHALLENGE 3 Convert Program into Embeddings in ChromaDB

Slide 45

Slide 45 text

TURN PROGRAM INTO EMBEDDINGS Merge branch 3-vectors into your working branch Test with app:test:chroma and app:program:embed Implement App\OpenAI\EmbeddingClient for API Implement App\WscProgram\Embedder See CHALLENGES.md for more guidance git fetch origin git merge origin/3-vectors bin/check # will fail

Slide 46

Slide 46 text

ARE YOU READY? docker compose exec app bin/console app:test:chroma Testing Chroma DB Connection ============================ // Connecting to Chroma DB ... ------------------ -------------------------------------- Key Value ------------------ -------------------------------------- ChromaDB Version 0.5.3 Collection Name wsc-program Collection ID 18bacf09-3512-40bd-8077-895c6e1c98ff Total Documents 34 ------------------ -------------------------------------- // Searching for Symfony content ...

Slide 47

Slide 47 text

CHALLENGE 4 Switch to RAG instead of static context

Slide 48

Slide 48 text

RETRIEVAL AUGMENTED GENERATION Merge branch 4-retrieval into your working branch git fetch origin git merge origin/4-retrieval bin/check # will fail

Slide 49

Slide 49 text

DECORATORS Use Symfony feature to implement decorators #[AsDecorator(decorates: GptClientInterface::class, priority: 10)] final class RetrievalClient implements GptClientInterface

Slide 50

Slide 50 text

ARE YOU READY? bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 51

Slide 51 text

TOOLS Equipping GPT with more Power

Slide 52

Slide 52 text

TOOLS GPT delegates calls back to app App comes back to GPT with result GPT generates final respoonse

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

API PAYLOAD "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and country, eg. San Francisco, USA" }, "format": { "type": "string", "enum": ["celsius", "fahrenheit"] } }

Slide 56

Slide 56 text

LLM CHAIN Easier to use ToolChain of php-llm/llm-chain use PhpLlm\LlmChain\ToolBox\AsTool; #[AsTool('clock', 'Provides the current date and time', '__invoke')] final class Clock

Slide 57

Slide 57 text

CHALLENGE 5 Shift time and program search to tools

Slide 58

Slide 58 text

TIME & PROGRAM AS TOOLS Merge branch 5-tools into your working branch ToolChain instead of GptClientInterface. Create App\Tool\Clock & register AsTool Create App\Tool\Retriever & register AsTool git fetch origin git merge origin/5-tools bin/check # will fail

Slide 59

Slide 59 text

ARE YOU READY? bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 60

Slide 60 text

CHALLENGE 6 Chat on top of YouTube Transcript

Slide 61

Slide 61 text

YOUTUBE TRANSCRIPT BOT Merge branch 6-youtube into your working branch New bot with YouTubeComponent Implement App\YouTube Use App\YouTube\TranscriptFetcher git fetch origin git merge origin/6-youtube

Slide 62

Slide 62 text

ARE YOU READY? bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 63

Slide 63 text

CHALLENGE 7 Tool Chain on top of Wikipedia

Slide 64

Slide 64 text

WIKIPEDIA TOOL CHAIN Merge branch 7-wikipedia into your working branch New Wikipedia bot for research Implement App\Wikipedia Use App\Wikipedia\Client git fetch origin git merge origin/7-wikipedia

Slide 65

Slide 65 text

ARE YOU READY? bin/check _ _ _____ _ _ _____ _ /\ | | | / ____| | | | | __ \ | | / \ | | | | | | |__ ___ ___| | _____ | |__) |_ _ ___ ___ ___ __| | / /\ \ | | | | | | '_ \ / _ \/ __| |/ / __| | ___/ _\ / __/ __|/ _ \/ _\ | / ____ \| | | | |____| | | | __/ (__| <\__ \ | | | (_| \__ \__ \ __/ (_| | /_/ \_\_|_| \_____|_| |_|\___|\___|_|\_\___/ |_| \__,_|___/___/\___|\__,_|

Slide 66

Slide 66 text

THANKS FOR JOINING! I'll be around for feedback, questions & discussions. Get in contact via christopher-hertel.de