Slide 1

Slide 1 text

AI AGENTS with some JavaScript How work and how to build them

Slide 2

Slide 2 text

Have you heard about AI agents?

Slide 3

Slide 3 text

Yeah, the intelligent agents who will take your job in a few years!

Slide 4

Slide 4 text

I don't want to scare you, but…

Slide 5

Slide 5 text

"By2034,AI"agents"will: •Replace70%ofofficework(McKinsey) •Add$7trilliontotheglobaleconomy(Goldman) Mostjobswillbecomeobsolete. Here'swhatyouneedtoknow(&howyoushouldprepare)"! Somebodyontwitter (+McKinsey&Goldman)

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

So, welcome to my woodworking course!

Slide 9

Slide 9 text

AI AGENTS with some JavaScript

Slide 10

Slide 10 text

AI AGENTS with some JavaScript How work and how to build them

Slide 11

Slide 11 text

When people talk about AI agents, we often imagine:

Slide 12

Slide 12 text

But in reality, AI agents are more like:

Slide 13

Slide 13 text

Don't get me wrong. These agents can be extremely helpful!

Slide 14

Slide 14 text

How LLMs work But before we continue, let's see

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Source: https://platform.openai.com/tokenizer

Slide 20

Slide 20 text

Prompts: - Prompts are instructions - You te! an LLM what you want, and it tries to reply based its training and your instructions - more clear instructions = better reply - LLM always answer, but not always based on truth

Slide 21

Slide 21 text

Prompts are just instructions

Slide 22

Slide 22 text

So, how do they work?

Slide 23

Slide 23 text

You give your instructions

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

You get some unexpected wisdom or hallucination

Slide 26

Slide 26 text

But how do LLMs know how to reply?

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

Tokens and vectors

Slide 29

Slide 29 text

@slobodan_ Source: https://cthiriet.com/blog/infinite-memory-llm

Slide 30

Slide 30 text

Anatomy of a prompt

Slide 31

Slide 31 text

The prompt is a set of textual instructions that fit LLM's context and other limitations

Slide 32

Slide 32 text

"Who is faster: Godzi!a or T-Rex?" A valid prompt

Slide 33

Slide 33 text

"Write a 500-word article about the bad influence of Amazon's RTO policy on Lambda cold starts" Also valid prompt

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

But LLMs are products that, like most other products, evolve with user needs and requests

Slide 36

Slide 36 text

System prompts

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

Some parts of your instructions might be more important than other or you might want to make them repeatable

Slide 40

Slide 40 text

Slobodan Stojanovic cofounder/CTO @ Vacation Tracker @slobodan_

Slide 41

Slide 41 text

Before we talk about AI agents, let me show you one more thing

Slide 42

Slide 42 text

Meet my "friend" Claude. I ask it many weird things all the time Way more weird than this one, trust me

Slide 43

Slide 43 text

This specific question is interesting because Claude can not answer it

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

Some other LLMs can answer this question! But I like Claude. Can I help it to answer?

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

By the way, we just created an AI agent!

Slide 50

Slide 50 text

I know ChatGPT can search the internet. But that's also an agent. It's just built into the ChatGPT product.

Slide 51

Slide 51 text

How AI agents work

Slide 52

Slide 52 text

LLMs can do and are good at this:

Slide 53

Slide 53 text

LLMs will always reply. Some replies might not be meaningful (hallucinations).

Slide 54

Slide 54 text

Agents are LLMs + something that provides missing information & capabilities

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

A "While" loop

Slide 57

Slide 57 text

An AI Agent is like a "while" loop that keeps asking available tools to provide additional information or capability until it has all it needs to complete the task or answer the question

Slide 58

Slide 58 text

The tool can be anything that provides the missing information or capabilities

Slide 59

Slide 59 text

I am the "tool" here!

Slide 60

Slide 60 text

But, while loops can be expensive!

Slide 61

Slide 61 text

Not because of the code complexity of a while loop, but because you invoke an LLM at least once in each iteration!

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

What does it mean to be "expensive" in this context? It depends on your use case! But be careful.

Slide 64

Slide 64 text

How to be careful: - Define spending limits - Make sure you do not iterate indefinitely (i.e., stop after N retries) - Use cheaper models for simple evaluations - Add monitoring and alarms

Slide 65

Slide 65 text

Where do we write these while loops?

Slide 66

Slide 66 text

Anywhere you want!

Slide 67

Slide 67 text

If you really really want, you can be that "while" loop

Slide 68

Slide 68 text

I am the "while loop" here!

Slide 69

Slide 69 text

But you can write your "while loop" anywhere you need it. This while loop can be in an app, in a terminal, on a server, in a browser, etc.

Slide 70

Slide 70 text

Just be careful not to expose your LLM secret keys and limit their usage because, remember, these while loops can be expensive.

Slide 71

Slide 71 text

How to write a while loop? - Define a system prompt with clear explanation of a! the tools you want to support (i.e., when and how to invoke them) - Ask an LLM to reply in the strict JSON format - Make sure you parse and validate reply correctly - Handle errors and have a limit on the number of iterations

Slide 72

Slide 72 text

LLMs are good at talking to humans, but these replies are not easy to parse in the code

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

You can ask an LLM to reply in the JSON format

Slide 75

Slide 75 text

User: // Some long instructions But always reply with valid JSON and nothing else! Here's your JSON: ```json { "some": "JSON", Assistant:

Slide 76

Slide 76 text

I SAID JSON ONLY!!! Works, sometimes

Slide 77

Slide 77 text

But there's something else you can do!

Slide 78

Slide 78 text

Write the beginning of the reply in the API request!

Slide 79

Slide 79 text

User: // Your instructions Answer with valid JSON and nothing else. { " Assistant: System: // Your system prompt

Slide 80

Slide 80 text

// Your instructions Answer with valid JSON and nothing else. { " // Your system prompt some": "valid", "JSON": true } User: Assistant: System:

Slide 81

Slide 81 text

It's good to understand how these "while loops" work. But, you don't really need to write your own while loop!

Slide 82

Slide 82 text

It's good to understand how these "while loops" work. But, you don't really need to write your own while loop!

Slide 83

Slide 83 text

Popular AI Agents tools & frameworks: - LlamaIndex - LangChain - AutoGen - Amazon Bedrock Agents - Many other alternatives…

Slide 84

Slide 84 text

https://ts.llamaindex.ai

Slide 85

Slide 85 text

LlamaIndex ≠ Meta Llama LLM

Slide 86

Slide 86 text

LlamaIndex Supported LLMs: - OpenAI LLms - Anthropic LLms - Groq LLMs - Llama2, Llama3, Llama3.1 LLMs - MistralAI LLMs - Fireworks LLMs - D"pS"k LLMs - ReplicateAI LLMs - TogetherAI LLMs - HuggingFace LLms - D"pInfra LLMs - Gemini LLMs

Slide 87

Slide 87 text

LlamaIndex recognized the AI assistant memory problem as an important thing to focus on

Slide 88

Slide 88 text

LLMs are limited by their context size

Slide 89

Slide 89 text

No content

Slide 90

Slide 90 text

No content

Slide 91

Slide 91 text

Luckily, the context of most LLMs is increasing fast

Slide 92

Slide 92 text

But we have more things we want to add to the context (such as Knowledge bases, documents, etc.)

Slide 93

Slide 93 text

RAG (Retrieval-Augmented Generation) Sounds complicated

Slide 94

Slide 94 text

No content

Slide 95

Slide 95 text

No content

Slide 96

Slide 96 text

No content

Slide 97

Slide 97 text

No content

Slide 98

Slide 98 text

Things to know about RAG - RAG is often explained in complicated terms, but it's a simple (and powerful) concept - You can use a vector database for RAG, but it's not necessary - You can store data almost anywhere, in vector DB, PostgreSQL, S3…

Slide 99

Slide 99 text

How does RAG work?

Slide 100

Slide 100 text

@slobodan_ Source: https://cthiriet.com/blog/infinite-memory-llm

Slide 101

Slide 101 text

1. Split your Knowledge base into pieces and create vectors for each piece 2. Create vectors from user input 3. Do a vector search for the closest knowledge base matches to the user input 4. Add a knowledge base pieces to the prompt with an explanation/instructions 5. PROFIT How to use RAG:

Slide 102

Slide 102 text

But comparing vectors must be complicated!

Slide 103

Slide 103 text

No content

Slide 104

Slide 104 text

However, RAG and vector search are topics for other presentations. Let's get back to agents…

Slide 105

Slide 105 text

Remember, LLMs are products that, like most other products, evolve with user needs and requests

Slide 106

Slide 106 text

So, now all major LLMs have built-in agent capabilities (or simply, they can use tools)

Slide 107

Slide 107 text

https://platform.openai.com/docs/guides/function-calling

Slide 108

Slide 108 text

https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview

Slide 109

Slide 109 text

Built-in tools pros: - Less errors (no n"d to force an LLM to return JSON) - No third-party tools - We! defined format

Slide 110

Slide 110 text

Built-in tools cons: - A bit harder to switch models (you n"d to write a sma! adapter/wrapper)

Slide 111

Slide 111 text

Let's build an AI agent!

Slide 112

Slide 112 text

No content

Slide 113

Slide 113 text

No content

Slide 114

Slide 114 text

We could build this in many different ways in production. For example, it can look similar to the following diagram

Slide 115

Slide 115 text

No content

Slide 116

Slide 116 text

In production you need to think about: - Web application firewa! (WAF) with rate limiting - Error handling - Rate limits (for your app + LLMs + other services) - Monitoring - Conversation storage (i.e., DynamoDB) - And many other things

Slide 117

Slide 117 text

I'll show the most important parts of the code, only

Slide 118

Slide 118 text

Define available tools

Slide 119

Slide 119 text

No content

Slide 120

Slide 120 text

Then you can invoke the LLM with the list of tools

Slide 121

Slide 121 text

No content

Slide 122

Slide 122 text

The response

Slide 123

Slide 123 text

No content

Slide 124

Slide 124 text

No content

Slide 125

Slide 125 text

Then you send your API request, wait for the reply and do the following:

Slide 126

Slide 126 text

No content

Slide 127

Slide 127 text

No content

Slide 128

Slide 128 text

Agent response

Slide 129

Slide 129 text

No content

Slide 130

Slide 130 text

That's it!

Slide 131

Slide 131 text

The complete code example is more complicated. I'll publish an article with a detailed code example and step-by-step guide soontm

Slide 132

Slide 132 text

A few other things to check

Slide 133

Slide 133 text

https://openai.com/index/new-tools-for-building-agents/

Slide 134

Slide 134 text

https://docs.anthropic.com/en/docs/agents-and-tools/mcp

Slide 135

Slide 135 text

https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/

Slide 136

Slide 136 text

A quick summary

Slide 137

Slide 137 text

@slobodan_ • An AI Agent is like "while" loops with tools • You have all the skills you need to build tools • AI agents aren't scary, and they can be useful • Go, build agents, and have fun

Slide 138

Slide 138 text

https://slobodan.me @slobodan_