Ruby On RAG - Building AI Use Cases for Fun and Profit

Ruby On RAG Building AI Use Cases for Fun and
Proﬁt 1

I am Landon Gray Founder & AI Engineer @ Identus
Consulting Hello! 2

Overview ◉ Fun ◦ What is RAG ◦ What Problem
does it solve ◦ How it works? ▪ Indexing ▪ Retrieval & Generation ◉ Proﬁt - Practical ◦ Demo 3

What is Rag? 4

“ Retrieval Augmented Generation (RAG) is way to augment the
LLM knowledge with additional information. 5

Typical LLM Query-Response Cycle 6

Hallucinations (making stuff up) 7

Non Relevant Data 8

9 How do we deal with this?

Try Adding Documents Limited Context Window!

Token Token can be thought of as a piece of
a word. Context Window & Tokens Context Window The maximum number of tokens that can be used in a single request, inclusive of both input and output tokens. 11 https://platform.openai.com/docs/models https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

Rule of Thumb 1 token~= 4 chars in English 100
tokens ~= 75 words Context Window & Tokens 12 https://platform.openai.com/docs/models https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them Example 6 tokens = “Ruby is a ﬁne programming language” GPT4-o Context Window = 128,000 tokens ~= 96,000 words

96,000 Words Visualized: To Kill a Mockingbird Approx. 100,000 words
13

96,000 Words Visualized: Ender’s Game Approx. 100,000 words 14

96,000 Words Visualized: 1984 Approx. 88,000 words 15

16 How we calculate the token size of various documents?

OpenAI - Tokenizer OpenAI has a tool to help us
understand how many tokens a piece of text might contain. 17

tiktoken_ruby tiktoken_ruby is a fast BPE (Byte Pair Encoding) tokenizer
for use with OpenAI's models. 18

Limited Context Window

Indexing

Chunking 21

Chunking (Text Splitting)

Pass chunks to the LLM Which Chunks?

24 We’ll get back to that…

Embeddings 25

Embedding = Array of Floats 26

Embedding = Vector = Array of Floats 27

Chunks to Array of Floats (Embedding) Chunks Embedding

“ Vector embeddings are a way to convert words and
sentences and other data into numbers that capture their meaning and relationships. 29

Effecient Semantic Search 30

31 Cake Water Milk Cookie Relationships?

32 Cake Water Milk Cookie Drink Dessert

33 Cake Water Milk Cookie Drink Dessert Water Milk Cake
Cookie

34 Drink Dessert Water Milk Cake Cookie Meaning Maintained!

Store 35

Store our Embeddings

Quick Recap - Indexing Chunking Embedding Store (Vector Database)

38 Time to get back to that thing…

Pass chunks to the LLM Which Chunks?

Pass chunks to the LLM

Similarity Search 41

Observations 42

Graph our data 45

47 Vector Store (Vector Space)

Similarity Search 53

Find 3 chunks similar to Star Wars

Retrieval & Generation

Quick Recap: Similarity Search

Generate a prompt Query: Search Results:

Clariﬁcation: Query vs Prompt Prompt Final text input ingest by
the LLM. Which often contains the Query Query Input text generated by some human or system.

Generate a prompt Query: Search Results: Prompt:

Pass to LLM to get Response

Quick Recap - Search & Generation

That’s It! 69

Demo 71

Thank You 72

Ruby On RAG - Building AI Use Cases for Fun and...

Ruby On RAG - Building AI Use Cases for Fun and Profit

More Decks by Landon Gray

Other Decks in Programming

Featured

Transcript