Poor Man’s BingChat – Building an Internet-connected Search Assistant from scratch*

vladiliescu.net Poor Man’s BingChat vladiliescu.net

vladiliescu.net what do we want? to ask GPT-4 questions about
events that have occurred after September 2021

vladiliescu.net how do we get it? well, uhhh….

vladiliescu.net think step by step

vladiliescu.net what are large language models

vladiliescu.net AI in a nutshell def do_ai(input): output = ''
for i in range(0, max_words): context = input + output next_word = predict_next_word(context) output += next_word return output do_ai('Give me the list of speakers for Codecamp Iasi 2023')

vladiliescu.net Large Language Models are Zero-Shot Reasoners Kojima et al.,
2022

vladiliescu.net why don’t we just retrain it

vladiliescu.net well…

vladiliescu.net a helpful visualization:

vladiliescu.net

vladiliescu.net Base models Q: How does Hamlet’s famous monologue start?
A: What is the opening line of Romeo’s soliloquy in Act 2, Scene 2 of “Romeo and Juliet”? – What are the first words of the “Tears in rain” monologue from the film “Blade Runner”? – What are the first words of Mark Antony’s “Friends, Romans, countrymen” speech in “Julius Caesar”? –

vladiliescu.net Assistant models Q: How does Hamlet’s famous monologue start?
A: Hamlet’s famous monologue, often referred to as the “To be or not to be” soliloquy, is from William Shakespeare’s play “Hamlet.” This monologue appears in Act 3, Scene 1 and delves into the protagonist’s contemplation of life, death, and the nature of existence. The opening line is as follows: To be, or not to be, that is the question

vladiliescu.net so, it doesn’t really make sense to do a
full retrain

vladiliescu.net maybe just fine-tune the thing?

vladiliescu.net

vladiliescu.net how a fine-tuning dataset looks like

vladiliescu.net Open Orca

vladiliescu.net issue #1 gathering a dataset of all text on
the Internet is non-trivial plus it takes like, a lot of time

vladiliescu.net issue #2 fine-tuning still takes some time to do

vladiliescu.net basically as soon as we’d fine-tune something it’s already
obsolete (no real-time info)

vladiliescu.net back to the drawing board

vladiliescu.net

vladiliescu.net what if we include the info in the prompt
?

vladiliescu.net

vladili

vladiliescu.net say hello to context window sizes

vladiliescu.net Remember this? def do_ai(input): output = '' for i
in range(0, max_words): context = input + output next_word = predict_next_word(context) output += next_word return output do_ai('Give me the list of speakers for Codecamp Iasi 2023')

vladiliescu.net some popular models Model Context window Company 4k tokens
Meta 4k tokens OpenAI 8k tokens OpenAI 16k tokens OpenAI 32k tokens OpenAI 100k tokens Meta 100k tokens Anthropic Llama2 GPT-3.5-turbo GPT-4 GPT-3.5-turbo-16k GPT-4–32k Code Llama Claude 2

vladiliescu.net but they’re so small

vladiliescu.net what if we don’t send it EVERYTHING?

vladiliescu.net how do we know what to include in the
prompt?

vladiliescu.net we include the most relevant info duh

vladiliescu.net pattern’s called r a g

vladiliescu.net pattern’s called retrieval augmented generation

vladiliescu.net

vladiliescu.net You are a search assistant that helps users find
information from a series of curated documents. The user just asked you this question: {QUESTION}. You need to think carefully and find the best answer to the query. Do not make stuff up, and answer using only information from the following documents: {RELEVANT_INFO}

vladiliescu.net The thing about RAG is that it’s only as
good as the quality of your search results. – Stack Overflow Blog

vladiliescu.net semantic vs lexical search

vladiliescu.net come to think of it, we don’t need all
of that (we do, actually)

vladiliescu.net

vladiliescu.net we can just use

vladiliescu.net let’s see how this works in practice

vladiliescu.net thank you! vladiliescu.net vlad.substack.com

Poor Man’s BingChat – Building an Internet-con...

Poor Man’s BingChat – Building an Internet-connected Search Assistant from scratch*

More Decks by Vlad Iliescu

Other Decks in Programming

Featured

Transcript