Building LLM Powered Features

  Building LLM Powered Features ! Radoslav Stankov

Radoslav Stankov @rstankov rstankov.com

https://tips.rstankov.com

I'm not going talk about...

1.AGI I'm not going talk about...

1.AGI 2.AI taking developers jobs I'm not going talk about...

1.AGI 2.AI taking developers jobs 3.Vibe coding I'm not going
talk about...

1.AGI 2.AI taking developers jobs 3.Vibe coding 4.How fast AI
is moving I'm not going talk about...

1.AGI 2.AI taking developers jobs 3.Vibe coding 4.How fast AI
is moving 5.Agents / MCP I'm not going talk about...

LLM-based features will become stable in every application, just like
databases. Knowing how to work with LLMs will be as essential a skill for a developer. My thoughts

LLMs are already incredible, and there is years of work
to be done to fully productize the capabilities that exist today.

.gguf !

GPU Model Network API

await fetch("https://api.openai.com/v1/responses", { method: "POST", headers: { "Content-Type": "application/json", "Authorization":
"Bearer YOUR_OPENAI_API_KEY" }, body: JSON.stringify({ model: "gpt-5.1", input: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Explain what LLM is" } ], temperature: 0.7 }) }); Chat mode

await fetch("https://api.openai.com/v1/responses", { method: "POST", headers: { "Content-Type": "application/json", "Authorization":
"Bearer YOUR_OPENAI_API_KEY" }, body: JSON.stringify({ model: "gpt-5.1", instructions: "Explain concepts clearly and concisely.", input: "What is an LLM?", temperature: 0.7 }) }); Instructions + Input Mode

Instructions (system prompt)

Input (user prompt)

Instructions + Input Output LLM

lets zoom in...

Instructions + Input Output

Instructions + Input Tokenizer Output

Instructions + Input Tokenizer All tokens so far (context) Output

Instructions + Input Tokenizer All tokens so far (context) Embeddings
(vectors) Output

(vectors) LLM   (transformer) Output

(vectors) LLM   (transformer) Next-token probabilities Output

(vectors) LLM   (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Output

(vectors) LLM   (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) Output

(vectors) LLM   (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? Output

(vectors) LLM   (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? append token to context false Output

(vectors) LLM   (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? Detokenizer true append token to context false Output

https://www.youtube.com/watch?v=2eWuYf-aZE4

https://www.youtube.com/watch?v=kCc8FmEb1nY

The best ChatGPT that $100 can buy. https://github.com/karpathy/nanochat

https://www.manning.com/books/build-a-large-language-model-from-scratch

SQL Output Database

! Retrieval-Augmented Generation (RAG) Injecting relevant data from external sources
like databases or ﬁles into the LLM’s context so it can produce more accurate outputs.

! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data
3/ receive data 4/ query + data 5/ response 6/ answer

! Levels of LLM integrations ! Single features ! Workﬂows
! Agents

! Single features Single LLM call and result

! Workﬂows Multistep LLMs calls orchestrated by code

! Agents LLMs calling tools in a loop to achieve
a goal

formCode = ` <form> <select name="user_id"> <option value="">Kасиер</option> <option value="43917">Демо
Потребител</option> </select> <select name="source"> <option value="">Източник</option> <option value="payment">Плащане</option> <option value="epay">Платено през ePay</option> <option value="easy_pay">Платено през EasyPay</option> <option value="icard">Платено през iCard</option> <option value="bank_transfer">Платено с банков трансфер</option> </select> <label>Равно <input type="date" name="date[eq]" /></label> <label>От <input type="date" name="date[gteq]" /></label> <label>До <input type="date" name="date[lteq]" /></label> <select name="kind"> <option value="">Тип</option> <option value="income">Приход</option> <option value="expense">Разход</option> </select> <input type="search" name="query" placeholder="Заглавие" /> <label>Равно <input type="text" name="amount[eq]" /></label> <label>От <input type="text" name="amount[gteq]" /></label> <label>До <input type="text" name="amount[lteq]" /></label> <label>Равно <input type="date" name="created_at[eq]" /></label> <label>От <input type="date" name="created_at[gteq]" /></label> <label>До<input type="date" name="created_at[lteq]" /></label> </form> `; const params = await transactionsSearchParams({ currentUser, formCode, query }); redirectTo(paths.transactionsSearch(params));

! Evals LLM evals are automated tests that measure how
reliably a model behaves across real scenarios so you can ship LLM features with conﬁdence.

1. Generate a lot of outputs with different inputs. Analyze
them. 2. Build a couple of tests with something like the VCR gem 3. Record LLM interactions and review manually with UI 4. Adjust prompt and go through steps 1/2 5. Process depends on speciﬁc features. ...this takes time, tries, and tokens ($$$) ! My current process

0/ ! How LLMs work 1/ ! RAG 2/ !
Levels of LLM integrations - single feature, workﬂow, agent 3/ ! Context 4/ ! Evals ! Recap

Learning resources

https://bit.ly/4cznIfA

https://rstankov.com/appearances

Building LLM Powered Features

Building LLM Powered Features

More Decks by Radoslav Stankov

Other Decks in Technology

Featured

Transcript