Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building LLM Powered Features

Building LLM Powered Features

Avatar for Radoslav Stankov

Radoslav Stankov

November 16, 2025
Tweet

More Decks by Radoslav Stankov

Other Decks in Technology

Transcript

  1. 1.AGI 2.AI taking developers jobs 3.Vibe coding 4.How fast AI

    is moving 5.Agents / MCP I'm not going talk about...
  2. LLM-based features will become stable in every application, just like

    databases. Knowing how to work with LLMs will be as essential a skill for a developer. My thoughts
  3. LLMs are already incredible, and there is years of work

    to be done to fully productize the capabilities that exist today.
  4. await fetch("https://api.openai.com/v1/responses", { method: "POST", headers: { "Content-Type": "application/json", "Authorization":

    "Bearer YOUR_OPENAI_API_KEY" }, body: JSON.stringify({ model: "gpt-5.1", input: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Explain what LLM is" } ], temperature: 0.7 }) }); Chat mode
  5. await fetch("https://api.openai.com/v1/responses", { method: "POST", headers: { "Content-Type": "application/json", "Authorization":

    "Bearer YOUR_OPENAI_API_KEY" }, body: JSON.stringify({ model: "gpt-5.1", instructions: "Explain concepts clearly and concisely.", input: "What is an LLM?", temperature: 0.7 }) }); Instructions + Input Mode
  6. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Output
  7. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Output
  8. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) Output
  9. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? Output
  10. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? append token to context false Output
  11. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? Detokenizer true append token to context false Output
  12. Instructions + Input Tokenizer All tokens so far (context) Embeddings

    (vectors) LLM 
 (transformer) Next-token probabilities Sampling (temperature, top-k, etc.) Select one token (next token) EOS token? Detokenizer true append token to context false Output
  13. ! Retrieval-Augmented Generation (RAG) Injecting relevant data from external sources

    like databases or files into the LLM’s context so it can produce more accurate outputs.
  14. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  15. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  16. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  17. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  18. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  19. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  20. ! Retrieval-Augmented Generation (RAG) 1/ user query 2/ retrive data

    3/ receive data 4/ query + data 5/ response 6/ answer
  21. const PROMPT = ` You are search assistant for {form_description}

    form. Convert the user's natural language query into a JSON object using only the fields defined in the HTML form. {form} # Context The current date is "{current_date}" The user name is "{user_name}" The user ID is "{user_id}" # Instructions - ... `; export default { async transactionsSearchParams({ query, formCode, currentUser }) { const prompt = PROMPT .replace('{form_description}', 'financial transactions',) .replace('{form}', formCode) .replace('{current_date}', new Date().toISOString().slice(0, 10)) .replace('{user_id}', String(currentUser.id)) .replace('{user_name}', currentUser.name); return await OpenAiApi.response({ label: `TransactionsSearch: ${query}`, userId: currentUser.id, instructions: prompt, input: query, }); }, };
  22. formCode = ` <form> <select name="user_id"> <option value="">Kасиер</option> <option value="43917">Демо

    Потребител</option> </select> <select name="source"> <option value="">Източник</option> <option value="payment">Плащане</option> <option value="epay">Платено през ePay</option> <option value="easy_pay">Платено през EasyPay</option> <option value="icard">Платено през iCard</option> <option value="bank_transfer">Платено с банков трансфер</option> </select> <label>Равно <input type="date" name="date[eq]" /></label> <label>От <input type="date" name="date[gteq]" /></label> <label>До <input type="date" name="date[lteq]" /></label> <select name="kind"> <option value="">Тип</option> <option value="income">Приход</option> <option value="expense">Разход</option> </select> <input type="search" name="query" placeholder="Заглавие" /> <label>Равно <input type="text" name="amount[eq]" /></label> <label>От <input type="text" name="amount[gteq]" /></label> <label>До <input type="text" name="amount[lteq]" /></label> <label>Равно <input type="date" name="created_at[eq]" /></label> <label>От <input type="date" name="created_at[gteq]" /></label> <label>До<input type="date" name="created_at[lteq]" /></label> </form> `; const params = await transactionsSearchParams({ currentUser, formCode, query }); redirectTo(paths.transactionsSearch(params));
  23. ! Evals LLM evals are automated tests that measure how

    reliably a model behaves across real scenarios so you can ship LLM features with confidence.
  24. 1. Generate a lot of outputs with different inputs. Analyze

    them. 2. Build a couple of tests with something like the VCR gem 3. Record LLM interactions and review manually with UI 4. Adjust prompt and go through steps 1/2 5. Process depends on specific features. ...this takes time, tries, and tokens ($$$) ! My current process
  25. 1. Generate a lot of outputs with different inputs. Analyze

    them. 2. Build a couple of tests with something like the VCR gem 3. Record LLM interactions and review manually with UI 4. Adjust prompt and go through steps 1/2 5. Process depends on specific features. ...this takes time, tries, and tokens ($$$) ! My current process
  26. 0/ ! How LLMs work 1/ ! RAG 2/ !

    Levels of LLM integrations - single feature, workflow, agent 3/ ! Context 4/ ! Evals ! Recap