Summaraizer - Lessons learned along the way

Summaraizer Lessons learned along the way

StefMa.guru Stefan May Android Developer since 2014 Principal Android Developer
@ioki since 2020 github.com/@StefMa StefMa.medium.com x.com/StefMa91

StefMa.guru Stefan May Android Developer since 2014 Principal Android Developer
@ioki since 2020 github.com/@StefMa StefMa.medium.com x.com/StefMa91 ki = künstliche intelligenz = artificial intelligence

“AI, can you help me?” ✨✨✨ ✨✨✨

Summaraizer

Summaraizer 👉 https://github.com/ioki-mobility/summaraizer Go, CLI and Module Supports Multiple Sources
(GitHub, Reddit, GitLab, more to come) Supports Multiple Providers (Ollama, OpenAI, Mistral, more to come) 👉 https://github.com/ioki-mobility/summaraizer-action JavaScript Supports Multiple Providers

Summaraizer Lessons learned along the way

Tokens

1 Token ~= 4 Characters (english alphabet)

Tokens Model Token (context) window gpt4o 128.000 Llama3 8.000 Claude
3 200.000 Gemini 1.000.000 (“soon” 2.000.000)

Tokens

Token window is a limitation for your input (prompt) and
output

Tokens Example: Token window: 5 tokens (5*4 ~= 20 chars)
Input: Why is the sky blue? (20 chars, “5 tokens”) Output: (0 chars left, “0 tokens”)

Tokens Example: Token window: 5 tokens (5*4 ~= 20 chars)
Input: The sky is (10 chars, “2.5 tokens”) Output: blue! (5 chars, “2.25 tokens”)

Tokens Stuffing: Just put in all the data in (and
hope for the best)

Tokens Stuffing: Just put in all the data in (and
hope for the best) MapReduce: Summarize chunks of the data and put all the summarization into a final prompt

Tokens

Tokens “Summary 1” “Summary 2”

Please summarize this: Summary 1, Summary 2, …, Summary N

Tokens Stuffing: Just put in all the data (and hope
for the best) MapReduce: Summarize chunks of the data and put all the summarization into a final prompt Refine: Summarize chunks of data and put the summarization plus the next chunk of data to the prompt until your data ends

Tokens “Summary 1”

Tokens “Summary 1” “Summary 2”

Tokens Stuffing: Just put in all the data (and hope
for the best) MapReduce: Summarize chunks of the data and put all the summarization into a final prompt Refine: Summarize chunks of data and put the summarization plus the next chunk of data to the prompt until your data ends

Streaming

Current models are just predicting next word machines

Current models are just predicting next token machines

Streaming Example: Input: The sky is

Streaming Example: Input: The sky is Tokenizer

Streaming Example: Input: The sky is Tokenizer The sky is

Streaming Example: Input: The sky is Tokenizer Neural Network

Streaming Example: Input: The sky is Tokenizer Neural Network (next)
Token Probability blue 0.9 nice 0.4 dog 0.1

Streaming Example: Input: The sky is Tokenizer Neural Network (next)
Token Probability blue 0.9 nice 0.4 dog 0.1 Greedy decoding

Streaming Example: Input: The sky is blue Tokenizer

Streaming Example: Input: The sky is blue Tokenizer Neural Network

Streaming Example: Input: The sky is blue Tokenizer Neural Network
(next) Token Probability because 0.7 AI 0.1 frankfurt 0.2

Streaming Example: Input: The sky is blue because…n Tokenizer Neural
Network

Streaming Example: Input: The sky is blue because…n Tokenizer Neural
Network End of sequence (token)

“Then how can it answer questions?”

“Then how can it answer questions?” Why is the sky
blue?

Prompting

Prompting How to separate comments?

Prompting How to separate comments? Good old <HTML>

Prompting How to separate comments? Good old <HTML> Solution: Separate
comments using enclosing tags Example: <comment>Why is the sky blue?</comment> <comment>I actually don’t know. Maybe ask @john</comment> <comment>The sky is blue because…</comment>

Prompting

Prompting mistral:7b

Prompting llama3:latest

Prompting gemma:2b-instruct

Model variants

Model variants llama3:latest mistral:7b gemma:2b-instruct

llama3:latest mistral:7b gemini:pro gemma:2b-instruct gemini:flash Model variants

Model variants llama3:latest mistral:7b gemini:pro gemma:2b-instruct gemini:flash llama3:70b llama3:8b gemma:2b
gemma:7b gemma:text mistral:instruct

Model variants [model]:[x]b [model]:[text|instruct|...]

Model variants [model]:[x]b b stands for billion (parameters) [model]:[text|instruct|...] variants
differs on training (data) and/or are fine-tuned

Model variants More parameters: Are “better” on a variety of
tasks Less parameters: Might be “optimized” for a specific task More parameters: Uses more resources Is slower Tend to have a bias on (a) topic(s) Less parameters: Uses less resources Are faster Might not have a bias on (a) topic(s) [model]:[x]b

Model variants Text Is optimized for general text processing like
translations, text summarization or text generation Instruct Is optimized for responding with completions for an specific instruct. [model]:[text|instruct|...]

Thank You For Listening!

Summaraizer - Lessons learned along the way

Summaraizer - Lessons learned along the way

More Decks by Stefan M.

Other Decks in Technology

Featured

Transcript