Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Zero to Production: Build Your Own GenAI S...

From Zero to Production: Build Your Own GenAI Solution.

Thanks to powerful frameworks and libraries, the first Generative AI applications can be realized at Hello World level with just a few lines of code. However, these first attempts also reach their limits just as quickly. Why? Because reality presents challenges that cannot be solved easily with this trivial approach. But what is needed for a Generative AI application at enterprise level? A well designed Generative AI architecture!
During this full day hands-on workshop, we will build a complex Generative AI application step by step. Starting with a minimalist RAG system, we will look at various challenges and discuss suitable solutions. In the end, we will have an architecture that can withstand the challenges of reality.

Lars Roewekamp

November 22, 2024
Tweet

More Decks by Lars Roewekamp

Other Decks in Technology

Transcript

  1. Generative AI From Zero to Production: Build your own GenAI

    solution #WISSENTEILEN powered by Lars Röwekamp | Tim Wüllner | open knowledge GmbH Hands-on Workshop
  2. This is not a cat fit Neural Network Cat Cat

    Cat Cat Dog Dog Neural Networks aka traditional AI This is a cat
  3. infer trained Neural Network with pattern recognition for pictures of

    cats Neural Networks aka traditional AI Is this a cat?
  4. Generative AI Model Prompting What is a cat? A cat

    is* … and looks like this. *based on my world knowledge Transformer Models aka generative AI
  5. Generative AI vs traditional Neural Networks Predictive ML Model Generative

    Model learns / recognises relationships between data and label learns / understands connections in unstructured data Label new Content Label Data unstructured Data
  6. Generative AI under the Hood Large Language Model Mary had

    a … Large Language Model Mary had a little … Next Token red little delicious … Probability 0.2 0.4 0.1 … Next Token mouse lamb kettle … Probability 0.2 0.3 0.2 …
  7. 1. queen – woman + man = king 2. doctor

    – man + woman = nurse Male-Female Verb Tense Country-Capital Generative AI under the Hood
  8. Transformer (LLM Layer) John wants his bank to cash the

    … ? word2vec style Vectors Generative AI under the Hood
  9. Transformer (LLM Layer) John wants his bank to cash the

    … ? John wants his bank to cash the … ? (verb) (verb) word2vec style Vectors context aka hidden state Generative AI under the Hood
  10. Transformer (LLM Layer) John wants his bank to cash the

    … ? John wants his bank to cash the … ? (verb) (verb) word2vec style Vectors context aka hidden state Key-Vector: btw I‘am a noun describing a male person. Query-Vector: btw I‘am seeking for a noun describing a male person. Generative AI under the Hood
  11. Transformer (LLM Layer) John wants his bank to cash the

    … ? word2vec style Vectors Transformer (LLM Layer) context aka hidden state John wants his bank to cash the … ? (verb) (verb) (John‘s) (finance) John wants his bank to cash the … ? (verb) (verb) (male) ( ... ) Generative AI under the Hood
  12. Transformer (LLM Layer) John wants his bank to cash the

    … ? word2vec style Vectors John (main character, male, married to Cheryl, cousin of Donald, from Minnesota, currently in Boise, … ) wants his bank to cash the … ? Generative AI under the Hood
  13. Transformer (LLM Layer) John wants his bank to cash the

    … ? word2vec style Vectors Transformer (LLM Layer) context aka hidden state John wants his bank to cash the … (verb) (verb) (John‘s) (finance) John wants his bank to cash the … ? (verb) (verb) (male) Enough context information to be able to „guess“ the next word. ( ... ) Generative AI under the Hood check
  14. Generative AI under the Hood Large Language Model Mary had

    a … Large Language Model Mary had a little … Next Token red little delicious … Probability 0.2 0.4 0.1 … Next Token mouse lamb kettle … Probability 0.2 0.3 0.2 …
  15. Generative AI by Example Health Care Enhancing medical images Discovering

    new drugs* Simplifying medical tasks Persionalized treatment *via Generative Design
  16. Generative AI by Example Advertising & Marketing Generate marketing content

    Personalized recommendations Create product descriptions Enhance SEO
  17. Generative AI by Example Media & Entertainment Create audio &

    video content Generate highlights for events Manage tags for better CM
  18. Generative AI by Example Financial Services Create investment strategies Communicate

    & educate clients Quickly draft information Monitor regulatory activities
  19. Generative AI by Example Software Development Generate code Automate testing

    Translate prog. Language Explain software soultions
  20. GenAI- Model Prompt GenAI Basics Model Selection Prompt Engineering 2

    1 „What is the most beautiful holiday destination?“
  21. GenAI Hands-on What are you going to learn during this

    lesson: • Create your first Jupyter Notebook in Google Colab cloud environment. • Access a GenAI model with customised prompts via provider-specific API.
  22. GenAI- Model Prompt GenAI Basics Model Selection Prompt Engineering 2

    Model Selection 1 Which MODEL? Which PARAMETERS? Which INFRASTRUCTURE? Which LICENCE? Which COST MODEL? Which GOVERNANCE? „What is the most beautiful holiday destination?“
  23. GenAI Basics Model Selection „ What does an AI benchmark

    measure?“ • General Kowledge • Programming • Mathematics • ... Official benchmarks ars often part of the training data! • Creativity • Factual Knowledge • Logic Thinking
  24. GenAI Basics Model Selection aka „How do I find the

    right model for me?“ • for my PROBLEM • for my DATA • for my CONTEXT • for my BUDGET
  25. GenAI Basics Model Selection aka „How do I find the

    right model for me?“ Public, Private, Sizes, Purpose Availability
  26. GenAI Basics Model Selection aka „How do I find the

    right model for me?“ Public, Private, Sizes, Purpose Data Privacy, Security, Explainability Availability Governance
  27. GenAI Basics Model Selection aka „How do I find the

    right model for me?“ Public, Private, Sizes, Purpose Data Privacy, Security, Explainability Complexity, Business Area Availability Governance Use Case
  28. GenAI Basics Model Selection aka „How do I find the

    right model for me?“ Public, Private, Sizes, Purpose Data Privacy, Security, Explainability Complexity, Business Area Latency, Cost, Customizeability Availability Governance Use Case Performance
  29. GenAI Basics Model Selection aka „How do I find the

    right model for me?“ Public, Private, Sizes, Purpose Data Privacy, Security, Explainability Complexity, Business Area Latency, Cost, Customizeability Traininig, Runtime, Availability Team Skills, Computational Power Availability Governance Use Case Performance Data Resources „Know Your Use-Case!“ *if you want to be successful.
  30. GenAI- Model Prompt GenAI Basics Model Integration Prompt Engineering 2

    Model Selection 1 Suitable model found: But how do I address it from my application? „What is the most beautiful holiday destination?“
  31. Model A-small Model Provider model-name model-parameter provider-parameter prompt Model A-big

    Model B e.g. openAI, Google, … Provider Client API GenAI Basics Model Integration „What is the most beautiful holiday destination?“ Reminder: 00-hello-genai.ipynb
  32. Model X mini Model Provider II model-name model-parameter provider-parameter prompt

    Model Y Model Y ++ e.g. openAI, Google, … Provider Client API GenAI Basics Model Integration „What is the most beautiful holiday destination?“
  33. Model X mini Model Provider II Provider II Client API

    model-name model-parameter provider-parameter prompt Model Y Model Y ++ e.g. openAI, Google, … GenAI Basics Model Integration „What is the most beautiful holiday destination?“
  34. Hub-Provider Client API model-name model-parameter hub-provider-parameter prompt e.g. Hugging-Face Provider-II

    hosted Models Provider-I hosted Models Hub-hosted Models* Hub-Provider Dispatcher *Provider X, fine-tuned or adapted, own GenAI Basics Model Integration „What is the most beautiful holiday destination?“ Cloud based Resources
  35. Hub-Provider Client API model-name model-parameter hub-provider-parameter prompt e.g. Hugging-Face Provider-II

    hosted Models Provider-I hosted Models Hub-hosted Models* Hub-Provider Dispatcher *Provider X, fine-tuned or adapted, own GenAI Basics Model Integration „What is the most beautiful holiday destination?“ cloud-not-allowed exception
  36. Local-Hub Provider Client API model-name model-parameter hub-provider-parameter prompt e.g. GPT4all,

    ollama Local hosted Models (Provider III) Local hosted Models (Provider II) Local hosted Models (own or adaptedI) Local-Hub Provider Dispatcher Local Machine GenAI Basics Model Integration „What is the most beautiful holiday destination?“
  37. GenAI Basics Model Adaption „ What additional levers do I

    have?“ • Temperature degree of ‘fantasy‘ • Max Tokens length of answer • Top K selection of hits from the top K hits • Top P selection of hits from the top P per cent • Presence Penalty avoid repetition • Frequence Penalty avoid repetition (weighted edition)
  38. GenAI- Model Prompt Model Selection Prompt Engineering 2 1 GenAI

    Basics Prompt Engineering „What is the most beautiful holiday destination?“
  39. GenAI- Model Prompt Model Selection Prompt Engineering 2 1 Parts

    of a Prompt System vs User Prompt Principles of Prompting Chain-of-Thought Prompts Few-Shot Learning GenAI Basics Prompt Engineering „What is the most beautiful holiday destination?“
  40. GenAI Basics Prompt Engineering Role Instruction Example 1 Context Example

    n Question Who am I? What is my intention? What are helpful examples? Are there any additional information? BTW: what is the task I ask for? What should a good prompt should look like?
  41. GenAI Basics Prompt Engineering Role Instruction Example 1 Context Example

    n Question What is your role as an assistant? What is my intention? What are helpful examples? Are there any additional information? BTW: what is the task I ask for? What should a good prompt should look like?
  42. GenAI Basics Good Prompt vs. Bad Prompt You I want

    to cook something. LR Topic: Recipe Recommendations
  43. GenAI Basics Good Prompt vs. Bad Prompt You I want

    to cook something. LR You Acting as an expert home cook, for someone who enjoys vegetarian Italian food and has only 30 minutes to prepare dinner, could you recommend a recipe including a list of ingredients and step-by- step instructions? LR Output Example Context Question Role Acting as an expert home cook, for someone who enjoys vegetarian Italian food and has only 30 minutes to prepare dinner, could you recommend a recipe including a list of ingredients and step-by-step instructions? You could point at recipes you like from the BBC’s Good Food guide, providing URL’s to recipes you love. Topic: Recipe Recommendations
  44. GenAI Basics Good Prompt vs. Bad Prompt You I want

    to learn something new. LR You Acting as a coding instructor, for a beginner with a goal to learn Python within 4 weeks, please provide a learning plan including resources and a weekly schedule for 10 hours per week. LR Output Example Context Question Role Acting as a coding instructor, for a beginner with a goal to learn Python within 4 weeks, Please provide a learning plan, including resources and a weekly schedule for 10 hours per week Point it at courses you’ve done in the past that you’ve liked. Tell it you like to learn by reading books, or websites, or by watching videos, or a mixture of both. Topic: Learning new skills
  45. GenAI Basics System Prompt vs. User Prompt System Assistant User

    „You“ Chat-Model Behaviour of Assistant
  46. GenAI Basics System Prompt vs. User Prompt System Assistant User

    ‘You're a friendly assistant.’ „You“ Chat-Model Behaviour of Assistant
  47. GenAI Basics System Prompt vs. User Prompt System Assistant User

    ‘You're a friendly assistant and always keep things brief.’ ‚Which is the most beautiful city in the world?‘ ‚Which is the most beautiful city in the world?‘ ‘You are a philosopher and have a tendency to be verbose.’ ??? ??? „You“ Chat-Model Behaviour of Assistant
  48. GenAI Basics System Prompt vs. User Prompt System-Prompt: ‘You're a

    friendly assistant and always keep things brief.’ Assistant: 65 Tokens
  49. GenAI Basics System Prompt vs. User Prompt System-Prompt: ‘You are

    a philosopher and have a tendency to be verbose.’ Assistant: 434 Tokens
  50. GenAI Basics Prompt Engineering You Brainstorm a list of product

    names for a shoe that fits any foot size, in the style of Steve Jobs. Return the result as a comma separated list, in this format: Product description: A shoe that fits any foot size Product name: [list of 3 names] ## Examples Product description: A refrigerator that dispenses beer Product names: iBarFridge, iFridgeBeer, iDrinkBeerFridge Product description: A watch that can tell accurate time in space Product names: iNaut, iSpace, iTime ChatGPT Product description: A shoe that fits any foot size Product names: iFitFoot, iPerfectFit, iShoeSize <10/10> LR Prompt Engineering Principles Source: Prompt Engineering for Generative AI, James Phoenix & Mike Taylor, O‘Reilly 2024
  51. GenAI Basics Prompt Engineering You Brainstorm a list of product

    names for a shoe that fits any foot size, in the style of Steve Jobs. Return the result as a comma separated list, in this format: Product description: A shoe that fits any foot size Product name: [list of 3 names] ## Examples Product description: A refrigerator that dispenses beer Product names: iBarFridge, iFridgeBeer, iDrinkBeerFridge Product description: A watch that can tell accurate time in space Product names: iNaut, iSpace, iTime ChatGPT Product description: A shoe that fits any foot size Product names: iFitFoot, iPerfectFit, iShoeSize <10/10> LR 1. Give Direction Describe the desired style in detail, or reference a relevant persona.
  52. GenAI Basics Prompt Engineering You Brainstorm a list of product

    names for a shoe that fits any foot size, in the style of Steve Jobs. Return the result as a comma separated list, in this format: Product description: A shoe that fits any foot size Product name: [list of 3 names] ## Examples Product description: A refrigerator that dispenses beer Product names: iBarFridge, iFridgeBeer, iDrinkBeerFridge Product description: A watch that can tell accurate time in space Product names: iNaut, iSpace, iTime ChatGPT Product description: A shoe that fits any foot size Product names: iFitFoot, iPerfectFit, iShoeSize <10/10> LR 1. Give Direction Describe the desired style in detail, or reference a relevant persona. 2. Specify Format Define the rules to follow, and the required structure to respond.
  53. GenAI Basics Prompt Engineering You Brainstorm a list of product

    names for a shoe that fits any foot size, in the style of Steve Jobs. Return the result as a comma separated list, in this format: Product description: A shoe that fits any foot size Product name: [list of 3 names] ## Examples Product description: A refrigerator that dispenses beer Product names: iBarFridge, iFridgeBeer, iDrinkBeerFridge Product description: A watch that can tell accurate time in space Product names: iNaut, iSpace, iTime ChatGPT Product description: A shoe that fits any foot size Product names: iFitFoot, iPerfectFit, iShoeSize <10/10> LR 1. Give Direction Describe the desired style in detail, or reference a relevant persona. 2. Specify Format Define the rules to follow, and the required structure to respond. 3. Provide Examples Insert a deverse set if test cases where the task was done correctly.
  54. GenAI Basics Prompt Engineering You Brainstorm a list of product

    names for a shoe that fits any foot size, in the style of Steve Jobs. Return the result as a comma separated list, in this format: Product description: A shoe that fits any foot size Product name: [list of 3 names] ## Examples Product description: A refrigerator that dispenses beer Product names: iBarFridge, iFridgeBeer, iDrinkBeerFridge Product description: A watch that can tell accurate time in space Product names: iNaut, iSpace, iTime ChatGPT Product description: A shoe that fits any foot size Product names: iFitFoot, iPerfectFit, iShoeSize <10/10> LR 1. Give Direction Describe the desired style in detail, or reference a relevant persona. 2. Specify Format Define the rules to follow, and the required structure to respond. 3. Provide Examples Insert a deverse set if test cases where the task was done correctly. 4. Evaluate Quality Identify errors and rate responses, Testing what drives performance.
  55. GenAI Basics Prompt Engineering You Brainstorm a list of product

    names for a shoe that fits any foot size, in the style of Steve Jobs. Return the result as a comma separated list, in this format: Product description: A shoe that fits any foot size Product name: [list of 3 names] ## Examples Product description: A refrigerator that dispenses beer Product names: iBarFridge, iFridgeBeer, iDrinkBeerFridge Product description: A watch that can tell accurate time in space Product names: iNaut, iSpace, iTime ChatGPT Product description: A shoe that fits any foot size Product names: iFitFoot, iPerfectFit, iShoeSize <10/10> LR 1. Give Direction Describe the desired style in detail, or reference a relevant persona. 2. Specify Format Define the rules to follow, and the required structure to respond. 3. Provide Examples Insert a deverse set if test cases where the task was done correctly. 4. Evaluate Quality Identify errors and rate responses, Testing what drives performance. You Please rate the product names based on their catchines, uniqueness, simplicity. Rate them on a scale from 1 to 5, with being 5 the highest score. Respond only with a table containing the results. LR ChatGPT Product Name Catchiness Uniqueness Simplicity iFitFoot 4 4 4 iPerfectFit 5 4 3 iShoeSize 3 3 5 5. Devide Labor Split the task into multiple steps, chained together for complex goals.
  56. GenAI Basics Prompt Engineering You Question: Roger has 5 tennis

    balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have? Answer: The answer is 11. Question: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? ChatGPT The answer is 27. [NOT CORRECT] LR Chain of Thoughts Pattern
  57. GenAI Basics Prompt Engineering You Question: Roger has 5 tennis

    balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have? Answer: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11. Question: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? ChatGPT A cafeteria had 23 apples originally. They used 20 to make lunch. So they had 23 - 20 = 3. They bought 6 more apples, so they have 3 + 6 = 9. The answer is 9. [CORRECT] LR ‘Instead of directly asking the language model to solve the problem, we use a chain of thought as a prompt to guide the model through intermediate steps.’
  58. GenAI Basics Prompt Engineering You Please choose the best explanation

    to the question: Question: How is snow formed? Explanation1: Snow is formed when water vapor in the air freezes into ice crystals in the atmosphere, which can combine and grow into snowflakes as they fall through the atmosphere and accumulate on the ground. Explanation2: Water vapor freezes into ice crystals forming snow. Answer: ChatGPT Explanation 1 is the better explanation. It provides a more detailed and complete description of the process. It explains that water vapor freezes into ice crystals in the atmosphere and how these crystals can combine to form snowflakes as they fall through the air before accumulating on the ground. This explanation gives a clearer understanding of the formation process of snow compared to the shorter Explanation 2. LR Few-Shot Learning Pattern ???
  59. GenAI Basics Prompt Engineering You Please choose the best explanation

    to the question: Question: Why is sky blue? Explanation1: The sky appears blue because of Rayleigh scattering, … making the sky look blue. Explanation2: Due to Rayleigh scattering effect. Answer: Explanation2 Question: What is the cause of earthquakes? Explanation1: Sudden release of energy in the Earth's crust. Explanation2: Earthquakes happen when tectonic plates suddenly slip or break apart, …. cause damage. Answer: Explanation1 Question: How is snow formed? Explanation1: Snow is formed when water vapor in the air freezes into ice crystals in the atmosphere, … on the ground. Explanation2: Water vapor freezes into ice crystals forming snow. Answer: ChatGPT Explanation 2. LR ‘Instead of feeding a model with lots and lots of data, we give it a few examples for orientation.’ Beispiel 2 Beispiel 1 ???
  60. GenAI Hands-on What are you going to learn during this

    lesson: • Promp with a dedicated system prompt. • Apply prompting patterns and best practives. • control genAI completion via model parameters.
  61. GenAI- Model Prompt GenAI Basics myDomain + „Leave regulation at

    open knowledge“ „How to apply for vacation at open knowledge?“
  62. GenAI- Model Prompt GenAI Basics myDomain + „Leave regulation at

    open knowledge“ „How to apply for vacation at open knowledge?“
  63. GenAI- Model Prompt GenAI Basics myDomain + „Leave regulation at

    open knowledge“ „How to activate the alarm system at open knowledge?“
  64. GenAI- Model Prompt GenAI Basics myDomain + + „How to

    activate the alarm system at open knowledge?“ „Leave regulation + Alarm system usage at open knowledge“
  65. GenAI- Model Prompt GenAI Basics myDomain + „All internal knowledge

    (Wiki, DB, …) at open knowledge“ „How to activate the alarm system at open knowledge?“
  66. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Token Limit

    exceeded Output Size Context Window Size „How to activate the alarm system at open knowledge?“ „All internal knowledge at open knowledge“
  67. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Out-of-Budget Exception

    Context Window Size „All internal knowledge at open knowledge“ „How to activate the alarm system at open knowledge?“
  68. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Lost-in-the-Middle Context

    Window Size „All internal knowledge at open knowledge“ „How to activate the alarm system at open knowledge?“
  69. GenAI Basics myDomain „What are Token Limits?“ • Context Window

    Size • Output Size • long Questions are expensive • long Answers are even more expensive
  70. Source: GenAI Pricing on Medium.com, Ragesh Gohel, March 2024 128.000

    Tokens? Sounds good! But usually is too little for myDomain! 128.000 tokens correspond to approx. 350 book pages Harry Potter has > 1 Mio. words on over 6.000 Pages!
  71. GenAI- Model Prompt GenAI Advanced myDomain + ERROR: Token Limit

    exceeded! Output Size Context Window Size „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  72. myGenAI Model Prompt GenAI Advanced myDomain „All internal knowledge at

    open knowledge“ „How to apply for vacation at open knowledge?“
  73. myGenAI Model Prompt GenAI Advanced myDomain Option 1: Build own

    Model Option 2: Fine-tune existing Model ERROR: Way to expensive ERROR: Way to complex WARNING: Out-of-Sync „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  74. Prompt GenAI Advanced myDomain GenAI- Model SOME MAGIC „MAGIC Enrichment“

    „How to apply for vacation at open knowledge?“
  75. Prompt GenAI- Model GenAI Advanced myDomain SOME MAGIC „How to

    apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  76. Prompt Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever GenAI- Model

    GenAI Advanced myDomain „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  77. „How to apply for vacation at open knowledge?“ Prompt Ingesting-

    Pipeline Knowledge Database Retriever GenAI Advanced myDomain R etrieval Augmented Generation „All internal knowledge at open knowledge“
  78. „How to apply for vacation at open knowledge?“ Prompt Ingesting-

    Pipeline Knowledge Database Retriever GenAI Advanced RAG Systems R etrieval Augmented Generation „All internal knowledge at open knowledge“
  79. Prompt Ingesting- Pipeline Knowledge Database Retriever GenAI Advanced RAG Systems

    R etrieval Augmented Generation „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  80. GenAI Advanced RAG Systems GenAI Model Ingesting-Pipeline (Async.) Retriever Chunking

    Embedding Retrieval-Pipeline (Sync.) Query- Embedding Top-k Fetching Augmenting UI/UX „How to apply for vacation at open knowledge?“
  81. „It will soon be the second anniversary of the release

    of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ Document (Doc. Id:1) Chunking Embedding Ingesting Pipeline by Example
  82. [„ It will soon be the second anniversary of the

    release of ChatGPT. There is still no end in sight to the hype.“] [„ The added value of this technology is too great and its use in the form of a chat is too intuitive.“] Document (Doc. Id:1) Chunking Embedding Ingesting Pipeline by Example „It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ [ … ]
  83. Chunking Embedding [-0.24, 0.15, …, 0.52] [-0.13, 0.95, …, 0.31]

    […] Ingesting Pipeline by Example [„ It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype.“] [„ The added value of this technology is too great and its use in the form of a chat is too intuitive.“] Document (Doc. Id:1) „It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ [ … ]
  84. Id Doc. Id Embedding 1 1 [-0.24, 0.15, …, 0.52]

    2 1 [-0.13, 0.95, …, 0.31] […] […] […] Chunking Embedding [-0.24, 0.15, …, 0.52] [-0.13, 0.95, …, 0.31] […] Ingesting Pipeline by Example [„ It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype.“] [„ The added value of this technology is too great and its use in the form of a chat is too intuitive.“] Document (Doc. Id:1) „It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ [ … ]
  85. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting ‘What is

    the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  86. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting [0.35, 0.25,

    …, 0.62] ‘What is the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  87. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting Embeddings from

    knowledge base Embeddings Dimension 1 Embed. Dim. 2 Top-k=1 document(s) [0.35, 0.25, …, 0.62] ‘What is the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  88. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting Embeddings from

    knowledge base Embeddings Dimension 1 Embed. Dim. 2 Top-k=1 document(s) [0.35, 0.25, …, 0.62] & „ What is the lecture […] based on the following context: ‘An upstream retrieval process to pull in relevant information from our own dynamic database can provide well-founded answers.’ ‘What is the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  89. GenAI Hands-on What are you going to learn during this

    lesson: • Implementing a RAG system based on a ingestion and a retrival pipeline. • Apply the RAG system to answer domain specific questions.
  90. Retrieval Augmented Knowledge Database Retriever Generation Ingesting- Pipeline GenAI Advanced

    RAG Systems „How to apply for vacation at open knowledge?“
  91. Retrieval Augmented Knowledge Database Retriever Generation Ingesting- Pipeline GenAI Advanced

    RAG Systems „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  92. Retrieval Augmented Knowledge Database Retriever Generation Ingesting- Pipeline GenAI Professional

    Guardrails „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  93. Retrieval Augmented Knowledge Database Retriever Generation Guarding Output Ingesting- Pipeline

    GenAI Professional Guardrails Output Profanity Check „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  94. Retrieval Augmented Knowledge Database Retriever Generation Guarding Output Ingesting- Pipeline

    GenAI Professional Guardrails „Unfortunately, I can't answer that!“ Output Profanity Check „…. Lazy piece! …“ „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  95. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guardrails Guarding Topic Check „Which Netflix series do you recommend?“
  96. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guardrails Guarding Topic Check „Which Netflix series do you recommend?“ „Sorry, but this does not fall within my area of responsibility.“
  97. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guardrails Output-Governance: Toxic Speech, Fact-Checking, Ethical-Guidelines, Cross-Validation, … Input-Evaluation: Prompt-Injection, Jailbreak-Attemp, Topic-Filter, PII-Cleansing, Rate-Limiting,… Guarding
  98. GenAI Hands-on What are you going to learn during this

    lesson: • Implement an input and output guardrail to detect and avoid toxic speech. • Implement fact checker guradrails to detect and avoid hallucination.
  99. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guarding What else is missing for a professional and productive system?
  100. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    RAG-Monitoring Cost- Monitoring Model- Monitoring Response-Quality- Monitoring GenAI Professional Monitoring Guarding
  101. Retrieval Augmented Generation GenAI Professional Response Quality Monitoring https://docs.confident-ai.com/docs/metrics-llm-evals G-Eval

    Summerization Answer Relevancy Faithfulness Summerization Hallucination Contextual Recall Contextual Relevancy Contextual Precision Relevant Context Generated Answer User Input
  102. Retrieval Augmented Generation User Input GenAI Professional Response Quality Monitoring

    Faithfulness Is the supported by the ? Relevant Context Generated Answer
  103. Retrieval Augmented Generation User Input GenAI Professional Response Quality Monitoring

    Does the match the ? Relevant Context Generated Answer Answer Relevancy
  104. GenAI Professional Response Quality Monitoring Step 1: Generate possible questions

    ( ) from Embeddings Dimension 1 Embed. Dim. 2 Answer Relevancy Does the match the ?
  105. GenAI Professional Response Quality Monitoring Step 1: Generate possible questions

    ( ) from Step 2: Determine the average distance from Embeddings Dimension 1 Embed. Dim. 2 Answer Relevancy Does the match the ?
  106. Best Practices What are the typical pitfalls that I will

    run into - and how do I deal with them?
  107. Retrieval Augmented Knowledge Database Retriever Generation Guarding Input Output Ingesting-

    Pipeline RAG-Monitoring Cost- Monitoring Model- Monitoring Response-Quality- Monitoring GenAI Best Practices Real Life Survival Guide
  108. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Hallucination (via Faithfulness / Answer Relevancy) GenAI Best Practices Real Life Survival Guide
  109. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Missing Data GenAI Best Practices Real Life Survival Guide
  110. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Missing Data GenAI Best Practices Real Life Survival Guide Chunk not found
  111. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Missing Data GenAI Best Practices Real Life Survival Guide Chunk not found Chunk not relevant
  112. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: RAG does not know the required data.
  113. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: RAG does not know the required data. Solution : Implement a suitable ingestor.
  114. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: Snapshot of the data is not up-to-date.
  115. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: Snapshot of the data is not up-to-date. Solution: Interface for ‘continuous’ data sync.
  116. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found.
  117. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #1: Adapt chunk-sizes and overlapping.
  118. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #2: Choose a different splitter (e.g. Semantic-Splitter).
  119. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #3: Optimize the distance threshold or k from top-k.
  120. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #4: Selection of a more suitable embedding model.
  121. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #5: Hypothetical document embeddings (HyDE) Input HyDE
  122. Guardrails Input Output Gao et al. (2023) HyDE = Hypothetical

    Document Embeddings Real Life RAG #2 Chunk-not-found Challenge HyDE
  123. Real Life RAG #3 Chunk-not-relevant Challenge Chunk not relevant GenAI

    Model Query- Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.)
  124. Real Life RAG #3 Chunk-not-relevant Challenge Problem: Chunk is not

    considered to be relevant by the model. Chunk not relevant GenAI Model Query- Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.)
  125. Real Life RAG #3 Chunk-not-relevant Challenge Chunk not relevant Query-

    Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.) Problem: Chunk is not considered to be relevant by the model. Solution #1: Select a more suitable model for the use case. GenAI Model
  126. Real Life RAG #3 Chunk-not-relevant Challenge Chunk not relevant Query-

    Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.) GenAI Model Problem: Chunk is not considered to be relevant by the model. Solution #2: Add additional context to the chunk (small-to-big). Chunk
  127. GenAI Model UI/UX Guardrails Input Output Ingesting-Pipeline (Async.) Chunking Embedding

    Retrieval-Pipeline (Sync.) Query- Embedding Top-k Fetching Augmenting
  128. GenAI Hands-on What are you going to learn during this

    lesson: • Surviving the RAG jungle: Applying patterns and best practices for content ingestion and retrieval.
  129. Prompt Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever GenAI- Model

    GenAI Outlook What else is there to consider? „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  130. UI/UX Enterprise Integration „Comment demander des vacances chez ok?“ Retrieve

    Augment Ingesting- Pipeline Retriever GenAI- Model GenAI Outlook Multi-Language RAG Knowledge Database „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  131. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Language- Detection Translation Translation Translation GenAI Outlook Multi-Language RAG Knowledge Database „All internal knowledge at open knowledge“ „Comment demander des vacances chez ok?“
  132. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever

    GenAI- Model Translation Translation Translation Translation FR_fr language not supported domain specific wording GenAI Outlook Multi-Language RAG Language- Detection „Comment demander des vacances chez ok?“ „All internal knowledge at open knowledge in multiple languages.“
  133. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Knowledge Database GenAI Outlook Access Control „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  134. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Knowledge Database GenAI Outlook Access Control „What are the top secret KPIs of open knowledge?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  135. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Token Knowledge Database Input Guardrail Some kind of Attack? Prompt-Injection? Prompt-Abuse? Rate-Limit? GenAI Outlook Access Control „What are the top secret KPIs of open knowledge?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  136. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database with

    AC Meta Data Retriever GenAI- Model RBAC via PBF Token Token Input Guardrail GenAI Outlook Access Control RBAC = Role based Access Control PBF = Permission based Filtering „What are the top secret KPIs of open knowledge?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  137. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database with

    AC Meta Data Retriever GenAI- Model RBAC via PBF Token Token Input Guardrail Output Guardrail PII: Personally identifiable information? GenAI Outlook Access Control „What does the personal data of X look like at ok?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  138. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Knowledge Database GenAI Outlook Integration „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  139. GenAI Outlook Integration GenAI based Service A P I ML

    as a Service Input Output UI/UX Enterprise Integration „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  140. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever

    GenAI- Model GenAI Outlook Multi-Agenten Systeme „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  141. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever

    GenAI- Model GenAI Outlook Multi-Agenten Systeme „All internal knowledge at open knowledge“ „Apply for my vacation from […] to [… ], please.“
  142. GenAI Outlook Multi-Agenten Systeme Retriever Memory Tools Planning Database Prompting

    Models API-Call UI/UX „Apply for my vacation from […] to [… ], please.“ „I applied for … .“
  143. GenAI from prototype to production Your personal „takeaways“: • GenAI

    is powerful. But also expensive. • Professional prompting is your super power. • Every model has its own character. • RAG for your own domain knowledge. • GenAI is also just software. Your use case determines the right path!
  144. „When deploying Generative AI technology, it is essential to adhere

    to best practices including employing models through validation, understanding the model’s limitations, ensuring ethical use, and monitoring for biases continuously.“ (Source: https://www.tredence.com/generative-ai-101)
  145. #WISSENTEILEN #WISSENTEILEN IMAGE REFERENCE Folie 21: © photoplotnikov - istockphoto.com

    Folie 23: © Mix und Match Studios - shutterstock.com Folie 23: © Mix und Match Studios - shutterstock.com All other pictures, drawings and icons originate from • pexels.com, • pixabay.com, • unsplash.com, • flaticon.com or are created by my own.