Slide 1

Slide 1 text

Avoid common LLM pitfalls Mete Atamel Developer Advocate @ Google @meteatamel atamel.dev speakerdeck.com/meteatamel github.com/meteatamel/genai-beyond-basics

Slide 2

Slide 2 text

Introduction Pitfalls Solutions Summary 01 02 03 04 Agenda

Slide 3

Slide 3 text

Introduction 01

Slide 4

Slide 4 text

Artificial Intelligence NLP AI Landscape Data Science Machine Learning — Unsupervised, Supervised, Reinforcement Learning Deep Learning — Artificial, Convolution, Recurrent Neural Networks Generative AI — GAN, VAE, Transformers LLMs — Transformers Image Gen — GAN, VAE

Slide 5

Slide 5 text

LLM Landscape

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Gemini (brand) Gemini App previously Bard Gemini Cloud Assist previously Duet AI Gemini Code Assist previously Duet AI for developers … Google AI Landscape Vertex AI Google AI Studio previously MakerSuite Model Garden Codey Imagen Gemma Llama 3 Claude 3 Falcon Vicuna Stable Diffusion … Search & Conversation Vector Search Notebooks Pipelines AutoML Gemini (model) … Vision, Video, TTS / STT, NL APIs

Slide 8

Slide 8 text

Natively multimodal Large context window Sophisticated reasoning

Slide 9

Slide 9 text

Proprietary + Confidential

Slide 10

Slide 10 text

Open model derived from Gemini

Slide 11

Slide 11 text

Gemini Gemma Type Closed, proprietary Open Size Very large Smaller (2B & 7B versions) Modality Text, image, video, speech Only text Languages 39 languages English-only Function calling ✅ ❌ Context window 32K for 1.0 Pro (8K out max) 1M+ for 1.5 Pro 8K tokens (in + out) Performance State-of-the-art in large models, high quality out-of-the-box State-of-the-art in its class, but can require fine-tuning Use cases Enterprise, scale, SLOs, model updates, etc. Experimentation, research, education Can run locally, privacy Pricing & Management Fully managed API Pay per character Manage yourself Pay for your own hardware & hosting Customization Through managed tuning: supervised, RLHF, distillation Programmatically modify underlying weights

Slide 12

Slide 12 text

Pitfalls 02

Slide 13

Slide 13 text

⚠ LLMs require pre and post processing

Slide 14

Slide 14 text

⚠ LLMs hallucinate

Slide 15

Slide 15 text

⚠ LLMs rely on outdated public data

Slide 16

Slide 16 text

⚠ LLM outputs can be chaotic

Slide 17

Slide 17 text

⚠ LLM inputs can get expensive

Slide 18

Slide 18 text

⚠ LLM outputs are hard to measure

Slide 19

Slide 19 text

⚠ LLM outputs can contain PII, harmful content, etc.

Slide 20

Slide 20 text

Solutions 03

Slide 21

Slide 21 text

LangChain is the most popular one Firebase Genkit, Semantic Kernel, AutoGen and others github.com/meteatamel/genai-beyond-basics/tree/main/samples/frameworks/langchain github.com/meteatamel/genai-beyond-basics/tree/main/samples/frameworks/semantic-kernel ⚠ LLMs require pre and post processing 💡LLM frameworks

Slide 22

Slide 22 text

Grounding with Google Search for public data Grounding with Vertex AI Search for private data github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/google-search github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/vertexai-search ⚠ LLMs hallucinate 💡Grounding (easy way)

Slide 23

Slide 23 text

At some point, you’ll need Retrieval-Augmented Generation (RAG) to ground on your own private data and for more control ⚠ LLMs hallucinate 💡Grounding (RAG)

Slide 24

Slide 24 text

LLM Vector DB vector embeddings chunks DOCS calculate split store vector + chunk ❶ INGESTION RAG

Slide 25

Slide 25 text

Chatbot app LLM Vector DB vector embeddings chunks DOCS calculate prompt vector embedding split calculate find similar answer prompt + chunks as context store vector + chunk ❶ INGESTION ❷ QUERYING RAG

Slide 26

Slide 26 text

● How to parse & chunk docs? ● What embedding model to use? ● What vector database to use? ● How to retrieve similar docs and add to the prompt? ● What about images? RAG get complicated github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/rag-pdf-langchain-firestore

Slide 27

Slide 27 text

Function calling: Augment LLMs with external APIs for more real-time data ⚠ LLMs rely on outdated public data 💡Function calling

Slide 28

Slide 28 text

Chatbot app Gemini What’s the weather like in Antwerp ? It’s sunny in Antwerp! External API or service user prompt + getWeather(String) function contract call getWeather(“Antwerp”) for me please 󰚦 getWeather(“Antwerp”) {“forecast”:”sunny”} function response is {“forecast”:”sunny”} Answer: “It’s sunny in Antwerp!” Function calling github.com/meteatamel/genai-beyond-basics/tree/main/samples/function-calling/weather

Slide 29

Slide 29 text

LLMs now support response type (JSON) and response schemas to control the output format better github.com/meteatamel/genai-beyond-basics/tree/main/samples/controlled-generation ⚠ LLM outputs can be chaotic 💡Response type and schema

Slide 30

Slide 30 text

Reduce costs (not necessarily latency) when a large context is referenced repeatedly by shorter requests github.com/meteatamel/genai-beyond-basics/tree/main/samples/context-caching ⚠ LLM inputs can get expensive 💡Context caching

Slide 31

Slide 31 text

Send multiple prompts at once and get results async when latency is not important at a discounted price ⚠ LLM inputs can get expensive 💡Batch generation github.com/meteatamel/genai-beyond-basics/tree/main/samples/batch-generation

Slide 32

Slide 32 text

DeepEval and Promptfoo are open-source evaluation frameworks Vertex AI has rapid evaluation and AutoSxS evaluation github.com/meteatamel/genai-beyond-basics/tree/main/samples/evaluation/deepeval ⚠ LLM outputs are hard to measure 💡Evaluation frameworks

Slide 33

Slide 33 text

Rely on the safety settings of the library for basic safety measures Promptfoo and LLMGuard are open-source testing/security frameworks github.com/meteatamel/genai-beyond-basics/tree/main/samples/evaluation/promptfoo github.com/meteatamel/genai-beyond-basics/tree/main/samples/evaluation/llmguard ⚠ LLM outputs can contain PII, harmful content, etc. 💡Testing/security frameworks

Slide 34

Slide 34 text

Summary 04

Slide 35

Slide 35 text

LLM frameworks to orchestrate LLM calls Grounding and function calling for private and real-time data Response type and schemas to structure outputs Context caching and batch processing to optimize costs Testing & Security frameworks to evaluate, test, and secure LLM inputs/outputs 📋 Summary

Slide 36

Slide 36 text

Thank you! Mete Atamel Developer Advocate at Google @meteatamel atamel.dev speakerdeck.com/meteatamel github.com/meteatamel/genai-beyond-basics