Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gemini, Google's Large Language Model

Gemini, Google's Large Language Model

Gemini is the large multimodal model powering the Gemini app, but you can also use its API through Google Cloud and integrate it into your applications. Gemini offers different sizes, from Nano to Ultra, including Pro. Its unique feature is its multimodality: you can give it text, images, or videos! This opens up new use cases for you.

In this presentation, we will explore the Gemini model (and its little “open-weights” model sister, Gemma). With our Java hats on, we will learn how to use its API, especially with the LangChain4j library.

How to get the most out of Gemini? We will see how to extract unstructured data, how to classify text, how to extend the model's knowledge with the RAG (Retrieval Augmented Generation) approach, and how to use "function calls" to invoke external services when generating text.

Hold on tight! The Gemini capsule is about to take off!

Guillaume Laforge

April 25, 2024
Tweet

More Decks by Guillaume Laforge

Other Decks in Technology

Transcript

  1. Google Cloud Gemma at a glance SOTA Excellent benchmark results

    Base & Instruction Tuned models 2B & 7B parameters Run it on Vertex AI, GKE, your laptop! Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create Gemini 14
  2. Google Cloud 16 Gemini Gemma Type Closed, proprietary Open Size

    Very large Smaller (2B & 7B versions) Modality Text, image, video, speech Only text Languages 39 languages English-only Function calling ✅ ❌ Context window 32K for 1.0 Pro (8K out max) 1M+ for 1.5 Pro 8K tokens (in + out) Performance State-of-the-art in large models, high quality out-of-the-box State-of-the-art in its class, but can require fine-tuning Use cases Enterprise, scale, SLOs, model updates, etc. Experimentation, research, education Can run locally, privacy Pricing & Management Fully managed API Pay per character/token Manage yourself Pay for your own hardware & hosting Customization Through managed tuning: supervised, RLHF, distillation Programmatically modify underlying weights
  3. Google Cloud 18 Python is all the rage in AI…

    What’s in it for us, Java developers? https://pixabay.com/photos/snake-repti le-python-boa-anaconda-7386684/
  4. Google Cloud 19 Option 1⃣ → Gemini SDK • https://github.com/googleapis/google-cloud-java/tree/main/java-vertexai

    • https://github.com/GoogleCloudPlatform/java-docs-samples/ tree/main/vertexai/snippets/src/main/java/vertexai/gemini • https://cloud.google.com/java/docs/reference/google-cloud-vertexai/
  5. Google Cloud More advanced use cases! What we’ll see •

    Simple question / answer (streaming and non-streaming) • Analyzing images with text prompts (multimodality) • Maintain chat conversations • Text classification with few-shot prompting • Extract structured data from unstructured text • Chat with your docs with Retrieval Augmented Generation • Extend with Function Calling to access external APIs • Gemma via Ollama, and TestContainers
  6. Google Cloud 23 Searching the Apache Groovy documentation Apply the

    RAG pattern: Retrieval Augmented Generation
  7. Chatbot app LLM Vector DB vector embeddings chunks DOCS calculate

    prompt vector embedding split calculate find similar answer context + prompt + chunks store vector + chunk ❶ INGESTION ❷ QUERYING RAG
  8. Chatbot app Gemini What’s the weather like in Paris? It’s

    sunny in Paris! External API or service user prompt + getWeather(String) function contract call getWeather(“Paris”) for me please 󰚦 getWeather(“Paris”) {“forecast”:”sunny”} function response is {“forecast”:”sunny”} Answer: “It’s sunny in Paris!” Function calling
  9. Gemma via Ollama in TestContainers Why is the sky blue?

    Chatbot app Ollama container Gemma Rayleigh scattering