Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building LLM Apps with Google Vertex AI and PaLM

Building LLM Apps with Google Vertex AI and PaLM

I gave this high-level session with a demo on how to quickly start using Google's PaLM API as part of their Vertex suite of AI tools, to build and use LLMs to add AI capabilities to new and existing websites and applications.

Event: Google IO Extended Meetup 2023
Location: Tintash Inc, Lahore

Sheharyar Naseer

July 06, 2023
Tweet

More Decks by Sheharyar Naseer

Other Decks in Technology

Transcript

  1. Building LLM Apps with Vertex AI & PaLM 2 Google

    IO Extended Meetup 2023 • Tintash Inc, Lahore
  2. Technology advisor & consultant for startups, and Manager at Google

    Developers Group Find me anywhere @sheharyarn Sheharyar Naseer
  3. Background ‣ Indie Nomad Software Architect ‣ 13+ years of

    polyglot experience, focus on Web & Cloud ‣ StackOverflow: 70,000+ score (Top 5 in Pakistan) ‣ Author / Contributor of multiple famous libraries & tools ‣ Featured on popular developer communities
  4. Intro to Vertex AI Generative AI and PaLM Parameters and

    Tuning API and SDK Live Demo Learning Resources and Q/A Outline
  5. Vertex AI ‣ Set of ML/AI tools on Google Cloud

    Platform ‣ Build, manage & deploy models ‣ Quickstart library of foundational models ‣ Low-code and No-code tooling ‣ Native integrations with BigQuery, DataProc, etc.
  6. Vertex AI Model Garden Generative AI Studio AutoML Deep Learning

    VM Images AI Workbench Matching Engine Data Labeling Deep Learning Containers Explainable AI AI Feature Store ML Metadata Model Monitoring AI Vizier AI Pipelines AI Prediction AI Tensorboard
  7. Generative AI Studio ‣ Low-Code Generative AI platform ‣ Easily

    access, tune & deploy ‣ Uses Google's foundation models ‣ PaLM: Text & Chat ‣ Imagen: Text-to-Image ‣ Chirp: Speech ‣ Codey: Code generation, completion and chat
  8. PaLM 2 ‣ Google's transformer-based LLM ‣ 340B parameters, trained

    on 3.9T tokens ‣ Used by Google's Bard AI & other services ‣ Models: ‣ Bison ‣ Gecko
  9. PaLM 2 ‣ Capabilities ‣ Multilingual: Trained on 100+ languages

    ‣ Reasoning: Improved Logic & common sense ‣ Coding: Popular & specialized languages
  10. Parameters & Tuning Prompt: ‣ Text Input to generate model

    response Token Limit: ‣ Maximum length of response, measured in tokens ‣ 1 Token ≈ 4 Characters ‣ 100 Tokens ≈ 60-80 words
  11. Parameters & Tuning Temperature: ‣ "Creativity" of the response ‣

    Value between [0.0, 1.0] ‣ Value of 0: Deterministic ‣ Value of 1: Fully random
  12. Parameters & Tuning Top-K: ‣ Modify token selection at each

    step ‣ Integer value between [1, 40] ‣ Value of K means the next token is selected from the K most probable tokens ‣ Higher K = More random
  13. Parameters & Tuning Top-P: ‣ Modify token selection at each

    step ‣ Probability value between [0.0, 1.0] ‣ Sets a threshold for token probability sum ‣ Shortlists samples returned by Top-K ‣ Higher P = More random
  14. [flowers, trees, grass, . . . , lake, bugs] 0.35

    0.22 0.17 0.02 0.01 . The park near my house has... S TAT E M E N T N E X T P R O B A B L E TO K E N S
  15. [flowers, trees, grass, . . . , lake, bugs] 0.35

    0.22 0.17 0.02 0.01 . The park near my house has... S TAT E M E N T N E X T P R O B A B L E TO K E N S Top-K = 3 Top-P = 0.6 TO K E N S A M P L I N G → [flowers, trees, grass] → [flowers, trees]
  16. API & SDKs ‣ Use PaLM2 in your own apps

    ‣ All LLM capabilities available and more ‣ Easy integrations with a simple API ‣ Python SDK also available ‣ Elixir SDK is in the works
  17. API Usage curl "https: / / ${API_REGION}.googleapis.com/v1/projects/${PROJECT_ID}/locations/us- central1/publishers/google/models/text-bison@001:predict" \ -X

    POST \ -H "Authorization: Bearer auth-token" \ -H "Content-Type: application/json" \ -d $'{ "instances": [{"content": " Explain what is going on in this horror story below: There was a picture on my phone of me sleeping. I live alone." }], "parameters": { "temperature": 0.5, "maxOutputTokens": 256, "topP": 0.8, "topK": 40 } }' Story Credit: /u/guztaluz
  18. Python SDK Usage import vertexai from vertexai.language_models import TextGenerationModel vertexai.init(project=project_id,

    location="us-central1") model = TextGenerationModel.from_pretrained("text-bison@001") prompt = """ Explain what is going on in this horror story below: There was a picture on my phone of me sleeping. I live alone. """ response = model.predict(prompt, temperature=0.5, max_output_tokens=256, top_p=0.8, top_k=40 }) print(f"Response from Model: {response.text}") Story Credit: /u/guztaluz
  19. Questions? These Slides More Talks Official Docs PaLM 2 Code

    Lab shyr.io/t/gcp-vertex-palm shyr.io/talks cloud.google.com/vertex-ai/docs ai.google/discover/palm2 to.shyr.io/vertex-ai-codelab → → → → → 🌎 @  shyr.io [email protected] @sheharyarn