Slide 1

Slide 1 text

Building LLM Apps with Vertex AI & PaLM 2 Google IO Extended Meetup 2023 • Tintash Inc, Lahore

Slide 2

Slide 2 text

Technology advisor & consultant for startups, and Manager at Google Developers Group Find me anywhere @sheharyarn Sheharyar Naseer

Slide 3

Slide 3 text

Background ‣ Indie Nomad Software Architect ‣ 13+ years of polyglot experience, focus on Web & Cloud ‣ StackOverflow: 70,000+ score (Top 5 in Pakistan) ‣ Author / Contributor of multiple famous libraries & tools ‣ Featured on popular developer communities

Slide 4

Slide 4 text

Intro to Vertex AI Generative AI and PaLM Parameters and Tuning API and SDK Live Demo Learning Resources and Q/A Outline

Slide 5

Slide 5 text

Vertex AI ‣ Set of ML/AI tools on Google Cloud Platform ‣ Build, manage & deploy models ‣ Quickstart library of foundational models ‣ Low-code and No-code tooling ‣ Native integrations with BigQuery, DataProc, etc.

Slide 6

Slide 6 text

Vertex AI Model Garden Generative AI Studio AutoML Deep Learning VM Images AI Workbench Matching Engine Data Labeling Deep Learning Containers Explainable AI AI Feature Store ML Metadata Model Monitoring AI Vizier AI Pipelines AI Prediction AI Tensorboard

Slide 7

Slide 7 text

Generative AI Studio ‣ Low-Code Generative AI platform ‣ Easily access, tune & deploy ‣ Uses Google's foundation models ‣ PaLM: Text & Chat ‣ Imagen: Text-to-Image ‣ Chirp: Speech ‣ Codey: Code generation, completion and chat

Slide 8

Slide 8 text

PaLM 2 ‣ Google's transformer-based LLM ‣ 340B parameters, trained on 3.9T tokens ‣ Used by Google's Bard AI & other services ‣ Models: ‣ Bison ‣ Gecko

Slide 9

Slide 9 text

PaLM 2 ‣ Capabilities ‣ Multilingual: Trained on 100+ languages ‣ Reasoning: Improved Logic & common sense ‣ Coding: Popular & specialized languages

Slide 10

Slide 10 text

DEMO

Slide 11

Slide 11 text

Parameters & Tuning Prompt: ‣ Text Input to generate model response Token Limit: ‣ Maximum length of response, measured in tokens ‣ 1 Token ≈ 4 Characters ‣ 100 Tokens ≈ 60-80 words

Slide 12

Slide 12 text

Parameters & Tuning Temperature: ‣ "Creativity" of the response ‣ Value between [0.0, 1.0] ‣ Value of 0: Deterministic ‣ Value of 1: Fully random

Slide 13

Slide 13 text

Parameters & Tuning Top-K: ‣ Modify token selection at each step ‣ Integer value between [1, 40] ‣ Value of K means the next token is selected from the K most probable tokens ‣ Higher K = More random

Slide 14

Slide 14 text

Parameters & Tuning Top-P: ‣ Modify token selection at each step ‣ Probability value between [0.0, 1.0] ‣ Sets a threshold for token probability sum ‣ Shortlists samples returned by Top-K ‣ Higher P = More random

Slide 15

Slide 15 text

The park near my house has... S TAT E M E N T

Slide 16

Slide 16 text

[flowers, trees, grass, . . . , lake, bugs] 0.35 0.22 0.17 0.02 0.01 . The park near my house has... S TAT E M E N T N E X T P R O B A B L E TO K E N S

Slide 17

Slide 17 text

[flowers, trees, grass, . . . , lake, bugs] 0.35 0.22 0.17 0.02 0.01 . The park near my house has... S TAT E M E N T N E X T P R O B A B L E TO K E N S Top-K = 3 Top-P = 0.6 TO K E N S A M P L I N G → [flowers, trees, grass] → [flowers, trees]

Slide 18

Slide 18 text

DEMO

Slide 19

Slide 19 text

API & SDKs ‣ Use PaLM2 in your own apps ‣ All LLM capabilities available and more ‣ Easy integrations with a simple API ‣ Python SDK also available ‣ Elixir SDK is in the works

Slide 20

Slide 20 text

API Usage curl "https: / / ${API_REGION}.googleapis.com/v1/projects/${PROJECT_ID}/locations/us- central1/publishers/google/models/text-bison@001:predict" \ -X POST \ -H "Authorization: Bearer auth-token" \ -H "Content-Type: application/json" \ -d $'{ "instances": [{"content": " Explain what is going on in this horror story below: There was a picture on my phone of me sleeping. I live alone." }], "parameters": { "temperature": 0.5, "maxOutputTokens": 256, "topP": 0.8, "topK": 40 } }' Story Credit: /u/guztaluz

Slide 21

Slide 21 text

Python SDK Usage import vertexai from vertexai.language_models import TextGenerationModel vertexai.init(project=project_id, location="us-central1") model = TextGenerationModel.from_pretrained("text-bison@001") prompt = """ Explain what is going on in this horror story below: There was a picture on my phone of me sleeping. I live alone. """ response = model.predict(prompt, temperature=0.5, max_output_tokens=256, top_p=0.8, top_k=40 }) print(f"Response from Model: {response.text}") Story Credit: /u/guztaluz

Slide 22

Slide 22 text

DEMO

Slide 23

Slide 23 text

Questions? These Slides More Talks Official Docs PaLM 2 Code Lab shyr.io/t/gcp-vertex-palm shyr.io/talks cloud.google.com/vertex-ai/docs ai.google/discover/palm2 to.shyr.io/vertex-ai-codelab → → → → → 🌎 @  shyr.io [email protected] @sheharyarn