Building LLM Apps with Google Vertex AI and PaLM

Building LLM Apps with Vertex AI & PaLM 2 Google
IO Extended Meetup 2023 • Tintash Inc, Lahore

Technology advisor & consultant for startups, and Manager at Google
Developers Group Find me anywhere @sheharyarn Sheharyar Naseer

Background ‣ Indie Nomad Software Architect ‣ 13+ years of
polyglot experience, focus on Web & Cloud ‣ StackOverflow: 70,000+ score (Top 5 in Pakistan) ‣ Author / Contributor of multiple famous libraries & tools ‣ Featured on popular developer communities

Intro to Vertex AI Generative AI and PaLM Parameters and
Tuning API and SDK Live Demo Learning Resources and Q/A Outline

Vertex AI ‣ Set of ML/AI tools on Google Cloud
Platform ‣ Build, manage & deploy models ‣ Quickstart library of foundational models ‣ Low-code and No-code tooling ‣ Native integrations with BigQuery, DataProc, etc.

Vertex AI Model Garden Generative AI Studio AutoML Deep Learning
VM Images AI Workbench Matching Engine Data Labeling Deep Learning Containers Explainable AI AI Feature Store ML Metadata Model Monitoring AI Vizier AI Pipelines AI Prediction AI Tensorboard

Generative AI Studio ‣ Low-Code Generative AI platform ‣ Easily
access, tune & deploy ‣ Uses Google's foundation models ‣ PaLM: Text & Chat ‣ Imagen: Text-to-Image ‣ Chirp: Speech ‣ Codey: Code generation, completion and chat

PaLM 2 ‣ Google's transformer-based LLM ‣ 340B parameters, trained
on 3.9T tokens ‣ Used by Google's Bard AI & other services ‣ Models: ‣ Bison ‣ Gecko

PaLM 2 ‣ Capabilities ‣ Multilingual: Trained on 100+ languages
‣ Reasoning: Improved Logic & common sense ‣ Coding: Popular & specialized languages

Parameters & Tuning Prompt: ‣ Text Input to generate model
response Token Limit: ‣ Maximum length of response, measured in tokens ‣ 1 Token ≈ 4 Characters ‣ 100 Tokens ≈ 60-80 words

Parameters & Tuning Temperature: ‣ "Creativity" of the response ‣
Value between [0.0, 1.0] ‣ Value of 0: Deterministic ‣ Value of 1: Fully random

Parameters & Tuning Top-K: ‣ Modify token selection at each
step ‣ Integer value between [1, 40] ‣ Value of K means the next token is selected from the K most probable tokens ‣ Higher K = More random

Parameters & Tuning Top-P: ‣ Modify token selection at each
step ‣ Probability value between [0.0, 1.0] ‣ Sets a threshold for token probability sum ‣ Shortlists samples returned by Top-K ‣ Higher P = More random

The park near my house has... S TAT E M
E N T

[flowers, trees, grass, . . . , lake, bugs] 0.35
0.22 0.17 0.02 0.01 . The park near my house has... S TAT E M E N T N E X T P R O B A B L E TO K E N S

[flowers, trees, grass, . . . , lake, bugs] 0.35
0.22 0.17 0.02 0.01 . The park near my house has... S TAT E M E N T N E X T P R O B A B L E TO K E N S Top-K = 3 Top-P = 0.6 TO K E N S A M P L I N G → [flowers, trees, grass] → [flowers, trees]

API & SDKs ‣ Use PaLM2 in your own apps
‣ All LLM capabilities available and more ‣ Easy integrations with a simple API ‣ Python SDK also available ‣ Elixir SDK is in the works

API Usage curl "https: / / ${API_REGION}.googleapis.com/v1/projects/${PROJECT_ID}/locations/us- central1/publishers/google/models/text-bison@001:predict" \ -X
POST \ -H "Authorization: Bearer auth-token" \ -H "Content-Type: application/json" \ -d $'{ "instances": [{"content": " Explain what is going on in this horror story below: There was a picture on my phone of me sleeping. I live alone." }], "parameters": { "temperature": 0.5, "maxOutputTokens": 256, "topP": 0.8, "topK": 40 } }' Story Credit: /u/guztaluz

Python SDK Usage import vertexai from vertexai.language_models import TextGenerationModel vertexai.init(project=project_id,
location="us-central1") model = TextGenerationModel.from_pretrained("text-bison@001") prompt = """ Explain what is going on in this horror story below: There was a picture on my phone of me sleeping. I live alone. """ response = model.predict(prompt, temperature=0.5, max_output_tokens=256, top_p=0.8, top_k=40 }) print(f"Response from Model: {response.text}") Story Credit: /u/guztaluz

Questions? These Slides More Talks Official Docs PaLM 2 Code
Lab shyr.io/t/gcp-vertex-palm shyr.io/talks cloud.google.com/vertex-ai/docs ai.google/discover/palm2 to.shyr.io/vertex-ai-codelab → → → → → 🌎 @  shyr.io [email protected] @sheharyarn

Building LLM Apps with Google Vertex AI and PaLM

Building LLM Apps with Google Vertex AI and PaLM

Sheharyar Naseer

More Decks by Sheharyar Naseer

Other Decks in Technology

Featured

Transcript

Building LLM Apps with Vertex AI & PaLM 2 Google

Technology advisor & consultant for startups, and Manager at Google

Background ‣ Indie Nomad Software Architect ‣ 13+ years of

Intro to Vertex AI Generative AI and PaLM Parameters and

Vertex AI ‣ Set of ML/AI tools on Google Cloud

Vertex AI Model Garden Generative AI Studio AutoML Deep Learning

Generative AI Studio ‣ Low-Code Generative AI platform ‣ Easily

PaLM 2 ‣ Google's transformer-based LLM ‣ 340B parameters, trained

PaLM 2 ‣ Capabilities ‣ Multilingual: Trained on 100+ languages

DEMO

Parameters & Tuning Prompt: ‣ Text Input to generate model

Parameters & Tuning Temperature: ‣ "Creativity" of the response ‣

Parameters & Tuning Top-K: ‣ Modify token selection at each

Parameters & Tuning Top-P: ‣ Modify token selection at each

The park near my house has... S TAT E M

[flowers, trees, grass, . . . , lake, bugs] 0.35

[flowers, trees, grass, . . . , lake, bugs] 0.35

DEMO

API & SDKs ‣ Use PaLM2 in your own apps

API Usage curl "https: / / ${API_REGION}.googleapis.com/v1/projects/${PROJECT_ID}/locations/us- central1/publishers/google/models/text-bison@001:predict" \ -X

Python SDK Usage import vertexai from vertexai.language_models import TextGenerationModel vertexai.init(project=project_id,

DEMO

Questions? These Slides More Talks Official Docs PaLM 2 Code