Gemini 2.0 per sviluppatori (Milano 20mar2025)

per sviluppatori Milano, 20 Marzo 2025 Riccardo Carlesso Developer Advocate

Imagine an AI that can read, hear, see, and respond
in realtime

Gemini is an umbrella brand used in Google products that
utilize the Gemini model

Hello to

Features across models

Features across models Feature Gemini 2.0 Flash Gemini 2.0 Flash-Lite
Gemini 2.0 Pro Experimental Release status Generally available Generally available Available today Multimodal inputs ✅ ✅ ✅ Text output ✅ ✅ ✅ Image output Coming soon… ❌ Coming soon

Features across models Feature Gemini 2.0 Flash Gemini 2.0 Flash-Lite
Gemini 2.0 Pro Experimental Release status Generally available Generally available Available today Multimodal inputs ✅ ✅ ✅ Text output ✅ ✅ ✅ Image output ✅ !!! ❌ Coming soon

gemini-2.0-flash A GA model on: • Gemini Developer API, on
Google AI Studio • Gemini API, on Vertex AI Studio

gemini-2.0-flash-thinking-exp-01-21 An experimental model trained to generate the thinking process
for stronger reasoning across math and science

Performance Multimodal Live API Unified SDK Native Tool Use Native
Image & Audio Output (preview) Spatial Understanding

Enterprise Vertex AI Studio Google Cloud APIs cloud.google.com/vertex-ai Consumers Gemini
gemini.google.com Developers Google AI Studio Google APIs aistudio.google.com ai.google.dev/gemini-api

Rio de Janeiro Demo1 Rio de Janeiro - chapterize

Performance

Gemini Flash 2.0 offers 2x the speed of Gemini 1.5
Pro Stronger performance on multimodal, text, code, video, spatial understanding and reasoning

Rio de Janeiro Demo2 Demo2: In AI Studio, show Gemini
2.0 Flash Thinking Mode with pool image and this prompt: How do I use three of the pool balls to sum up to 30?

6 + 11 + 13 = 30

Unified SDK

Unified interface to Gemini 2.0 (and 1.5) • Gemini Developer
API on Google AI Studio • Gemini API on Vertex AI The new Google Gen AI SDK

Gemini Developer API on Google AI Studio client = genai.Client(
api_key=your-gemini-api-key) response = client.models.generate_content( model="gemini-2.0-flash-001", contents="Why is the sky blue?")

Gemini Developer API on Vertex AI client = genai.Client( vertexai=True,
project=your-google-cloud-project, location="us-central1") response = client.models.generate_content( model="gemini-2.0-flash-001", contents="Why is the sky blue?")

Using environment variables Gemini Developer API on Google AI Studio
export GOOGLE_API_KEY='your-api-key' Gemini Developer API on Vertex AI export GOOGLE_GENAI_USE_VERTEXAI=true export GOOGLE_CLOUD_PROJECT='your-project-id' export GOOGLE_CLOUD_LOCATION='us-central1' client = genai.Client() Common client initialization

Native Tool Use

Google Search Tool Ground model responses in Google Search results
For more accurate, up-to-date, and relevant responses

Why grounding? Ask an LLM: ⚛Explain Einstein’s Relativity 🪐How many
moons does Saturn have? ⌚What time is it today? 🌤What is the weather like in Milan?

Google Search Tool google_search_tool = Tool(google_search=GoogleSearch()) response = client.models.generate_content( model="gemini-2.0-flash-001",
contents="How’s the weather like today in Milan?", config=GenerateContentConfig(tools=[google_search_tool]) )

Rio de Janeiro Demo3 Demo3: In Vertex AI Studio, show
this prompt without and with Google Search grounding and also show the code: Che tempo fa oggi a Milano? [Specifica in formato JSON]

Code Execution Tool Model generates and runs Python code Useful
for applications that benefit from code-based reasoning (e.g. solving equations)

Code Execution Tool code_execution_tool = Tool(code_execution=ToolCodeExecution()) response = client.models.generate_content( model="gemini-2.0-flash-001",
contents="What is the sum of the first 50 prime numbers?", config=GenerateContentConfig( tools=[code_execution_tool], temperature=0))

Rio de Janeiro Demo4 Demo4: In Google AI Studio, show
this prompt without and with Code Execution: What is the sum of the first 50 prime numbers? (should be 5117)

Automatic Function Calling Submit a Python function for automatic function
calling (instead of submitting an OpenAPI specification of the function)

Automatic Function Calling def get_current_weather(location: str) -> str: """Example method.
Returns the current weather. Args: location: The city and state, e.g. San Francisco, CA """ weather_map: dict[str, str] = { "Barcelona": "sunny", "Paris": "foggy", "Milan": "raining", "Rome": "hot", "London, UK": "rainy and dark", } return weather_map.get(location, "unknown")

Manual Function Calling (before)

Automatic Function Calling response = client.models.generate_content( model="gemini-2.0-flash-001", contents="What is the
weather like in Milan?", config=GenerateContentConfig( tools=[get_current_weather], temperature=0))

Maps Function Calling within Gemini Che pizzerie ci sono a
Milano? Rimaniamo in zona Isola (entro 1km da via Confalonieri). Siamo a piedi. -- Ora trova il numero di telefono di quei ristoranti Ora aggiungi la distanza dall’ufficio di Google (in via Confalonieri) -- Ok allora per questi ristoranti per favore dammi numero di telefono, indirizzo, breve descrizione in forma tabulare. Su gemini.google.com 48 Dati pizzerie Isola per gente rumorosa

Spatial Understanding

Improved accuracy on 2D and 3D spatial understanding Spatial Understanding
https://aistudio.google.com/starter-apps/spatial

Rio de Janeiro Dem Demo5: Open https://aistudio.google.com/starter-apps/spatial and go through
some scenarios: 1. Animal shade // animal name in Italian with emoji 2. Fish // specific food name of in italian

Multimodal Live API

Enables low-latency, two-way interactions → Input: text, audio, and video
← Output: audio and text Multimodal Live API

• Multimodality – model can see, hear, speak • Low-latency
– for realtime interaction • Memory – model remembers the session • Tools – Function calling, code execution, and Google search Multimodal Live API–key capabilities

Rio de Janeiro Dem Demo6: First, show that Live API
is available in Google AI Studio. Then, show 1-2 of these starter apps: cd ~/git/pvt-gemini20

Ricc Demo

Native Image & Audio Output (preview)

Gemini 2.0 introduces native image generation and text-to-speech capabilities Enables
image generation / editing and expressive storytelling Native Image & Audio Output (preview)

Rio de Janeiro Dem Demo7: Image output -> AI Studio
-> Seleziona Gemini Video > Visual Story

Gemini2.0: text+image (story) Shrek in Milan: A Digital Story

https://www.youtube.com/shorts/sA Ah4aI7ZA8 Veo: text to video

https://www.youtube.com/shorts/sA Ah4aI7ZA8 Veo: text to video “Shrek and Fiona are
doing an Ironman in Japan: Shrek is cycling in a trisuite while Fiona is getting out of the water right now. The camera fades out, then the view goes to above. On a distance, donkey is taking pictures of the green couple, on top of his big red dragon.“

Veo: text+image to video “Person becomes Shrek in front of
Milan Duomo square, then eats a panettone”

Veo: text+image to video “This code enters a rocket ship
and flies to the colorful clouds of Google Cloud!

Veo app “mosaic” con Streamlit $ cd ~/git/genai-googlecloud-scripts/22-gemini20/ && make
app => http://localhost:8501/

goo.gle/gemini2 goo.gle/multimodal-live-api Keep exploring

Thank you! Riccardo Carlesso Developer Advocate at Google @palladius Ricc.rocks
http://linkedin.com/in/riccardo-carlesso https://speakerdeck.com/palladius

Gemini 2.0 per sviluppatori (Milano 20mar2025)

Gemini 2.0 per sviluppatori (Milano 20mar2025)

More Decks by Riccardo Carlesso

Featured

Transcript