Gemini_2.0_for_developers.pdf

Slide 1

Slide 1 text

for developers Mete Atamel Developer Advocate @ Google @meteatamel atamel.dev speakerdeck.com/meteatamel

Slide 2

Slide 2 text

Hello to

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Features across models

Slide 5

Slide 5 text

gemini-2.0-flash A GA model on: ● Gemini Developer API, on Google AI Studio ● Gemini API, on Vertex AI Studio

Slide 6

Slide 6 text

gemini-2.0-flash-thinking-exp-01-21 An experimental model trained to generate the thinking process for stronger reasoning across math and science

Slide 7

Slide 7 text

Performance Multimodal Live API Unified SDK Native Tool Use Native Image & Audio Output (preview) Spatial Understanding

Slide 8

Slide 8 text

Enterprise Vertex AI Studio Google Cloud APIs cloud.google.com/vertex-ai Consumers Gemini gemini.google.com Developers Google AI Studio Google APIs aistudio.google.com ai.google.dev/gemini-api

Slide 9

Slide 9 text

Performance

Slide 10

Slide 10 text

Gemini Flash 2.0 offers 2x the speed of Gemini 1.5 Pro Stronger performance on multimodal, text, code, video, spatial understanding and reasoning

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Unified SDK

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Unified interface to Gemini 2.0 (and 1.5) ● Gemini Developer API on Google AI Studio ● Gemini API on Vertex AI The new Google Gen AI SDK

Slide 17

Slide 17 text

Gemini Developer API on Google AI Studio client = genai.Client( api_key=your-gemini-api-key) response = client.models.generate_content( model="gemini-2.0-flash-001", contents="Why is the sky blue?")

Slide 18

Slide 18 text

Gemini Developer API on Vertex AI client = genai.Client( vertexai=True, project=your-google-cloud-project, location="us-central1") response = client.models.generate_content( model="gemini-2.0-flash-001", contents="Why is the sky blue?")

Slide 19

Slide 19 text

Native Tool Use

Slide 20

Slide 20 text

Google Search Tool Ground model responses in Google Search results For more accurate, up-to-date, and relevant responses

Slide 21

Slide 21 text

Google Search Tool google_search_tool = Tool(google_search=GoogleSearch()) response = client.models.generate_content( model="gemini-2.0-flash-001", contents="How’s the weather like today in London?", config=GenerateContentConfig(tools=[google_search_tool]) )

Slide 22

Slide 22 text

Code Execution Tool Model generates and runs Python code Useful for applications that benefit from code-based reasoning (e.g. solving equations)

Slide 23

Slide 23 text

Code Execution Tool code_execution_tool = Tool(code_execution=ToolCodeExecution()) response = client.models.generate_content( model="gemini-2.0-flash-001", contents="What is the sum of the first 50 prime numbers?", config=GenerateContentConfig( tools=[code_execution_tool], temperature=0))

Slide 24

Slide 24 text

Automatic Function Calling Submit a Python function for automatic function calling (instead of submitting an OpenAPI specification of the function)

Slide 25

Slide 25 text

Function Calling–before

Slide 26

Slide 26 text

Automatic Function Calling def get_current_weather(location: str) -> str: """Example method. Returns the current weather. Args: location: The city and state, e.g. San Francisco, CA """ weather_map: dict[str, str] = { "Boston, MA": "snowing", "San Francisco, CA": "foggy", "Seattle, WA": "raining", "Austin, TX": "hot", "London, UK": "rainy and dark", } return weather_map.get(location, "unknown")

Slide 27

Slide 27 text

Automatic Function Calling response = client.models.generate_content( model="gemini-2.0-flash-001", contents="What is the weather like in Austin?", config=GenerateContentConfig( tools=[get_current_weather], temperature=0))

Slide 28

Slide 28 text

Spatial Understanding

Slide 29

Slide 29 text

Improved accuracy on 2D and 3D spatial understanding Spatial Understanding

Slide 30

Slide 30 text

Multimodal Live API

Slide 31

Slide 31 text

Enables low-latency, two-way interactions → Input: text, audio, and video ← Output: audio and text Multimodal Live API

Slide 32

Slide 32 text

● Multimodality – model can see, hear, speak ● Low-latency – for realtime interaction ● Memory – model remembers the session ● Tools – Function calling, code execution, and Google search Multimodal Live API–key capabilities

Slide 33

Slide 33 text

Native Image & Audio Output (preview)

Slide 34

Slide 34 text

Gemini 2.0 introduces native image generation and text-to-speech capabilities Enables image generation / editing and expressive storytelling Native Image & Audio Output (preview)

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

Thank you! Mete Atamel Developer Advocate at Google @meteatamel atamel.dev speakerdeck.com/meteatamel