Analyzing Images with Google's Cloud Vision API

Slide 1

Slide 1 text

Analyzing Images with Google’s Cloud Vision API Sara Robinson, Developer Advocate @SRobTweets / @googlecloud

Slide 2

Slide 2 text

What we’ll cover 01 02 03 04 Image detection in the wild What is the Cloud Vision API? Let’s make an API request! Live demo

Slide 3

Slide 3 text

01 Image detection in the wild

Slide 4

Slide 4 text

@SRobTweets 4 4 Image Search

Slide 5

Slide 5 text

@SRobTweets 5 5 Google Photos

Slide 6

Slide 6 text

Google Cloud Platform 6 But how do they work?

Slide 7

Slide 7 text

@SRobTweets 7 7 Neural networks

Slide 8

Slide 8 text

@SRobTweets 8 TensorFlow ● Open source ML library for researchers and developers to build and train their own deep learning models ● Python and C++ APIs ● Used in many areas across Google tensorflow.org github.com/tensorflow

Slide 9

Slide 9 text

@SRobTweets 9 Powering many Google Services ● Speech Recognition ● Google Photos ● Gmail

Slide 10

Slide 10 text

@SRobTweets 10 What if I’m not a machine learning expert?

Slide 11

Slide 11 text

02 The Cloud Vision API Complex image detection with a simple REST request

Slide 12

Slide 12 text

@SRobTweets 12 12 Types of Detection ● Label ● Landmark ● Logo ● Face ● Text ● Safe search

Slide 13

Slide 13 text

@SRobTweets 13 13 Types of Detection Face Detection ○ Find multiple faces ○ Location of eyes, nose, mouth ○ Detect emotions: joy, anger, surprise, sorrow Entity Detection ○ Find common objects and landmarks, and their location in the image ○ Detect explicit content

Slide 14

Slide 14 text

03 Making an API request

Slide 15

Slide 15 text

Making a request { "requests":[ { "image": { "content": "base64ImageString" }, "features": [ { "type": "LABEL_DETECTION", "maxResults": 10 }, { "type": "FACE_DETECTION", "maxResults": 10 }, // More feature detection types... ] } ] }

Slide 16

Slide 16 text

Google Cloud Platform 16 Let’s see it in action!

Slide 17

Slide 17 text

17 { "labelAnnotations" : [ { "mid" : "\/m\/01wydv", "score" : 0.92442685, "description" : "beignet" }, { "mid" : "\/m\/0270h", "score" : 0.90845567, "description" : "dessert" }, { "mid" : "\/m\/033nb2", "score" : 0.74553984, "description" : "profiterole" }, { "mid" : "\/m\/01dk8s", "score" : 0.71415579, "description" : "powdered sugar" } ] } Label Detection 17

Slide 18

Slide 18 text

18 { "landmarkAnnotations" : [ { "boundingPoly" : { "vertices" : [ { "x" : 52, "y" : 25 }, ... ] }, "mid" : "\/m\/0b__kbm", "score" : 0.4231607, "description" : "The Wizarding World of Harry Potter", "locations" : [ { "latLng" : { "longitude" : -81.471261, "latitude" : 28.473 } } ] } ] } Landmark Detection 18 Photo attributions: Eiffel Tower (Creative Commons via Sathish J), Lens (Creative Commons via Mark Hunter)

Slide 19

Slide 19 text

19 "logoAnnotations" : [ { "boundingPoly" : { "vertices" : [ { "x" : 130, "y" : 157 }, ... ] }, "mid" : "\/m\/018c_r", "score" : 0.811352, "description" : "Google" } ] Logo Detection 19

Slide 20

Slide 20 text

20 "faceAnnotations" : [ { "headwearLikelihood" : "VERY_UNLIKELY", "surpriseLikelihood" : "VERY_UNLIKELY", "rollAngle" : 8.5484314, "angerLikelihood" : "VERY_UNLIKELY", "detectionConfidence" : 0.9996134, "joyLikelihood" : "VERY_LIKELY", "panAngle" : 18.178885, "sorrowLikelihood" : "VERY_UNLIKELY", "tiltAngle" : -12.244568, "underExposedLikelihood" : "VERY_UNLIKELY", "blurredLikelihood" : "VERY_UNLIKELY" "landmarks" : [ { "type" : "LEFT_EYE", "position" : { "x" : 268.25815, "y" : 491.55255, "z" : -0.0022390306 } }, ... Face Detection 20 { "type" : "RIGHT_EYE", "position" : { "x" : 418.42868, "y" : 508.22632, "z" : 49.302765 } }, { "type" : "MIDPOINT_BETWEEN_EYES", "position" : { "x" : 359.86551, "y" : 500.2868, "z" : -7.9241152 } }, { "type" : "NOSE_TIP", "position" : { "x" : 358.51404, "y" : 611.80286, "z" : -31.350466 } }, ...

Slide 21

Slide 21 text

21 { "locale" : "en", "boundingPoly" : { "vertices" : [ { "x" : 99, "y" : 220 }, { "x" : 551, "y" : 220 }, { "x" : 551, "y" : 345 }, { "x" : 99, "y" : 345 } ] }, "description" : "WINCHESTER\nCOTTAGE S\nCOPPERFIELD STREET\nLONDON BOROUGH OF SOUTHWARK\n" } Text Detection 21

Slide 22

Slide 22 text

04 Live Demo

Slide 23

Slide 23 text

@SRobTweets 23 23 We’ll build ● An iOS app in Swift that... ● Runs label detection on an image ● Stores label data in Firebase

Slide 24

Slide 24 text

@SRobTweets 24 24 We’ll build

Slide 25

Slide 25 text

@SRobTweets 25 25 Resources ● Sign up for the alpha: cloud.google.com/vision ● Read the announcement blog post: bit.ly/vision-api-blog ● Get started with TensorFlow: tensorflow.org ● Jeff Dean’s presentation on neural networks: bit.ly/google-brain-ml Photo Attributions ● Eiffel Tower: Creative Commons via Sathish J ● Lens: Creative Commons via Mark Hunter ● Harry Potter World: Creative Commons via daihung ● London building: Creative Commons via DncnH ● iPhone camera: Creative Commons via David Goehring

Slide 26

Slide 26 text

Google Cloud Platform 26 One more thing... GCP NEXT 2016

Slide 27

Slide 27 text

Confidential & Proprietary Google Cloud Platform 27 ABOUT GCP NEXT 2016 WHAT NEXT 2016 is a user conference and learning opportunity for developers, IT professionals and technologists who want to understand what’s next for cloud technology. The two-day conference will feature: - product updates and demonstrations - perspectives from industry leaders - hands-on code labs - 30+ breakout sessions - technical training - opportunities to connect with Google engineers and the GCP community. WHERE/WHEN Pier 48, San Francisco, CA March 23-24, 2016 HOW MUCH $499/attendee ($349 early bird rate ends 5 Feb)

Slide 28

Slide 28 text

Confidential & Proprietary Google Cloud Platform 28 Top 5 reasons why developers need to go... Don’t miss out, Reserve your seat: goo.gl/lNPpwr

Slide 29

Slide 29 text

Thank You Sara Robinson / @SRobTweets