Analyzing Images with Google's Cloud Vision API

778425a9498f00198e57896c7b2a95d3?s=47 sararob
February 06, 2016

Analyzing Images with Google's Cloud Vision API

Ever wondered about the technology behind Google Photos? Or wanted to build an app that performs complex image analysis, like detecting objects, faces, emotions, and landmarks? The new Google Cloud Vision API (currently in alpha) exposes the machine learning models that power Google Photos and Google Image Search. Developers can now access these features with just a simple REST API call. We’ll learn how to make a request to the Vision API, and then we’ll see it classify images, extract text, and even identify landmarks like Harry Potter World. We’ll end the talk by live coding an iOS app that implements image detection with the Vision API.

778425a9498f00198e57896c7b2a95d3?s=128

sararob

February 06, 2016
Tweet

Transcript

  1. Analyzing Images with Google’s Cloud Vision API Sara Robinson, Developer

    Advocate @SRobTweets / @googlecloud
  2. What we’ll cover 01 02 03 04 Image detection in

    the wild What is the Cloud Vision API? Let’s make an API request! Live demo
  3. 01 Image detection in the wild

  4. @SRobTweets 4 4 Image Search

  5. @SRobTweets 5 5 Google Photos

  6. Google Cloud Platform 6 But how do they work?

  7. @SRobTweets 7 7 Neural networks

  8. @SRobTweets 8 TensorFlow • Open source ML library for researchers

    and developers to build and train their own deep learning models • Python and C++ APIs • Used in many areas across Google tensorflow.org github.com/tensorflow
  9. @SRobTweets 9 Powering many Google Services • Speech Recognition •

    Google Photos • Gmail
  10. @SRobTweets 10 What if I’m not a machine learning expert?

  11. 02 The Cloud Vision API Complex image detection with a

    simple REST request
  12. @SRobTweets 12 12 Types of Detection • Label • Landmark

    • Logo • Face • Text • Safe search
  13. @SRobTweets 13 13 Types of Detection Face Detection ◦ Find

    multiple faces ◦ Location of eyes, nose, mouth ◦ Detect emotions: joy, anger, surprise, sorrow Entity Detection ◦ Find common objects and landmarks, and their location in the image ◦ Detect explicit content
  14. 03 Making an API request

  15. Making a request { "requests":[ { "image": { "content": "base64ImageString"

    }, "features": [ { "type": "LABEL_DETECTION", "maxResults": 10 }, { "type": "FACE_DETECTION", "maxResults": 10 }, // More feature detection types... ] } ] }
  16. Google Cloud Platform 16 Let’s see it in action!

  17. 17 { "labelAnnotations" : [ { "mid" : "\/m\/01wydv", "score"

    : 0.92442685, "description" : "beignet" }, { "mid" : "\/m\/0270h", "score" : 0.90845567, "description" : "dessert" }, { "mid" : "\/m\/033nb2", "score" : 0.74553984, "description" : "profiterole" }, { "mid" : "\/m\/01dk8s", "score" : 0.71415579, "description" : "powdered sugar" } ] } Label Detection 17
  18. 18 { "landmarkAnnotations" : [ { "boundingPoly" : { "vertices"

    : [ { "x" : 52, "y" : 25 }, ... ] }, "mid" : "\/m\/0b__kbm", "score" : 0.4231607, "description" : "The Wizarding World of Harry Potter", "locations" : [ { "latLng" : { "longitude" : -81.471261, "latitude" : 28.473 } } ] } ] } Landmark Detection 18 Photo attributions: Eiffel Tower (Creative Commons via Sathish J), Lens (Creative Commons via Mark Hunter)
  19. 19 "logoAnnotations" : [ { "boundingPoly" : { "vertices" :

    [ { "x" : 130, "y" : 157 }, ... ] }, "mid" : "\/m\/018c_r", "score" : 0.811352, "description" : "Google" } ] Logo Detection 19
  20. 20 "faceAnnotations" : [ { "headwearLikelihood" : "VERY_UNLIKELY", "surpriseLikelihood" :

    "VERY_UNLIKELY", "rollAngle" : 8.5484314, "angerLikelihood" : "VERY_UNLIKELY", "detectionConfidence" : 0.9996134, "joyLikelihood" : "VERY_LIKELY", "panAngle" : 18.178885, "sorrowLikelihood" : "VERY_UNLIKELY", "tiltAngle" : -12.244568, "underExposedLikelihood" : "VERY_UNLIKELY", "blurredLikelihood" : "VERY_UNLIKELY" "landmarks" : [ { "type" : "LEFT_EYE", "position" : { "x" : 268.25815, "y" : 491.55255, "z" : -0.0022390306 } }, ... Face Detection 20 { "type" : "RIGHT_EYE", "position" : { "x" : 418.42868, "y" : 508.22632, "z" : 49.302765 } }, { "type" : "MIDPOINT_BETWEEN_EYES", "position" : { "x" : 359.86551, "y" : 500.2868, "z" : -7.9241152 } }, { "type" : "NOSE_TIP", "position" : { "x" : 358.51404, "y" : 611.80286, "z" : -31.350466 } }, ...
  21. 21 { "locale" : "en", "boundingPoly" : { "vertices" :

    [ { "x" : 99, "y" : 220 }, { "x" : 551, "y" : 220 }, { "x" : 551, "y" : 345 }, { "x" : 99, "y" : 345 } ] }, "description" : "WINCHESTER\nCOTTAGE S\nCOPPERFIELD STREET\nLONDON BOROUGH OF SOUTHWARK\n" } Text Detection 21
  22. 04 Live Demo

  23. @SRobTweets 23 23 We’ll build • An iOS app in

    Swift that... • Runs label detection on an image • Stores label data in Firebase
  24. @SRobTweets 24 24 We’ll build

  25. @SRobTweets 25 25 Resources • Sign up for the alpha:

    cloud.google.com/vision • Read the announcement blog post: bit.ly/vision-api-blog • Get started with TensorFlow: tensorflow.org • Jeff Dean’s presentation on neural networks: bit.ly/google-brain-ml Photo Attributions • Eiffel Tower: Creative Commons via Sathish J • Lens: Creative Commons via Mark Hunter • Harry Potter World: Creative Commons via daihung • London building: Creative Commons via DncnH • iPhone camera: Creative Commons via David Goehring
  26. Google Cloud Platform 26 One more thing... GCP NEXT 2016

  27. Confidential & Proprietary Google Cloud Platform 27 ABOUT GCP NEXT

    2016 WHAT NEXT 2016 is a user conference and learning opportunity for developers, IT professionals and technologists who want to understand what’s next for cloud technology. The two-day conference will feature: - product updates and demonstrations - perspectives from industry leaders - hands-on code labs - 30+ breakout sessions - technical training - opportunities to connect with Google engineers and the GCP community. WHERE/WHEN Pier 48, San Francisco, CA March 23-24, 2016 HOW MUCH $499/attendee ($349 early bird rate ends 5 Feb)
  28. Confidential & Proprietary Google Cloud Platform 28 Top 5 reasons

    why developers need to go... Don’t miss out, Reserve your seat: goo.gl/lNPpwr
  29. Thank You Sara Robinson / @SRobTweets