Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploring the World Using Cloud Vision & Twilio

juliaferraioli
September 20, 2016

Exploring the World Using Cloud Vision & Twilio

We take photos for many reasons: to capture moments, because something's funny or pretty, because we're curious about something and want to know more. Luckily, there's a whole area of machine learning dedicated to understanding images.

We’ll explore how we can use machine learning ﹘ without the math ﹘ with Twilio and the pictures that we take to gain a greater understanding of the world around us. We'll dig into various ways that this fusion can be used to make people's lives richer and better.

Accompanying blog post: http://www.blog.juliaferraioli.com/2016/02/exploring-world-using-vision-twilio.html

juliaferraioli

September 20, 2016
Tweet

More Decks by juliaferraioli

Other Decks in Technology

Transcript

  1. Exploring the World Using Cloud
    Vision & Twilio
    Julia Ferraioli
    @juliaferraioli
    Software Engineer
    Google Open Source
    Programs Office

    View full-size slide

  2. ★ All code is licensed under the
    Apache License, Version 2.0
    bit.ly/apache-v2

    View full-size slide

  3. @juliaferraioli
    Other stuff I like to do

    View full-size slide

  4. People think of machine
    learning like this

    View full-size slide

  5. CC image courtesy of M.Kemal: https://flic.kr/p/bYAsCs

    View full-size slide

  6. But really, machine
    learning is this

    View full-size slide

  7. → ⚙ → ❓
    data algorithm insight

    View full-size slide

  8. But that means we
    can do cool stuff

    View full-size slide

  9. @juliaferraioli
    Like this

    View full-size slide

  10. @juliaferraioli
    Or this

    View full-size slide

  11. But what about the
    data?

    View full-size slide

  12. You need a bunch of it

    View full-size slide

  13. @juliaferraioli
    Machine learning models as a Service

    Models pre-trained on extensive
    amounts of curated data

    View full-size slide

  14. Google Cloud Vision
    ● Detect faces, landmarks, logos, text, and more
    ● Perform sentiment analysis
    ● Straightforward REST API
    ● Works on a base64-encoded image
    ● Connects to Google Cloud Storage

    View full-size slide

  15. The primary place I
    take pictures is on
    my phone

    View full-size slide

  16. ...and I want it to be
    similar to existing
    Q&A methods...

    View full-size slide

  17. Oh, I know! Twilio!

    View full-size slide

  18. @juliaferraioli
    Lifecycle of a picture

    View full-size slide

  19. @juliaferraioli
    Receiving the MMS
    @app.route("/", methods=['GET', 'POST'])
    def receive_message():
    for i in range(int(request.values.get('NumMedia', None))):
    media_url = request.values.get('MediaUrl%i' % i, None)
    image = requests.get(media_url).content
    labels = get_labels(image)
    resp = construct_message(labels)
    return str(resp)

    View full-size slide

  20. @juliaferraioli
    Receiving the MMS
    @app.route("/", methods=['GET', 'POST'])
    def receive_message():
    for i in range(int(request.values.get('NumMedia', None))):
    media_url = request.values.get('MediaUrl%i' % i, None)
    image = requests.get(media_url).content
    labels = get_labels(image)
    resp = construct_message(labels)
    return str(resp)

    View full-size slide

  21. @juliaferraioli
    Processing the image
    def get_labels(image, num_retries=3, max_results=3):
    # Set up the service that can access the API
    # Prepare the image for the API
    # Construct the request
    # Send it off to the API
    # Return the labels

    View full-size slide

  22. @juliaferraioli
    Authenticating with the API
    def get_labels(image, num_retries=3, max_results=3):
    # Set up the service that can access the API
    http = httplib2.Http()
    credentials = GoogleCredentials.get_application_default().create_scoped(
    ['https://www.googleapis.com/auth/cloud-platform'])
    credentials.authorize(http)
    service = discovery.build('vision', 'v1', http=http,
    discoveryServiceUrl=DISCOVERY_URL)

    View full-size slide

  23. @juliaferraioli
    Encoding the image
    def get_labels(image, num_retries=3, max_results=3):
    # Prepare the image for the API
    image_content = base64.b64encode(image)

    View full-size slide

  24. @juliaferraioli
    Crafting the request
    def get_labels(image, num_retries=3, max_results=3):
    # Construct the request
    service_request = service.images().annotate(
    body={'requests': [{
    'image': { 'content': image_content, },
    'features': [{
    'type': 'LABEL_DETECTION',
    'maxResults': max_results,
    }]
    }]
    })

    View full-size slide

  25. @juliaferraioli
    Executing the request
    def get_labels(image, num_retries=3, max_results=3):
    # Send it off to the API
    response = service_request.execute(num_retries=num_retries)

    View full-size slide

  26. @juliaferraioli
    Returning the labels
    def get_labels(image, num_retries=3, max_results=3):
    # Return the labels
    if('responses' in response and 'labelAnnotations'
    in response['responses'][0]):
    # Hey, the API found something!
    return response['responses'][0]['labelAnnotations']
    else:
    # Sigh, no dice this time :-(
    return []

    View full-size slide

  27. CC image courtesy of John Morgan:
    https://flic.kr/p/5XbG1m

    View full-size slide

  28. @juliaferraioli
    Receiving the MMS
    @app.route("/", methods=['GET', 'POST'])
    def receive_message():
    for i in range(int(request.values.get('NumMedia', None))):
    media_url = request.values.get('MediaUrl%i' % i, None)
    image = requests.get(media_url).content
    labels = get_labels(image)
    response = construct_message(labels)
    return str(response)

    View full-size slide

  29. @juliaferraioli
    Sending back the results
    def construct_message(labels):
    label_desc = ""
    # Go through labels and turn them into text of the response
    # Turn the string into a twiml response

    View full-size slide

  30. @juliaferraioli
    Sending back the results
    def construct_message(labels):
    # Go through labels and turn them into text of the response
    for i in range(len(labels)):
    # We've got an answer! Let's tell them about it
    label_desc += 'Score is %s for %s\n' % (labels[i]['score'],
    labels[i]['description'])

    View full-size slide

  31. @juliaferraioli
    Sending back the results
    def construct_message(labels):
    # Turn the string into a twiml response
    resp = twilio.twiml.Response()
    resp.message(label_desc)

    View full-size slide

  32. @juliaferraioli
    Responding to the MMS
    @app.route("/", methods=['GET', 'POST'])
    def receive_message():
    for i in range(int(request.values.get('NumMedia', None))):
    media_url = request.values.get('MediaUrl%i' % i, None)
    image = requests.get(media_url).content
    labels = get_labels(image)
    resp = construct_message(labels)
    return str(resp)

    View full-size slide

  33. @juliaferraioli

    View full-size slide

  34. @juliaferraioli
    What’s that?
    We found some labels for your
    image:
    Score is 0.90518606 for cephalopod
    Score is 0.79291707 for marine
    invertebrates
    Score is 0.78696847 for octopus

    View full-size slide

  35. @juliaferraioli
    CC image courtesy of John Morgan:
    https://flic.kr/p/5XbG1m

    View full-size slide

  36. @juliaferraioli
    What’s that?
    We found some labels for your
    image:
    Score is 0.59691668 for brand
    Score is 0.58166391 for writing

    View full-size slide

  37. What? NO DICE?!?

    View full-size slide

  38. This is cool, but...

    View full-size slide

  39. @juliaferraioli
    Why? ¯\_(ツ)_/¯
    ● Aid tourists around the
    world with landmark
    detection
    ● Build a bot to sort your
    mail and text you the
    senders with OCR
    ● Run a simple Q&A service
    for people with low vision
    ● Help folks analyze
    interactions using
    sentiment analysis
    ● Boost conference
    scanners to recognize
    logos on business cards
    ● … and lots more

    View full-size slide

  40. Learn more
    ● Google Cloud Vision developer
    documentation: http://bit.ly/gcp-vision
    ● Code from today: http://bit.ly/twilio-vision
    ● Code from today running on Kubernetes (!!!):
    http://bit.ly/twilio-vision-k8s

    View full-size slide