Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for Developers

Machine Learning for Developers

Codemotion, Rome, March 25th, 2017

Have you always wanted to add predictive capabilities or voice recognition to your application, but haven’t been able to find the time or the right technology to get started? Everybody wants to build smart apps, but only a few are Data Scientists. This session will help you understand machine learning terminology & challenges, implement a machine learning model, add predictive capabilities to your app, and provide your customer with voice UX.

Danilo Poccia

March 25, 2017
Tweet

More Decks by Danilo Poccia

Other Decks in Programming

Transcript

  1. Machine Learning
    For Developers
    Danilo Poccia
    @danilop danilop
    AWS Technical Evangelist

    View Slide

  2. Credit: Gerry Cranham/Fox Photos/Getty Images
    http://www.telegraph.co.uk/travel/destinations/europe/united-kingdom/england/london/galleries/The-history-of-the-Tube-in-pictures-150-years-of-London-Underground/1939-ticket-examin/

    View Slide

  3. 1939 London Underground
    Credit: Gerry Cranham/Fox Photos/Getty Images
    http://www.telegraph.co.uk/travel/destinations/europe/united-kingdom/england/london/galleries/The-history-of-the-Tube-in-pictures-150-years-of-London-Underground/1939-ticket-examin/

    View Slide

  4. Data Predictions

    View Slide

  5. Data Model Predictions

    View Slide

  6. Model

    View Slide

  7. http://www.thehudsonvalley.com/articles/60-years-ago-today-local-technology-demonstrated-artificial-intelligence-for-the-first-time
    1959 Arthur Samuel

    View Slide

  8. Machine Learning

    View Slide

  9. Machine Learning
    Supervised
    Learning
    Inferring a model
    from labeled
    training data

    View Slide

  10. Machine Learning
    Supervised
    Learning
    Unsupervised
    Learning
    Inferring a model
    from labeled
    training data
    Inferring a model
    to describe hidden
    structure from
    unlabeled data

    View Slide

  11. Reinforcement
    Learning
    Perform a certain
    goal in a
    dynamic
    environment
    Machine Learning
    Supervised
    Learning
    Unsupervised
    Learning

    View Slide

  12. Driving a vehicle
    Playing a game
    against an opponent

    View Slide

  13. Clustering

    View Slide

  14. Clustering

    View Slide

  15. Tip: Try topic modeling with your own emails ;-)
    Topic Modeling
    Discovering abstract “topics”
    that occur in a collection of documents
    For example, looking for “infrequent” words
    that are used more often in a document

    View Slide

  16. Regression “How many bikes will
    be rented tomorrow?”
    Happy, Sad, Angry,
    Confused, Disgusted,
    Surprised, Calm,
    Unknown
    Binary
    Classification
    Multi-Class
    Classification
    “Is this email spam?”
    “What is the
    sentiment of this
    tweet, or of this social
    media comment?”
    1, 0, 100K
    Yes / No
    True / False
    %

    View Slide

  17. Training the Model
    Minimizing the Error
    of using the Model on the Labeled Data

    View Slide

  18. Validation
    How well is this Model working on New Data?

    View Slide

  19. Be Careful of Overfitting

    View Slide

  20. Be Careful of Overfitting

    View Slide

  21. Be Careful of Overfitting

    View Slide

  22. Better Fitting

    View Slide

  23. Better Fitting

    View Slide

  24. Different Models ⇒ Different Predictions

    View Slide

  25. Labeled Data

    View Slide

  26. Labeled Data
    70%
    30%
    Training
    Validation

    View Slide

  27. Neural
    Networks

    View Slide

  28. 1943 Warren McCulloch, Walter Pitts
    Threshold
    Logic
    Units

    View Slide

  29. 1962 Frank Rosenblatt
    Perceptron

    View Slide


  30. w1
    w2
    w3
    wn
    w0
    =
    output
    weights
    (parameters)
    activation
    function
    input

    View Slide

  31. f(∑)
    w1
    w2
    w3
    wn
    w0
    =
    weights
    (parameters)
    activation
    function
    output
    input

    View Slide

  32. f(∑)
    input output

    View Slide

  33. 1969 Marvin Minsky, Seymour Papert
    Perceptrons:
    An Introduction
    to Computational Geometry
    A perceptron can only solve
    linearly separable functions
    (e.g. no XOR)

    View Slide

  34. f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    input
    layer
    hidden
    layer
    output
    layer
    input output
    Multiple Layers
    Lots of Parameters
    Backpropagation

    View Slide

  35. Microprocessor Transistor Counts 1971-2011
    Intel E7 CPU
    4-24 cores
    NVIDIA K80 GPU
    2,496 cores
    https://en.wikipedia.org/wiki/Moore's_law

    View Slide

  36. LeCun, Gradient-Based
    Learning Applied to Document
    Recognition,1998
    Hinton, A Fast Learning
    Algorithm for Deep Belief
    Nets, 2006
    Bengio, Learning Deep
    Architectures for AI, 2009
    Advances in Research 1998-2009

    View Slide

  37. Image
    Processing

    View Slide

  38. f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    f(∑)
    output
    How to give images in input
    to a Neural Network?
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  39. Convolution Matrix
    0 0 0
    0 1 0
    0 0 0
    Identity
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  40. Convolution Matrix
    1 0 -1
    2 0 -2
    1 0 -1
    Left Edges
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  41. Convolution Matrix
    -1 0 1
    -2 0 2
    -1 0 1
    Right Edges
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  42. Convolution Matrix
    1 2 1
    0 0 0
    -1 -2 -1
    Top Edges
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  43. Convolution Matrix
    -1 -2 -1
    0 0 0
    1 2 1
    Bottom Edges
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  44. Convolution Matrix
    0.6 -0.6 1.2
    -1.4 1.2 -1.6
    0.8 -1.4 1.6
    Random Values
    Photo by David Iliff. License: CC-BY-SA 3.0
    https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

    View Slide

  45. Convolutional Neural Networks (CNNs)
    https://en.wikipedia.org/wiki/Convolutional_neural_network

    View Slide

  46. ImageNet Classification Error Over Time
    0
    5
    10
    15
    20
    25
    30
    2010 2011 2012 2013 2014 2015 2016
    Classification Error
    CNNs

    View Slide

  47. 2012 ImageNet Classification with Deep Convolutional Neural Networks

    View Slide

  48. SuperVision: 8 layers, 60M parameters
    0

    View Slide

  49. 2013 Visualizing and Understanding Convolutional Networks

    View Slide

  50. View Slide

  51. View Slide

  52. http://www.asimovinstitute.org/neural-network-zoo/
    Lots of Parameters
    Network Architectures
    defined by Hyperparameters
    Dropout Layers
    for Regularization

    View Slide

  53. Generative Adversarial Networks (GANs)
    Generator
    Neural
    Network
    Discriminator
    Neural
    Network
    Real or
    Generated?
    Real
    Picture
    Generated
    Picture

    View Slide

  54. 2016
    Generative Adversarial Networks (GANs)

    View Slide

  55. Artificial Intelligence & Deep Learning At Amazon
    Thousands Of Employees Across The Company Focused on AI
    Discovery &
    Search
    Fulfilment &
    Logistics
    Add ML-powered
    features to existing products
    Echo &
    Alexa

    View Slide

  56. Artificial Intelligence on AWS
    P2, F1 &
    Elastic GPUs
    Deep Learning
    AMI and template
    Investment in
    Apache MXNet

    View Slide

  57. Apache MXNet

    View Slide

  58. Deep Learning Frameworks
    MXNet, Caffe, Tensorflow, Theano,
    Torch, CNTK and Keras
    Pre-installed components to speed
    productivity, such as Nvidia drivers, CUDA,
    cuDNN, Intel MKL-DNN with MXNet,
    Anaconda, Python 2 and 3
    AWS Integration
    Deep Learning AMI

    View Slide

  59. Amazon AI
    Bringing Powerful Artificial Intelligence To All Developers

    View Slide

  60. Amazon Rekognition
    Image Recognition And Analysis
    Powered By Deep Learning
    1

    View Slide

  61. Amazon Rekognition
    Deep learning-based image recognition service
    Search, verify, and organize millions of images
    Object and Scene
    Detection
    Facial
    Analysis
    Face
    Comparison
    Facial
    Recognition

    View Slide

  62. Amazon Rekognition: Images In,
    Categories and Facial Analysis Out
    Amazon
    Rekognition
    Car
    Outside
    Daytime
    Driving
    Objects
    & Scenes
    Female
    Smiling
    Sunglasses
    Face ID
    DetectLabels
    DetectFaces
    CompareFaces
    IndexFaces
    SearchFacesByImage
    Faces

    View Slide

  63. View Slide

  64. Deep Learning Process
    Conv 1 Conv 2 Conv n


    Feature Maps
    Labrador
    Dog
    Beach
    Outdoors
    Softmax
    Probability
    Fully
    Connected
    Layer

    View Slide

  65. Bynder allows you to easily create, find and use content
    for branding automation and marketing solutions.
    With our new AI capabilities,
    Bynder’s software… now allows
    users to save hours of admin
    labor when uploading and
    organizing their files, adding
    exponentially more value.
    Chris Hall
    CEO, Bynder


    With Rekognition, Bynder revolutionizes marketing admin tasks with AI capabilities

    View Slide

  66. Amazon Polly
    Text To Speech Powered By Deep Learning
    2

    View Slide

  67. Amazon Polly: Text In, Life-like Speech Out
    Amazon Polly
    “The temperature
    in WA is 75°F”
    “The temperature
    in Washington is 75 degrees
    Fahrenheit”

    View Slide

  68. TEXT
    Market grew by > 20%.
    WORDS
    PHONEMES
    {
    {
    {
    {
    {
    ˈtwɛn.ti
    pɚ.ˈsɛnt
    ˈmɑɹ.kət ˈgɹu baɪ ˈmoʊɹ
    ˈðæn
    PROSODY CONTOUR
    UNIT SELECTION AND ADAPTATION
    TEXT PROCESSING
    PROSODY MODIFICATION
    STREAMING
    Market grew by more
    than
    twenty
    percent
    Speech units
    inventory

    View Slide

  69. aws polly synthesize-speech
    --text "It was nice to live such a wonderful live show."
    --output-format mp3
    --voice-id Joanna
    --text-type text
    output.mp3

    View Slide

  70. “Nel mezzo del cammin di nostra vita
    mi ritrovai per una selva oscura
    ché la diritta via era smarrita.”
    https://commons.wikimedia.org/wiki/File:Portrait_de_Dante.jpg

    View Slide

  71. Duolingo voices its language learning service Using Polly
    Duolingo is a free language learning service where users
    help translate the web and rate translations.
    With Amazon Polly our users
    benefit from the most lifelike
    Text-to-Speech voices
    available on the market.
    Severin Hacker
    CTO, Duolingo

    “ • Spoken language crucial for
    language learning
    • Accurate pronunciation matters
    • Faster iteration thanks to TTS
    • As good as natural human speech

    View Slide

  72. GoAnimate is a cloud-based, animated video creation
    plarform.
    Amazon Polly gives
    GoAnimate users the ability to
    immediately give voice to the
    characters they animate using
    our platform.
    Alvin Hung
    CEO, GoAnimate

    “ • Multi-language communication
    • Training or HR professionals who
    have to create content in many
    languages
    • Video preproduction
    • Video makers who need to iterate
    and fine-tune before the text-to-
    speech is eventually replaced by a
    professional voiceover
    • K–12 education
    • Students who make videos and don’t
    have access to professional voices
    or time for or knowledge of voiceover
    With Polly, GoAnimate gives voice to the characters in their animations

    View Slide



  73. Royal National Institute of Blind People creates and
    distributes accessible information in the form of
    synthesized content
    Amazon Polly delivers
    incredibly lifelike voices which
    captivate and engage our
    readers.
    John Worsfold
    Solutions Implementation Manager, RNIB
    • RNIB delivers largest library of
    audiobooks in the UK for nearly 2 million
    people with sight loss
    • Naturalness of generated speech is
    critical to captivate and engage readers
    • No restrictions on speech redistributions
    enables RNIB to create and distribute
    accessible information in a form of
    synthesized content
    RNIB provides the largest library in the UK for people with sight loss

    View Slide

  74. Amazon ALEXA
    (It’s what’s inside Alexa)
    3
    Natural Language Understanding (NLU) &
    Automatic Speech Recognition (ASR) Powered By Deep Learning

    View Slide

  75. Amazon Lex: Speech Recognition
    & Natural Language Understanding
    Amazon Lex
    Automatic Speech Recognition
    Natural Language Understanding
    “What’s the weather
    forecast?”
    Weather
    Forecast

    View Slide

  76. Amazon Lex: Speech Recognition
    & Natural Language Understanding
    Amazon Lex
    Automatic Speech Recognition
    Natural Language Understanding
    “What’s the weather
    forecast?”
    “It will be sunny
    and 25°C”
    Weather
    Forecast

    View Slide

  77. Lex Bot Structure
    Utterances
    Spoken or typed phrases that invoke your
    intent
    BookHotel
    Intents
    An Intent performs an action in response
    to natural language user input
    Slots
    Slots are input data required to fulfill the
    intent
    Fulfillment
    Fulfillment mechanism for your intent

    View Slide

  78. Hotel Booking
    City New York City
    Check In Nov 30th
    Check Out Dec 2nd
    Hotel Booking
    City New York City
    Check In
    Check Out
    “Book a Hotel”
    Book Hotel
    NYC
    “Book a Hotel in
    NYC”
    Automatic Speech
    Recognition
    Hotel Booking
    New York City
    Natural Language
    Understanding
    Intent/Slot
    Model
    Utterances
    “Your hotel is booked
    for Nov 30th”
    Polly
    Confirmation: “Your hotel is
    booked for Nov 30th”
    a
    in
    “Can I go ahead with
    the booking?”

    View Slide



  79. Finding missing persons:
    ~100,000 active missing
    persons cases in the U.S.
    at any given time
    ~60% are adults,
    ~40% are children
    • Motorola Solutions applies Amazon
    Rekognition, Amazon Polly and Amazon
    Lex
    • Image analytics and facial recognition
    can continually monitor for missing
    persons
    • Tools that understand natural language
    can enable officers to keep eyes up and
    hands free
    Motorola Solutions is using Amazon AI to help finding missing persons
    Motorola Solutions keeps utility workers connected and
    visible to each other with real-time voice and data
    communication across the smart grid.

    View Slide


  80. I See

    View Slide

  81. I see…
    Amazon
    Rekognition
    Amazon
    Polly
    Camera
    Raspberry Pi
    Voice
    Synthesize
    Speech
    Detect Labels
    Detect Faces

    View Slide

  82. View Slide

  83. Nikola Tesla, 1926
    “When wireless is perfectly
    applied, the whole earth will be
    converted into a huge brain…”

    View Slide

  84. Machine Learning
    For Developers
    Danilo Poccia
    @danilop danilop
    AWS Technical Evangelist

    View Slide