Machine Learning for Developers

Machine Learning for Developers

Voxxed Days Zurich, February 23rd, 2017

Have you always wanted to add predictive capabilities or voice recognition to your application, but haven’t been able to find the time or the right technology to get started? Everybody wants to build smart apps, but only a few are Data Scientists. This session will help you understand machine learning terminology & challenges, implement a machine learning model, add predictive capabilities to your app, and provide your customer with voice UX.

7c9b8b368924556d8642bdaed3ded1f5?s=128

Danilo Poccia

February 23, 2017
Tweet

Transcript

  1. Machine Learning for Developers Danilo Poccia @danilop danilop AWS Technical

    Evangelist
  2. Credit: Gerry Cranham/Fox Photos/Getty Images http://www.telegraph.co.uk/travel/destinations/europe/united-kingdom/england/london/galleries/The-history-of-the-Tube-in-pictures-150-years-of-London-Underground/1939-ticket-examin/

  3. 1939 London Underground Credit: Gerry Cranham/Fox Photos/Getty Images http://www.telegraph.co.uk/travel/destinations/europe/united-kingdom/england/london/galleries/The-history-of-the-Tube-in-pictures-150-years-of-London-Underground/1939-ticket-examin/

  4. Batch Reports Real-Time Alerts Prediction Forecasts

  5. Predictions

  6. Data Predictions

  7. Data Model Predictions

  8. Model

  9. http://www.thehudsonvalley.com/articles/60-years-ago-today-local-technology-demonstrated-artificial-intelligence-for-the-first-time 1959 Arthur Samuel

  10. Machine Learning

  11. Machine Learning Supervised Learning Inferring a model from labeled training

    data
  12. Machine Learning Supervised Learning Unsupervised Learning Inferring a model from

    labeled training data Inferring a model to describe hidden structure from unlabeled data
  13. Reinforcement Learning Perform a certain goal in a dynamic environment

  14. Driving a vehicle Playing a game against an opponent

  15. Clustering

  16. Clustering

  17. Topic Modeling Discovering abstract “topics” that occur in a collection

    of documents Try that with your own emails ;-)
  18. Regression “How many bikes will be rented tomorrow?” Happy, Sad,

    Angry, Confused, Disgusted, Surprised, Calm, Unknown Binary Classification Multi-Class Classification “Is this email spam?” “What is the sentiment of this tweet, or of this social media comment?” 1, 0, 100K Yes, No True, False %
  19. Training the Model Minimizing the Error

  20. Validation

  21. Labeled Data

  22. Labeled Data 70% 30% Training Validation

  23. Be Careful of Overfitting

  24. Be Careful of Overfitting

  25. Be Careful of Overfitting

  26. Better Fitting

  27. Better Fitting

  28. Different Models ⇒ Different Predictions

  29. Adding a “cost” for using
larger parameters in the model L1,

    L2, Neural Network Dropouts Regularization
  30. Large numbers can have a bigger impact in a mathematical

    model Feature Scaling So they are all in the same range, For example between 0 and 1 Normalization
  31. Model Parameters “Hyperparameters”

  32. Labeled Data 70% 15% Training Validation 15% Test Try Different

    Hyperparameters
  33. Labeled Data Cross Validation

  34. Neural Networks

  35. 1943 Warren McCulloch, Walter Pitts Threshold Logic Units

  36. 1962 Frank Rosenblatt Perceptron

  37. https://github.com/cdipaolo/goml/tree/master/perceptron weights activation function input output

  38. https://en.wikipedia.org/wiki/Artificial_neural_network Multiple Layers Backpropagation

  39. Microprocessor Transistor Counts 1971-2011 Intel E7 CPU 4-24 cores NVIDIA

    K80 GPU 2,496 cores https://en.wikipedia.org/wiki/Moore's_law
  40. LeCun, Gradient-Based Learning Applied to Document Recognition,1998 Hinton, A Fast

    Learning Algorithm for Deep Belief Nets, 2006 Bengio, Learning Deep Architectures for AI, 2009 Advances in Research 1998-2009
  41. Image Processing

  42. Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg Convolution Matrix

  43. Convolution Matrix Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

  44. Convolution Matrix 0 0 0 0 1 0 0 0

    0 Identity Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg
  45. Convolution Matrix 1 0 -1 2 0 -2 1 0

    -1 Left Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg
  46. Convolution Matrix -1 0 1 -2 0 2 -1 0

    1 Right Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg
  47. Convolution Matrix 1 2 1 0 0 0 -1 -2

    -1 Top Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg
  48. Convolution Matrix -1 -2 -1 0 0 0 1 2

    1 Bottom Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg
  49. Convolution Matrix 0.6 -0.6 1.2 -1.4 1.2 -1.6 0.8 -1.4

    1.6 Random Values Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg
  50. Convolutional Neural Networks (CNNs) https://en.wikipedia.org/wiki/Convolutional_neural_network

  51. ImageNet Classification Error Over Time 0 5 10 15 20

    25 30 2010 2011 2012 2013 2014 2015 2016 Classification Error CNNs
  52. 2012 ImageNet Classification with Deep Convolutional Neural Networks

  53. SuperVision: 8 layers, 60M parameters 0

  54. 2013 Visualizing and Understanding Convolutional Networks

  55. None
  56. None
  57. How Do Neural Networks Learn? ? More generic and can

    be reused as feature extractor for other visual tasks Specific to task Cat Dog 0
  58. output input state output input state(t) memory Feedforward Neural Networks

    (no cycle) Recurrent Neural Networks (directed cycle) What About Memory?
  59. https://en.wikipedia.org/wiki/Long_short-term_memory Long Short-Term Memory (LSTM) How much goes into memory

    How much is used in computing the output How much remains in memory
  60. http://www.asimovinstitute.org/neural-network-zoo/ Lots of Parameters Network Architectures defined by Hyperparameters Dropout

    Layers for Regularization
  61. Generative Adversarial Networks (GANs) Generator Neural Network Discriminator Neural Network

    Real or Generated? Real Picture Generated Picture
  62. 2014 Generative Adversarial Networks (GANs)

  63. 2016 Generative Adversarial Networks (GANs)

  64. How much can you improve? Bayes Limit Vs Human Performance

    Training better than Validation? Learning Bias Data Leakage Do you have all the Data? Building a Model
  65. Artificial Intelligence & Deep Learning At Amazon Thousands Of Employees

    Across The Company Focused on AI Discovery & Search Fulfilment & Logistics Add ML-powered features to existing products Echo & Alexa
  66. None
  67. Create Great Content: ASK is how you connect to your

    consumer THE ALEXA ECOSYSTEM Supported by two powerful frameworks A L E X A V O I C E S E R V I C E Unparalleled Distribution: AVS allows your content to be everywhere Lives In The Cloud Automated Speech Recognition (ASR) Natural Language Understanding (NLU) Always Learning A L E X A S K I L L S K I T
  68. UNDER THE HOOD OF ASK A closer look at how

    the Alexa Skills Kit process a request and returns an appropriate response You Pass Back a Textual or Audio Response You Pass Back a Graphical Response Alexa Converts Text-to-Speech (TTS) & Renders Graphical Component Respond to Intent through Text & Visual Alexa sends Customer Intent to Your Service User Makes a Request Alexa Identifies Skill & Recognizes Intent Through ASR & NLU Your Service processes Request Audio Stream is sent up to Alexa
  69. Artificial Intelligence on AWS P2 Deep Learning AMI and template

    Investment in Apache MXNet
  70. Elastic GPUs On EC2 P2 M4 D2 X1 G2 T2

    R4 I3 C5 General Purpose GPU General Purpose Dense storage Large memory Graphics intensive Memory intensive High I/O Compute intensive Burstable Lightsail Simple VPS F1 FPGAs Instance Families
  71. Up to 40 thousand parallel processing cores 70 teraflops (single

    precision) over 23 teraflops (double precision) Instance Size GPUs GPU Peer to Peer vCPUs Memory (GiB) Network Bandwidth* p2.xlarge 1 - 4 61 1.25Gbps p2.8xlarge 8 Y 32 488 10Gbps p2.16xlarge 16 Y 64 732 20Gbps *In a placement group Amazon EC2 P2 Instances
  72. Elastic GPUs For EC2: GPU Acceleration For Graphics Workloads 1GiB

    GPU Memory 2 GiB 4 GiB 8 GiB Current Generation EC2 Instance
  73. F1 Instances: Bringing Hardware Acceleration To All FPGA Images Available

    In AWS Marketplace F1 Instance With your custom logic running on an FPGA Develop, simulate, debug & compile your code Package as FPGA Images
  74. Apache MXNet

  75. Deep Learning Frameworks MXNet, Caffe, Tensorflow, Theano, Torch, CNTK and

    Keras Pre-installed components to speed productivity, such as Nvidia drivers, CUDA, cuDNN, Intel MKL-DNN with MXNet, Anaconda, Python 2 and 3 AWS Integration Deep Learning AMI
  76. Apache Spark MLlib

  77. Amazon AI Bringing Powerful Artificial Intelligence To All Developers

  78. Amazon Rekognition Image Recognition And Analysis Powered By Deep Learning

    1
  79. Amazon Rekognition: Images In, Categories and Facial Analysis Out Amazon

    Rekognition Car Outside Daytime Driving Objects & Scenes Female Smiling Sunglasses Faces
  80. Ground Truth Generation Training

  81. Deep Learning Process Conv 1 Conv 2 Conv n …

    … Feature Maps Labrador Dog Beach Outdoors Softmax Probability Fully Connected Layer
  82. <demo> Amazon Rekognition </demo>

  83. Amazon Polly Text To Speech Powered By Deep Learning 2

  84. Amazon Polly: Text In, Life-like Speech Out Amazon Polly “The

    temperature in WA is 75°F” “The temperature in Washington is 75 degrees Fahrenheit”
  85. TEXT Market grew by > 20%. WORDS PHONEMES { {

    { { { ˈtwɛn.ti pɚ.ˈsɛnt ˈmɑɹ.kət ˈgɹu baɪ ˈmoʊɹ ˈðæn PROSODY CONTOUR UNIT SELECTION AND ADAPTATION TEXT PROCESSING PROSODY MODIFICATION STREAMING Market grew by more than twenty percent Speech units inventory
  86. <demo> Amazon Polly </demo>

  87. Amazon ALEXA (It’s what’s inside Alexa) 3 Natural Language Understanding

    (NLU) & Automatic Speech Recognition (ASR) Powered By Deep Learning
  88. Amazon Lex: Speech Recognition & Natural Language Understanding Amazon Lex

    Automatic Speech Recognition Natural Language Understanding “What’s the weather forecast?” Weather Forecast
  89. Amazon Lex: Speech Recognition & Natural Language Understanding Amazon Lex

    Automatic Speech Recognition Natural Language Understanding “What’s the weather forecast?” “It will be sunny and 25°C” Weather Forecast
  90. Lex Bot Structure Utterances Spoken or typed phrases that invoke

    your intent BookHotel Intents An Intent performs an action in response to natural language user input Slots Slots are input data required to fulfill the intent Fulfillment Fulfillment mechanism for your intent
  91. Hotel Booking City New York City Check In Nov 30th

    Check Out Dec 2nd Hotel Booking City New York City Check In Check Out “Book a Hotel” Book Hotel NYC “Book a Hotel in NYC” Automatic Speech Recognition Hotel Booking New York City Natural Language Understanding Intent/Slot Model Utterances “Your hotel is booked for Nov 30th” Polly Confirmation: “Your hotel is booked for Nov 30th” a in “Can I go ahead with the booking?”
  92. None
  93. <demo> Amazon Lex </demo>

  94. Amazon Machine Learning Create ML models without having to learn

    complex algorithms and technology 4
  95. Train model Evaluate and optimize Retrieve predictions Building smart applications

    with Amazon ML 1 2 3
  96. Train model Evaluate and optimize Retrieve predictions Building smart applications

    with Amazon ML Create a datasource object pointing to your data Explore and understand your data Transform data and train your model 1 2 3
  97. Create a datasource object >>> import boto >>> ml =

    boto.connect_machinelearning() >>> ds = ml.create_data_source_from_s3( data_source_id = ’my_datasource', data_spec= { 'DataLocationS3':'s3://bucket/input/', 'DataSchemaLocationS3':'s3://bucket/input/.schema'}, compute_statistics = True)
  98. Explore and understand your data

  99. Train your model >>> import boto >>> ml = boto.connect_machinelearning()

    >>> model = ml.create_ml_model( ml_model_id=’my_model', ml_model_type='REGRESSION', training_data_source_id='my_datasource')
  100. Train model Evaluate and optimize Retrieve predictions Building smart applications

    with Amazon ML Understand model quality Adjust model interpretation 1 2 3
  101. Explore model quality

  102. Fine-tune model interpretation

  103. Fine-tune model interpretation

  104. Train model Evaluate and optimize Retrieve predictions Building smart applications

    with Amazon ML Batch predictions Real-time predictions 1 2 3
  105. Batch predictions Asynchronous, large-volume prediction generation Request through service console

    or API Best for applications that deal with batches of data records >>> import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_batch_prediction( batch_prediction_id = 'my_batch_prediction’ batch_prediction_data_source_id = ’my_datasource’ ml_model_id = ’my_model', output_uri = 's3://examplebucket/output/’)
  106. Real-time predictions Synchronous, low-latency, high-throughput prediction generation Request through service

    API or server or mobile SDKs Best for interaction applications that deal with individual data records >>> import boto >>> ml = boto.connect_machinelearning() >>> ml.predict( ml_model_id=’my_model', predict_endpoint=’example_endpoint’, record={’key1':’value1’, ’key2':’value2’}) { 'Prediction': { 'predictedValue': 13.284348, 'details': { 'Algorithm': 'SGD', 'PredictiveModelType': 'REGRESSION’ } } }
  107. <demo> Bike Sharing </demo>

  108. None
  109. All Users Casual Users Registered Users

  110. Your Skill (Lambda function) Amazon Machine Learning get real-time predictions

    invoke Weather Forecast Historical Data get forecast build & train model
  111. None
  112. Nikola Tesla, 1926 “When wireless is perfectly applied, the whole

    earth will be converted into a huge brain…”
  113. Let’s Build Smarter Apps Using Services Platforms Engines

  114. Machine Learning for Developers Danilo Poccia @danilop danilop AWS Technical

    Evangelist