Slide 1

Slide 1 text

Machine Learning for Developers Danilo Poccia @danilop danilop AWS Technical Evangelist

Slide 2

Slide 2 text

Credit: Gerry Cranham/Fox Photos/Getty Images http://www.telegraph.co.uk/travel/destinations/europe/united-kingdom/england/london/galleries/The-history-of-the-Tube-in-pictures-150-years-of-London-Underground/1939-ticket-examin/

Slide 3

Slide 3 text

1939 London Underground Credit: Gerry Cranham/Fox Photos/Getty Images http://www.telegraph.co.uk/travel/destinations/europe/united-kingdom/england/london/galleries/The-history-of-the-Tube-in-pictures-150-years-of-London-Underground/1939-ticket-examin/

Slide 4

Slide 4 text

Batch Reports Real-Time Alerts Prediction Forecasts

Slide 5

Slide 5 text

Predictions

Slide 6

Slide 6 text

Data Predictions

Slide 7

Slide 7 text

Data Model Predictions

Slide 8

Slide 8 text

Model

Slide 9

Slide 9 text

http://www.thehudsonvalley.com/articles/60-years-ago-today-local-technology-demonstrated-artificial-intelligence-for-the-first-time 1959 Arthur Samuel

Slide 10

Slide 10 text

Machine Learning

Slide 11

Slide 11 text

Machine Learning Supervised Learning Inferring a model from labeled training data

Slide 12

Slide 12 text

Machine Learning Supervised Learning Unsupervised Learning Inferring a model from labeled training data Inferring a model to describe hidden structure from unlabeled data

Slide 13

Slide 13 text

Reinforcement Learning Perform a certain goal in a dynamic environment

Slide 14

Slide 14 text

Driving a vehicle Playing a game against an opponent

Slide 15

Slide 15 text

Clustering

Slide 16

Slide 16 text

Clustering

Slide 17

Slide 17 text

Topic Modeling Discovering abstract “topics” that occur in a collection of documents Try that with your own emails ;-)

Slide 18

Slide 18 text

Regression “How many bikes will be rented tomorrow?” Happy, Sad, Angry, Confused, Disgusted, Surprised, Calm, Unknown Binary Classification Multi-Class Classification “Is this email spam?” “What is the sentiment of this tweet, or of this social media comment?” 1, 0, 100K Yes, No True, False %

Slide 19

Slide 19 text

Training the Model Minimizing the Error

Slide 20

Slide 20 text

Validation

Slide 21

Slide 21 text

Labeled Data

Slide 22

Slide 22 text

Labeled Data 70% 30% Training Validation

Slide 23

Slide 23 text

Be Careful of Overfitting

Slide 24

Slide 24 text

Be Careful of Overfitting

Slide 25

Slide 25 text

Be Careful of Overfitting

Slide 26

Slide 26 text

Better Fitting

Slide 27

Slide 27 text

Better Fitting

Slide 28

Slide 28 text

Different Models ⇒ Different Predictions

Slide 29

Slide 29 text

Adding a “cost” for using
larger parameters in the model L1, L2, Neural Network Dropouts Regularization

Slide 30

Slide 30 text

Large numbers can have a bigger impact in a mathematical model Feature Scaling So they are all in the same range, For example between 0 and 1 Normalization

Slide 31

Slide 31 text

Model Parameters “Hyperparameters”

Slide 32

Slide 32 text

Labeled Data 70% 15% Training Validation 15% Test Try Different Hyperparameters

Slide 33

Slide 33 text

Labeled Data Cross Validation

Slide 34

Slide 34 text

Neural Networks

Slide 35

Slide 35 text

1943 Warren McCulloch, Walter Pitts Threshold Logic Units

Slide 36

Slide 36 text

1962 Frank Rosenblatt Perceptron

Slide 37

Slide 37 text

https://github.com/cdipaolo/goml/tree/master/perceptron weights activation function input output

Slide 38

Slide 38 text

https://en.wikipedia.org/wiki/Artificial_neural_network Multiple Layers Backpropagation

Slide 39

Slide 39 text

Microprocessor Transistor Counts 1971-2011 Intel E7 CPU 4-24 cores NVIDIA K80 GPU 2,496 cores https://en.wikipedia.org/wiki/Moore's_law

Slide 40

Slide 40 text

LeCun, Gradient-Based Learning Applied to Document Recognition,1998 Hinton, A Fast Learning Algorithm for Deep Belief Nets, 2006 Bengio, Learning Deep Architectures for AI, 2009 Advances in Research 1998-2009

Slide 41

Slide 41 text

Image Processing

Slide 42

Slide 42 text

Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg Convolution Matrix

Slide 43

Slide 43 text

Convolution Matrix Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 44

Slide 44 text

Convolution Matrix 0 0 0 0 1 0 0 0 0 Identity Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 45

Slide 45 text

Convolution Matrix 1 0 -1 2 0 -2 1 0 -1 Left Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 46

Slide 46 text

Convolution Matrix -1 0 1 -2 0 2 -1 0 1 Right Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 47

Slide 47 text

Convolution Matrix 1 2 1 0 0 0 -1 -2 -1 Top Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 48

Slide 48 text

Convolution Matrix -1 -2 -1 0 0 0 1 2 1 Bottom Edges Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 49

Slide 49 text

Convolution Matrix 0.6 -0.6 1.2 -1.4 1.2 -1.6 0.8 -1.4 1.6 Random Values Photo by David Iliff. License: CC-BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Colosseum_in_Rome,_Italy_-_April_2007.jpg

Slide 50

Slide 50 text

Convolutional Neural Networks (CNNs) https://en.wikipedia.org/wiki/Convolutional_neural_network

Slide 51

Slide 51 text

ImageNet Classification Error Over Time 0 5 10 15 20 25 30 2010 2011 2012 2013 2014 2015 2016 Classification Error CNNs

Slide 52

Slide 52 text

2012 ImageNet Classification with Deep Convolutional Neural Networks

Slide 53

Slide 53 text

SuperVision: 8 layers, 60M parameters 0

Slide 54

Slide 54 text

2013 Visualizing and Understanding Convolutional Networks

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

How Do Neural Networks Learn? ? More generic and can be reused as feature extractor for other visual tasks Specific to task Cat Dog 0

Slide 58

Slide 58 text

output input state output input state(t) memory Feedforward Neural Networks (no cycle) Recurrent Neural Networks (directed cycle) What About Memory?

Slide 59

Slide 59 text

https://en.wikipedia.org/wiki/Long_short-term_memory Long Short-Term Memory (LSTM) How much goes into memory How much is used in computing the output How much remains in memory

Slide 60

Slide 60 text

http://www.asimovinstitute.org/neural-network-zoo/ Lots of Parameters Network Architectures defined by Hyperparameters Dropout Layers for Regularization

Slide 61

Slide 61 text

Generative Adversarial Networks (GANs) Generator Neural Network Discriminator Neural Network Real or Generated? Real Picture Generated Picture

Slide 62

Slide 62 text

2014 Generative Adversarial Networks (GANs)

Slide 63

Slide 63 text

2016 Generative Adversarial Networks (GANs)

Slide 64

Slide 64 text

How much can you improve? Bayes Limit Vs Human Performance Training better than Validation? Learning Bias Data Leakage Do you have all the Data? Building a Model

Slide 65

Slide 65 text

Artificial Intelligence & Deep Learning At Amazon Thousands Of Employees Across The Company Focused on AI Discovery & Search Fulfilment & Logistics Add ML-powered features to existing products Echo & Alexa

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Create Great Content: ASK is how you connect to your consumer THE ALEXA ECOSYSTEM Supported by two powerful frameworks A L E X A V O I C E S E R V I C E Unparalleled Distribution: AVS allows your content to be everywhere Lives In The Cloud Automated Speech Recognition (ASR) Natural Language Understanding (NLU) Always Learning A L E X A S K I L L S K I T

Slide 68

Slide 68 text

UNDER THE HOOD OF ASK A closer look at how the Alexa Skills Kit process a request and returns an appropriate response You Pass Back a Textual or Audio Response You Pass Back a Graphical Response Alexa Converts Text-to-Speech (TTS) & Renders Graphical Component Respond to Intent through Text & Visual Alexa sends Customer Intent to Your Service User Makes a Request Alexa Identifies Skill & Recognizes Intent Through ASR & NLU Your Service processes Request Audio Stream is sent up to Alexa

Slide 69

Slide 69 text

Artificial Intelligence on AWS P2 Deep Learning AMI and template Investment in Apache MXNet

Slide 70

Slide 70 text

Elastic GPUs On EC2 P2 M4 D2 X1 G2 T2 R4 I3 C5 General Purpose GPU General Purpose Dense storage Large memory Graphics intensive Memory intensive High I/O Compute intensive Burstable Lightsail Simple VPS F1 FPGAs Instance Families

Slide 71

Slide 71 text

Up to 40 thousand parallel processing cores 70 teraflops (single precision) over 23 teraflops (double precision) Instance Size GPUs GPU Peer to Peer vCPUs Memory (GiB) Network Bandwidth* p2.xlarge 1 - 4 61 1.25Gbps p2.8xlarge 8 Y 32 488 10Gbps p2.16xlarge 16 Y 64 732 20Gbps *In a placement group Amazon EC2 P2 Instances

Slide 72

Slide 72 text

Elastic GPUs For EC2: GPU Acceleration For Graphics Workloads 1GiB GPU Memory 2 GiB 4 GiB 8 GiB Current Generation EC2 Instance

Slide 73

Slide 73 text

F1 Instances: Bringing Hardware Acceleration To All FPGA Images Available In AWS Marketplace F1 Instance With your custom logic running on an FPGA Develop, simulate, debug & compile your code Package as FPGA Images

Slide 74

Slide 74 text

Apache MXNet

Slide 75

Slide 75 text

Deep Learning Frameworks MXNet, Caffe, Tensorflow, Theano, Torch, CNTK and Keras Pre-installed components to speed productivity, such as Nvidia drivers, CUDA, cuDNN, Intel MKL-DNN with MXNet, Anaconda, Python 2 and 3 AWS Integration Deep Learning AMI

Slide 76

Slide 76 text

Apache Spark MLlib

Slide 77

Slide 77 text

Amazon AI Bringing Powerful Artificial Intelligence To All Developers

Slide 78

Slide 78 text

Amazon Rekognition Image Recognition And Analysis Powered By Deep Learning 1

Slide 79

Slide 79 text

Amazon Rekognition: Images In, Categories and Facial Analysis Out Amazon Rekognition Car Outside Daytime Driving Objects & Scenes Female Smiling Sunglasses Faces

Slide 80

Slide 80 text

Ground Truth Generation Training

Slide 81

Slide 81 text

Deep Learning Process Conv 1 Conv 2 Conv n … … Feature Maps Labrador Dog Beach Outdoors Softmax Probability Fully Connected Layer

Slide 82

Slide 82 text

Amazon Rekognition

Slide 83

Slide 83 text

Amazon Polly Text To Speech Powered By Deep Learning 2

Slide 84

Slide 84 text

Amazon Polly: Text In, Life-like Speech Out Amazon Polly “The temperature in WA is 75°F” “The temperature in Washington is 75 degrees Fahrenheit”

Slide 85

Slide 85 text

TEXT Market grew by > 20%. WORDS PHONEMES { { { { { ˈtwɛn.ti pɚ.ˈsɛnt ˈmɑɹ.kət ˈgɹu baɪ ˈmoʊɹ ˈðæn PROSODY CONTOUR UNIT SELECTION AND ADAPTATION TEXT PROCESSING PROSODY MODIFICATION STREAMING Market grew by more than twenty percent Speech units inventory

Slide 86

Slide 86 text

Amazon Polly

Slide 87

Slide 87 text

Amazon ALEXA (It’s what’s inside Alexa) 3 Natural Language Understanding (NLU) & Automatic Speech Recognition (ASR) Powered By Deep Learning

Slide 88

Slide 88 text

Amazon Lex: Speech Recognition & Natural Language Understanding Amazon Lex Automatic Speech Recognition Natural Language Understanding “What’s the weather forecast?” Weather Forecast

Slide 89

Slide 89 text

Amazon Lex: Speech Recognition & Natural Language Understanding Amazon Lex Automatic Speech Recognition Natural Language Understanding “What’s the weather forecast?” “It will be sunny and 25°C” Weather Forecast

Slide 90

Slide 90 text

Lex Bot Structure Utterances Spoken or typed phrases that invoke your intent BookHotel Intents An Intent performs an action in response to natural language user input Slots Slots are input data required to fulfill the intent Fulfillment Fulfillment mechanism for your intent

Slide 91

Slide 91 text

Hotel Booking City New York City Check In Nov 30th Check Out Dec 2nd Hotel Booking City New York City Check In Check Out “Book a Hotel” Book Hotel NYC “Book a Hotel in NYC” Automatic Speech Recognition Hotel Booking New York City Natural Language Understanding Intent/Slot Model Utterances “Your hotel is booked for Nov 30th” Polly Confirmation: “Your hotel is booked for Nov 30th” a in “Can I go ahead with the booking?”

Slide 92

Slide 92 text

No content

Slide 93

Slide 93 text

Amazon Lex

Slide 94

Slide 94 text

Amazon Machine Learning Create ML models without having to learn complex algorithms and technology 4

Slide 95

Slide 95 text

Train model Evaluate and optimize Retrieve predictions Building smart applications with Amazon ML 1 2 3

Slide 96

Slide 96 text

Train model Evaluate and optimize Retrieve predictions Building smart applications with Amazon ML Create a datasource object pointing to your data Explore and understand your data Transform data and train your model 1 2 3

Slide 97

Slide 97 text

Create a datasource object >>> import boto >>> ml = boto.connect_machinelearning() >>> ds = ml.create_data_source_from_s3( data_source_id = ’my_datasource', data_spec= { 'DataLocationS3':'s3://bucket/input/', 'DataSchemaLocationS3':'s3://bucket/input/.schema'}, compute_statistics = True)

Slide 98

Slide 98 text

Explore and understand your data

Slide 99

Slide 99 text

Train your model >>> import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_ml_model( ml_model_id=’my_model', ml_model_type='REGRESSION', training_data_source_id='my_datasource')

Slide 100

Slide 100 text

Train model Evaluate and optimize Retrieve predictions Building smart applications with Amazon ML Understand model quality Adjust model interpretation 1 2 3

Slide 101

Slide 101 text

Explore model quality

Slide 102

Slide 102 text

Fine-tune model interpretation

Slide 103

Slide 103 text

Fine-tune model interpretation

Slide 104

Slide 104 text

Train model Evaluate and optimize Retrieve predictions Building smart applications with Amazon ML Batch predictions Real-time predictions 1 2 3

Slide 105

Slide 105 text

Batch predictions Asynchronous, large-volume prediction generation Request through service console or API Best for applications that deal with batches of data records >>> import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_batch_prediction( batch_prediction_id = 'my_batch_prediction’ batch_prediction_data_source_id = ’my_datasource’ ml_model_id = ’my_model', output_uri = 's3://examplebucket/output/’)

Slide 106

Slide 106 text

Real-time predictions Synchronous, low-latency, high-throughput prediction generation Request through service API or server or mobile SDKs Best for interaction applications that deal with individual data records >>> import boto >>> ml = boto.connect_machinelearning() >>> ml.predict( ml_model_id=’my_model', predict_endpoint=’example_endpoint’, record={’key1':’value1’, ’key2':’value2’}) { 'Prediction': { 'predictedValue': 13.284348, 'details': { 'Algorithm': 'SGD', 'PredictiveModelType': 'REGRESSION’ } } }

Slide 107

Slide 107 text

Bike Sharing

Slide 108

Slide 108 text

No content

Slide 109

Slide 109 text

All Users Casual Users Registered Users

Slide 110

Slide 110 text

Your Skill (Lambda function) Amazon Machine Learning get real-time predictions invoke Weather Forecast Historical Data get forecast build & train model

Slide 111

Slide 111 text

No content

Slide 112

Slide 112 text

Nikola Tesla, 1926 “When wireless is perfectly applied, the whole earth will be converted into a huge brain…”

Slide 113

Slide 113 text

Let’s Build Smarter Apps Using Services Platforms Engines

Slide 114

Slide 114 text

Machine Learning for Developers Danilo Poccia @danilop danilop AWS Technical Evangelist