Slide 1

Slide 1 text

ML: concepts January 2017

Slide 2

Slide 2 text

Format Slack channel : #ml-courses ● Today : concepts ● 08 Feb : understanding a ML project : what are the good questions? ● 20 Feb and following : ○ more technical sessions ○ optional “homework” between sessions (small projects) ○ current plan : 7 technical sessions ○ goal : be able to work on ML projects autonomously

Slide 3

Slide 3 text

Machine Learning

Slide 4

Slide 4 text

Hierarchy

Slide 5

Slide 5 text

Explaining Machine Learning Machine learning is the idea that there are generic algorithms that can tell you something interesting about a set of data without you having to write any custom code specific to the problem. Instead of writing code, you feed data to the generic algorithm and it builds its own logic based on the data.

Slide 6

Slide 6 text

Machine Learning vs Statistics ? They are both concerned with the same question: how do we learn from data? Statistics Machine Learning Estimation Learning Classifier Hypothesis Data Point Example/ Instance Regression Supervised Learning Classification Supervised Learning Covariate Feature Response Label

Slide 7

Slide 7 text

Inputs Raw data Multimedia (sound, pictures, videos) Language

Slide 8

Slide 8 text

ML vs “traditional” methods

Slide 9

Slide 9 text

Minksy’s Multiplicity (1960) Crucial parts for problem solving : ● Induction ● Planning ● Search, knowledge representation ● Pattern recognition ● Learning Components needed to get to human-level AI

Slide 10

Slide 10 text

Minksy’s Multiplicity (1960) Crucial parts for problem solving : ● Induction ● Planning ● Search, knowledge representation ● Pattern recognition ● Learning Components needed to get to human-level AI

Slide 11

Slide 11 text

Topics for Machine Learning ● Self-driving cars ● Human interaction : ○ Handwriting ○ Speech ○ Natural language ● OCR ● Image recognition ● Information retrieval ● Artificial personal assistants ● Recommendations systems ● Drones ● Game playing ● ...

Slide 12

Slide 12 text

Subdomains

Slide 13

Slide 13 text

Subdomains of Machine Learning Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning

Slide 14

Slide 14 text

Supervised learning

Slide 15

Slide 15 text

Supervised learning Labelled data Classification Regression

Slide 16

Slide 16 text

Demo: Intro to Machine Learning http://www.r2d3.us/ visual-intro-to- machine-learning-part-1/

Slide 17

Slide 17 text

Unsupervised learning

Slide 18

Slide 18 text

Unsupervised learning

Slide 19

Slide 19 text

Reinforcement learning

Slide 20

Slide 20 text

Reinforcement learning

Slide 21

Slide 21 text

Reinforcement Learning: OpenAI

Slide 22

Slide 22 text

First intuitions

Slide 23

Slide 23 text

Linear regression

Slide 24

Slide 24 text

Intuitions from linear regression ● algorithm is generic, results depends on data ● system is both the algorithm and the data ● only as good as your data ● starts with a hypothesis about how we can represent the data (for linear regression : a straight line) ● can deal poorly with outliers ● lots of calculation to learn, but very fast to apply (can run on mobile)

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Tools

Slide 27

Slide 27 text

Artificial Neural Networks (ANN) Each node does a linear combination of previous nodes More nodes can handle more complexity Training in multiple steps - left to right to evaluate the training set - right to left to propagate errors

Slide 28

Slide 28 text

Demo: Tensorflow Playground

Slide 29

Slide 29 text

Deep learning

Slide 30

Slide 30 text

Deep learning

Slide 31

Slide 31 text

Convolutional neural networks

Slide 32

Slide 32 text

Demo: MNIST

Slide 33

Slide 33 text

Recurrent neural networks

Slide 34

Slide 34 text

Generative Adversial Nets

Slide 35

Slide 35 text

State of the art

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

IBM Watson ● Healthcare : ○ Diagnostics ○ Tests suggestion ○ Prescription recommendations ● Legal : ○ Hired as a lawyer (“Ross”) ● Teaching : ○ Used as a TA (“Jill Watson”) ● Cooking : ○ published a recipe book ○ new combinations ○ able to avoid allergies

Slide 38

Slide 38 text

Google Auto captionning (2014) Two pizzas sitting on top of a stove top oven

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

Cloud Vision API Available on Google Cloud Automatic labelling Sentiment analysis Text extraction Landmark detection Logo detection Explicit content detection

Slide 41

Slide 41 text

SyntaxNet and Parsey McParseface

Slide 42

Slide 42 text

SyntaxNet and Parsey McParseface

Slide 43

Slide 43 text

SyntaxNet and Parsey McParseface Parsey McParseface can correctly read: ● The old man the boat. ● While the man hunted the deer ran into the woods. ● While Anna dressed the baby played in the crib. ● Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. It makes mistakes on: ● I convinced her children are noisy. ● The coach smiled at the player tossed the frisbee. ● The cotton clothes are made up of grows in Mississippi. ● James while John had had had had had had had had had had had a better effect on the teacher

Slide 44

Slide 44 text

Demo: Pride and Prejudice

Slide 45

Slide 45 text

Google Translate

Slide 46

Slide 46 text

Google Translate

Slide 47

Slide 47 text

Google Translate

Slide 48

Slide 48 text

NeuralArt (Tensorflow) Content Style Output

Slide 49

Slide 49 text

Facial manipulation (Feb 2015)

Slide 50

Slide 50 text

Face recognition (April 2014)

Slide 51

Slide 51 text

Finding similarities in art (Aug 2014)

Slide 52

Slide 52 text

Audio prediction on video (April 2016)

Slide 53

Slide 53 text

Predicting scene from sound

Slide 54

Slide 54 text

Extract instructions from Youtube videos (Nov 2015)

Slide 55

Slide 55 text

Reading old journals

Slide 56

Slide 56 text

Creating videos of the future

Slide 57

Slide 57 text

Applications of NLP at Quora - automatic grammar correction - question quality - duplicate question detection - related question suggestion - topic biography quality (= qualifications of writer) - topic labeler (from “science” to narrow topics like “tennis Courts in Mountain View”) - search - answer summaries - automatic answers wiki - hate speech/harassment detection - spam detection - question edit quality

Slide 58

Slide 58 text

Questions? January 2017