ML Session n°1

ML: concepts January 2017

Format Slack channel : #ml-courses • Today : concepts •
08 Feb : understanding a ML project : what are the good questions? • 20 Feb and following : ◦ more technical sessions ◦ optional “homework” between sessions (small projects) ◦ current plan : 7 technical sessions ◦ goal : be able to work on ML projects autonomously

Machine Learning

Hierarchy

Explaining Machine Learning Machine learning is the idea that there
are generic algorithms that can tell you something interesting about a set of data without you having to write any custom code specific to the problem. Instead of writing code, you feed data to the generic algorithm and it builds its own logic based on the data.

Machine Learning vs Statistics ? They are both concerned with
the same question: how do we learn from data? Statistics Machine Learning Estimation Learning Classifier Hypothesis Data Point Example/ Instance Regression Supervised Learning Classification Supervised Learning Covariate Feature Response Label

Inputs Raw data Multimedia (sound, pictures, videos) Language

ML vs “traditional” methods

Minksy’s Multiplicity (1960) Crucial parts for problem solving : •
Induction • Planning • Search, knowledge representation • Pattern recognition • Learning Components needed to get to human-level AI

Topics for Machine Learning • Self-driving cars • Human interaction
: ◦ Handwriting ◦ Speech ◦ Natural language • OCR • Image recognition • Information retrieval • Artificial personal assistants • Recommendations systems • Drones • Game playing • ...

Subdomains

Subdomains of Machine Learning Machine Learning Supervised Learning Unsupervised Learning
Reinforcement Learning

Supervised learning

Supervised learning Labelled data Classification Regression

Demo: Intro to Machine Learning http://www.r2d3.us/ visual-intro-to- machine-learning-part-1/

Unsupervised learning

Reinforcement learning

Reinforcement Learning: OpenAI

First intuitions

Linear regression

Intuitions from linear regression • algorithm is generic, results depends
on data • system is both the algorithm and the data • only as good as your data • starts with a hypothesis about how we can represent the data (for linear regression : a straight line) • can deal poorly with outliers • lots of calculation to learn, but very fast to apply (can run on mobile)

Artificial Neural Networks (ANN) Each node does a linear combination
of previous nodes More nodes can handle more complexity Training in multiple steps - left to right to evaluate the training set - right to left to propagate errors

Demo: Tensorflow Playground

Deep learning

Convolutional neural networks

Demo: MNIST

Recurrent neural networks

Generative Adversial Nets

State of the art

IBM Watson • Healthcare : ◦ Diagnostics ◦ Tests suggestion
◦ Prescription recommendations • Legal : ◦ Hired as a lawyer (“Ross”) • Teaching : ◦ Used as a TA (“Jill Watson”) • Cooking : ◦ published a recipe book ◦ new combinations ◦ able to avoid allergies

Google Auto captionning (2014) Two pizzas sitting on top of
a stove top oven

Cloud Vision API Available on Google Cloud Automatic labelling Sentiment
analysis Text extraction Landmark detection Logo detection Explicit content detection

SyntaxNet and Parsey McParseface

SyntaxNet and Parsey McParseface Parsey McParseface can correctly read: •
The old man the boat. • While the man hunted the deer ran into the woods. • While Anna dressed the baby played in the crib. • Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. It makes mistakes on: • I convinced her children are noisy. • The coach smiled at the player tossed the frisbee. • The cotton clothes are made up of grows in Mississippi. • James while John had had had had had had had had had had had a better effect on the teacher

Demo: Pride and Prejudice

Google Translate

NeuralArt (Tensorflow) Content Style Output

Facial manipulation (Feb 2015)

Face recognition (April 2014)

Finding similarities in art (Aug 2014)

Audio prediction on video (April 2016)

Predicting scene from sound

Extract instructions from Youtube videos (Nov 2015)

Reading old journals

Creating videos of the future

Applications of NLP at Quora - automatic grammar correction -
question quality - duplicate question detection - related question suggestion - topic biography quality (= qualifications of writer) - topic labeler (from “science” to narrow topics like “tennis Courts in Mountain View”) - search - answer summaries - automatic answers wiki - hate speech/harassment detection - spam detection - question edit quality

Questions? January 2017

ML Session n°1

ML Session n°1

More Decks by Adrien Couque

Other Decks in Technology

Featured

Transcript