Deep Learning: An Introduction [Devoxx Belgium 2016]

Transcript

Deep Learning: An Introduction Breandan Considine Devoxx Belgium ’16

Who am I? • Background in Computer Science, Machine Learning

• Worked for a small ad-tech startup out of university • Spent two years as Developer Advocate @JetBrains • Interested in machine learning and speech recognition • Enjoy writing code, traveling to conferences, reading • Say hello! @breandan | breandan.net | [email protected]

Who am I? • Background in Computer Science, Machine Learning

• Worked for a small ad-tech startup out of university • Spent two years as Developer Advocate @JetBrains • Interested in machine learning and speech recognition • Enjoy writing code, traveling to conferences, reading • Say hello! @breandan | breandan.net | [email protected]

ImageNet Large Scale Visual Recognition

Year over year Top-5 Recognition Error

Automatic speech recognition in 2011

Automatic speech recognition in 2015 This image cannot currently be

displayed.

What happened? • Bigger data • Faster hardware • Smarter

algorithms

Traditional ASR • Requires lots of handmade feature engineering •

Poor results: >25% WER for HMM architectures

What is Machine Learning?

Why machine learning? • Prediction • Categorization • Anomaly detection

• Personalization • Adaptive control • Playing games

Traditional childhood education • One-size-fits-all curriculum • Teaching process is

repetitive • Students are not fully engaged • Memorization over understanding • Encouragement can be inconsistent • Teaches to the test (not the real world)

How can we disrupt early education? • Personalized learning •

Teaching assistance • Adaptive feedback • Active engagement • Spaced repetition • Assistive technology

None

How can we disrupt early education? • Personalized learning •

None

Recall

None

How can we disrupt early education? • Personalized learning •

None

Handwriting recognition

Speech recognition

How can we disrupt early education? • Personalized learning •

None

How can we disrupt early education? • Personalized learning •

Spaced repetition

How can we disrupt early education? • Personalized learning •

Speech Verification / Recitation

Language learning

None

Machine learning, for humans • Self improvement • Language learning

• Computer training • Special education • Reading comprehension • Content generation

Machine learning fundamentals • Tensors • Supervised learning • Unsupervised

learning

So what’s a Tensor? • A “tensor’ is just an

n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!

So what’s a Tensor anyway? • A “tensor’ is just

an n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day! 't'

So what’s a Tensor anyway? • A “tensor’ is just

an n-dimensional array • Useful for working with complex data • We use tiny (and large) tensors every day! 't'

Amplitude Frequency Time

NxM image is a point in RNM

https://inst.eecs.berkeley.edu/~cs194-26/fa14/upload/files/proj5/cs194-dm/

What are they good for? • Modeling complex systems, data

sets • Capturing higher order correlations • Representing dynamic relationships • Doing machine learning!

Types of machine learning

A quick taste of supervised learning

Let’s have a look at linear regression! • • <

Live Code >

Classification in a nutshell

None

0 1

Cool learning algorithm def classify(datapoint, weights):

Cool learning algorithm def classify(datapoint, weights): prediction = sum(x *

y for x, y in zip([1] + datapoint, weights))

Cool learning algorithm def classify(datapoint, weights): prediction = sum(x *

y for x, y in zip([1] + datapoint, weights)) if prediction < 0: return 0 else: return 1

Cool learning algorithm def classify(datapoint, weights): prediction = sum(x *

y for x, y in zip([1] + datapoint, weights)) if prediction < 0: return 0 else: return 1

Cool learning algorithm def train(data_set):

Cool learning algorithm def train(data_set): class Datum: def init(self, features,

label): self.features = [1] + features self.label = label

Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

[0, 0, 0]

Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

total_error = threshold + 1

Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: error = item.label – classify(item.features, weights) weights = [w + RATE * error * i for w, i in zip(weights, item.features)] total_error += abs(error)

Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

weights = [w + RATE * error * i for

w, i in zip(weights, item.features)] Cool learning algorithm * 1 i1 i2 in * * * w0 w1 w2 wn Σ

Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

None

Even Cooler Algorithm! (Backprop) train(trainingSet) : initialize network weights randomly

until average error stops decreasing (or you get tired): for each sample in trainingSet: prediction = network.output(sample) compute error (prediction – sample.output) compute error of (hidden -> output) layer weights compute error of (input -> hidden) layer weights update weights across the network save the weights

None

Gradient Descent http://cs231n.github.io/

None

“A Neural Network Zoo,” Fjdor Van Neen http://www.asimovinstitute.org/neural-network-zoo/

None

A brief look at unsupervised learning • Where did my

labels go? • Mostly clustering, separation, association • Many different methods • Self organizing map • Expectation-maximization • Association rule learning • Reccomender systems

None

Cool clustering algorithm def cluster(data, k, max_it=1000):

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels it = 0 old_centers = None while it < max_it and not np.array_equal(old_centers, centers): it += 1

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels it = 0 old_centers = None while it < max_it and not np.array_equal(old_centers, centers): it += 1 old_centers = np.copy(centers)

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels it = 0 old_centers = None while it < max_it and not np.array_equal(old_centers, centers): it += 1 old_centers = np.copy(centers) update_labels(labeled, centers)

Cool clustering algorithm def update_labels(data, centers): for datum in data:

datum[-1] = centers[0, -1] min = distance.euclidean(datum[:-1], centers[0, :-1]) for center in centers: dist = distance.euclidean(datum[:-1], center[:-1]) if dist < min: min = dist datum[-1] = center[-1]

Cool clustering algorithm def update_labels(data, centers): for datum in data:

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels it = 0 old_centers = None while it < max_it and not np.array_equal(old_centers, centers): it += 1 old_centers = np.copy(centers) update_labels(labeled, centers) update_centers(labeled, centers)

Cool clustering algorithm def update_centers(data, centers): k = len(centers) for

i in range(1, k + 1): cluster = data[data[:, -1] == i, :-1] centers[i - 1, :-1] = np.mean(cluster, axis=0)

Cool clustering algorithm def update_centers(data, centers): k = len(centers) for

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels it = 0 old_centers = None while it < max_it and not np.array_equal(old_centers, centers): it += 1 old_centers = np.copy(centers) update_labels(labeled, centers) update_centers(labeled, centers) return labeled

Cool clustering algorithm def cluster(data, k, max_it=1000): labeled = np.append(data,

np.zeros((len(data), 1)), axis=1) random_pts = np.random.choice(len(labeled), k, replace=False centers = labeled[random_pts] centers[:, -1] = range(1, k + 1) # Assign labels it = 0 old_centers = None while it < max_it and not np.array_equal(old_centers, centers): it += 1 old_centers = np.copy(centers) update_labels(labeled, centers) update_centers(labeled, centers) return labeled

None

Deep Learning: An Introduction [Devoxx Belgium ...

Deep Learning: An Introduction [Devoxx Belgium 2016]

More Decks by Breandan Considine

Other Decks in Programming

Featured

Transcript