Semi supervised learning with Autoencoders by Ілля Горев

Semi supervised learning with Autoencoders

Supervised and Unsupervised learning • Data X with labels Y
• Predict Y from given X ➜ f(X) ≈ Y • Examples: ◦ Regression ◦ Classification • Data without labels • Find hidden patterns or structure in data • Examples: ◦ Clustering ◦ Dimensionality reduction

Semi Supervised learning • Mix of the two methods mentioned
above. • Both labeled and unlabeled examples are present in dataset. But data is mostly unlabeled. • In many real world applications getting labels is expensive, on the contrary there can be a lot of unlabeled data. • Semi supervised methods help put surplus of unlabeled data to use.

Autoencoder Usually we use NN for classification and regression task,
that is, given an input vector X, we want to find y. In other words, we want neural net to find a mapping f(X) ≈ Y: But with unsupervised learning we don’t have labels. Now, what happens if we use the same data as codomain of the function? That is, we want to find a mapping f(X) ≈ X. Well, the neural net now will learn an identity mapping of X: NN X Y NN X X

Main idea behind autoencoders To prevent network from learning the
identity function hidden layer must be (much) smaller than input and output layers. So NN learns compressed representation of the input and that tries to reconstruct it to match the output. After training we are mostly interested in compressed representation i.e. values produced by hidden layer(bottleneck).

Is hidden representation possible? Let’s take MNIST as an example
Here we have huge(25628*28) input space of all possible images. But the ones that we are meaningful for us are only tiny subset of it. Therefore there is hope that if we input only pictures that we are interested in, NN will be able to extract some more abstract features than raw pixel intensities ... ...

Are autoencoders useful? • Latent space clustering • Generating higher
resolution images • Denoising • Semi supervised learning

Deep autoencoders Nothing is stopping us from using more than
one hidden layer Other architectures than fully connected layers are also possible

t-SNE Dimensionality reduction algorithm. Takes a high dimensional data and
reduces it to lower dimensions retaining a lot of original information. Finds a way to project data in low dimensional space in such a way that the clustering from high dimensional space is preserved

Then t-SNE moves point a little bit at a time
until it has clustered them: Source: https://www.youtube.com/watch?v=NEaUSP4YerM

Gaussian mixture model(GMM) Similar to k-means, but uses probability of
a sample to determine the feasibility of it belonging to a cluster. K-means assumes clusters to spherical, GMM does not. K-means produces hard assignments, GMM incorporates the degree of uncertainty in its predictions.

Pipeline of today’s example Deep conv autoencoder MNIST dataset without
labels Latent space representations

t-SNE dimensionality reduction GMM clustering Small set(5 from each class)
of labeled examples

Supervised CNN 32 depth 26x26 32 depth 24x24 12x12 Dense
128 Dense 10 (softmax) Conv 3x3 Conv 3x3 Max pool 2x2

Demo time

Semi supervised learning with Autoencoders by І...

Semi supervised learning with Autoencoders by Ілля Горев

GDG Ternopil

More Decks by GDG Ternopil

Other Decks in Programming

Featured

Transcript

Semi supervised learning with Autoencoders

Supervised and Unsupervised learning • Data X with labels Y

Semi Supervised learning • Mix of the two methods mentioned

Autoencoder Usually we use NN for classification and regression task,

Main idea behind autoencoders To prevent network from learning the

Is hidden representation possible? Let’s take MNIST as an example

Are autoencoders useful? • Latent space clustering • Generating higher

Deep autoencoders Nothing is stopping us from using more than

t-SNE Dimensionality reduction algorithm. Takes a high dimensional data and

Then t-SNE moves point a little bit at a time

Gaussian mixture model(GMM) Similar to k-means, but uses probability of

Pipeline of today’s example Deep conv autoencoder MNIST dataset without

t-SNE dimensionality reduction GMM clustering Small set(5 from each class)

Supervised CNN 32 depth 26x26 32 depth 24x24 12x12 Dense

Demo time