Introduction to Deep Learning

INTRODUCTION TO DEEP LEARNING TORCH TUTORIAL

WORKSHOP OVERVIEW • WHY DEEP LEARNING • HOW IT WORKS
• CONVNETS • WORKSHOP DEMO • FUTURE DIRECTION

WHY DEEP LEARNING? WHAT’S THE HYPE?

MACHINE LEARNING VS DEEP LEARNING WHAT’S THE DIFFERENCE?

MACHINE LEARNING VS DEEP LEARNING (1) MACHINE LEARNING IS ABOUT
EXAMPLES. DEEP LEARNING IS ABOUT DATA. ▸ Machine learning lets see customize examples and features and models. ▸ Deep learning is about one model to rule them all. As long as you have data you can learn the parameters. But can you interpret?

MACHINE LEARNING VS DEEP LEARNING (2) DEEP LEARNING IS ON
THE RISE. ▸ Deep learning is growing more and more popular. ▸ Proven it’s worth in industry.

APPLICATIONS IN INDUSTRY WHAT HAS DEEP LEARNING DONE?

NEURAL NETWORKS HOW DOES IT WORK?

NEURAL UNIT THE BASIC BUILDING BLOCK.

NEURAL UNIT (1) BIOLOGICAL ORIGINS. ▸ The computational neural unit
is inspired by biology! (kind of) ▸ Brain has 86 billion neurons and 10^15 synapses. ▸ Each neuron receives input signals from dendrites and produces output signals along its axon, which connects to other dendrites. ▸ Neurons can ﬁre and whether it ﬁres is based on an activation threshold. ▸ Inhibitory and excitatory neurons!

NEURAL UNIT (2) BIOLOGY TO COMPUTATION. ▸ Super simpliﬁed version
of a biological neuron. If even… ▸ Like a neural cell, a neural unit takes inputs and makes outputs. ▸ The output a unit produces is based on an activation function. It represents the “frequency” of ﬁring (probability). ▸ The input signals interact multiplicatively to represent inhibition and excitatory. These are weights!

NEURAL UNIT (3) HOW DO YOU MODEL THE ACTIVATION FUNCTION?
▸ Inputs and outputs are easy to model. ▸ Sigmoid : maps real number to 0 and 1. Easy to see how this can help determine the activation function. ▸ Goes from not ﬁring (0) to fully-saturated ﬁring at maximum frequency (1). ▸ In real life, people use ReLu or Hyperbolic Tangent because it’s more stable.

▸ A single neural cell is a linear classiﬁer! ▸
Round to 0 and 1.

NEURAL NETWORK BUILDING A COMPLEX STRUCTURE.

NEURAL NETWORK (1) COMBINING NEURONS. ▸ If a single neuron
is already so powerful, what can you do with a bunch of them? ▸ You can combine neurons together into a network, creating “hidden layers”.

NEURAL NETWORK (3) MODULARITY IS SO IMPORTANT. ▸ Being to
think of a neural networks as layers allows us to build them fundamentally like lego blocks. ▸ As long as we ensure that each layer is designed well, we can just stack layers in any way we want! Super powerful. ▸ To change things, you can switch out layers, add, remove.

NEURAL NETWORK (2) LAYER WISE DESIGN. ▸ A single layer
needs 3 things: ▸ A function to process input to output. ▸ A function to process derivative wrt output to derivative wrt input. ▸ A function to get gradients of parameters.

BACKPROPAGATION CALCULATING COST IN A SMART WAY.

BACKPROPAGATION (1) OPTIMIZATION. ▸ Inputs to outputs is intuitive. But
why do we need derivatives? ▸ The goal of machine learning in general is to ﬁnd optimal parameters using some model. ▸ Think of a best ﬁt line! The slope is out parameter. And we minimize distance of points to the line.

BACKPROPAGATION (2) COST FUNCTION. ▸ Need a way to measure
how far we are from the maxima! ▸ This is what we want to take the derivative of.

BACKPROPAGATION (3) GRADIENT DESCENT. ▸ In general, we can optimize
by gradient descent to ﬁnd global maxima/minima. ▸ We can think of 3d data as topological map. There are valleys and mountains. To ﬁnd optima, we want to take small steps in the right direction. ▸ A gradient/derivative represents the direction and size of the step! ▸ In real life, use Adagrad, Momentum, Newton’s CG.

BACKPROPAGATION (4) MOTIVATION FOR BACKPROPAGATION. ▸ Go back to that
model. ▸ This is inefﬁcient. We are calculating same values over and over and this is a model of 3 layers. Imagine 60 layers.

BACKPROPAGATION (5) THIS WORKS FOR ARBITRARY MODELS. ▸ Super robust.
Even if a complex model like a convolutional neural net, just do a forward pass to get outputs, then a backwards pass to get derivatives wrt to outputs, and use some of the layers to get derivatives wrt to parameters, and optimize.

CONVOLUTIONAL NEURAL NETWORKS STATE-OF-THE-ART OBJECT DETECTION

CONVOLUTIONAL NETWORK NETWORK (1) ▸ Revolutionized object and speech recognition.
▸ Can contain 1000s of layers and millions of parameters. ▸ Contains two new types of layers: convolution and pooling.

CONVOLUTION AND POOLING. IMAGE CALCULATIONS.

CONVOLUTIONAL NETWORK NETWORK (2) ▸ Convolution is like sliding a
filter over an image and creating a linear combination each time. ▸ Do this over an entire image and you get an image back. ▸ The parameters here are the filters themselves and the point of the model is to learn these filters.

CONVOLUTIONAL NETWORK NETWORK (3) POOLING. ▸ Pooling is even easier.
If I do a lot of filters then after a convolution layer, I end up with a lot of images. ▸ To be more memory efficient, we can just take the largest one from a single filter.

TORCH TUTORIAL QUICK OVERVIEW OF TORCH.

TORCH TUTORIAL (1) THE TENSOR. ▸ A tensor is an
N-dim vector. ▸ Like how vector calculations in python are faster than loops, everything in Torch is a tensor calculation. ▸ This is really useful for big data!

TORCH TUTORIAL (2) NN IS TO TORCH AS NUMPY IS
PYTHON. ▸ NN is a library that makes designing neural networks really really easy. You can deﬁne models, cost, and optimize with a few calls. ▸ It’s built with layers too. Like legos. INPUT OUTPUT LINEAR LINEAR NONLINEAR

TORCH TUTORIAL (3) LSTM IN 33 LINES.

WORKSHOP DEMO HTTP://WWW.MIKEWUIS.ME/DEEP.HTML

FUTURE DIRECTION WHAT COMES NEXT?

FUTURE DIRECTION (1) DEEP LEARNING IS NOT THE END. IT’S
A STEP. ▸ As great as deep learning is, there are big and noticeable drawbacks. What can we do? ▸ Combining with Bayesian Learning. ▸ Automating Hyperparameters. ▸ Biological Models? ▸ More industry applications.

BIG THANKS! NANDO DE FEIRAS ANDREJ KARPATHY SUSAN WEST JASON
BROOKS

ASK ME QUESTIONS! EMAIL? [email protected]

Introduction to Deep Learning

Introduction to Deep Learning

More Decks by Mike Wu

Other Decks in Technology

Featured

Transcript