DCGAN - How does it work?

Google conﬁdential | Do not distribute DCGAN How does it
work? Etsuji Nakai Cloud Solutions Architect at Google 2016/09/26 ver1.1 GIF Animation https://goo.gl/zXL1bV

$ who am i ▪Etsuji Nakai Cloud Solutions Architect at
Google Twitter @enakai00 Now on Sale!

What is DCGAN?

What is DCGAN? ▪ DCGAN: Deep Convolutional Generative Adversarial Networks
• It works in the opposite direction of the image classiﬁer (CNN). • CNN transforms an image to a class label (list of probabilities). • DCGAN generates an image from random parameters. (0.01, 0.05, 0.91, 0.02, ...) deer dog cat human ... (0.01, 0.05, 0.91, 0.02, ...) CNN DCGAN Probabilities of each entry. What do these numbers mean? Random parameters

Examples of Convolutional Filters ▪ Convolutional filters are ... just
an image filter you sometimes apply in Photoshop! Filter to blur images Filter to extract vertical edges

Convolutional Filters in CNN ▪ CNN applies a lot of
filters to extract various features from a single image. ▪ CNN applies multi-layered filters to a single image (to extract features of features?) ▪ A filtered image becomes smaller to drop off unnecessary details. Extracting vertical and horizontal edges using two filters.

Convolutional Filters in CNN ▪ This shows how filters are
applied to a multi-layered image. Input image Output image A Output image B Filter A Filter B Apply independent filters to each layer Sum up resulting images from each layer

Typical CNN Filtering Layers http://arxiv.org/abs/1511.06434 RGB layers of a single
64x64 image. 128 layers of 32x32 images. 256 layers of 16x16 images. A list of probabilities ・・・ ▪ Starting from a single RGB image on the right, multiple filtering layers are applied to produce smaller (and more) images.

Image Generation Flow of DCGAN http://arxiv.org/abs/1511.06434 RGB layers of a
single 64x64 image. 512 layers of 8x8 images. 1024 layers of 4x4 images. A list of random numbers ・・・ ▪ Basically, it's just flipping the direction. No magic!

Illustration of Convolution Operations ▪ Convolutional filters in CNN and
transposed-convolutional filters in DCGAN works in the opposite directions. Here's a good Illustration how they work. http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html Convolution: (Up to) 3x3 blue pixels contribute to generate a single green pixel. Each of 3x3 blue pixels is multiplied by the corresponding ﬁlter value, and the results from different blue pixels are summed up to be a single green pixel. Transposed-convolution: A single green pixel contributes to generate (up to) 3x3 blue pixels. Each green pixel is multiplied by each of 3x3 ﬁlter values, and the results from different green pixels are summed up to be a single blue pixel. GIF Animation https://goo.gl/tAY4BL

Training Strategy of DCGAN It's a fake! ▪ We train
two models simultaneously. • CNN: Classifying authentic and fake images. • "Authentic" images are provided as training data to CNN. • DCGAN: Trained to generate images classified as authentic by CNN. • By trying to fool CNN, DCGAN learns to generate images similar to the training data. CNN DCGAN Training data

Training Loop of DCGAN ▪ By repeating this loop, CNN
becomes more accurate and DCGAN becomes more crafty. CNN DCGAN Training data B Generated image A Random numbers P(A) : Probability that A is authentic. P(B) : Probability that B is authentic. Modify parameters such that P(A) becomes large Modify parameters such that P(A) becomes small and P(B) becomes large

Demo https://goo.gl/D8RBGm

Model ▪ Training data : MNIST (28x28 pixels, grayscale images)
▪ DCGAN : Generate a single 28x28 image from 64 parameters. • → 128 x (7x7) → 64 x (14x14) → 1 x (28x28) ▪ CNN : Calculate a probability that a single 28x28 image is authentic. • 1 x (28x28) → 64 x (14✕14) → 128 x (7x7) → Probability of authentic image ▪ Batch size : 32 • Modify ﬁlter parameters using 32 generated images and 32 MNIST images at a time.

Learning Process ▪ This shows the evolution of images generated
from the same input parameters during the training loop. (DCGAN's ﬁlters are initialized with random values.)

Playing with Input Parameters ▪ If we change the input
parameter, the shape of generated image changes too. By making small, contiguous changes to the input, we can achieve a morphing effect. ▪ Since the input parameter is a point in the 64 dimensional space, we can draw a straight line between two points. The end points represent images before and after morphing.

Playing with Input Parameters ▪ Using more complicated closed loop
in the parameter space, we can even make a dancing image :) ▪ The sample image on this page is generated from the trajectory over a sphere (embedded in the 64 dimensional space.) GIF Animation https://goo.gl/zXL1bV

Interpretation of Input Parameters ▪ In the DCGAN paper, it
is suggested that the input parameters could use a semantic structure as in the following example. Smile Man Woman Neutral Neutral Woman Smiling Woman Smiling Man Neutral Man http://arxiv.org/abs/1511.06434

Thank you!

DCGAN - How does it work?

DCGAN - How does it work?

Etsuji Nakai

More Decks by Etsuji Nakai

Other Decks in Technology

Featured

Transcript

Google conﬁdential | Do not distribute DCGAN How does it

$ who am i ▪Etsuji Nakai Cloud Solutions Architect at

What is DCGAN?

What is DCGAN? ▪ DCGAN: Deep Convolutional Generative Adversarial Networks

Examples of Convolutional Filters ▪ Convolutional filters are ... just

Convolutional Filters in CNN ▪ CNN applies a lot of

Convolutional Filters in CNN ▪ This shows how filters are

Typical CNN Filtering Layers http://arxiv.org/abs/1511.06434 RGB layers of a single

Image Generation Flow of DCGAN http://arxiv.org/abs/1511.06434 RGB layers of a

Illustration of Convolution Operations ▪ Convolutional filters in CNN and

Training Strategy of DCGAN It's a fake! ▪ We train

Training Loop of DCGAN ▪ By repeating this loop, CNN

Demo https://goo.gl/D8RBGm

Model ▪ Training data : MNIST (28x28 pixels, grayscale images)

Learning Process ▪ This shows the evolution of images generated

Playing with Input Parameters ▪ If we change the input

Playing with Input Parameters ▪ Using more complicated closed loop

Interpretation of Input Parameters ▪ In the DCGAN paper, it

Thank you!