Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DCGAN - How does it work?

DCGAN - How does it work?

Etsuji Nakai

April 06, 2022
Tweet

More Decks by Etsuji Nakai

Other Decks in Technology

Transcript

  1. Google confidential | Do not distribute
    DCGAN How does it work?
    Etsuji Nakai
    Cloud Solutions Architect at Google
    2016/09/26 ver1.1
    GIF Animation
    https://goo.gl/zXL1bV

    View Slide

  2. $ who am i
    ▪Etsuji Nakai
    Cloud Solutions Architect at Google
    Twitter @enakai00
    Now on Sale!

    View Slide

  3. What is DCGAN?

    View Slide

  4. What is DCGAN?
    ▪ DCGAN: Deep Convolutional Generative Adversarial Networks
    ● It works in the opposite direction of the image classifier (CNN).
    ● CNN transforms an image to a class label (list of probabilities).
    ● DCGAN generates an image from random parameters.
    (0.01, 0.05, 0.91, 0.02, ...)
    deer dog cat human ...
    (0.01, 0.05, 0.91, 0.02, ...)
    CNN
    DCGAN
    Probabilities of each entry.
    What do these
    numbers mean?
    Random parameters

    View Slide

  5. Examples of Convolutional Filters
    ▪ Convolutional filters are ... just an image filter you sometimes apply in Photoshop!
    Filter to blur images Filter to extract vertical edges

    View Slide

  6. Convolutional Filters in CNN
    ▪ CNN applies a lot of filters to extract various features from a single image.
    ▪ CNN applies multi-layered filters to a single image (to extract features of
    features?)
    ▪ A filtered image becomes smaller to drop off unnecessary details.
    Extracting vertical and horizontal edges using two filters.

    View Slide

  7. Convolutional Filters in CNN
    ▪ This shows how filters are
    applied to a multi-layered image.
    Input image
    Output image A
    Output image B
    Filter A
    Filter B
    Apply independent
    filters to each layer
    Sum up resulting images
    from each layer

    View Slide

  8. Typical CNN Filtering Layers
    http://arxiv.org/abs/1511.06434
    RGB layers of a
    single 64x64 image.
    128 layers of
    32x32 images.
    256 layers of
    16x16 images.
    A list of
    probabilities
    ・・・
    ▪ Starting from a single RGB image on the right, multiple filtering layers are applied
    to produce smaller (and more) images.

    View Slide

  9. Image Generation Flow of DCGAN
    http://arxiv.org/abs/1511.06434
    RGB layers of a
    single 64x64 image.
    512 layers of
    8x8 images.
    1024 layers of
    4x4 images.
    A list of random
    numbers
    ・・・
    ▪ Basically, it's just flipping the direction. No magic!

    View Slide

  10. Illustration of Convolution Operations
    ▪ Convolutional filters in CNN and transposed-convolutional filters in DCGAN works
    in the opposite directions. Here's a good Illustration how they work.
    http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html
    Convolution:
    (Up to) 3x3 blue pixels contribute to
    generate a single green pixel. Each
    of 3x3 blue pixels is multiplied by
    the corresponding filter value, and
    the results from different blue
    pixels are summed up to be a single
    green pixel.
    Transposed-convolution:
    A single green pixel contributes to
    generate (up to) 3x3 blue pixels.
    Each green pixel is multiplied by
    each of 3x3 filter values, and the
    results from different green pixels
    are summed up to be a single blue
    pixel.
    GIF Animation
    https://goo.gl/tAY4BL

    View Slide

  11. Training Strategy of DCGAN
    It's a fake!
    ▪ We train two models simultaneously.
    ● CNN: Classifying authentic and fake images.
    ● "Authentic" images are provided as training data to CNN.
    ● DCGAN: Trained to generate images classified as authentic by CNN.
    ● By trying to fool CNN, DCGAN learns to generate images similar to the training data.
    CNN DCGAN
    Training data

    View Slide

  12. Training Loop of DCGAN
    ▪ By repeating this loop, CNN
    becomes more accurate and
    DCGAN becomes more crafty.
    CNN
    DCGAN
    Training data B
    Generated image A
    Random numbers
    P(A) : Probability that
    A is authentic.
    P(B) : Probability that
    B is authentic.
    Modify parameters such that
    P(A) becomes large
    Modify parameters such that
    P(A) becomes small
    and P(B) becomes large

    View Slide

  13. Demo
    https://goo.gl/D8RBGm

    View Slide

  14. Model
    ▪ Training data : MNIST (28x28 pixels, grayscale images)
    ▪ DCGAN : Generate a single 28x28 image from 64 parameters.
    ● → 128 x (7x7) → 64 x (14x14) → 1 x (28x28)
    ▪ CNN : Calculate a probability that a single 28x28 image is authentic.
    ● 1 x (28x28) → 64 x (14✕14) → 128 x (7x7) → Probability of authentic image
    ▪ Batch size : 32
    ● Modify filter parameters using 32 generated images and 32 MNIST images at a
    time.

    View Slide

  15. Learning Process
    ▪ This shows the evolution of images
    generated from the same input parameters
    during the training loop. (DCGAN's filters are
    initialized with random values.)

    View Slide

  16. Playing with Input Parameters
    ▪ If we change the input parameter, the shape of generated image changes too. By
    making small, contiguous changes to the input, we can achieve a morphing effect.
    ▪ Since the input parameter is a point in the 64 dimensional space, we can draw a
    straight line between two points. The end points represent images before and
    after morphing.

    View Slide

  17. Playing with Input Parameters
    ▪ Using more complicated closed loop in the parameter space, we can even make a
    dancing image :)
    ▪ The sample image on this page is generated from the trajectory over a sphere
    (embedded in the 64 dimensional space.)
    GIF Animation
    https://goo.gl/zXL1bV

    View Slide

  18. Interpretation of Input Parameters
    ▪ In the DCGAN paper, it is suggested that the input parameters could use a
    semantic structure as in the following example.
    Smile
    Man
    Woman
    Neutral
    Neutral Woman
    Smiling Woman Smiling Man
    Neutral Man
    http://arxiv.org/abs/1511.06434

    View Slide

  19. Thank you!

    View Slide