Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mastering Sketching: Adversarial Augmentation for Structured Prediction

Mastering Sketching: Adversarial Augmentation for Structured Prediction

We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is a real training data or the output of the simplification network, which in turn tries to fool it. This approach not only encourages the output sketches to be more similar in appearance to the training sketches, but allows training with additional unsupervised data. By training with additional rough sketches and line drawings that are not corresponding to each other, we can improve the quality of the sketch simplification. Our models that significantly outperform the state of the art in the sketch simplification task, and show we can also optimize for a single image, which improves accuracy at the cost of additional computation time. Using the same framework, it is possible to train the network to perform pencil drawing generation, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, where our approach is preferred to the state of the art in sketch simplification 88.9% of the time.

シモセラ エドガー

August 15, 2018
Tweet

More Decks by シモセラ エドガー

Other Decks in Research

Transcript

  1. Mastering Sketching: Adversarial Augmentation for Structured Prediction Edgar Simo-Serra*, Satoshi

    Iizuka*, Hiroshi Ishikawa (*equal contribution) Wednesday, August 15, 2018 Waseda University
  2. Contributions • Semi-supervised framework for sketch simplification • Pencil generation

    results • Single-Image Optimization Input [Simo-Serra+ 2016] Ours ©Eisaku Kubonouchi 4
  3. Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2

    Stroke Reduction 1.3 Stroke Grouping Liu et al. 2015 5
  4. Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2

    Stroke Reduction 1.3 Stroke Grouping 1.4 Vector input Liu et al. 2015 5
  5. Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2

    Stroke Reduction 1.3 Stroke Grouping 1.4 Vector input 2. Vectorization 2.1 Model Fitting (Bezier, …) 2.2 Gradient-based approaches Noris et al. 2013 5
  6. Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2

    Stroke Reduction 1.3 Stroke Grouping 1.4 Vector input 2. Vectorization 2.1 Model Fitting (Bezier, …) 2.2 Gradient-based approaches 2.3 Require fairly clean input sketches Noris et al. 2013 5
  7. Related work [Simo-Serra+ 2016] • 23 layer fully convolutional neural

    network • Encoder-Decoder shape Flat-convolution Up-convolution 2 × 2 4 × 4 8 × 8 4 × 4 2 × 2 × × Down-convolution 6
  8. Related work [Simo-Serra+ 2016] • 23 layer fully convolutional neural

    network • Encoder-Decoder shape • Dataset construction is critical • Expert knowledge is important Standard Dataset Creation Inverse Dataset Creation 6
  9. Dataset Bias Training pairs Rough sketches “in the wild” •

    Supervised dataset (rough sketch and line drawing pairs): ρx,y • Rough sketch dataset:ρx • Line drawing dataset: ρy 7
  10. Generative Adversarial Network (GAN) • D(·): maximize classification prediction max

    D Ey∗∼ρy Real data log D(y∗) + Ez∼N(0,1) Random log(1 − D(G(z))) 8
  11. Generative Adversarial Network (GAN) • D(·): maximize classification prediction •

    G(·): minimize to fool D(·) min G Ez∼N(0,1) Random log(1 − D(G(z))) 8
  12. Generative Adversarial Network (GAN) • D(·): maximize classification prediction •

    G(·): minimize to fool D(·) • Alternate optimization min G max D Ey∗∼ρy Real data log D(y∗) + Ez∼N(0,1) Random log(1 − D(G(z))) 8
  13. Model • S(·): Sketch simplification model • 23 layer fully

    convolutional neural network [Simo-Serra+ 2016] • D(·): Discriminator model • 6 layer convolutional neural network Flat-convolution Up-convolution 2 × 2 4 × 4 8 × 8 4 × 4 2 × 2 × × Down-convolution 9
  14. Proposed framework min S max D Supervised E(x,y∗)∼ρx , y

       Standard Loss S(x) − y∗ 2 + Adversarial Loss α log D(y∗) + α log(1 − D(S(x)))    入力 Standard Loss +Adversarial 10
  15. Proposed framework min S max D Supervised E(x,y∗)∼ρx , y

       Standard Loss S(x) − y∗ 2 + Adversarial Loss α log D(y∗) + α log(1 − D(S(x)))    + β Line Ey∼ρy [ log D(y) ] + β Rough Ex∼ρx [ log(1 − D(S(x))) ] Unsupervised Adversarial Loss 入力 Standard Loss +Adversarial +Unsupervised 10
  16. Training • Supervised data: standard loss + adversarial loss •

    Unsupervised data: adversarial loss Supervised Data MSE Adversarial 11
  17. Training • Supervised data: standard loss + adversarial loss •

    Unsupervised data: adversarial loss Line Drawings Rough Sketches Adversarial 11
  18. (lack of) Post-processing • MSE loss requires post-processing to avoid

    blurring • Adversarial loss avoids blurring Input LtS (no PP) LtS (PP) Ours (no PP) 12
  19. Perceptual User Study • Comparison against LtS [Simo-Serra et al.

    2016] • 99 images, 15 users • 94 images from artists not in training set • 60 images come from twitter LtS Ours 1 2 3 4 5 Absolute rating. LtS Ours absolute 2.77 3.60 vs LtS - 88.9% vs Ours 11.1% - Mean results for all users. 14
  20. Extensions - Single-Image Optimization Parametric Model Training Examples Testing Examples

    Induction Deduction Transduction • Inductive Machine Learning: Learning a parametric function/model from training data and applying it on new data • Transductive Machine Learning: Using training data to predict test data (model not necessary) 16
  21. Limitations • Still a strong dependency on labelled data •

    Results not perfect and require manual fixing Input Only Unsupervised Ours 17
  22. Limitations • Still a strong dependency on labelled data •

    Results not perfect and require manual fixing Input Output 17