Slide 1

Slide 1 text

Semi-Supervised Learning of Sketch Simplification Edgar Simo-Serra March 22nd, 2018 Waseda University

Slide 2

Slide 2 text

Today’s talk 1

Slide 3

Slide 3 text

Today’s talk 1

Slide 4

Slide 4 text

Today’s talk 1

Slide 5

Slide 5 text

Illustration Stages Rough Sketch → Line Art → Colorization → Completion David Revoy, www.davidrevoy.com 2

Slide 6

Slide 6 text

Illustration Stages Rough Sketch → Line Art → Colorization → Completion David Revoy, www.davidrevoy.com 2

Slide 7

Slide 7 text

Illustration Stages Rough Sketch → Line Art → Colorization → Completion David Revoy, www.davidrevoy.com 2

Slide 8

Slide 8 text

Illustration Stages Rough Sketch → Line Art → Colorization → Completion 2

Slide 9

Slide 9 text

Illustration Stages David Revoy, www.davidrevoy.com 3

Slide 10

Slide 10 text

Sketch Simplification 4

Slide 11

Slide 11 text

Sketch Simplification Input: Rough Sketch Output: Line Art 5

Slide 12

Slide 12 text

Sketch Simplification Rough Target Rough Target 6

Slide 13

Slide 13 text

Characteristics of Sketch Simplification • Input and outputs are sparse • Data hard to obtain • Very large diversity in input/output Input [Simo-Serra et al. 2016] Ours [Favreau et al. 2016] 7

Slide 14

Slide 14 text

Characteristics of Sketch Simplification • Not possible to use automatic evaluation Ground Truth Displaced 0.4% (4px) Displaced 0.8% (8px) MSE: 0 MSE: 0.0276 MSE: 0.0302 (+9%) 7

Slide 15

Slide 15 text

Characteristics of Sketch Simplification • Not possible to use automatic evaluation Ground Truth Displaced 0.4% (4px) White Image MSE: 0 MSE: 0.0276 MSE: 0.0190 (-31%) 7

Slide 16

Slide 16 text

Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2 Stroke Reduction 1.3 Stroke Grouping 1.4 Vector input Liu et al. 2015 8

Slide 17

Slide 17 text

Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2 Stroke Reduction 1.3 Stroke Grouping 1.4 Vector input 2. Vectorization 2.1 Model Fitting (Bezier, …) 2.2 Gradient-based approaches 2.3 Require fairly clean input sketches Noris et al. 2013 8

Slide 18

Slide 18 text

Related Work 1. Sketch Simplification 1.1 Progressive Online Modification 1.2 Stroke Reduction 1.3 Stroke Grouping 1.4 Vector input 2. Vectorization 2.1 Model Fitting (Bezier, …) 2.2 Gradient-based approaches 2.3 Require fairly clean input sketches 3. Deep Learning 3.1 Fully Convolutional Network 3.2 Generative Adversarial Network 3.3 … Long et al. 2015 8

Slide 19

Slide 19 text

Semi-Supervised Sketch Simplification • Sketch Simplification Model: S(·) • Supervised Training Data: ρx,y (rough sketch x, line drawing y∗) min S E(x,y∗)∼ρx , y [ S(x) − y∗ 2 ] 9

Slide 20

Slide 20 text

Semi-Supervised Sketch Simplification • Sketch Simplification Model: S(·) • Discriminator Model: D(·) • Supervised Training Data: ρx,y (rough sketch x, line drawing y∗) • Adversarial weighting hyperparameter: α min S max D E(x,y∗)∼ρx , y [ S(x) − y∗ 2 + α log D(y∗) + α log(1 − D(S(x))) ] 9

Slide 21

Slide 21 text

Semi-Supervised Sketch Simplification • Sketch Simplification Model: S(·) • Discriminator Model: D(·) • Supervised Training Data: ρx,y (rough sketch x, line drawing y∗) • Unsupervised data: ρy, ρx • Adversarial weighting hyperparameter: α • Unsupervised weighting hyperparameter: β min S max D E(x,y∗)∼ρx , y [ S(x) − y∗ 2 + α log D(y∗) + α log(1 − D(S(x))) ] + β Ey∼ρy [ log D(y) ] + β Ex∼ρx [ log(1 − D(S(x))) ] 9

Slide 22

Slide 22 text

Optimization Overview 10

Slide 23

Slide 23 text

Semi-Supervised Motivation Annotated Images Sketches “in the wild” 11

Slide 24

Slide 24 text

Semi-Supervised Motivation Supervised Data MSE GAN 11

Slide 25

Slide 25 text

Semi-Supervised Motivation Line Drawings Rough Sketches GAN 11

Slide 26

Slide 26 text

Semi-Supervised Motivation Rough Sketches Supervised Data Discriminator Network Line Drawings Simplification Network Real Real Fake Fake Target Label MSE loss 11

Slide 27

Slide 27 text

Model • 23 convolutional layers • Output has the same resolution as the input • Encoder-Decoder architecture • Reduces memory usage • Increases spatial resolution • Resolution lowered to 1/8 of original size Flat-convolution Up-convolution 2 × 2 4 × 4 8 × 8 4 × 4 2 × 2 × × Down-convolution 12

Slide 28

Slide 28 text

Training • Trained from scratch • First trained with MSE only • Afterwards trained with full loss • Using 424 × 424px or 384 × 384px patches • Batch Normalization [Ioffe and Szegedy 2015] • Optimized with ADADELTA [Zeiler 2012] Input Output Target 13

Slide 29

Slide 29 text

Vectorization and Simplification • Vectorization with potrace • Open source software • High pass filter and binarization Input Output Vector 14

Slide 30

Slide 30 text

Vectorization and Simplification • Vectorization with potrace • Open source software • High pass filter and binarization • Scaling input changes simplification degree 14

Slide 31

Slide 31 text

Sketch Dataset

Slide 32

Slide 32 text

Supervised Sketch Dataset • 68 pairs of rough and target sketches ρx,y • 5 illustrators ・・・ Extracted patches Sketch dataset ・・・ 15

Slide 33

Slide 33 text

Inverse Dataset Creation • Data quality is critical • Creating target sketches from rough sketches has misalignments • Creating rough sketches from target sketches properly aligns Standard Inverse Creation 16

Slide 34

Slide 34 text

Inverse Dataset Creation • Data quality is critical • Creating target sketches from rough sketches has misalignments • Creating rough sketches from target sketches properly aligns 16

Slide 35

Slide 35 text

Data Augmentation • 68 pairs is insufficient • Scaling training data • Random cropping, flipping and rotation • Additional augmentation: tone, slur, and noise input tone slur noise 17

Slide 36

Slide 36 text

Unsupervised Data • Obtained from a diversity of sources • Known illustrators • Web search • All manually verified • All from different authors than training data • 109 unsupervised clean sketches ρy • 85 unsupervised rough sketches ρx 18

Slide 37

Slide 37 text

Results

Slide 38

Slide 38 text

Computation Time • Intel Core i7-5960X CPU (3.00GHz) • NVIDIA GeForce TITAN X GPU • 3 weeks training time Image Size Pixels CPU (s) GPU (s) Speedup 320 × 320 102,400 2.014 0.047 42.9× 640 × 640 409,600 7.533 0.159 47.4× 1024 × 1024 1,048,576 19.463 0.397 49.0× 19

Slide 39

Slide 39 text

Comparison with Standard Tools Input Potrace Adobe Live Trace Ours (MSE) 20

Slide 40

Slide 40 text

Comparison with Standard Tools Input Potrace Adobe Live Trace Ours (MSE) 20

Slide 41

Slide 41 text

User Study (MSE) • Comparison with 15 images • 19 users participated (10 with illustration experience) • Absolute rating (1 to 5 scale) • Relative evaluation (best of two) Ours (MSE) Live Trace Potrace Score 4.53 2.94 2.80 vs Ours (MSE) - 2.5% 2.8% vs Live Trace 97.5% - 30.3% vs Potrace 97.2% 69.7% - 21

Slide 42

Slide 42 text

Effect of Full Loss • MSE only blurs output • High pass filter loses details • Adversarial loss removes need for post-processing ©Eisaku Kubonouchi Input MSE MSE Full no post-processing no post-processing 22

Slide 43

Slide 43 text

Benefits of Unsupervised Data • Improves generalization to different rough sketch styles Input Supervised-only Full David Revoy, www.davidrevoy.com 23

Slide 44

Slide 44 text

Results Input MSE (PP) Full 24

Slide 45

Slide 45 text

Results Input MSE (PP) Full David Revoy, www.davidrevoy.com 24

Slide 46

Slide 46 text

Results Input MSE (PP) Full ©Eisaku Kubonouchi 24

Slide 47

Slide 47 text

In-depth User Study • Comparison with 99 images (60 from twitter) • 15 users participated • Absolute rating (1 to 5 scale) • Relative evaluation (best of two) LtS Ours 1 2 3 4 5 MSE Full absolute 2.77 3.60 vs MSE - 88.9% vs FULL 11.1% - 25

Slide 48

Slide 48 text

From Inductive to Transductive Paradigm • Use test data as unsupervised training data • Fine-tuning done from full model • Inference time increases (roughly 100 iterations are sufficient) Input Output Optimized David Revoy, www.davidrevoy.com 26

Slide 49

Slide 49 text

From Inductive to Transductive Paradigm • Use test data as unsupervised training data • Fine-tuning done from full model • Inference time increases (roughly 100 iterations are sufficient) Input Output Optimized David Revoy, www.davidrevoy.com 26

Slide 50

Slide 50 text

Pencil Drawing Generation 27

Slide 51

Slide 51 text

Pencil Drawing Generation Input MSE Loss Artist 1 Artist 2 27

Slide 52

Slide 52 text

Line Drawing Inpainting and Model Optimization mask 28

Slide 53

Slide 53 text

Line Drawing Inpainting and Model Optimization 1 2 3 64 96 128 1 2 3 128 256 512 1 2 3 64 96 128 layer3 layer7 layer11 1 2 3 64 96 128 1 2 3 128 256 512 1 2 3 64 96 128 Input Output 28

Slide 54

Slide 54 text

Conclusions • Sketch simplification is a hard problem • Adversarial learning beneficial • Eliminates post-processing • Semi-supervised training • Transductive learning • Pencil drawing generation 29

Slide 55

Slide 55 text

Conclusions • Sketch simplification is a hard problem • Adversarial learning beneficial • Eliminates post-processing • Semi-supervised training • Transductive learning • Pencil drawing generation • Limitations • Training data • Adversarial (in)stability 29

Slide 56

Slide 56 text

Conclusions • Sketch simplification is a hard problem • Adversarial learning beneficial • Eliminates post-processing • Semi-supervised training • Transductive learning • Pencil drawing generation • Limitations • Training data • Adversarial (in)stability • Future directions • Graphical model post-processing • Colorization • Improving transductive learning 29

Slide 57

Slide 57 text

New Results Input Adversarial New 30

Slide 58

Slide 58 text

Thanks for listening • Edgar Simo-Serra http://hi.cs.waseda.ac.jp/~esimo/ • Try Sketch Simplification http://hi.cs.waseda.ac.jp:8081/ • Code: https://github.com/bobbens/sketch_simplification ©Edgar Simo-Serra 31