Learning to Generate Synthetic Data
via Compositing
画像処理&機械学習 論文LT会 #5
@phalanx
22 July, 2019
Slide 2
Slide 2 text
Self Introduction
• Name: phalanx
• Data Scientist at DeNA
• Machine Learing: 1.5 year
• Kaggle: 1 year
• Kaggle Master
• TGS 1st place
• iMet 7th place
• Petfinder 17th place
• HPA 36th place
@ZFPhalanx
Slide 3
Slide 3 text
Outline
• Task aware approach to synthetic data generation
• Our pipeline consist of three components
• Synthesizer Network: generate composite image
• Target Network: classify/detect foreground object in composite image
• Discriminator: identify whether composite image is real or not
Slide 4
Slide 4 text
Outline
• Limitation of prior approaches
• Generating synthetic data is decoupled from training the target classifier
→ synthetic data has little value in improving performance of target network
• Our approach
• Synthesizer and target networks are trained in an adversarial manner
→ synthesizer produces meaningful training samples
Slide 5
Slide 5 text
Synthesizer Network
• Inputs: background image(), foreground object mask()
• Output: transformation function()
• Restrict A to set of 2D affine transformation in this paper
• Composite synthetic image: = ۩ ()
• ۩: alpha blending
• Sptial transformer network create by , ,
Slide 6
Slide 6 text
Synthesizer Network: architecture
• Shared Feature Network
• Identical feature extraction on and
• Foreground/Background branch
• Identical mid-level feature extraction on and
• FC Regression Network:
• Concatnate mid-level feature of and
• Outputs affine transformation parameter
Slide 7
Slide 7 text
Target Network
• Neural network trained for specific task(classification, detection, etc.)
• Target network is fine-tuned with composite image
• Loss function:
• Image classification: cross entropy loss
• Object detection:
• classification: cross entropy, localization: smooth 1
Slide 8
Slide 8 text
Discriminator
• motivation
• Realistic data can help the target network to learn more efficiently
• Synthesizer need to produce realistic composite image
• Binary classification
• Input(composite image, real images)
• Loss function : Ε
log + Ε
log 1 −
Slide 9
Slide 9 text
Training
• Train models according to , ,
• Update parameter of while keeping parameters of , fixed
• Update parameter of , while keeping parameter of fixed
: Synthesizer network, : Target network, : Discriminator
Slide 10
Slide 10 text
Performance with AffNIST
• AffNIST: transformed MNIST by randomly sampled affine transform
• Red line: train model with MNIST, then finetune with AffNIST data
• Green line: train model with MNIST, then finetune with Synthetic data
• Synthesis data
• foreground:MNIST digits
• Background: black background
Slide 11
Slide 11 text
Performance with Pascal VOC
• Comparison of our approach with prior approach
• Cut-Paste-Learn [7]
• Context-Data-Augmentation [6]
• Synthesis data
• foreground:instance mask from voc2007/2012
• Background: coco
Results on VOC 2007
Slide 12
Slide 12 text
Performance with Pascal VOC
• Quality of Synthetic Data
• Our approach generate harder examples than prior approach
Slide 13
Slide 13 text
Synthetic Network Output
• Top: composite image without discriminator
• Bottom: composite image with discriminator