Image Data Augmentation for Deep Learning

Slide 1

Slide 1 text

Image Data Augmentation for Deep Learning Reference: A survey on Image Data Augmentation for Deep Learning

Slide 2

Slide 2 text

Motivation - Deep learning has been successful but most of the time we need a ton of data… - … which we don’t always have - … and even if we do sometimes they are unlabeled - Manually collecting data & labeling them requires human effort

Slide 3

Slide 3 text

Motivation - Data augmentation allows us to increase the diversity of our data (in hopes of avoiding overfitting) without actually collecting new data - Is useful not just for limited datasets but also imbalanced datasets

Slide 4

Slide 4 text

Image Manipulations - Geometric transformations: - Flipping - Cropping - Rotation - Translation - Good for positional biases present in training data

Slide 5

Slide 5 text

Image Manipulations - Color space transformations - Useful to overcome lighting challenges

Slide 6

Slide 6 text

Image Manipulations - Color space transformations https://www.youtube.com/watch?v=d-rCZCHXNBo

Slide 7

Slide 7 text

Image Manipulations - Noise injection - Depending on the transformations & datasets, it might not be a label- preserving transformation

Slide 8

Slide 8 text

Image Manipulations +: - Simple & easy -: - Sometimes we must manually observe whether transformations are label-preserving or not -> domain knowledge

Slide 9

Slide 9 text

Kernel Filters - Sharpening - Blurring http://setosa.io/ev/image-kernels/

Slide 10

Slide 10 text

Kernel Filters: PatchShuffle Guoliang K, Xuanyi D, Liang Z, Yi Y. PatchShuffle regularization. arXiv preprint. 2017. 5.66% error rate on CIFAR-10 vs 6.33% achieved without PatchShuffle Using hyperparams: - 2 × 2 filters - 0.05 probability of swapping Can be implemented as a layer in CNN instead

Slide 11

Slide 11 text

Kernel Filters: PatchShuffle Guoliang K, Xuanyi D, Liang Z, Yi Y. PatchShuffle regularization. arXiv preprint. 2017.

Slide 12

Slide 12 text

Mixing Images: SamplePairing - Averaging pixel values (only with simple averaging!) - Probably prevents the model from overfitting by giving wrong info to the model - Don’t turn SamplePairing on for the entire training period Hiroshi I. Data augmentation by pairing samples for images classification. ArXiv e-prints. 2018.

Slide 13

Slide 13 text

Mixing Images: SamplePairing https://jsideas.net/samplepairing/ Same class Different class Different class

Slide 14

Slide 14 text

Mixing Images: SamplePairing - Good results on limited dataset Hiroshi I. Data augmentation by pairing samples for images classification. ArXiv e-prints. 2018.

Slide 15

Slide 15 text

Mixing Images: MixUp - 30% cat, 70% alpaca H Zhang, M Cisse, YN Dauphin and D Lopez-Paz (2017) mixup: Beyond Empirical Risk Minimization - Take 2 images & do a linear combination of them - Λ is randomly sampled from beta distribution

Slide 16

Slide 16 text

Mixing Images: MixUp https://www.dlology.com/blog/how-to-do-mixup-training-from-image-files-in-keras/

Slide 17

Slide 17 text

Mixing Images: More Variations - Best result: 5.4 to 3.8% error on CIFAR-10 and 23.6% to 19.7% on CIFAR-100 Cecilia S, Michael JD. Improved mixed-example data augmentation. ArXiv preprint. 2018. https://github.com/ceciliaresearch/MixedExample

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Mixing Images +: Does not require significant domain knowledge - : Doesn’t really make sense from a human perspective -> difficult to explain

Slide 20

Slide 20 text

Random Erasing - Pretty much like dropout except it’s done input data - Designed to handle occlusion -> where some parts of the object are unclear - Forcing the model to learn more descriptive features - Also forces the model to pay attention the entire image Zhun Z, Liang Z, Guoliang K, Shaozi L, Yi Y. Random erasing data augmentation. ArXiv e-prints. 2017.

Slide 21

Slide 21 text

Random Erasing - The best patch fill method was found to be random values. - Params: the fill method and size of the masks Zhun Z, Liang Z, Guoliang K, Shaozi L, Yi Y. Random erasing data augmentation. ArXiv e-prints. 2017.

Slide 22

Slide 22 text

Random Erasing - : - Might lose important info

Slide 23

Slide 23 text

Feature Space Augmentation Terrance V, Graham WT. Dataset augmentation in feature space. In: Proceedings of the international conference on machine learning (ICML), workshop track, 2017. - Main challenge: domain expertise is required to ensure that the newly generated data respects valid transformations - It is more likely to encounter realistic samples in feature space - Noise, interpolation, extrapolation

Slide 24

Slide 24 text

Feature Space Augmentation

Slide 25

Slide 25 text

Feature Space Augmentation

Slide 26

Slide 26 text

Feature Space Augmentation +: - domain-independent, requiring no specialized knowledge - Can be applied to many different types of problems -: - Difficult to interpret the vector data

Slide 27

Slide 27 text

GAN-based Data Augmentation - Creating artificial images from a dataset that retain similar characteristics to the original set ThisCatDoesNotExist.com

Slide 28

Slide 28 text

GAN-based Data Augmentation Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee. MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation. ArXiv preprint. 2019. - Built upon FineGAN - Requires bounding boxes for training

Slide 29

Slide 29 text

GAN-based Data Augmentation - : - GANs themselves need a substantial amount of data to train, so might not be very practical

Slide 30

Slide 30 text

AutoAugment - Learn the best augmentation policies for a given dataset with Reinforcement Learning - Search space of all possible transformations is huge: - A policy consists of 5 sub-policies - Each sub-policy applies 2 image operations in sequence - Each of those image operations has two parameters: The probability of applying it and the magnitude of the operation (e.g. rotate 30 degrees in 70% of cases). Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 31

Slide 31 text

AutoAugment Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 32

Slide 32 text

AutoAugment ImageNet - 5 policies, 25 sub-policies per dataset Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 33

Slide 33 text

AutoAugment Street View House Number (SVHN) Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 34

Slide 34 text

AutoAugment - 16 operations: - 14 from the Python image library PIL: rotating, color inverting, posterizing (reducing pixel bits), solarizing (inverting colors above a threshold), etc. - Plus Cutout and SamplePairing - 10 magnitudes, 11 probabilities - Trained using Recurrent Neural Network trained with Proximal Policy Optimization Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 35

Slide 35 text

AutoAugment Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 36

Slide 36 text

AutoAugment - Policies learned on the ImageNet dataset were successful when transferred to the Stanford Cars and FGVC Aircraft image recognition tasks. Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 37

Slide 37 text

AutoAugment - : - Relatively new, hasn’t been heavily tested - Difficult & time-consuming to implement (it takes $37,500 to discover the best policies for ImageNet) :/ - Follow up: Population Based Augmentation (2019) https:// bair.berkeley.edu/blog/2019/06/07/data_aug/ Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint. 2018.

Slide 38

Slide 38 text

Takeaways - Be careful of non label-preserving transformations

Slide 39

Slide 39 text

Takeaways - Be careful of non label-preserving transformations - Lots of data augmentation alternatives beyond cropping & rotating

Slide 40

Slide 40 text

Takeaways - Be careful of non label-preserving transformations - Lots of data augmentation alternatives beyond cropping & rotating - Data augmentation is not a silver bullet - e.g. if you’re trying to classify dogs but you only have bulldogs and no instances of golden retrievers, no method is going to automagically create golden retrievers for you

Slide 41

Slide 41 text

No content