Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Image Data Augmentation for Deep Learning

Galuh Sahid
January 15, 2020

Image Data Augmentation for Deep Learning

Reference: A survey on Image Data Augmentation for Deep Learning (https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0)

Galuh Sahid

January 15, 2020
Tweet

More Decks by Galuh Sahid

Other Decks in Technology

Transcript

  1. Image Data Augmentation for
    Deep Learning
    Reference:
    A survey on Image Data Augmentation for Deep Learning

    View Slide

  2. Motivation
    - Deep learning has been successful but most of the time we need a ton of
    data…
    - … which we don’t always have
    - … and even if we do sometimes they are unlabeled
    - Manually collecting data & labeling them requires human effort

    View Slide

  3. Motivation
    - Data augmentation allows us to increase the diversity of our data (in hopes of
    avoiding overfitting) without actually collecting new data
    - Is useful not just for limited datasets but also imbalanced datasets

    View Slide

  4. Image Manipulations
    - Geometric transformations:
    - Flipping
    - Cropping
    - Rotation
    - Translation
    - Good for positional biases present in
    training data

    View Slide

  5. Image Manipulations
    - Color space transformations
    - Useful to overcome lighting
    challenges

    View Slide

  6. Image Manipulations
    - Color space transformations
    https://www.youtube.com/watch?v=d-rCZCHXNBo

    View Slide

  7. Image Manipulations
    - Noise injection
    - Depending on the transformations &
    datasets, it might not be a label-
    preserving transformation

    View Slide

  8. Image Manipulations
    +:
    - Simple & easy
    -:
    - Sometimes we must manually
    observe whether transformations are
    label-preserving or not -> domain
    knowledge

    View Slide

  9. Kernel Filters
    - Sharpening
    - Blurring
    http://setosa.io/ev/image-kernels/

    View Slide

  10. Kernel Filters: PatchShuffle
    Guoliang K, Xuanyi D, Liang Z, Yi Y. PatchShuffle regularization. arXiv preprint. 2017.
    5.66% error rate on CIFAR-10
    vs 6.33% achieved without
    PatchShuffle
    Using hyperparams:
    - 2 × 2 filters
    - 0.05 probability of swapping
    Can be implemented as a
    layer in CNN instead

    View Slide

  11. Kernel Filters: PatchShuffle
    Guoliang K, Xuanyi D, Liang Z, Yi Y. PatchShuffle regularization. arXiv preprint. 2017.

    View Slide

  12. Mixing Images: SamplePairing
    - Averaging pixel values
    (only with simple
    averaging!)
    - Probably prevents the
    model from overfitting by
    giving wrong info to the
    model
    - Don’t turn SamplePairing
    on for the entire training
    period
    Hiroshi I. Data augmentation by pairing samples for images classification. ArXiv e-prints. 2018.

    View Slide

  13. Mixing Images: SamplePairing
    https://jsideas.net/samplepairing/
    Same class
    Different class
    Different class

    View Slide

  14. Mixing Images: SamplePairing
    - Good results on limited
    dataset
    Hiroshi I. Data augmentation by pairing samples for images classification. ArXiv e-prints. 2018.

    View Slide

  15. Mixing Images: MixUp
    - 30% cat, 70% alpaca
    H Zhang, M Cisse, YN Dauphin and D Lopez-Paz (2017) mixup: Beyond Empirical Risk Minimization
    - Take 2 images & do a linear combination of them
    - Λ is randomly sampled from beta distribution

    View Slide

  16. Mixing Images: MixUp
    https://www.dlology.com/blog/how-to-do-mixup-training-from-image-files-in-keras/

    View Slide

  17. Mixing Images: More Variations
    - Best result: 5.4 to 3.8% error on
    CIFAR-10 and 23.6% to 19.7% on
    CIFAR-100
    Cecilia S, Michael JD. Improved mixed-example data augmentation. ArXiv preprint. 2018.
    https://github.com/ceciliaresearch/MixedExample

    View Slide

  18. Mixing Images: More Variations
    - Best result: 5.4 to 3.8% error on
    CIFAR-10 and 23.6% to 19.7% on
    CIFAR-100
    Cecilia S, Michael JD. Improved mixed-example data augmentation. ArXiv preprint. 2018.
    https://github.com/ceciliaresearch/MixedExample

    View Slide

  19. Mixing Images
    +:
    Does not require significant domain
    knowledge
    - :
    Doesn’t really make sense from a human
    perspective -> difficult to explain

    View Slide

  20. Random Erasing
    - Pretty much like dropout except it’s
    done input data
    - Designed to handle occlusion ->
    where some parts of the object are
    unclear
    - Forcing the model to learn more
    descriptive features
    - Also forces the model to pay
    attention the entire image
    Zhun Z, Liang Z, Guoliang K, Shaozi L, Yi Y. Random erasing data augmentation. ArXiv e-prints. 2017.

    View Slide

  21. Random Erasing
    - The best patch fill method was found
    to be random values.
    - Params: the fill method and size of
    the masks
    Zhun Z, Liang Z, Guoliang K, Shaozi L, Yi Y. Random erasing data augmentation. ArXiv e-prints. 2017.

    View Slide

  22. Random Erasing
    - :
    - Might lose important info

    View Slide

  23. Feature Space Augmentation
    Terrance V, Graham WT. Dataset augmentation in feature space. In: Proceedings of the international conference on machine
    learning (ICML), workshop track, 2017.
    - Main challenge: domain
    expertise is required to
    ensure that the newly
    generated data respects
    valid transformations
    - It is more likely to
    encounter realistic
    samples in feature space
    - Noise, interpolation,
    extrapolation

    View Slide

  24. Feature Space Augmentation

    View Slide

  25. Feature Space Augmentation

    View Slide

  26. Feature Space Augmentation
    +:
    - domain-independent, requiring no specialized knowledge
    - Can be applied to many different types of problems
    -:
    - Difficult to interpret the vector data

    View Slide

  27. GAN-based Data Augmentation
    - Creating artificial images from a
    dataset that retain similar
    characteristics to the original set
    ThisCatDoesNotExist.com

    View Slide

  28. GAN-based Data Augmentation
    Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee. MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation. ArXiv
    preprint. 2019.
    - Built upon
    FineGAN
    - Requires bounding
    boxes for training

    View Slide

  29. GAN-based Data Augmentation
    - :
    - GANs themselves need a substantial
    amount of data to train, so might not
    be very practical

    View Slide

  30. AutoAugment
    - Learn the best augmentation policies
    for a given dataset with
    Reinforcement Learning
    - Search space of all possible
    transformations is huge:
    - A policy consists of 5 sub-policies
    - Each sub-policy applies 2 image
    operations in sequence
    - Each of those image operations has
    two parameters: The probability of
    applying it and the magnitude of the
    operation (e.g. rotate 30 degrees in
    70% of cases).
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  31. AutoAugment
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  32. AutoAugment
    ImageNet
    - 5 policies, 25 sub-policies per dataset
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  33. AutoAugment
    Street View House Number (SVHN)
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  34. AutoAugment
    - 16 operations:
    - 14 from the Python image library
    PIL: rotating, color inverting,
    posterizing (reducing pixel bits),
    solarizing (inverting colors above a
    threshold), etc.
    - Plus Cutout and SamplePairing
    - 10 magnitudes, 11 probabilities
    - Trained using Recurrent Neural
    Network trained with Proximal Policy
    Optimization
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  35. AutoAugment
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  36. AutoAugment
    - Policies learned on the ImageNet
    dataset were successful when
    transferred to the Stanford Cars and
    FGVC Aircraft image recognition
    tasks.
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  37. AutoAugment
    - :
    - Relatively new, hasn’t been heavily
    tested
    - Difficult & time-consuming to
    implement (it takes $37,500 to
    discover the best policies for
    ImageNet) :/
    - Follow up: Population Based Augmentation (2019) https://
    bair.berkeley.edu/blog/2019/06/07/data_aug/
    Ekin DC, Barret Z, Dandelion M, Vijay V, Quoc VL. AutoAugment: learning augmentation policies from data. ArXiv preprint.
    2018.

    View Slide

  38. Takeaways
    - Be careful of non label-preserving transformations

    View Slide

  39. Takeaways
    - Be careful of non label-preserving transformations
    - Lots of data augmentation alternatives beyond cropping & rotating

    View Slide

  40. Takeaways
    - Be careful of non label-preserving transformations
    - Lots of data augmentation alternatives beyond cropping & rotating
    - Data augmentation is not a silver bullet - e.g. if you’re trying to
    classify dogs but you only have bulldogs and no instances of golden
    retrievers, no method is going to automagically create golden
    retrievers for you

    View Slide

  41. View Slide