SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

SimCLR: A Simple Framework for Contrastive Learning of Visual Representation
੉઱ഘ (ML Research Scientist @ Pingpong)

A Simple Framework for Contrastive Learning of Visual Representation Overview
• “A Simple Framework for Contrastive Learning of Visual Representation” • Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton • Google Research & Brain • ICML 2020 • Contrastive Learningਸ ਤೠ Frameworkਸ ઁউೞҊ, п ਃࣗ੄ ੄޷৬ ӝৈܳ ࠙ࢳೣ • ImageNet (Linear evaluation)ীࢲ State-of-the-art ࢿמਸ ࠁ੐

1. Introduction A Simple Framework for Contrastive Learning of Visual
Representation (Chen et al., 2020)

Self-Supervised Learning (SSL) 1. Introduction • ਃ્ ઁੌ ೥ೠ ఃਕ٘
઺ ೞա (ౠ൤ Computer Visionীࢲ) • ҙ۲ ఃਕ٘: Unsupervised Learning, Representation(Embedding) Learning, Contrastive Learning, Augmentation • Pretext Taskী ؀೧ࢲ Supervised Learning୊ۢ Objective Functionਵ۽ ೟णೣ • Pretext Task: Unlabeled Data۽ࠗఠ ٜ݅যմ Inputҗ Labelਸ ੉ਊೠ Predictive Task

Pretext Task 1. Introduction (a) Relative Path Prediction  (Doersch et
al., 2015) (b) Jigsaw Puzzle (Noroozi et al., 2016) (d) Rotation Prediction  (Gidaris et al., 2018) (c) Colorization (Larsson et al., 2017)

Pretext Task 1. Introduction (a) Masked Language Modeling (b) Next
Sentence Prediction (c) Language Modeling  (auto-regressive)

Supervised / Unsupervised / Self-supervised 1. Introduction • Supervised Learning:
• ࢎۈ੉ ҙৈೞৈ Target Taskী ٮۄ Inputী ؀ೠ Labelਸ ٜ݅Ҋ ੉ܳ ೟णೣ (e.g. text classiﬁcation) • Unsupervised Learning: • ؘ੉ఠ ࢚ী Label੉ ઓ੤ೞ૑ ঋҊ ೟णী ࢎਊೞ૑ب ঋ਺ (e.g. Clustering, Auto-encoder, GAN) • Self-supervised Learning: • Unlabeled ؘ੉ఠ۽ࠗఠ Inputҗ Labelਸ ੗زਵ۽ ٜ݅যࢲ, Supervised Learning୊ۢ ೟णೣ

Contrastive Learning 1. Introduction • Example Pairо ਬࢎೠ૑ ইצ૑ ݏ୶ח
ޙઁ • Example Pair੄ Latent Space ࢚੄ Ѣܻܳ  ਬࢎೞݶ оӰѱ, ׮ܰݶ ݣѱ Representationਸ ೟णदఃח ߑध • Metric Learning੄ ੌઙਵ۽ࢲ,   Example੄ ౠ૚җ ҙ҅ܳ ੜ ಴അೞب۾ ೟णदఃח ѱ ೨ब

Key Points of Contrastive Learning 1. Introduction 1.Example of similar
and dissimilar images • যڌѱ ਬࢎೞѢա ׮ܲ Example Pairܳ ٜ݅ Ѫੋо?

and dissimilar images • যڌѱ ਬࢎೞѢա ׮ܲ Example Pairܳ ٜ݅ Ѫੋо? 2. Ability to know what an image represents • যڌѱ Representationਸ ੜ ٜ݅ Ѫੋо?

and dissimilar images • যڌѱ ਬࢎೞѢա ׮ܲ Example Pairܳ ٜ݅ Ѫੋо? 2. Ability to know what an image represents • যڌѱ Representationਸ ੜ ٜ݅ Ѫੋо? 3. Ability to quantify if two images are similar • যڌѱ ਬࢎೠ ੿بܳ ബҗ੸ਵ۽ ஏ੿ೡ Ѫੋо?

2. SimCLR A Simple Framework for Contrastive Learning of Visual

2. SimCLR SimCLR

2. SimCLR SimCLR - 1) Data Augmentation • ׮਺੄ 3о૑
Augmentation ӝߨਸ ےؒೞѱ ੸ਊ • Random Crop (with ﬂip and resize) • Color Distortion • Gaussian Blur

2. SimCLR SimCLR - 2) Encoder • ח Representation ܳ
݅٘ח Encoder Network • Encoderח ResNetਸ ࢎਊೣ (׮ܲ Ѫب ࢎਊ оמ) • f( ⋅ ) h h = f(˜ x) = ResNet(˜ x), where h ∈ ℝd

2. SimCLR SimCLR - 3) Projection • ח Representation ܳ
Latent Space۽ ࢎ࢚ೞח Projection Network • 2-layer Nonlinear MLP (fully connected) ܳ ࢎਊೣ • g( ⋅ ) h z = g(h) = W(2)σ(W(1)h)

2. SimCLR SimCLR - 4) Loss • In-batch Negativesܳ ੉ਊ೧ࢲ
Cross Entropy ೟ण • ੿ഛ൤ח Normalized temperature-scaled Cross Entropy (NT-Xent) • Batch ղ੄ ѐ੄ imageܳ п 2о૑ ߑधਵ۽ Augmentation ೞݶ ୨ ѐ੄ imageܳ ঳ਸ ࣻ ੓਺ (ਗࠄ image ࢎਊ উೣ) • زੌೠ ਗࠄ imageܳ ыח image ह ח positive,  ୹୊о ׮ܲ աݠ૑ imageٜҗ੄ ह ח negative • ૊, ੹୓ ѐ੄ ह ઺ীࢲ positiveܳ ଺ח धਵ۽ ೟ण N 2N (zi , zj ) (zi , zk ) 2N − 1

2. SimCLR SimCLR - Summary attract attract repel

2. SimCLR SimCLR - Algorithm

2. SimCLR Augmentation SimCLR - Algorithm

2. SimCLR Encoding Representation SimCLR - Algorithm

2. SimCLR Nonlinear Projection SimCLR - Algorithm

2. SimCLR Similarity & Loss SimCLR - Algorithm

2. SimCLR Training Details 1.Batch Size • 256ࠗఠ 8192ө૑ ׮নೞѱ
प೷ • Batch sizeо ੌ ٸ, ੉ Negativesפө ୭؀ 16382ѐ 2.LARS Optimizer • Batch sizeо ழ૑ݶ Learning rateب ழઉঠ ೞחؘ, ੉۞ݶ ೟ण੉ ࠛউ੿೧૗ (Goyal et al., 2017) • Large batch sizeীࢲ উ੿੸ੋ ೟णਸ ਤ೧ LARS Optimizerܳ ࢎਊ N 2(N − 1)

2. SimCLR Training Details 3.Global Batch Normalization • ࠙࢑ ೟णਸ
ೡ ٸ, BN੄ mean, varianceח п device ߹۽ ҅࢑ؽ (aggregated locally per device) • Positive pairח ೦࢚ э਷ deviceীࢲ ҅࢑ೞחؘ, ੉۽ ੋ೧ જ਷ representationਸ ݅٘ח ߑೱ੉ ইצ local ੿ࠁܳ ӝ߈ਵ۽ ৘ஏ ࢿמ݅ਸ ֫੉ח ߑೱਵ۽ ೟णೡ ࣻ ੓਺   → Global mean, varianceܳ ࢎਊ

2. SimCLR SimCLR for Downstream tasks

3. Experiments A Simple Framework for Contrastive Learning of Visual

3. Experiments Dataset & Evaluation Protocols • Dataset: ImageNet ILSVRC-2012
(Russakovsky et al., 2015) • Evaluation Protocols 1.Linear Evaluation • Linear Classiﬁer trained on learned features (frozen encoder) 2.Semi-supervised Learning • Fine-tune the model on few labels 3.Transfer Learning • Transfer Learning by ﬁne-tuning on other datasets

3. Experiments 1. Linear Evaluation

3. Experiments 2. Semi-supervised Learning • ੹୓ Labeled ILSVRC-2012 ೟ण
ؘ੉ఠীࢲ 1%, 10%݅ ୶୹ೞৈ ೟ण ߂ ಣо (class-balanced way) • Class ׼ 1% ӝળ 12.8੢ ೟ण • ੉੹ state-of-the-art ؀࠺ 10% ࢿמ ೱ࢚ • ଵҊ۽ 100% ೟णदఃݶ training from scratch ࠁ׮ ࢿמ੉ ؊ য়ܴ (ResNet (4x) 78.4% / 94.2% → 80.4% / 95.4%)

3. Experiments 3. Transfer Learning

4. Discussion A Simple Framework for Contrastive Learning of Visual

4. Discussion Large Models • Unsupervised contrastive learning beneﬁts (more)
from bigger models • ݽ؛ ௼ӝо ழ૕ ࣻ۾ ࢿמ੄ ੉ٙ੉ ੓਺ • Supervised ࠁ׮ Unsupervised Learning੉ ݽ؛ ௼ӝ ૐоী ٮۄ ೱ࢚ ಩੉ ؊ ఀ • Unsupervised Learning੉ ࢚؀੸ਵ۽ ؊ ௾ ੉ٙ

4. Discussion Nonlinear Projection • Projection Head ח যڃ ѱ
ઁੌ જਸө? • ܳ ׮਺੄ 3о૑ ߑߨਵ۽ प೷ • Identity mapping • Linear projection • Nonlinear projection with one additional hidden layer • Nonlinear projection੉ Linear projectionࠁ׮ח 3%, No projection ࠁ׮ 10% ੉࢚ જ਺ (see Figure 8) g( ⋅ ) g( ⋅ )

4. Discussion Nonlinear Projection • Downstream taskী ࢎਊೡ Representationਵ۽ॄ,  Projection
੉੹੄ , ੉റ੄ , ޖ঺੉ ؊ જਸө? • ࠁ׮ о ഻ঁ ؊ જ਺ (see Table 3 & Figure B.4) • Why? • is trained to be invariant to data transformation • Contrastive loss੄ ౠࢿ ࢚ ੉޷૑ ߸ജ(࢝, ഥ੹ ١)җ ޖҙೞѱ э਷ ੉޷૑ۄҊ ೟ण • ೞ૑݅ ੉۠ ੿ࠁח Downstream taskীࢲח ਬਊೡ ࣻ ੓਺ h z = g(h) z = g(h) h z = g(h)

4. Discussion Normalized Temperature-scaled Cross Entropy (NT-Xent) • normalization (i.e.
cosine similarity) along with temperature effectively weights different examples, and an appropriate temperature can help the model learn from hard negatives • Cross Entropyী ࠺೧ ׮ܲ Objectiveٜ਷ Negatives ࢎ੉ী ࢚؀੸ੋ য۰਑ਸ ಴അೡ ࣻ হ׮ח ױ੼੉ ੓਺ l2

4. Discussion Normalized Temperature-scaled Cross Entropy (NT-Xent) • Similarity ೣࣻ
߂ Scalingਸ ਤೠ Temperature termਸ যڌѱ ࢸ੿ೞח Ѫ੉ જਸө? • Cosine similarity vs. Dot product • Cosine similarity with о о੢ જ਺ • Contractive Accuracyח Dot productо о੢ જওਵա Linear Evaluation਷ য়൤۰ Cosine Similarityо જ਺ τ = 0.1

4. Discussion Batch Size and Training Time • Epochਸ ੘ѱ
оઉт ٸח Batch Sizeо ௿ࣻ۾ ࢿמ੉ әѺೞѱ ೱ࢚ؽ • Supervised Learningҗ ׳ܻ Contrastive Learning਷ Batch Sizeо ௿ࣻ۾ ؊ ݆਷ Negative Exampleਸ ઁҕ ೞӝ ٸޙী ؊ ബҗ੸ਵ۽ ࣻ۴ೣ

5. Conclusion A Simple Framework for Contrastive Learning of Visual

Conclusion 5. Conclusion • SimCLRۄח рױೞݶࢲب ബҗ੸ੋ Framework for Contrastive
Learningਸ ઁউೣ • ӝઓ੄ state-of-the-artҗ ࠺Үೞৈ ௼ѱ ࢿמਸ ೱ࢚दఇ • Representation Learningী ੓যࢲ ઺ਃೠ ਃٜࣗਸ प೷ਸ ా೧ࢲ ࠙ࢳೞৈ ݆਷ ੋࢎ੉౟ܳ ઁҕೣ • ઱ഘ ࢤп • RRM, SSM ೟णী ੓যࢲ ੹߈੸ੋ ҳઑ ߂ Nonlinear projection, NT-Xent ١ਸ ੸ਊ೧ࠅ ࣻ ੓ਸ ٠ • Linear Evaluation э੉ Representationী ؀ೠ ಣо ߑߨ੉ ੓ਵݶ જਸ ٠ (see Table 5; ֤ޙ п?) • (ডр ୶࢚੸੐) Similarity Modelਸ Self-supervisionਵ۽ ٜ݅ ߑߨ੉ ઑӘ ࢤп೧ࠁݶ ৷૑ ੓ਸ Ѫ э਺

хࢎ೤פ׮✌ ୶о ૕ޙ ژח ҾӘೠ ੼੉ ੓׮ݶ ঱ઁٚ ইې োۅ୊۽
োۅ ઱ࣁਃ! ੉઱ഘ (ML Research Scientist @ Pingpong) Email. [email protected] Facebook. @roomylee Linked in. @roomylee

Reference A Simple Framework for Contrastive Learning of Visual Representation
(Chen et al., 2020) • Papers • (SimCLR) A Simple Framework for Contrastive Learning of Visual Representations: https://arxiv.org/abs/2002.05709 • Ofﬁcial Repository: https://github.com/google-research/simclr • (SimCLR v2) Big Self-Supervised Models are Strong Semi-Supervised Learners: https://arxiv.org/abs/2006.10029 • (MoCo) Momentum Contrast for Unsupervised Visual Representation Learning: https://arxiv.org/abs/1911.05722 • (MoCo v2) Improved Baselines with Momentum Contrastive Learning: https://arxiv.org/abs/2003.04297 • Other Materials • SimCLR Slide by Google Brain: https://docs.google.com/presentation/d/1ccddJFD_j3p3h0TCqSV9ajSi2y1yOfh0-lJoK29ircs/edit? usp=sharing • The Illustrated SimCLR Framework: https://amitness.com/2020/03/illustrated-simclr/ • PR-231 Video: https://www.youtube.com/watch?v=FWhM3juUM6s

SimCLR: A Simple Framework for Contrastive Lear...

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

More Decks by Scatter Lab Inc.

Other Decks in Research

Featured

Transcript