Divergence Optimization for Noisy UniDA

Qing Yu1,2, Atsushi Hashimoto2, Yoshitaka Ushiku2 (1The University of Tokyo,
2OMRON SINIC X) Divergence Optimization for Noisy Universal Domain Adaptation

Unsupervised Domain Adaptation (UDA) 2

Partial UDA 3

Open-set UDA 4

Universal UDA (UniUDA) 5

Our Setting for Noisy UniUDA 6

Related Works 7 Method Noisy labels Partial UDA Open-set UDA
DANN [Ganin+, ICML 15] ✖ ✖ ✖ TCL [Shu+, AAAI 19] ✔ ✖ ✖ ETN [Cao+, CVPR 19] ✖ ✔ ✖ STA [Liu+, CVPR 19] ✖ ✖ ✔ UAN [You+, CVPR 19] ✖ ✔ ✔ DANCE [Saito+, NeurIPS 20] ✖ ✔ ✔ Ours ✔ ✔ ✔

Overall Concept 8 CNN-1 Noisy dataset CNN-2 Different parameters distinct
views of each sample

Overall Concept 9 CNN-1 CNN-2 Different parameters Dog Divergence Dog
Cat 1.0 0.0 Dog Cat 1.0 0.0 Entropy Entropy Cross Entropy

Overall Concept 10 CNN-1 CNN-2 Different parameters Dog Divergence Dog
Cat 0.8 0.2 Dog Cat 0.5 0.5 Entropy Entropy Cross Entropy

Our Setting for Noisy UniUDA 11 Noisy data Also Noisy
data

• Step-1: Train on noisy labeled source samples Proposed Method
12 G F1 F2 Labeled source samples Divergence Probabilities Supervised Loss Supervised Loss Total Loss

Proposed Method 13 G F1 F2 Labeled source samples Probabilities
Supervised Loss Supervised Loss Total Loss • Step-1: Train on noisy labeled source samples Samples having small loss ↓ Clean source samples Divergence

Proposed Method 14 • Step-1: Train on noisy labeled source
samples -> Train the classifiers under the supervision without detected noisy labels Clean source samples Classifiers for ▲ Target samples Noisy source samples

• Step-2: Train on unlabeled target samples Proposed Method 15
G F1 F2 Unlabeled target samples Probabilities Divergence Separation Large divergence -> Target private samples Small divergence -> Common samples

• Step-2: Train on unlabeled target samples -> Align the
distribution of target common/private samples according to the divergence Proposed Method 16 Align target common samples into small divergence area Align targe private samples into large divergence area

G F1 F2 Unlabeled target samples Probabilities Maximize Divergence

Updated classifiers by maximizing divergence to detect target samples Source samples are also used to keep classifiers

G F1 F2 Unlabeled target samples Probabilities Minimize Divergence common samples having small divergence

Align targe common samples into the cluster of source samples

Inference 21 Test target samples Divergence > Threshold? Large divergence
-> Target private samples Small divergence -> Common samples -> Class prediction G F2 Probabilities F1

Experiments on Toy Data 22 Source samples Target samples Classifier-1
for Classifier-2 for Area of Area of Area of Area having large divergence

Experiments on Benchmarks • Dataset (Source private/Common/Target private): • Office
(10/10/11) • Office-Home (5/10/50) • Visda (3/6/3) • Training source data: • Label noise: pairflip (P), symmetry (S) • Nosie Level: 20%, 45% -> P20, P45, S20, S45 23 pairflip noise transition matrix True label Corrupted label

Experiments on Benchmarks • Dataset (Source private/Common/Target private): • Office
(10/10/11) • Office-Home (5/10/50) • Visda (3/6/3) • Training source data: • Label noise: pairflip (P), symmetry (S) • Nosie Level: 20%, 45% -> P20, P45, S20, S45 24 True label Corrupted label symmetric noise transition matrix

Results • Test Accuracy (Common classes + target private class)
->State-of-the-art performance in most tasks 25

Results • Probability density function of the divergence of common
and target private samples ->The divergence of common and target private samples is separated 26 0.0 0.2 0 0.4 Density 0.6 0.8 1 2 3 4 5 6 Divergence

Conclusion • We proposed divergence optimization for Noisy UniUDA. •
We used two classifiers to find clean source samples, reject target private classes, and find important target samples that contribute most to the model’s adaptation. • Our method achieved high performance on a diverse set of benchmarks. 27

Divergence Optimization for Noisy UniDA

Divergence Optimization for Noisy UniDA

YUI

More Decks by YUI

Other Decks in Research

Featured

Transcript

Qing Yu1,2, Atsushi Hashimoto2, Yoshitaka Ushiku2 (1The University of Tokyo,

Unsupervised Domain Adaptation (UDA) 2

Partial UDA 3

Open-set UDA 4

Universal UDA (UniUDA) 5

Our Setting for Noisy UniUDA 6

Related Works 7 Method Noisy labels Partial UDA Open-set UDA

Overall Concept 8 CNN-1 Noisy dataset CNN-2 Different parameters distinct

Overall Concept 9 CNN-1 CNN-2 Different parameters Dog Divergence Dog

Overall Concept 10 CNN-1 CNN-2 Different parameters Dog Divergence Dog

Our Setting for Noisy UniUDA 11 Noisy data Also Noisy

• Step-1: Train on noisy labeled source samples Proposed Method

Proposed Method 13 G F1 F2 Labeled source samples Probabilities

Proposed Method 14 • Step-1: Train on noisy labeled source

• Step-2: Train on unlabeled target samples Proposed Method 15

• Step-2: Train on unlabeled target samples -> Align the

• Step-3: Train on unlabeled target samples Proposed Method 17

• Step-3: Train on unlabeled target samples Proposed Method 18

• Step-4: Train on unlabeled target samples Proposed Method 19

• Step-4: Train on unlabeled target samples Proposed Method 20

Inference 21 Test target samples Divergence > Threshold? Large divergence

Experiments on Toy Data 22 Source samples Target samples Classifier-1

Experiments on Benchmarks • Dataset (Source private/Common/Target private): • Office

Experiments on Benchmarks • Dataset (Source private/Common/Target private): • Office

Results • Test Accuracy (Common classes + target private class)

Results • Probability density function of the divergence of common

Conclusion • We proposed divergence optimization for Noisy UniUDA. •