Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Divergence Optimization for Noisy UniDA

8d274b7e71747715113dc29d4617f678?s=47 YUI
June 10, 2021

Divergence Optimization for Noisy UniDA

Qing Yu, Atsushi Hashimoto and Yoshitaka Ushiku, "Divergence Optimization for Noisy Universal Domain Adaptation", CVPR'21
paper: https://arxiv.org/abs/2104.00246

8d274b7e71747715113dc29d4617f678?s=128

YUI

June 10, 2021
Tweet

Transcript

  1. Qing Yu1,2, Atsushi Hashimoto2, Yoshitaka Ushiku2 (1The University of Tokyo,

    2OMRON SINIC X) Divergence Optimization for Noisy Universal Domain Adaptation
  2. Unsupervised Domain Adaptation (UDA) 2

  3. Partial UDA 3

  4. Open-set UDA 4

  5. Universal UDA (UniUDA) 5

  6. Our Setting for Noisy UniUDA 6

  7. Related Works 7 Method Noisy labels Partial UDA Open-set UDA

    DANN [Ganin+, ICML 15] ✖ ✖ ✖ TCL [Shu+, AAAI 19] ✔ ✖ ✖ ETN [Cao+, CVPR 19] ✖ ✔ ✖ STA [Liu+, CVPR 19] ✖ ✖ ✔ UAN [You+, CVPR 19] ✖ ✔ ✔ DANCE [Saito+, NeurIPS 20] ✖ ✔ ✔ Ours ✔ ✔ ✔
  8. Overall Concept 8 CNN-1 Noisy dataset CNN-2 Different parameters distinct

    views of each sample
  9. Overall Concept 9 CNN-1 CNN-2 Different parameters Dog Divergence Dog

    Cat 1.0 0.0 Dog Cat 1.0 0.0 Entropy Entropy Cross Entropy
  10. Overall Concept 10 CNN-1 CNN-2 Different parameters Dog Divergence Dog

    Cat 0.8 0.2 Dog Cat 0.5 0.5 Entropy Entropy Cross Entropy
  11. Our Setting for Noisy UniUDA 11 Noisy data Also Noisy

    data
  12. • Step-1: Train on noisy labeled source samples Proposed Method

    12 G F1 F2 Labeled source samples Divergence Probabilities Supervised Loss Supervised Loss Total Loss
  13. Proposed Method 13 G F1 F2 Labeled source samples Probabilities

    Supervised Loss Supervised Loss Total Loss • Step-1: Train on noisy labeled source samples Samples having small loss ↓ Clean source samples Divergence
  14. Proposed Method 14 • Step-1: Train on noisy labeled source

    samples -> Train the classifiers under the supervision without detected noisy labels Clean source samples Classifiers for ▲ Target samples Noisy source samples
  15. • Step-2: Train on unlabeled target samples Proposed Method 15

    G F1 F2 Unlabeled target samples Probabilities Divergence Separation Large divergence -> Target private samples Small divergence -> Common samples
  16. • Step-2: Train on unlabeled target samples -> Align the

    distribution of target common/private samples according to the divergence Proposed Method 16 Align target common samples into small divergence area Align targe private samples into large divergence area
  17. • Step-3: Train on unlabeled target samples Proposed Method 17

    G F1 F2 Unlabeled target samples Probabilities Maximize Divergence
  18. • Step-3: Train on unlabeled target samples Proposed Method 18

    Updated classifiers by maximizing divergence to detect target samples Source samples are also used to keep classifiers
  19. • Step-4: Train on unlabeled target samples Proposed Method 19

    G F1 F2 Unlabeled target samples Probabilities Minimize Divergence common samples having small divergence
  20. • Step-4: Train on unlabeled target samples Proposed Method 20

    Align targe common samples into the cluster of source samples
  21. Inference 21 Test target samples Divergence > Threshold? Large divergence

    -> Target private samples Small divergence -> Common samples -> Class prediction G F2 Probabilities F1
  22. Experiments on Toy Data 22 Source samples Target samples Classifier-1

    for Classifier-2 for Area of Area of Area of Area having large divergence
  23. Experiments on Benchmarks • Dataset (Source private/Common/Target private): • Office

    (10/10/11) • Office-Home (5/10/50) • Visda (3/6/3) • Training source data: • Label noise: pairflip (P), symmetry (S) • Nosie Level: 20%, 45% -> P20, P45, S20, S45 23 pairflip noise transition matrix True label Corrupted label
  24. Experiments on Benchmarks • Dataset (Source private/Common/Target private): • Office

    (10/10/11) • Office-Home (5/10/50) • Visda (3/6/3) • Training source data: • Label noise: pairflip (P), symmetry (S) • Nosie Level: 20%, 45% -> P20, P45, S20, S45 24 True label Corrupted label symmetric noise transition matrix
  25. Results • Test Accuracy (Common classes + target private class)

    ->State-of-the-art performance in most tasks 25
  26. Results • Probability density function of the divergence of common

    and target private samples ->The divergence of common and target private samples is separated 26 0.0 0.2 0 0.4 Density 0.6 0.8 1 2 3 4 5 6 Divergence
  27. Conclusion • We proposed divergence optimization for Noisy UniUDA. •

    We used two classifiers to find clean source samples, reject target private classes, and find important target samples that contribute most to the model’s adaptation. • Our method achieved high performance on a diverse set of benchmarks. 27