Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DeepCluster 論文の紹介

Yuki Ishikawa
August 08, 2018

DeepCluster 論文の紹介

Facebook AI Research による論文「Deep Clustering
 for Unsupervised Learning
 of Visual Features」の解説資料
https://arxiv.org/abs/1807.05520

Yuki Ishikawa

August 08, 2018
Tweet

More Decks by Yuki Ishikawa

Other Decks in Science

Transcript

  1. Deep Clustering

    for Unsupervised Learning

    of Visual Features
    2018.08.08
    (ԭೄ) AI ؔ࿈࿦จಡΈձ #4

    View Slide

  2. @hoto17296
    • ͪΎΒσʔλגࣜձࣾ
    • Web ԰ / Πϯϑϥ԰ / σʔλ෼ੳ԰

    View Slide

  3. ࠓճͷ࿦จɿ

    Deep Clustering for Unsupervised Learning

    of Visual Features
    • https://arxiv.org/abs/1807.05520
    • Facebook AI Research
    • Accepted at ECCV 2018

    View Slide

  4. ֓ཁ
    • CNN Ͱը૾ͷΫϥελϦϯάΛ͢Δख๏
    • CNN ͷग़ྗΛ k-means ͰΫϥελϦϯάͨ݁͠ՌΛ

    “ِϥϕϧ” ͱͯ͠ѻ͍ɺωοτϫʔΫͷॏΈΛߋ৽͢Δ
    • ͱͯ΋ྑ͍ੑೳ͕ग़ͨ
    • Pascal VOC ʹΑΔධՁͰଞͷΞϧΰϦζϜΛ௒͑Δੑೳ
    • ؤ݈ੑ͕͋Δ
    • σʔληοτΛม͑ͯ΋େৎ෉ (ImageNet → YFCC100)
    • ωοτϫʔΫߏ଄Λม͑ͯ΋େৎ෉ (AlexNet → VGG16)
    • k-means Ҏ֎ͷΫϥελϦϯάΞϧΰϦζϜͰ΋େৎ෉

    View Slide

  5. 1. എܠ

    View Slide

  6. ImageNet
    • (௒༗໊ͳ) ը૾σʔληοτ
    • 100ສຕΛ௒͑Δը૾
    • 1000Ϋϥεʹϥϕϧ෇͚͕͞Ε͍ͯΔ
    • ը૾෼ྨΞϧΰϦζϜͷධՁͳͲͷ༻్Ͱ

    Α͘༻͍ΒΕΔ

    View Slide

  7. ImageNet ͷ՝୊
    • “ͨͬͨͷ” 100ສຕ͔͠ͳ͍
    • ਓؒͷ෇͚ͨϥϕϧʹҰ෦ޡΓ͕͋Δ

    ۙ೥ɺը૾෼ྨϞσϧͷੑೳ͕಄ଧͪʹͳ͍ͬͯΔͷ͸

    σʔληοτʹཁҼ͕͋Δ΋ͷͱߟ͑ΒΕ͍ͯΔ

    View Slide

  8. ImageNet ͷ՝୊ͷղܾࡦ
    • Πϯλʔωοτن໛ͷը૾σʔληοτ
    • ਓ͕ؒϥϕϧ෇͚͠ͳ͍

    ڭࢣͳֶ͠शʹΑͬͯ͜ΕΛ࣮ݱ͍ͨ͠

    View Slide

  9. 2. લఏ

    View Slide

  10. ڭࢣ (͋Γ|ͳ͠) ֶश
    • ڭࢣ͋Γֶश (Supervised Learning)
    • ֶशσʔλʹڭࢣϥϕϧ͕෇͍͍ͯΔ
    • ෼ྨ΍ճؼͳͲ
    • ڭࢣͳֶ͠श (Unsupervised Learning)
    • ֶशσʔλʹڭࢣϥϕϧ͕෇͍͍ͯͳ͍
    • ΫϥελϦϯά
    • AutoEncoder

    View Slide

  11. ࣗݾڭࢣ͋Γֶश
    (Self-Supervised Learning)
    • ڭࢣͳֶ͠शͷҰछ (ཁग़య)
    • ԿΒ͔ͷํ๏Ͱ “ِͷϥϕϧ” Λ༻ҙ͠ɺ

    ͦΕΛڭࢣϥϕϧͱݟֶཱͯͯशΛߦ͏

    View Slide

  12. 3. ख๏

    View Slide

  13. epoch ͷྲྀΕ
    3. k-means ͰΫϥελϦϯά
    1. ೖྗΛ CNN ͰϑΥϫʔυ
    4. ΫϥελϦϯά݁ՌΛ

    “ِϥϕϧ” ͱͯ͠ޡࠩΛܭࢉ
    5. ωοτϫʔΫͷॏΈΛߋ৽
    2. CNN ͷग़ྗ݁ՌΛ PCA Ͱѹॖ

    View Slide

  14. ٙ໰
    ͏·͘ΫϥελϦϯάͰ͖ΔΘ͚ͳ͘ͳ͍ʁʁʁ
    ॳظঢ়ଶͰ͸ωοτϫʔΫΛશֶ͘शͤͯ͞ͳ͍ͷʹɺ

    View Slide

  15. Ͱ͖ΔΒ͍͠
    5IFHPPEQFSGPSNBODFPGSBOEPNDPOWOFUTJTJOUJNBUFMZUJFEUPUIFJS
    DPOWPMVUJPOBMTUSVDUVSFXIJDIHJWFTBTUSPOHQSJPSPOUIFJOQVUTJHOBM
    (ֶश͍ͤͯ͞ͳ͍) ϥϯμϜͳ CNN Ͱ͋ͬͯ΋ྑ͍ੑೳ͕ग़ͤΔͷ͸ɺ
    ೖྗ৴߸ʹڧ͍ࣄલ෼෍Λ༩͑Δ৞ΈࠐΈߏ଄ ͕ີ઀ʹؔ܎͍ͯ͠Δɻ
    ❓❓❓

    View Slide

  16. ผͷݚڀ [26]
    • ύϥϝʔλ͕ϥϯμϜͳ AlexNet ʹ

    ImageNet σʔληοτͰ෼ྨΛߦͬͨ
    • ग़ྗ΋ϥϯμϜʹͳΔͱ͢Ε͹ɺ

    (ImageNet ͸1000Ϋϥε෼ྨͳͷͰ)

    ਫ਼౓ͷظ଴஋͸ 0.1 %ͱͳΔ
    • ͔࣮͠͠ࡍʹ͸ɺظ଴஋Λང͔ʹ௒͑Δ

    12 %ͷਫ਼౓Λग़ͨ͠

    View Slide

  17. ͨͿΜ͜͏͍͏͜ͱ
    ύϥϝʔλ͕ϥϯμϜͰ͋ͬͯ΋ɺCNN ͷߏ଄ͦͷ΋ͷ͕

    ʮͳΜ͔ͦΕͬΆ͍஋Λग़ྗ͢ΔʯྗΛ͍࣋ͬͯΔ

    View Slide

  18. 4. ͦͷଞ͍Ζ͍Ζ

    View Slide

  19. ͍͔ͭ͘ͷ໰୊ͷճආ
    • શ෦ͻͱͭͷΫϥελʹೖͬͯ͠·͏໰୊
    • ۭͷΫϥελ͕͋ͬͨ৔߹͸ॏ৺ΛҠಈͤͯ͞

    ΫϥελΛ࠶ܭࢉ͢Δ͜ͱͰղܾ
    • ِϥϕϧ਺͕ภΔ໰୊
    • ڭࢣ͋ΓֶशͰϥϕϧͷ਺͕ภ͍ͬͯΔͱ͖ʹ

    ى͖Δͷͱಉ͡໰୊
    • ِϥϕϧͷத͔ΒҰ༷ʹαϯϓϦϯάͯ͠

    ֶशͤ͞Δ͜ͱͰղܾ

    View Slide

  20. ࣮૷ͷৄࡉ (1/2)
    • CNN ʹ͸ඪ४తͳ AlexNet Λ༻͍ͨ
    • Local Response Normalization ૚͸

    Batch Normalization ૚ʹೖΕସ͑ͨ
    • ৭৘ใΛͦͷ··ѻ͏ͷ͕೉͍͠
    • Sobel filter (※ ྠֲநग़) ʹجͮ͘ઢܗม׵ʹΑͬͯ

    ৭Λ࡟আ͠ίϯτϥετΛڧௐ͍ͯ͠Δ
    • ImageNet ͷը૾͸ Data Augmentation ͯ͠ೖྗͨ͠
    • mini batch size ͸ 256 ʹͨ͠

    View Slide

  21. ࣮૷ͷৄࡉ (2/2)
    • 500 epoch Λֶश͢Δͷʹ P100 GPU Λ࢖ͬͯ

    12 ೔͔͔ͬͨ
    • ࣮ߦ࣌ؒશମͷ 1/3 ͸ k-means ͷॲཧ࣌ؒ
    • ΫϥελϦϯά͢ΔલʹશσʔλΛ Forward ͢Δඞཁ͕

    ͋ΔͷͰͲ͏ͯ͠΋͕͔͔࣌ؒΔ

    View Slide

  22. 5. ༷ʑͳ࣮ݧɾߟ࡯

    View Slide

  23. ิ଍ɿਖ਼نԽ૬ޓ৘ใྔ (NMI)
    • Normalized Mutual Information
    • ͋ΔΫϥελϦϯά݁Ռ A ͱ

    ผͷΫϥελϦϯά݁Ռ B ͕

    ͲΕ͚ͩࣅ௨͍ͬͯΔ͔ΛදݱͰ͖Δ

    View Slide

  24. ImageNet ϥϕϧͱͷൺֱ
    • DeepCluster ʹΑΔΫϥελϦϯά݁Ռͱ

    ImageNet ͷϥϕϧͷ NMI ͷਪҠ
    • epoch ͕ਐΉʹͭΕͯ

    ࣅ௨ͬͯ͘Δ

    View Slide

  25. Ϋϥελͷ҆ఆੑ
    • ͋Δը૾͕ɺ࣍ͷ epoch Ͱ΋ಉ͡Ϋϥελʹ

    ׂΓ౰ͯΒΕΔׂ߹ (= ҆ఆੑ)
    • epoch ͕ਐΉʹͭΕ

    ҆ఆੑ͕૿͢
    • 0.8 ҎԼͰ๞࿨͢Δ
    • ͦΕҎ্ͷֶश͸

    ҙຯ͕ͳ͍

    View Slide

  26. Ϋϥελ਺ʹΑΔੑೳͷҧ͍
    • mAP ͱ͍͏ํ๏ (ʁ) Ͱ෼ྨੑೳΛܭଌͨ͠
    • k = 10,000 Ͱ࠷΋ੑೳ͕ྑ͔ͬͨ
    • ImageNet Ͱ͋Ε͹ k = 1,000 ͕

    ྑ͍ͷͰ͸ͳ͍͔ͱߟ͕͕͑ͪͩɺ

    ա৒ͳηάϝϯςʔγϣϯͷ

    ΄͏͕͍͍݁ՌΛग़ͨ͠

    View Slide

  27. ৭ͷআڈʹΑΔࣝผೳྗͷҧ͍
    • Լͷը૾͸ɺCNN ͷ࠷ॳͷ૚ΛՄࢹԽͨ͠΋ͷ
    • ৭৘ใΛͦͷ··ೖྗͨ͠৔߹ (ࠨ) ͸ɺ

    ৭ʹؔ͢Δ৘ใ͔ࣝ͠ผ͍ͯ͠ͳ͍
    • Sobel filter Ͱ৭৘ใΛม׵ͨ͠৔߹ (ӈ) ͸ɺ

    ΤοδΛࣝผ͍ͯ͠Δ

    View Slide

  28. CNN ͷ֤૚͝ͱͷߟ࡯
    • Լͷը૾͸ɺ֤૚Ͱ࠷΋൓Ԡͷྑ͔ͬͨը૾ TOP 9
    • ਂ͍૚ʹͳΔ΄Ͳେ͖ͳύλʔϯΛೝ͍ࣝͯ͠Δ (༧૝௨Γ)
    • ࠷ޙͷ૚ (conv5) ͸ɺલͷ૚·ͰͰೝࣝͨ͜͠ͱΛ

    ࠶౓ೝࣝ͠௚͍ͯ͠ΔΑ͏ʹ΋ݟ͑Δ
    • (AlexNet ʹ͓͍ͯ) ࠷ޙͷ૚ (conv5) ͸ଞͷ૚ͱ͸

    ಛ௃͕ҟͳΔͱ͍͏ผͷݚڀ݁Ռ [43] Λཪ෇͚͍ͯΔ

    View Slide

  29. ֤૚ͷ෼ྨੑೳ (1/3)
    • ্Ґ n ૚·Ͱͷग़ྗ͔Βઢܗ෼ྨثΛ࡞Δ
    • ImageNet ͱ Place σʔληοτͰͷ෼ྨੑೳΛධՁ͢Δ

    View Slide

  30. ֤૚ͷ෼ྨੑೳ (2/3)

    View Slide

  31. ֤૚ͷ෼ྨੑೳ (3/3)
    • DeepCluster ͸ߴ͍ϨΠϠͰͷੑೳ͕ྑ͍
    • conv3 ͷੑೳ͕ͱͯ΋ྑ͍
    • ͳΜͱ conv5 ΑΓ΋ྑ͍
    • ҰํͰ conv1 ͷੑೳ͕શ͘ྑ͘ͳ͍
    • DeepCluster Ͱ͸ɺconv3-conv4 Ͱ ImageNet ͷ

    ϥϕϧʹ૬౰͢Δ΋ͷΛೝ͍ࣝͯ͠ΔͷͰ͸ͳ͍͔

    View Slide

  32. Pascal VOC ʹΑΔධՁ (1/3)
    • Pascal VOC: ෼ྨɾ෺ମݕग़ɾϥϕϧ෇͚ Λߦ͏ίϯϖ
    • DeepCluster Λ࢖ͬͯ໰୊Λղ͘͜ͱͰੑೳΛධՁ͢Δ
    • ෺ମݕग़ͷ࣮૷ʹ͸ Fast R-CNN Λ༻͍ͨ

    View Slide

  33. Pascal VOC ʹΑΔධՁ (2/3)

    View Slide

  34. Pascal VOC ʹΑΔධՁ (3/3)
    • ෼ྨɾ෺ମݕग़ɾϥϕϧ෇͚ ͢΂ͯʹ͓͍ͯੑೳ͕ྑ͍
    • ڵຯਂ͍఺ͱͯ͠ɺfine-tuned (?) ͳϥϯμϜωοτϫʔΫ͸
    ͦΕͳΓͷਫ਼౓Λग़͕͢ɺશ݁߹૚ 6-8 ͷΈΛֶशͨ͠৔߹
    ͷੑೳ͸͔ͳΓ௿͘ͳΔ
    • ͜ΕΒͷλεΫ͸ fine-tuning Ͱ͖ͳ͍৔߹Ͱݱ࣮ͷ

    ΞϓϦέʔγϣϯͱۙ͘ͳΔ
    • ͦͷ৔߹ɺ࠷৽ͷख๏ͱͷࠩ͸ߋʹେ͖͘ͳΔͩΖ͏
    (෼ྨͰ࠷େ 9%)
    ( ˘ω˘) .oO ( ͪΐͬͱԿݴͬͯΔ͔Θ͔ΒΜ͔ͬͨ )

    View Slide

  35. 6. ·ͱΊ

    View Slide

  36. ֓ཁ (࠶ܝ)
    • CNN Ͱը૾ͷΫϥελϦϯάΛ͢Δख๏
    • CNN ͷग़ྗΛ k-means ͰΫϥελϦϯάͨ݁͠ՌΛ

    “ِϥϕϧ” ͱͯ͠ѻ͍ɺωοτϫʔΫͷॏΈΛߋ৽͢Δ
    • ͱͯ΋ྑ͍ੑೳ͕ग़ͨ
    • Pascal VOC ʹΑΔධՁͰଞͷΞϧΰϦζϜΛ௒͑Δੑೳ
    • ؤ݈ੑ͕͋Δ
    • σʔληοτΛม͑ͯ΋େৎ෉ (ImageNet → YFCC100)
    • ωοτϫʔΫߏ଄Λม͑ͯ΋େৎ෉ (AlexNet → VGG16)
    • k-means Ҏ֎ͷΫϥελϦϯάΞϧΰϦζϜͰ΋େৎ෉

    View Slide