Slide 1

Slide 1 text

ICML2016࿦จ঺հ Unsupervised Deep Embedding for Clustering Analysis Ross Girshick Jungian Xie Ali Farhadi University of Washington Facebook AI Research University of Washington ൃදऀ ਅ໦༐ਓ ಸྑઌ୺Պֶٕज़େֶӃେֶ ৘ใՊֶݚڀՊ ത࢜ޙظ՝ఔ ஌ೳίϛϡχέʔγϣϯݚڀࣨ 2016/07/16 @NAIST

Slide 2

Slide 2 text

3ߦͰཁ໿ • ର৅ɿݹయతͳΫϥελϦϯά໰୊ • ख๏ɿਂ૚ֶशΛར༻ͨ࣍͠ݩ࡟ݮύ
 ϥϝλͱΫϥελϦϯάͷಉ࣌
 ࠷దԽ • ݁Ռɿैདྷख๏ΑΓ΋ߴ͍ਫ਼౓ɼ୹͍
 ܭࢉ࣌ؒΛ࣮ݱ

Slide 3

Slide 3 text

ΫϥελϦϯάͷؔ࿈ݚڀ • k-means ٴͼ ࠞ߹ਖ਼ن෼෍Ϟσϧ(GMM) • ೖྗͷ࣍ݩ͕ߴ͍ͱࣦഊ͠΍͍͢ • ࣍ݩ࡟ݮͱΫϥελϦϯάΛಉ࣌ʹߦ͏ख๏ • ௿࣍ݩۭؒʹࣸ૾ɼࣸ૾ͨ͠ઌͰΫϥελϦϯά • ैདྷख๏͸ઢܗࣸ૾ͷΈ • εϖΫτϥϧɾΫϥελϦϯά • σʔλͷάϥϑߏ଄Λར༻͢Δख๏ • k-meansΑΓྑ͍݁ՌʹͳΔ͜ͱ͕ଟ͍ • ܭࢉྔ͕αϯϓϧ਺ͷ̎৐·ͨ͸̐৐ʹൺྫ

Slide 4

Slide 4 text

ه߸ • σʔλ਺ɿ • σʔλɿ • Ϋϥελͷ਺ʢࣄલʹܾఆʣɿ • ࣸ૾ɿ • ɹ ͷ࣍ݩ <<< ͷ࣍ݩ • ࣸ૾ͷύϥϝλ ΛDNNͰֶश • ࣸ૾ઌͷσʔλɿ • ηϯτϩΠυʢΫϥελΛ୅ද͢Δ఺ʣɿ n { xi 2 X }n i=1 k zi = f✓( xi) ✓ {zi 2 Z}n i=1 {µj 2 Z}k i=1 zi = f✓( xi) zi = f✓( xi)

Slide 5

Slide 5 text

ఏҊ๏ͷྲྀΕ ॳظԽ ࣍ݩ࡟ݮ ΫϥελׂΓ౰ͯ KL divergenceܭࢉ ύϥϝλߋ৽

Slide 6

Slide 6 text

࣍ݩ࡟ݮ • ਂ૚ֶशΛར༻ͨ͠ඇઢܗͳ௿࣍ݩԽࣸ૾ f✓ : X ! Z { xi 2 X }n i=1 zi = • ڭࢣͳֶ͠शͷͨΊɼަ ࠩݕূ๏ʹΑΔϋΠύʔύ ϥϝλͷௐ੔͸Ͱ͖ͳ͍ • ͦͷͨΊɼΑ͘࢖ΘΕΔ ωοτϫʔΫߏ੒Λ࢖༻ • ֤૚ͷ࣍ݩ͸
 (input)-500-500-2000-10 • શ݁߹ [van der Maaten, 09]

Slide 7

Slide 7 text

ΫϥελׂΓ౰ͯ Soft Asignment • ࣸ૾͞Εͨσʔλ ͱηϯτϩΠυ ͷྨࣅ ౓ई౓ (soft assignment)
 
 
 
 
 
 
 ͸ɼ ͕̹൪໨ͷΫϥελʹೖΔ֬཰ͱͯ͠ ղऍͰ͖Δɽ qij = 1 + kzi µj0 k2/↵ (↵+1)/2 P j0 (1 + kzi µj0 k2/↵) (↵+1)/2 ↵ = 1 {zi 2 Z}n i=1 µj [van der Maaten & Hinton, 08] qij = 1 + kzi µj0 k2/↵ (↵+1)/2 P j0 (1 + kzi µj0 k2/↵) (↵+1)/2 {zi 2 Z}n i=1 • ڭࢣͳֶ͠शʹ͓͍ͯ͸ɼަࠩݕূ๏͸࢖͑ͳ͍ ͨΊɼ ʹݻఆɽ

Slide 8

Slide 8 text

KLμΠόʔδΣϯεʹΑΔDNNֶश • ࢑ఆతͳׂΓ౰ͯ෼෍ • ໨ඪ෼෍ʢཧ૝తͳΫϥελϦϯάΛߦ͏ͱߟ͑Β ΕΔ෼෍ʣ • PͱQͷKLμΠόʔδΣϯεΛ࠷খԽ͢ΔΑ͏ʹDNN Λֶश • Pͷઃఆ͕ຊख๏ͷΩϞ

Slide 9

Slide 9 text

ύϥϝλߋ৽ • DNNͷύϥϝλθ ͱ
 ηϯτϩΠυ μj Λߋ৽ • SGDͰߋ৽ (θ͸όοΫϓϩύήʔγϣϯ)

Slide 10

Slide 10 text

ॳظԽ • DNNͷॳظԽɿ Stacked Auto Encoder Λར༻ • ηϯτϩΠυͷॳظԽɿॳظԽDNNΛར༻ͯ࣍͠ݩ࡟ ݮ͠ɼࣸ૾ઌͰk-means

Slide 11

Slide 11 text

࣮ݧ • σʔληοτ • ൺֱख๏ • k-means • LDGMI (εϖΫτϥϧɾΫϥελϦϯά) • SEC (εϖΫτϥϧɾΫϥελϦϯά) • Without back propagation

Slide 12

Slide 12 text

࣮૷ • Stacked Auto EncoderͷॳظԽ • ฏۉ0ɼඪ४ภࠩ0.01ͷਖ਼ن෼෍Λ࢖ͬͨ ཚ਺ͰॏΈΛॳظԽ • ֤૚͝ͱʹ50000ճ൓෮ʢ20%Dropoutʣ • Auto EncoderશମͰ100000ճ൓෮ͯ͠ fine tuning (Dropoutແ͠) • ϛχόοναΠζ=256 • ֶश཰=0.1 ←20000൓෮ຖʹ1/10

Slide 13

Slide 13 text

࣮૷ • ηϯτϩΠυͷॳظԽ • ҟͳΔॳظ஋Ͱ20ճ࣮ߦͯ͠ϕετͳ΋ ͷΛબ୒ • KLμΠόʔδΣϯεͷ࠷খԽ • ֶश཰=0.01 (ݻఆ) • ऩଋ൑ఆ • ΫϥελͷׂΓ౰͕ͯมԽ͢Δσʔλ͕ 0.1%ҎԼʹͳΔ·Ͱ

Slide 14

Slide 14 text

ධՁج४ • Unsupervised Clustering Accuracy (ACC) • pi ɿਅͷϥϕϧ • qi ɿΞϧΰϦζϜ͕ग़ྗͨ͠ϥϕϧ • map()ɿϥϕϧ͔ΒΫϥελ΁ͷ࠷దͳϚοϐϯά

Slide 15

Slide 15 text

݁Ռ • ఏҊ๏͕ϕετͷੑೳ • REUTERSʹ͍ͭͯ • ఏҊ๏ͷֶश࣌ؒ͸30෼ఔ౓ • LDMGIͱSEC͸਺ϲ݄Ҏ্ͷܭࢉ࣌ؒͱςϥ ୯ҐͷϝϞϦ͕ඞཁ

Slide 16

Slide 16 text

ϋΠύʔύϥϝλʹର͢Δؤ݈ੑ • ҟͳΔ9ͭͷϋΠύʔύϥϝλʢΞχʔϦϯάʣ
 ͰੑೳΛൺֱ • ఏҊ๏͸ϋΠύʔύϥϝλͷมಈʹରͯ͠ؤ݈ɼ
 ͔ͭσʔληοτʹඇґଘ • ڭࢣͳֶ͠शʹ͓͍ͯ͸ॏཁͳੑ࣭

Slide 17

Slide 17 text

໨ඪ෼෍Pͷੑ࣭ • qij ͕େ͖΄Ͳ(֬৴౓͕େ͖͍΄Ͳ)ɼޯ഑͕ େ͖͘ͳΔ
 ˠP͸๬·͍͠ੑ࣭Λ͍࣋ͬͯΔ

Slide 18

Slide 18 text

࣍ݩ࡟ݮͷ൓෮࠷దԽʹΑΔޮՌ • t-SNEΛར༻ͨ͠ՄࢹԽ [van der Maaten & Hinton, 08] • ߋ৽͕ਐΉ΄ͲΫϥελʔͷ෼཭͕໌֬ʹ

Slide 19

Slide 19 text

Auto EncoderʹΑΔಛ௃நग़ͷޮՌ • Auto EncoderͰಛ௃நग़ˠ֤ΞϧΰϦζϜͰॲཧ • Auto EncoderʹΑΔߩݙ͕େ͖͍

Slide 20

Slide 20 text

ෆۉҰͳσʔληοτʹର͢Δؤ݈ੑ • αϯϓϧ਺͕࠷খͷΫϥεͷαϯϓϧ਺Λɼ ࠷େͷαϯϓϧ਺ͷΫϥεͷrmin ഒʹઃఆɼ ͦͷଞͷΫϥε͸0.1ͣͭ૿΍͍ͯ͘͠ • ఏҊ๏͸αϯϓϧ਺ͷෆۉҰੑʹରͯ͠ؤ݈

Slide 21

Slide 21 text

݁࿦ • ఏҊ๏͸ɼਂ૚ֶशΛར༻ͨ࣍͠ݩ࡟ݮʹΑΓΫϥ ελϦϯάͷੑೳΛ޲্ • ࣍ݩ࡟ݮͷύϥϝλͱΫϥελϦϯάͷ݁ՌΛಉ ࣌ʹ࠷దԽ • ΫϥελϦϯά࢑ఆతͳग़ྗͱ໨ඪ෼෍ͱͷKL μΠόʔδΣϯεΛଛࣦؔ਺ͱͯ͠όοΫϓϩύ ήʔγϣϯ • ैདྷ๏ΑΓߴਫ਼౓͔ͭߴ଎ʢܭࢉ࣌ؒ͸αϯϓϧ਺ ʹରͯ͠ઢܗʹൺྫʣɼσʔληοτඇґଘɼϋΠύʔ ύϥϝλඇґଘɼαϯϓϧ਺ͷෆۉҰੑʹରͯ͠ؤ݈

Slide 22

Slide 22 text

Ҏ্