Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
【ICML読み会】Unsupervised Deep Embedding for Clustering Analysis
Hayato Maki
July 16, 2016
Technology
0
1.1k
【ICML読み会】Unsupervised Deep Embedding for Clustering Analysis
Hayato Maki
July 16, 2016
Tweet
Share
More Decks by Hayato Maki
See All by Hayato Maki
Billion-scale Embedding for E-commerce Recommendation in Alibaba
hamaki
0
17
Today was a Good Day: The Daily Life of Software Developers
hamaki
0
8
論文紹介:Relaxed Softmax for PU Learning
hamaki
3
700
MIRU 2019 Lunch on Seminar
hamaki
1
110
コーディネート整合性を考慮したカテゴリ間推薦
hamaki
0
1.1k
Regularization_The Element of Statical Learning
hamaki
0
110
Neural Activity During Sentence Processing as Reflected in Theta, Alpha, Beta, and Gamma Oscillations
hamaki
0
77
Other Decks in Technology
See All in Technology
AWS CLI でやってみる ~ AWS Hands-on for Beginners ECS ハンズオン ~
kentosuzuki
1
450
ソフトバンクでのMECの取り組みについて
sbtechnight
0
310
#awsbasics [LT] サーバレスECにおける Step Functions の使い方
miu_crescent
0
830
Autonomous Database Cloud 技術詳細 / adb-s_technical_detail_jp
oracle4engineer
PRO
10
19k
Goで実装するブランドネットワークとの接続ポイント
pongzu
2
270
Sysdig Secure/Falcoの活用術! ~Kubernetes基盤の脅威モデリングとランタイムセキュリティの強化~
owlinux1000
0
240
ここが好きだよAWS管理ポリシー_devio2022/i_am_iam_lover
yukihirochiba
0
3.1k
プロダクトマネージャーの役割と育成、評価
middleokada
16
11k
EC/CRMの自社サービス開発をマネジメントするようになって1年でやってきたこととこれから / devio2022-takano-sho-road-to-good-development-team-management
masaru_b_cl
0
410
SPAとWebアプリケーションでCognitoの使い方はどう変わるのか? / How do we use cognito with SPA and web applications?
kitano_yuichi
0
380
cobra は便利になっている
nwiizo
0
140
psql, my favorite tool!
nuko_yokohama
1
180
Featured
See All Featured
Why Our Code Smells
bkeepers
PRO
324
55k
Building an army of robots
kneath
298
40k
Three Pipe Problems
jasonvnalue
89
8.7k
A designer walks into a library…
pauljervisheath
196
16k
Six Lessons from altMBA
skipperchong
14
1.4k
Become a Pro
speakerdeck
PRO
3
910
Principles of Awesome APIs and How to Build Them.
keavy
113
15k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
7
1.1k
How to train your dragon (web standard)
notwaldorf
60
3.9k
Thoughts on Productivity
jonyablonski
44
2.4k
Reflections from 52 weeks, 52 projects
jeffersonlam
337
17k
Infographics Made Easy
chrislema
233
17k
Transcript
ICML2016จհ Unsupervised Deep Embedding for Clustering Analysis Ross Girshick Jungian
Xie Ali Farhadi University of Washington Facebook AI Research University of Washington ൃදऀ ਅ༐ਓ ಸྑઌՊֶٕज़େֶӃେֶ ใՊֶݚڀՊ ത࢜ޙظ՝ఔ ೳίϛϡχέʔγϣϯݚڀࣨ 2016/07/16 @NAIST
3ߦͰཁ • ରɿݹయతͳΫϥελϦϯά • ख๏ɿਂֶशΛར༻ͨ࣍͠ݩݮύ ϥϝλͱΫϥελϦϯάͷಉ࣌ ࠷దԽ • ݁Ռɿैདྷख๏ΑΓߴ͍ਫ਼ɼ͍ ܭࢉ࣌ؒΛ࣮ݱ
ΫϥελϦϯάͷؔ࿈ݚڀ • k-means ٴͼ ࠞ߹ਖ਼نϞσϧ(GMM) • ೖྗͷ࣍ݩ͕ߴ͍ͱࣦഊ͍͢͠ • ࣍ݩݮͱΫϥελϦϯάΛಉ࣌ʹߦ͏ख๏ •
࣍ݩۭؒʹࣸ૾ɼࣸ૾ͨ͠ઌͰΫϥελϦϯά • ैདྷख๏ઢܗࣸ૾ͷΈ • εϖΫτϥϧɾΫϥελϦϯά • σʔλͷάϥϑߏΛར༻͢Δख๏ • k-meansΑΓྑ͍݁ՌʹͳΔ͜ͱ͕ଟ͍ • ܭࢉྔ͕αϯϓϧͷ̎·ͨ̐ʹൺྫ
ه߸ • σʔλɿ • σʔλɿ • Ϋϥελͷʢࣄલʹܾఆʣɿ • ࣸ૾ɿ •
ɹ ͷ࣍ݩ <<< ͷ࣍ݩ • ࣸ૾ͷύϥϝλ ΛDNNͰֶश • ࣸ૾ઌͷσʔλɿ • ηϯτϩΠυʢΫϥελΛද͢Δʣɿ n { xi 2 X }n i=1 k zi = f✓( xi) ✓ {zi 2 Z}n i=1 {µj 2 Z}k i=1 zi = f✓( xi) zi = f✓( xi)
ఏҊ๏ͷྲྀΕ ॳظԽ ࣍ݩݮ ΫϥελׂΓͯ KL divergenceܭࢉ ύϥϝλߋ৽
࣍ݩݮ • ਂֶशΛར༻ͨ͠ඇઢܗͳ࣍ݩԽࣸ૾ f✓ : X ! Z { xi
2 X }n i=1 zi = • ڭࢣͳֶ͠शͷͨΊɼަ ࠩݕূ๏ʹΑΔϋΠύʔύ ϥϝλͷௐͰ͖ͳ͍ • ͦͷͨΊɼΑ͘ΘΕΔ ωοτϫʔΫߏΛ༻ • ֤ͷ࣍ݩ (input)-500-500-2000-10 • શ݁߹ [van der Maaten, 09]
ΫϥελׂΓͯ Soft Asignment • ࣸ૾͞Εͨσʔλ ͱηϯτϩΠυ ͷྨࣅ ई (soft assignment)
ɼ ͕̹൪ͷΫϥελʹೖΔ֬ͱͯ͠ ղऍͰ͖Δɽ qij = 1 + kzi µj0 k2/↵ (↵+1)/2 P j0 (1 + kzi µj0 k2/↵) (↵+1)/2 ↵ = 1 {zi 2 Z}n i=1 µj [van der Maaten & Hinton, 08] qij = 1 + kzi µj0 k2/↵ (↵+1)/2 P j0 (1 + kzi µj0 k2/↵) (↵+1)/2 {zi 2 Z}n i=1 • ڭࢣͳֶ͠शʹ͓͍ͯɼަࠩݕূ๏͑ͳ͍ ͨΊɼ ʹݻఆɽ
KLμΠόʔδΣϯεʹΑΔDNNֶश • ఆతͳׂΓͯ • ඪʢཧతͳΫϥελϦϯάΛߦ͏ͱߟ͑Β ΕΔʣ • PͱQͷKLμΠόʔδΣϯεΛ࠷খԽ͢ΔΑ͏ʹDNN Λֶश •
Pͷઃఆ͕ຊख๏ͷΩϞ
ύϥϝλߋ৽ • DNNͷύϥϝλθ ͱ ηϯτϩΠυ μj Λߋ৽ • SGDͰߋ৽ (θόοΫϓϩύήʔγϣϯ)
ॳظԽ • DNNͷॳظԽɿ Stacked Auto Encoder Λར༻ • ηϯτϩΠυͷॳظԽɿॳظԽDNNΛར༻ͯ࣍͠ݩ ݮ͠ɼࣸ૾ઌͰk-means
࣮ݧ • σʔληοτ • ൺֱख๏ • k-means • LDGMI (εϖΫτϥϧɾΫϥελϦϯά)
• SEC (εϖΫτϥϧɾΫϥελϦϯά) • Without back propagation
࣮ • Stacked Auto EncoderͷॳظԽ • ฏۉ0ɼඪ४ภࠩ0.01ͷਖ਼نΛͬͨ ཚͰॏΈΛॳظԽ • ֤͝ͱʹ50000ճ෮ʢ20%Dropoutʣ
• Auto EncoderશମͰ100000ճ෮ͯ͠ fine tuning (Dropoutແ͠) • ϛχόοναΠζ=256 • ֶश=0.1 ←20000෮ຖʹ1/10
࣮ • ηϯτϩΠυͷॳظԽ • ҟͳΔॳظͰ20ճ࣮ߦͯ͠ϕετͳ ͷΛબ • KLμΠόʔδΣϯεͷ࠷খԽ • ֶश=0.01
(ݻఆ) • ऩଋఆ • ΫϥελͷׂΓ͕ͯมԽ͢Δσʔλ͕ 0.1%ҎԼʹͳΔ·Ͱ
ධՁج४ • Unsupervised Clustering Accuracy (ACC) • pi ɿਅͷϥϕϧ •
qi ɿΞϧΰϦζϜ͕ग़ྗͨ͠ϥϕϧ • map()ɿϥϕϧ͔ΒΫϥελͷ࠷దͳϚοϐϯά
݁Ռ • ఏҊ๏͕ϕετͷੑೳ • REUTERSʹ͍ͭͯ • ఏҊ๏ͷֶश࣌ؒ30ఔ • LDMGIͱSECϲ݄Ҏ্ͷܭࢉ࣌ؒͱςϥ ୯ҐͷϝϞϦ͕ඞཁ
ϋΠύʔύϥϝλʹର͢Δؤ݈ੑ • ҟͳΔ9ͭͷϋΠύʔύϥϝλʢΞχʔϦϯάʣ ͰੑೳΛൺֱ • ఏҊ๏ϋΠύʔύϥϝλͷมಈʹରͯ͠ؤ݈ɼ ͔ͭσʔληοτʹඇґଘ • ڭࢣͳֶ͠शʹ͓͍ͯॏཁͳੑ࣭
ඪPͷੑ࣭ • qij ͕େ͖΄Ͳ(֬৴͕େ͖͍΄Ͳ)ɼޯ͕ େ͖͘ͳΔ ˠP·͍͠ੑ࣭Λ͍࣋ͬͯΔ
࣍ݩݮͷ෮࠷దԽʹΑΔޮՌ • t-SNEΛར༻ͨ͠ՄࢹԽ [van der Maaten & Hinton, 08] •
ߋ৽͕ਐΉ΄ͲΫϥελʔͷ͕໌֬ʹ
Auto EncoderʹΑΔಛநग़ͷޮՌ • Auto EncoderͰಛநग़ˠ֤ΞϧΰϦζϜͰॲཧ • Auto EncoderʹΑΔߩݙ͕େ͖͍
ෆۉҰͳσʔληοτʹର͢Δؤ݈ੑ • αϯϓϧ͕࠷খͷΫϥεͷαϯϓϧΛɼ ࠷େͷαϯϓϧͷΫϥεͷrmin ഒʹઃఆɼ ͦͷଞͷΫϥε0.1ͣͭ૿͍ͯ͘͠ • ఏҊ๏αϯϓϧͷෆۉҰੑʹରͯ͠ؤ݈
݁ • ఏҊ๏ɼਂֶशΛར༻ͨ࣍͠ݩݮʹΑΓΫϥ ελϦϯάͷੑೳΛ্ • ࣍ݩݮͷύϥϝλͱΫϥελϦϯάͷ݁ՌΛಉ ࣌ʹ࠷దԽ • ΫϥελϦϯάఆతͳग़ྗͱඪͱͷKL μΠόʔδΣϯεΛଛࣦؔͱͯ͠όοΫϓϩύ
ήʔγϣϯ • ैདྷ๏ΑΓߴਫ਼͔ͭߴʢܭࢉ࣌ؒαϯϓϧ ʹରͯ͠ઢܗʹൺྫʣɼσʔληοτඇґଘɼϋΠύʔ ύϥϝλඇґଘɼαϯϓϧͷෆۉҰੑʹରͯ͠ؤ݈
Ҏ্