Slide 1

Slide 1 text

ICA: ಠཱ੒෼෼ੳ Daisuke Yoneoka March 2, 2015 Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 1 / 10

Slide 2

Slide 2 text

Notations Latent: zt ∈ RL Observed: xt ∈ RD t ͸࣌ؒͰͱΓ͋͑ͣ࣌ؒҟଘͳ͠ Cocktail party problem, blind signal separation, blind source separation ͱ΋ ݺ͹ΕΔ. Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 2 / 10

Slide 3

Slide 3 text

ICA ಋೖ xt = W Zt + εt ͱ͍͏ߏ଄ W ͸ D × L ͷ mixing matrix or generative weights εt ∼ N(0, Ψ), ͨͩ͠؆୯ԽͷͨΊʹ ∥Ψ∥ = 0 ໨ඪ͸ p(zt |xt, θ) ΛٻΊΔ͜ͱ! (๻ͷߟ͑ͨ)PCA ͱ ICA ͷҧ͍: PCA: ৴߸ͷڧ͞ (ݻ༗஋ͷେ͖͞) ʹ஫໨͠ɺͦΕΒ͸૬ؔ = 0 ICA: ৴߸͕ಠཱʹ෼཭Ͱ͖Δ͜ͱΛॏࢹ͍ͯ͠Δ Ͱ΋ Varimax ͠ͳ͕Βͷ PCA ͱԿ͕ҧ͏ͷ͔Θ͔Βͳ͍... Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 3 / 10

Slide 4

Slide 4 text

Prior ʹ Non Gauss Λ PCA ͱͦͷपลͰ͸ z ͷ prior ʹ Gaussian ΛԾఆ͍ͯͨ͠. p(zt) = ΠL j=1 N(zij |0, 1) ಠཱͱແ૬ؔͷҧ͍ɿ(Ұൠతʹݴͬͯ) ಠཱ͸ແݶͷΫϩεϞʔϝϯτ=0 Ͱ, ແ૬ؔ͸ೋ࣍ͷϞʔϝϯτ͕ 0 PCA ͷ Gaussian ͷ৔߹͸, 3 ࣍Ҏ্ͷϞʔϝϯτ͕ 0 ͳͷͰ, ಠཱ=ແ૬ؔ ݁Ռͱͯ͠ճసʹର͢ΔෆఆੑΛ࢒͢ ICA ͸ Non Gaussian ΛԾఆ͢Δ͜ͱͰճసʹର͢ΔෆఆੑΛআ͘. p(zt ) = ΠL j=1 pj (zij ) ॱংʹର͢Δෆఆੑͱ power ʹର͢Δෆఆੑ͸, ͦΕͰ΋࢒Δ W ∗ = P ΛW Ͱ P ͸ permutation, Λ ͸ڧ͞Λม׵͢Δߦྻ Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 4 / 10

Slide 5

Slide 5 text

ਪఆ๏֓આ جຊతʹ͸ Information bottleneck ͱ͍͏͔૬ޓ৘ใྔΛ࠷খԽ͍ͯ͘͠. I(z) = H(zt) − H(z) I(z) ͸૬ޓ৘ใྔͰ Kullback Leibler distance H(z) = − g(z) log g(z)dz ͸Τϯτϩϐʔ x Λத৺Խͱന৭Խ (ie.E[xxT ] = I) Ͱ,z ͷ෼ࢄ͸ 1 ʹ͓ͯ͘͠ͱ cov(x) = E[xxT ] = WE[zzT ]WT ΑΓ W ͸ orthogonal ʹݶఆͰ͖Δ ଞʹ΋ H(zt ) ͷ୅ΘΓʹ Negentropy Λ࢖͏͜ͱ΋͋Δ (Hyvarinen and Oja (2000)). J(Yj ) = H(Zj ) − H(Yj ) ͨͩ͠,Zj is a Gaussian random variable with the same variance as Yj . Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 5 / 10

Slide 6

Slide 6 text

Maximum likelihood estimation x = W z ΑΓ, px(xt) = px(W zt) = pz(zt)|det(W −1)| = pz(V xt |det(V )|), ͨͩ͠ V −1 = W T ͕ iid ͱ͢Δͱ, ର਺໬౓͸ 1 T log p(D|V ) = log |det(V )| + 1 T j t log pj (vT j xt ) ୈҰ߲͸ V ͕ orthogonal ͳͷͰఆ਺ ୈೋ߲Λ V ͕ orthogonal ͱ͍͏੍໿ԼͰ࠷খԽ͢Ε͹ྑ͍ Gradient descent: ஗͍ Natural descent: MacKay, 2003 Newton method or EM. Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 6 / 10

Slide 7

Slide 7 text

FastICA (Hyvarinen and Oja (2000)) ཁ͸, Newton ํΛ ICA ༻ʹมߋʁ؆୯ԽͷͨΊʹ z ͸Ұ࣍ݩͰ, ͔͠΋ͦͷ෼෍ ͕෼͔Δͱ͢Δ. G(z) = − log p(z) Ͱ, g(z) = d dz G(z) ͔ͭ β = −2λ ͱ͢Δͱ f(v) = E[G(vT x)] + λ(1 − vT x) ∇f(v) = E[xg(vT x)] − βv H(v) = E[xxT g′ (vT x)] − βI ࣍ͷΑ͏ͳۙࣅΛߟ͑Δ: E[xxT g′ (vT x)] ≈ E[xxT ]E[g′ (vT x)] = E[g′ (vT x)] ͜ΕʹΑͬͯϔγΞϯ͕؆୯ʹͳΔͷͰ Newton ๏͸ v∗ = E[xg(vT x)] − E[g′ (vT x)]v ͕؆୯ʹͳΔ vnew = v∗ ∥v∗∥ Ͱߋ৽͢Ε͹ OK Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 7 / 10

Slide 8

Slide 8 text

Non GaussianͬͯԿ࢖͑͹͍͍ͷʁ Super-Gaussian (leptokurtic): ϥϓϥε෼෍ͱ͔͕͜ͷΫϥε. த৺͕ઑͬ ͯ੄௕. Sub-Gaussian (platykurtic): ෛͷઑ౓Λ࣋ͭΫϥε. kurt(z) = E[(Z − E[Z])4] σ4 − 3 Skewed distribution: ΨϯϚ෼෍ͳΜ͔͕͜ͷΫϥε. skew(z) = E[(Z − E[Z])3] σ3 Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 8 / 10

Slide 9

Slide 9 text

EM for ICA p(z) ΛԾఆ͢Δ୅ΘΓʹਖ਼نࠞ߹Έ͍ͨͳͷΛߟ͑ͯ΋͍͍Μ͡Όͳ͍͔ʁ p(qj = k) = πk p(zj |qj = k) = N(µjk , σ2 jk ) p(x|z) = N(W z, Ψ) ϙΠϯτ͸ E[zt |xt, θ] ͕ qt Ͱͷશύλʔϯͷ summary Λߟ͑Δ͜ͱͰͰ ͖Δ఺. ͔ͳΓ expendive ͳ৔߹͸ม෼๏Ͱ΋Ͱ͖Δ (Attias 1999). ࣍ʹ E[zt ] Λ GMM ͳΜ͔Ͱਪఆ͢Δ ࠷ޙʹ pj(zj) = K k=1 πjkN(zj |µjk, σ2 jk ) Λਪఆ͢Δ. Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 9 / 10

Slide 10

Slide 10 text

ͦͷଞͷਪఆ๏ ࠷໬๏͚ͩ͡Όͳ͍Αʂ ৄࡉ͸ Hyvarinen and Oja (2000) Λޚཡ͍ͩ͘͞. ΤϯτϩϐʔΛ࠷େԽ͡Όͳ͘ωδΣϯτϩϐʔͷ࠷େԽ ૬ޓ৘ใྔͷ࠷খԽ ૬ޓ৘ใྔͷ࠷େԽ (infomax) Daisuke Yoneoka ICA: ಠཱ੒෼෼ੳ March 2, 2015 10 / 10