内容:ベイズニューラルネットワーク(5.1節),近似ベイズ推論の高速化(5.2節)
ϕΠζਂֶशdܡɹঘً
View Slide
ຊͷ༰‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ‣ϥϓϥεۙࣅʹΑΔֶश‣ϋϛϧτχΞϯϞϯςΧϧϩ๏‣ۙࣅϕΠζਪͷޮԽ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश‣֬తมਪ๏ʹΑΔֶश‣ޯͷϞϯςΧϧϩۙࣅ‣ޯۙࣅʹΑΔมਪ๏‣ظ๏ʹΑΔֶश
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧɹষͷۙࣅਪख๏ɼਂֶशϞσϧʹద༻Ͱ͖ΔɽɹઢܗճؼϞσϧͱಉ༷ʹॱܕχϡʔϥϧωοτϫʔΫʢ//ʣΛϕΠζԽɽɹ ύϥϝʔλ ʹࣄલΛઃఆ͠ɼ֬తͳֶशͱ༧ଌΛՄೳʹ͢Δɽ⟹ WϕΠζਪʹ͓͚Δֶशͱ༧ଌύϥϝʔλͷಉ࣌ɿɹ ͱදͤΔɽֶशɹɿɹ ΛධՁ͢Δɽ༧ଌɹɿɹ ΛٻΊΔɽp(Y, W|X) = p(W)N∏n=1p(yn|w, xn)p(W|X, Y)p(y*|x*, Y, X)n = 1,…, NxnynW
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧɹઃఆɹɹೖྗσʔλ ɼ؍ଌσʔλ ͓Αͼύϥϝʔλͷಉ࣌ΛҎԼͷΑ͏ʹ͓͘ɽ ɹɹ؍ଌσʔλɼҎԼͷ͔ΒಘΒΕΔͱԾఆ͢Δɽ ɹɹ χϡʔϥϧωοτͷؔ ݻఆͷϊΠζύϥϝʔλɽɹɹύϥϝʔλɼҎԼͷ͔ΒಘΒΕΔͱઃఆ͢Δɽɹ ɹ ݻఆͷϊΠζύϥϝʔλɽɹɹɹX = {x1, …, xN} Y = {y1, ⋯, yn}p(Y, W|X) = p(W)N∏n=1p(yn|w, xn)p(yn|xn, W) = (yn| f(xn; W), σ2yI)f(xn; W) σ2yp(w) = (w|0,σ2w) where w ∈ Wσ2w
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧɹಛɹɹ//ͷ͕Ͱ͋Δͱ͖ɼɹɹɹӅΕϢχοτ͕ଟ͍ɹ ɹؔෳࡶԽɽɹɹɹ ͕େ͖͍ɹ ɹมԽ͕ٸफ़ɽɹɹɹ⟶σw⟶ɹϕΠζ//ɼӅΕϢχοτΛ૿͢ͱɼࣄޙ͕ෳࡶʹͳ͍ͬͯ͘͜ͱ͕ΒΕ͍ͯΔɽ
ϥϓϥεۙࣅʹΑΔֶशϥϓϥεۙࣅp(Z|X) ≈ (Z|ZMAP, {Λ(ZMAP)}−1)Λ(Z) = − ∇2Zlog p(Z|X)ɹ؆୯ͷͨΊʹ//ͷग़ྗͷ࣍ݩΛͱ͢Δɽࣄޙͷۙࣅɹࣄޙͷ."1ਪఆΛٻΊΔɽɹɹ Ͱ࠷େΛऔΔύϥϝʔλ ΛٻΊΔɽɹࣄޙ࠷େԽɹʹɹରࣄޙ࠷େԽɹͳͷͰɼରࣄޙͷޯΛར༻͢ΔͱɼҎԼͷΑ͏ͳ࠷దԽʹΑͬͯ."1ਪఆ͕ٻΊΒΕΔɽɹ ֶशɽ⟹ p(W|Y, X) WMAPWnew= Wold+ α∇Wlog p(W|Y, X)|W=Woldα
ϥϓϥεۙࣅʹΑΔֶशࣄޙͷۙࣅɹࣄޙͷޯɼҎԼͷΑ͏ʹٻΒΕΔɽɹɹɹɹɹɹɹɹɹɹɹɹɹ Αͬͯɼɹɹɹɹɹɹɹɹɹ ύϥϝʔλ Ͱภඍ͢ΔͱɼҎԼͷΑ͏ʹίετؔͷඍͱͳΔɽɹɹɹɹɹɹɹɹɹ ɼͦΕͧΕ//ͷޡࠩؔͱ֤ύϥϝʔλͷࣄલʹ༝དྷ͢Δਖ਼ଇԽ߲Ͱ͋Δɽp(W|Y, X) =p(W)p(Y|X, W)p(X|Y)∝ p(W)p(Y|X, W)log p(W|Y, X) = log p(Y|X, W) + log p(W) + c=N∑n=1log p(yn|xn, W) + ∑w∈Wlog p(w) + cw ∈ W∂∂wlog p(W|Y, X) = −{1σ2y∂∂wE(W) +1σ2w∂∂wΩL2(W)}E(W), ΩL2(W)
ϥϓϥεۙࣅʹΑΔֶशࣄޙͷۙࣅɹΑͬͯɼ."1ਪఆΛٻΊͨΒɼࣄޙΛҎԼͷΑ͏ʹۙࣅͰ͖Δɽɹɹɹɹɹɹɹɹɹɹ ޡࠩؔʹର͢ΔϔοηߦྻͰ͋Δɽp(W|Y, X) ≈ q(W)= (W|WMAP, {Λ(WMAP)}−1)Λ(W) = − ∇2Wlog p(W|Y, X)=1σ2wI +1σ2yHH
ϥϓϥεۙࣅʹΑΔֶश༧ଌͷۙࣅɹϥϓϥεۙࣅΛ༻͍Δͱɼ༧ଌҎԼͷΑ͏ʹۙࣅͰ͖Δɽɹ ɹ͔͠͠ɼ ͷதʹ//ؚ͕·Ε͍ͯΔͷͰɼղੳతܭࢉ͕ෆՄೳɽɹ͜͜Ͱɼύϥϝʔλͷࣄޙͷີ͕."1ਪఆͷपลʹूத͓ͯ͠Γɼ͔ͭͦͷখ͞ͳൣғʹ͓͍ͯ ͕ ͷઢܕؔͰΑۙ͘ࣅͰ͖Δͱ͍͏ԾઆΛ͓͘ɽ͜ͷԾઆ͔Βɼςʔϥʔల։Ͱ ͷؔ Λ ·ΘΓͰ࣍ۙࣅ͢ΔͱɼҎԼͷΑ͏ʹͳΔɽɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y*|x*, Y, X) = p(y*|x*)=∫p(y*|x*, W)p(W|X, Y)dW≈∫p(y*|x*, W)q(W)dWp(y*|x*, W)f(x*|W) WW f(x*|W) WMAPf(x*; W) ≈ f(x*; WMAP) + gT(W − WMAP)g = ∇Wf(x*; W)|W=WMAP
ϥϓϥεۙࣅʹΑΔֶश༧ଌͷۙࣅɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹp(y*|x*, Y, X) = p(y*|x*)=∫p(y*|x*, W)p(W|X, Y)dW≈∫p(y*|x*, W)q(W)dW=∫(yn| f(xn; W), σ2y)(W|WMAP, {Λ(WMAP)}−1)dW=∫(yn| f(x*; WMAP) + gT(W − WMAP), σ2y)(W|WMAP, {Λ(WMAP)}−1)dW= (y*| f(x*; WMAP), σ2(x*))σ2(x*) = σ2y+ gT{Λ(WMAP)}−1g
ϥϓϥεۙࣅʹΑΔֶश༧ଌͷۙࣅɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹp(y*|x*, Y, X) = p(y*|x*)=∫p(y*|x*, W)p(W|X, Y)dW≈∫p(y*|x*, W)q(W)dW=∫(yn| f(xn; W), σ2y)(W|WMAP, {Λ(WMAP)}−1)dW=∫(yn| f(x*; WMAP) + gT(W − WMAP), σ2y)(W|WMAP, {Λ(WMAP)}−1)dW= (y*| f(x*; WMAP), σ2(x*))σ2(x*) = σ2y+ gT{Λ(WMAP)}−1gϥϓϥεۙࣅςʔϥʔల։ͷҰ࣍ۙࣅ
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशɹରࣄޙʢϋϛϧτχΞϯʹ͓͚ΔϙςϯγϟϧΤωϧΪʔʣ͕αϯϓϦϯά͍ͨ͠มʹରͯ͠ඍՄೳͳΒ).$๏͕ద༻Ͱ͖Δɽܭࢉ࣌ؒ͑͞ेʹ֬อ͍ͯ͠Εɼཧతʹਅͷࣄޙ͔Βͷαϯϓϧ͕ಘΒΕΔʢ.$.$ͷಛʣɽ݁Ռతʹɼෳͷαϯϓϧ͔Βෆ࣮֬ੑΛදݱͰ͖Δɽ
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशॏΈύϥϝʔλͷਪɹਖ਼نԽ͞Ε͍ͯͳ͍ࣄޙΛར༻͢ΕɼରԠ͢ΔϙςϯγϟϧΤωϧΪʔҎԼͷΑ͏ʹͳΔɽ ͜ΕΛඍ͢Δͱɼઌ΄Ͳొͨ͠ίετؔͷඍͱՁͰ͋Δ͜ͱ͕Θ͔Δɽɹ ޡࠩٯ๏ʹΑΔޯܭࢉ͕ར༻Ͱ͖Δɽʲ.$.$ʹجͮ͘ͷۙࣅਪͷʳw αϯϓϧ͕ेͰ͋Δ͔ΛΔखஈ͕ͳ͍ɽw .$.$ͷύϥϝʔλௐ͕͍͠ɽʢFH).$๏ʹ͓͚ΔεςοϓαΠζεςοϓͳͲw ֶश͕ɽɹ(W) = − {log p(Y|X, W) + log p(W)}⟹
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशϋΠύʔύϥϝʔλͷਪɹϋΠύʔύϥϝʔλͰ͋Δ ʹͦΕͧΕࣄલΛ༩͑Δ͜ͱͰ ͱಉ࣌ʹਪՄೳͰ͋Δɽɹɹਫ਼ύϥϝʔλ Λಋೖ͠ɼҎԼͷΑ͏ʹࣄલΛΨϯϚͰఆٛ͢Δɽ ɹಉ༷ʹ ʹରͯ͠ɼҎԼͷΑ͏ʹఆٛ͢ΔɽσwσyWγw= σ−2wp(γw) = Gam(γw|aw, bw) (aw, bwਖ਼ͷݻఆ)γy= σ−2yp(γy) = Gam(γy|ay, by) (ay, byਖ਼ͷݻఆ)
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशϋΠύʔύϥϝʔλͷਪɹϞσϧʢύϥϝʔλͷಉ࣌ʣΛվΊͯॻ͘ͱɼҎԼͷΑ͏ʹͳΔɽ ɹp(Y, W, γw, γy|X) = p(γw)p(γy)p(W|γw)N∏n=1p(yn|xn, W, γy)n = 1,…, NxnynWγyγwɹࣄޙɼҎԼͷΑ͏ʹͳΔɽɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹp(W, γw, γy|X, Y)αyβwβyαw
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशϋΠύʔύϥϝʔλͷਪɹΪϒεαϯϓϦϯάΛ༻͍ͯɼ ΛαϯϓϦϯά͢Δɽw ͷαϯϓϦϯάɹɹɹઌ΄Ͳͱಉ༷ʹɼ).$๏Ͱαϯϓϧ͢Δɽɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ w ͷαϯϓϦϯάɹɹɹ ɹɹɹ Ψεɼ ΨϯϚʢΨεͷڞࣄલʣͳͷͰɼɹɹɹ ΨϯϚͰ͋ΔɽΑͬͯɼ ͨͩ͠ɼ ॏΈύϥϝʔλͷ૯ɽW, γw, γyWW ∼ p(W|Y, X, γw, γy)γwp(γw|Y, X, W, γy) ∝ p(W|γw)p(γw)p(W|γw) p(γw)p(γw|Y, X, W, γy)γw∼ Gam( ̂aw, ̂bw)̂aw= aw+Kw2̂bw= bw+12 ∑w∈Ww2Kw
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशϋΠύʔύϥϝʔλͷਪw ͷαϯϓϦϯάɹɹɹ ɹɹɹ Ψεͷ૯ͳͷͰΨεɼ ΨϯϚΑΓɼɹɹɹ ΨϯϚͰ͋ΔɽΑͬͯɼ γyp(γy|Y, X, W, γw) ∝ p(γw)N∏n=1p(yn|xn, W, γr)N∏n=1p(yn|xn, W, γr) p(γy)p(γy|Y, X, W, γw)γy∼ Gam( ̂ay, ̂by)̂ay= ay+N2̂by= by+12N∑n=1{yn− f(xn; W)}2
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶशϋΠύʔύϥϝʔλͷਪɹΨϯϚ ͷฏۉ ɼࢄ ͳͷͰɼ ͕େ͖͍΄Ͳ ʹΑΔ ͷਪఆਫ਼͕ѱ͘ɼ؍ଌʹର͢Δࢄ͕େ͖͘ͳΔΑ͏ʹֶश͞ΕΔɽɹɹࠓճɼॏΈύϥϝʔλͷਫ਼ύϥϝʔλɼશମʹͬͯڞ௨ͷ Ͱ͓͍͍͕ͯͨɼ//ͷ֤͝ͱʹਫ਼ύϥϝʔλ ͱ͓͘͜ͱՄೳͰ͋ΔɽGam(a, b) a/b a/b2 ̂byf(xn|W)ynγw(γ(1)w, …, γ(L)w)
ۙࣅϕΠζਪͷߴԽ
ۙࣅϕΠζਪͷߴԽʲϕΠζχϡʔϥϧωοτϫʔΫͷܽʳɹύϥϝʔλͷपลԽʹ͏ܭࢉྔ͕େɹɹ ༧ଌπʔϧͱͯ͋͠·ΓΘΕͳ͔ͬͨɽɹ·ͨɼਂֶशඞཁͳֶशσʔλ͕େɹɹ όονֶशΛલఏͱͨ͠ख๏Ͱܭࢉޮ͕ѱ͍ɽʲͲͷΑ͏ʹܽΛิ͏ʁʳw ੵআڈΛۙࣅਪ͢Δ͜ͱͰɼܭࢉͷޮΛ্͛Δɽw ϛχόονֶशΛಋೖ͢Δɽ⟹⟹
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶशʲʳɹ.$.$Λར༻ֶͨ͠शେنͳσʔλʹରͯ͠ɼܭࢉޮ͕ѱ͍ɽʲղܾࡦʳɹܭࢉޮͷߴ͍ϛχόονʹجֶͮ͘शख๏ʢFH֬తޯ߱Լ๏ʣͱෆ࣮֬ੑͷਪఆ͕Մೳͳ.$.$ʢFH.)๏ɼ).$๏ʣΛΈ߹ΘͤΔɽɹ ֬తϚϧίϑ࿈ϞϯςΧϧϩ๏⟹
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶशʲֶशʳɹ֬తޯ߱Լ๏ͱϥϯδϡόϯಈྗֶ๏ΛΈ߹Θͤͨɹ֬తޯϥάδϡόϯಈྗֶ๏ɹΛར༻ֶͨ͠शΛߟ͑Δɽɹύϥϝʔλͷߋ৽Λɹ ͱද͢ɽɹ֬తޯ߱Լ๏Ͱɼύϥϝʔλͷߋ৽෯ΛҎԼͷΑ͏ʹॻ͚Δɽ ͨͩ͠ɼ αϒαϯϓϧͷେ͖͞Ͱ͋ΓɼՃ͑ͯɼϩϏϯεɾϞϯϩʔΞϧΰϦζϜͷΈʹ͢ΔͨΊʹɼεςοϓʹ͓͚Δֶश ҎԼͷ݅Λຬͨ͢Α͏ʹઃఆ͢ΔɽWnew= Wold+ ΔWΔW =αt2∇Wlog p(W|Xs, Ys) =αt2 {NM ∑n∈S∇Wlog p(yn|xn, W) + ∇Wlog p(W)}Mt αt∞∑i=1αt= ∞,∞∑i=1α2t< ∞
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶशʲֶशʳɹҰํͰɼόονֶशΞϧΰϦζϜͷϥϯδϡόϯಈྗֶ๏ͷαϯϓϧΛಘΔͨΊʹඞཁͳεςοϓɼϙςϯγϟϧΤωϧΪʔΛ ɼεςοϓαΠζΛ ΛӡಈྔϕΫτϧͱ͢Δͱɼύϥϝʔλͷߋ৽෯ҎԼͷΑ͏ʹͳΔɽ ɹ Λখ͘͢͞Εɼ.)๏ʹ͓͚Δड༰ΛݶΓͳ͘·Ͱ͚ۙͮΒΕΔɽ = − log p(W|X, Y)ϵ = αtpΔW = −ϵ22∇W + ϵp=αt2∇Wlog p(W|X, Y) + αtp=αt2 {N∑n=1∇Wlog p(yn|xn, W) + ∇Wlog p(W)}+ αtp,p ∼ (0, I) .αt
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶशʲֶशʳɹઌͷͭʢ֬తޯ߱Լ๏ͱϥϯδϡόϯಈྗֶ๏ʣΛΈ߹ΘͤΔͱɼߋ৽෯͕ҎԼͷΑ͏ʹͳΔɽɹɹɹɹɹɹɹ ֶशɼઌ΄Ͳͷ݅ͱಉ༷ɽɹɹʬ͕খ͖͞ͱ͖ʢֶशॳظஈ֊ʣ㲊ɹɹ4(%ͷརΛੜ͔ͯ͠ࣄޙͷۭؒΛޮతʹ୳ࡧɽɹʬ͕େ͖͘ͳΔʹͭΕͯ㲊ϥϯδϡόϯಈྗֶ๏ʹΑΔਅͷࣄޙ͔ΒۙࣅతͳαϯϓϧΛಘΒΕΔɽΔW =αt2 {NM ∑n∈S∇Wlog p(yn|xn, W) + ∇Wlog p(W)}+ αtp,p ∼ (0, I) .tt
֬తมਪ๏ɹઌ΄Ͳɼ֬తޯ๏ͱ.$.$ͷΈ߹ΘͤΛհͨ͠ɽɹ࣍ɼมਪ๏ͱ֬తޯ߱Լ๏ΛΈ߹ΘͤΔɽɹɹ ֬తมਪ๏ɹɹΛมύϥϝʔλͷू߹ͱͨ͠ͱ͖ɼɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ ͱͳΔΑ͏ͳۙࣅ ΛٻΊΔ͜ͱ͕ඪɽ⟹ξq(W; ξ) ≈ p(W|X, Y)q(W; ξ)
֬తมਪ๏ɹޮԽͷͨΊʹϛχόονΛಋೖ͢Δɼɹ ɹϛχόονͰܭࢉ͞Εͨ ʹର͢ΔෆภਪఆྔͱͳΔɽ ɹ͕ͨͬͯ͠ɼ Λ࠷େԽ͢ΔΘΓʹɼ Λ࠷େԽ͢Δ͜ͱʹΑͬͯɼޮΑ͘ύϥϝʔλͷࣄޙΛۙࣅͰ͖Δɽℒ(ξ) =N∑n=1∫q(W; ξ)log p(yn| f(xn; W))dW − DKL[q(W; ξ)||p(W)]ℒS(ξ) =NM ∑n∈S∫q(W; ξ)log p(yn| f(xn; W))dW − DKL[q(W; ξ)||p(W)]ℒsℒS[ℒs(ξ)] = ℒ(ξ)ℒ(ξ) ℒs(ξ)ϛχόονԽ
֬తมਪ๏ɹ͜ͷޙͷεϥΠυͰɼۙࣅΛ࣍ͷΑ͏ͳಠཱͳΨεͱԾఆ͠ɼ&-#0Λޯ߱Լ๏Λར༻ͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δɽq(W; ξ) = ∏i,j,l(w(l)i,j|μ(l)i,j, σ(l)i,j2)
ޯͷϞϯςΧϧϩۙࣅɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰɼ&-#0ʹ͓͚Δύϥϝʔλ ղੳతʹੵআڈͰ͖ͳ͍ɽɹ ޯ߱Լ๏ʹΑͬͯ Λ࠷େԽɽɹޯ߱Լ๏Λ͏ͨΊʹ ΛมύϥϝʔλʹΑΔޯܭࢉΛ͢Δඞཁ͕͋Δɽ ɼͲͪΒΨεͳͷͰղੳతʹޯܭࢉͰ͖ΔɽҰํͰɼର ղੳతʹੵͰ͖ͳ͍ɽW⟹ ℒS(ξ)ℒS(ξ) ξDKL[q(W; ξ)||p(W)]∫q(W; ξ)log p(yn| f(xn; W))dW
ޯͷϞϯςΧϧϩۙࣅɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰɼ&-#0ʹ͓͚Δύϥϝʔλ ղੳతʹੵআڈͰ͖ͳ͍ɽɹ ޯ߱Լ๏ʹΑͬͯ Λ࠷େԽɽɹޯ߱Լ๏Λ͏ͨΊʹ ΛมύϥϝʔλʹΑΔޯܭࢉΛ͢Δඞཁ͕͋Δɽ ɼͲͪΒΨεͳͷͰղੳతʹޯܭࢉͰ͖ΔɽҰํͰɼର ղੳతʹੵͰ͖ͳ͍ɽW⟹ ℒS(ξ)ℒS(ξ) ξDKL[q(W; ξ)||p(W)]∫q(W; ξ)log p(yn| f(xn; W))dWɹϞϯςΧϧϩ๏ͰੵʢରʣΛۙࣅͯ͠ɼޯͷਪఆΛಘΑ͏ʂ
ޯͷϞϯςΧϧϩۙࣅʲඪʳɹύϥϝʔλ ʹରͯ͠ɼ͋Δ ͱ Λߟ͑ɼ࣍ͷޯΛਪ͢Δ͜ͱɽ ʲܭࢉํ๏ʳɹείΞؔਪఆɼ࠶ύϥϝʔλԽޯɼҰൠԽ࠶ύϥϝʔλԽޯɼӄؔඍͳͲw ∈ ℝ f(w) q(w; ξ)I(ξ) = ∇ξ ∫f(w)q(w; ξ)dw
ޯͷϞϯςΧϧϩۙࣅείΞؔਪఆɹҎԼͷΑ͏ʹ Λมܗ͢Δɽ ɹ͕ͨͬͯ͠ɼ ͔Β ΛෳαϯϓϦϯά͔ͯ͠ΒඍΛධՁ͢Δ͜ͱͰ ͷෆภਪఆྔ͕ಘΒΕΔɽʲద༻Ͱ͖Δ݅ʳɹ ͷඍ͕ܭࢉՄೳɽʲʳɹ࣮༻্ඇৗʹߴ͍ࢄ͕ൃੜͯ͠͠·͏ɽʲղܾࡦʳɹ੍ޚมྔ๏ͳͲͷࢄݮগख๏ͱΈ߹ΘͤΔɽI(ξ)I(ξ) = ∇ξ ∫f(w)q(w; ξ)dw=∫f(w)∇ξq(w; ξ)dw=∫f(w)q(w; ξ)∇ξlog q(w; ξ)dw= q(w;ξ)[ f(w)∇ξlog q(w; ξ)]q(w; ξ) w I(ξ)log q(w; ξ)
ޯͷϞϯςΧϧϩۙࣅ࠶ύϥϝʔλԽޯɹ Λ ͔ΒαϯϓϦϯά͢ΔΘΓʹɼʹґଘ͠ͳ͍ ͔ΒΛαϯϓϦϯά͠ɼม Λద༻͢Δ͜ͱͰؒతʹ ͷαϯϓϦϯάΛ͢Δ͜ͱΛߟ͑Δɽɹ͕ͨͬͯ͠ɼҎԼͷΑ͏ʹޯͷෆภਪఆྔ͕ಘΒΕΔɽ ʲ۩ମྫʳɹ ɼ ͷ߹ɹ ɼ ͱ͢Δ͜ͱͰɼ ͔ΒαϯϓϦϯάͰ͖Δɽมύϥϝʔλʹؔ͢Δޯͷඍɼ࣍ͷΑ͏ʹͳΓɼ֤มύϥϝʔλͷޯͷෆภਪఆྔ͕ಘΒΕΔɽɹɹɹɹ ɹɹɹɹw q(w; ξ) ξ q(ϵ) ϵw = g(ξ, ϵ) wq(ϵ)[ f′(g(ξ; ϵ))∇ξg(ξ; ϵ)] = I(ξ)ξ = { ̂μ, ̂σ2} q(w; ξ) = (w| ̂μ, ̂σ2)˜ϵ ∼ (0,1) = q(ϵ) ˜w = g(ξ; ϵ) = ̂μ + ̂σϵ ˜w ( ̂μ, ̂σ2)∂∂ ̂μ ∫f(w)q(w; ξ)dw =∫f′(w)q(w; ξ)dw ∴ I( ̂μ) = q(w;ξ)[ f′(w)]∂∂ ̂σ ∫f(w)q(w; ξ)dw =∫f′(w)(w − ̂μ)̂σq(w; ξ)dw ∴ I( ̂μ) = q(w;ξ) [f′(w)(w − ̂μ)̂σ ]
ޯͷϞϯςΧϧϩۙࣅ࠶ύϥϝʔλԽޯͷҰൠԽʲ࠶ύϥϝʔλԽޯͷརʳɹɹείΞؔਪఆͱൺͯޯͷࢄΛখ͑͘͞ΒΕΔɽʲ࠶ύϥϝʔλԽޯͷʳɹɹมม ͕ඞཁɽʢશͯͷͰద༻Ͱ͖ΔΘ͚Ͱͳ͍ɽʣʲղܾࡦɹྫɿʳɹҰൠԽ࠶ύϥϝʔλԽޯɹɹ ʹؔ͢Δ੍Λ؇Ίɼଟ͘ͷछྨͷʹରͯ͠ద༻Մೳͱͨ͠ͷɽɹɹ ͷΑ͏ʹมύϥϝʔλͷґଘੑΛ͢͜ͱΛڐ͢ɽʲղܾࡦɹྫɿʳɹӄؔඍɹʲ͑Δ݅ʳw ΛٻΊΔ͜ͱࠔ͕ͩɼٯม ༰қʹಘΒΕΔɽw ࿈ଓͷɹɹ ΛͰඍ͢Δ͜ͱͰظͷޯΛಘΔɽggq(ϵ; ξ)g g−1ϵ = g−1(ϵ; ξ) ξ
ޯͷϞϯςΧϧϩۙࣅ࠶ύϥϝʔλԽޯͷҰൠԽʲղܾࡦɹྫɿʳɹ࿈ଓ؇ɹɹࢄͷ֬ʹରͯ͠࠶ύϥϝʔλԽޯΛద༻͢Δํ๏ɽɹʲ۩ମྫʳΧςΰϦʢࢄʣɼΨϯϕϧιϑτϚοΫεʢ࿈ଓʣͷԹύϥϝʔλΛʹઃఆͨ͠ͷͱҰக͢Δɽɹɹ
ޯۙࣅʹΑΔมਪ๏ɹ࣮ࡍʹ࠶ύϥϝʔλԽޯΛར༻ͯ͠ϕΠζχϡʔϥϧωοτͷ&-#0Λ࠷େԽ͢Δɽᶃ ϛχόον Λσʔληοτ ͔ΒϥϯμϜʹநग़͢Δɽᶄ .ݸʢϛχόονͷαϯϓϧʣͷϊΠζΛऔಘ͢Δɽɹ ᶅ มύϥϝʔλʹؔ͢ΔޯΛܭࢉ͢Δɽ ᶆ &-#0ͷ૿ՃํʹมύϥϝʔλΛߋ৽͢Δɽs˜ϵi∼ (0, I)ℒs(ξ) =NM ∑n∈S∫q(W; ξ)log p(yn| f(xn; W))dW − DKL[q(W; ξ)||p(W)]=NM ∑n∈S∫p(ϵ)log p(yn| f(xn; g(ξ; ϵ)))dϵ − DKL[q(W; ξ)||p(W)]≈ ℒS,ϵ(ξ) ( ∵ ,ϵ[ℒS,ϵ(ξ)] = ℒ(ξ))=NM ∑n∈Slog p(yn| f(xn; g(ξ; ˜ϵn))) − DKL[q(W; ξ)||p(W)],∇ξℒs(ξ) ≈ ∇ξℒS,ϵ(ξ)=NM ∑n∈S∇ξlog p(yn| f(xn; g(ξ; ˜ϵn))) − ∇ξDKL[q(W; ξ)||p(W)] .ξ ← ξ + α∇ξℒS,ϵ(ξ)
ظ๏ʹΑΔֶशɹॱܭࢉͰχϡʔϥϧωοτϫʔΫΛ௨ͨ֬͠ͷʹΑΓपลͷධՁΛߦ͍ɼٯͰύϥϝʔλΛֶश͢ΔͨΊʹظ๏Λ༻͍ͯपลͷޯΛܭࢉ͢Δɽ ֬తٯ๏ɹ֬తٯ๏σʔλΛஞ࣍తʹॲཧͰ͖ΔͷͰɼେྔσʔλΛ༻ֶ͍ͨशͰεέʔϧՄೳɽ؍ଌσʔλͷਫ਼ύϥϝʔλॏΈͷࣄલΛࢧ͢Δਫ਼ύϥϝʔλۙࣅਪՄೳɽ⟹
ظ๏ʹΑΔֶशʲظ๏ʹΑΔֶशʳ‣Ϟσϧ‣ۙࣅ‣ॳظԽͱࣄલҼࢠͷಋೖ‣Ҽࢠͷಋೖ‣׆ੑͷ‣ޯʹجֶͮ͘श‣֬తٯ๏ͷ·ͱΊ‣ؔ࿈ख๏
ظ๏ʹΑΔֶशϞσϧʲઃఆʳɹɹ ͱ͠ɼपลΛҎԼͷΑ͏ʹఆٛ͢Δɽ ɹ ͷ׆ੑԽؔʹਖ਼نԽઢܗؔʢ3F-6ʣΛ༻͍Δɽɹɹύϥϝʔλ ɼಠཱͳΨεʹै͏ͱ͢Δɽ ʲඪʳɹɹҎԼͷࣄޙΛۙࣅਪ͢Δ͜ͱɽyn∈ ℝp(Y|X, W, γr) =N∏n=1(yn| f(xn; W), γ−1y)p(γy) = Gam(γr|αγy0, βγy0)f(xn; W)Wp(W|γw) =L∏l=1Hl∏i=1Hl−1∏j=1(w(l)i,j|0,γ−1w)p(γw) = Gam(γw|αγw0, βγw0)p(W, γy, γw|) ∝ p(Y|X, W, γr)p(W|γw)p(γy)p(γw)
ظ๏ʹΑΔֶशۙࣅɹ֬తٯ๏ɼԾఆີϑΟϧλϦϯάʹج͍͍ͮͯΔɽɹύϥϝʔλͷۙࣅΛ࣍ͷΑ͏ʹ͓͘ɽ ɹɹ্ͷࣜΛԾఆີϑΟϧλϦϯάʹ͓͚ΔϞʔϝϯτϚονϯάͰஞ࣍తʹߋ৽͍ͯ͘͠ɽq(W, γy, γw) = Gam(γy|αγy, βγy)Gam(γw|αγw, βγw)L∏l=1Hl∏i=1Hl−1∏j=1(w(l)i,j|m(l)i,j, v(l)i,j)= q(γy)q(γw)q(W)ԾఆີϑΟϧλϦϯάqi+1(θ) ≈ ri+1=1Zi+1fi+1(θ)qi(θ) ɿҼࢠfi(θ)
ظ๏ʹΑΔֶशॳظԽͱࣄલҼࢠͷಋೖʲॳظԽʳɹɹۙࣅ͕ແใʹͳΔΑ͏ʹɼ ɼ ɼ ɼ ɼ ɼ ͰॳظԽ͢ΔɽʲࣄલҼࢠͷಋೖʳɹඪͷࣄޙͷҼࢠΛͭͭՃ͢Δ͜ͱͰۙࣅΛߋ৽͢ΔɽɹࠓճͷϞσϧʹ͓͚ΔࣄલҼࢠҎԼͷΑ͏ʹͳΔɽɹm(l)i,j= 0 v(l)i,j= ∞ αγy= 1 βγy= 0 αγw= 1βγw= 0p(γr), p(γw), {p(w(l)i,j|γw)}i,j,lࣄޙɿɹ ۙࣅɿɹp(W, γy, γw|) ∝ p(Y|X, W, γr)p(W|γy)p(γw)p(γw)q(W, γy, γw) = q(γy)q(γw)q(W)
ظ๏ʹΑΔֶशॳظԽͱࣄલҼࢠͷಋೖʲࣄલҼࢠͷಋೖʳwҼࢠ ͓Αͼ ͷՃɽɹۙࣅ Λࣄલ ͱಉ͡ͷʹ͍ͯ͠ΔͷͰɼҼࢠͷߋ৽ҎԼͷΑ͏ʹͳΔɽ ɹɹɹɹɹɹɹɹ ɼ ɼ ɼ ͭ·Γɼ ɼp(γw) p(γy)q(γy), q(γw) p(γy), p(γw)qnew(γy)qnew(γw)qnew(W) ≈ p(γy)p(γw)q(W)αnewγy= αγy0βnewγy= βγy0αnewγw= αγw0βnewγw= βγw0q(γr) ← p(γr) q(γw) ← p(γw)ԾఆີϑΟϧλϦϯάqnew(γy)qnew(γw)qnew(W) ≈ r =1Zf new(γy, γw, W)q(γy)q(γw)q(W)
ظ๏ʹΑΔֶशॳظԽͱࣄલҼࢠͷಋೖʲࣄલҼࢠͷಋೖʳwҼࢠ ͷՃ ɹҎ߱ͰɼΠϯσοΫε Λলུ͢Δɽɹߋ৽͞ΕΔͷɼ ͓Αͼ Ͱ͋ΔɽΑͬͯɼͦΕͧΕΛҎԼͷΑ͏ʹߋ৽͢Δɽ ɹԼઢ෦ΛҼࢠͱΈͳ͢ɽҙ͖͢ɼͭͷͷߋ৽ʹͭͷ৽ͨʹߋ৽͞Εͨ༻͍ͯ͠ͳ͍ͳͷͰɼߋ৽ॱʹؔͳ͍͜ͱɽp(w(l)i,j|γw)qnew(γy)qnew(γw)qnew(W) ≈1Zp(w(l)i,j|γw)q(γy)q(γw)q(W)⇔ qnew(γw)qnew(W) ≈1Zp(w(l)i,j|γw)q(γw)q(W)i, j, lq(W) q(γw)qnew(W) ≈1Z0p(w|γw)q(γw)q(W)qnew(γw) ≈1Z0p(w|γw)q(W)q(γw)
ظ๏ʹΑΔֶशॳظԽͱࣄલҼࢠͷಋೖʲࣄલҼࢠͷಋೖʳwҼࢠ ͷՃɿ ͷߋ৽ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹp(w(l)i,j|γw) q(W)qnew(W) ≈1Z0p(w|γw)q(γw)q(W)ɹ ΨεͰ͋Δ͜ͱ͔ΒɼͷΨεͷྫʢQʣͱಉ༷ʹϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ͕ߋ৽͞ΕΔɽ q(W)mnew= m + v∂∂mlog Z0vnew= v − v2{(∂∂mlog Z0)2− 2∂∂vlog Z0}Z0= Z(αγw, βγw) =∫p(w|γw)q(W)q(γw)dwdγw=∫(w|0,γ−1w)(w|m, v)Gam(γw|αγw, βγw)dwdγw
ظ๏ʹΑΔֶशॳظԽͱࣄલҼࢠͷಋೖʲࣄલҼࢠͷಋೖʳwҼࢠ ͷՃɿ ͷߋ৽ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹp(w(l)i,j|γw) q(γw)qnew(γw) ≈1Z0p(w|γw)q(W)q(γw)ɹ ΨϯϚͰ͋Δ͜ͱ͔ΒɼͷΨϯϚͷྫʢQʣͱಉ༷ʹϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ͕ߋ৽͞ΕΔɽ ɹɹɹɹɹɹɹɹ ͨͩ͠ɼ ɼq(γw)αnewγw={Z0Z2Z−21αγw+ 1αγw− 1}−1βnewγw={Z2Z−11αγw+ 1βγw− Z1Z−10αγwβγw}−1Z1= Z(αγw+ 1,βγw) Z2= Z(αγw+ 2,βγw)
ظ๏ʹΑΔֶशॳظԽͱࣄલҼࢠͷಋೖʲࣄલҼࢠͷಋೖʳɹਖ਼نԽఆ ݫີʹٻΊΒΕͳ͍ͷͰɼܭࢉ్தͰݱΕΔενϡʔσϯτͷUΛɼฏۉͱࢄͷ͍͠ΨεͰۙࣅ͢ΔɽZ(αγw, βγw)Z(αγw, βγw) =∫(w|0,γ−1w)q(W, γy, γw)dWdγydγw=∫(w|0,γ−1w)(w|m, v)Gam(γw|αγw, βγw)dwdγw=∫St(w|0,αγw/βγw,2αγw)(w|m, v)dw≈∫(w|0,(αγw− 1)/βγw)(w|m, v)dw= (w|0,(αγw− 1)/βγw+ v)UΛฏۉͱࢄ͕͍͠Ψεʹۙࣅɽ
ظ๏ʹΑΔֶशҼࢠͷಋೖɹࣄલͷ֤Ҽࢠ͕Ճ͞Εͨޙɼ ͷҼࢠΛͭͣͭՃ͢Δɽ ɹ Ψεɼ ΨϯϚͳͷͰɼઌ΄Ͳͷߋ৽ͱಉ༷ʹߦ͏ɽ ৽͘͠ೖ͖ͬͯͨͷҼࢠ ʹର͢Δਖ਼نԽఆʢ ͷՃ࣌ͱҟͳΔߋ৽෦ʣΛܭࢉ͢Δ͜ͱ͕ඪɽɹp(Y|X, W, γy)qnew(γy)qnew(γw)qnew(W) ≈1Zp(yi|xi, W, γy)q(γy)q(γw)q(W)⇔ qnew(γr)qnew(W) ≈1Zp(yi|xi, W, γy)q(γr)q(W)q(W) q(γy)qnew(W) ≈1Z0p(yi|xi, W, γy)q(γw)q(W)qnew(γw) ≈1Z0p(yi|xi, W, γy)q(W)q(γw)⟹ p(yi|xi, W, γy) p(w(l)i,j|γw)
ظ๏ʹΑΔֶशҼࢠͷಋೖɹ൪ͷΛՃͨ͠ͱ͖ͷਖ਼نԽఆΛɼ࣍ͷΑ͏ʹۙࣅతʹٻΊΔɽ ɹiZ(αγy, βγy) =∫(yi| f(xi, W), γy)q(W, γy, γw)dWdγydγw=∫(yi| f(xi, W), γy)q(W, γy)dWdγy≈∫(yi|z(L), γy)(z(L) |mz(L), vz(L))Gam(γy|αγy, βγy)dz(L)dγy=∫St(yi|z(L), αγy/βγy,2αγy)(z(L) |mz(L), vz(L))dz(L)≈∫(yi|mz(L), (αγy− 1)/βγy)(z(L) |mz(L), vz(L))dw= (yi|mz(L), (αγy− 1)/βγy+ vz(L)) UΛฏۉͱࢄ͕͍͠ΨεʹۙࣅɽͷӅΕϢχοτ ͕ฏۉ ɼࢄ ʹै͏ͱԾఆɽʢ࣍ͷεϥΠυͰৄ͘͠ʣlz(l) ∈ ℝHl mz(l)vz(l)
ظ๏ʹΑΔֶशҼࢠͷಋೖɹ ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽʲܭࢉํ๏ʳɹͷӅΕϢχοτͷ ͕ฏۉ ɼࢄ Λ࣋ͭͱԾఆ͢Δɽ·ͨɼͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ ͱ͓͘ɽ ͷฏۉͱࢄҎԼͷΑ͏ʹͳΔɽ ͨͩ͠ɼ ͷɼ֤ύϥϝʔλͷฏۉ ͱࢄ Ͱ͋Δɽ·ͨɼ ΞμϚʔϧੵɽ(z(L) |mz(L), vz(L)) mz(L)vz(L)l z(l) ∈ ℝHl mz(l)vz(l)l W(l) ∈ ℝHl×Hl−1a(l) = W(l)z(l−1)/ Hl−1a(l)ma(l)= M(l)mz(l−1)/ Hl−1va(l)= {(M(l) ⊙ M(l))vz(l−1)+ V(l)(mz(l−1)⊙ mz(l−1)) + V(l)vz(l−1)}/Hl−1M(l), V(l) ∈ ℝHl×Hl−1 m(l)i,jv(l)i,j⊙
ظ๏ʹΑΔֶशҼࢠͷಋೖɹ ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽʲܭࢉํ๏ʳɹͷӅΕϢχοτͷ ͕ฏۉ ɼࢄ Λ࣋ͭͱԾఆ͢Δɽ·ͨɼͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ ͱ͓͘ɽ ͷฏۉͱࢄҎԼͷΑ͏ʹͳΔɽ ͨͩ͠ɼ ͷɼ֤ύϥϝʔλͷฏۉ ͱࢄ Ͱ͋Δɽ·ͨɼ ΞμϚʔϧੵɽ(z(L) |mz(L), vz(L)) mz(L)vz(L)l z(l) ∈ ℝHl mz(l)vz(l)l W(l) ∈ ℝHl×Hl−1a(l) = W(l)z(l−1)/ Hl−1a(l)ma(l)= M(l)mz(l−1)/ Hl−1va(l)= {(M(l) ⊙ M(l))vz(l−1)+ V(l)(mz(l−1)⊙ mz(l−1)) + V(l)vz(l−1)}/Hl−1M(l), V(l) ∈ ℝHl×Hl−1 m(l)i,jv(l)i,j⊙ ͷӅΕϢχοτͷฏۉ ͱࢄ ͔Βͷ׆ੑͷฏۉͱࢄ ͕ٻ·Δɽl − 1 mz(l−1)vz(l−1)l ma(l)va(l)
ظ๏ʹΑΔֶशҼࢠͷಋೖɹ ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽʲܭࢉํ๏ʳɹͷӅΕϢχοτͷ ͕ฏۉ ɼࢄ Λ࣋ͭͱԾఆ͢Δɽ·ͨɼͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ ͱ͓͘ɽ ͷฏۉͱࢄҎԼͷΑ͏ʹͳΔɽ ͨͩ͠ɼ ͷɼ֤ύϥϝʔλͷฏۉ ͱࢄ Ͱ͋Δɽ·ͨɼ ΞμϚʔϧੵɽ(z(L) |mz(L), vz(L)) mz(L)vz(L)l z(l) ∈ ℝHl mz(l)vz(l)l W(l) ∈ ℝHl×Hl−1a(l) = W(l)z(l−1)/ Hl−1a(l)ma(l)= M(l)mz(l−1)/ Hl−1va(l)= {(M(l) ⊙ M(l))vz(l−1)+ V(l)(mz(l−1)⊙ mz(l−1)) + V(l)vz(l−1)}/Hl−1M(l), V(l) ∈ ℝHl×Hl−1 m(l)i,jv(l)i,j⊙ ͷӅΕϢχοτͷฏۉ ͱࢄ ͔Βͷ׆ੑͷฏۉͱࢄ ͕ٻ·Δɽl − 1 mz(l−1)vz(l−1)l ma(l)va(l)ͷ׆ੑͷฏۉ ͱࢄ ͔ΒͷӅΕϢχοτͷฏۉ ͱࢄ ͕ٻ·Ε࠶ؼతʹܭࢉՄೳɽl ma(l)va(l)lmz(l)vz(l)
ظ๏ʹΑΔֶश׆ੑͷɹ׆ੑ ͷ Λܭࢉ͢Δɽத৺ۃݶఆཧΑΓɼӅΕϢχοτ ͕େ͖͍߹ɼ ۙࣅతʹΨεʹै͏ɽ ɹΨεʹै͏ม͕3F-6Λ௨ΔͱɼਤͷӈਤͷΑ͏ʹͷࠞ߹ʹͳΔɽᶃ ෛͷೖྗΛ௨͖ͬͯͨαϯϓϧɼฏۉ ɼࢄ ͷΑ͏ͳ࣭ʹͳΔɽᶄ ඇෛͷೖྗΛ௨͖ͬͯͨαϯϓϧɼҎԼ͕ΒΕͨஅยΨεʹͳΔɽa(l) p(a(l) |W(l), z(l−1))Hl−1a(l)p(a(l) |W(l), z(l−1)) ≈ q(a(l)) = (a(l) |ma(l), va(l))μp= 0 σp= 0
ظ๏ʹΑΔֶश׆ੑͷʲࠞ߹ͷฏۉͱࢄͷҰൠࣜʳɹ ݸͷཁૉΛ࣋ͭࠞ߹ͷฏۉͱࢄɼࠞ߹ ɼ ͱ͢ΔͱɼҰൠతʹҎԼͷΑ͏ʹͳΔɽ K πk> 0K∑k=1πk= 1[xmix] =K∑k=1πkμk[xmix] =K∑k=1πk(μk+ σk) − [xmix]2
ظ๏ʹΑΔֶश׆ੑͷʲ׆ੑͷࠞ߹ʹద༻ʳɹɹɹ࣭ͱஅยΨεͷࠞ߹ΛͦΕͧΕ ɼ ͱ͢Δɽͭ·Γɼ ɽɹ ɼ ͱ͓͘ͱɼҎԼͷΑ͏ʹͳΔɽ ɹ͕ͨͬͯ͠ɼஅΨεͷҎԼͷΑ͏ʹٻΊΒΕΔɽ ɹ<4,PU[ >ΑΓɼஅยΨεͷฏۉ ͱࢄ ҎԼͷΑ͏ʹͳΔɽ ɹҰൠࣜʹ͓͚Δ ɼ ʹͯΊΔͱɼͷฏۉͱࢄ͕ಘΒΕΔɽπpπtπp+ πp= 1πp¯μ = − μ/σπp=∫0−∞(x|μ, σ2)dx= Φ(−μ/σ) = Φ( ¯μ)πt= 1 − πp= Φ(− ¯μ)μtσtμt= μ + σ( ¯μ|0,1)Φ(− ¯μ)σ2t= σ2{1 + ¯μ( ¯μ|0,1)Φ(− ¯μ)−( ¯μ|0,1)Φ(− ¯μ)− 2}( ¯μ|0,1)Φ(− ¯μ)[xmix] [xmix] z
ظ๏ʹΑΔֶश׆ੑͷͭ·Γɼͷ׆ੑͷฏۉͱࢄ͔ΒͷӅΕϢχοτͷฏۉͱࢄ͕ܭࢉՄೳɽl l ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ(z(L) |mz(L), vz(L)) mz(L)vz(L)
ظ๏ʹΑΔֶशޯʹجֶͮ͘शɹ ɼฏۉ ɼࢄ ͱͯ͠ѻ͏ʢ࠶ؼܭࢉͷॳظ ɼ ʣɽdͰɼ ͷग़ྗ ͔Β׆ੑ Λ௨͠ɼͷग़ྗ ͷฏۉͱࢄΛٻΊΔʢத৺ۃݶఆཧΑΓΨεʹۙࣅͰ͖ΔɽʣҰ࿈ͷྲྀΕΛհͨ͠ɽ͜ͷۙࣅ݁ՌΛ࠶ؼతʹ༻͍Δ͜ͱͰɼ࠷ऴ ͷΛΨε Ͱۙࣅ͢Δ͜ͱ͕Ͱ͖Δɽɹ͕ͨͬͯ͠ɼਖ਼نԽఆͷۙࣅදݱ͕ಘΒΕΔɽ ɹਖ਼نԽఆΛಘͨޙɼύϥϝʔλʹΑΔඍΛܭࢉ͢Δ͜ͱͰޯ͕ܭࢉͰ͖Δɽz(0) xi0 mz(0)vz(0)l − 1 z(l−1) a(l) l z(l)z(L)(z(L) |mz(L), v(L)z)Z(αγy, βγy) ≈ (yi|mz(L), (αγy− 1)/βγy+ vz(L))
ظ๏ʹΑΔֶश֬తٯ๏ͷ·ͱΊϞσϧͷఆٛɿp(W, γy, γw|) ∝ p(Y|X, W, γr)p(W|γw)p(γy)p(γw)ۙࣅͷಋೖɿq(W, γy, γw) = q(γy)q(γw)q(W)ۙࣅͷॳظԽɿq0(γy), q0(γw), q0(W)ࣄલҼࢠͷಋೖʢͦͷʣɿҼࢠ ͷՃɿ Ҽࢠ ͷՃɿp(γr) q(γr) ← p(γr)p(γw) q(γw) ← p(γw)
ظ๏ʹΑΔֶश֬తٯ๏ͷ·ͱΊࣄલҼࢠͷಋೖʢͦͷʣɿfor l = 1 to L dofor j = 1 to Hl−1dofor i = 1 to HldoҼࢠp(w(l)i,j|γw)ͷՃɿ⋅ q(W)ͷߋ৽⋅ q(γw)ͷߋ৽ॱɿp(yi|xi, W, γy) where i ∈ sӅΕϢχοτͱ׆ੑͷฏۉͱࢄΛ࠶ؼܭࢉҼࢠ ͷಋೖɿ ͷߋ৽p(yi|xi, W, γy) q(W), q(γy)
ظ๏ʹΑΔֶशؔ࿈ख๏ɹ֬తٯ๏ʹࣅͨख๏ͱͯ͠ɼܾఆతมਪ๏͕͋Δɽʲมਪ๏ͷܽʳɹ&-#0ͷධՁͷͨΊʹରͷظΛܭࢉ͢Δඞཁ͕͋ΓɼϞϯςΧϧϩ๏ͰۙࣅղΛಘ͍ͯΔɽ ҆ఆੑ͕͍ʲܾఆతมਪ๏ʳɹظͷۙࣅܭࢉΛܾఆతʹߦ͏͜ͱͰ҆ఆੑΛߴΊΒΕΔɽ⟹