Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ベイズ深層学習(5.1~5.2)
Search
catla
February 28, 2020
Science
0
230
ベイズ深層学習(5.1~5.2)
内容:ベイズニューラルネットワーク(5.1節),近似ベイズ推論の高速化(5.2節)
catla
February 28, 2020
Tweet
Share
More Decks by catla
See All by catla
ベイズ深層学習(6.3)
catla
2
230
ベイズ深層学習(6.2)
catla
3
240
[読み会資料] Federated Learning for Vision-and-Language Grounding Problems
catla
0
310
ベイズ深層学習(4.1)
catla
0
460
ベイズ深層学習(3.3~3.4)
catla
19
11k
ベイズ深層学習(2.2~2.4)
catla
6
1.3k
23回アルゴリズムコンテスト 1位解法
catla
6
680
Learning Lightweight Lane Detection CNNs by Self Attention Distillation(ICCV2019)の紹介
catla
0
610
TGS Salt Identification Challenge 12th place solution
catla
3
12k
Other Decks in Science
See All in Science
主成分分析に基づく教師なし特徴抽出法を用いたコラーゲン-グリコサミノグリカンメッシュの遺伝子発現への影響
tagtag
PRO
0
210
PPIのみを用いたAIによる薬剤–遺伝子–疾患 相互作用の同定
tagtag
PRO
0
180
あなたに水耕栽培を愛していないとは言わせない
mutsumix
1
280
データベース05: SQL(2/3) 結合質問
trycycle
PRO
0
890
NDCG is NOT All I Need
statditto
2
2.9k
AIによる科学の加速: 各領域での革新と共創の未来
masayamoriofficial
0
450
AIに仕事を奪われる 最初の医師たちへ
ikora128
0
1k
コミュニティサイエンスの実践@日本認知科学会2025
hayataka88
0
140
AI(人工知能)の過去・現在・未来 —AIは人間を超えるのか—
tagtag
PRO
1
240
防災デジタル分野での官民共創の取り組み (1)防災DX官民共創をどう進めるか
ditccsugii
0
540
Navigating Weather and Climate Data
rabernat
0
130
Rashomon at the Sound: Reconstructing all possible paleoearthquake histories in the Puget Lowland through topological search
cossatot
0
600
Featured
See All Featured
Joys of Absence: A Defence of Solitary Play
codingconduct
1
300
Discover your Explorer Soul
emna__ayadi
2
1.1k
Documentation Writing (for coders)
carmenintech
77
5.3k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
470
Building AI with AI
inesmontani
PRO
1
760
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
140
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
How STYLIGHT went responsive
nonsquared
100
6k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
2.3k
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.1k
Utilizing Notion as your number one productivity tool
mfonobong
4
240
Transcript
ϕΠζਂֶश d ܡɹঘً
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷ ۙࣅਪ๏
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹষͷۙࣅਪख๏ɼਂֶशϞσϧʹద༻Ͱ͖Δɽ ɹઢܗճؼϞσϧͱಉ༷ʹॱܕχϡʔϥϧωοτϫʔΫʢ//ʣΛϕΠζԽɽ ɹ ύϥϝʔλ ʹࣄલΛઃఆ͠ɼ֬తͳֶशͱ༧ଌΛՄೳʹ͢Δɽ ⟹ W ϕΠζਪʹ͓͚Δֶशͱ༧ଌ ύϥϝʔλͷಉ࣌ɿɹ
ͱදͤΔɽ ֶशɹɿɹ ΛධՁ͢Δɽ ༧ଌɹɿɹ ΛٻΊΔɽ p(Y, W|X) = p(W) N ∏ n=1 p(yn |w, xn ) p(W|X, Y) p(y* |x* , Y, X) n = 1,…, N xn yn W
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹઃఆ ɹɹೖྗσʔλ ɼ؍ଌσʔλ ͓Αͼύϥϝʔλͷಉ࣌ ΛҎԼͷΑ͏ʹ͓͘ɽ ɹɹ؍ଌσʔλɼҎԼͷ͔ΒಘΒΕΔͱԾఆ͢Δɽ
ɹɹ χϡʔϥϧωοτͷؔ ݻఆͷϊΠζύϥϝʔλɽ ɹɹύϥϝʔλɼҎԼͷ͔ΒಘΒΕΔͱઃఆ͢Δɽ ɹ ɹ ݻఆͷϊΠζύϥϝʔλɽ ɹ ɹɹ X = {x1 , …, xN } Y = {y1 , ⋯, yn } p(Y, W|X) = p(W) N ∏ n=1 p(yn |w, xn ) p(yn |xn , W) = (yn | f(xn ; W), σ2 y I) f(xn ; W) σ2 y p(w) = (w|0,σ2 w ) where w ∈ W σ2 w
ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹಛ ɹɹ//ͷ͕Ͱ͋Δͱ͖ɼ ɹɹɹӅΕϢχοτ͕ଟ͍ɹ ɹؔෳࡶԽɽ ɹɹɹ ͕େ͖͍ɹ ɹมԽ͕ٸफ़ɽ ɹ ɹɹ
⟶ σw ⟶ ɹϕΠζ//ɼӅΕϢχοτΛ૿͢ͱɼࣄޙ͕ෳࡶʹͳ͍ͬͯ͘͜ͱ͕ ΒΕ͍ͯΔɽ
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ϥϓϥεۙࣅʹΑΔֶश ϥϓϥεۙࣅ p(Z|X) ≈ (Z|ZMAP , {Λ(ZMAP )} −1 )
Λ(Z) = − ∇2 Z log p(Z|X) ɹ؆୯ͷͨΊʹ//ͷग़ྗͷ࣍ݩΛͱ͢Δɽ ࣄޙͷۙࣅ ɹࣄޙͷ."1ਪఆΛٻΊΔɽ ɹɹ Ͱ࠷େΛऔΔύϥϝʔλ ΛٻΊΔɽ ɹࣄޙ࠷େԽɹʹɹରࣄޙ࠷େԽɹͳͷͰɼରࣄޙͷޯΛར༻͢Δ ͱɼҎԼͷΑ͏ͳ࠷దԽʹΑͬͯ."1ਪఆ͕ٻΊΒΕΔɽ ɹ ֶशɽ ⟹ p(W|Y, X) WMAP Wnew = Wold + α∇W log p(W|Y, X)| W=Wold α
ϥϓϥεۙࣅʹΑΔֶश ࣄޙͷۙࣅ ɹࣄޙͷޯɼҎԼͷΑ͏ʹٻΒΕΔɽɹɹɹ ɹɹɹɹɹɹɹɹɹɹ Αͬͯɼ ɹɹɹɹɹɹɹɹɹ ύϥϝʔλ Ͱภඍ͢ΔͱɼҎԼͷΑ͏ʹίετؔͷඍͱͳΔɽ
ɹɹɹɹɹɹɹɹɹ ɼͦΕͧΕ//ͷޡࠩؔͱ֤ύϥϝʔλͷࣄલʹ༝དྷ͢Δਖ਼ଇԽ ߲Ͱ͋Δɽ p(W|Y, X) = p(W)p(Y|X, W) p(X|Y) ∝ p(W)p(Y|X, W) log p(W|Y, X) = log p(Y|X, W) + log p(W) + c = N ∑ n=1 log p(yn |xn , W) + ∑ w∈W log p(w) + c w ∈ W ∂ ∂w log p(W|Y, X) = − { 1 σ2 y ∂ ∂w E(W) + 1 σ2 w ∂ ∂w ΩL2 (W) } E(W), ΩL2 (W)
ϥϓϥεۙࣅʹΑΔֶश ࣄޙͷۙࣅ ɹΑͬͯɼ."1ਪఆΛٻΊͨΒɼࣄޙΛҎԼͷΑ͏ʹۙࣅͰ͖Δɽ ɹɹɹɹɹɹɹɹɹɹ ޡࠩؔʹର͢ΔϔοηߦྻͰ͋Δɽ p(W|Y, X) ≈
q(W) = (W|WMAP , {Λ(WMAP )} −1 ) Λ(W) = − ∇2 W log p(W|Y, X) = 1 σ2 w I + 1 σ2 y H H
ϥϓϥεۙࣅʹΑΔֶश ༧ଌͷۙࣅ ɹϥϓϥεۙࣅΛ༻͍Δͱɼ༧ଌҎԼͷΑ͏ʹۙࣅͰ͖Δɽ ɹ ɹ͔͠͠ɼ ͷதʹ//ؚ͕·Ε͍ͯΔͷͰɼղੳతܭࢉ͕ෆՄೳɽ ɹ͜͜Ͱɼύϥϝʔλͷࣄޙͷີ͕."1ਪఆͷपลʹूத͓ͯ͠Γɼ͔ͭͦͷ খ͞ͳൣғʹ͓͍ͯ ͕
ͷઢܕؔͰΑۙ͘ࣅͰ͖Δͱ͍͏ԾઆΛ͓͘ɽ͜ͷ Ծઆ͔Βɼςʔϥʔల։Ͱ ͷؔ Λ ·ΘΓͰ࣍ۙࣅ͢ΔͱɼҎԼͷΑ͏ ʹͳΔɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* , Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW p(y* |x* , W) f(x* |W) W W f(x* |W) WMAP f(x* ; W) ≈ f(x* ; WMAP ) + gT(W − WMAP ) g = ∇W f(x* ; W)| W=WMAP
ϥϓϥεۙࣅʹΑΔֶश ༧ଌͷۙࣅ ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ ɹ ɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* ,
Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW = ∫ (yn | f(xn ; W), σ2 y )(W|WMAP , {Λ(WMAP )}−1)dW = ∫ (yn | f(x* ; WMAP ) + gT(W − WMAP ), σ2 y ) (W|WMAP , {Λ(WMAP )}−1)dW = (y* | f(x* ; WMAP ), σ2(x* )) σ2(x* ) = σ2 y + gT{Λ(WMAP )}−1g
ϥϓϥεۙࣅʹΑΔֶश ༧ଌͷۙࣅ ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ ɹ ɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* ,
Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW = ∫ (yn | f(xn ; W), σ2 y )(W|WMAP , {Λ(WMAP )}−1)dW = ∫ (yn | f(x* ; WMAP ) + gT(W − WMAP ), σ2 y ) (W|WMAP , {Λ(WMAP )}−1)dW = (y* | f(x* ; WMAP ), σ2(x* )) σ2(x* ) = σ2 y + gT{Λ(WMAP )}−1g ϥϓϥεۙࣅ ςʔϥʔల։ͷҰ࣍ۙࣅ
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ɹରࣄޙʢϋϛϧτχΞϯʹ͓͚ΔϙςϯγϟϧΤωϧΪʔʣ͕αϯϓϦϯά͠ ͍ͨมʹରͯ͠ඍՄೳͳΒ).$๏͕ద༻Ͱ͖Δɽܭࢉ࣌ؒ͑͞ेʹ֬อ͍ͯ͠Ε ɼཧతʹਅͷࣄޙ͔Βͷαϯϓϧ͕ಘΒΕΔʢ.$.$ͷಛʣɽ݁Ռతʹɼෳ ͷαϯϓϧ͔Βෆ࣮֬ੑΛදݱͰ͖Δɽ
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ॏΈύϥϝʔλͷਪ ɹਖ਼نԽ͞Ε͍ͯͳ͍ࣄޙΛར༻͢ΕɼରԠ͢ΔϙςϯγϟϧΤωϧΪʔҎԼ ͷΑ͏ʹͳΔɽ ͜ΕΛඍ͢Δͱɼઌ΄Ͳొͨ͠ίετؔͷඍͱՁͰ͋Δ͜ͱ͕Θ͔Δɽ ɹ ޡࠩٯ๏ʹΑΔޯܭࢉ͕ར༻Ͱ͖Δɽ ʲ.$.$ʹجͮ͘ͷۙࣅਪͷʳ
w αϯϓϧ͕ेͰ͋Δ͔ΛΔखஈ͕ͳ͍ɽ w .$.$ͷύϥϝʔλௐ͕͍͠ɽʢFH).$๏ʹ͓͚ΔεςοϓαΠζεςοϓͳͲ w ֶश͕ɽɹ (W) = − {log p(Y|X, W) + log p(W)} ⟹
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ ɹϋΠύʔύϥϝʔλͰ͋Δ ʹͦΕͧΕࣄલΛ༩͑Δ͜ͱͰ ͱಉ࣌ʹ ਪՄೳͰ͋Δɽ ɹ ɹਫ਼ύϥϝʔλ Λಋೖ͠ɼҎԼͷΑ͏ʹࣄલΛΨϯϚͰఆٛ͢Δɽ
ɹಉ༷ʹ ʹରͯ͠ɼҎԼͷΑ͏ʹఆٛ͢Δɽ σw σy W γw = σ−2 w p(γw ) = Gam(γw |aw , bw ) (aw , bw ਖ਼ͷݻఆ) γy = σ−2 y p(γy ) = Gam(γy |ay , by ) (ay , by ਖ਼ͷݻఆ)
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ ɹϞσϧʢύϥϝʔλͷಉ࣌ʣΛվΊͯॻ͘ͱɼҎԼͷΑ͏ʹͳΔɽ ɹ p(Y, W, γw , γy
|X) = p(γw )p(γy )p(W|γw ) N ∏ n=1 p(yn |xn , W, γy ) n = 1,…, N xn yn W γy γw ɹࣄޙɼҎԼͷΑ͏ʹͳΔɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(W, γw , γy |X, Y) αy βw βy αw
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ ɹΪϒεαϯϓϦϯάΛ༻͍ͯɼ ΛαϯϓϦϯά͢Δɽ w ͷαϯϓϦϯά ɹɹɹઌ΄Ͳͱಉ༷ʹɼ).$๏Ͱαϯϓϧ͢Δɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
w ͷαϯϓϦϯά ɹɹɹ ɹɹɹ Ψεɼ ΨϯϚʢΨεͷڞࣄલʣͳͷͰɼ ɹɹɹ ΨϯϚͰ͋ΔɽΑͬͯɼ ͨͩ͠ɼ ॏΈύϥϝʔλͷ૯ɽ W, γw , γy W W ∼ p(W|Y, X, γw , γy ) γw p(γw |Y, X, W, γy ) ∝ p(W|γw )p(γw ) p(W|γw ) p(γw ) p(γw |Y, X, W, γy ) γw ∼ Gam( ̂ aw , ̂ bw ) ̂ aw = aw + Kw 2 ̂ bw = bw + 1 2 ∑ w∈W w2 Kw
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ w ͷαϯϓϦϯά ɹɹɹ ɹɹɹ Ψεͷ૯ͳͷͰΨεɼ ΨϯϚΑΓɼ
ɹɹɹ ΨϯϚͰ͋ΔɽΑͬͯɼ γy p(γy |Y, X, W, γw ) ∝ p(γw ) N ∏ n=1 p(yn |xn , W, γr ) N ∏ n=1 p(yn |xn , W, γr ) p(γy ) p(γy |Y, X, W, γw ) γy ∼ Gam( ̂ ay , ̂ by ) ̂ ay = ay + N 2 ̂ by = by + 1 2 N ∑ n=1 {yn − f(xn ; W)}2
ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ ɹΨϯϚ ͷฏۉ ɼࢄ ͳͷͰɼ ͕େ͖͍΄Ͳ ʹΑΔ ͷਪఆਫ਼͕ѱ͘ɼ؍ଌʹର͢Δࢄ͕େ͖͘ͳΔΑ͏ʹֶश͞ΕΔɽ
ɹ ɹࠓճɼॏΈύϥϝʔλͷਫ਼ύϥϝʔλɼશମʹͬͯڞ௨ͷ Ͱ͓͍͍͕ͯͨɼ //ͷ֤͝ͱʹਫ਼ύϥϝʔλ ͱ͓͘͜ͱՄೳͰ͋Δɽ Gam(a, b) a/b a/b2 ̂ by f(xn |W) yn γw (γ(1) w , …, γ(L) w )
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ۙࣅϕΠζਪͷߴԽ
ۙࣅϕΠζਪͷߴԽ ʲϕΠζχϡʔϥϧωοτϫʔΫͷܽʳ ɹύϥϝʔλͷपลԽʹ͏ܭࢉྔ͕େ ɹɹ ༧ଌπʔϧͱͯ͋͠·ΓΘΕͳ͔ͬͨɽ ɹ·ͨɼਂֶशඞཁͳֶशσʔλ͕େ ɹɹ όονֶशΛલఏͱͨ͠ख๏Ͱܭࢉޮ͕ѱ͍ɽ ʲͲͷΑ͏ʹܽΛิ͏ʁʳ w
ੵআڈΛۙࣅਪ͢Δ͜ͱͰɼܭࢉͷޮΛ্͛Δɽ w ϛχόονֶशΛಋೖ͢Δɽ ⟹ ⟹
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲʳ ɹ.$.$Λར༻ֶͨ͠शେنͳσʔλʹରͯ͠ɼܭࢉޮ͕ѱ͍ɽ ʲղܾࡦʳ ɹܭࢉޮͷߴ͍ϛχόονʹجֶͮ͘शख๏ʢFH֬తޯ߱Լ๏ʣͱෆ࣮֬ੑͷ ਪఆ͕Մೳͳ.$.$ʢFH.)๏ɼ).$๏ʣΛΈ߹ΘͤΔɽ ɹ ֬తϚϧίϑ࿈ϞϯςΧϧϩ๏ ⟹
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹ֬తޯ߱Լ๏ͱϥϯδϡόϯಈྗֶ๏ΛΈ߹Θͤͨɹ֬తޯϥάδϡόϯ ಈྗֶ๏ɹΛར༻ֶͨ͠शΛߟ͑Δɽ ɹύϥϝʔλͷߋ৽Λɹ ͱද͢ɽ ɹ֬తޯ߱Լ๏Ͱɼύϥϝʔλͷߋ৽෯ΛҎԼͷΑ͏ʹॻ͚Δɽ ͨͩ͠ɼ
αϒαϯϓϧͷେ͖͞Ͱ͋ΓɼՃ͑ͯɼϩϏϯεɾϞϯϩʔΞϧΰϦζϜͷ Έʹ͢ΔͨΊʹɼεςοϓʹ͓͚Δֶश ҎԼͷ݅Λຬͨ͢Α͏ʹઃఆ͢ Δɽ Wnew = Wold + ΔW ΔW = αt 2 ∇W log p(W|Xs , Ys ) = αt 2 { N M ∑ n∈S ∇W log p(yn |xn , W) + ∇W log p(W) } M t αt ∞ ∑ i=1 αt = ∞, ∞ ∑ i=1 α2 t < ∞
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹҰํͰɼόονֶशΞϧΰϦζϜͷϥϯδϡόϯಈྗֶ๏ͷαϯϓϧΛಘΔͨΊʹඞ ཁͳεςοϓɼϙςϯγϟϧΤωϧΪʔΛ ɼεςοϓαΠζΛ ΛӡಈྔϕΫτϧͱ͢Δͱɼύϥϝʔλͷߋ৽෯ҎԼͷΑ͏ʹͳΔɽ
ɹ Λখ͘͢͞Εɼ.)๏ʹ͓͚Δड༰ΛݶΓͳ͘·Ͱ͚ۙͮΒΕΔɽ = − log p(W|X, Y) ϵ = αt p ΔW = − ϵ2 2 ∇W + ϵp = αt 2 ∇W log p(W|X, Y) + αt p = αt 2 { N ∑ n=1 ∇W log p(yn |xn , W) + ∇W log p(W) } + αt p, p ∼ (0, I) . αt
֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹઌͷͭʢ֬తޯ߱Լ๏ͱϥϯδϡόϯಈྗֶ๏ʣΛΈ߹ΘͤΔͱɼߋ৽෯͕Ҏ ԼͷΑ͏ʹͳΔɽ ɹɹɹɹɹɹɹ ֶशɼઌ΄Ͳͷ݅ͱಉ༷ɽ ɹ ɹʬ͕খ͖͞ͱ͖ʢֶशॳظஈ֊ʣ㲊 ɹɹ4(%ͷརΛੜ͔ͯ͠ࣄޙͷۭؒΛޮతʹ୳ࡧɽ
ɹʬ͕େ͖͘ͳΔʹͭΕͯ㲊 ϥϯδϡόϯಈྗֶ๏ʹΑΔਅͷࣄޙ͔ΒۙࣅతͳαϯϓϧΛಘΒΕΔɽ ΔW = αt 2 { N M ∑ n∈S ∇W log p(yn |xn , W) + ∇W log p(W) } + αt p, p ∼ (0, I) . t t
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
֬తมਪ๏ ɹઌ΄Ͳɼ֬తޯ๏ͱ.$.$ͷΈ߹ΘͤΛհͨ͠ɽ ɹ࣍ɼมਪ๏ͱ֬తޯ߱Լ๏ΛΈ߹ΘͤΔɽ ɹɹ ֬తมਪ๏ ɹ ɹΛมύϥϝʔλͷू߹ͱͨ͠ͱ͖ɼ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ ͱͳΔΑ͏ͳۙࣅ
ΛٻΊΔ͜ͱ͕ඪɽ ⟹ ξ q(W; ξ) ≈ p(W|X, Y) q(W; ξ)
֬తมਪ๏ ɹޮԽͷͨΊʹϛχόονΛಋೖ͢Δɼ ɹ ɹϛχόονͰܭࢉ͞Εͨ ʹର͢ΔෆภਪఆྔͱͳΔɽ
ɹ͕ͨͬͯ͠ɼ Λ࠷େԽ͢ΔΘΓʹɼ Λ࠷େԽ͢Δ͜ͱʹΑͬͯɼޮ Α͘ύϥϝʔλͷࣄޙΛۙࣅͰ͖Δɽ ℒ(ξ) = N ∑ n=1 ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] ℒS (ξ) = N M ∑ n∈S ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] ℒs ℒ S [ℒs (ξ)] = ℒ(ξ) ℒ(ξ) ℒs (ξ) ϛχόονԽ
֬తมਪ๏ ɹ͜ͷޙͷεϥΠυͰɼۙࣅΛ࣍ͷΑ͏ͳಠཱͳΨεͱԾఆ͠ɼ&-#0Λ ޯ߱Լ๏Λར༻ͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δɽ q(W; ξ) = ∏ i,j,l (w(l)
i,j |μ(l) i,j , σ(l) i,j 2 )
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ޯͷϞϯςΧϧϩۙࣅ ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰɼ&-#0ʹ͓͚Δύϥϝʔλ ղੳతʹ ੵআڈͰ͖ͳ͍ɽ ɹ ޯ߱Լ๏ʹΑͬͯ Λ࠷େԽɽ ɹޯ߱Լ๏Λ͏ͨΊʹ ΛมύϥϝʔλʹΑΔޯܭࢉΛ͢Δඞཁ͕͋Δɽ
ɼͲͪΒΨεͳͷͰղੳతʹޯܭࢉͰ͖ΔɽҰํͰɼର ղੳతʹੵͰ͖ͳ͍ɽ W ⟹ ℒS (ξ) ℒS (ξ) ξ DKL [q(W; ξ)||p(W)] ∫ q(W; ξ)log p(yn | f(xn ; W))dW
ޯͷϞϯςΧϧϩۙࣅ ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰɼ&-#0ʹ͓͚Δύϥϝʔλ ղੳతʹ ੵআڈͰ͖ͳ͍ɽ ɹ ޯ߱Լ๏ʹΑͬͯ Λ࠷େԽɽ ɹޯ߱Լ๏Λ͏ͨΊʹ ΛมύϥϝʔλʹΑΔޯܭࢉΛ͢Δඞཁ͕͋Δɽ
ɼͲͪΒΨεͳͷͰղੳతʹޯܭࢉͰ͖ΔɽҰํͰɼର ղੳతʹੵͰ͖ͳ͍ɽ W ⟹ ℒS (ξ) ℒS (ξ) ξ DKL [q(W; ξ)||p(W)] ∫ q(W; ξ)log p(yn | f(xn ; W))dW ɹϞϯςΧϧϩ๏ͰੵʢରʣΛۙࣅͯ͠ɼޯͷਪఆΛಘΑ͏ʂ
ޯͷϞϯςΧϧϩۙࣅ ʲඪʳ ɹύϥϝʔλ ʹରͯ͠ɼ͋Δ ͱ Λߟ͑ɼ࣍ͷޯΛਪ͢ Δ͜ͱɽ ʲܭࢉํ๏ʳ
ɹείΞؔਪఆɼ࠶ύϥϝʔλԽޯɼҰൠԽ࠶ύϥϝʔλԽޯɼӄؔඍͳͲ w ∈ ℝ f(w) q(w; ξ) I(ξ) = ∇ξ ∫ f(w)q(w; ξ)dw
ޯͷϞϯςΧϧϩۙࣅ είΞؔਪఆ ɹҎԼͷΑ͏ʹ Λมܗ͢Δɽ ɹ͕ͨͬͯ͠ɼ ͔Β ΛෳαϯϓϦϯά͔ͯ͠ΒඍΛධՁ͢Δ͜ͱͰ ͷෆ
ภਪఆྔ͕ಘΒΕΔɽ ʲద༻Ͱ͖Δ݅ʳɹ ͷඍ͕ܭࢉՄೳɽ ʲʳɹ࣮༻্ඇৗʹߴ͍ࢄ͕ൃੜͯ͠͠·͏ɽ ʲղܾࡦʳɹ੍ޚมྔ๏ͳͲͷࢄݮগख๏ͱΈ߹ΘͤΔɽ I(ξ) I(ξ) = ∇ξ ∫ f(w)q(w; ξ)dw = ∫ f(w)∇ξ q(w; ξ)dw = ∫ f(w)q(w; ξ)∇ξ log q(w; ξ)dw = q(w;ξ) [ f(w)∇ξ log q(w; ξ)] q(w; ξ) w I(ξ) log q(w; ξ)
ޯͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯ ɹ Λ ͔ΒαϯϓϦϯά͢ΔΘΓʹɼʹґଘ͠ͳ͍ ͔ΒΛαϯϓϦϯ ά͠ɼม Λద༻͢Δ͜ͱͰؒతʹ ͷαϯϓϦϯάΛ͢Δ͜ͱΛߟ͑Δɽ ɹ͕ͨͬͯ͠ɼҎԼͷΑ͏ʹޯͷෆภਪఆྔ͕ಘΒΕΔɽ
ʲ۩ମྫʳɹ ɼ ͷ߹ ɹ ɼ ͱ͢Δ͜ͱͰɼ ͔ΒαϯϓϦϯ άͰ͖Δɽมύϥϝʔλʹؔ͢Δޯͷඍɼ࣍ͷΑ͏ʹͳΓɼ֤มύϥϝʔλ ͷޯͷෆภਪఆྔ͕ಘΒΕΔɽ ɹɹɹɹ ɹɹɹɹ w q(w; ξ) ξ q(ϵ) ϵ w = g(ξ, ϵ) w q(ϵ) [ f′(g(ξ; ϵ))∇ξ g(ξ; ϵ)] = I(ξ) ξ = { ̂ μ, ̂ σ2} q(w; ξ) = (w| ̂ μ, ̂ σ2) ˜ ϵ ∼ (0,1) = q(ϵ) ˜ w = g(ξ; ϵ) = ̂ μ + ̂ σϵ ˜ w ( ̂ μ, ̂ σ2) ∂ ∂ ̂ μ ∫ f(w)q(w; ξ)dw = ∫ f′(w)q(w; ξ)dw ∴ I( ̂ μ) = q(w;ξ) [ f′(w)] ∂ ∂ ̂ σ ∫ f(w)q(w; ξ)dw = ∫ f′(w) (w − ̂ μ) ̂ σ q(w; ξ)dw ∴ I( ̂ μ) = q(w;ξ) [f′(w) (w − ̂ μ) ̂ σ ]
ޯͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯͷҰൠԽ ʲ࠶ύϥϝʔλԽޯͷརʳ ɹɹείΞؔਪఆͱൺͯޯͷࢄΛখ͑͘͞ΒΕΔɽ ʲ࠶ύϥϝʔλԽޯͷʳ ɹɹมม ͕ඞཁɽʢશͯͷͰద༻Ͱ͖ΔΘ͚Ͱͳ͍ɽʣ ʲղܾࡦɹྫɿʳɹҰൠԽ࠶ύϥϝʔλԽޯ ɹɹ ʹؔ͢Δ੍Λ؇Ίɼଟ͘ͷछྨͷʹରͯ͠ద༻Մೳͱͨ͠ͷɽ
ɹɹ ͷΑ͏ʹมύϥϝʔλͷґଘੑΛ͢͜ͱΛڐ͢ɽ ʲղܾࡦɹྫɿʳɹӄؔඍ ɹʲ͑Δ݅ʳ w ΛٻΊΔ͜ͱࠔ͕ͩɼٯม ༰қʹಘΒΕΔɽ w ࿈ଓͷ ɹɹ ΛͰඍ͢Δ͜ͱͰظͷޯΛಘΔɽ g g q(ϵ; ξ) g g−1 ϵ = g−1(ϵ; ξ) ξ
ޯͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯͷҰൠԽ ʲղܾࡦɹྫɿʳɹ࿈ଓ؇ ɹɹࢄͷ֬ʹରͯ͠࠶ύϥϝʔλԽޯΛద༻͢Δํ๏ɽ ɹʲ۩ମྫʳ ΧςΰϦʢࢄʣɼΨϯϕϧιϑτϚοΫεʢ࿈ଓʣͷԹύ ϥϝʔλΛʹઃఆͨ͠ͷͱҰக͢Δɽ ɹɹ
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ޯۙࣅʹΑΔมਪ๏ ɹ࣮ࡍʹ࠶ύϥϝʔλԽޯΛར༻ͯ͠ϕΠζχϡʔϥϧωοτͷ&-#0Λ࠷େԽ͢Δɽ ᶃ ϛχόον Λσʔληοτ ͔ΒϥϯμϜʹநग़͢Δɽ ᶄ .ݸʢϛχόονͷαϯϓϧʣͷϊΠζΛऔಘ͢Δɽ ɹ
ᶅ มύϥϝʔλʹؔ͢ΔޯΛܭࢉ͢Δɽ ᶆ &-#0ͷ૿ՃํʹมύϥϝʔλΛߋ৽͢Δɽ s ˜ ϵi ∼ (0, I) ℒs (ξ) = N M ∑ n∈S ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] = N M ∑ n∈S ∫ p(ϵ)log p(yn | f(xn ; g(ξ; ϵ)))dϵ − DKL [q(W; ξ)||p(W)] ≈ ℒS,ϵ (ξ) ( ∵ ,ϵ [ℒS,ϵ (ξ)] = ℒ(ξ)) = N M ∑ n∈S log p(yn | f(xn ; g(ξ; ˜ ϵn ))) − DKL [q(W; ξ)||p(W)], ∇ξ ℒs (ξ) ≈ ∇ξ ℒS,ϵ (ξ) = N M ∑ n∈S ∇ξ log p(yn | f(xn ; g(ξ; ˜ ϵn ))) − ∇ξ DKL [q(W; ξ)||p(W)] . ξ ← ξ + α∇ξ ℒS,ϵ (ξ)
ຊͷ༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪͷޮԽ ‣֬తޯϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬తมਪ๏ʹΑΔֶश ‣ޯͷϞϯςΧϧϩۙࣅ ‣ޯۙࣅʹΑΔมਪ๏
‣ظ๏ʹΑΔֶश
ظ๏ʹΑΔֶश ɹॱܭࢉͰχϡʔϥϧωοτϫʔΫΛ௨ͨ֬͠ͷʹΑΓपลͷධՁΛ ߦ͍ɼٯͰύϥϝʔλΛֶश͢ΔͨΊʹظ๏Λ༻͍ͯपลͷޯΛ ܭࢉ͢Δɽ ֬తٯ๏ ɹ֬తٯ๏σʔλΛஞ࣍తʹॲཧͰ͖ΔͷͰɼେྔσʔλΛ༻ֶ͍ͨशͰε έʔϧՄೳɽ؍ଌσʔλͷਫ਼ύϥϝʔλॏΈͷࣄલΛࢧ͢Δਫ਼ύϥϝʔλ ۙࣅਪՄೳɽ ⟹
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश Ϟσϧ ʲઃఆʳ ɹɹ ͱ͠ɼपลΛҎԼͷΑ͏ʹఆٛ͢Δɽ ɹ
ͷ׆ੑԽؔʹਖ਼نԽઢܗؔʢ3F-6ʣΛ༻͍Δɽ ɹɹύϥϝʔλ ɼಠཱͳΨεʹै͏ͱ͢Δɽ ʲඪʳ ɹɹҎԼͷࣄޙΛۙࣅਪ͢Δ͜ͱɽ yn ∈ ℝ p(Y|X, W, γr ) = N ∏ n=1 (yn | f(xn ; W), γ−1 y ) p(γy ) = Gam(γr |αγy 0 , βγy 0 ) f(xn ; W) W p(W|γw ) = L ∏ l=1 Hl ∏ i=1 Hl−1 ∏ j=1 (w(l) i,j |0,γ−1 w ) p(γw ) = Gam(γw |αγw 0 , βγw 0 ) p(W, γy , γw |) ∝ p(Y|X, W, γr )p(W|γw )p(γy )p(γw )
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश ۙࣅ ɹ֬తٯ๏ɼԾఆີϑΟϧλϦϯάʹج͍͍ͮͯΔɽ ɹύϥϝʔλͷۙࣅΛ࣍ͷΑ͏ʹ͓͘ɽ ɹ ɹ্ͷࣜΛԾఆີϑΟϧλϦϯάʹ͓͚ΔϞʔϝϯτϚονϯάͰஞ࣍తʹߋ৽ͯ͠ ͍͘ɽ q(W,
γy , γw ) = Gam(γy |αγy , βγy )Gam(γw |αγw , βγw ) L ∏ l=1 Hl ∏ i=1 Hl−1 ∏ j=1 (w(l) i,j |m(l) i,j , v(l) i,j ) = q(γy )q(γw )q(W) ԾఆີϑΟϧλϦϯά qi+1 (θ) ≈ ri+1 = 1 Zi+1 fi+1 (θ)qi (θ) ɿҼࢠ fi (θ)
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश ॳظԽͱࣄલҼࢠͷಋೖ ʲॳظԽʳ ɹɹۙࣅ͕ແใʹͳΔΑ͏ʹɼ ɼ ɼ ɼ ɼ ɼ
ͰॳظԽ͢Δɽ ʲࣄલҼࢠͷಋೖʳ ɹඪͷࣄޙͷҼࢠΛͭͭՃ͢Δ͜ͱͰۙࣅΛߋ৽͢Δɽ ɹࠓճͷϞσϧʹ͓͚ΔࣄલҼࢠҎԼͷΑ͏ʹͳΔɽ ɹ m(l) i,j = 0 v(l) i,j = ∞ αγy = 1 βγy = 0 αγw = 1 βγw = 0 p(γr ), p(γw ), {p(w(l) i,j |γw )}i,j,l ࣄޙɿɹ ۙࣅɿɹ p(W, γy , γw |) ∝ p(Y|X, W, γr )p(W|γy )p(γw )p(γw ) q(W, γy , γw ) = q(γy )q(γw )q(W)
ظ๏ʹΑΔֶश ॳظԽͱࣄલҼࢠͷಋೖ ʲࣄલҼࢠͷಋೖʳ wҼࢠ ͓Αͼ ͷՃɽ ɹۙࣅ Λࣄલ ͱಉ͡ͷʹ͍ͯ͠ΔͷͰɼҼࢠͷߋ৽ ҎԼͷΑ͏ʹͳΔɽ
ɹɹɹɹɹɹɹɹ ɼ ɼ ɼ ͭ·Γɼ ɼ p(γw ) p(γy ) q(γy ), q(γw ) p(γy ), p(γw ) qnew(γy )qnew(γw )qnew(W) ≈ p(γy )p(γw )q(W) αnew γy = αγy 0 βnew γy = βγy 0 αnew γw = αγw 0 βnew γw = βγw 0 q(γr ) ← p(γr ) q(γw ) ← p(γw ) ԾఆີϑΟϧλϦϯά qnew(γy )qnew(γw )qnew(W) ≈ r = 1 Z f new(γy , γw , W)q(γy )q(γw )q(W)
ظ๏ʹΑΔֶश ॳظԽͱࣄલҼࢠͷಋೖ ʲࣄલҼࢠͷಋೖʳ wҼࢠ ͷՃ ɹҎ߱ͰɼΠϯσοΫε Λলུ͢Δɽ ɹߋ৽͞ΕΔͷɼ
͓Αͼ Ͱ͋ΔɽΑͬͯɼͦΕͧΕΛҎԼͷΑ͏ʹߋ৽ ͢Δɽ ɹԼઢ෦ΛҼࢠͱΈͳ͢ɽҙ͖͢ɼͭͷͷߋ৽ʹͭͷ৽ͨʹߋ৽͞ Εͨ༻͍ͯ͠ͳ͍ͳͷͰɼߋ৽ॱʹؔͳ͍͜ͱɽ p(w(l) i,j |γw ) qnew(γy )qnew(γw )qnew(W) ≈ 1 Z p(w(l) i,j |γw )q(γy )q(γw )q(W) ⇔ qnew(γw )qnew(W) ≈ 1 Z p(w(l) i,j |γw )q(γw )q(W) i, j, l q(W) q(γw ) qnew(W) ≈ 1 Z0 p(w|γw )q(γw )q(W) qnew(γw ) ≈ 1 Z0 p(w|γw )q(W)q(γw )
ظ๏ʹΑΔֶश ॳظԽͱࣄલҼࢠͷಋೖ ʲࣄલҼࢠͷಋೖʳ wҼࢠ ͷՃɿ ͷߋ৽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(w(l) i,j |γw
) q(W) qnew(W) ≈ 1 Z0 p(w|γw )q(γw )q(W) ɹ ΨεͰ͋Δ͜ͱ͔ΒɼͷΨεͷྫʢQʣͱಉ༷ʹ ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ͕ߋ৽͞ΕΔɽ q(W) mnew = m + v ∂ ∂m log Z0 vnew = v − v2 {( ∂ ∂m log Z0) 2 − 2 ∂ ∂v log Z0} Z0 = Z(αγw , βγw ) = ∫ p(w|γw )q(W)q(γw )dwdγw = ∫ (w|0,γ−1 w )(w|m, v)Gam(γw |αγw , βγw )dwdγw
ظ๏ʹΑΔֶश ॳظԽͱࣄલҼࢠͷಋೖ ʲࣄલҼࢠͷಋೖʳ wҼࢠ ͷՃɿ ͷߋ৽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(w(l) i,j |γw
) q(γw ) qnew(γw ) ≈ 1 Z0 p(w|γw )q(W)q(γw ) ɹ ΨϯϚͰ͋Δ͜ͱ͔ΒɼͷΨϯϚͷྫʢQʣͱಉ༷ʹ ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ͕ߋ৽͞ΕΔɽ ɹɹɹɹɹɹɹɹ ͨͩ͠ɼ ɼ q(γw ) αnew γw = { Z0 Z2 Z−2 1 αγw + 1 αγw − 1 } −1 βnew γw = { Z2 Z−1 1 αγw + 1 βγw − Z1 Z−1 0 αγw βγw } −1 Z1 = Z(αγw + 1,βγw ) Z2 = Z(αγw + 2,βγw )
ظ๏ʹΑΔֶश ॳظԽͱࣄલҼࢠͷಋೖ ʲࣄલҼࢠͷಋೖʳ ɹਖ਼نԽఆ ݫີʹٻΊΒΕͳ͍ͷͰɼܭࢉ్தͰݱΕΔενϡʔσϯτ ͷUΛɼฏۉͱࢄͷ͍͠ΨεͰۙࣅ͢Δɽ Z(αγw , βγw
) Z(αγw , βγw ) = ∫ (w|0,γ−1 w )q(W, γy , γw )dWdγy dγw = ∫ (w|0,γ−1 w )(w|m, v)Gam(γw |αγw , βγw )dwdγw = ∫ St(w|0,αγw /βγw ,2αγw )(w|m, v)dw ≈ ∫ (w|0,(αγw − 1)/βγw )(w|m, v)dw = (w|0,(αγw − 1)/βγw + v) UΛฏۉͱࢄ͕ ͍͠Ψεʹ ۙࣅɽ
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश Ҽࢠͷಋೖ ɹࣄલͷ֤Ҽࢠ͕Ճ͞Εͨޙɼ ͷҼࢠΛͭͣͭՃ͢Δɽ ɹ Ψεɼ ΨϯϚͳͷͰɼઌ΄Ͳͷߋ৽ͱಉ༷ʹߦ͏ɽ
৽͘͠ೖ͖ͬͯͨͷҼࢠ ʹର͢Δਖ਼نԽఆʢ ͷ Ճ࣌ͱҟͳΔߋ৽෦ʣΛܭࢉ͢Δ͜ͱ͕ඪɽ ɹ p(Y|X, W, γy ) qnew(γy )qnew(γw )qnew(W) ≈ 1 Z p(yi |xi , W, γy )q(γy )q(γw )q(W) ⇔ qnew(γr )qnew(W) ≈ 1 Z p(yi |xi , W, γy )q(γr )q(W) q(W) q(γy ) qnew(W) ≈ 1 Z0 p(yi |xi , W, γy )q(γw )q(W) qnew(γw ) ≈ 1 Z0 p(yi |xi , W, γy )q(W)q(γw ) ⟹ p(yi |xi , W, γy ) p(w(l) i,j |γw )
ظ๏ʹΑΔֶश Ҽࢠͷಋೖ ɹ൪ͷΛՃͨ͠ͱ͖ͷਖ਼نԽఆΛɼ࣍ͷΑ͏ʹۙࣅతʹٻΊΔɽ ɹ i Z(αγy , βγy
) = ∫ (yi | f(xi , W), γy )q(W, γy , γw )dWdγy dγw = ∫ (yi | f(xi , W), γy )q(W, γy )dWdγy ≈ ∫ (yi |z(L), γy )(z(L) |mz(L) , vz(L) )Gam(γy |αγy , βγy )dz(L)dγy = ∫ St(yi |z(L), αγy /βγy ,2αγy )(z(L) |mz(L) , vz(L) )dz(L) ≈ ∫ (yi |mz(L) , (αγy − 1)/βγy )(z(L) |mz(L) , vz(L) )dw = (yi |mz(L) , (αγy − 1)/βγy + vz(L) ) UΛฏۉͱࢄ͕ ͍͠Ψεʹ ۙࣅɽ ͷӅΕϢχοτ ͕ฏۉ ɼ ࢄ ʹै͏ͱԾఆɽ ʢ࣍ͷεϥΠυͰৄ͘͠ʣ l z(l) ∈ ℝHl mz(l) vz(l)
ظ๏ʹΑΔֶश Ҽࢠͷಋೖ ɹ ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹͷӅΕϢχοτͷ ͕ฏۉ ɼࢄ
Λ࣋ͭͱԾఆ͢Δɽ· ͨɼͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ ͱ͓͘ɽ ͷฏۉͱࢄҎԼͷΑ͏ʹͳΔɽ ͨͩ͠ɼ ͷɼ֤ύϥϝʔλͷฏۉ ͱࢄ Ͱ͋Δɽ· ͨɼ ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙
ظ๏ʹΑΔֶश Ҽࢠͷಋೖ ɹ ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹͷӅΕϢχοτͷ ͕ฏۉ ɼࢄ
Λ࣋ͭͱԾఆ͢Δɽ· ͨɼͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ ͱ͓͘ɽ ͷฏۉͱࢄҎԼͷΑ͏ʹͳΔɽ ͨͩ͠ɼ ͷɼ֤ύϥϝʔλͷฏۉ ͱࢄ Ͱ͋Δɽ· ͨɼ ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙ ͷӅΕϢχοτͷฏۉ ͱ ࢄ ͔Βͷ׆ੑͷฏۉ ͱࢄ ͕ٻ·Δɽ l − 1 mz(l−1) vz(l−1) l ma(l) va(l)
ظ๏ʹΑΔֶश Ҽࢠͷಋೖ ɹ ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹͷӅΕϢχοτͷ ͕ฏۉ ɼࢄ
Λ࣋ͭͱԾఆ͢Δɽ· ͨɼͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ ͱ͓͘ɽ ͷฏۉͱࢄҎԼͷΑ͏ʹͳΔɽ ͨͩ͠ɼ ͷɼ֤ύϥϝʔλͷฏۉ ͱࢄ Ͱ͋Δɽ· ͨɼ ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙ ͷӅΕϢχοτͷฏۉ ͱ ࢄ ͔Βͷ׆ੑͷฏۉ ͱࢄ ͕ٻ·Δɽ l − 1 mz(l−1) vz(l−1) l ma(l) va(l) ͷ׆ੑͷฏۉ ͱࢄ ͔Β ͷӅΕϢχοτͷฏۉ ͱࢄ ͕ٻ·Ε࠶ؼతʹܭࢉՄೳɽ l ma(l) va(l) l mz(l) vz(l)
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश ׆ੑͷ ɹ׆ੑ ͷ Λܭࢉ͢Δɽத৺ۃݶఆཧΑΓɼӅΕϢχοτ ͕େ͖͍߹ɼ ۙࣅతʹΨεʹै͏ɽ
ɹΨεʹै͏ม͕3F-6Λ௨ΔͱɼਤͷӈਤͷΑ͏ʹͷࠞ߹ʹͳ Δɽ ᶃ ෛͷೖྗΛ௨͖ͬͯͨαϯϓϧɼฏۉ ɼࢄ ͷΑ͏ͳ࣭ʹͳ Δɽ ᶄ ඇෛͷೖྗΛ௨͖ͬͯͨαϯϓϧɼҎԼ͕ΒΕͨஅยΨεʹͳΔɽ a(l) p(a(l) |W(l), z(l−1)) Hl−1 a(l) p(a(l) |W(l), z(l−1)) ≈ q(a(l)) = (a(l) |ma(l) , va(l) ) μp = 0 σp = 0
ظ๏ʹΑΔֶश ׆ੑͷ ʲࠞ߹ͷฏۉͱࢄͷҰൠࣜʳ ɹ ݸͷཁૉΛ࣋ͭࠞ߹ͷฏۉͱࢄɼࠞ߹ ɼ ͱ͢Δͱɼ ҰൠతʹҎԼͷΑ͏ʹͳΔɽ
K πk > 0 K ∑ k=1 πk = 1 [xmix ] = K ∑ k=1 πk μk [xmix ] = K ∑ k=1 πk (μk + σk ) − [xmix ]2
ظ๏ʹΑΔֶश ׆ੑͷ ʲ׆ੑͷࠞ߹ʹద༻ʳɹ ɹɹ࣭ͱஅยΨεͷࠞ߹ΛͦΕͧΕ ɼ ͱ͢Δɽͭ·Γɼ ɽ ɹ ɼ ͱ͓͘ͱɼҎԼͷΑ͏ʹͳΔɽ
ɹ͕ͨͬͯ͠ɼஅΨεͷҎԼͷΑ͏ʹٻΊΒΕΔɽ ɹ<4,PU[ >ΑΓɼஅยΨεͷฏۉ ͱࢄ ҎԼͷΑ͏ʹͳΔɽ ɹҰൠࣜʹ͓͚Δ ɼ ʹͯΊΔͱɼͷฏۉͱࢄ͕ಘΒΕΔɽ πp πt πp + πp = 1 πp ¯ μ = − μ/σ πp = ∫ 0 −∞ (x|μ, σ2)dx = Φ(−μ/σ) = Φ( ¯ μ) πt = 1 − πp = Φ(− ¯ μ) μt σt μt = μ + σ ( ¯ μ|0,1) Φ(− ¯ μ) σ2 t = σ2 {1 + ¯ μ ( ¯ μ|0,1) Φ(− ¯ μ) − ( ¯ μ|0,1) Φ(− ¯ μ) − 2} ( ¯ μ|0,1) Φ(− ¯ μ) [xmix ] [xmix ] z
ظ๏ʹΑΔֶश ׆ੑͷ ͭ·Γɼ ͷ׆ੑͷฏۉͱࢄ͔ΒͷӅΕϢχοτͷฏۉͱࢄ͕ܭࢉՄೳɽ l l ͷฏۉ ͱࢄ ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ
(z(L) |mz(L) , vz(L) ) mz(L) vz(L)
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश ޯʹجֶͮ͘श ɹ ɼฏۉ ɼࢄ ͱͯ͠ѻ͏ʢ࠶ؼܭࢉͷॳظ ɼ ʣɽ dͰɼ ͷग़ྗ
͔Β׆ੑ Λ௨͠ɼͷग़ྗ ͷฏۉͱࢄΛٻΊΔʢத৺ۃݶఆཧΑΓΨεʹۙࣅͰ͖ΔɽʣҰ࿈ͷྲྀΕΛ հͨ͠ɽ͜ͷۙࣅ݁ՌΛ࠶ؼతʹ༻͍Δ͜ͱͰɼ࠷ऴ ͷΛΨε Ͱۙࣅ͢Δ͜ͱ͕Ͱ͖Δɽ ɹ͕ͨͬͯ͠ɼਖ਼نԽఆͷۙࣅදݱ͕ಘΒΕΔɽ ɹਖ਼نԽఆΛಘͨޙɼύϥϝʔλʹΑΔඍΛܭࢉ͢Δ͜ͱͰޯ͕ܭࢉͰ͖Δɽ z(0) xi 0 mz(0) vz(0) l − 1 z(l−1) a(l) l z(l) z(L) (z(L) |mz(L) , v(L) z ) Z(αγy , βγy ) ≈ (yi |mz(L) , (αγy − 1)/βγy + vz(L) )
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश ֬తٯ๏ͷ·ͱΊ Ϟσϧͷఆٛɿ p(W, γy , γw |) ∝ p(Y|X,
W, γr )p(W|γw )p(γy )p(γw ) ۙࣅͷಋೖɿ q(W, γy , γw ) = q(γy )q(γw )q(W) ۙࣅͷॳظԽɿ q0 (γy ), q0 (γw ), q0 (W) ࣄલҼࢠͷಋೖʢͦͷʣɿ Ҽࢠ ͷՃɿ Ҽࢠ ͷՃɿ p(γr ) q(γr ) ← p(γr ) p(γw ) q(γw ) ← p(γw )
ظ๏ʹΑΔֶश ֬తٯ๏ͷ·ͱΊ ࣄલҼࢠͷಋೖʢͦͷʣɿ for l = 1 to L do
for j = 1 to Hl−1 do for i = 1 to Hl do Ҽࢠp(w(l) i,j |γw )ͷՃɿ ⋅ q(W)ͷߋ৽ ⋅ q(γw )ͷߋ৽ ॱɿ p(yi |xi , W, γy ) where i ∈ s ӅΕϢχοτͱ׆ੑͷฏۉͱࢄΛ࠶ؼܭࢉ Ҽࢠ ͷಋೖɿ ͷߋ৽ p(yi |xi , W, γy ) q(W), q(γy )
ظ๏ʹΑΔֶश ʲظ๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ ‣ॳظԽͱࣄલҼࢠͷಋೖ ‣Ҽࢠͷಋೖ ‣׆ੑͷ ‣ޯʹجֶͮ͘श ‣֬తٯ๏ͷ·ͱΊ ‣ؔ࿈ख๏
ظ๏ʹΑΔֶश ؔ࿈ख๏ ɹ֬తٯ๏ʹࣅͨख๏ͱͯ͠ɼܾఆతมਪ๏͕͋Δɽ ʲมਪ๏ͷܽʳ ɹ&-#0ͷධՁͷͨΊʹରͷظΛܭࢉ͢Δඞཁ͕͋ΓɼϞϯςΧϧϩ๏Ͱۙ ࣅղΛಘ͍ͯΔɽ ҆ఆੑ͕͍ ʲܾఆతมਪ๏ʳ ɹظͷۙࣅܭࢉΛܾఆతʹߦ͏͜ͱͰ҆ఆੑΛߴΊΒΕΔɽ ⟹