ベイズ深層学習(5.1~5.2)

5d07c24d8a7903a287556c5e76c0609b?s=47 catla
February 28, 2020

 ベイズ深層学習(5.1~5.2)

内容:ベイズニューラルネットワーク(5.1節),近似ベイズ推論の高速化(5.2節)

5d07c24d8a7903a287556c5e76c0609b?s=128

catla

February 28, 2020
Tweet

Transcript

  1. ϕΠζਂ૚ֶश  d ܡɹঘً

  2. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  3. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  4. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷ ۙࣅਪ࿦๏

  5. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  6. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹষͷۙࣅਪ࿦ख๏͸ɼਂ૚ֶशϞσϧʹ΋௚઀ద༻Ͱ͖Δɽ ɹઢܗճؼϞσϧͱಉ༷ʹॱ఻೻ܕχϡʔϥϧωοτϫʔΫʢ//ʣΛϕΠζԽɽ ɹ ύϥϝʔλ ʹࣄલ෼෍Λઃఆ͠ɼ֬཰తͳֶशͱ༧ଌΛՄೳʹ͢Δɽ ⟹ W ϕΠζਪ࿦ʹ͓͚Δֶशͱ༧ଌ ύϥϝʔλͷಉ࣌෼෍ɿɹ

    ͱදͤΔɽ ֶशɹɿɹ ΛධՁ͢Δɽ ༧ଌɹɿɹ ΛٻΊΔɽ p(Y, W|X) = p(W) N ∏ n=1 p(yn |w, xn ) p(W|X, Y) p(y* |x* , Y, X) n = 1,…, N xn yn W
  7. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹઃఆ ɹɹೖྗσʔλ ɼ؍ଌσʔλ ͓Αͼύϥϝʔλͷಉ࣌෼෍ ΛҎԼͷΑ͏ʹ͓͘ɽ   ɹɹ؍ଌσʔλ͸ɼҎԼͷ෼෍͔ΒಘΒΕΔͱԾఆ͢Δɽ 

     ɹɹ ͸χϡʔϥϧωοτͷؔ਺஋ ͸ݻఆͷϊΠζύϥϝʔλɽ ɹɹύϥϝʔλ͸ɼҎԼͷ෼෍͔ΒಘΒΕΔͱઃఆ͢Δɽ ɹ  ɹ ͸ݻఆͷϊΠζύϥϝʔλɽ ɹ ɹɹ X = {x1 , …, xN } Y = {y1 , ⋯, yn } p(Y, W|X) = p(W) N ∏ n=1 p(yn |w, xn ) p(yn |xn , W) = (yn | f(xn ; W), σ2 y I) f(xn ; W) σ2 y p(w) = (w|0,σ2 w ) where w ∈ W σ2 w
  8. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹಛ௃ ɹɹ//ͷ૚਺͕Ͱ͋Δͱ͖ɼ ɹɹɹӅΕϢχοτ਺͕ଟ͍ɹ ɹؔ਺ෳࡶԽɽ ɹɹɹ ͕େ͖͍ɹ ɹมԽ͕ٸफ़ɽ ɹ ɹɹ

    ⟶ σw ⟶ ɹϕΠζ//͸ɼӅΕϢχοτ਺΍૚਺Λ૿΍͢ͱɼࣄޙ෼෍͕ෳࡶʹͳ͍ͬͯ͘͜ͱ͕ ஌ΒΕ͍ͯΔɽ
  9. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  10. ϥϓϥεۙࣅʹΑΔֶश ϥϓϥεۙࣅ p(Z|X) ≈ (Z|ZMAP , {Λ(ZMAP )} −1 )

    Λ(Z) = − ∇2 Z log p(Z|X) ɹ؆୯ͷͨΊʹ//ͷग़ྗͷ࣍ݩΛͱ͢Δɽ ࣄޙ෼෍ͷۙࣅ ɹࣄޙ෼෍ͷ."1ਪఆ஋ΛٻΊΔɽ ɹɹ  Ͱ࠷େΛऔΔύϥϝʔλ ΛٻΊΔɽ ɹࣄޙ෼෍࠷େԽɹʹɹର਺ࣄޙ෼෍࠷େԽɹͳͷͰɼର਺ࣄޙ෼෍ͷޯ഑Λར༻͢Δ ͱɼҎԼͷΑ͏ͳ࠷దԽʹΑͬͯ."1ਪఆ஋͕ٻΊΒΕΔɽ ɹ   ͸ֶश཰ɽ ⟹ p(W|Y, X) WMAP Wnew = Wold + α∇W log p(W|Y, X)| W=Wold α
  11. ϥϓϥεۙࣅʹΑΔֶश ࣄޙ෼෍ͷۙࣅ ɹࣄޙ෼෍ͷޯ഑͸ɼҎԼͷΑ͏ʹٻΒΕΔɽɹɹɹ ɹɹɹɹɹɹɹɹɹɹ  Αͬͯɼ ɹɹɹɹɹɹɹɹɹ  ύϥϝʔλ Ͱภඍ෼͢ΔͱɼҎԼͷΑ͏ʹίετؔ਺ͷඍ෼ͱͳΔɽ

    ɹɹɹɹɹɹɹɹɹ   ͸ɼͦΕͧΕ//ͷޡࠩؔ਺ͱ֤ύϥϝʔλͷࣄલ෼෍ʹ༝དྷ͢Δਖ਼ଇԽ ߲Ͱ͋Δɽ p(W|Y, X) = p(W)p(Y|X, W) p(X|Y) ∝ p(W)p(Y|X, W) log p(W|Y, X) = log p(Y|X, W) + log p(W) + c = N ∑ n=1 log p(yn |xn , W) + ∑ w∈W log p(w) + c w ∈ W ∂ ∂w log p(W|Y, X) = − { 1 σ2 y ∂ ∂w E(W) + 1 σ2 w ∂ ∂w ΩL2 (W) } E(W), ΩL2 (W)
  12. ϥϓϥεۙࣅʹΑΔֶश ࣄޙ෼෍ͷۙࣅ ɹΑͬͯɼ."1ਪఆ஋ΛٻΊͨΒɼࣄޙ෼෍ΛҎԼͷΑ͏ʹۙࣅͰ͖Δɽ ɹɹɹɹɹɹɹɹɹɹ   ͸ޡࠩؔ਺ʹର͢ΔϔοηߦྻͰ͋Δɽ p(W|Y, X) ≈

    q(W) = (W|WMAP , {Λ(WMAP )} −1 ) Λ(W) = − ∇2 W log p(W|Y, X) = 1 σ2 w I + 1 σ2 y H H
  13. ϥϓϥεۙࣅʹΑΔֶश ༧ଌ෼෍ͷۙࣅ ɹϥϓϥεۙࣅΛ༻͍Δͱɼ༧ଌ෼෍͸ҎԼͷΑ͏ʹۙࣅͰ͖Δɽ ɹ  ɹ͔͠͠ɼ ͷதʹ//ؚ͕·Ε͍ͯΔͷͰɼղੳతܭࢉ͕ෆՄೳɽ ɹ͜͜Ͱɼύϥϝʔλͷࣄޙ෼෍ͷີ౓͕."1ਪఆ஋ͷपลʹूத͓ͯ͠Γɼ͔ͭͦͷ খ͞ͳൣғʹ͓͍ͯ͸ ͕

    ͷઢܕؔ਺ͰΑۙ͘ࣅͰ͖Δͱ͍͏ԾઆΛ͓͘ɽ͜ͷ Ծઆ͔Βɼςʔϥʔల։Ͱ ͷؔ਺ Λ ·ΘΓͰ࣍ۙࣅ͢ΔͱɼҎԼͷΑ͏ ʹͳΔɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ  p(y* |x* , Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW p(y* |x* , W) f(x* |W) W W f(x* |W) WMAP f(x* ; W) ≈ f(x* ; WMAP ) + gT(W − WMAP ) g = ∇W f(x* ; W)| W=WMAP
  14. ϥϓϥεۙࣅʹΑΔֶश ༧ଌ෼෍ͷۙࣅ ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ ɹ  ɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* ,

    Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW = ∫ (yn | f(xn ; W), σ2 y )(W|WMAP , {Λ(WMAP )}−1)dW = ∫ (yn | f(x* ; WMAP ) + gT(W − WMAP ), σ2 y ) (W|WMAP , {Λ(WMAP )}−1)dW = (y* | f(x* ; WMAP ), σ2(x* )) σ2(x* ) = σ2 y + gT{Λ(WMAP )}−1g
  15. ϥϓϥεۙࣅʹΑΔֶश ༧ଌ෼෍ͷۙࣅ ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ ɹ  ɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* ,

    Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW = ∫ (yn | f(xn ; W), σ2 y )(W|WMAP , {Λ(WMAP )}−1)dW = ∫ (yn | f(x* ; WMAP ) + gT(W − WMAP ), σ2 y ) (W|WMAP , {Λ(WMAP )}−1)dW = (y* | f(x* ; WMAP ), σ2(x* )) σ2(x* ) = σ2 y + gT{Λ(WMAP )}−1g ϥϓϥεۙࣅ ςʔϥʔల։ͷҰ࣍ۙࣅ
  16. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  17. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ɹର਺ࣄޙ෼෍ʢϋϛϧτχΞϯʹ͓͚ΔϙςϯγϟϧΤωϧΪʔʣ͕αϯϓϦϯά͠ ͍ͨม਺ʹରͯ͠ඍ෼ՄೳͳΒ).$๏͕ద༻Ͱ͖Δɽܭࢉ࣌ؒ͑͞े෼ʹ֬อ͍ͯ͠Ε ͹ɼཧ࿦తʹਅͷࣄޙ෼෍͔Βͷαϯϓϧ͕ಘΒΕΔʢ.$.$ͷಛ௃ʣɽ݁Ռతʹɼෳ ਺ͷαϯϓϧ͔Βෆ࣮֬ੑΛදݱͰ͖Δɽ

  18. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ॏΈύϥϝʔλͷਪ࿦ ɹਖ਼نԽ͞Ε͍ͯͳ͍ࣄޙ෼෍Λར༻͢Ε͹ɼରԠ͢ΔϙςϯγϟϧΤωϧΪʔ͸ҎԼ ͷΑ͏ʹͳΔɽ   ͜ΕΛඍ෼͢Δͱɼઌ΄Ͳొ৔ͨ͠ίετؔ਺ͷඍ෼ͱ౳ՁͰ͋Δ͜ͱ͕Θ͔Δɽ ɹ ޡࠩٯ఻೻๏ʹΑΔޯ഑ܭࢉ͕ར༻Ͱ͖Δɽ ʲ.$.$ʹجͮ͘ͷۙࣅਪ࿦ͷ໰୊఺ʳ

    w αϯϓϧ਺͕े෼Ͱ͋Δ͔Λ஌Δखஈ͕ͳ͍ɽ w .$.$ͷύϥϝʔλௐ੔͕೉͍͠ɽʢFH).$๏ʹ͓͚ΔεςοϓαΠζ΍εςοϓ਺ͳͲ w ֶश͕௿଎ɽɹ (W) = − {log p(Y|X, W) + log p(W)} ⟹
  19. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹϋΠύʔύϥϝʔλͰ͋Δ ΍ ʹ΋ͦΕͧΕࣄલ෼෍Λ༩͑Δ͜ͱͰ ͱಉ࣌ʹ ਪ࿦ՄೳͰ͋Δɽ ɹ ɹਫ਼౓ύϥϝʔλ Λಋೖ͠ɼҎԼͷΑ͏ʹࣄલ෼෍ΛΨϯϚ෼෍Ͱఆٛ͢Δɽ

      ɹಉ༷ʹ ʹରͯ͠΋ɼҎԼͷΑ͏ʹఆٛ͢Δɽ  σw σy W γw = σ−2 w p(γw ) = Gam(γw |aw , bw ) (aw , bw ͸ਖ਼ͷݻఆ஋) γy = σ−2 y p(γy ) = Gam(γy |ay , by ) (ay , by ͸ਖ਼ͷݻఆ஋)
  20. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹϞσϧʢύϥϝʔλͷಉ࣌෼෍ʣΛվΊͯॻ͘ͱɼҎԼͷΑ͏ʹͳΔɽ  ɹ p(Y, W, γw , γy

    |X) = p(γw )p(γy )p(W|γw ) N ∏ n=1 p(yn |xn , W, γy ) n = 1,…, N xn yn W γy γw ɹࣄޙ෼෍͸ɼҎԼͷΑ͏ʹͳΔɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(W, γw , γy |X, Y) αy βw βy αw
  21. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹΪϒεαϯϓϦϯάΛ༻͍ͯɼ ΛαϯϓϦϯά͢Δɽ w  ͷαϯϓϦϯά ɹɹɹઌ΄Ͳͱಉ༷ʹɼ).$๏Ͱαϯϓϧ͢Δɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ 

    w  ͷαϯϓϦϯά ɹɹɹ  ɹɹɹ ͸Ψ΢ε෼෍ɼ ͸ΨϯϚ෼෍ʢΨ΢ε෼෍ͷڞ໾ࣄલ෼෍ʣͳͷͰɼ ɹɹɹ ͸ΨϯϚ෼෍Ͱ͋ΔɽΑͬͯɼ   ͨͩ͠ɼ ͸ॏΈύϥϝʔλͷ૯਺ɽ W, γw , γy W W ∼ p(W|Y, X, γw , γy ) γw p(γw |Y, X, W, γy ) ∝ p(W|γw )p(γw ) p(W|γw ) p(γw ) p(γw |Y, X, W, γy ) γw ∼ Gam( ̂ aw , ̂ bw ) ̂ aw = aw + Kw 2 ̂ bw = bw + 1 2 ∑ w∈W w2 Kw
  22. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ w  ͷαϯϓϦϯά ɹɹɹ  ɹɹɹ ͸Ψ΢ε෼෍ͷ૯৐ͳͷͰΨ΢ε෼෍ɼ ͸ΨϯϚ෼෍ΑΓɼ

    ɹɹɹ ͸ΨϯϚ෼෍Ͱ͋ΔɽΑͬͯɼ    γy p(γy |Y, X, W, γw ) ∝ p(γw ) N ∏ n=1 p(yn |xn , W, γr ) N ∏ n=1 p(yn |xn , W, γr ) p(γy ) p(γy |Y, X, W, γw ) γy ∼ Gam( ̂ ay , ̂ by ) ̂ ay = ay + N 2 ̂ by = by + 1 2 N ∑ n=1 {yn − f(xn ; W)}2
  23. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹΨϯϚ෼෍ ͷฏۉ͸ ɼ෼ࢄ͸ ͳͷͰɼ ͕େ͖͍΄Ͳ ʹΑΔ  ͷਪఆਫ਼౓͕ѱ͘ɼ؍ଌʹର͢Δ෼ࢄ͕େ͖͘ͳΔΑ͏ʹֶश͞ΕΔɽ

    ɹ ɹࠓճ͸ɼॏΈύϥϝʔλͷਫ਼౓ύϥϝʔλ͸ɼશମʹ౉ͬͯڞ௨ͷ Ͱ͓͍͍͕ͯͨɼ //ͷ֤૚͝ͱʹਫ਼౓ύϥϝʔλ ͱ͓͘͜ͱ΋ՄೳͰ͋Δɽ Gam(a, b) a/b a/b2 ̂ by f(xn |W) yn γw (γ(1) w , …, γ(L) w )
  24. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  25. ۙࣅϕΠζਪ࿦ͷߴ଎Խ

  26. ۙࣅϕΠζਪ࿦ͷߴ଎Խ ʲϕΠζχϡʔϥϧωοτϫʔΫͷܽ఺ʳ ɹύϥϝʔλͷपลԽʹ൐͏ܭࢉྔ͕๲େ ɹɹ ༧ଌπʔϧͱͯ͋͠·Γ࢖ΘΕͳ͔ͬͨɽ ɹ·ͨɼਂ૚ֶश͸ඞཁͳֶशσʔλ͕๲େ ɹɹ όονֶशΛલఏͱͨ͠ख๏Ͱ͸ܭࢉޮ཰͕ѱ͍ɽ ʲͲͷΑ͏ʹܽ఺Λิ͏ʁʳ w

    ੵ෼আڈΛۙࣅਪ࿦͢Δ͜ͱͰɼܭࢉͷޮ཰Λ্͛Δɽ w ϛχόονֶशΛಋೖ͢Δɽ ⟹ ⟹
  27. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  28. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲ໰୊఺ʳ ɹ.$.$Λར༻ֶͨ͠श͸େن໛ͳσʔλʹରͯ͠ɼܭࢉޮ཰͕ѱ͍ɽ ʲղܾࡦʳ ɹܭࢉޮ཰ͷߴ͍ϛχόονʹجֶͮ͘शख๏ʢFH֬཰తޯ഑߱Լ๏ʣͱෆ࣮֬ੑͷ ਪఆ͕Մೳͳ.$.$ʢFH.)๏ɼ).$๏ʣΛ૊Έ߹ΘͤΔɽ ɹ ֬཰తϚϧίϑ࿈࠯ϞϯςΧϧϩ๏ ⟹

  29. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹ֬཰తޯ഑߱Լ๏ͱϥϯδϡόϯಈྗֶ๏Λ૊Έ߹Θͤͨɹ֬཰తޯ഑ϥάδϡόϯ ಈྗֶ๏ɹΛར༻ֶͨ͠शΛߟ͑Δɽ ɹύϥϝʔλͷߋ৽Λɹ ͱද͢ɽ ɹ֬཰తޯ഑߱Լ๏Ͱ͸ɼύϥϝʔλͷߋ৽෯ΛҎԼͷΑ͏ʹॻ͚Δɽ   ͨͩ͠ɼ

    ͸αϒαϯϓϧͷେ͖͞Ͱ͋ΓɼՃ͑ͯɼϩϏϯεɾϞϯϩʔΞϧΰϦζϜͷ ࿮૊Έʹ͢ΔͨΊʹɼεςοϓ໨ʹ͓͚Δֶश཰ ҎԼͷ৚݅Λຬͨ͢Α͏ʹઃఆ͢ Δɽ  Wnew = Wold + ΔW ΔW = αt 2 ∇W log p(W|Xs , Ys ) = αt 2 { N M ∑ n∈S ∇W log p(yn |xn , W) + ∇W log p(W) } M t αt ∞ ∑ i=1 αt = ∞, ∞ ∑ i=1 α2 t < ∞
  30. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹҰํͰɼόονֶशΞϧΰϦζϜͷϥϯδϡόϯಈྗֶ๏ͷαϯϓϧΛಘΔͨΊʹඞ ཁͳεςοϓ͸ɼϙςϯγϟϧΤωϧΪʔΛ ɼεςοϓαΠζΛ    ΛӡಈྔϕΫτϧͱ͢Δͱɼύϥϝʔλͷߋ৽෯͸ҎԼͷΑ͏ʹͳΔɽ 

     ɹ Λখ͘͢͞Ε͹ɼ.)๏ʹ͓͚Δड༰཰ΛݶΓͳ͘·Ͱ͚ۙͮΒΕΔɽ = − log p(W|X, Y) ϵ = αt p ΔW = − ϵ2 2 ∇W + ϵp = αt 2 ∇W log p(W|X, Y) + αt p = αt 2 { N ∑ n=1 ∇W log p(yn |xn , W) + ∇W log p(W) } + αt p, p ∼ (0, I) . αt
  31. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹઌͷͭʢ֬཰తޯ഑߱Լ๏ͱϥϯδϡόϯಈྗֶ๏ʣΛ૊Έ߹ΘͤΔͱɼߋ৽෯͕Ҏ ԼͷΑ͏ʹͳΔɽ ɹɹɹɹɹɹɹ  ֶश཰͸ɼઌ΄Ͳͷ৚݅ͱಉ༷ɽ ɹ ɹʬ͕খ͖͞ͱ͖ʢֶशॳظஈ֊ʣ㲊 ɹɹ4(%ͷར఺Λੜ͔ͯ͠ࣄޙ෼෍ͷۭؒΛޮ཰తʹ୳ࡧɽ

    ɹʬ͕େ͖͘ͳΔʹͭΕͯ㲊 ϥϯδϡόϯಈྗֶ๏ʹΑΔਅͷࣄޙ෼෍͔ΒۙࣅతͳαϯϓϧΛಘΒΕΔɽ ΔW = αt 2 { N M ∑ n∈S ∇W log p(yn |xn , W) + ∇W log p(W) } + αt p, p ∼ (0, I) . t t
  32. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  33. ֬཰తม෼ਪ࿦๏ ɹઌ΄Ͳ͸ɼ֬཰తޯ഑๏ͱ.$.$ͷ૊Έ߹ΘͤΛ঺հͨ͠ɽ ɹ࣍͸ɼม෼ਪ࿦๏ͱ֬཰తޯ഑߱Լ๏Λ૊Έ߹ΘͤΔɽ ɹɹ ֬཰తม෼ਪ࿦๏ ɹ ɹΛม෼ύϥϝʔλͷू߹ͱͨ͠ͱ͖ɼ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ  ͱͳΔΑ͏ͳۙࣅ෼෍

    ΛٻΊΔ͜ͱ͕໨ඪɽ ⟹ ξ q(W; ξ) ≈ p(W|X, Y) q(W; ξ)
  34. ֬཰తม෼ਪ࿦๏ ɹޮ཰ԽͷͨΊʹϛχόονΛಋೖ͢Δɼ ɹ    ɹϛχόονͰܭࢉ͞Εͨ ͸ ʹର͢ΔෆภਪఆྔͱͳΔɽ 

     ɹ͕ͨͬͯ͠ɼ Λ௚઀࠷େԽ͢Δ୅ΘΓʹɼ Λ࠷େԽ͢Δ͜ͱʹΑͬͯɼޮ཰ Α͘ύϥϝʔλͷࣄޙ෼෍ΛۙࣅͰ͖Δɽ ℒ(ξ) = N ∑ n=1 ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] ℒS (ξ) = N M ∑ n∈S ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] ℒs ℒ S [ℒs (ξ)] = ℒ(ξ) ℒ(ξ) ℒs (ξ) ϛχόονԽ
  35. ֬཰తม෼ਪ࿦๏ ɹ͜ͷޙͷεϥΠυͰ͸ɼۙࣅ෼෍Λ࣍ͷΑ͏ͳಠཱͳΨ΢ε෼෍ͱԾఆ͠ɼ&-#0Λ ޯ഑߱Լ๏Λར༻ͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δɽ  q(W; ξ) = ∏ i,j,l (w(l)

    i,j |μ(l) i,j , σ(l) i,j 2 )
  36. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  37. ޯ഑ͷϞϯςΧϧϩۙࣅ ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰ͸ɼ&-#0ʹ͓͚Δύϥϝʔλ ͸ղੳతʹ ੵ෼আڈͰ͖ͳ͍ɽ ɹ ޯ഑߱Լ๏ʹΑͬͯ Λ࠷େԽɽ ɹޯ഑߱Լ๏Λ࢖͏ͨΊʹ Λม෼ύϥϝʔλʹΑΔޯ഑ܭࢉΛ͢Δඞཁ͕͋Δɽ 

    ͸ɼͲͪΒ΋Ψ΢ε෼෍ͳͷͰղੳతʹޯ഑ܭࢉͰ͖ΔɽҰํͰɼର ਺໬౓ ͸ղੳతʹੵ෼Ͱ͖ͳ͍ɽ W ⟹ ℒS (ξ) ℒS (ξ) ξ DKL [q(W; ξ)||p(W)] ∫ q(W; ξ)log p(yn | f(xn ; W))dW
  38. ޯ഑ͷϞϯςΧϧϩۙࣅ ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰ͸ɼ&-#0ʹ͓͚Δύϥϝʔλ ͸ղੳతʹ ੵ෼আڈͰ͖ͳ͍ɽ ɹ ޯ഑߱Լ๏ʹΑͬͯ Λ࠷େԽɽ ɹޯ഑߱Լ๏Λ࢖͏ͨΊʹ Λม෼ύϥϝʔλʹΑΔޯ഑ܭࢉΛ͢Δඞཁ͕͋Δɽ 

    ͸ɼͲͪΒ΋Ψ΢ε෼෍ͳͷͰղੳతʹޯ഑ܭࢉͰ͖ΔɽҰํͰɼର ਺໬౓ ͸ղੳతʹੵ෼Ͱ͖ͳ͍ɽ W ⟹ ℒS (ξ) ℒS (ξ) ξ DKL [q(W; ξ)||p(W)] ∫ q(W; ξ)log p(yn | f(xn ; W))dW ɹϞϯςΧϧϩ๏Ͱੵ෼ʢର਺໬౓ʣΛۙࣅͯ͠ɼޯ഑ͷਪఆΛಘΑ͏ʂ
  39. ޯ഑ͷϞϯςΧϧϩۙࣅ ʲ໨ඪʳ ɹύϥϝʔλ ʹରͯ͠ɼ͋Δ෼෍ ͱ෼෍ Λߟ͑ɼ࣍ͷޯ഑Λਪ࿦͢ Δ͜ͱɽ   ʲܭࢉํ๏ʳ

    ɹείΞؔ਺ਪఆɼ࠶ύϥϝʔλԽޯ഑ɼҰൠԽ࠶ύϥϝʔλԽޯ഑ɼӄؔ਺ඍ෼ͳͲ w ∈ ℝ f(w) q(w; ξ) I(ξ) = ∇ξ ∫ f(w)q(w; ξ)dw
  40. ޯ഑ͷϞϯςΧϧϩۙࣅ είΞؔ਺ਪఆ ɹҎԼͷΑ͏ʹ Λมܗ͢Δɽ   ɹ͕ͨͬͯ͠ɼ ͔Β Λෳ਺αϯϓϦϯά͔ͯ͠Βඍ෼ΛධՁ͢Δ͜ͱͰ ͷෆ

    ภਪఆྔ͕ಘΒΕΔɽ ʲద༻Ͱ͖Δ৚݅ʳɹ ͷඍ෼͕ܭࢉՄೳɽ ʲ໰୊఺ʳɹ࣮༻্͸ඇৗʹߴ͍෼ࢄ͕ൃੜͯ͠͠·͏ɽ ʲղܾࡦʳɹ੍ޚมྔ๏ͳͲͷ෼ࢄݮগख๏ͱ૊Έ߹ΘͤΔɽ I(ξ) I(ξ) = ∇ξ ∫ f(w)q(w; ξ)dw = ∫ f(w)∇ξ q(w; ξ)dw = ∫ f(w)q(w; ξ)∇ξ log q(w; ξ)dw = q(w;ξ) [ f(w)∇ξ log q(w; ξ)] q(w; ξ) w I(ξ) log q(w; ξ)
  41. ޯ഑ͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯ഑ ɹ Λ ͔Β௚઀αϯϓϦϯά͢Δ୅ΘΓʹɼʹґଘ͠ͳ͍ ͔ΒΛαϯϓϦϯ ά͠ɼม׵ Λద༻͢Δ͜ͱͰؒ઀తʹ ͷαϯϓϦϯάΛ͢Δ͜ͱΛߟ͑Δɽ ɹ͕ͨͬͯ͠ɼҎԼͷΑ͏ʹޯ഑ͷෆภਪఆྔ͕ಘΒΕΔɽ

      ʲ۩ମྫʳɹ ɼ ͷ৔߹ ɹ ɼ ͱ͢Δ͜ͱͰɼ ͸ ͔ΒαϯϓϦϯ άͰ͖Δɽม෼ύϥϝʔλʹؔ͢Δޯ഑ͷඍ෼͸ɼ࣍ͷΑ͏ʹͳΓɼ֤ม෼ύϥϝʔλ ͷޯ഑ͷෆภਪఆྔ͕ಘΒΕΔɽ ɹɹɹɹ  ɹɹɹɹ w q(w; ξ) ξ q(ϵ) ϵ w = g(ξ, ϵ) w q(ϵ) [ f′(g(ξ; ϵ))∇ξ g(ξ; ϵ)] = I(ξ) ξ = { ̂ μ, ̂ σ2} q(w; ξ) = (w| ̂ μ, ̂ σ2) ˜ ϵ ∼ (0,1) = q(ϵ) ˜ w = g(ξ; ϵ) = ̂ μ + ̂ σϵ ˜ w ( ̂ μ, ̂ σ2) ∂ ∂ ̂ μ ∫ f(w)q(w; ξ)dw = ∫ f′(w)q(w; ξ)dw ∴ I( ̂ μ) = q(w;ξ) [ f′(w)] ∂ ∂ ̂ σ ∫ f(w)q(w; ξ)dw = ∫ f′(w) (w − ̂ μ) ̂ σ q(w; ξ)dw ∴ I( ̂ μ) = q(w;ξ) [f′(w) (w − ̂ μ) ̂ σ ]
  42. ޯ഑ͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯ഑ͷҰൠԽ ʲ࠶ύϥϝʔλԽޯ഑ͷར఺ʳ ɹɹείΞؔ਺ਪఆͱൺ΂ͯޯ഑ͷ෼ࢄΛখ͘͞཈͑ΒΕΔɽ ʲ࠶ύϥϝʔλԽޯ഑ͷ໰୊఺ʳ ɹɹม਺ม׵ ͕ඞཁɽʢશͯͷ෼෍Ͱద༻Ͱ͖ΔΘ͚Ͱ͸ͳ͍ɽʣ ʲղܾࡦɹྫɿʳɹҰൠԽ࠶ύϥϝʔλԽޯ഑ ɹɹ ʹؔ͢Δ੍໿Λ؇Ίɼଟ͘ͷछྨͷ෼෍ʹରͯ͠ద༻Մೳͱͨ͠΋ͷɽ

    ɹɹ ͷΑ͏ʹม෼ύϥϝʔλͷґଘੑΛ࢒͢͜ͱΛڐ͢ɽ ʲղܾࡦɹྫɿʳɹӄؔ਺ඍ෼ ɹʲ࢖͑Δ৚݅ʳ w  ΛٻΊΔ͜ͱ͸ࠔ೉͕ͩɼٯม׵ ͸༰қʹಘΒΕΔɽ w ࿈ଓ஋ͷ෼෍ ɹɹ ΛͰඍ෼͢Δ͜ͱͰظ଴஋ͷޯ഑ΛಘΔɽ g g q(ϵ; ξ) g g−1 ϵ = g−1(ϵ; ξ) ξ
  43. ޯ഑ͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯ഑ͷҰൠԽ ʲղܾࡦɹྫɿʳɹ࿈ଓ؇࿨ ɹɹ཭ࢄͷ֬཰෼෍ʹରͯ͠࠶ύϥϝʔλԽޯ഑Λద༻͢Δํ๏ɽ ɹʲ۩ମྫʳ ΧςΰϦ෼෍ʢ཭ࢄ෼෍ʣ͸ɼΨϯϕϧιϑτϚοΫε෼෍ʢ࿈ଓ෼෍ʣͷԹ౓ύ ϥϝʔλΛʹઃఆͨ͠΋ͷͱҰக͢Δɽ ɹɹ

  44. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  45. ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏ ɹ࣮ࡍʹ࠶ύϥϝʔλԽޯ഑Λར༻ͯ͠ϕΠζχϡʔϥϧωοτͷ&-#0Λ࠷େԽ͢Δɽ ᶃ ϛχόον Λσʔληοτ ͔ΒϥϯμϜʹநग़͢Δɽ ᶄ .ݸʢϛχόονͷαϯϓϧ਺ʣͷϊΠζΛऔಘ͢Δɽ ɹ 

    ᶅ ม෼ύϥϝʔλʹؔ͢Δޯ഑Λܭࢉ͢Δɽ   ᶆ &-#0ͷ૿Ճํ޲ʹม෼ύϥϝʔλΛߋ৽͢Δɽ  s ˜ ϵi ∼ (0, I) ℒs (ξ) = N M ∑ n∈S ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] = N M ∑ n∈S ∫ p(ϵ)log p(yn | f(xn ; g(ξ; ϵ)))dϵ − DKL [q(W; ξ)||p(W)] ≈ ℒS,ϵ (ξ) ( ∵ ,ϵ [ℒS,ϵ (ξ)] = ℒ(ξ)) = N M ∑ n∈S log p(yn | f(xn ; g(ξ; ˜ ϵn ))) − DKL [q(W; ξ)||p(W)], ∇ξ ℒs (ξ) ≈ ∇ξ ℒS,ϵ (ξ) = N M ∑ n∈S ∇ξ log p(yn | f(xn ; g(ξ; ˜ ϵn ))) − ∇ξ DKL [q(W; ξ)||p(W)] . ξ ← ξ + α∇ξ ℒS,ϵ (ξ)
  46. ຊ೔ͷ಺༰ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏ ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ‣ϥϓϥεۙࣅʹΑΔֶश ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏ ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश ‣ޯ഑ͷϞϯςΧϧϩۙࣅ ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏

    ‣ظ଴஋఻೻๏ʹΑΔֶश
  47. ظ଴஋఻೻๏ʹΑΔֶश ɹॱ఻೻ܭࢉͰ͸χϡʔϥϧωοτϫʔΫΛ௨ͨ֬͠཰ͷ఻೻ʹΑΓपล໬౓ͷධՁΛ ߦ͍ɼٯ఻೻Ͱ͸ύϥϝʔλΛֶश͢ΔͨΊʹظ଴஋఻೻๏Λ༻͍ͯपล໬౓ͷޯ഑Λ ܭࢉ͢Δɽ  ֬཰తٯ఻೻๏ ɹ֬཰తٯ఻೻๏͸σʔλΛஞ࣍తʹॲཧͰ͖ΔͷͰɼେྔσʔλΛ༻ֶ͍ͨशͰ΋ε έʔϧՄೳɽ؍ଌσʔλͷਫ਼౓ύϥϝʔλ΍ॏΈͷࣄલ෼෍Λࢧ഑͢Δਫ਼౓ύϥϝʔλ ΋ۙࣅਪ࿦Մೳɽ ⟹

  48. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  49. ظ଴஋఻೻๏ʹΑΔֶश Ϟσϧ ʲઃఆʳ ɹɹ ͱ͠ɼपล໬౓ΛҎԼͷΑ͏ʹఆٛ͢Δɽ     ɹ

    ͷ׆ੑԽؔ਺ʹ͸ਖ਼نԽઢܗؔ਺ʢ3F-6ʣΛ༻͍Δɽ ɹɹύϥϝʔλ ͸ɼಠཱͳΨ΢ε෼෍ʹै͏ͱ͢Δɽ     ʲ໨ඪʳ ɹɹҎԼͷࣄޙ෼෍Λۙࣅਪ࿦͢Δ͜ͱɽ  yn ∈ ℝ p(Y|X, W, γr ) = N ∏ n=1 (yn | f(xn ; W), γ−1 y ) p(γy ) = Gam(γr |αγy 0 , βγy 0 ) f(xn ; W) W p(W|γw ) = L ∏ l=1 Hl ∏ i=1 Hl−1 ∏ j=1 (w(l) i,j |0,γ−1 w ) p(γw ) = Gam(γw |αγw 0 , βγw 0 ) p(W, γy , γw |) ∝ p(Y|X, W, γr )p(W|γw )p(γy )p(γw )
  50. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  51. ظ଴஋఻೻๏ʹΑΔֶश ۙࣅ෼෍ ɹ֬཰తٯ఻೻๏͸ɼԾఆີ౓ϑΟϧλϦϯάʹج͍͍ͮͯΔɽ ɹύϥϝʔλͷۙࣅ෼෍Λ࣍ͷΑ͏ʹ͓͘ɽ   ɹ ɹ্ͷࣜΛԾఆີ౓ϑΟϧλϦϯάʹ͓͚ΔϞʔϝϯτϚονϯάͰஞ࣍తʹߋ৽ͯ͠ ͍͘ɽ q(W,

    γy , γw ) = Gam(γy |αγy , βγy )Gam(γw |αγw , βγw ) L ∏ l=1 Hl ∏ i=1 Hl−1 ∏ j=1 (w(l) i,j |m(l) i,j , v(l) i,j ) = q(γy )q(γw )q(W) Ծఆີ౓ϑΟϧλϦϯά qi+1 (θ) ≈ ri+1 = 1 Zi+1 fi+1 (θ)qi (θ)  ɿҼࢠ fi (θ)
  52. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  53. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲॳظԽʳ ɹɹۙࣅ෼෍͕ແ৘ใʹͳΔΑ͏ʹɼ ɼ ɼ ɼ ɼ ɼ 

    ͰॳظԽ͢Δɽ ʲࣄલ෼෍Ҽࢠͷಋೖʳ ɹ໨ඪͷࣄޙ෼෍ͷҼࢠΛͭͭ௥Ճ͢Δ͜ͱͰۙࣅ෼෍Λߋ৽͢Δɽ ɹࠓճͷϞσϧʹ͓͚Δࣄલ෼෍Ҽࢠ͸ҎԼͷΑ͏ʹͳΔɽ ɹ m(l) i,j = 0 v(l) i,j = ∞ αγy = 1 βγy = 0 αγw = 1 βγw = 0 p(γr ), p(γw ), {p(w(l) i,j |γw )}i,j,l ࣄޙ෼෍ɿɹ  ۙࣅ෼෍ɿɹ p(W, γy , γw |) ∝ p(Y|X, W, γr )p(W|γy )p(γw )p(γw ) q(W, γy , γw ) = q(γy )q(γw )q(W)
  54. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͓Αͼ ͷ௥Ճɽ ɹۙࣅ෼෍ Λࣄલ෼෍ ͱಉ͡΋ͷʹ͍ͯ͠ΔͷͰɼҼࢠͷߋ৽ ͸ҎԼͷΑ͏ʹͳΔɽ

      ɹɹɹɹɹɹɹɹ ɼ ɼ ɼ  ͭ·Γɼ  ɼ p(γw ) p(γy ) q(γy ), q(γw ) p(γy ), p(γw ) qnew(γy )qnew(γw )qnew(W) ≈ p(γy )p(γw )q(W) αnew γy = αγy 0 βnew γy = βγy 0 αnew γw = αγw 0 βnew γw = βγw 0 q(γr ) ← p(γr ) q(γw ) ← p(γw ) Ծఆີ౓ϑΟϧλϦϯά qnew(γy )qnew(γw )qnew(W) ≈ r = 1 Z f new(γy , γw , W)q(γy )q(γw )q(W)
  55. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͷ௥Ճ   ɹҎ߱Ͱ͸ɼΠϯσοΫε Λলུ͢Δɽ ɹߋ৽͞ΕΔͷ͸ɼ

    ͓Αͼ Ͱ͋ΔɽΑͬͯɼͦΕͧΕΛҎԼͷΑ͏ʹߋ৽ ͢Δɽ     ɹԼઢ෦ΛҼࢠͱΈͳ͢ɽ஫ҙ͢΂͖͸ɼͭ໨ͷ෼෍ͷߋ৽ʹͭ໨ͷ৽ͨʹߋ৽͞ Εͨ෼෍͸࢖༻͍ͯ͠ͳ͍఺ͳͷͰɼߋ৽ॱʹؔ܎͸ͳ͍͜ͱɽ p(w(l) i,j |γw ) qnew(γy )qnew(γw )qnew(W) ≈ 1 Z p(w(l) i,j |γw )q(γy )q(γw )q(W) ⇔ qnew(γw )qnew(W) ≈ 1 Z p(w(l) i,j |γw )q(γw )q(W) i, j, l q(W) q(γw ) qnew(W) ≈ 1 Z0 p(w|γw )q(γw )q(W) qnew(γw ) ≈ 1 Z0 p(w|γw )q(W)q(γw )
  56. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͷ௥Ճɿ ͷߋ৽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(w(l) i,j |γw

    ) q(W) qnew(W) ≈ 1 Z0 p(w|γw )q(γw )q(W) ɹ ͸Ψ΢ε෼෍Ͱ͋Δ͜ͱ͔ΒɼͷΨ΢ε෼෍ͷྫʢQʣͱಉ༷ʹ ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ෼෍͕ߋ৽͞ΕΔɽ      q(W) mnew = m + v ∂ ∂m log Z0 vnew = v − v2 {( ∂ ∂m log Z0) 2 − 2 ∂ ∂v log Z0} Z0 = Z(αγw , βγw ) = ∫ p(w|γw )q(W)q(γw )dwdγw = ∫ (w|0,γ−1 w )(w|m, v)Gam(γw |αγw , βγw )dwdγw
  57. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͷ௥Ճɿ ͷߋ৽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(w(l) i,j |γw

    ) q(γw ) qnew(γw ) ≈ 1 Z0 p(w|γw )q(W)q(γw ) ɹ ͸ΨϯϚ෼෍Ͱ͋Δ͜ͱ͔ΒɼͷΨϯϚ෼෍ͷྫʢQʣͱಉ༷ʹ ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ෼෍͕ߋ৽͞ΕΔɽ   ɹɹɹɹɹɹɹɹ  ͨͩ͠ɼ ɼ q(γw ) αnew γw = { Z0 Z2 Z−2 1 αγw + 1 αγw − 1 } −1 βnew γw = { Z2 Z−1 1 αγw + 1 βγw − Z1 Z−1 0 αγw βγw } −1 Z1 = Z(αγw + 1,βγw ) Z2 = Z(αγw + 2,βγw )
  58. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ ɹਖ਼نԽఆ਺ ͸ݫີʹٻΊΒΕͳ͍ͷͰɼܭࢉ్தͰݱΕΔενϡʔσϯτ ͷU෼෍Λɼฏۉͱ෼ࢄͷ౳͍͠Ψ΢ε෼෍Ͱۙࣅ͢Δɽ  Z(αγw , βγw

    ) Z(αγw , βγw ) = ∫ (w|0,γ−1 w )q(W, γy , γw )dWdγy dγw = ∫ (w|0,γ−1 w )(w|m, v)Gam(γw |αγw , βγw )dwdγw = ∫ St(w|0,αγw /βγw ,2αγw )(w|m, v)dw ≈ ∫ (w|0,(αγw − 1)/βγw )(w|m, v)dw = (w|0,(αγw − 1)/βγw + v) U෼෍Λฏۉͱ෼ࢄ͕ ౳͍͠Ψ΢ε෼෍ʹ ۙࣅɽ
  59. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  60. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹࣄલ෼෍ͷ֤Ҽࢠ͕௥Ճ͞Εͨޙ͸ɼ໬౓ ͷҼࢠΛͭͣͭ௥Ճ͢Δɽ   ɹ ͸Ψ΢ε෼෍ɼ ͸ΨϯϚ෼෍ͳͷͰɼઌ΄Ͳͷߋ৽ͱಉ༷ʹߦ͏ɽ 

        ৽͘͠ೖ͖ͬͯͨ໬౓ͷҼࢠ ʹର͢Δਖ਼نԽఆ਺ʢ ͷ௥ Ճ࣌ͱҟͳΔߋ৽෦෼ʣΛܭࢉ͢Δ͜ͱ͕໨ඪɽ ɹ p(Y|X, W, γy ) qnew(γy )qnew(γw )qnew(W) ≈ 1 Z p(yi |xi , W, γy )q(γy )q(γw )q(W) ⇔ qnew(γr )qnew(W) ≈ 1 Z p(yi |xi , W, γy )q(γr )q(W) q(W) q(γy ) qnew(W) ≈ 1 Z0 p(yi |xi , W, γy )q(γw )q(W) qnew(γw ) ≈ 1 Z0 p(yi |xi , W, γy )q(W)q(γw ) ⟹ p(yi |xi , W, γy ) p(w(l) i,j |γw )
  61. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ൪໨ͷ໬౓Λ௥Ճͨ͠ͱ͖ͷਖ਼نԽఆ਺Λɼ࣍ͷΑ͏ʹۙࣅతʹٻΊΔɽ   ɹ i Z(αγy , βγy

    ) = ∫ (yi | f(xi , W), γy )q(W, γy , γw )dWdγy dγw = ∫ (yi | f(xi , W), γy )q(W, γy )dWdγy ≈ ∫ (yi |z(L), γy )(z(L) |mz(L) , vz(L) )Gam(γy |αγy , βγy )dz(L)dγy = ∫ St(yi |z(L), αγy /βγy ,2αγy )(z(L) |mz(L) , vz(L) )dz(L) ≈ ∫ (yi |mz(L) , (αγy − 1)/βγy )(z(L) |mz(L) , vz(L) )dw = (yi |mz(L) , (αγy − 1)/βγy + vz(L) ) U෼෍Λฏۉͱ෼ࢄ͕ ౳͍͠Ψ΢ε෼෍ʹ ۙࣅɽ ૚໨ͷӅΕϢχοτ  ͕ฏۉ ɼ ෼ࢄ ʹै͏ͱԾఆɽ ʢ࣍ͷεϥΠυͰৄ͘͠ʣ l z(l) ∈ ℝHl mz(l) vz(l)
  62. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ

    Λ࣋ͭͱԾఆ͢Δɽ· ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ  ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ     ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ· ͨɼ ͸ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙
  63. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ

    Λ࣋ͭͱԾఆ͢Δɽ· ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ  ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ     ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ· ͨɼ ͸ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙  ૚໨ͷӅΕϢχοτͷฏۉ ͱ ෼ࢄ ͔Β૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͕ٻ·Δɽ l − 1 mz(l−1) vz(l−1) l ma(l) va(l)
  64. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ

    Λ࣋ͭͱԾఆ͢Δɽ· ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ  ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ     ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ· ͨɼ ͸ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙  ૚໨ͷӅΕϢχοτͷฏۉ ͱ ෼ࢄ ͔Β૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͕ٻ·Δɽ l − 1 mz(l−1) vz(l−1) l ma(l) va(l) ૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͔Β ૚໨ͷӅΕϢχοτͷฏۉ ͱ෼ࢄ  ͕ٻ·Ε͹࠶ؼతʹܭࢉՄೳɽ l ma(l) va(l) l mz(l) vz(l)
  65. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  66. ظ଴஋఻೻๏ʹΑΔֶश ׆ੑͷ෼෍ ɹ׆ੑ ͷ෼෍ Λܭࢉ͢Δɽத৺ۃݶఆཧΑΓɼӅΕϢχοτ਺  ͕େ͖͍৔߹ɼ ͸ۙࣅతʹΨ΢ε෼෍ʹै͏ɽ  

    ɹΨ΢ε෼෍ʹै͏ม਺͕3F-6Λ௨ΔͱɼਤͷӈਤͷΑ͏ʹ෼෍ͷࠞ߹෼෍ʹͳ Δɽ ᶃ ෛͷೖྗΛ௨͖ͬͯͨαϯϓϧ͸ɼฏۉ ɼ෼ࢄ ͷΑ͏ͳ࣭఺ʹͳ Δɽ ᶄ ඇෛͷೖྗΛ௨͖ͬͯͨαϯϓϧ͸ɼҎԼ͕࡟ΒΕͨஅยΨ΢ε෼෍ʹͳΔɽ a(l) p(a(l) |W(l), z(l−1)) Hl−1 a(l) p(a(l) |W(l), z(l−1)) ≈ q(a(l)) = (a(l) |ma(l) , va(l) ) μp = 0 σp = 0
  67. ظ଴஋఻೻๏ʹΑΔֶश ׆ੑͷ෼෍ ʲࠞ߹෼෍ͷฏۉͱ෼ࢄͷҰൠࣜʳ ɹ ݸͷཁૉΛ࣋ͭࠞ߹෼෍ͷฏۉͱ෼ࢄ͸ɼࠞ߹܎਺ ɼ ͱ͢Δͱɼ ҰൠతʹҎԼͷΑ͏ʹͳΔɽ  

      K πk > 0 K ∑ k=1 πk = 1 [xmix ] = K ∑ k=1 πk μk [xmix ] = K ∑ k=1 πk (μk + σk ) − [xmix ]2
  68. ظ଴஋఻೻๏ʹΑΔֶश ׆ੑͷ෼෍ ʲ׆ੑͷࠞ߹෼෍ʹద༻ʳɹ ɹɹ࣭఺ͱஅยΨ΢ε෼෍ͷࠞ߹܎਺ΛͦΕͧΕ ɼ ͱ͢Δɽͭ·Γɼ ɽ ɹ ͸ɼ ͱ͓͘ͱɼҎԼͷΑ͏ʹͳΔɽ

      ɹ͕ͨͬͯ͠ɼ੾அΨ΢ε෼෍ͷ܎਺͸ҎԼͷΑ͏ʹٻΊΒΕΔɽ   ɹ<4,PU[ >ΑΓɼஅยΨ΢ε෼෍ͷฏۉ ͱ෼ࢄ ͸ҎԼͷΑ͏ʹͳΔɽ     ɹҰൠࣜʹ͓͚Δ ɼ ʹ౰ͯ͸ΊΔͱɼͷฏۉͱ෼ࢄ͕ಘΒΕΔɽ πp πt πp + πp = 1 πp ¯ μ = − μ/σ πp = ∫ 0 −∞ (x|μ, σ2)dx = Φ(−μ/σ) = Φ( ¯ μ) πt = 1 − πp = Φ(− ¯ μ) μt σt μt = μ + σ ( ¯ μ|0,1) Φ(− ¯ μ) σ2 t = σ2 {1 + ¯ μ ( ¯ μ|0,1) Φ(− ¯ μ) − ( ¯ μ|0,1) Φ(− ¯ μ) − 2} ( ¯ μ|0,1) Φ(− ¯ μ) [xmix ] [xmix ] z
  69. ظ଴஋఻೻๏ʹΑΔֶश ׆ੑͷ෼෍ ͭ·Γɼ ૚໨ͷ׆ੑͷฏۉͱ෼ࢄ͔Β૚໨ͷӅΕϢχοτͷฏۉͱ෼ࢄ͕ܭࢉՄೳɽ l l  ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ

    (z(L) |mz(L) , vz(L) ) mz(L) vz(L)
  70. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  71. ظ଴஋఻೻๏ʹΑΔֶश ޯ഑ʹجֶͮ͘श ɹ ͸ɼฏۉ ɼ෼ࢄ ͱͯ͠ѻ͏ʢ࠶ؼܭࢉͷॳظ஋ ɼ ʣɽ dͰ͸ɼ ૚໨ͷग़ྗ

    ͔Β׆ੑ Λ௨͠ɼ૚໨ͷग़ྗ  ͷฏۉͱ෼ࢄΛٻΊΔʢத৺ۃݶఆཧΑΓΨ΢ε෼෍ʹۙࣅͰ͖ΔɽʣҰ࿈ͷྲྀΕΛ঺ հͨ͠ɽ͜ͷۙࣅ݁ՌΛ࠶ؼతʹ༻͍Δ͜ͱͰɼ࠷ऴ૚ ͷ෼෍ΛΨ΢ε෼෍  Ͱۙࣅ͢Δ͜ͱ͕Ͱ͖Δɽ ɹ͕ͨͬͯ͠ɼਖ਼نԽఆ਺ͷۙࣅදݱ͕ಘΒΕΔɽ   ɹਖ਼نԽఆ਺Λಘͨޙ͸ɼύϥϝʔλʹΑΔඍ෼Λܭࢉ͢Δ͜ͱͰޯ഑͕ܭࢉͰ͖Δɽ z(0) xi 0 mz(0) vz(0) l − 1 z(l−1) a(l) l z(l) z(L) (z(L) |mz(L) , v(L) z ) Z(αγy , βγy ) ≈ (yi |mz(L) , (αγy − 1)/βγy + vz(L) )
  72. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  73. ظ଴஋఻೻๏ʹΑΔֶश ֬཰తٯ఻೻๏ͷ·ͱΊ Ϟσϧͷఆٛɿ p(W, γy , γw |) ∝ p(Y|X,

    W, γr )p(W|γw )p(γy )p(γw ) ۙࣅ෼෍ͷಋೖɿ q(W, γy , γw ) = q(γy )q(γw )q(W) ۙࣅ෼෍ͷॳظԽɿ q0 (γy ), q0 (γw ), q0 (W) ࣄલ෼෍Ҽࢠͷಋೖʢͦͷʣɿ Ҽࢠ ͷ௥Ճɿ  Ҽࢠ ͷ௥Ճɿ p(γr ) q(γr ) ← p(γr ) p(γw ) q(γw ) ← p(γw )
  74. ظ଴஋఻೻๏ʹΑΔֶश ֬཰తٯ఻೻๏ͷ·ͱΊ ࣄલ෼෍Ҽࢠͷಋೖʢͦͷʣɿ for l = 1 to L do

    for j = 1 to Hl−1 do for i = 1 to Hl do Ҽࢠp(w(l) i,j |γw )ͷ௥Ճɿ ⋅ q(W)ͷߋ৽ ⋅ q(γw )ͷߋ৽ ॱ఻೻ɿ p(yi |xi , W, γy ) where i ∈ s ӅΕϢχοτͱ׆ੑͷฏۉͱ෼ࢄΛ࠶ؼܭࢉ ໬౓Ҽࢠ ͷಋೖɿ ͷߋ৽ p(yi |xi , W, γy ) q(W), q(γy )
  75. ظ଴஋఻೻๏ʹΑΔֶश ʲظ଴஋఻೻๏ʹΑΔֶशʳ ‣Ϟσϧ ‣ۙࣅ෼෍ ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ‣໬౓Ҽࢠͷಋೖ ‣׆ੑͷ෼෍ ‣ޯ഑ʹجֶͮ͘श ‣֬཰తٯ఻೻๏ͷ·ͱΊ ‣ؔ࿈ख๏

  76. ظ଴஋఻೻๏ʹΑΔֶश ؔ࿈ख๏ ɹ֬཰తٯ఻೻๏ʹࣅͨख๏ͱͯ͠ɼܾఆతม෼ਪ࿦๏͕͋Δɽ ʲม෼ਪ࿦๏ͷܽ఺ʳ ɹ&-#0ͷධՁͷͨΊʹର਺໬౓ͷظ଴஋Λܭࢉ͢Δඞཁ͕͋ΓɼϞϯςΧϧϩ๏Ͱۙ ࣅղΛಘ͍ͯΔɽ ҆ఆੑ͕௿͍ ʲܾఆతม෼ਪ࿦๏ʳ ɹظ଴஋ͷۙࣅܭࢉΛܾఆతʹߦ͏͜ͱͰ҆ఆੑΛߴΊΒΕΔɽ ⟹