Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ベイズ深層学習(5.1~5.2)

catla
February 28, 2020

 ベイズ深層学習(5.1~5.2)

内容:ベイズニューラルネットワーク(5.1節),近似ベイズ推論の高速化(5.2節)

catla

February 28, 2020
Tweet

More Decks by catla

Other Decks in Science

Transcript

  1. ϕΠζਂ૚ֶश

    d
    ܡɹঘً

    View Slide

  2. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  3. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  4. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷ
    ۙࣅਪ࿦๏

    View Slide

  5. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  6. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ɹষͷۙࣅਪ࿦ख๏͸ɼਂ૚ֶशϞσϧʹ΋௚઀ద༻Ͱ͖Δɽ
    ɹઢܗճؼϞσϧͱಉ༷ʹॱ఻೻ܕχϡʔϥϧωοτϫʔΫʢ//ʣΛϕΠζԽɽ
    ɹ ύϥϝʔλ ʹࣄલ෼෍Λઃఆ͠ɼ֬཰తͳֶशͱ༧ଌΛՄೳʹ͢Δɽ
    ⟹ W
    ϕΠζਪ࿦ʹ͓͚Δֶशͱ༧ଌ
    ύϥϝʔλͷಉ࣌෼෍ɿɹ ͱදͤΔɽ
    ֶशɹɿɹ ΛධՁ͢Δɽ
    ༧ଌɹɿɹ ΛٻΊΔɽ
    p(Y, W|X) = p(W)
    N

    n=1
    p(yn
    |w, xn
    )
    p(W|X, Y)
    p(y*
    |x*
    , Y, X)
    n = 1,…, N
    xn
    yn
    W

    View Slide

  7. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ɹઃఆ
    ɹɹೖྗσʔλ ɼ؍ଌσʔλ ͓Αͼύϥϝʔλͷಉ࣌෼෍
    ΛҎԼͷΑ͏ʹ͓͘ɽ

    ɹɹ؍ଌσʔλ͸ɼҎԼͷ෼෍͔ΒಘΒΕΔͱԾఆ͢Δɽ

    ɹɹ ͸χϡʔϥϧωοτͷؔ਺஋ ͸ݻఆͷϊΠζύϥϝʔλɽ
    ɹɹύϥϝʔλ͸ɼҎԼͷ෼෍͔ΒಘΒΕΔͱઃఆ͢Δɽ
    ɹ
    ɹ ͸ݻఆͷϊΠζύϥϝʔλɽ
    ɹ
    ɹɹ
    X = {x1
    , …, xN
    } Y = {y1
    , ⋯, yn
    }
    p(Y, W|X) = p(W)
    N

    n=1
    p(yn
    |w, xn
    )
    p(yn
    |xn
    , W) = (yn
    | f(xn
    ; W), σ2
    y
    I)
    f(xn
    ; W) σ2
    y
    p(w) = (w|0,σ2
    w
    ) where w ∈ W
    σ2
    w

    View Slide

  8. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ɹಛ௃
    ɹɹ//ͷ૚਺͕Ͱ͋Δͱ͖ɼ
    ɹɹɹӅΕϢχοτ਺͕ଟ͍ɹ ɹؔ਺ෳࡶԽɽ
    ɹɹɹ ͕େ͖͍ɹ ɹมԽ͕ٸफ़ɽ
    ɹ
    ɹɹ

    σw

    ɹϕΠζ//͸ɼӅΕϢχοτ਺΍૚਺Λ૿΍͢ͱɼࣄޙ෼෍͕ෳࡶʹͳ͍ͬͯ͘͜ͱ͕
    ஌ΒΕ͍ͯΔɽ

    View Slide

  9. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  10. ϥϓϥεۙࣅʹΑΔֶश
    ϥϓϥεۙࣅ
    p(Z|X) ≈ (Z|ZMAP
    , {Λ(ZMAP
    )}
    −1
    )
    Λ(Z) = − ∇2
    Z
    log p(Z|X)
    ɹ؆୯ͷͨΊʹ//ͷग़ྗͷ࣍ݩΛͱ͢Δɽ
    ࣄޙ෼෍ͷۙࣅ
    ɹࣄޙ෼෍ͷ."1ਪఆ஋ΛٻΊΔɽ
    ɹɹ Ͱ࠷େΛऔΔύϥϝʔλ ΛٻΊΔɽ
    ɹࣄޙ෼෍࠷େԽɹʹɹର਺ࣄޙ෼෍࠷େԽɹͳͷͰɼର਺ࣄޙ෼෍ͷޯ഑Λར༻͢Δ
    ͱɼҎԼͷΑ͏ͳ࠷దԽʹΑͬͯ."1ਪఆ஋͕ٻΊΒΕΔɽ
    ɹ
    ͸ֶश཰ɽ
    ⟹ p(W|Y, X) WMAP
    Wnew
    = Wold
    + α∇W
    log p(W|Y, X)|
    W=Wold
    α

    View Slide

  11. ϥϓϥεۙࣅʹΑΔֶश
    ࣄޙ෼෍ͷۙࣅ
    ɹࣄޙ෼෍ͷޯ഑͸ɼҎԼͷΑ͏ʹٻΒΕΔɽɹɹɹ
    ɹɹɹɹɹɹɹɹɹɹ
    Αͬͯɼ
    ɹɹɹɹɹɹɹɹɹ
    ύϥϝʔλ Ͱภඍ෼͢ΔͱɼҎԼͷΑ͏ʹίετؔ਺ͷඍ෼ͱͳΔɽ
    ɹɹɹɹɹɹɹɹɹ
    ͸ɼͦΕͧΕ//ͷޡࠩؔ਺ͱ֤ύϥϝʔλͷࣄલ෼෍ʹ༝དྷ͢Δਖ਼ଇԽ
    ߲Ͱ͋Δɽ
    p(W|Y, X) =
    p(W)p(Y|X, W)
    p(X|Y)
    ∝ p(W)p(Y|X, W)
    log p(W|Y, X) = log p(Y|X, W) + log p(W) + c
    =
    N

    n=1
    log p(yn
    |xn
    , W) + ∑
    w∈W
    log p(w) + c
    w ∈ W

    ∂w
    log p(W|Y, X) = −
    {
    1
    σ2
    y

    ∂w
    E(W) +
    1
    σ2
    w

    ∂w
    ΩL2
    (W)
    }
    E(W), ΩL2
    (W)

    View Slide

  12. ϥϓϥεۙࣅʹΑΔֶश
    ࣄޙ෼෍ͷۙࣅ
    ɹΑͬͯɼ."1ਪఆ஋ΛٻΊͨΒɼࣄޙ෼෍ΛҎԼͷΑ͏ʹۙࣅͰ͖Δɽ
    ɹɹɹɹɹɹɹɹɹɹ
    ͸ޡࠩؔ਺ʹର͢ΔϔοηߦྻͰ͋Δɽ
    p(W|Y, X) ≈ q(W)
    = (W|WMAP
    , {Λ(WMAP
    )}
    −1
    )
    Λ(W) = − ∇2
    W
    log p(W|Y, X)
    =
    1
    σ2
    w
    I +
    1
    σ2
    y
    H
    H

    View Slide

  13. ϥϓϥεۙࣅʹΑΔֶश
    ༧ଌ෼෍ͷۙࣅ
    ɹϥϓϥεۙࣅΛ༻͍Δͱɼ༧ଌ෼෍͸ҎԼͷΑ͏ʹۙࣅͰ͖Δɽ
    ɹ
    ɹ͔͠͠ɼ ͷதʹ//ؚ͕·Ε͍ͯΔͷͰɼղੳతܭࢉ͕ෆՄೳɽ
    ɹ͜͜Ͱɼύϥϝʔλͷࣄޙ෼෍ͷີ౓͕."1ਪఆ஋ͷपลʹूத͓ͯ͠Γɼ͔ͭͦͷ
    খ͞ͳൣғʹ͓͍ͯ͸ ͕ ͷઢܕؔ਺ͰΑۙ͘ࣅͰ͖Δͱ͍͏ԾઆΛ͓͘ɽ͜ͷ
    Ծઆ͔Βɼςʔϥʔల։Ͱ ͷؔ਺ Λ ·ΘΓͰ࣍ۙࣅ͢ΔͱɼҎԼͷΑ͏
    ʹͳΔɽ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    p(y*
    |x*
    , Y, X) = p(y*
    |x*
    )
    =

    p(y*
    |x*
    , W)p(W|X, Y)dW


    p(y*
    |x*
    , W)q(W)dW
    p(y*
    |x*
    , W)
    f(x*
    |W) W
    W f(x*
    |W) WMAP
    f(x*
    ; W) ≈ f(x*
    ; WMAP
    ) + gT(W − WMAP
    )
    g = ∇W
    f(x*
    ; W)|
    W=WMAP

    View Slide

  14. ϥϓϥεۙࣅʹΑΔֶश
    ༧ଌ෼෍ͷۙࣅ
    ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ
    ɹ
    ɹ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    p(y*
    |x*
    , Y, X) = p(y*
    |x*
    )
    =

    p(y*
    |x*
    , W)p(W|X, Y)dW


    p(y*
    |x*
    , W)q(W)dW
    =

    (yn
    | f(xn
    ; W), σ2
    y
    )(W|WMAP
    , {Λ(WMAP
    )}−1)dW
    =

    (yn
    | f(x*
    ; WMAP
    ) + gT(W − WMAP
    ), σ2
    y
    )
    (W|WMAP
    , {Λ(WMAP
    )}−1)dW
    = (y*
    | f(x*
    ; WMAP
    ), σ2(x*
    ))
    σ2(x*
    ) = σ2
    y
    + gT{Λ(WMAP
    )}−1g

    View Slide

  15. ϥϓϥεۙࣅʹΑΔֶश
    ༧ଌ෼෍ͷۙࣅ
    ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ
    ɹ
    ɹ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    p(y*
    |x*
    , Y, X) = p(y*
    |x*
    )
    =

    p(y*
    |x*
    , W)p(W|X, Y)dW


    p(y*
    |x*
    , W)q(W)dW
    =

    (yn
    | f(xn
    ; W), σ2
    y
    )(W|WMAP
    , {Λ(WMAP
    )}−1)dW
    =

    (yn
    | f(x*
    ; WMAP
    ) + gT(W − WMAP
    ), σ2
    y
    )
    (W|WMAP
    , {Λ(WMAP
    )}−1)dW
    = (y*
    | f(x*
    ; WMAP
    ), σ2(x*
    ))
    σ2(x*
    ) = σ2
    y
    + gT{Λ(WMAP
    )}−1g
    ϥϓϥεۙࣅ
    ςʔϥʔల։ͷҰ࣍ۙࣅ

    View Slide

  16. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  17. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ɹର਺ࣄޙ෼෍ʢϋϛϧτχΞϯʹ͓͚ΔϙςϯγϟϧΤωϧΪʔʣ͕αϯϓϦϯά͠
    ͍ͨม਺ʹରͯ͠ඍ෼ՄೳͳΒ).$๏͕ద༻Ͱ͖Δɽܭࢉ࣌ؒ͑͞े෼ʹ֬อ͍ͯ͠Ε
    ͹ɼཧ࿦తʹਅͷࣄޙ෼෍͔Βͷαϯϓϧ͕ಘΒΕΔʢ.$.$ͷಛ௃ʣɽ݁Ռతʹɼෳ
    ਺ͷαϯϓϧ͔Βෆ࣮֬ੑΛදݱͰ͖Δɽ

    View Slide

  18. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ॏΈύϥϝʔλͷਪ࿦
    ɹਖ਼نԽ͞Ε͍ͯͳ͍ࣄޙ෼෍Λར༻͢Ε͹ɼରԠ͢ΔϙςϯγϟϧΤωϧΪʔ͸ҎԼ
    ͷΑ͏ʹͳΔɽ

    ͜ΕΛඍ෼͢Δͱɼઌ΄Ͳొ৔ͨ͠ίετؔ਺ͷඍ෼ͱ౳ՁͰ͋Δ͜ͱ͕Θ͔Δɽ
    ɹ ޡࠩٯ఻೻๏ʹΑΔޯ഑ܭࢉ͕ར༻Ͱ͖Δɽ
    ʲ.$.$ʹجͮ͘ͷۙࣅਪ࿦ͷ໰୊఺ʳ
    w αϯϓϧ਺͕े෼Ͱ͋Δ͔Λ஌Δखஈ͕ͳ͍ɽ
    w .$.$ͷύϥϝʔλௐ੔͕೉͍͠ɽʢFH).$๏ʹ͓͚ΔεςοϓαΠζ΍εςοϓ਺ͳͲ
    w ֶश͕௿଎ɽɹ
    (W) = − {log p(Y|X, W) + log p(W)}

    View Slide

  19. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ϋΠύʔύϥϝʔλͷਪ࿦
    ɹϋΠύʔύϥϝʔλͰ͋Δ ΍ ʹ΋ͦΕͧΕࣄલ෼෍Λ༩͑Δ͜ͱͰ ͱಉ࣌ʹ
    ਪ࿦ՄೳͰ͋Δɽ
    ɹ
    ɹਫ਼౓ύϥϝʔλ Λಋೖ͠ɼҎԼͷΑ͏ʹࣄલ෼෍ΛΨϯϚ෼෍Ͱఆٛ͢Δɽ

    ɹಉ༷ʹ ʹରͯ͠΋ɼҎԼͷΑ͏ʹఆٛ͢Δɽ

    σw
    σy
    W
    γw
    = σ−2
    w
    p(γw
    ) = Gam(γw
    |aw
    , bw
    ) (aw
    , bw
    ͸ਖ਼ͷݻఆ஋)
    γy
    = σ−2
    y
    p(γy
    ) = Gam(γy
    |ay
    , by
    ) (ay
    , by
    ͸ਖ਼ͷݻఆ஋)

    View Slide

  20. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ϋΠύʔύϥϝʔλͷਪ࿦
    ɹϞσϧʢύϥϝʔλͷಉ࣌෼෍ʣΛվΊͯॻ͘ͱɼҎԼͷΑ͏ʹͳΔɽ
    ɹ
    p(Y, W, γw
    , γy
    |X) = p(γw
    )p(γy
    )p(W|γw
    )
    N

    n=1
    p(yn
    |xn
    , W, γy
    )
    n = 1,…, N
    xn
    yn
    W
    γy
    γw
    ɹࣄޙ෼෍͸ɼҎԼͷΑ͏ʹͳΔɽ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    p(W, γw
    , γy
    |X, Y)
    αy
    βw
    βy
    αw

    View Slide

  21. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ϋΠύʔύϥϝʔλͷਪ࿦
    ɹΪϒεαϯϓϦϯάΛ༻͍ͯɼ ΛαϯϓϦϯά͢Δɽ
    w ͷαϯϓϦϯά
    ɹɹɹઌ΄Ͳͱಉ༷ʹɼ).$๏Ͱαϯϓϧ͢Δɽ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    w ͷαϯϓϦϯά
    ɹɹɹ
    ɹɹɹ ͸Ψ΢ε෼෍ɼ ͸ΨϯϚ෼෍ʢΨ΢ε෼෍ͷڞ໾ࣄલ෼෍ʣͳͷͰɼ
    ɹɹɹ ͸ΨϯϚ෼෍Ͱ͋ΔɽΑͬͯɼ

    ͨͩ͠ɼ ͸ॏΈύϥϝʔλͷ૯਺ɽ
    W, γw
    , γy
    W
    W ∼ p(W|Y, X, γw
    , γy
    )
    γw
    p(γw
    |Y, X, W, γy
    ) ∝ p(W|γw
    )p(γw
    )
    p(W|γw
    ) p(γw
    )
    p(γw
    |Y, X, W, γy
    )
    γw
    ∼ Gam( ̂
    aw
    , ̂
    bw
    )
    ̂
    aw
    = aw
    +
    Kw
    2
    ̂
    bw
    = bw
    +
    1
    2 ∑
    w∈W
    w2
    Kw

    View Slide

  22. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ϋΠύʔύϥϝʔλͷਪ࿦
    w ͷαϯϓϦϯά
    ɹɹɹ
    ɹɹɹ ͸Ψ΢ε෼෍ͷ૯৐ͳͷͰΨ΢ε෼෍ɼ ͸ΨϯϚ෼෍ΑΓɼ
    ɹɹɹ ͸ΨϯϚ෼෍Ͱ͋ΔɽΑͬͯɼ


    γy
    p(γy
    |Y, X, W, γw
    ) ∝ p(γw
    )
    N

    n=1
    p(yn
    |xn
    , W, γr
    )
    N

    n=1
    p(yn
    |xn
    , W, γr
    ) p(γy
    )
    p(γy
    |Y, X, W, γw
    )
    γy
    ∼ Gam( ̂
    ay
    , ̂
    by
    )
    ̂
    ay
    = ay
    +
    N
    2
    ̂
    by
    = by
    +
    1
    2
    N

    n=1
    {yn
    − f(xn
    ; W)}2

    View Slide

  23. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश
    ϋΠύʔύϥϝʔλͷਪ࿦
    ɹΨϯϚ෼෍ ͷฏۉ͸ ɼ෼ࢄ͸ ͳͷͰɼ ͕େ͖͍΄Ͳ ʹΑΔ
    ͷਪఆਫ਼౓͕ѱ͘ɼ؍ଌʹର͢Δ෼ࢄ͕େ͖͘ͳΔΑ͏ʹֶश͞ΕΔɽ
    ɹ
    ɹࠓճ͸ɼॏΈύϥϝʔλͷਫ਼౓ύϥϝʔλ͸ɼશମʹ౉ͬͯڞ௨ͷ Ͱ͓͍͍͕ͯͨɼ
    //ͷ֤૚͝ͱʹਫ਼౓ύϥϝʔλ ͱ͓͘͜ͱ΋ՄೳͰ͋Δɽ
    Gam(a, b) a/b a/b2 ̂
    by
    f(xn
    |W)
    yn
    γw
    (γ(1)
    w
    , …, γ(L)
    w
    )

    View Slide

  24. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  25. ۙࣅϕΠζਪ࿦ͷߴ଎Խ

    View Slide

  26. ۙࣅϕΠζਪ࿦ͷߴ଎Խ
    ʲϕΠζχϡʔϥϧωοτϫʔΫͷܽ఺ʳ
    ɹύϥϝʔλͷपลԽʹ൐͏ܭࢉྔ͕๲େ
    ɹɹ ༧ଌπʔϧͱͯ͋͠·Γ࢖ΘΕͳ͔ͬͨɽ
    ɹ·ͨɼਂ૚ֶश͸ඞཁͳֶशσʔλ͕๲େ
    ɹɹ όονֶशΛલఏͱͨ͠ख๏Ͱ͸ܭࢉޮ཰͕ѱ͍ɽ
    ʲͲͷΑ͏ʹܽ఺Λิ͏ʁʳ
    w ੵ෼আڈΛۙࣅਪ࿦͢Δ͜ͱͰɼܭࢉͷޮ཰Λ্͛Δɽ
    w ϛχόονֶशΛಋೖ͢Δɽ


    View Slide

  27. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  28. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ʲ໰୊఺ʳ
    ɹ.$.$Λར༻ֶͨ͠श͸େن໛ͳσʔλʹରͯ͠ɼܭࢉޮ཰͕ѱ͍ɽ
    ʲղܾࡦʳ
    ɹܭࢉޮ཰ͷߴ͍ϛχόονʹجֶͮ͘शख๏ʢFH֬཰తޯ഑߱Լ๏ʣͱෆ࣮֬ੑͷ
    ਪఆ͕Մೳͳ.$.$ʢFH.)๏ɼ).$๏ʣΛ૊Έ߹ΘͤΔɽ
    ɹ ֬཰తϚϧίϑ࿈࠯ϞϯςΧϧϩ๏

    View Slide

  29. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ʲֶशʳ
    ɹ֬཰తޯ഑߱Լ๏ͱϥϯδϡόϯಈྗֶ๏Λ૊Έ߹Θͤͨɹ֬཰తޯ഑ϥάδϡόϯ
    ಈྗֶ๏ɹΛར༻ֶͨ͠शΛߟ͑Δɽ
    ɹύϥϝʔλͷߋ৽Λɹ ͱද͢ɽ
    ɹ֬཰తޯ഑߱Լ๏Ͱ͸ɼύϥϝʔλͷߋ৽෯ΛҎԼͷΑ͏ʹॻ͚Δɽ

    ͨͩ͠ɼ ͸αϒαϯϓϧͷେ͖͞Ͱ͋ΓɼՃ͑ͯɼϩϏϯεɾϞϯϩʔΞϧΰϦζϜͷ
    ࿮૊Έʹ͢ΔͨΊʹɼεςοϓ໨ʹ͓͚Δֶश཰ ҎԼͷ৚݅Λຬͨ͢Α͏ʹઃఆ͢
    Δɽ

    Wnew
    = Wold
    + ΔW
    ΔW =
    αt
    2
    ∇W
    log p(W|Xs
    , Ys
    ) =
    αt
    2 {
    N
    M ∑
    n∈S
    ∇W
    log p(yn
    |xn
    , W) + ∇W
    log p(W)
    }
    M
    t αt


    i=1
    αt
    = ∞,


    i=1
    α2
    t
    < ∞

    View Slide

  30. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ʲֶशʳ
    ɹҰํͰɼόονֶशΞϧΰϦζϜͷϥϯδϡόϯಈྗֶ๏ͷαϯϓϧΛಘΔͨΊʹඞ
    ཁͳεςοϓ͸ɼϙςϯγϟϧΤωϧΪʔΛ ɼεςοϓαΠζΛ
    ΛӡಈྔϕΫτϧͱ͢Δͱɼύϥϝʔλͷߋ৽෯͸ҎԼͷΑ͏ʹͳΔɽ

    ɹ Λখ͘͢͞Ε͹ɼ.)๏ʹ͓͚Δड༰཰ΛݶΓͳ͘·Ͱ͚ۙͮΒΕΔɽ
    = − log p(W|X, Y)
    ϵ = αt
    p
    ΔW = −
    ϵ2
    2
    ∇W
    + ϵp
    =
    αt
    2
    ∇W
    log p(W|X, Y) + αt
    p
    =
    αt
    2 {
    N

    n=1
    ∇W
    log p(yn
    |xn
    , W) + ∇W
    log p(W)
    }
    + αt
    p,
    p ∼ (0, I) .
    αt

    View Slide

  31. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ʲֶशʳ
    ɹઌͷͭʢ֬཰తޯ഑߱Լ๏ͱϥϯδϡόϯಈྗֶ๏ʣΛ૊Έ߹ΘͤΔͱɼߋ৽෯͕Ҏ
    ԼͷΑ͏ʹͳΔɽ
    ɹɹɹɹɹɹɹ
    ֶश཰͸ɼઌ΄Ͳͷ৚݅ͱಉ༷ɽ
    ɹ
    ɹʬ͕খ͖͞ͱ͖ʢֶशॳظஈ֊ʣ㲊
    ɹɹ4(%ͷར఺Λੜ͔ͯ͠ࣄޙ෼෍ͷۭؒΛޮ཰తʹ୳ࡧɽ
    ɹʬ͕େ͖͘ͳΔʹͭΕͯ㲊
    ϥϯδϡόϯಈྗֶ๏ʹΑΔਅͷࣄޙ෼෍͔ΒۙࣅతͳαϯϓϧΛಘΒΕΔɽ
    ΔW =
    αt
    2 {
    N
    M ∑
    n∈S
    ∇W
    log p(yn
    |xn
    , W) + ∇W
    log p(W)
    }
    + αt
    p,
    p ∼ (0, I) .
    t
    t

    View Slide

  32. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  33. ֬཰తม෼ਪ࿦๏
    ɹઌ΄Ͳ͸ɼ֬཰తޯ഑๏ͱ.$.$ͷ૊Έ߹ΘͤΛ঺հͨ͠ɽ
    ɹ࣍͸ɼม෼ਪ࿦๏ͱ֬཰తޯ഑߱Լ๏Λ૊Έ߹ΘͤΔɽ
    ɹɹ ֬཰తม෼ਪ࿦๏
    ɹ
    ɹΛม෼ύϥϝʔλͷू߹ͱͨ͠ͱ͖ɼ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    ͱͳΔΑ͏ͳۙࣅ෼෍ ΛٻΊΔ͜ͱ͕໨ඪɽ

    ξ
    q(W; ξ) ≈ p(W|X, Y)
    q(W; ξ)

    View Slide

  34. ֬཰తม෼ਪ࿦๏
    ɹޮ཰ԽͷͨΊʹϛχόονΛಋೖ͢Δɼ
    ɹ

    ɹϛχόονͰܭࢉ͞Εͨ ͸ ʹର͢ΔෆภਪఆྔͱͳΔɽ

    ɹ͕ͨͬͯ͠ɼ Λ௚઀࠷େԽ͢Δ୅ΘΓʹɼ Λ࠷େԽ͢Δ͜ͱʹΑͬͯɼޮ཰
    Α͘ύϥϝʔλͷࣄޙ෼෍ΛۙࣅͰ͖Δɽ
    ℒ(ξ) =
    N

    n=1

    q(W; ξ)log p(yn
    | f(xn
    ; W))dW − DKL
    [q(W; ξ)||p(W)]
    ℒS
    (ξ) =
    N
    M ∑
    n∈S

    q(W; ξ)log p(yn
    | f(xn
    ; W))dW − DKL
    [q(W; ξ)||p(W)]
    ℒs

    S
    [ℒs
    (ξ)] = ℒ(ξ)
    ℒ(ξ) ℒs
    (ξ)
    ϛχόονԽ

    View Slide

  35. ֬཰తม෼ਪ࿦๏
    ɹ͜ͷޙͷεϥΠυͰ͸ɼۙࣅ෼෍Λ࣍ͷΑ͏ͳಠཱͳΨ΢ε෼෍ͱԾఆ͠ɼ&-#0Λ
    ޯ഑߱Լ๏Λར༻ͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δɽ

    q(W; ξ) = ∏
    i,j,l
    (w(l)
    i,j
    |μ(l)
    i,j
    , σ(l)
    i,j
    2
    )

    View Slide

  36. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  37. ޯ഑ͷϞϯςΧϧϩۙࣅ
    ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰ͸ɼ&-#0ʹ͓͚Δύϥϝʔλ ͸ղੳతʹ
    ੵ෼আڈͰ͖ͳ͍ɽ
    ɹ ޯ഑߱Լ๏ʹΑͬͯ Λ࠷େԽɽ
    ɹޯ഑߱Լ๏Λ࢖͏ͨΊʹ Λม෼ύϥϝʔλʹΑΔޯ഑ܭࢉΛ͢Δඞཁ͕͋Δɽ
    ͸ɼͲͪΒ΋Ψ΢ε෼෍ͳͷͰղੳతʹޯ഑ܭࢉͰ͖ΔɽҰํͰɼର
    ਺໬౓ ͸ղੳతʹੵ෼Ͱ͖ͳ͍ɽ
    W
    ⟹ ℒS
    (ξ)
    ℒS
    (ξ) ξ
    DKL
    [q(W; ξ)||p(W)]

    q(W; ξ)log p(yn
    | f(xn
    ; W))dW

    View Slide

  38. ޯ഑ͷϞϯςΧϧϩۙࣅ
    ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰ͸ɼ&-#0ʹ͓͚Δύϥϝʔλ ͸ղੳతʹ
    ੵ෼আڈͰ͖ͳ͍ɽ
    ɹ ޯ഑߱Լ๏ʹΑͬͯ Λ࠷େԽɽ
    ɹޯ഑߱Լ๏Λ࢖͏ͨΊʹ Λม෼ύϥϝʔλʹΑΔޯ഑ܭࢉΛ͢Δඞཁ͕͋Δɽ
    ͸ɼͲͪΒ΋Ψ΢ε෼෍ͳͷͰղੳతʹޯ഑ܭࢉͰ͖ΔɽҰํͰɼର
    ਺໬౓ ͸ղੳతʹੵ෼Ͱ͖ͳ͍ɽ
    W
    ⟹ ℒS
    (ξ)
    ℒS
    (ξ) ξ
    DKL
    [q(W; ξ)||p(W)]

    q(W; ξ)log p(yn
    | f(xn
    ; W))dW
    ɹϞϯςΧϧϩ๏Ͱੵ෼ʢର਺໬౓ʣΛۙࣅͯ͠ɼޯ഑ͷਪఆΛಘΑ͏ʂ

    View Slide

  39. ޯ഑ͷϞϯςΧϧϩۙࣅ
    ʲ໨ඪʳ
    ɹύϥϝʔλ ʹରͯ͠ɼ͋Δ෼෍ ͱ෼෍ Λߟ͑ɼ࣍ͷޯ഑Λਪ࿦͢
    Δ͜ͱɽ

    ʲܭࢉํ๏ʳ
    ɹείΞؔ਺ਪఆɼ࠶ύϥϝʔλԽޯ഑ɼҰൠԽ࠶ύϥϝʔλԽޯ഑ɼӄؔ਺ඍ෼ͳͲ
    w ∈ ℝ f(w) q(w; ξ)
    I(ξ) = ∇ξ ∫
    f(w)q(w; ξ)dw

    View Slide

  40. ޯ഑ͷϞϯςΧϧϩۙࣅ
    είΞؔ਺ਪఆ
    ɹҎԼͷΑ͏ʹ Λมܗ͢Δɽ

    ɹ͕ͨͬͯ͠ɼ ͔Β Λෳ਺αϯϓϦϯά͔ͯ͠Βඍ෼ΛධՁ͢Δ͜ͱͰ ͷෆ
    ภਪఆྔ͕ಘΒΕΔɽ
    ʲద༻Ͱ͖Δ৚݅ʳɹ ͷඍ෼͕ܭࢉՄೳɽ
    ʲ໰୊఺ʳɹ࣮༻্͸ඇৗʹߴ͍෼ࢄ͕ൃੜͯ͠͠·͏ɽ
    ʲղܾࡦʳɹ੍ޚมྔ๏ͳͲͷ෼ࢄݮগख๏ͱ૊Έ߹ΘͤΔɽ
    I(ξ)
    I(ξ) = ∇ξ ∫
    f(w)q(w; ξ)dw
    =

    f(w)∇ξ
    q(w; ξ)dw
    =

    f(w)q(w; ξ)∇ξ
    log q(w; ξ)dw
    = q(w;ξ)
    [ f(w)∇ξ
    log q(w; ξ)]
    q(w; ξ) w I(ξ)
    log q(w; ξ)

    View Slide

  41. ޯ഑ͷϞϯςΧϧϩۙࣅ
    ࠶ύϥϝʔλԽޯ഑
    ɹ Λ ͔Β௚઀αϯϓϦϯά͢Δ୅ΘΓʹɼʹґଘ͠ͳ͍ ͔ΒΛαϯϓϦϯ
    ά͠ɼม׵ Λద༻͢Δ͜ͱͰؒ઀తʹ ͷαϯϓϦϯάΛ͢Δ͜ͱΛߟ͑Δɽ
    ɹ͕ͨͬͯ͠ɼҎԼͷΑ͏ʹޯ഑ͷෆภਪఆྔ͕ಘΒΕΔɽ

    ʲ۩ମྫʳɹ ɼ ͷ৔߹
    ɹ ɼ ͱ͢Δ͜ͱͰɼ ͸ ͔ΒαϯϓϦϯ
    άͰ͖Δɽม෼ύϥϝʔλʹؔ͢Δޯ഑ͷඍ෼͸ɼ࣍ͷΑ͏ʹͳΓɼ֤ม෼ύϥϝʔλ
    ͷޯ഑ͷෆภਪఆྔ͕ಘΒΕΔɽ
    ɹɹɹɹ
    ɹɹɹɹ
    w q(w; ξ) ξ q(ϵ) ϵ
    w = g(ξ, ϵ) w
    q(ϵ)
    [ f′(g(ξ; ϵ))∇ξ
    g(ξ; ϵ)] = I(ξ)
    ξ = { ̂
    μ, ̂
    σ2} q(w; ξ) = (w| ̂
    μ, ̂
    σ2)
    ˜
    ϵ ∼ (0,1) = q(ϵ) ˜
    w = g(ξ; ϵ) = ̂
    μ + ̂
    σϵ ˜
    w ( ̂
    μ, ̂
    σ2)

    ∂ ̂
    μ ∫
    f(w)q(w; ξ)dw =

    f′(w)q(w; ξ)dw ∴ I( ̂
    μ) = q(w;ξ)
    [ f′(w)]

    ∂ ̂
    σ ∫
    f(w)q(w; ξ)dw =

    f′(w)
    (w − ̂
    μ)
    ̂
    σ
    q(w; ξ)dw ∴ I( ̂
    μ) = q(w;ξ) [f′(w)
    (w − ̂
    μ)
    ̂
    σ ]

    View Slide

  42. ޯ഑ͷϞϯςΧϧϩۙࣅ
    ࠶ύϥϝʔλԽޯ഑ͷҰൠԽ
    ʲ࠶ύϥϝʔλԽޯ഑ͷར఺ʳ
    ɹɹείΞؔ਺ਪఆͱൺ΂ͯޯ഑ͷ෼ࢄΛখ͘͞཈͑ΒΕΔɽ
    ʲ࠶ύϥϝʔλԽޯ഑ͷ໰୊఺ʳ
    ɹɹม਺ม׵ ͕ඞཁɽʢશͯͷ෼෍Ͱద༻Ͱ͖ΔΘ͚Ͱ͸ͳ͍ɽʣ
    ʲղܾࡦɹྫɿʳɹҰൠԽ࠶ύϥϝʔλԽޯ഑
    ɹɹ ʹؔ͢Δ੍໿Λ؇Ίɼଟ͘ͷछྨͷ෼෍ʹରͯ͠ద༻Մೳͱͨ͠΋ͷɽ
    ɹɹ ͷΑ͏ʹม෼ύϥϝʔλͷґଘੑΛ࢒͢͜ͱΛڐ͢ɽ
    ʲղܾࡦɹྫɿʳɹӄؔ਺ඍ෼
    ɹʲ࢖͑Δ৚݅ʳ
    w ΛٻΊΔ͜ͱ͸ࠔ೉͕ͩɼٯม׵ ͸༰қʹಘΒΕΔɽ
    w ࿈ଓ஋ͷ෼෍
    ɹɹ ΛͰඍ෼͢Δ͜ͱͰظ଴஋ͷޯ഑ΛಘΔɽ
    g
    g
    q(ϵ; ξ)
    g g−1
    ϵ = g−1(ϵ; ξ) ξ

    View Slide

  43. ޯ഑ͷϞϯςΧϧϩۙࣅ
    ࠶ύϥϝʔλԽޯ഑ͷҰൠԽ
    ʲղܾࡦɹྫɿʳɹ࿈ଓ؇࿨
    ɹɹ཭ࢄͷ֬཰෼෍ʹରͯ͠࠶ύϥϝʔλԽޯ഑Λద༻͢Δํ๏ɽ
    ɹʲ۩ମྫʳ
    ΧςΰϦ෼෍ʢ཭ࢄ෼෍ʣ͸ɼΨϯϕϧιϑτϚοΫε෼෍ʢ࿈ଓ෼෍ʣͷԹ౓ύ
    ϥϝʔλΛʹઃఆͨ͠΋ͷͱҰக͢Δɽ
    ɹɹ

    View Slide

  44. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  45. ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ɹ࣮ࡍʹ࠶ύϥϝʔλԽޯ഑Λར༻ͯ͠ϕΠζχϡʔϥϧωοτͷ&-#0Λ࠷େԽ͢Δɽ
    ᶃ ϛχόον Λσʔληοτ ͔ΒϥϯμϜʹநग़͢Δɽ
    ᶄ .ݸʢϛχόονͷαϯϓϧ਺ʣͷϊΠζΛऔಘ͢Δɽ
    ɹ
    ᶅ ม෼ύϥϝʔλʹؔ͢Δޯ഑Λܭࢉ͢Δɽ

    ᶆ &-#0ͷ૿Ճํ޲ʹม෼ύϥϝʔλΛߋ৽͢Δɽ

    s

    ˜
    ϵi
    ∼ (0, I)
    ℒs
    (ξ) =
    N
    M ∑
    n∈S

    q(W; ξ)log p(yn
    | f(xn
    ; W))dW − DKL
    [q(W; ξ)||p(W)]
    =
    N
    M ∑
    n∈S

    p(ϵ)log p(yn
    | f(xn
    ; g(ξ; ϵ)))dϵ − DKL
    [q(W; ξ)||p(W)]
    ≈ ℒS,ϵ
    (ξ) ( ∵ ,ϵ
    [ℒS,ϵ
    (ξ)] = ℒ(ξ))
    =
    N
    M ∑
    n∈S
    log p(yn
    | f(xn
    ; g(ξ; ˜
    ϵn
    ))) − DKL
    [q(W; ξ)||p(W)],
    ∇ξ
    ℒs
    (ξ) ≈ ∇ξ
    ℒS,ϵ
    (ξ)
    =
    N
    M ∑
    n∈S
    ∇ξ
    log p(yn
    | f(xn
    ; g(ξ; ˜
    ϵn
    ))) − ∇ξ
    DKL
    [q(W; ξ)||p(W)] .
    ξ ← ξ + α∇ξ
    ℒS,ϵ
    (ξ)

    View Slide

  46. ຊ೔ͷ಺༰
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧͷۙࣅਪ࿦๏
    ‣ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ
    ‣ϥϓϥεۙࣅʹΑΔֶश
    ‣ϋϛϧτχΞϯϞϯςΧϧϩ๏
    ‣ۙࣅϕΠζਪ࿦ͷޮ཰Խ
    ‣֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश
    ‣֬཰తม෼ਪ࿦๏ʹΑΔֶश
    ‣ޯ഑ͷϞϯςΧϧϩۙࣅ
    ‣ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏
    ‣ظ଴஋఻೻๏ʹΑΔֶश

    View Slide

  47. ظ଴஋఻೻๏ʹΑΔֶश
    ɹॱ఻೻ܭࢉͰ͸χϡʔϥϧωοτϫʔΫΛ௨ͨ֬͠཰ͷ఻೻ʹΑΓपล໬౓ͷධՁΛ
    ߦ͍ɼٯ఻೻Ͱ͸ύϥϝʔλΛֶश͢ΔͨΊʹظ଴஋఻೻๏Λ༻͍ͯपล໬౓ͷޯ഑Λ
    ܭࢉ͢Δɽ
    ֬཰తٯ఻೻๏
    ɹ֬཰తٯ఻೻๏͸σʔλΛஞ࣍తʹॲཧͰ͖ΔͷͰɼେྔσʔλΛ༻ֶ͍ͨशͰ΋ε
    έʔϧՄೳɽ؍ଌσʔλͷਫ਼౓ύϥϝʔλ΍ॏΈͷࣄલ෼෍Λࢧ഑͢Δਫ਼౓ύϥϝʔλ
    ΋ۙࣅਪ࿦Մೳɽ

    View Slide

  48. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  49. ظ଴஋఻೻๏ʹΑΔֶश
    Ϟσϧ
    ʲઃఆʳ
    ɹɹ ͱ͠ɼपล໬౓ΛҎԼͷΑ͏ʹఆٛ͢Δɽ


    ɹ ͷ׆ੑԽؔ਺ʹ͸ਖ਼نԽઢܗؔ਺ʢ3F-6ʣΛ༻͍Δɽ
    ɹɹύϥϝʔλ ͸ɼಠཱͳΨ΢ε෼෍ʹै͏ͱ͢Δɽ


    ʲ໨ඪʳ
    ɹɹҎԼͷࣄޙ෼෍Λۙࣅਪ࿦͢Δ͜ͱɽ

    yn
    ∈ ℝ
    p(Y|X, W, γr
    ) =
    N

    n=1
    (yn
    | f(xn
    ; W), γ−1
    y
    )
    p(γy
    ) = Gam(γr
    |αγy
    0
    , βγy
    0
    )
    f(xn
    ; W)
    W
    p(W|γw
    ) =
    L

    l=1
    Hl

    i=1
    Hl−1

    j=1
    (w(l)
    i,j
    |0,γ−1
    w
    )
    p(γw
    ) = Gam(γw
    |αγw
    0
    , βγw
    0
    )
    p(W, γy
    , γw
    |) ∝ p(Y|X, W, γr
    )p(W|γw
    )p(γy
    )p(γw
    )

    View Slide

  50. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  51. ظ଴஋఻೻๏ʹΑΔֶश
    ۙࣅ෼෍
    ɹ֬཰తٯ఻೻๏͸ɼԾఆີ౓ϑΟϧλϦϯάʹج͍͍ͮͯΔɽ
    ɹύϥϝʔλͷۙࣅ෼෍Λ࣍ͷΑ͏ʹ͓͘ɽ

    ɹ
    ɹ্ͷࣜΛԾఆີ౓ϑΟϧλϦϯάʹ͓͚ΔϞʔϝϯτϚονϯάͰஞ࣍తʹߋ৽ͯ͠
    ͍͘ɽ
    q(W, γy
    , γw
    ) = Gam(γy
    |αγy
    , βγy
    )Gam(γw
    |αγw
    , βγw
    )
    L

    l=1
    Hl

    i=1
    Hl−1

    j=1
    (w(l)
    i,j
    |m(l)
    i,j
    , v(l)
    i,j
    )
    = q(γy
    )q(γw
    )q(W)
    Ծఆີ౓ϑΟϧλϦϯά
    qi+1
    (θ) ≈ ri+1
    =
    1
    Zi+1
    fi+1
    (θ)qi
    (θ)
    ɿҼࢠ
    fi
    (θ)

    View Slide

  52. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  53. ظ଴஋఻೻๏ʹΑΔֶश
    ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ʲॳظԽʳ
    ɹɹۙࣅ෼෍͕ແ৘ใʹͳΔΑ͏ʹɼ ɼ ɼ ɼ ɼ ɼ
    ͰॳظԽ͢Δɽ
    ʲࣄલ෼෍Ҽࢠͷಋೖʳ
    ɹ໨ඪͷࣄޙ෼෍ͷҼࢠΛͭͭ௥Ճ͢Δ͜ͱͰۙࣅ෼෍Λߋ৽͢Δɽ
    ɹࠓճͷϞσϧʹ͓͚Δࣄલ෼෍Ҽࢠ͸ҎԼͷΑ͏ʹͳΔɽ
    ɹ
    m(l)
    i,j
    = 0 v(l)
    i,j
    = ∞ αγy
    = 1 βγy
    = 0 αγw
    = 1
    βγw
    = 0
    p(γr
    ), p(γw
    ), {p(w(l)
    i,j
    |γw
    )}i,j,l
    ࣄޙ෼෍ɿɹ
    ۙࣅ෼෍ɿɹ
    p(W, γy
    , γw
    |) ∝ p(Y|X, W, γr
    )p(W|γy
    )p(γw
    )p(γw
    )
    q(W, γy
    , γw
    ) = q(γy
    )q(γw
    )q(W)

    View Slide

  54. ظ଴஋఻೻๏ʹΑΔֶश
    ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ʲࣄલ෼෍Ҽࢠͷಋೖʳ
    wҼࢠ ͓Αͼ ͷ௥Ճɽ
    ɹۙࣅ෼෍ Λࣄલ෼෍ ͱಉ͡΋ͷʹ͍ͯ͠ΔͷͰɼҼࢠͷߋ৽
    ͸ҎԼͷΑ͏ʹͳΔɽ

    ɹɹɹɹɹɹɹɹ ɼ ɼ ɼ
    ͭ·Γɼ
    ɼ
    p(γw
    ) p(γy
    )
    q(γy
    ), q(γw
    ) p(γy
    ), p(γw
    )
    qnew(γy
    )qnew(γw
    )qnew(W) ≈ p(γy
    )p(γw
    )q(W)
    αnew
    γy
    = αγy
    0
    βnew
    γy
    = βγy
    0
    αnew
    γw
    = αγw
    0
    βnew
    γw
    = βγw
    0
    q(γr
    ) ← p(γr
    ) q(γw
    ) ← p(γw
    )
    Ծఆີ౓ϑΟϧλϦϯά
    qnew(γy
    )qnew(γw
    )qnew(W) ≈ r =
    1
    Z
    f new(γy
    , γw
    , W)q(γy
    )q(γw
    )q(W)

    View Slide

  55. ظ଴஋఻೻๏ʹΑΔֶश
    ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ʲࣄલ෼෍Ҽࢠͷಋೖʳ
    wҼࢠ ͷ௥Ճ

    ɹҎ߱Ͱ͸ɼΠϯσοΫε Λলུ͢Δɽ
    ɹߋ৽͞ΕΔͷ͸ɼ ͓Αͼ Ͱ͋ΔɽΑͬͯɼͦΕͧΕΛҎԼͷΑ͏ʹߋ৽
    ͢Δɽ


    ɹԼઢ෦ΛҼࢠͱΈͳ͢ɽ஫ҙ͢΂͖͸ɼͭ໨ͷ෼෍ͷߋ৽ʹͭ໨ͷ৽ͨʹߋ৽͞
    Εͨ෼෍͸࢖༻͍ͯ͠ͳ͍఺ͳͷͰɼߋ৽ॱʹؔ܎͸ͳ͍͜ͱɽ
    p(w(l)
    i,j
    |γw
    )
    qnew(γy
    )qnew(γw
    )qnew(W) ≈
    1
    Z
    p(w(l)
    i,j
    |γw
    )q(γy
    )q(γw
    )q(W)
    ⇔ qnew(γw
    )qnew(W) ≈
    1
    Z
    p(w(l)
    i,j
    |γw
    )q(γw
    )q(W)
    i, j, l
    q(W) q(γw
    )
    qnew(W) ≈
    1
    Z0
    p(w|γw
    )q(γw
    )q(W)
    qnew(γw
    ) ≈
    1
    Z0
    p(w|γw
    )q(W)q(γw
    )

    View Slide

  56. ظ଴஋఻೻๏ʹΑΔֶश
    ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ʲࣄલ෼෍Ҽࢠͷಋೖʳ
    wҼࢠ ͷ௥Ճɿ ͷߋ৽
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    p(w(l)
    i,j
    |γw
    ) q(W)
    qnew(W) ≈
    1
    Z0
    p(w|γw
    )q(γw
    )q(W)
    ɹ ͸Ψ΢ε෼෍Ͱ͋Δ͜ͱ͔ΒɼͷΨ΢ε෼෍ͷྫʢQʣͱಉ༷ʹ
    ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ෼෍͕ߋ৽͞ΕΔɽ



    q(W)
    mnew
    = m + v

    ∂m
    log Z0
    vnew
    = v − v2
    {(

    ∂m
    log Z0)
    2
    − 2

    ∂v
    log Z0}
    Z0
    = Z(αγw
    , βγw
    ) =

    p(w|γw
    )q(W)q(γw
    )dwdγw
    =

    (w|0,γ−1
    w
    )(w|m, v)Gam(γw
    |αγw
    , βγw
    )dwdγw

    View Slide

  57. ظ଴஋఻೻๏ʹΑΔֶश
    ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ʲࣄલ෼෍Ҽࢠͷಋೖʳ
    wҼࢠ ͷ௥Ճɿ ͷߋ৽
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    p(w(l)
    i,j
    |γw
    ) q(γw
    )
    qnew(γw
    ) ≈
    1
    Z0
    p(w|γw
    )q(W)q(γw
    )
    ɹ ͸ΨϯϚ෼෍Ͱ͋Δ͜ͱ͔ΒɼͷΨϯϚ෼෍ͷྫʢQʣͱಉ༷ʹ
    ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ෼෍͕ߋ৽͞ΕΔɽ

    ɹɹɹɹɹɹɹɹ
    ͨͩ͠ɼ ɼ
    q(γw
    )
    αnew
    γw
    =
    {
    Z0
    Z2
    Z−2
    1
    αγw
    + 1
    αγw
    − 1
    }
    −1
    βnew
    γw
    =
    {
    Z2
    Z−1
    1
    αγw
    + 1
    βγw
    − Z1
    Z−1
    0
    αγw
    βγw
    }
    −1
    Z1
    = Z(αγw
    + 1,βγw
    ) Z2
    = Z(αγw
    + 2,βγw
    )

    View Slide

  58. ظ଴஋఻೻๏ʹΑΔֶश
    ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ʲࣄલ෼෍Ҽࢠͷಋೖʳ
    ɹਖ਼نԽఆ਺ ͸ݫີʹٻΊΒΕͳ͍ͷͰɼܭࢉ్தͰݱΕΔενϡʔσϯτ
    ͷU෼෍Λɼฏۉͱ෼ࢄͷ౳͍͠Ψ΢ε෼෍Ͱۙࣅ͢Δɽ

    Z(αγw
    , βγw
    )
    Z(αγw
    , βγw
    ) =

    (w|0,γ−1
    w
    )q(W, γy
    , γw
    )dWdγy
    dγw
    =

    (w|0,γ−1
    w
    )(w|m, v)Gam(γw
    |αγw
    , βγw
    )dwdγw
    =

    St(w|0,αγw
    /βγw
    ,2αγw
    )(w|m, v)dw


    (w|0,(αγw
    − 1)/βγw
    )(w|m, v)dw
    = (w|0,(αγw
    − 1)/βγw
    + v)
    U෼෍Λฏۉͱ෼ࢄ͕
    ౳͍͠Ψ΢ε෼෍ʹ
    ۙࣅɽ

    View Slide

  59. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  60. ظ଴஋఻೻๏ʹΑΔֶश
    ໬౓Ҽࢠͷಋೖ
    ɹࣄલ෼෍ͷ֤Ҽࢠ͕௥Ճ͞Εͨޙ͸ɼ໬౓ ͷҼࢠΛͭͣͭ௥Ճ͢Δɽ

    ɹ ͸Ψ΢ε෼෍ɼ ͸ΨϯϚ෼෍ͳͷͰɼઌ΄Ͳͷߋ৽ͱಉ༷ʹߦ͏ɽ


    ৽͘͠ೖ͖ͬͯͨ໬౓ͷҼࢠ ʹର͢Δਖ਼نԽఆ਺ʢ ͷ௥
    Ճ࣌ͱҟͳΔߋ৽෦෼ʣΛܭࢉ͢Δ͜ͱ͕໨ඪɽ
    ɹ
    p(Y|X, W, γy
    )
    qnew(γy
    )qnew(γw
    )qnew(W) ≈
    1
    Z
    p(yi
    |xi
    , W, γy
    )q(γy
    )q(γw
    )q(W)
    ⇔ qnew(γr
    )qnew(W) ≈
    1
    Z
    p(yi
    |xi
    , W, γy
    )q(γr
    )q(W)
    q(W) q(γy
    )
    qnew(W) ≈
    1
    Z0
    p(yi
    |xi
    , W, γy
    )q(γw
    )q(W)
    qnew(γw
    ) ≈
    1
    Z0
    p(yi
    |xi
    , W, γy
    )q(W)q(γw
    )
    ⟹ p(yi
    |xi
    , W, γy
    ) p(w(l)
    i,j
    |γw
    )

    View Slide

  61. ظ଴஋఻೻๏ʹΑΔֶश
    ໬౓Ҽࢠͷಋೖ
    ɹ൪໨ͷ໬౓Λ௥Ճͨ͠ͱ͖ͷਖ਼نԽఆ਺Λɼ࣍ͷΑ͏ʹۙࣅతʹٻΊΔɽ

    ɹ
    i
    Z(αγy
    , βγy
    ) =

    (yi
    | f(xi
    , W), γy
    )q(W, γy
    , γw
    )dWdγy
    dγw
    =

    (yi
    | f(xi
    , W), γy
    )q(W, γy
    )dWdγy


    (yi
    |z(L), γy
    )(z(L) |mz(L)
    , vz(L)
    )Gam(γy
    |αγy
    , βγy
    )dz(L)dγy
    =

    St(yi
    |z(L), αγy
    /βγy
    ,2αγy
    )(z(L) |mz(L)
    , vz(L)
    )dz(L)


    (yi
    |mz(L)
    , (αγy
    − 1)/βγy
    )(z(L) |mz(L)
    , vz(L)
    )dw
    = (yi
    |mz(L)
    , (αγy
    − 1)/βγy
    + vz(L)
    ) U෼෍Λฏۉͱ෼ࢄ͕
    ౳͍͠Ψ΢ε෼෍ʹ
    ۙࣅɽ
    ૚໨ͷӅΕϢχοτ
    ͕ฏۉ ɼ
    ෼ࢄ ʹै͏ͱԾఆɽ
    ʢ࣍ͷεϥΠυͰৄ͘͠ʣ
    l
    z(l) ∈ ℝHl mz(l)
    vz(l)

    View Slide

  62. ظ଴஋఻೻๏ʹΑΔֶश
    ໬౓Ҽࢠͷಋೖ
    ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ
    ʲܭࢉํ๏ʳ
    ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ Λ࣋ͭͱԾఆ͢Δɽ·
    ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ
    ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ


    ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ·
    ͨɼ ͸ΞμϚʔϧੵɽ
    (z(L) |mz(L)
    , vz(L)
    ) mz(L)
    vz(L)
    l z(l) ∈ ℝHl mz(l)
    vz(l)
    l W(l) ∈ ℝHl
    ×Hl−1
    a(l) = W(l)z(l−1)/ Hl−1
    a(l)
    ma(l)
    = M(l)mz(l−1)
    / Hl−1
    va(l)
    = {(M(l) ⊙ M(l))vz(l−1)
    + V(l)(mz(l−1)
    ⊙ mz(l−1)
    ) + V(l)vz(l−1)
    }/Hl−1
    M(l), V(l) ∈ ℝHl
    ×Hl−1 m(l)
    i,j
    v(l)
    i,j

    View Slide

  63. ظ଴஋఻೻๏ʹΑΔֶश
    ໬౓Ҽࢠͷಋೖ
    ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ
    ʲܭࢉํ๏ʳ
    ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ Λ࣋ͭͱԾఆ͢Δɽ·
    ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ
    ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ


    ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ·
    ͨɼ ͸ΞμϚʔϧੵɽ
    (z(L) |mz(L)
    , vz(L)
    ) mz(L)
    vz(L)
    l z(l) ∈ ℝHl mz(l)
    vz(l)
    l W(l) ∈ ℝHl
    ×Hl−1
    a(l) = W(l)z(l−1)/ Hl−1
    a(l)
    ma(l)
    = M(l)mz(l−1)
    / Hl−1
    va(l)
    = {(M(l) ⊙ M(l))vz(l−1)
    + V(l)(mz(l−1)
    ⊙ mz(l−1)
    ) + V(l)vz(l−1)
    }/Hl−1
    M(l), V(l) ∈ ℝHl
    ×Hl−1 m(l)
    i,j
    v(l)
    i,j
    ⊙ ૚໨ͷӅΕϢχοτͷฏۉ ͱ
    ෼ࢄ ͔Β૚໨ͷ׆ੑͷฏۉ
    ͱ෼ࢄ ͕ٻ·Δɽ
    l − 1 mz(l−1)
    vz(l−1)
    l ma(l)
    va(l)

    View Slide

  64. ظ଴஋఻೻๏ʹΑΔֶश
    ໬౓Ҽࢠͷಋೖ
    ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ
    ʲܭࢉํ๏ʳ
    ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ Λ࣋ͭͱԾఆ͢Δɽ·
    ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ
    ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ


    ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ·
    ͨɼ ͸ΞμϚʔϧੵɽ
    (z(L) |mz(L)
    , vz(L)
    ) mz(L)
    vz(L)
    l z(l) ∈ ℝHl mz(l)
    vz(l)
    l W(l) ∈ ℝHl
    ×Hl−1
    a(l) = W(l)z(l−1)/ Hl−1
    a(l)
    ma(l)
    = M(l)mz(l−1)
    / Hl−1
    va(l)
    = {(M(l) ⊙ M(l))vz(l−1)
    + V(l)(mz(l−1)
    ⊙ mz(l−1)
    ) + V(l)vz(l−1)
    }/Hl−1
    M(l), V(l) ∈ ℝHl
    ×Hl−1 m(l)
    i,j
    v(l)
    i,j
    ⊙ ૚໨ͷӅΕϢχοτͷฏۉ ͱ
    ෼ࢄ ͔Β૚໨ͷ׆ੑͷฏۉ
    ͱ෼ࢄ ͕ٻ·Δɽ
    l − 1 mz(l−1)
    vz(l−1)
    l ma(l)
    va(l)
    ૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͔Β
    ૚໨ͷӅΕϢχοτͷฏۉ ͱ෼ࢄ
    ͕ٻ·Ε͹࠶ؼతʹܭࢉՄೳɽ
    l ma(l)
    va(l)
    l
    mz(l)
    vz(l)

    View Slide

  65. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  66. ظ଴஋఻೻๏ʹΑΔֶश
    ׆ੑͷ෼෍
    ɹ׆ੑ ͷ෼෍ Λܭࢉ͢Δɽத৺ۃݶఆཧΑΓɼӅΕϢχοτ਺
    ͕େ͖͍৔߹ɼ ͸ۙࣅతʹΨ΢ε෼෍ʹै͏ɽ

    ɹΨ΢ε෼෍ʹै͏ม਺͕3F-6Λ௨ΔͱɼਤͷӈਤͷΑ͏ʹ෼෍ͷࠞ߹෼෍ʹͳ
    Δɽ
    ᶃ ෛͷೖྗΛ௨͖ͬͯͨαϯϓϧ͸ɼฏۉ ɼ෼ࢄ ͷΑ͏ͳ࣭఺ʹͳ
    Δɽ
    ᶄ ඇෛͷೖྗΛ௨͖ͬͯͨαϯϓϧ͸ɼҎԼ͕࡟ΒΕͨஅยΨ΢ε෼෍ʹͳΔɽ
    a(l) p(a(l) |W(l), z(l−1))
    Hl−1
    a(l)
    p(a(l) |W(l), z(l−1)) ≈ q(a(l)) = (a(l) |ma(l)
    , va(l)
    )
    μp
    = 0 σp
    = 0

    View Slide

  67. ظ଴஋఻೻๏ʹΑΔֶश
    ׆ੑͷ෼෍
    ʲࠞ߹෼෍ͷฏۉͱ෼ࢄͷҰൠࣜʳ
    ɹ ݸͷཁૉΛ࣋ͭࠞ߹෼෍ͷฏۉͱ෼ࢄ͸ɼࠞ߹܎਺ ɼ ͱ͢Δͱɼ
    ҰൠతʹҎԼͷΑ͏ʹͳΔɽ


    K πk
    > 0
    K

    k=1
    πk
    = 1
    [xmix
    ] =
    K

    k=1
    πk
    μk
    [xmix
    ] =
    K

    k=1
    πk
    (μk
    + σk
    ) − [xmix
    ]2

    View Slide

  68. ظ଴஋఻೻๏ʹΑΔֶश
    ׆ੑͷ෼෍
    ʲ׆ੑͷࠞ߹෼෍ʹద༻ʳɹ
    ɹɹ࣭఺ͱஅยΨ΢ε෼෍ͷࠞ߹܎਺ΛͦΕͧΕ ɼ ͱ͢Δɽͭ·Γɼ ɽ
    ɹ ͸ɼ ͱ͓͘ͱɼҎԼͷΑ͏ʹͳΔɽ

    ɹ͕ͨͬͯ͠ɼ੾அΨ΢ε෼෍ͷ܎਺͸ҎԼͷΑ͏ʹٻΊΒΕΔɽ

    ɹ<4,PU[ >ΑΓɼஅยΨ΢ε෼෍ͷฏۉ ͱ෼ࢄ ͸ҎԼͷΑ͏ʹͳΔɽ


    ɹҰൠࣜʹ͓͚Δ ɼ ʹ౰ͯ͸ΊΔͱɼͷฏۉͱ෼ࢄ͕ಘΒΕΔɽ
    πp
    πt
    πp
    + πp
    = 1
    πp
    ¯
    μ = − μ/σ
    πp
    =

    0
    −∞
    (x|μ, σ2)dx
    = Φ(−μ/σ) = Φ( ¯
    μ)
    πt
    = 1 − πp
    = Φ(− ¯
    μ)
    μt
    σt
    μt
    = μ + σ
    ( ¯
    μ|0,1)
    Φ(− ¯
    μ)
    σ2
    t
    = σ2
    {1 + ¯
    μ
    ( ¯
    μ|0,1)
    Φ(− ¯
    μ)

    ( ¯
    μ|0,1)
    Φ(− ¯
    μ)
    − 2}
    ( ¯
    μ|0,1)
    Φ(− ¯
    μ)
    [xmix
    ] [xmix
    ] z

    View Slide

  69. ظ଴஋఻೻๏ʹΑΔֶश
    ׆ੑͷ෼෍
    ͭ·Γɼ
    ૚໨ͷ׆ੑͷฏۉͱ෼ࢄ͔Β૚໨ͷӅΕϢχοτͷฏۉͱ෼ࢄ͕ܭࢉՄೳɽ
    l l
    ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ
    (z(L) |mz(L)
    , vz(L)
    ) mz(L)
    vz(L)

    View Slide

  70. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  71. ظ଴஋఻೻๏ʹΑΔֶश
    ޯ഑ʹجֶͮ͘श
    ɹ ͸ɼฏۉ ɼ෼ࢄ ͱͯ͠ѻ͏ʢ࠶ؼܭࢉͷॳظ஋ ɼ ʣɽ
    dͰ͸ɼ ૚໨ͷग़ྗ ͔Β׆ੑ Λ௨͠ɼ૚໨ͷग़ྗ
    ͷฏۉͱ෼ࢄΛٻΊΔʢத৺ۃݶఆཧΑΓΨ΢ε෼෍ʹۙࣅͰ͖ΔɽʣҰ࿈ͷྲྀΕΛ঺
    հͨ͠ɽ͜ͷۙࣅ݁ՌΛ࠶ؼతʹ༻͍Δ͜ͱͰɼ࠷ऴ૚ ͷ෼෍ΛΨ΢ε෼෍
    Ͱۙࣅ͢Δ͜ͱ͕Ͱ͖Δɽ
    ɹ͕ͨͬͯ͠ɼਖ਼نԽఆ਺ͷۙࣅදݱ͕ಘΒΕΔɽ

    ɹਖ਼نԽఆ਺Λಘͨޙ͸ɼύϥϝʔλʹΑΔඍ෼Λܭࢉ͢Δ͜ͱͰޯ഑͕ܭࢉͰ͖Δɽ
    z(0) xi
    0 mz(0)
    vz(0)
    l − 1 z(l−1) a(l) l z(l)
    z(L)
    (z(L) |mz(L)
    , v(L)
    z
    )
    Z(αγy
    , βγy
    ) ≈ (yi
    |mz(L)
    , (αγy
    − 1)/βγy
    + vz(L)
    )

    View Slide

  72. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  73. ظ଴஋఻೻๏ʹΑΔֶश
    ֬཰తٯ఻೻๏ͷ·ͱΊ
    Ϟσϧͷఆٛɿ
    p(W, γy
    , γw
    |) ∝ p(Y|X, W, γr
    )p(W|γw
    )p(γy
    )p(γw
    )
    ۙࣅ෼෍ͷಋೖɿ
    q(W, γy
    , γw
    ) = q(γy
    )q(γw
    )q(W)
    ۙࣅ෼෍ͷॳظԽɿ
    q0
    (γy
    ), q0
    (γw
    ), q0
    (W)
    ࣄલ෼෍Ҽࢠͷಋೖʢͦͷʣɿ
    Ҽࢠ ͷ௥Ճɿ
    Ҽࢠ ͷ௥Ճɿ
    p(γr
    ) q(γr
    ) ← p(γr
    )
    p(γw
    ) q(γw
    ) ← p(γw
    )

    View Slide

  74. ظ଴஋఻೻๏ʹΑΔֶश
    ֬཰తٯ఻೻๏ͷ·ͱΊ
    ࣄલ෼෍Ҽࢠͷಋೖʢͦͷʣɿ
    for l = 1 to L do
    for j = 1 to Hl−1
    do
    for i = 1 to Hl
    do
    Ҽࢠp(w(l)
    i,j
    |γw
    )ͷ௥Ճɿ
    ⋅ q(W)ͷߋ৽
    ⋅ q(γw
    )ͷߋ৽
    ॱ఻೻ɿ
    p(yi
    |xi
    , W, γy
    ) where i ∈ s
    ӅΕϢχοτͱ׆ੑͷฏۉͱ෼ࢄΛ࠶ؼܭࢉ
    ໬౓Ҽࢠ ͷಋೖɿ ͷߋ৽
    p(yi
    |xi
    , W, γy
    ) q(W), q(γy
    )

    View Slide

  75. ظ଴஋఻೻๏ʹΑΔֶश
    ʲظ଴஋఻೻๏ʹΑΔֶशʳ
    ‣Ϟσϧ
    ‣ۙࣅ෼෍
    ‣ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ
    ‣໬౓Ҽࢠͷಋೖ
    ‣׆ੑͷ෼෍
    ‣ޯ഑ʹجֶͮ͘श
    ‣֬཰తٯ఻೻๏ͷ·ͱΊ
    ‣ؔ࿈ख๏

    View Slide

  76. ظ଴஋఻೻๏ʹΑΔֶश
    ؔ࿈ख๏
    ɹ֬཰తٯ఻೻๏ʹࣅͨख๏ͱͯ͠ɼܾఆతม෼ਪ࿦๏͕͋Δɽ
    ʲม෼ਪ࿦๏ͷܽ఺ʳ
    ɹ&-#0ͷධՁͷͨΊʹର਺໬౓ͷظ଴஋Λܭࢉ͢Δඞཁ͕͋ΓɼϞϯςΧϧϩ๏Ͱۙ
    ࣅղΛಘ͍ͯΔɽ ҆ఆੑ͕௿͍
    ʲܾఆతม෼ਪ࿦๏ʳ
    ɹظ଴஋ͷۙࣅܭࢉΛܾఆతʹߦ͏͜ͱͰ҆ఆੑΛߴΊΒΕΔɽ

    View Slide