Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ベイズ深層学習(6.2)

catla
March 11, 2020

 ベイズ深層学習(6.2)

内容:変分モデル

catla

March 11, 2020
Tweet

More Decks by catla

Other Decks in Science

Transcript

  1. ϕΠζਂ૚ֶश

    ม෼Ϟσϧ
    ܡɹঘً

    View Slide

  2. ຊ೔ͷ಺༰
    ‣ม෼Ϟσϧ
    ‣ਖ਼نԽྲྀ
    ‣֊૚ม෼Ϟσϧ
    ‣ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏

    View Slide

  3. ຊ೔ͷ಺༰
    ‣ม෼Ϟσϧ
    ‣ਖ਼نԽྲྀ
    ‣֊૚ม෼Ϟσϧ
    ‣ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏

    View Slide

  4. ม෼Ϟσϧ

    View Slide

  5. ม෼Ϟσϧ
    ɹม෼ਪ࿦๏ʹΑΔࣄޙ෼෍ͷۙࣅਪ࿦͸ɼۙࣅ෼෍ ʹରͯ͠ͲͷΑ͏ʹઃܭ͢Δ͔͕
    ΞϧΰϦζϜͷੑೳΛࠨӈ͢Δɽ
    q
    ۙࣅ෼෍ͷઃܭͰॏཁͳ఺
    ᶃɹ Λ࢖ͬͨظ଴஋ܭࢉ΍αϯϓϦϯά͕ߦ͍΍͍͢
    ᶄɹ ͕,-μΠόʔδΣϯεͳͲͷࢦඪͷ΋ͱͰ࠷దԽ͠΍͍͢
    ᶅɹ ͕ෳࡶͳਅͷࣄޙ෼෍Λਫ਼౓ྑۙ͘ࣅͰ͖ΔΑ͏ͳॊೈ͞Λ΋͍ͬͯΔ
    q
    q
    q

    View Slide

  6. ม෼Ϟσϧ
    ۙࣅ෼෍ͷઃܭͰॏཁͳ఺
    ᶃɹ Λ࢖ͬͨظ଴஋ܭࢉ΍αϯϓϦϯά͕ߦ͍΍͍͢
    ᶄɹ ͕,-μΠόʔδΣϯεͳͲͷࢦඪͷ΋ͱͰ࠷దԽ͠΍͍͢
    ᶅɹ ͕ෳࡶͳਅͷࣄޙ෼෍Λਫ਼౓ྑۙ͘ࣅͰ͖ΔΑ͏ͳॊೈ͞Λ΋͍ͬͯΔ
    q
    q
    q
    ɹฏۉ৔ۙࣅ͸ɼ͜ͷ఺ʹ͍ͭͯͲ͏ͩΖ͏ɾɾɾʁ
    ᶃ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࢦ਺ܕ෼෍ͳͲͷಛੑ͕Α͘஌ΒΕͨ෼෍Λۙࣅͱͯ͠༻͍Δ͔Βɽ

    View Slide

  7. ม෼Ϟσϧ
    ۙࣅ෼෍ͷઃܭͰॏཁͳ఺
    ᶃɹ Λ࢖ͬͨظ଴஋ܭࢉ΍αϯϓϦϯά͕ߦ͍΍͍͢
    ᶄɹ ͕,-μΠόʔδΣϯεͳͲͷࢦඪͷ΋ͱͰ࠷దԽ͠΍͍͢
    ᶅɹ ͕ෳࡶͳਅͷࣄޙ෼෍Λਫ਼౓ྑۙ͘ࣅͰ͖ΔΑ͏ͳॊೈ͞Λ΋͍ͬͯΔ
    q
    q
    q
    ɹฏۉ৔ۙࣅ͸ɼ͜ͷ఺ʹ͍ͭͯͲ͏ͩΖ͏ɾɾɾʁ
    ᶃ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࢦ਺ܕ෼෍ͳͲͷಛੑ͕Α͘஌ΒΕͨ෼෍Λۙࣅͱͯ͠༻͍Δ͔Βɽ
    ᶄ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࣄલ෼෍ͱಉ͡ܗࣜΛۙࣅ෼෍ͱͯ͠બͿ͜ͱͰ&-#0ͷܭࢉΛղੳతʹ͍ͯ͠Δ͔Βɽ

    View Slide

  8. ม෼Ϟσϧ
    ۙࣅ෼෍ͷઃܭͰॏཁͳ఺
    ᶃɹ Λ࢖ͬͨظ଴஋ܭࢉ΍αϯϓϦϯά͕ߦ͍΍͍͢
    ᶄɹ ͕,-μΠόʔδΣϯεͳͲͷࢦඪͷ΋ͱͰ࠷దԽ͠΍͍͢
    ᶅɹ ͕ෳࡶͳਅͷࣄޙ෼෍Λਫ਼౓ྑۙ͘ࣅͰ͖ΔΑ͏ͳॊೈ͞Λ΋͍ͬͯΔ
    q
    q
    q
    ɹฏۉ৔ۙࣅ͸ɼ͜ͷ఺ʹ͍ͭͯͲ͏ͩΖ͏ɾɾɾʁ
    ᶃ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࢦ਺ܕ෼෍ͳͲͷಛੑ͕Α͘஌ΒΕͨ෼෍Λۙࣅͱͯ͠༻͍Δ͔Βɽ
    ᶄ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࣄલ෼෍ͱಉ͡ܗࣜΛۙࣅ෼෍ͱͯ͠બͿ͜ͱͰ&-#0ͷܭࢉΛղੳతʹ͍ͯ͠Δ͔Βɽ
    ᶅ ຬ͍ͨͯ͠ͳ͍ɽ
    ʢཧ༝ʣۙࣅ͢Δݸʑͷ֬཰ม਺ʹରͯ͠γϯϓϧͳಠཱੑΛԾఆ͍ͯ͠Δ͔Βɽ

    View Slide

  9. ม෼Ϟσϧ
    ۙࣅ෼෍ͷઃܭͰॏཁͳ఺
    ᶃɹ Λ࢖ͬͨظ଴஋ܭࢉ΍αϯϓϦϯά͕ߦ͍΍͍͢
    ᶄɹ ͕,-μΠόʔδΣϯεͳͲͷࢦඪͷ΋ͱͰ࠷దԽ͠΍͍͢
    ᶅɹ ͕ෳࡶͳਅͷࣄޙ෼෍Λਫ਼౓ྑۙ͘ࣅͰ͖ΔΑ͏ͳॊೈ͞Λ΋͍ͬͯΔ
    q
    q
    q
    ɹฏۉ৔ۙࣅ͸ɼ͜ͷ఺ʹ͍ͭͯͲ͏ͩΖ͏ɾɾɾʁ
    ᶃ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࢦ਺ܕ෼෍ͳͲͷಛੑ͕Α͘஌ΒΕͨ෼෍Λۙࣅͱͯ͠༻͍Δ͔Βɽ
    ᶄ ຬ͍ͨͯ͠Δɽ
    ʢཧ༝ʣࣄલ෼෍ͱಉ͡ܗࣜΛۙࣅ෼෍ͱͯ͠બͿ͜ͱͰ&-#0ͷܭࢉΛղੳతʹ͍ͯ͠Δ͔Βɽ
    ᶅ ຬ͍ͨͯ͠ͳ͍ɽ
    ʢཧ༝ʣۙࣅ͢Δݸʑͷ֬཰ม਺ʹରͯ͠γϯϓϧͳಠཱੑΛԾఆ͍ͯ͠Δ͔Βɽ
    ฏۉ৔ۙࣅΛར༻ͨ͠7"&ͷ෼෍ͷۙࣅೳྗ͸ɼ͔ͳΓ੍ݶ͞ΕΔɽ

    View Slide

  10. ม෼Ϟσϧ
    ม෼Ϟσϧ
    ɹม෼ਪ࿦๏ͷۙࣅ෼෍ ʹ࢖༻͢Δ֬཰෼෍ͷ଒ͷ͜ͱɽ
    ɹ ੜ੒Ϟσϧɿ؍ଌσʔλͷੜ੒աఔΛදݱ͢Δ֬཰෼෍ɽ
    q

    ɹฏۉ৔ۙࣅ΍ਪ࿦ωοτϫʔΫʢΤϯίʔμʣ΋ม෼ϞσϧͷҰछͱߟ͑Δ͜ͱ͕Ͱ
    ͖Δɽ

    View Slide

  11. ຊ೔ͷ಺༰
    ‣ม෼Ϟσϧ
    ‣ਖ਼نԽྲྀ
    ‣֊૚ม෼Ϟσϧ
    ‣ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏

    View Slide

  12. ਖ਼نԽྲྀ
    ʲฏۉ৔ۙࣅʹجͮ͘7"&ͷֶशͷ໰୊఺ʳ
    ɹۙࣅ෼෍ ʹର֯Ψ΢ε෼෍ͳͲͷ୯७ͳ෼෍ΛԾఆ͍ͯ͠Δ͜ͱɽҰൠత
    ʹෳࡶͳϞσϧʢFHਂ૚ੜ੒Ϟσϧʣͷજࡏม਺ͷਅͷࣄޙ෼෍͸ෳࡶͳ΋ͷʹͳ
    Δɽ
    ΑΓෳࡶͳදݱ෼෍Λ΋ͭۙࣅ෼෍Λߟ͑Α͏ʂ
    q(zn
    ; xn
    , ψ)

    View Slide

  13. ਖ਼نԽྲྀ
    ʲฏۉ৔ۙࣅʹجͮ͘7"&ͷֶशͷ໰୊఺ʳ
    ɹۙࣅ෼෍ ʹର֯Ψ΢ε෼෍ͳͲͷ୯७ͳ෼෍ΛԾఆ͍ͯ͠Δ͜ͱɽҰൠత
    ʹෳࡶͳϞσϧʢFHਂ૚ੜ੒Ϟσϧʣͷજࡏม਺ͷਅͷࣄޙ෼෍͸ෳࡶͳ΋ͷʹͳ
    Δɽ
    ΑΓෳࡶͳදݱ෼෍Λ΋ͭۙࣅ෼෍Λߟ͑Α͏ʂ
    q(zn
    ; xn
    , ψ)

    ਖ਼نԽྲྀʢOPSNBMJ[JOHqPXʣ
    ɹ؆୯ͳ֬཰෼෍͔Βͷαϯϓϧ ʹରͯ͠ɼෳ਺ճͷՄٯ͔ͭඍ෼Մೳͳؔ਺
    ʹΑΔม׵Λద༻͢Δ͜ͱͰɼΑΓෳࡶͳ෼෍͔Βͷαϯϓϧ ΛಘΔ
    ख๏ɽ
    w w w w w w w w w w w w w w w
    z0
    f1
    , …, fK
    zK

    View Slide

  14. ਖ਼نԽྲྀ
    ʲՄٯͳؔ਺ʹΑΔม׵ʳ
    ɹՄٯͰ࿈ଓͳؔ਺ Λߟ͑Δɽม׵ Λ༻͍Ε͹ɼ֬཰ີ౓ؔ਺
    ʹରͯ͠ ͸ҎԼͷΑ͏ʹͳΔɽʢ֬཰ີ౓ؔ਺ͷม׵͸અࢀরʣ

    ͓Αͼ ͸ϠίϏߦྻɽ ͸ߦྻࣜɽ
    ͜ͷม׵Λ ͔Β ճద༻͢Δ͜ͱΛߟ͑Δɽ

    ͕ͨͬͯ͠ɼ࠷ऴతͳ֬཰ม਺ ͷີ౓ؔ਺͸ҎԼͷΑ͏ʹͳΔɽ
    ɹɹɹɹɹɹɹɹɹɹɹ
    f : ℝD → ℝD ̂
    z = f(z)
    q(z) q( ̂
    z)
    q( ̂
    z) = q(z) det
    (
    ∂f−1
    ∂ ̂
    z )
    = q(z) det (
    ∂f
    ∂z)
    −1
    ∂f−1
    ∂ ̂
    z
    ∂f
    ∂z
    det( ⋅ )
    z0
    K
    zK
    = fK
    ∘ ⋯ ∘ f1
    (z0
    )
    zK
    qK
    (zK
    ) = q0
    (z0
    )
    K

    k=1
    det
    (
    ∂fk
    ∂zk−1
    )
    −1

    View Slide

  15. ਖ਼نԽྲྀ
    ʲม׵ͷྫʳ
    ɹฏ໘ྲྀʢQMBOBSqPXʣɹ͸ؔ਺Λ࣍ͷΑ͏ʹ͢Δɽ
    ɹ
    ͸ඍ෼Մೳͳඇઢܗؔ਺ɽ ͸ม׵ΛܾΊΔύϥϝʔλɽ
    ม෼ਪ࿦๏Ͱ͸ɼ ͸ม෼ύϥϝʔλͷ໾ׂΛՌͨ͢ɽฏ໘ྲྀʹΑͬͯಘΒΕ෼෍ͷີ
    ౓ܭࢉʹඞཁͳϠίϏߦྻ͸ ͰܭࢉͰ͖Δɽ

    ͨͩ͠ɼ ͷಋؔ਺Λ ͱ͓͍ͨɽ
    ɹ ʹରͯؔ͠਺Λ܁Γฦ͠ద༻ͯ͠ಘΒΕΔີ౓ؔ਺͸ɼ௒ฏ໘ ʹਨ
    ௚ͳํ޲ʹऩॖͱ֦େΛ܁Γฦ͍͖ͯ͠ɼ࠷ऴతʹಘΒΕΔ ͸ෳࡶͳ෼෍Λܗ੒͢
    Δɽ
    f
    f(z) = z + uh(wTz + b) .
    h λ = {w ∈ ℝD, u ∈ ℝD, b ∈ ℝ}
    λ
    (D)
    det (
    ∂f
    ∂z) = |1 + uTψ(z)|
    h ψ(z) = h′(wTz + b)w
    z0
    f wTz + b = 0
    zK

    View Slide

  16. ਖ਼نԽྲྀ
    ʲม׵ͷྫʳ
    ɹ์ࣹঢ়ྲྀʢSBEJBMqPXʣɹ͸ج४ͱͳΔ఺ͷपลͰҎԼͷΑ͏ͳؔ਺Λ༻͍ͯ
    ֬཰ີ౓Λม׵͢Δɽ
    ɹ
    ͨͩ͠ɼ ɼ ͱ͢Δɽύϥϝʔλ͸
    ์ࣹঢ়ྲྀͷϠίϏߦྻ͸ҎԼͷΑ͏ʹ؆୯ʹܭࢉͰ͖Δɽ

    ̂
    z f
    f(z) = z + βh(α, r)(z − ̂
    z) .
    r = |z − ̂
    z| h(α, r) =
    1
    α + r
    λ = { ̂
    z ∈ ℝD, α ∈ ℝ, β ∈ ℝ}
    det (
    ∂f
    ∂z ) = {1 + βh(α, r)}D−1{1 + βh(α, r) + βh′(α, r)r}
    ʢ͜ͷϠίϏߦྻͬͯɼͲ͏ٻΊͯΔʁʣ

    View Slide

  17. ਖ਼نԽྲྀ
    ʲม׵ͷྫʳ
    ग़యɿ“Variational Inference with Normalizing Flows”,
    Danilo J. Rezende and Shakir Mohamed., ICML, 2015

    View Slide

  18. ਖ਼نԽྲྀ
    ʲม෼ਪ࿦๏΁ͷద༻ʳ
    ฏۉ৔ۙࣅʹجͮ͘ม෼ਪ࿦๏
    ୯७ͳԾఆʹΑΓෳࡶͳ෼෍ͷۙࣅੑೳ͕ѱ͍ɽ

    View Slide

  19. ਖ਼نԽྲྀ
    ʲม෼ਪ࿦๏΁ͷద༻ʳ
    ฏۉ৔ۙࣅʹجͮ͘ม෼ਪ࿦๏
    ୯७ͳԾఆʹΑΓෳࡶͳ෼෍ͷۙࣅੑೳ͕ѱ͍ɽ
    ม෼ਪ࿦๏ʹਖ਼نԽྲྀΛ૊Έ߹ΘͤΔ
    ฏۉ৔ۙࣅʹΑΔਪ࿦ΑΓ΋͸Δ͔ʹਫ਼౓ͷߴ͍ࣄޙ෼෍ͷۙࣅ͕Մೳɽ


    View Slide

  20. ਖ਼نԽྲྀ
    ʲม෼ਪ࿦๏΁ͷద༻ʳ
    ɹજࡏม਺ͷू߹Λ ͱͨ͠ͱ͖ɼੜ੒Ϟσϧ ʹରͯ͠ɼ
    ਖ਼نԽྲྀΛద༻ͨ͠৔߹ɼ͋Δσʔλ ʹର͢Δม෼ΤωϧΪʔ͸ɼҎԼͷΑ͏ʹͳ
    Δɽ

    Z p(X, Z) =
    N

    n=1
    p(xn
    |zn
    )p(zn
    )
    x
    ℱ[q] = q(z)
    [ln q(z) − ln p(x, z)]
    = q0
    (z0
    )
    [ln qK
    (zK
    ) − ln p(x, zK
    )]
    = q0
    (z0
    )
    [ln q0
    (z0
    )] − Eq0
    (z0
    )
    [ln p(x, zK
    )] − Eq0
    (z0
    )
    K

    k=1
    ln det
    (
    ∂fk
    ∂zk−1
    )

    View Slide

  21. ਖ਼نԽྲྀ
    ʲม෼ਪ࿦๏΁ͷద༻ʳ
    ɹ7"&ʹద༻͢Δ৔߹͸ɼॳظ෼෍ΛҎԼͷΑ͏ʹ͢Δɽ

    ਖ਼نԽྲྀͷύϥϝʔλ ΋//ͷग़ྗΛ༻͍Δ͜ͱ͕Ͱ͖ΔɽΤϯίʔμΛ࢖ͬͯޮ཰
    తʹજࡏม਺શମͷۙࣅ෼෍Λֶश͠ɼਖ਼نԽྲྀͰෳࡶͳ෼෍ʹม׵͢Δ͜ͱͰɼਫ਼౓
    ͷߴ͍ۙࣅΛߦ͏͜ͱ͕ՄೳͱͳΔɽ
    q0
    (z0
    ) = (z|m(x; ψ), diagm(v(x; ψ)))
    λ
    ग़యɿ“Variational Inference with Normalizing Flows”,
    Danilo J. Rezende and Shakir Mohamed., ICML, 2015
    normalizing flow

    View Slide

  22. ਖ਼نԽྲྀ
    ʲޡهʳࣜʢʣʹؔΘΔ෦෼
    ɹຊʹ͸ɼࣜʢʣ͸&-#0ͷܭࢉʹ
    ͳ͍ͬͯΔ͕ɼ͜Ε͸ม෼ΤωϧΪʔɽ

    ℒ[q] → ℱ[q] = − ℒ[q]
    ʢݩ࿦จ͔Βൈਮʣ

    View Slide

  23. ਖ਼نԽྲྀ
    ʲม෼ਪ࿦๏΁ͷద༻ʳม෼ΤωϧΪʔͷಋग़



    qK
    (zK
    ) = q0
    (z0
    )
    K

    k=1
    det
    (
    ∂fk
    ∂zk
    )
    −1
    ln qK
    (zK
    ) = ln q0
    (z0
    ) +
    K

    k=1
    ln det
    (
    ∂fk
    ∂zk
    )
    −1
    = ln q0
    (z0
    ) −
    K

    k=1
    ln det
    (
    ∂fk
    ∂zk
    )
    ℱ[q] = − ℒ[q] = −

    q(z)ln
    p(x, z)
    q(z)
    dz
    =

    q(z)ln
    q(z)
    p(x, z)
    dz = q(z)
    [ln q(z) − ln p(x, z)]
    =

    q0
    (z0
    )ln
    qK
    (zK
    )
    p(x, zK
    )
    dz = q0
    (z0
    )
    [ln qK
    (zK
    ) − ln p(x, zK
    )] ( ∵ normalizing flow)
    = q0
    (z0
    )
    ln q0
    (z0
    ) −
    K

    k=1
    ln det
    (
    ∂fk
    ∂zk
    )
    − ln p(x, zK
    )
    = q0
    (z0
    )
    [ln q0
    (z0
    )] − q0
    (z0
    )
    [ln p(x, zK
    )] − q0
    (z0
    )
    K

    k=1
    ln det
    (
    ∂fk
    ∂zk−1
    )

    View Slide

  24. ਖ਼نԽྲྀ
    ʲελΠϯม෼ޯ഑߱Լ๏ʳ
    ɹ
    ελΠϯม෼ޯ഑߱Լ๏
    ɹஞ࣍తͳม਺ม׵Λར༻ͨ͠ม෼ਪ࿦๏ɽ࠶ੜ֩ώϧϕϧτ্ۭؒͰͷ൚ؔ਺
    ඍ෼Λར༻ͨ͠ޯ഑߱Լ๏Λద༻͢Δ͜ͱͰɼਅͷࣄޙ෼෍ʹର͢Δ,-μΠόʔ
    δΣϯεΛ࠷খԽ͢Δख๏ɽ
    ɹۙࣅࣄޙ෼෍͸ɼॳظ෼෍͔Βͷ༗ݶݸͷαϯϓϧ͔Βදݱ͞Εɼ࠷దԽʹΑͬͯ
    ਅͷࣄޙ෼෍͔Βͷαϯϓϧʹม׵͞ΕΔɽ
    ʲར఺ʳߦྻࣜ΍ٯߦྻͷܭࢉ͕ෆཁͳ఺ɽ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ࠶ੜ֩ώϧϕϧτۭؒʹର͢Δ஌͕ࣝͳ͘ʜ

    View Slide

  25. ຊ೔ͷ಺༰
    ‣ม෼Ϟσϧ
    ‣ਖ਼نԽྲྀ
    ‣֊૚ม෼Ϟσϧ
    ‣ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏

    View Slide

  26. ֊૚ม෼Ϟσϧ
    ʲۙࣅ෼෍ͷϞσϧʳ
    ɹฏۉ৔ۙࣅΛ༻͍ͨજࡏม਺ ͷۙࣅ෼෍Λ ͱ͢Δɽɹ
    ɹɹɹɹɹɹɹɹɹɹɹɹɹɹ
    ɿม෼ύϥϝʔλͷू߹
    ฏۉ৔ۙࣅͰ͸ɼ.ݸͷજࡏม਺͸ಠཱ͍ͯ͠ΔͱԾఆ͍ͯ͠Δɽ
    ɹɹɹɹɹɹɹɹɹɹɹҰํɼ֊૚ม෼Ϟσϧ͸ɾɾɾʁ
    Z = {z1
    , …, zM
    } qMF
    qMF
    (Z; λ) =
    M

    m=1
    q(zm
    ; λm
    )
    λ
    ֊૚ม෼Ϟσϧ
    ɹิॿજࡏม਺๏ͱ΋ݺ͹Ε͍ͯΔɽม෼ϞσϧͷҰछͰɼۙࣅ෼෍Λ֊૚Խ͢
    Δ͜ͱʹΑΓෳࡶͳۙࣅ෼෍ΛදݱͰ͖ΔΑ͏ʹ֦ுͨ͠΋ͷɽ
    w w w

    View Slide

  27. ֊૚ม෼Ϟσϧ
    ʲۙࣅ෼෍ͷϞσϧʳ
    ɹ֊૚ม෼ϞσϧʹΑΔۙࣅ෼෍ ͸࣍ͷΑ͏ͳܗࣜΛͱΔ͜ͱͰɼۙࣅ෼෍Λ֊
    ૚Խ͢Δɽ

    Λม෼ࣄલ෼෍ɼ Λม෼໬౓ͱݺͿɽ ʹؔͯ͠पลԽ͢Δ͜ͱͰɼۙࣅ
    ෼෍͸͋Δछͷࠞ߹෼෍ʹͳΔɽ
    ɹม෼ύϥϝʔλͷੜ੒ʹಠཱͰͳ͍෼෍ΛԾఆ͢Δ͜ͱͰɼજࡏม਺ؒͷ૬ؔΛଊ͑
    Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳΔɽ֊૚ม෼Ϟσϧ͸ɼม෼ύϥϝʔλʹؔͯ͠&-#0࠷େ
    Խ͢Δ͜ͱͰɼม෼ਪ࿦๏ͷ࿮૊ΈͰֶश͕Մೳɽ
    qHVM
    qHVM
    (Z; ξ) =

    q(λ; ξ)
    M

    m=1
    q(zm
    ; λm
    )dλ
    q(λ; ξ) q(zm
    |λm
    ) λ
    ξ
    ֊૚ม෼Ϟσϧ
    ɹิॿજࡏม਺๏ͱ΋ݺ͹Ε͍ͯΔɽม෼ϞσϧͷҰछͰɼۙࣅ෼෍Λ֊૚Խ͢
    Δ͜ͱʹΑΓෳࡶͳۙࣅ෼෍ΛදݱͰ͖ΔΑ͏ʹ֦ுͨ͠΋ͷɽ
    https://arxiv.org/abs/1511.02386

    View Slide

  28. ֊૚ม෼Ϟσϧ
    ʲۙࣅ෼෍ͷϞσϧʳ
    ֊૚ม෼Ϟσϧ
    ɹิॿજࡏม਺๏ͱ΋ݺ͹Ε͍ͯΔɽม෼ϞσϧͷҰछͰɼۙࣅ෼෍Λ֊૚Խ͢
    Δ͜ͱʹΑΓෳࡶͳۙࣅ෼෍ΛදݱͰ͖ΔΑ͏ʹ֦ுͨ͠΋ͷɽ
    ฏۉ৔ۙࣅ ֊૚ม෼Ϟσϧ
    m = 1,…, M
    zm
    λm
    m = 1,…, M
    zm
    λm
    ξ

    View Slide

  29. ֊૚ม෼Ϟσϧ
    ʲม෼ࣄલ෼෍ͷྫʳ
    ɹ Λࠞ߹ཁૉ਺ɼ Λ ࣍ݩͷΧςΰϦ෼෍ͷύϥϝʔλɼ Λ ࣍ݩΨ
    ΢ε෼෍ͷύϥϝʔλͷू߹ͱ͢Ε͹ɼҎԼͷΑ͏ʹࠞ߹Ϟσϧͷม෼ࣄલ෼෍Λߟ͑
    ΒΕΔɽ

    ݁Ռɼજࡏม਺ؒͷৄࡉͳ૬ؔΛଊ͑Δ͜ͱ͕ՄೳͱͳΔɽ
    ɹ·ͨɼม෼ࣄલ෼෍ʹਖ਼نԽྲྀΛద༻͢Δ͜ͱ΋Մೳɽ

    K π K ξ = {μk
    , Σk
    }K
    k=1
    M
    q(λ; ξ) =
    K

    k=1
    πk
    (λ|μk
    , Σk
    )
    q(λ; ξ) = q(λ0
    )
    K

    k=1
    det
    (
    ∂f
    ∂λk−1
    )
    −1
    ֊૚ม෼ϞσϧʹΑΔۙࣅ෼෍
    ɹɹɹɹɹɹɹɹɹɹ
    qHVM
    =

    q(λ; ξ)
    M

    m=1
    q(zm
    ; λm
    )dλ
    ม෼ࣄલ෼෍
    ม෼໬౓
    q(λ; ξ)
    q(zm
    ; λm
    )

    View Slide

  30. ֊૚ม෼Ϟσϧ
    ʲม෼ࣄલ෼෍ͷྫʳ
    ɹม෼Ϟσϧͱͯ͠Ψ΢εաఔΛ༻͍Δ͜ͱ΋Մೳɽ ม෼Ψ΢εաఔ

    ɹ ͸ม෼σʔλͱ͍͏ม෼Ψ΢εաఔͷͨΊͷٖࣅతͳೖग़ྗσʔλɽม෼σʔλͱ
    ڞ෼ࢄؔ਺ͷύϥϝʔλ ͕ม෼Ψ΢εաఔʹ͓͚Δม෼ύϥϝʔλͰɼ&-#0ʹجͮ
    ͖࠷దԽɽΨ΢εաఔʹै͏ؔ਺ ʹΑͬͯɼજࡏೖྗ͕ ʹϚοϐϯά͞Εɼม
    ෼໬౓ ʹΑͬͯਪ࿦͍ͨ͠જࡏม਺ ͷ෼෍͕ܾ·Δɽ
    ɹ͜ΕΒͷม෼Ϟσϧ͸ɼϒϥοΫϘοΫεม෼ਪ࿦๏ͱݺ͹ΕΔख๏ͳͲͰ&-#0࠷
    େԽʹ࢖͑ΔɽϒϥοΫϘοΫεม෼ਪ࿦๏͸είΞؔ਺ਪఆʹجͮ͘&-#0ͷޯ഑ۙ
    ࣅख๏ɽ

    qVGP
    (Z; θ, V) =
    ∫ ∫
    M

    m=1
    q(zm
    |Fm
    (ξ))(Fm
    ; O, Kξ,ξ
    )(ξ; 0, I)dFdξ
    V
    θ
    Fm
    ξ Fm
    (ξ)
    q(zm
    |Fm
    (ξ)) zm
    https://arxiv.org/abs/1511.06499
    ষΛಡΜͩޙʹಡΈฦ͢ͷ͕ྑͦ͞͏ɽ

    View Slide

  31. ຊ೔ͷ಺༰
    ‣ม෼Ϟσϧ
    ‣ਖ਼نԽྲྀ
    ‣֊૚ม෼Ϟσϧ
    ‣ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏

    View Slide

  32. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ɹ͍··Ͱ΍͖ͬͯͨ֬཰ੜ੒Ϟσϧ͸ɼີ౓ܭࢉ͕ܭࢉՄೳͳ֬཰෼෍Λ૊Έ߹Θͤ
    ͯϞσϦϯά͖ͯͨ͠ɽ
    ɹີ౓ܭࢉͰ͖ͳ͍৔߹ʹ͓͚Δσʔλͷੜ੒ͷݚڀ΋ߦΘΕ͍ͯΔɽ
    ඇ໌ࣔతϞσϧ
    ɹີ౓Λܭࢉ͢Δ͜ͱ͕Ͱ͖ͳ͍΋ͷͷɼσʔλͷੜ੒͸ߦ͏͜ͱ͕Ͱ͖ΔΑ͏
    ͳϞσϧɽ
    ɹඇ໌ࣔతϞσϧͷऔΓѻ͍͸ɼۙࣅϕΠζܭࢉͱͯ͠௕͘ݚڀ͞Ε͍ͯΔɽ
    ໬౓ͳ͠ม෼ਪ࿦๏
    ɹੜ੒Ϟσϧ΍ۙࣅ෼෍͕ඇ໌ࣔతϞσϧͱͯ͠ߏ੒͞Ε͍ͯΔঢ়گΛ૝ఆͨ͠ਪ
    ࿦ΞϧΰϦζϜɽ
    ɹ໬౓ͳ͠ม෼ਪ࿦๏ʹΑΔੜ੒Ϟσϧͷֶशํ๏͸ɼఢରతੜ੒ωοτϫʔΫ
    ʢ("/ʣʹ΋ར༻͞Ε͍ͯΔɽ
    ɹ
    ɹҎ߱ɼʮඇ໌ࣔతʯ͸ʮີ౓ܭࢉ͕Ͱ͖ͳ͍ʯ͜ͱΛࢦ͢ɽ

    View Slide

  33. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲඇ໌ࣔతϞσϧʳ
    ɹ࣍ͷΑ͏ͳ؍ଌσʔλ ͷ֊૚తͳੜ੒ϞσϧΛߟ͑Δɽ

    ɹɹɹɹɹɹɹɹɹɹɹɹ ɿજࡏม਺
    ɹɹɹɹɹɹɹɹɹɹɹɹ ɿશͯͷσʔλͰڞ༗͞ΕΔύϥϝʔλͷू߹
    ɹ ͸ඇ໌ࣔతͳ෼෍Ͱ͋Δͱఆٛ͢Δɽͭ·Γɼ


    ɹ্ͷࣜͷΑ͏ʹؔ਺ ͱϊΠζ ʹΑͬͯσʔλ ͕ੜ੒͞ΕΔͱ͢Ε͹ɼҎԼͷ
    Α͏ʹ໬౓͕ܭࢉͰ͖Δɽ

    ɹ͜ͷੵ෼͸ղੳతʹܭࢉෆՄͰɼޮ཰తʹ໬౓ܭࢉ͕Ͱ͖ͳ͍ͱԾఆɽ·ͨɼύϥ
    ϝʔλͷࣄલ෼෍ ͸αϯϓϦϯά΋ີ౓ܭࢉ΋༰қͱ͢Δɽ
    X
    p(X, Z, θ) = p(θ)
    N

    n=1
    p(xn
    |zn
    , θ)p(zn
    |θ)
    Z
    θ
    p(xn
    |zn
    , θ)
    ϵn
    ∼ p(ϵ)
    xn
    = g(ϵn
    |zn
    , θ)
    g ϵn
    xn
    p(xn
    ∈ A|zn
    , θ) =

    xn
    ∈A
    p(ϵn
    )dϵn
    p(θ)
    https://arxiv.org/abs/1702.08896

    View Slide

  34. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲඇ໌ࣔతϞσϧʳ
    ֊૚Ϟσϧ
    n = 1,…, N
    θ zn
    xn
    ඇ໌ࣔత֊૚Ϟσϧ
    n = 1,…, N
    θ zn
    xn
    ϵn
    ਖ਼ํܗ͸ɼܾఆతؔ਺

    View Slide

  35. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ɹඇ໌ࣔతϞσϧͷࣄޙ෼෍͸ҎԼͷΑ͏ʹͳΔɽ

    ͔͠͠ɼ͜Ε͸ղੳతʹܭࢉͰ͖ͳ͍ɽ
    ɹ ม෼ਪ࿦๏ʹΑΔࣄޙ෼෍ͷۙࣅɽ
    ɹҰൠతʹඇ໌ࣔతϞσϧ͸ࣄޙ෼෍΋ෳࡶʹͳΔͷͰɼԾఆ͢Δۙࣅ෼෍΋දݱྗ͕
    ߴ͍΄͏͕ྑ͍ɽ
    ɹ ۙࣅ෼෍ʹԾఆ͢Δ੍໿ΛऑΊɼΑΓ޿͍Ϋϥεͷۙࣅ෼෍Λઃఆɽ
    p(Z, θ|X) =
    p(X, Z, θ)
    p(X)


    View Slide

  36. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ɹજࡏม਺ͷۙࣅ෼෍ʹରͯ͠΋ม෼ύϥϝʔλΛ ͱͨ͠ඇ໌ࣔతͳ෼෍ΛԾఆɽ

    જࡏม਺ ͸؆୯ʹαϯϓϧՄೳɽม෼໬౓ ͷ஋ࣗମ͸ܭࢉͰ͖ͳͯ͘΋ྑ
    ͍ͱ͢Δɽ
    w w w w w w w
    ψ
    zn
    ∼ qψ
    (zn
    |xn
    , θ)
    zn

    (zn
    |xn
    , θ)
    ໬౓ͳ͠ม෼ਪ࿦๏ͷ໨త
    ɹ໌ࣔతͳີ౓ؔ਺Λ࣋ͨͣɼαϯϓϧ ΛಘΒΕΔ͜ͱ͚ͩ
    Λར༻ͯ͠ม෼ਪ࿦๏Λ࣮ߦ͢Δ͜ͱɽ
    zn
    n = 1,…, N
    θ zn
    xn
    ϵn
    ψ

    View Slide

  37. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ɹඇ໌ࣔతͳม෼໬౓ͱɼม෼ύϥϝʔλΛ΋ͭ ͷۙࣅ෼෍ Λ༻͍ͯۙࣅࣄޙ
    ෼෍શମΛҎԼͷΑ͏ʹ͢Δɽ

    ͸ɼ ͷαϯϓϦϯά΋ີ౓ܭࢉ΋༰қͳ֬཰ີ౓ؔ਺ʢFHΨ΢ε෼෍ʣΛઃఆ
    ͢Δɽ
    ξ θ qξ
    (θ)
    qψ,ξ
    (Z, θ|X) = qξ
    (θ)
    N

    n=1

    (zn
    |xn
    , θ)

    (θ) θ
    n = 1,…, N
    θ zn
    xn
    ϵn
    ψ
    ξ

    View Slide

  38. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ

    ɹ্ͷۙࣅࣄޙ෼෍ΑΓɼର਺पล໬౓ͷ&-#0͸ҎԼͷΑ͏ʹॻ͚Δɽ

    ͓Αͼ ͸ඇ໌ࣔతͳ෼෍ɽ
    ɹ ޯ഑߱Լ๏ͳͲͰ&-#0࠷େԽ͕Ͱ͖ͳ͍ɽ
    ʲղܾࡦʳ
    ɹσʔλͷܦݧ෼෍ Λར༻͢Δɽ
    qψ,ξ
    (Z, θ|X) = qξ
    (θ)
    N

    n=1

    (zn
    |xn
    , θ)
    ℒ(ψ, ξ) = qψ,ξ
    (Z,θ|X)
    [ln p(X, Z, θ) − ln qψ,ξ
    (Z, θ|X)]
    = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ)
    [ln p(xn
    , zn
    |θ) − ln qψ
    (zn
    |xn
    , θ)]
    p(xn
    , zn
    |θ) qψ
    (zn
    |xn
    , θ)

    q
    (xn
    )

    View Slide

  39. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ɹͲͷΑ͏ʹ Λར༻͢Δ͔ʁ ʹ ΛՃ͑Δɽ

    ɹ໬౓ͳ͠ม෼ਪ࿦๏Ͱ͸ɼඇ໌ࣔతͳ෼෍Λ௚઀ѻ͏୅ΘΓʹɼີ౓ൺͷର਺Λ௚઀
    ਪఆ͢Δ͜ͱͰԼքͷܭࢉΛߦ͏ɽ

    ɹີ౓ਪఆث ʹ͸ɼύϥϝʔλΛ ͱͨ͠ඍ෼Մೳͳχϡʔϥϧωοτ
    ϫʔΫͳͲͷճؼϞσϧΛબ୒͞ΕΔɽ
    q
    (xn
    ) ⟹ ℒ(ψ, ξ) −ln q
    (xn
    )
    ℒ(ψ, ξ) = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ) [
    ln
    p(xn
    , zn
    |θ)
    qψ,
    (xn
    , zn
    |θ) ]
    + c
    ropt.
    (xn
    , zn
    , θ|η) = ln
    p(xn
    , zn
    |θ)
    qψ,
    (xn
    , zn
    |θ)
    r(xn
    , zn
    , θ|η) η

    View Slide

  40. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ɹີ౓ਪఆثͷֶशͷྫͱͯ͠ɼదਖ਼είΞنଇʹج͍ͮͨଛࣦؔ਺Λ࢖༻͢Δ͜ͱ
    ͕ڍ͛ΒΕΔɽ
    ɹɹɹɹɹɹɹɹɹ
    ͕ ͔Βͷαϯϓϧʹରͯ͠ɼ ͔Βͷαϯϓϧʹରͯ͠Λฦ͢ͱ
    ͖ɼ ΛͱΔɽ
    ɹͭ·Γɼີ౓ਪఆث ͸ ͓Αͼ ͔ΒͷαϯϓϧͷΈ࢖ͬͯɼ ͷ ͷ
    ޯ഑ʹؔ͢ΔෆภਪఆྔΛಘΔ͜ͱʹΑΓֶशͰ͖Δɽ
    r
    J(η) = p(xn
    ,zn
    |θ)
    [−ln Sig(r(xn
    , zn
    , θ|η))]
    +qψ
    (xn
    ,zn
    |θ)
    [−ln{1 − Sig(r(xn
    , zn
    , θ|η))}]
    Sig(r(xn
    , zn
    , θ|η)) p q
    J(η) = 0
    r(xn
    , zn
    , θ|η) xn
    zn
    J(η) η

    View Slide

  41. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ɹΑͬͯɼ࠷େԽ͢Δ໨తؔ਺͸ҎԼͷΑ͏ʹͳΔɽ
    ɹ
    ɹ࠶ύϥϝʔλԽޯ഑Λ࢖ͬͯɼ ͓Αͼ ΛαϯϓϦϯάͯ͠ɼม෼ύϥϝʔλ ͓
    Αͼʹؔ͢Δޯ഑ͷۙࣅΛಘΔɽ
    ℒr
    (ψ, ξ) = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ) [r(xn
    , zn
    , θ|η)]
    zn
    θ ψ
    ξ

    View Slide

  42. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ
    ໬౓ͳ͠ม෼ਪ࿦๏ͷ·ͱΊ
    wೖྗɿඇ໌ࣔతϞσϧ ɼࣄલ෼෍ ɼඇ໌ࣔతม෼໬౓ؔ਺
    ɼม෼ࣄલ෼෍ ɼີ౓ൺਪఆث
    wग़ྗɿม෼ύϥϝʔλɹ ɼ
    wύϥϝʔλ ɼɼ ͷॳظԽ
    wԼهΛऩଋ͢Δ·Ͱ܁Γฦ͢ɽ
    ޯ഑ ɼ ɼ ͷෆภਪఆྔΛܭࢉ
    ɼ ɼΛߋ৽͢Δɽ
    p(zn
    , θ|xn
    ) p(θ)

    (zn
    |xn
    , θ) qξ
    (θ) r(xn
    , zn
    , θ|η)
    ψ ξ
    ψ ξ η
    ∇J(η) ∇ψ
    ℒ(ψ, ξ) ∇ξ
    ℒ(ψ, ξ)
    η ψ ξ

    View Slide

  43. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ໨తؔ਺ͷಋग़



    ln p(X, Z, θ) = ln p(X, Z|θ)p(θ)
    = ln p(X, Z|θ) + ln p(θ)
    =
    N

    n=1
    ln p(xn
    , zn
    |θ) + ln p(θ),
    ln qψ,ξ
    (Z, θ|X) = ln qξ
    (θ) +
    N

    n=1
    ln qψ
    (zn
    |xn
    , θ),
    ℒ(ψ, ξ) =
    ∫ ∫
    qψ,ξ
    (Z, θ|X)ln
    p(X, Z, θ)
    qψ,ξ
    (Z, θ|X)
    dZdθ
    = qψ,ξ
    (Z,θ|X)
    [ln p(X, Z, θ) − ln qψ,ξ
    (Z, θ|X)]
    = qξ
    (θ)qψ
    (Z|X,θ)
    [ln p(X, Z|θ) + ln p(θ)
    −ln qξ
    (θ) − ln qψ
    (Z|X, θ)]
    = qξ
    (θ)qψ
    (Z|X,θ)
    [ln p(θ) − ln qξ
    (θ)]
    +qξ
    (θ)qψ
    (Z|X,θ)
    [ln p(X, Z|θ) − ln qψ
    (Z|X, θ)]
    = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ)
    [ln p(xn
    , zn
    |θ) − ln qψ
    (zn
    |xn
    , θ)]

    ͨͩ͠ɼ

    ℒ(ψ, ξ) = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ)
    [ln p(xn
    , zn
    |θ) − ln qψ
    (zn
    |xn
    , θ)
    −ln p
    (xn
    ) + ln p
    (xn
    )]
    = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ) [
    ln
    p(xn
    , zn
    |θ)

    (zn
    |xn
    , θ)p
    (xn
    )]
    + c
    = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ) [
    ln
    p(xn
    , zn
    |θ)
    qψ,
    (xn
    , zn
    |θ)]
    + c
    c =
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ)
    [ln p
    (xn
    )]
    =
    N

    n=1
    ln p
    (xn
    )

    View Slide

  44. ඇ໌ࣔతϞσϧͱ໬౓ͳ͠ม෼ਪ࿦๏
    ʲ໬౓ͳ͠ม෼ਪ࿦๏ʳ໨తؔ਺ͷಋग़
    ͱͨ͠ͱ͖ɼ

    ͱ͓͘ͱɼ
    ɹɹɹɹ
    ɹɹɹɹ
    ͱͳΔͷͰɼ ࠷େԽͱ ࠷େԽ͸౳Ձɽ
    ͕ͨͬͯ͠ɼ࠷େԽ͍ͨ͠໨తؔ਺͕ ͱͳΔɽ
    r(xn
    , zn
    , θ; η) = ln
    p(xn
    , zn
    |θ)
    qψ,
    (xn
    , zn
    |θ)
    ℒr
    (ψ, ξ) = qξ
    (θ)
    [ln p(θ) − ln qξ
    (θ)]
    +
    N

    n=1

    (θ)qψ
    (zn
    |xn
    ,θ) [r(xn
    , zn
    , θ; η)]
    ∇ψ
    ℒ(ψ, ξ) = ∇ψ
    ℒr
    (ψ, ξ)
    ∇ξ
    ℒ(ψ, ξ) = ∇ξ
    ℒr
    (ψ, ξ)
    ℒ(ψ, ξ) ℒr
    (ψ, ξ)
    ℒr
    (ψ, ξ)

    View Slide