Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

ベイズ統計モデリング 10 // Doing Bayesian Data Analysis C...

todesking
August 24, 2018

ベイズ統計モデリング 10 // Doing Bayesian Data Analysis Chapter 10

todesking

August 24, 2018
Tweet

More Decks by todesking

Other Decks in Science

Transcript

  1. Ϟσϧͷந৅දݱ • Ϟσϧ͕ෳ਺ͷม਺͔Βߏ੒͞Ε͍ͯͯ΋ɺந৅Խ͢Ε͹ θ, P(θ), P(D|θ) ͰදݱͰ͖Δ(ಠࣗݚڀ) D θ1 θ

    = (φ1 , φ2 , φ3 ) D = (X, Y) P(θ) = P(φ1 , φ2 , φ3 ) P(D|θ) = P(X, Y|φ1 , φ2 , φ3 ) X φ1 φ3 Y φ2
  2. ϞσϧൺֱͷͨΊͷϞσϧ • ͜ͷϞσϧͷύϥϝʔλಉ࣌෼෍͸ҎԼʹͳΔ • Ϟσϧ਺ΛM, ύϥϝʔλ{θ_1, ..., θ_M} Λ Θ

    ͱͨ͠ P(Θ, m|D) = P(D|Θ, m)P(Θ, m) ∑ m ∫ dθm P(D|Θ, m) = ∏ m ∫ dθm Pm (D|θm , m)Pm (θm )P(m) ∑ m ∏ m ∫ dθm Pm (D|θm , m)Pm (θm )P(m) P(D|Θ) ͕ ∏ m Pm (D|θm )Pm (θm |m)P(m)ʹͳΔͷ͕ॏཁΒ͍͠
  3. ࣄޙΦοζͱϕΠζϑΝΫλ • ϞσϧؒͰP(m|D)ͷൺΛऔΕ͹ɺͲͪΒͷϞσϧ͕΋ͬͱ ΋Β͍͔͠Θ͔Δ=ࣄޙΦοζ P(m = 1|D) P(m = 2|D)

    = P(D|m = 1) P(D|m = 2) P(m = 1) P(m = 2) • ໬౓P(m|D)ͷൺΛϕΠζϑΝΫλ(BF)ͱ͍͏ • ࣄޙΦοζ=BF * ࣄલ֬཰ͷൺ
  4. P(m|D)ͷൺΛٻΊΔ: ղੳղ • ·ͱΊΔͱɺP(D|m)͸ҎԼͱͳΔ P(D|m) = P(z, N|m) = B(z,

    am , N − z + bm ) B(am , bm ) am = ωm (κ − 2) + 1 bm = (1 − ωm )(κ − 2) + 1 ω1 = 0.25 ω2 = 0.75
  5. P(m|D)ͷൺΛٻΊΔ: ղੳղ • P(D|m=1) ≒ 0.000499 • P(D|m=2) ≒ 0.002339

    • BF = P(D|m=1)/P(D|m=2) ≒ 0.213 ͱͳΔ • P(m = 1) = P(m = 2) = 0.5 ͷͱ͖ɺ P(m = 1|D) P(m = 2|D) = P(D|m = 1) P(D|m = 2) = 0.213 P(m = 2|D) = 1 − P(m = 1|D)ΑΓ P(m = 1|D) 1 − P(m = 1|D) = 0.213 P(m = 1|D) = 0.176 P(m = 2|D) = 0.824
  6. Ϟσϧ͝ͱͷपล໬౓ܭࢉ P(θ|D) = P(D|θ)P(θ) P(D) 1 P(D) = P(θ|D) P(D|θ)P(θ)

    ೚ҙͷ֬཰෼෍h(θ)Λಋೖͯ͠ = P(θ|D) P(D|θ)P(θ) ∫ dθ′h(θ′) = ∫ dθ′ P(θ|D) P(D|θ)P(θ) h(θ′) ೚ҙͷθʹ͍ͭͯɺ P(θ|D) P(D|θ)P(θ) ͷ஋͸ಉ͡ͳͷͰ = ∫ dθ′ P(θ′|D) P(D|θ′)P(θ′) h(θ′) ≈ N ∑ θi ∼P(θ|D) h(θi ) P(D|θi )P(θi )
  7. 2 10.3.2.1 ΋ͬͱҰൠతͳํ๏ • ͜ͷࣄྫͰ͸ɺͨ·ͨ·ࣄલ෼෍ؔ਺͕શϞσϧͰಉ͡ • Ұൠతʹ͸ɺҟͳΔؔ਺Λ࢖͍͍ͨ • ͷͰɺ෼͚ͯهड़͢Δͱ͜͏ͳΔ •

    આ໌ͷ౎߹্ɺࣄલ෼෍ͷύϥϝʔλ͸લͷྫͱҧ͍ͬͯΔ N y θ m ω1 = 0.10 ω2 = 0.90 m ∼ Categorial(0.5,0.5) θ1 ∼ Beta(ω = ω1 , κ = 20) θ2 ∼ Beta(ω = ω2 , κ = 20) yi ∼ Bern(θm ) ω
  8. mʹΑΔθͷมԽ • JAGS͸gibbs sampling͍ͯ͠ΔͷͰɺύϥϝʔλΛҰݸ ͣͭαϯϓϦϯά͍ͯ͘͠ θ(1) 1 ∼ P(θ1 |θ(0)

    2 , m(0), D) θ(1) 2 ∼ P(θ2 |θ(1) 1 , m(0), D) m(1) ∼ P(m|θ(1) 1 , θ(1) 2 , D) θ(2) 1 ∼ P(θ1 |θ(1) 2 , m(1), D) θ(2) 2 ∼ P(θ2 |θ(2) 1 , m(1), D) m(2) ∼ P(m|θ(2) 1 , θ(2) 2 , D) ⋯
  9. αϯϓϦϯάաఔ P(θ1 , θ2 , m|D) = { P1 (D|θ1

    )P1 (θ1 )P2 (θ2 )P(m = 1) if m = 1 P2 (D|θ2 )P1 (θ1 )P2 (θ2 )P(m = 2) if m = 2 m(1) = 1 θ(1) 1 ∼ P(θ1 |θ(0) 2 , m = 1,D) = P(θ1 , θ(0) 2 , m = 1|D) P(θ(0) 2 , m = 1|D) P(θ(0) 2 , m = 1|D) = P2 (θ(0) 2 )P(m = 1) ∫ dθ1 P1 (D|θ1 )P1 (θ1 )ΑΓ = P1 (D|θ1 )P1 (θ1 ) ∫ dθ1 P1 (D|θ1 )P1 (θ1 )
  10. αϯϓϦϯάաఔ θ(1) 2 ∼ P(θ2 |θ(1) 1 , m =

    1,D) = P(θ(1) 1 , θ2 , m = 1|D) P(θ(1) 1 , m = 1|D) P(θ(0) 1 , m = 1|D) = P1 (D|θ(1) 1 )P1 (θ(1) 1 )P(m = 1) ∫ dθ2 P2 (θ2 )ΑΓ = P2 (θ2 ) P(θ1 , θ2 , m|D) = { P1 (D|θ1 )P1 (θ1 )P2 (θ2 )P(m = 1) if m = 1 P2 (D|θ2 )P1 (θ1 )P2 (θ2 )P(m = 2) if m = 2
  11. αϯϓϦϯάաఔ m(2) ∼ P(m|θ(1) 1 , θ(1) 2 , D)

    = P(θ(1) 1 , θ(1) 2 , m|D) P(θ(1) 1 , θ(1) 2 |D) P(θ1 , θ2 , m|D) = P(D|θ1 , θ2 , m) P(θ1 , θ2 , m) P(D|θ1 , θ2 , m) = { P1 (D|θ1 )P1 (θ1 )P2 (θ2 )P(m = 1) if m = 1 P2 (D|θ2 )P1 (θ1 )P2 (θ2 )P(m = 2) if m = 2 P(θ1 , θ2 , m) = P1 (θ1 )P2 (θ2 )P(m) = { P(D|θ1 ) if m = 1 P(D|θ2 ) if m = 2
  12. ٙࣅࣄલ෼෍ͷར༻ • ٙࣅࣄલ෼෍Λ࢖Θͳ͍ϞσϧΛࣄલʹ࣮ߦ͓͖ͯ͠ɺٙࣅࣄલ෼෍ ͷύϥϝʔλΛಘΔ • બ͹ΕͨϞσϧͷθ͸ී௨ʹαϯϓϦϯά͢Δ͕ɺબ͹Εͳ͔ͬͨํ͸ ٙࣅࣄલ෼෍͔ΒαϯϓϦϯά͢Δ ωi,j , κi,j

    = { true prior if i = j pseudo prior if i ≠ j m ∼ Categorial(0.5,0.5) θ1 ∼ Beta(ω = ω1,m , κ = κ1,m ) θ2 ∼ Beta(ω = ω2,m , κ = κ2,m ) yi ∼ Bern(θm ) 2 2 N y θ m ω
  13. ࢧ࣋͞Εͳ͍Ϟσϧͷαϯϓ ϧ͕গͳ͍໰୊ • mͷࣄޙ෼෍ʹ͓͍ͯɺϞσϧ1͕બ͹ΕΔͷ͸8% • ͭ·ΓϞσϧ1ͷύϥϝʔλͰ͋Δθ1ͷαϯϓϧ਺͕શମ ͷ8% • αϯϓϧ਺Λ૿΍ͨ͢Ίʹ͸ɺνΣʔϯͷ௕͞Λ૿΍͢ (ܭࢉ࣌ؒʹѱӨڹ)΄͔ʹɺϞσϧ͕ΑΓฏ౳ʹબ͹ΕΔ

    Α͏P(m)Λௐ੔͢Δ(m=1ʹόΠΞεΛֻ͚Δ)ํ๏͕͋Δ • P(m)Λ͍ͬͯ͡γϛϡϨʔγϣϯͨ͠৔߹Ͱ΋ɺฏ౳ͳ ࣄલ෼෍ʹ͓͚ΔࣄޙΦοζΛٻΊΒΕΔ BF = P(m = 1|D) P(m = 2|D) P(m = 2) P(m = 1)
  14. 10.3.3 Ϟσϧ͝ͱʹҟͳΔ ໬౓ؔ਺ͷར༻ • P(D|θ)Λnoise distributionͱ΋͍͏ͦ͏Ͱ͢ • Ϟσϧ͝ͱʹҟͳΔP(D|θ)Λ࢖͍͍ͨͱ͖͸ɺ8.6.1Ͱ঺հ͠ ͨςΫχοΫ͕࢖͑Δ •

    spy = if m = 1 then PDF(D|θ1) else PDF(D|θ2) / C • 1 ~ Bern(spy) • Ϟσϧͷಉ࣌֬཰ʹspyΛ৐͡Δ͜ͱʹͳΔ • C(େ͖Ίͷఆ਺)Ͱׂ͍ͬͯΔͷ͸spy͕1Λ௒͑ͳ͍Α͏ʹ • ૬ରతͳ஋͕ॏཁͳͷͰɺspyͷ۩ମతͳ஋͸ؔ܎ͳ͍ • STANͩͱ΋ͬͱ௚ײతʹॻ͚ͨؾ͕͢Δ(increment_log_prob ؔ਺ͰϞσϧͷ֬཰ΛՃࢉͰ͖Δ)
  15. 10.4: Ϟσϧฏۉ • P(y)Λ༧ଌ͍ͨ͠ • Ϟσϧൺֱͷ݁ՌϞσϧb͕উ͍ͬͯͨͳΒɺͦͷϞσϧͰ༧ ଌ͢Δ͜ͱ͕Ͱ͖Δ P( ̂ y|D,

    m = b) = ∫ dθb Pb ( ̂ y|θb , m = b)Pb (θb |D, m = b) • Ϟσϧ͝ͱʹ֬৴౓ׂ͕Γ౰ͯΒΕ͍ͯΔͷͰɺͦͷॏΈ Λ࢖ͬͯશϞσϧͷฏۉΛऔΔ͜ͱ͕Ͱ͖Δ P( ̂ y|D) = ∑ m ∫ dθm Pm ( ̂ y|θm , m)Pm (θm |D, m)P(m|D)
  16. Ϟσϧൺֱͱෳࡶ͞ • ίΠϯ౤͛ͷϞσϧ: θ ~ Beta(a, b) Λߟ͑Δ • 1.

    ϑΣΞͩΖ͏Ϟσϧ: (a,b) = (500, 500) • 2. ͢΂ͯى͜Γ͏ΔϞσϧ: (a, b) = (1, 1) • 20ճத15ճද͕ग़ͨέʔεͰ͸ɺϞσϧ2͕উͭ • 20ճத11ճද͕ग़ͨΒϞσϧ1͕উͭ • ࣄલ෼෍ͷް͍෦෼Ͱσʔλʹద߹Ͱ͖͔͕ܾͨΊख
  17. 10.5.1 Ϟσϧൺֱͷ஫ҙ • ͋ΔϞσϧ(full modelͱݺͿ)ʹରͯ͠ɺύϥϝʔλͷൣғ ʹ੍໿ΛՃ͑ͨϞσϧΛߟ͑Δ͜ͱ͕Ͱ͖Δ • ύϥϝʔλaͷ஋͸bͱಉ͡ɺͳͲ • full

    modelͷ΄͏͕ෳࡶͳͷͰɺ੍ݶϞσϧ͕ಉ͘͡Β͍ Α͘σʔλΛදݱͰ͖ΔͳΒɺϕΠδΞϯϞσϧൺֱͰ͸ ੍ݶϞσϧ͕બ͹ΕΔͩΖ͏ • 9ষͷ໺ٿબखϞσϧʹ͓͍ͯɺ಺໺खͷೳྗ͸͢΂ͯಉ ͡Ͱ͋Δͱ͍͏੍ݶΛ͔͚ͨϞσϧ͕ߟ͑ΒΕΔ
  18. 10.6 ࣄલ෼෍ʹහײ • ϕΠζϑΝΫλʔ͸∫dθ P(D|θ)P(θ) Λ࢖͍ͬͯΔͷͰɺࣄ લ෼෍ʹහײ • ྫ: ෼ࢠଆͷϞσϧͷࣄલ෼෍ΛBeta(1,1)͔Β

    Beta(0.01,0.01)ʹͨ͠ΒɺBF͕0.12͔Β5.72ʹ • Ϟσϧͷ95% HDI͸ࣄલ෼෍ͷӨڹΛ΄΅ड͚ͳ͍ • ॆ෼ͳྔͷσʔλ͕͋ΔͳΒɺϕΠζਪఆ͸Ϟσϧൺֱͱ ҧͬͯࣄલ෼෍ͷӨڹΛड͚ʹ͍͘
  19. 10.6.1 ֤Ϟσϧͷࣄલ෼෍ʹ͸ ฏ౳ʹ৘ใΛ༩͑Δ΂͖ • ࣄલ෼෍ͷҧ͍͕BFʹӨڹΛ༩͑ΔɻͲ͏͢΂͖͔ • σʔλʹج͍ͮͯࣄલ෼෍Λܾఆ͢Δ • ֤ϞσϧͰɺಉ͡σʔλʹج͍ܾͮͯΊΔ •

    ྫ: 100ճத65ճද͕ग़ͨίΠϯ౤͛ • σʔλͷ10%(10ճத6ճද)Λ࢖ͬͯࣄલ෼෍Λิਖ਼ • Beta(1, 1) → Beta(1+6, 1+4) • BF͕҆ఆ͢Δ