todesking
August 24, 2018
97

# ベイズ統計モデリング 10 // Doing Bayesian Data Analysis Chapter 10

August 24, 2018

## Transcript

3. ### 10.1 ҰൠࣜͱϕΠζϑΝΫλʔ • ؍ଌ͞Εͨσʔλ(D)Λઆ໌͢Δ2ͭͷϞσϧΛߟ͑Δ • ֤Ϟσϧ͸ɺࣄલ෼෍P(θ)ɺ໬౓P(D|θ)͓Αͼύϥϝʔλθ͔ΒͳΔ D θ1 θ1 ∼

P1 (θ1 ) D ∼ P1 (D|θ1 ) D θ2 θ2 ∼ P2 (θ2 ) D ∼ P2 (D|θ2 )
4. ### Ϟσϧͷந৅දݱ • Ϟσϧ͕ෳ਺ͷม਺͔Βߏ੒͞Ε͍ͯͯ΋ɺந৅Խ͢Ε͹ θ, P(θ), P(D|θ) ͰදݱͰ͖Δ(ಠࣗݚڀ) D θ1 θ

= (φ1 , φ2 , φ3 ) D = (X, Y) P(θ) = P(φ1 , φ2 , φ3 ) P(D|θ) = P(X, Y|φ1 , φ2 , φ3 ) X φ1 φ3 Y φ2
5. ### ϞσϧൺֱͷͨΊͷϞσϧ • Ϟσϧબ୒ม਺mΛಋೖͯ͠ɺ2ͭͷϞσϧΛ·ͱΊΔ • ϞσϧൺֱͷͨΊͷϞσϧͱͳΔ • ﬁg 10.1Ͱ͸ɺ໬౓ؔ਺͕ผʑͷέʔε(தԝ)ɺڞ௨͍ͯ͠Δέʔε(ӈ)ɺͦ ͷҰൠԽ͍Δͷ͔?ͷέʔε(ࠨ)͕දݱ͞Ε͍ͯΔ •

Լਤ͸தԝͷέʔεʹ૬౰ D θ1 θ1 ∼ P1 (θ1 ) θ2 θ2 ∼ P2 (θ2 ) m m ∼ P(m) P(D|θ1 , m = 1) = P1 (D|θ1 ) P(D|θ2 , m = 2) = P2 (D|θ2 )
6. ### ϞσϧൺֱͷͨΊͷϞσϧ • ͜ͷϞσϧͷύϥϝʔλಉ࣌෼෍͸ҎԼʹͳΔ • Ϟσϧ਺ΛM, ύϥϝʔλ{θ_1, ..., θ_M} Λ Θ

ͱͨ͠ P(Θ, m|D) = P(D|Θ, m)P(Θ, m) ∑ m ∫ dθm P(D|Θ, m) = ∏ m ∫ dθm Pm (D|θm , m)Pm (θm )P(m) ∑ m ∏ m ∫ dθm Pm (D|θm , m)Pm (θm )P(m) P(D|Θ) ͕ ∏ m Pm (D|θm )Pm (θm |m)P(m)ʹͳΔͷ͕ॏཁΒ͍͠
7. ### P(m|D) • ͜ͷϞσϧΛ࢖͏͜ͱͰɺσʔλD͕༩͑ΒΕͨͱ͖Ϟσ ϧm͕࢖ΘΕΔ֬཰P(m|D)ΛٻΊΔ͜ͱ͕Ͱ͖Δ P(m|D) = P(D|m)P(m) ∑ m P(D|m)P(m)

P(D|m) = ∫ dθm Pm (D|θm )Pm (θm )
8. ### ࣄޙΦοζͱϕΠζϑΝΫλ • ϞσϧؒͰP(m|D)ͷൺΛऔΕ͹ɺͲͪΒͷϞσϧ͕΋ͬͱ ΋Β͍͔͠Θ͔Δ=ࣄޙΦοζ P(m = 1|D) P(m = 2|D)

= P(D|m = 1) P(D|m = 2) P(m = 1) P(m = 2) • ໬౓P(m|D)ͷൺΛϕΠζϑΝΫλ(BF)ͱ͍͏ • ࣄޙΦοζ=BF * ࣄલ֬཰ͷൺ
9. ### 10.2 2ͭͷίΠϯ޻৔ͷྫ • ίΠϯΛNճ౤͛ͨΒද͕zճग़ͨɻ2ͭͷ޻৔ͷͲͪΒ͔Βདྷ ͨίΠϯ͔? • ͦΕͧΕͷ޻৔ΛϞσϧͱΈͳͯ͠ɺϞσϧൺֱ͢Δ • ﬁg 10.1ʹ͓͚Δӈͷਤ=໬౓ؔ਺͕ಉ͡Ͱ͋Δέʔεʹ૬౰

z θ1 θ1 ∼ Beta(ω = 0.25,κ = 12) θ2 m m ∼ Categorical(0.5,0.5) z ∼ Binomial(θm , N) θ2 ∼ Beta(ω = 0.75,κ = 12) N
10. ### P(m|D)ͷൺΛٻΊΔ: ղੳղ • ࣄલ෼෍͕Beta(a, b)Ͱ༩͑ΒΕΔͱ͖ɺP(z,N)͸ҎԼͷࣜ ʹͳΔ(6ষͰઆ໌ࡁΈ) • আࢉ࣌͸Ξϯμʔϑϩʔ๷ࢭͷͨΊʹlogΛऔΔͱ͍͍ • Rʹ͸ϕʔλ෼෍ͷlogΛٻΊΔlbetaؔ਺͕͋Δ

P(z, N) = B(z, a, N − z + b) B(a, b) = exp(log B(z + a, N − z + b) − logB(a, b))
11. ### P(m|D)ͷൺΛٻΊΔ: ղੳղ • ·ͱΊΔͱɺP(D|m)͸ҎԼͱͳΔ P(D|m) = P(z, N|m) = B(z,

am , N − z + bm ) B(am , bm ) am = ωm (κ − 2) + 1 bm = (1 − ωm )(κ − 2) + 1 ω1 = 0.25 ω2 = 0.75
12. ### P(m|D)ͷൺΛٻΊΔ: ղੳղ • P(D|m=1) ≒ 0.000499 • P(D|m=2) ≒ 0.002339

• BF = P(D|m=1)/P(D|m=2) ≒ 0.213 ͱͳΔ • P(m = 1) = P(m = 2) = 0.5 ͷͱ͖ɺ P(m = 1|D) P(m = 2|D) = P(D|m = 1) P(D|m = 2) = 0.213 P(m = 2|D) = 1 − P(m = 1|D)ΑΓ P(m = 1|D) 1 − P(m = 1|D) = 0.213 P(m = 1|D) = 0.176 P(m = 2|D) = 0.824

ࣄલ෼෍

18. ### Ϟσϧ͝ͱͷपล໬౓ܭࢉ • P(D)͸P(θ)͔ΒαϯϓϦϯάͨ͠θ_nΛ࢖ͬͯɺΣP(D|θ) / N ͰۙࣅՄೳ͕ͩɺ࣮༻తͰͳ͍ • P(θ)͸֦ࢄ͍ͯ͠Δ • ΄ͱΜͲͷαϯϓϧʹஔ͍ͯɺP(D|θ)͸ඇৗʹখ͍͞

• ࣄޙ෼෍P(θ|D)͔ΒαϯϓϦϯάͨ͠θΛ࢖ͬͯP(D)Λಋ ग़͍ͨ͠
19. ### Ϟσϧ͝ͱͷपล໬౓ܭࢉ P(θ|D) = P(D|θ)P(θ) P(D) 1 P(D) = P(θ|D) P(D|θ)P(θ)

೚ҙͷ֬཰෼෍h(θ)Λಋೖͯ͠ = P(θ|D) P(D|θ)P(θ) ∫ dθ′h(θ′) = ∫ dθ′ P(θ|D) P(D|θ)P(θ) h(θ′) ೚ҙͷθʹ͍ͭͯɺ P(θ|D) P(D|θ)P(θ) ͷ஋͸ಉ͡ͳͷͰ = ∫ dθ′ P(θ′|D) P(D|θ′)P(θ′) h(θ′) ≈ N ∑ θi ∼P(θ|D) h(θi ) P(D|θi )P(θi )
20. ### Ϟσϧ͝ͱͷपล໬౓ܭࢉ • h(θ)ͱͯ͠͸೚ҙͷ֬཰෼෍͕࢖͑Δ͕ɺ਺஋ܭࢉͷ౎߹ ্ɺ෼฼ͱࣅͨܗঢ়Ͱ͋Δ͜ͱ͕๬·͍͠ • ෳࡶͳϞσϧʹ͓͍ͯɺͦͷΑ͏ͳhΛٻΊΔͷ͸೉͍͠ • 10.3.1.1ʹ͓͍ͯ͸ɺαϯϓϦϯάͨ͠θΛݩʹhͷܗঢ়Λ ܾΊ͍ͯΔ N

∑ θi ∼P(θ|D) h(θi ) P(D|θi )P(θi )
21. ### N 10.3.2 MCMC: ֊૚Ϟσϧ • ࠓճͷέʔεͰ͸ɺ֤Ϟσϧͷࣄલ෼෍͓Αͼ໬౓෼෍͕ಉؔ͡ ਺ͰදͤΔ • θΛαϯϓϦϯά͢ΔࡍʹɺmΛߟྀͯ͠ωͷ஋Λม͑Ε͹Α͍ y

θ m ω1 = 0.25 ω2 = 0.75 m ∼ Categorial(0.5,0.5) θ ∼ Beta(ω = ωm , κ = 12) yi ∼ Bern(θ) 2 ω
22. ### MCMCͷ݁Ռ • ্͕ࣄલɺԼ͕ࣄޙ • mͷࣄޙ෼෍͸ɺଞͷख๏Ͱͷ݁ ՌͱҰக͍ͯ͠Δ • m=1ʹ͓͚Δθͷࣄޙ෼෍͸ɺα ϯϓϧશମͷ18%͔͠࢖ΘΕͯ ͍ͳ͍͜ͱʹ஫ҙ

• m=2ʹ͓͚Δθͷࣄޙ෼෍͸ɺ࢒ Γ82%͕࢖ΘΕ͍ͯΔ • ࢧ࣋͞Εͳ͔ͬͨϞσϧʹؔ͢Δ αϯϓϧ͸গͳ͘ͳΔ
23. ### 2 10.3.2.1 ΋ͬͱҰൠతͳํ๏ • ͜ͷࣄྫͰ͸ɺͨ·ͨ·ࣄલ෼෍ؔ਺͕શϞσϧͰಉ͡ • Ұൠతʹ͸ɺҟͳΔؔ਺Λ࢖͍͍ͨ • ͷͰɺ෼͚ͯهड़͢Δͱ͜͏ͳΔ •

આ໌ͷ౎߹্ɺࣄલ෼෍ͷύϥϝʔλ͸લͷྫͱҧ͍ͬͯΔ N y θ m ω1 = 0.10 ω2 = 0.90 m ∼ Categorial(0.5,0.5) θ1 ∼ Beta(ω = ω1 , κ = 20) θ2 ∼ Beta(ω = ω2 , κ = 20) yi ∼ Bern(θm ) ω

26. ### mʹΑΔθͷมԽ • JAGS͸gibbs sampling͍ͯ͠ΔͷͰɺύϥϝʔλΛҰݸ ͣͭαϯϓϦϯά͍ͯ͘͠ θ(1) 1 ∼ P(θ1 |θ(0)

2 , m(0), D) θ(1) 2 ∼ P(θ2 |θ(1) 1 , m(0), D) m(1) ∼ P(m|θ(1) 1 , θ(1) 2 , D) θ(2) 1 ∼ P(θ1 |θ(1) 2 , m(1), D) θ(2) 2 ∼ P(θ2 |θ(2) 1 , m(1), D) m(2) ∼ P(m|θ(2) 1 , θ(2) 2 , D) ⋯
27. ### αϯϓϦϯάաఔ P(θ1 , θ2 , m|D) = { P1 (D|θ1

)P1 (θ1 )P2 (θ2 )P(m = 1) if m = 1 P2 (D|θ2 )P1 (θ1 )P2 (θ2 )P(m = 2) if m = 2 m(1) = 1 θ(1) 1 ∼ P(θ1 |θ(0) 2 , m = 1,D) = P(θ1 , θ(0) 2 , m = 1|D) P(θ(0) 2 , m = 1|D) P(θ(0) 2 , m = 1|D) = P2 (θ(0) 2 )P(m = 1) ∫ dθ1 P1 (D|θ1 )P1 (θ1 )ΑΓ = P1 (D|θ1 )P1 (θ1 ) ∫ dθ1 P1 (D|θ1 )P1 (θ1 )
28. ### αϯϓϦϯάաఔ θ(1) 2 ∼ P(θ2 |θ(1) 1 , m =

1,D) = P(θ(1) 1 , θ2 , m = 1|D) P(θ(1) 1 , m = 1|D) P(θ(0) 1 , m = 1|D) = P1 (D|θ(1) 1 )P1 (θ(1) 1 )P(m = 1) ∫ dθ2 P2 (θ2 )ΑΓ = P2 (θ2 ) P(θ1 , θ2 , m|D) = { P1 (D|θ1 )P1 (θ1 )P2 (θ2 )P(m = 1) if m = 1 P2 (D|θ2 )P1 (θ1 )P2 (θ2 )P(m = 2) if m = 2
29. ### αϯϓϦϯάաఔ m(2) ∼ P(m|θ(1) 1 , θ(1) 2 , D)

= P(θ(1) 1 , θ(1) 2 , m|D) P(θ(1) 1 , θ(1) 2 |D) P(θ1 , θ2 , m|D) = P(D|θ1 , θ2 , m) P(θ1 , θ2 , m) P(D|θ1 , θ2 , m) = { P1 (D|θ1 )P1 (θ1 )P2 (θ2 )P(m = 1) if m = 1 P2 (D|θ2 )P1 (θ1 )P2 (θ2 )P(m = 2) if m = 2 P(θ1 , θ2 , m) = P1 (θ1 )P2 (θ2 )P(m) = { P(D|θ1 ) if m = 1 P(D|θ2 ) if m = 2
30. ### αϯϓϦϯάաఔ • ࣍ͷm͕{1,2}ͷͲͪΒʹͳΔ͔͸ɺP(D|θ1)/P(D|θ2)ͷൺͰܾ·Δ • θ1ͷ΄͏͸P(D|θ1)P(θ1)͔Βੜ੒͞Ε͍ͯΔˠP(D|θ1)͕େʹͳΔ ֬཰͕ߴ͍ • θ2͸P(θ2)͔Βੜ੒͞Ε͍ͯΔˠP(D|θ2)͸খʹͳΔͩΖ͏ • ݁Ռͱͯ͠ɺm͸1ʹཹ·Δ֬཰͕ߴ͍

m(1) = 1 θ(1) 1 ∼ P(θ1 |m = 1,D) ∝ P1 (D|θ1 )P1 (θ1 ) θ(1) 2 ∼ P1 (θ2 ) m(2) ∼ P(m|θ(1) 1 , θ(1) 2 , D) = { P(D|θ1 ) if m = 1 P(D|θ2 ) if m = 2

32. ### ٙࣅࣄલ෼෍ͷར༻ • ٙࣅࣄલ෼෍Λ࢖Θͳ͍ϞσϧΛࣄલʹ࣮ߦ͓͖ͯ͠ɺٙࣅࣄલ෼෍ ͷύϥϝʔλΛಘΔ • બ͹ΕͨϞσϧͷθ͸ී௨ʹαϯϓϦϯά͢Δ͕ɺબ͹Εͳ͔ͬͨํ͸ ٙࣅࣄલ෼෍͔ΒαϯϓϦϯά͢Δ ωi,j , κi,j

= { true prior if i = j pseudo prior if i ≠ j m ∼ Categorial(0.5,0.5) θ1 ∼ Beta(ω = ω1,m , κ = κ1,m ) θ2 ∼ Beta(ω = ω2,m , κ = κ2,m ) yi ∼ Bern(θm ) 2 2 N y θ m ω

34. ### ࢧ࣋͞Εͳ͍Ϟσϧͷαϯϓ ϧ͕গͳ͍໰୊ • mͷࣄޙ෼෍ʹ͓͍ͯɺϞσϧ1͕બ͹ΕΔͷ͸8% • ͭ·ΓϞσϧ1ͷύϥϝʔλͰ͋Δθ1ͷαϯϓϧ਺͕શମ ͷ8% • αϯϓϧ਺Λ૿΍ͨ͢Ίʹ͸ɺνΣʔϯͷ௕͞Λ૿΍͢ (ܭࢉ࣌ؒʹѱӨڹ)΄͔ʹɺϞσϧ͕ΑΓฏ౳ʹબ͹ΕΔ

Α͏P(m)Λௐ੔͢Δ(m=1ʹόΠΞεΛֻ͚Δ)ํ๏͕͋Δ • P(m)Λ͍ͬͯ͡γϛϡϨʔγϣϯͨ͠৔߹Ͱ΋ɺฏ౳ͳ ࣄલ෼෍ʹ͓͚ΔࣄޙΦοζΛٻΊΒΕΔ BF = P(m = 1|D) P(m = 2|D) P(m = 2) P(m = 1)
35. ### 10.3.3 Ϟσϧ͝ͱʹҟͳΔ ໬౓ؔ਺ͷར༻ • P(D|θ)Λnoise distributionͱ΋͍͏ͦ͏Ͱ͢ • Ϟσϧ͝ͱʹҟͳΔP(D|θ)Λ࢖͍͍ͨͱ͖͸ɺ8.6.1Ͱ঺հ͠ ͨςΫχοΫ͕࢖͑Δ •

spy = if m = 1 then PDF(D|θ1) else PDF(D|θ2) / C • 1 ~ Bern(spy) • Ϟσϧͷಉ࣌֬཰ʹspyΛ৐͡Δ͜ͱʹͳΔ • C(େ͖Ίͷఆ਺)Ͱׂ͍ͬͯΔͷ͸spy͕1Λ௒͑ͳ͍Α͏ʹ • ૬ରతͳ஋͕ॏཁͳͷͰɺspyͷ۩ମతͳ஋͸ؔ܎ͳ͍ • STANͩͱ΋ͬͱ௚ײతʹॻ͚ͨؾ͕͢Δ(increment_log_prob ؔ਺ͰϞσϧͷ֬཰ΛՃࢉͰ͖Δ)
36. ### 10.4: Ϟσϧฏۉ • P(y)Λ༧ଌ͍ͨ͠ • Ϟσϧൺֱͷ݁ՌϞσϧb͕উ͍ͬͯͨͳΒɺͦͷϞσϧͰ༧ ଌ͢Δ͜ͱ͕Ͱ͖Δ P( ̂ y|D,

m = b) = ∫ dθb Pb ( ̂ y|θb , m = b)Pb (θb |D, m = b) • Ϟσϧ͝ͱʹ֬৴౓ׂ͕Γ౰ͯΒΕ͍ͯΔͷͰɺͦͷॏΈ Λ࢖ͬͯશϞσϧͷฏۉΛऔΔ͜ͱ͕Ͱ͖Δ P( ̂ y|D) = ∑ m ∫ dθm Pm ( ̂ y|θm , m)Pm (θm |D, m)P(m|D)
37. ### 10.5: Ϟσϧͷෳࡶ౓ • ࣄલ෼෍ʹ͓͍ͯɺύϥϝʔλͷऔΓ͏Δൣғ͕޿͍Ϟσ ϧΛʮෳࡶʯͳϞσϧͱݴ͍ͬͯΔͬΆ͍ • ୯ʹύϥϝʔλ਺͕ଟ͍Ϟσϧͱ͍͏ҙຯͰ͸ͳ͍(ҎԼ ͷྫͰ΋ɺύϥϝʔλ਺͸ಉ͡) • ҰൠతʹɺෳࡶͳϞσϧͷ΄͏͕σʔλ΁ͷద߹͸༗ར

• ޿͍ύϥϝʔλൣғͷϞσϧͷ΄͏͕ɺσʔλʹద߹ ͢Δύϥϝʔλͷ૊Έ߹ΘͤΛؚΉՄೳੑ͕ߴ͍ͷͰ • ͔͠͠աద߹͸ආ͚͍ͨ

ʹӨڹΛ༩͑Δ
39. ### Ϟσϧൺֱͱෳࡶ͞ • ίΠϯ౤͛ͷϞσϧ: θ ~ Beta(a, b) Λߟ͑Δ • 1.

ϑΣΞͩΖ͏Ϟσϧ: (a,b) = (500, 500) • 2. ͢΂ͯى͜Γ͏ΔϞσϧ: (a, b) = (1, 1) • 20ճத15ճද͕ग़ͨέʔεͰ͸ɺϞσϧ2͕উͭ • 20ճத11ճද͕ग़ͨΒϞσϧ1͕উͭ • ࣄલ෼෍ͷް͍෦෼Ͱσʔλʹద߹Ͱ͖͔͕ܾͨΊख
40. ### 10.5.1 Ϟσϧൺֱͷ஫ҙ • ͋ΔϞσϧ(full modelͱݺͿ)ʹରͯ͠ɺύϥϝʔλͷൣғ ʹ੍໿ΛՃ͑ͨϞσϧΛߟ͑Δ͜ͱ͕Ͱ͖Δ • ύϥϝʔλaͷ஋͸bͱಉ͡ɺͳͲ • full

modelͷ΄͏͕ෳࡶͳͷͰɺ੍ݶϞσϧ͕ಉ͘͡Β͍ Α͘σʔλΛදݱͰ͖ΔͳΒɺϕΠδΞϯϞσϧൺֱͰ͸ ੍ݶϞσϧ͕બ͹ΕΔͩΖ͏ • 9ষͷ໺ٿબखϞσϧʹ͓͍ͯɺ಺໺खͷೳྗ͸͢΂ͯಉ ͡Ͱ͋Δͱ͍͏੍ݶΛ͔͚ͨϞσϧ͕ߟ͑ΒΕΔ
41. ### Ϟσϧൺֱͷ஫ҙ఺ • ߟ͑ΒΕΔ੍໿Λશ෦ࢼͦ͏ͱ͢Δͷ͸΍Ίͨ΄͏͕͍ ͍ • 9ύϥϝʔλʹಉ஋੍໿Λֻ͚Δ৔߹ɺ૊Έ߹Θͤ͸ 21147௨Γ • ੍໿Λ͔͚Δͱ͍͏͜ͱ͸ɺಛఆͷύϥϝʔλͷ૊Έ ߹Θͤʹ͍ͭͯࣄલ෼෍Λ0ʹ͢Δͱ͍͏͜ͱ

• ͨͱ͑ϞσϧൺֱͰউͭͱͯ͠΋ɺ๬·͘͠ͳ͍͔΋ ͠Εͳ͍
42. ### 10.6 ࣄલ෼෍ʹහײ • ϕΠζϑΝΫλʔ͸∫dθ P(D|θ)P(θ) Λ࢖͍ͬͯΔͷͰɺࣄ લ෼෍ʹහײ • ྫ: ෼ࢠଆͷϞσϧͷࣄલ෼෍ΛBeta(1,1)͔Β

Beta(0.01,0.01)ʹͨ͠ΒɺBF͕0.12͔Β5.72ʹ • Ϟσϧͷ95% HDI͸ࣄલ෼෍ͷӨڹΛ΄΅ड͚ͳ͍ • ॆ෼ͳྔͷσʔλ͕͋ΔͳΒɺϕΠζਪఆ͸Ϟσϧൺֱͱ ҧͬͯࣄલ෼෍ͷӨڹΛड͚ʹ͍͘
43. ### 10.6.1 ֤Ϟσϧͷࣄલ෼෍ʹ͸ ฏ౳ʹ৘ใΛ༩͑Δ΂͖ • ࣄલ෼෍ͷҧ͍͕BFʹӨڹΛ༩͑ΔɻͲ͏͢΂͖͔ • σʔλʹج͍ͮͯࣄલ෼෍Λܾఆ͢Δ • ֤ϞσϧͰɺಉ͡σʔλʹج͍ܾͮͯΊΔ •

ྫ: 100ճத65ճද͕ग़ͨίΠϯ౤͛ • σʔλͷ10%(10ճத6ճද)Λ࢖ͬͯࣄલ෼෍Λิਖ਼ • Beta(1, 1) → Beta(1+6, 1+4) • BF͕҆ఆ͢Δ