疫学のための確率の基礎

 疫学のための確率の基礎

Causal Inference: What If』勉強会の第0回目の資料です.
疫学の理論を読み解くための確率の基礎をまとめました.

38e2af7f8bdad4f2087ab3d42b627e33?s=128

Shuntaro Sato

April 27, 2020
Tweet

Transcript

  1. Shuntaro Sato (Sato@ੜ෺౷ܭՈ) ӸֶͷͨΊͷ֬཰ͷجૅ Causal Inference: What Ifษڧձ

  2. ࣭໰͸ʁ 2 • Slack: ษڧձதͷ࣭໰ശνϟϯωϧʹ౤ߘ͍ͯͩ͘͠͞ • ܗࣜ͸ͳ͍Ͱ͢ • ษڧձதҎ֎ͷ࣭໰͸ɼSlack: ΈΜͳ΁ͷ࣭໰νϟϯωϧʹ౤ߘ͍ͯͩ͘͠͞

  3. Roadmap (1) 3 Population of interest vs. vs. Causation Association

    Treated Untreated E[Ya=1] E[Ya=0] E[Y|A = 1] E[Y|A = 0] E[Ya=1] − E[Ya=0] E[Y|A = 1] − E[Y|A = 0] Average causal effect Association measure Goal Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC. Figure 1.1
  4. Roadmap (2) 4 Average causal effect ͱ Association measureͷ ߏ੒ཁૉΛཧղ͢Δ

    Goal Average causal effect Association measure E[Ya=1] − E[Ya=0] = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  5. ஫ҙ 5 • ࠓճͷษڧձͷ໨త͸ʮ֬཰ͷجૅʯͷཧղ •γϯϓϧͳه๏Λ༻͍Δ • જࡏΞ΢τΧϜ౳ͷCausal Inferenceಛ༗ͷه๏͸Ͱ͖Δ͚ͩ༻͍ͳ͍ E[Ya=1] −

    E[Ya=0] = E[Ya=1 − Ya=0] E[X] − E[Y] = E[X − Y] E[Y1 ] − E[Y2 ] = E[Y1 − Y2 ] or
  6. ·ͱΊʢ͜Ε͚ͩ͸֮͑ͯʂʣ 6 1. ڵຯ͋Δม਺ͷظ଴஋ = ڵຯ͋Δม਺ͷฏۉ ɹೋ஋σʔλͷ৔߹ɼظ଴஋ = ฏۉ =

    ׂ߹ 2. ʰ࿨ͷظ଴஋͸ɼظ଴஋ͷ࿨ʱʢ਺ֶΨʔϧ ཚ୒ΞϧΰϦζϜΑΓʣ 3. ৚͖݅ͭظ଴஋͸ɼαϒάϧʔϓ಺Ͱͷظ଴஋ E[X + Y] = E[X] + E[Y] E[Y|A = 1] E[Y|A = 0] Y A 1 1 1 0 0 0
  7. େࣄͳ༻ޠ 7 ೔ຊޠ ӳޠ ه๏ͷྫ ࣄ৅ Event ഉ൓ Disjoint ཭ࢄܕ

    Discrete type ࿈ଓܕ Continuous type ֬཰ Probability ֬཰ม਺ Random variable ֬཰෼෍ Probability distribution ֬཰ີ౓ؔ਺ Probability density function: PDF ظ଴஋ Expected value ظ଴஋ͷઢܗੑ Linearity of expectation ৚͖݅ͭ֬཰ Conditional probability ಉ࣌֬཰ Joint probability ඪ४Խ Standardization ಠཱੑ Independent ৚͖݅ͭಠཱ Conditional independent શ֬཰ͷެࣜ Law of total probability पล֬཰ Marginal probability ࿈࠯ެࣜ Chain rule A Pr(X), Pr(X = x) X Pr f E[X] Pr[Y |A = a] Pr[Y |A], Pr(Y = y, A = a) Y⊥ ⊥ A|L Y⊥ ⊥ A
  8. ظ଴஋Λ෼ղ͢Δʢ1ʣ 8 Average causal effect Association measure E[Ya=1] − E[Ya=0]

    = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  9. ظ଴஋Λ෼ղ͢Δʢ2ʣ 9 E[X] = ∞ ∑ k=1 xk Pr(X =

    xk ) ظ଴஋ͷఆٛ ֬཰ม਺ ؍ଌσʔλ ֬཰ ֬཰ม਺ɹ ͷظ଴஋ɹɹ Λ࣍ࣜͰఆٛ͢Δɽ͜͜Ͱɼ • ɹ͸ɼ֬཰ม਺ ɹ ͕ͱΔ஋ • ɹɹɹɹ ͸ɼ֬཰ม਺ɹ ͕ɹ ʹ౳͘͠ͳΔ֬཰Λද͢ xk X X E[X] Pr(X = xk ) X xk
  10. 3௨Γͷ֬཰ͷҙຯ 10 ݹయత֬཰ ౷ܭత֬཰ ެཧత֬཰ ಉఔ౓ʹ͔֬Β͍͠ ͢΂ͯͷ৔߹ͷ਺ʹର͢ Δɼ͋Δࣄ৅ͷى͜Δ৔ ߹ͷ਺ͷൺΛ֬཰ͱ͢Δ શମͷ਺ʹର͢Δɼ

    ͋Δࣄ৅ͷى͜Δ਺ͷൺ Λ֬཰ͱ͢Δ ֬཰ͷެཧʹΑͬͯఆΊ ͨ֬཰ ݱ୅਺ֶͰͷ֬཰ͷఆٛ ݹయత֬཰ ެཧత֬཰
  11. ֬཰ͷެཧ 11 ɹΛू߹ͱ͠ɼɹɼɹΛɹͷ෦෼ू߹ͱ͢Δɽ ɹ Λɹ ͷ෦෼ू߹͔Β࣮਺΁ͷؔ਺ͱ͢Δɽ ؔ਺ɹ ͕ҎԼͷ3ͭͷެཧΛຬͨ͢ͱ͠Α͏ɽ • ू߹ɹ

    Λඪຊۭؒͱݺͼɼ • ɹ ͷ෦෼ू߹Λࣄ৅ͱݺͼɼ • ؔ਺ɹɹΛ֬཰෼෍ͱݺͼɼ • ࣮਺ɹɹɹΛɹ ͕ى͖Δ֬཰ͱݺͿ Ω Ω A B Ω Pr Pr 0 ≤ Pr(A) ≤ 1 Pr(Ω) = 1 A ∩ B = ϕ Pr(A ∪ B) = Pr(A) + Pr(B) ͳΒ͹ɼ ͜ͷͱ͖ɼ Ω Ω Pr Pr(A) A ެཧ1 ެཧ2 ެཧ3
  12. ඪຊۭؒͱ֬཰෼෍ 12 αΠίϩΛճ౤͛Δͱ͖ͷඪຊۭؒ Ω = { , , , ,

    , } ࠜݩࣄ৅ w ඪຊۭؒ͸ࠜݩࣄ৅ͷू߹ w ඪຊۭؒ͸ɼ΋Εͳ͘ɾͩͿΓ͕ͳ͍ w ඪຊۭؒɹͷ෦෼ू߹ɹΛࣄ৅ʢFWFOUʣͱݺͿ w ɹ͸ɹͷ෦෼ू߹Ͱ΋͋Δˠશࣄ৅ ࣄ৅ ω { } { } { } { } { } { } Pr(ω) 1 6 1 6 1 6 1 6 1 6 1 6 Pr ֬཰෼෍ ֬཰ ֬཰෼෍ɹɹ͸ɹɹͷ෦෼ू߹͔Β࣮਺΁ͷؔ਺ Pr Ω Pr({ }) = 1 6 ͷΑ͏ʹࣜͰ͔͚Δ Ω A ɹɹ A Ω A ⊂ Ω Ω Ω
  13. ֬཰ͷެཧ 1 13 ࣄ৅Aͷ֬཰͸0Ҏ্Ͱ1ҎԼ 0 ≤ Pr(A) ≤ 1

  14. ֬཰ͷެཧ 2 14 શࣄ৅ͷ֬཰͸ʹ౳͍͠ Pr(Ω) = 1 Pr({ , ,

    , , , }) = 1
  15. ֬཰ͷެཧ 3 15 AͱBͷੵू߹͕ۭू߹ͳΒ͹ɼ AͱBͷ࿨ू߹ͷ֬཰͸ɼAͷ֬཰ͱBͷ֬཰ͷ࿨ͱ౳͍͠ A ∩ B = ϕ

    Pr(A ∪ B) = Pr(A) + Pr(B) ͳΒ͹ɼ { , , , } ∩ { , , } = { , } A B A ∩ B { , , } ∩ { , , } = {} A B A ∩ B = ϕ ޓ͍ʹૉˠഉ൓ࣄ৅ Pr({ , , , , , }) = Pr({ , , }) + Pr({ , , }) ͳΒ͹ɼ ੵू߹
  16. ֬཰ม਺ 16 ֬཰ม਺͸ɼඪຊۭ͔ؒΒ࣮਺΁ͷؔ਺ ࣄ৅ ω { } { } {

    } { } { } { } X(ω) X ֬཰ม਺ ֬཰ม਺ͷ஋ Pr(X(ω)) 1 6 1 6 1 6 1 6 1 6 1 6 Pr ֬཰෼෍ ֬཰ 1 2 3 4 5 6 • αΠίϩΛ1ճ౤͛ͨͱ͖ͷɼग़Δ໨Λ֬཰ม਺Xͱ͢Δ • ग़ͨ໨Λ100ഒͨ͠΋ͷΛ֬཰ม਺Xͱ͢Δ • ͋Δूஂͷ͏ͪɼநग़ͨ͠Ұਓͷ਎௕Λ֬཰ม਺Xͱ͢Δ • ͋Δूஂͷ͏ͪɼநग़ͨ͠Ұਓͷ͋Δ࣬පͷ༗ແΛ֬཰ม਺Yͱ͢Δ
  17. ֬཰ม਺ͱ֬཰෼෍ 17 ͋Δ࣬පͷ༗ແ ω { } { } x ֬཰ม਺ɹ͕ͱΓಘΔ஋

    Pr(X = x) 0.2 0.8 Pr ֬཰෼෍ ʹͳΔ֬཰ 1 0 ͋Γ ͳ͠ X X = x ؍ଌ ਺஋Խͨ͠؍ଌσʔλ
  18. ظ଴஋Λཧղ͢Δ 18 Average causal effect Association measure E[Ya=1] − E[Ya=0]

    = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  19. ظ଴஋ͷఆٛ 19 E[X] = ∞ ∑ k=1 xk Pr(X =

    xk ) = x1 Pr(X = x1 ) + x2 Pr(X = x2 ) + ⋯ + xn Pr(X = xn ) ֬཰ม਺ɹ ͷظ଴஋ɹɹ Λ࣍ࣜͰఆٛ͢Δɽ͜͜Ͱɼ • ɹ͸ɼ֬཰ม਺ ɹ ͕ͱΔ஋ • ɹɹɹɹ ͸ɼ֬཰ม਺ɹ ͕ɹ ʹ౳͘͠ͳΔ֬཰Λද͢ xk X X E[X] Pr(X = xk ) X xk ֬཰ม਺ͷ஋ʹର͠ɼͦΕʹରԠ͢Δ֬཰ΛॏΈ෇͚͍ͯ͠Δ ʰྫࣔ͸ཧղͷࢼۚੴʱɹ਺ֶΨʔϧΑΓ
  20. Height X n Proportion Probability 150 cmͷ਎௕ͷਓ 150 10 10/40

    0.25 150 * 0.25 = 37.5 160 cmͷ਎௕ͷਓ 160 20 20/40 0.5 160 * 0.5 = 80 170 cmͷ਎௕ͷਓ 170 10 10/40 0.25 170 * 0.25 = 42.5 ظ଴஋ͷܭࢉ 20 ֬཰ม਺ɹ ͕ͱΔ஋ɹ ʹɼͦΕʹରԠ͢Δ֬཰Λ͔͚ɼશͯͷࣄ৅Λ଍͢ xk X Pr(X = x) xk Pr(X = xk ) x1 x2 E[X] = ∞ ∑ k=1 xk Pr(X = xk ) = 37.5 + 80 + 42.5 = 160 x3 Outcome X n Proportion Probability ͋Γ 1 20 0.2 0.2 1 * 0.2 = 0.2 ͳ͠ 0 80 0.8 0.8 0 * 0.8 = 0 Pr(X = x) xk Pr(X = xk ) x1 x2 E[X] = ∞ ∑ k=1 xk Pr(X = xk ) = 0.2 + 0 = 0.2 ࿈ଓσʔλ ཭ࢄσʔλʢೋ஋σʔλʣ xk Pr(X = xk )
  21. ظ଴஋͸ฏۉʢ࿈ଓσʔλʣ 21 Height X n Proportion Probability 150 cmͷ਎௕ͷਓ 150

    10 10/40 0.25 150 * 0.25 = 37.5 160 cmͷ਎௕ͷਓ 160 20 20/40 0.5 160 * 0.5 = 80 170 cmͷ਎௕ͷਓ 170 10 10/40 0.25 170 * 0.25 = 42.5 Pr(X = x) x1 x2 E[X] = ∞ ∑ k=1 xk Pr(X = xk ) = 150 ⋅ 10 40 + 160 ⋅ 20 40 + 170 ⋅ 10 40 = 160 x3 ࿈ଓσʔλ xk Pr(X = xk ) μ = 150 ⋅ 10 + 160 ⋅ 20 + 170 ⋅ 10 40 = 160 ຊདྷ͸ɼ֬཰ີ౓ؔ਺ɹɹ : Probability Density Function (PDF) Ͱߟ͑Δ ʢe.g. Techical Point 1.1ʣ E[X] = ∫ x f(x)dx f(x)
  22. ظ଴஋͸ฏۉʢ཭ࢄσʔλʣ 22 xk Pr(X = xk ) Outcome X n

    Proportion Probability ͋Γ 1 20 0.2 0.2 1 * 0.2 = 0.2 ͳ͠ 0 80 0.8 0.8 0 * 0.8 = 0 Pr(X = x) xk Pr(X = xk ) x1 x2 ཭ࢄσʔλʢೋ஋σʔλʣ E[X] = ∞ ∑ k=1 xk Pr(X = xk ) = 1 ⋅ 20 100 + 0 ⋅ 80 100 = 0.2 μ = 1 ⋅ 20 + 0 ⋅ 80 100 = 0.2 • ಛʹೋ஋σʔλͷ৔߹͸ɼ ظ଴஋ = ฏۉ = ֬཰ม਺͕1ʹͳΔׂ߹ • ൃ঱ͷ༗ແΛ1ͱ0ʹίʔσΟϯάͨ͠৔߹ɼൃ঱ׂ߹ ಛʹɼೋ஋σʔλΛ1, 0Ͱࣔ֬͢཰ม਺ΛIndicatorͱݺͿ
  23. ظ଴஋ͷઢܗੑ (1) 23 ֬཰ม਺ɹɹɹͷظ଴஋ɹɹɹɹ ʹ͍ͭͯɼ͕࣍ࣜ੒Γཱͭ X + Y E[X +

    Y] E[X + Y] = E[X] + E[Y] ࿨ͷظ଴஋͸ɼظ଴஋ͷ࿨ xk Pr(X = xk ) ID X Y X + Y 1 1 1 2 2 1 0 1 3 1 1 2 4 1 1 2 5 1 0 1 6 1 0 1 7 1 0 1 8 1 0 1 9 0 1 1 10 0 0 0 X + Y Probability 2 3/10 2 * 0.3 = 0.6 1 6/10 1 * 0.6 = 0.6 0 1/10 0 * 0.1 = 0 (x + y)k ⋅ Pr(X + Y) E[X] = 0.8, E[Y] = 0.4 E[X] + E[Y] = 1.2 E[X + Y] = 1.2
  24. ظ଴஋ͷઢܗੑ (2) 24 E[k ⋅ X] = k ⋅ E[X]

    ·ͨɼ೚ҙͷఆ਺ɹʹ͍ͭͯɼ͕࣍ࣜ੒Γཱͭ ఆ਺ഒͷظ଴஋͸ɼظ଴஋ͷఆ਺ഒ X (Height [cm]) k kX (Height [m]) Probability 150 1/100 1.5 0.25 150 * 0.25 = 37.5 0.375 160 1/100 1.6 0.5 160 * 0.5 = 80 0.80 170 1/100 1.7 0.25 170 * 0.25 = 42.5 0.425 xk Pr(X = xk ) kxk Pr(X = kxk ) E[X] = 160 kE[X] = 1/100 ⋅ 160 = 1.6 E[kX] = 1.6 k x1 x2 x3
  25. Average causal effect ΛಡΉ 25 E[Ya=1] − E[Ya=0] = E[Ya=1

    − Ya=0] E[Y1 ] − E[Y2 ] = E[Y1 − Y2 ] ID Y1 Y2 Y1 - Y2 1 1 1 0 2 1 0 1 3 1 1 0 4 1 1 0 5 1 0 1 6 1 0 1 7 1 0 1 8 1 0 1 9 0 1 -1 10 0 0 0 Y1 - Y2 Probability -1 1/10 -0.1 0 4/10 0 1 5/10 0.5 (y1 − y2 )k ⋅ Pr(Y1 − Y2 ) E[Y1 ] = 0.8, E[Y2 ] = 0.4 E[Y1 ] − E[Y2 ] = 0.4 E[Y1 − Y2 ] = 0.4 • ظ଴஋ͷઢܗੑΑΓɼಉ͡஋ʹͳΔ • ࠨล͸ɼ͋Δ2ͭͷঢ়ଶʹ͓͚Δɼ ͦΕͧΕͷΞ΢τΧϜͷظ଴஋ͷࠩ • ӈล͸ɼݸਓ಺Ͱͷ2ͭͷঢ়ଶʹ͓͚Δ Ξ΢τΧϜͷࠩͷظ଴஋ ղऍ͠΍͍͢
  26. ৚͖݅ͭظ଴஋Λ෼ղ͢Δ 26 Average causal effect Association measure E[Ya=1] − E[Ya=0]

    = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  27. ৚͖݅ͭ֬཰ɼಉ࣌֬཰ 27 P(Y = y|A = a) ৚͖݅ͭ֬཰ɿɹɹɹͰϑΟϧλʔΛ͔͚ͨޙͷɹɹɹͷ֬཰ A =

    a Y = y k Outcome Y Sex A n Joint probability 1 1 (M) 20 20/200 = 0.1 1 0 (F) 50 50/200 = 0.25 0 1 (M) 80 80/200 = 0.4 0 0 (F) 50 50/200 = 0.25 Pr(Y, A) ಉ࣌֬཰ɿ ෳ਺ͷ֬཰ม਺͕ಉ࣌ʹى͜Δ֬཰ k Outcome Y Sex A = 1 n Conditional probability 1 1 20 20/100 = 0.2 0 1 80 80/100 = 0.8 Pr(Y |A = 1) k Outcome Y Sex A = 0 n Conditional probability 1 0 50 50/100 = 0.5 0 0 50 50/100 = 0.5 Pr(Y |A = 0) αϒάϧʔϓ಺
  28. ৚͖݅ͭظ଴஋ 28 E[Y|A = a] = ∞ ∑ k=1 yk

    Pr(Y = yk |A = a) = y1 Pr(Y = y1 |A = a) + y2 Pr(Y = y2 |A = a) + ⋯ + yn Pr(Y = yn |A = a) Outcome Y Sex A = 1 n Conditional probability 1 1 20 20/100 = 0.2 1 * 0.2 = 0.2 0 1 80 80/100 = 0.8 0 * 0.8 = 0 Outcome Y Sex A = 0 n Conditional probability 1 0 50 50/100 = 0.5 1 * 0.5 = 0.5 0 0 50 50/100 = 0.5 0 * 0.5 = 0.5 yk Pr(Y = yk |A = 1) yk Pr(Y = yk |A = 0) Pr(Y |A = 0) Pr(Y |A = 1) E[Y|A = 1] = 0.2 E[Y|A = 1] = 0.5 ੑผͰαϒάϧʔϓʹ෼͚ͨޙɼαϒάϧʔϓ಺Ͱͷظ଴஋ΛٻΊΔ
  29. Association measure ΛಡΉ 29 ID Outcome Y Exposure A 1

    1 1 2 0 1 3 0 1 4 0 1 5 1 1 6 0 0 7 1 0 8 1 0 9 1 0 10 0 0 E[Y|A = 1] − E[Y|A = 0] Outcome Y Exposure A = 1 n Conditional probability 1 1 2 2/5 = 0.4 0 1 3 3/5 = 0.6 yk Pr(Y = yk |A = 1) Pr(Y |A = 1) E[Y|A = 1] = 0.4 Outcome Y Exposure A = 0 n Conditional probability 1 0 3 3/5 = 0.6 0 0 2 2/5 = 0.4 yk Pr(Y = yk |A = 1) Pr(Y |A = 0) E[Y|A = 0] = 0.6 E[Y|A = 1] − E[Y|A = 0] = 0.4 − 0.6 = − 0.2 മ࿐܈ͷ࣬ױൃੜͷظ଴஋ͱඇമ࿐܈ͷ࣬ױൃੜͷظ଴஋ͷࠩ
  30. ৚͖݅ͭظ଴஋ͱճؼ (1) 30 Average causal effect Association measure E[Ya=1] −

    E[Ya=0] = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  31. ৚͖݅ͭظ଴஋ͱճؼ (2) 31 ճؼͱ͸৚͖݅ͭظ଴஋ΛٻΊΔ͜ͱ ৚͖݅ͭظ଴஋͸ɼճؼͰٻΊΒΕΔ E[Y|A = a] • •

    • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 160 165 170 175 180 12 14 16 Age [yr] Height [cm] E[Height|Age] E[Height|Age, sex = f ] E[Height|Age, sex = m] • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 160 165 170 175 180 12 14 16 Age [yr] Height [cm] sex • f m
  32. ஌ͬͱ͖͍ͨςΫχοΫ (1) 32 Average causal effect Association measure E[Ya=1] −

    E[Ya=0] = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  33. ৚͖݅ͭ֬཰ 33 Pr(Y|A) = Pr(Y, A) Pr(A) k Outcome Y

    Sex A n Joint probability Conditional probability 1 1 20 20/200 = 0.1 0.1/0.5 = 0.2 1 0 50 50/200 = 0.25 0.25/0.5 = 0.5 0 1 80 80/200 = 0.4 0.4/0.5 = 0.8 0 0 50 50/200 = 0.25 0.25/0.5 = 0.5 Pr(Y, A) Pr(A = 1) = (20 + 80)/200 = 0.5 Pr(Y, A) Pr(Y |A = a) Pr(A = 0) = (50 + 50)/200 = 0.5
  34. Chain rule (1) 34 Pr(Y, A) = Pr(Y|A) ⋅ Pr(A)

    ৐๏ఆཧ ಉ࣌֬཰͔Βɼ৚͚͍݅ͮͨม਺Λు͖ग़͢ Oݸͷࣄ৅ʹ͍ͭͯ΋֦ுͰ͖Δ $IBJOSVMF Pr(A1 , A2 , ⋯, An ) = Pr(An |A1 , A2 , ⋯, An−1 )⋯ Pr(A2 |A1 ) Pr(A1 ) ಉ࣌֬཰͔Βɼ৚͚͍݅ͮͨม਺Λు͖ग़͠ଓ͚Δ Pr(Y|A) = Pr(Y, A) Pr(A)
  35. Chain rule (2) 35 $IBJOSVMF Pr(A1 , A2 , ⋯,

    An ) = Pr(An |A1 , A2 , ⋯, An−1 )⋯ Pr(A2 |A1 ) Pr(A1 ) ʰྫࣔ͸ཧղͷࢼۚੴʱɹ਺ֶΨʔϧΑΓ Pr(A1 , A2 , A3 ) = Pr(A1 |A2 , A3 ) Pr(A2 , A3 ) = Pr(A1 |A2 , A3 ) Pr(A2 |A3 ) Pr(A3 ) Pr(A1 , A2 , A3 , A4 ) = Pr(A1 |A2 , A3 , A4 ) Pr(A2 , A3 , A4 ) = Pr(A1 |A2 , A3 , A4 ) Pr(A2 |A3 , A4 ) Pr(A3 , A4 ) = Pr(A1 |A2 , A3 , A4 ) Pr(A2 |A3 , A4 ) Pr(A3 |A4 ) Pr(A4 ) Pr(A1 , A2 , A3 ) = Pr(A2 |A1 , A3 ) Pr(A1 , A3 ) = Pr(A2 |A1 , A3 ) Pr(A1 |A3 ) Pr(A3 ) ೚ҙͷࣄ৅Λు͖ग़ͤΔ
  36. ৐๏ఆཧ͔Β৚͖݅ͭظ଴஋΁ 36 Pr(Y, A) = Pr(Y|A) ⋅ Pr(A) E[Y, A]

    = E[Y|A] ⋅ Pr(A) ॏΈ͚ͮ ඪ४ԽͰ࢖͏ Outcome Y Exposure A = 1 n Conditional probability 1 1 2 2/5 = 0.4 0 1 3 3/5 = 0.6 yk Pr(Y = yk |A = 1) Pr(Y |A = 1) E[Y|A = 1] = 0.4 Outcome Y Exposure A = 0 n Conditional probability 1 0 3 3/5 = 0.6 0 0 2 2/5 = 0.4 yk Pr(Y = yk |A = 1) Pr(Y |A = 0) E[Y|A = 0] = 0.6 E[Y, A = 1] = 0.4 ⋅ 0.5 = 0.2 E[Y, A = 0] = 0.6 ⋅ 0.5 = 0.3 Pr(A = 1) = 0.5 Pr(A = 0) = 0.5
  37. ಠཱੑ 37 Pr(Y|A) = Pr(Y) Pr(Y, A) Pr(A|Y) = Pr(A)

    Pr(Y, A) = Pr(Y) ⋅ Pr(A) 2ͭͷࣄ৅YͱAʹ͍ͭͯɼ͕࣍ࣜ੒Γཱͭͱ͖ɼࣄ৅YͱA͸ಠཱͰ͋Δ ಠཱͰ͋Δͱ͖ɼ͕࣍ࣜಘΒΕΔ Y⊥ ⊥ A ͱ͔͘ keynoteͷtex Ͱॻ͚ͨΑ!! ExchangeabilityͰ࢖͏ հೖAΛϥϯμϜʹׂΓ෇͚ͯ΋ɼજࡏΞ΢τΧϜʹ͸Өڹ͠ͳ͍
  38. ৚͖݅ͭಠཱ 38 Pr(Y|A, L) = Pr(Y|L) Pr(Y, A) Pr(Y, A|L)

    = Pr(Y|L) ⋅ Pr(A|L) 3ͭͷࣄ৅YɼAɼLʹ͍ͭͯɼ͕࣍ࣜ੒Γཱͭͱ͖ɼ LΛ৚্͚݅ͮͨͰࣄ৅YͱA͸ಠཱͰ͋Δ ৚͖݅ͭಠཱͰ͋Δͱ͖ɼ͕࣍ࣜಘΒΕΔ Y⊥ ⊥ A|L ͱ͔͘ Conditional ExchangeabilityͰ࢖͏ αϒάϧʔϓ಺Ͱ͸ɼհೖA͸ϥϯμϜʹׂΓ෇͚ΒΕ͍ͯΔͱߟ͑Δ ͢ΔͱհೖA͸જࡏΞ΢τΧϜʹ͸Өڹ͠ͳ͍
  39. DAGΛ਺ࣜͰද͢ 39 w Chain ruleͱಠཱੑ͔ΒDAGΛ਺ࣜͰදݱͰ͖Δ w άϥϑΟΧϧϞσϧʢஞ࣍తҼ਺෼ղͷ๏ଇʣ͔Βͷ੍໿͸ඞཁ • ैଐؔ܎ΛChain ruleʹ෇༩͢Δ

    Y A L Pr(Y, A, L) = Pr(Y|L) Pr(L|A) Pr(A) A Y Pr(Y, A) = Pr(Y|A) Pr(A) A Y Pr(Y, A) = Pr(Y) Pr(A) Y A L Pr(Y, A, L) = Pr(Y|A, L) Pr(A|L) Pr(L) Pr(Y, A, L) = Pr(Y|A, L) Pr(L, A) Pr(A) = Pr(Y|A, L) Pr(L|A) Pr(A) L Y A Pr(Y, A, L) = Pr(L|Y, A) Pr(Y) Pr(A)
  40. ஌ͬͱ͖͍ͨςΫχοΫ (2) 40 Average causal effect Association measure E[Ya=1] −

    E[Ya=0] = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  41. શ֬཰ͷެࣜͱपลԽ 41 ࣄ৅AͱB͕͋Δɽࣄ৅Bͷཁૉಉ࢜΋ഉ൓ͳΒ͹ɼ͕࣍ࣜ੒Γཱͭ Pr(A) = Pr(A, B1 ) + Pr(A,

    B2 ) + ⋯ + Pr(A, Bn ) Probability Exposure Outcome A = 1 A = 0 Y = 1 0.1 0.25 0.35 Y = 0 0.4 0.25 0.65 0.5 0.5 1 Pr(Y) Pr(A) Pr(Y, A) Pr(A = 1) = Pr(A = 1,Y = 1) + Pr(A = 1,Y = 0) = 0.1 + 0.4 = 0.5 Pr(A = 0) = Pr(A = 0,Y = 1) + Pr(A = 0,Y = 0) = 0.25 + 0.25 = 0.5 Pr(Y = 1) = Pr(Y = 1,A = 1) + Pr(Y = 1,A = 0) = 0.1 + 0.25 = 0.35 Pr(Y = 0) = Pr(Y = 0,A = 1) + Pr(Y = 0,A = 0) = 0.4 + 0.25 = 0.65 पล֬཰ पล֬཰ΛٻΊΔ͜ͱΛ पลԽͱݺͿ
  42. ৚͖݅ͭ֬཰ͱपลԽ 42 શ֬཰ͷެࣜ͸ɼ৚͖݅ͭ֬཰Ͱද͢͜ͱ͕Ͱ͖Δ Pr(A) = Pr(A, B1 ) + Pr(A,

    B2 ) + ⋯ + Pr(A, Bn ) = Pr(A|B1 ) Pr(B1 ) + Pr(A|B2 ) Pr(B2 ) + ⋯ + Pr(A|Bn ) Pr(Bn ) Probability Exposure Outcome A = 1 A = 0 Y = 1 0.2 0.5 0.2 * 0.5 = 0.1 0.5 * 0.5 = 0.25 0.35 Y = 0 0.8 0.5 0.8 * 0.5 = 0.4 0.5 * 0.5 = 0.25 0.65 0.5 0.5 1 Pr(Y |A = 1) Pr(A = 1) Pr(A) Pr(Y |A) മ࿐AͰ৚͚݅ͭΔ Pr(Y |A = 0) Pr(A = 0) Pr(Y) पล֬཰ ඪ४ԽͰ࢖͏ E[Y] = E[Y |A1 ] ⋅ P(A1 ) + E[Y |A2 ] ⋅ P(A2 ) + ⋯ + E[Y |An ] ⋅ P(An ) = n ∑ k=1 E[Y |Ak ] ⋅ Pr(Ak )
  43. Roadmap 43 Average causal effect ͱ Association measureͷ ߏ੒ཁૉΛཧղ͢Δ Goal

    Average causal effect Association measure E[Ya=1] − E[Ya=0] = E[Ya=1 − Ya=0] E[Y|A = 1] − E[Y|A = 0] ظ଴஋ ظ଴஋ͷઢܗੑ ฏۉ ֬཰ ֬཰ม਺ ৚͖݅ͭظ଴஋ ৚͖݅ͭ֬཰ Marginal Conditional w ಉ࣌֬཰ w ಠཱੑ w $IBJOSVMF w શ֬཰ͷެࣜ पลԽ ճؼ पลԽ
  44. ·ͱΊʢ͜Ε͚ͩ͸֮͑ͯʂʣ 44 1. ڵຯ͋Δม਺ͷظ଴஋ = ڵຯ͋Δม਺ͷฏۉ ɹೋ஋σʔλͷ৔߹ɼظ଴஋ = ฏۉ =

    ׂ߹ 2. ʰ࿨ͷظ଴஋͸ɼظ଴஋ͷ࿨ʱʢ਺ֶΨʔϧ ཚ୒ΞϧΰϦζϜΑΓʣ 3. ৚͖݅ͭظ଴஋͸ɼαϒάϧʔϓ಺Ͱͷظ଴஋ E[X + Y] = E[X] + E[Y] E[Y|A = 1] E[Y|A = 0] Y A 1 1 1 0 0 0
  45. • Hernán MA, Robins JM. (2020). Causal Inference: What If.

    Boca Raton: Chapman & Hall/CRC. • Pearl J, (མւߒ, ༁). (2019). ౷ܭతҼՌਪ࿦. ே૔ॻళ. • ౻ᖒ༸ಙ. (2006). ֬཰ͱ౷ܭ. ே૔ॻళ. • A. ίϧϞΰϩϑ, I. δϡϧϕϯί, A. ϓϩϗϩϑ, ؙࢁ఩࿠+അ৔ྑ࿨. (2003). ίϧϞΰϩϑͷ֬཰࿦ೖ໳. ৿๺ग़൛. • स໦௚ٱ. (2004). ֬཰࿦. ே૔ॻళ. • ݁৓ߒ. (2011). ਺ֶΨʔϧ ཚ୒ΞϧΰϦζϜ. SB Creative. • ݁৓ߒ. (2016). ਺ֶΨʔϧͷൿີͷϊʔτ ΍͍͞͠౷ܭ. SB Creative. • ࠇ໦ֶ. (2020). ਺ཧ౷ܭֶ ౷ܭతਪ࿦ͷجૅ. ڞཱग़൛. • Porta M. (2014). A dictionary of epidemiology sixth edition. Oxford. ࢀߟจݙ 45