Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking 2023 - Lecture 04

Statistical Rethinking 2023 - Lecture 04

Richard McElreath

January 11, 2023
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Statistical Rethinking
    4. Categories & Curves
    2023

    View Slide

  2. View Slide

  3. www.3blue1brown.com

    View Slide

  4. Drawing Inferences
    How to use statistical models to get at
    scientific estimands?
    Need to incorporate causal thinking into
    (1) How we draw the statistical models

    (2) How we process the results

    View Slide

  5. Drawing Inferences
    (1) How we draw the statistical models
    Generative model + multiple estimands 

    => multiple estimators
    (2) How we process the results
    Only very simple causal estimates show up
    in summary tables

    View Slide

  6. Categories & Curves
    Linear models can do extra-linear things
    Categories (indicator & index variables)
    Splines and other additive structures
    We require these tools to build causal
    estimators

    View Slide

  7. Categories
    How to cope with causes that are
    not continuous?
    Categories: discrete, unordered
    types
    Want to stratify by category: 

    Fit a separate line for each

    View Slide

  8. 140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    women
    men
    Adult height, weight, and sex

    View Slide

  9. Think scientifically first
    How are height, weight, and sex causally related?
    How are height, weight, and sex statistically related?
    W
    H
    S

    View Slide

  10. The causes aren’t in the data
    W
    H
    W
    H
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)

    View Slide

  11. The causes aren’t in the data
    W
    H
    W
    H
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)

    View Slide

  12. The causes aren’t in the data
    S
    H
    S
    H
    women
    men
    140 150 160 170 180
    0.00 0.04 0.08
    height (cm)
    Density

    View Slide

  13. The causes aren’t in the data
    S
    H
    S
    H
    women
    men
    30 40 50 60
    0.00 0.04 0.08
    weight (kg)
    Density
    140 150 160 170 180
    0.00 0.04 0.08
    height (cm)
    Density
    women
    men
    S
    W
    S
    W

    View Slide

  14. The causes aren’t in the data
    S
    H
    S
    H
    women
    men
    30 40 50 60
    0.00 0.04 0.08
    weight (kg)
    Density
    140 150 160 170 180
    0.00 0.04 0.08
    height (cm)
    Density
    women
    men
    S
    W
    S
    W

    View Slide

  15. W
    H
    S
    height
    influences
    weight
    sex influences both
    height & weight
    weight is
    influenced
    by both
    height & sex

    View Slide

  16. W
    H
    S
    H = fH
    (S)
    W = fW
    (H, S)

    View Slide

  17. height
    sex
    weight

    View Slide

  18. W
    H
    S
    H = fH
    (S, U)
    W = fW
    (H, S, V)
    U
    V
    W
    unobserved
    causes
    ignorable unless shared
    S = fS
    (W)

    View Slide

  19. W
    H
    S T
    V
    temperature
    U
    Figure: Lockley & Eizaguirre 2021

    View Slide

  20. W
    H
    S
    H = fH
    (S, U)
    W = fW
    (H, S, V)
    U
    V
    W
    S = fS
    (W)

    View Slide

  21. W
    H
    S
    U
    V
    W
    # S=1 female; S=2 male
    sim_HW N H W data.frame(S,H,W)
    }
    Synthetic people

    View Slide

  22. W
    H
    S
    U
    V
    W
    # S=1 female; S=2 male
    sim_HW N H W data.frame(S,H,W)
    }
    Synthetic people
    for each S, different height-weight line

    View Slide

  23. W
    H
    S
    U
    V
    W
    # S=1 female; S=2 male
    sim_HW N H W data.frame(S,H,W)
    }
    S dat head(dat)
    S H W
    1 1 149.2428 80.07163
    2 2 164.8629 94.79647
    3 2 158.1800 90.59944
    4 1 148.2153 70.68285
    5 2 155.5794 91.43798
    6 2 154.0733 92.18237

    View Slide

  24. Think scientifically first
    Different causal questions may need
    different statistical models
    Q: Causal effect of H on W?
    Q: Causal effect of S on W?
    Q: Direct causal effect of S on W?
    W
    H
    S

    View Slide

  25. Think scientifically first
    Different causal questions need
    different statistical models
    Q: Causal effect of H on W?
    Q: Causal effect of S on W?
    Q: Direct causal effect of S on W?
    W
    H
    S

    View Slide

  26. Think scientifically first
    Different causal questions need
    different statistical models
    Q: Causal effect of H on W?
    Q: Causal effect of S on W?
    Q: Direct causal effect of S on W?
    W
    H
    S

    View Slide

  27. Think scientifically first
    Different causal questions need
    different statistical models
    Q: Causal effect of H on W?
    Q: Causal effect of S on W?
    Q: Direct causal effect of S on W?
    W
    H
    S

    View Slide

  28. From estimand to estimate
    Q: Causal effect of S on W?
    W
    H
    S
    W
    H
    S
    Q: Direct causal effect of S on W?
    Stratify by S: different estimate for each value S can take

    View Slide

  29. Drawing Categorical Owls
    Several ways to code categorical variables
    (1) indicator (0/1) variables
    (2) index variables: 1,2,3,4,…
    We will use index variables: 

    Extend to many categories with no change in code

    Better for specifying priors

    Extend effortlessly to multi-level models

    View Slide

  30. Drawing Categorical Owls
    Example:
    Category Cyan Magenta Yellow Black
    Index Value 1 2 3 4
    α = [α1
    , α2
    , α3
    , α4
    ]
    Influence of color coded by:

    View Slide

  31. Drawing Categorical Owls
    α = [α1
    , α2
    , α3
    , α4
    ]
    AAACNHicbVBNS8NAEN34bf2qevSyWBQFKYmIehEEL4IgClaFJoTJdtsu3U3C7kQsoT/Kiz/EiwgeFPHqb3Abc/DrwcKb92aY2RelUhh03SdnZHRsfGJyaroyMzs3v1BdXLo0SaYZb7BEJvo6AsOliHkDBUp+nWoOKpL8KuodDf2rG66NSOIL7Kc8UNCJRVswQCuF1ZN+KOi6b4SiPvJbzE8TrUAONnyVhWLLGh0Fm75fKWq6fkB9kGkXwrxoNywvbhg0RTAIqzW37hagf4lXkhopcRZWH/xWwjLFY2QSjGl6bopBDhoFk3xQ8TPDU2A96PCmpTEoboJyIV2zSou2E21fjLRQv0/koIzpq8h2KsCu+e0Nxf+8Zobt/SAXcZohj9nXonYmKSZ0mCBtCc0Zyr4lwLSwt1LWBQ0Mbc4VG4L3+8t/yeV23dut75zv1A6PyzimyApZJRvEI3vkkByTM9IgjNyRR/JCXp1759l5c96/WkeccmaZ/IDz8Qn0zKu5
    yi
    ⇠ Normal(µi, )
    µi = ↵color[i]

    View Slide

  32. Using Index Variables
    AAACNXicbZBNTxsxEIa90BYa+hHg2ItFBApqG+1WCLggIXrJoUJUaghSHK1mnUliYe+u7FlEtMqf4sL/4AQHDq2qXvkLOCGH8jGSpcfvOyN73iTXylEY3gRz869ev1lYfFtZevf+w8fq8sqxyworsSUzndmTBBxqlWKLFGk8yS2CSTS2k9PvE799htapLP1Foxy7Bgap6isJ5KW4+qMdK74hnDJcEJ5TeZhZA3pcF6aI1RdvDAxsClGZ3vnGHheg8yHwz1wkSFBvxuqrR7C8uRlXa2EjnBZ/DtEMamxWR3H1SvQyWRhMSWpwrhOFOXVLsKSkxnFFFA5zkKcwwI7HFAy6bjndeszXvdLj/cz6kxKfqv9PlGCcG5nEdxqgoXvqTcSXvE5B/d1uqdK8IEzlw0P9QnPK+CRC3lMWJemRB5BW+b9yOQQLknzQFR9C9HTl53D8rRFtN7Z+btX2D2ZxLLJPbI3VWcR22D5rsiPWYpJdsGv2m/0JLoPb4G/w76F1LpjNrLJHFdzdAxUtqR8=
    Wi
    ⇠ Normal(µi, )
    µi = ↵ + (Hi
    ¯
    H)
    intercept

    View Slide

  33. Using Index Variables
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    sex of i-th

    person
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2]
    S = 1 indicates female
    S = 2 indicates male
    Two intercepts, one
    for each value in S

    View Slide

  34. Using Index Variables
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2]
    i = 1
    S[1]=2

    View Slide

  35. Using Index Variables
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2]
    i = 2
    S[2]=1

    View Slide

  36. Using Index Variables
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    Priors
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2] AAACC3icbVA9SwNBEN3zM8avqKXNEhUiSLgTUcugjZUoGCPkQpjbbMya3btjd04MR3ob/4qNoCJ24h+w84do7Sax0OiDgcd7M8zMC2IpDLruuzMyOjY+MZmZyk7PzM7N5xYWT02UaMbLLJKRPgvAcClCXkaBkp/FmoMKJK8E7f2eX7nk2ogoPMFOzGsKzkPRFAzQSvVc3gcZt6B+QX0jFPWRX2F6GGkFslvYdjc8d72eW3GLbh/0L/G+yUpp9eP+5XL686iee/MbEUsUD5FJMKbquTHWUtAomOTdrJ8YHgNrwzmvWhqC4qaW9n/p0jWrNGgz0rZCpH3150QKypiOCmynAmyZYa8n/udVE2zu1lIRxgnykA0WNRNJMaK9YGhDaM5QdiwBpoW9lbIWaGBo48vaELzhl/+S082it13cOrZp7JEBMmSZ5EmBeGSHlMgBOSJlwsg1uSUP5NG5ce6cJ+d50DrifM8skV9wXr8A+UieIw==
    ↵j
    ⇠ Normal(60, 10)

    View Slide

  37. Testing
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    AAACCnicbVDLSgMxFM3UV62vUZduokWpIGVGirosunFZwWkLnaFk0rQNTWaG5I5Yhq7d+CtuXCji1i9w59+YPhZavRByOOdc7r0nTATX4DhfVm5hcWl5Jb9aWFvf2Nyyt3fqOk4VZR6NRayaIdFM8Ih5wEGwZqIYkaFgjXBwNdYbd0xpHke3MExYIEkv4l1OCRiqbe/7mvckwUfml9gHdg+ZZwyxkqOSc+I6x2276JSdSeG/wJ2BIppVrW1/+p2YppJFQAXRuuU6CQQZUcCpYKOCn2qWEDogPdYyMCKS6SCbnDLCh4bpYDPevAjwhP3ZkRGp9VCGxikJ9PW8Nib/01opdC+CjEdJCiyi00HdVGCI8TgX3OGKURBDAwhV3OyKaZ8oQsGkVzAhuPMn/wX107J7Vq7cVIrVy1kcebSHDlAJuegcVdE1qiEPUfSAntALerUerWfrzXqfWnPWrGcX/Srr4xt9+5l9
    ⇠ Uniform(0, 10)
    AAACC3icbVA9SwNBEN3zM8avqKXNEhUiSLgTUcugjZUoGCPkQpjbbMya3btjd04MR3ob/4qNoCJ24h+w84do7Sax0OiDgcd7M8zMC2IpDLruuzMyOjY+MZmZyk7PzM7N5xYWT02UaMbLLJKRPgvAcClCXkaBkp/FmoMKJK8E7f2eX7nk2ogoPMFOzGsKzkPRFAzQSvVc3gcZt6B+QX0jFPWRX2F6GGkFslvYdjc8d72eW3GLbh/0L/G+yUpp9eP+5XL686iee/MbEUsUD5FJMKbquTHWUtAomOTdrJ8YHgNrwzmvWhqC4qaW9n/p0jWrNGgz0rZCpH3150QKypiOCmynAmyZYa8n/udVE2zu1lIRxgnykA0WNRNJMaK9YGhDaM5QdiwBpoW9lbIWaGBo48vaELzhl/+S082it13cOrZp7JEBMmSZ5EmBeGSHlMgBOSJlwsg1uSUP5NG5ce6cJ+d50DrifM8skV9wXr8A+UieIw==
    ↵j
    ⇠ Normal(60, 10)
    # female sample
    S simF # male sample
    S simM # effect of sex (male-female)
    mean(simM$W - simF$W)
    What is the total causal effect of sex? Causal
    effect of sex is difference made intervening:
    [1] 21.13852

    View Slide

  38. # observe sample
    S dat # estimate posterior
    m_SW alist(
    W ~ dnorm(mu,sigma),
    mu a[S] ~ dnorm(60,10),
    sigma ~ dunif(0,10)
    ), data=dat )
    precis(m_SW,depth=2)
    Now run the estimating model and synthetic sample
    a[1] 75.47 0.93 73.98 76.95
    a[2] 97.39 0.84 96.05 98.74
    sigma 6.26 0.44 5.55 6.96
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    AAACCnicbVDLSgMxFM3UV62vUZduokWpIGVGirosunFZwWkLnaFk0rQNTWaG5I5Yhq7d+CtuXCji1i9w59+YPhZavRByOOdc7r0nTATX4DhfVm5hcWl5Jb9aWFvf2Nyyt3fqOk4VZR6NRayaIdFM8Ih5wEGwZqIYkaFgjXBwNdYbd0xpHke3MExYIEkv4l1OCRiqbe/7mvckwUfml9gHdg+ZZwyxkqOSc+I6x2276JSdSeG/wJ2BIppVrW1/+p2YppJFQAXRuuU6CQQZUcCpYKOCn2qWEDogPdYyMCKS6SCbnDLCh4bpYDPevAjwhP3ZkRGp9VCGxikJ9PW8Nib/01opdC+CjEdJCiyi00HdVGCI8TgX3OGKURBDAwhV3OyKaZ8oQsGkVzAhuPMn/wX107J7Vq7cVIrVy1kcebSHDlAJuegcVdE1qiEPUfSAntALerUerWfrzXqfWnPWrGcX/Srr4xt9+5l9
    ⇠ Uniform(0, 10)
    AAACC3icbVA9SwNBEN3zM8avqKXNEhUiSLgTUcugjZUoGCPkQpjbbMya3btjd04MR3ob/4qNoCJ24h+w84do7Sax0OiDgcd7M8zMC2IpDLruuzMyOjY+MZmZyk7PzM7N5xYWT02UaMbLLJKRPgvAcClCXkaBkp/FmoMKJK8E7f2eX7nk2ogoPMFOzGsKzkPRFAzQSvVc3gcZt6B+QX0jFPWRX2F6GGkFslvYdjc8d72eW3GLbh/0L/G+yUpp9eP+5XL686iee/MbEUsUD5FJMKbquTHWUtAomOTdrJ8YHgNrwzmvWhqC4qaW9n/p0jWrNGgz0rZCpH3150QKypiOCmynAmyZYa8n/udVE2zu1lIRxgnykA0WNRNJMaK9YGhDaM5QdiwBpoW9lbIWaGBo48vaELzhl/+S082it13cOrZp7JEBMmSZ5EmBeGSHlMgBOSJlwsg1uSUP5NG5ce6cJ+d50DrifM8skV9wXr8A+UieIw==
    ↵j
    ⇠ Normal(60, 10)
    97 – 75 = 22

    View Slide

  39. Analyze the real sample
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    data(Howell1)
    d d =18 , ]
    dat W = d$weight,
    S = d$male + 1 ) # S=1 female, S=2 male
    m_SW alist(
    W ~ dnorm(mu,sigma),
    mu a[S] ~ dnorm(60,10),
    sigma ~ dunif(0,10)
    ), data=dat ) AAACCnicbVDLSgMxFM3UV62vUZduokWpIGVGirosunFZwWkLnaFk0rQNTWaG5I5Yhq7d+CtuXCji1i9w59+YPhZavRByOOdc7r0nTATX4DhfVm5hcWl5Jb9aWFvf2Nyyt3fqOk4VZR6NRayaIdFM8Ih5wEGwZqIYkaFgjXBwNdYbd0xpHke3MExYIEkv4l1OCRiqbe/7mvckwUfml9gHdg+ZZwyxkqOSc+I6x2276JSdSeG/wJ2BIppVrW1/+p2YppJFQAXRuuU6CQQZUcCpYKOCn2qWEDogPdYyMCKS6SCbnDLCh4bpYDPevAjwhP3ZkRGp9VCGxikJ9PW8Nib/01opdC+CjEdJCiyi00HdVGCI8TgX3OGKURBDAwhV3OyKaZ8oQsGkVzAhuPMn/wX107J7Vq7cVIrVy1kcebSHDlAJuegcVdE1qiEPUfSAntALerUerWfrzXqfWnPWrGcX/Srr4xt9+5l9
    ⇠ Uniform(0, 10)
    AAACC3icbVA9SwNBEN3zM8avqKXNEhUiSLgTUcugjZUoGCPkQpjbbMya3btjd04MR3ob/4qNoCJ24h+w84do7Sax0OiDgcd7M8zMC2IpDLruuzMyOjY+MZmZyk7PzM7N5xYWT02UaMbLLJKRPgvAcClCXkaBkp/FmoMKJK8E7f2eX7nk2ogoPMFOzGsKzkPRFAzQSvVc3gcZt6B+QX0jFPWRX2F6GGkFslvYdjc8d72eW3GLbh/0L/G+yUpp9eP+5XL686iee/MbEUsUD5FJMKbquTHWUtAomOTdrJ8YHgNrwzmvWhqC4qaW9n/p0jWrNGgz0rZCpH3150QKypiOCmynAmyZYa8n/udVE2zu1lIRxgnykA0WNRNJMaK9YGhDaM5QdiwBpoW9lbIWaGBo48vaELzhl/+S082it13cOrZp7JEBMmSZ5EmBeGSHlMgBOSJlwsg1uSUP5NG5ce6cJ+d50DrifM8skV9wXr8A+UieIw==
    ↵j
    ⇠ Normal(60, 10)

    View Slide

  40. Posterior means & predictions
    # posterior mean W
    post dens( post$a[,1] , xlim=c(39,50) , lwd=3 ,
    col=2 , xlab="posterior mean weight (kg)" )
    dens( post$a[,2] , lwd=3 , col=4 , add=TRUE )
    # posterior W distributions
    W1 W2 dens( W1 , xlim=c(20,70) , ylim=c(0,0.085) ,
    lwd=3 , col=2 )
    dens( W2 , lwd=3 , col=4 , add=TRUE )
    40 42 44 46 48 50
    0.0 0.4 0.8
    posterior mean weight (kg)
    Density
    women men

    View Slide

  41. Posterior means & predictions
    # posterior mean W
    post dens( post$a[,1] , xlim=c(39,50) , lwd=3 ,
    col=2 , xlab="posterior mean weight (kg)" )
    dens( post$a[,2] , lwd=3 , col=4 , add=TRUE )
    # posterior W distributions
    W1 W2 dens( W1 , xlim=c(20,70) , ylim=c(0,0.085) ,
    lwd=3 , col=2 )
    dens( W2 , lwd=3 , col=4 , add=TRUE )
    40 42 44 46 48 50
    0.0 0.4 0.8
    posterior mean weight (kg)
    Density
    20 30 40 50 60 70
    0.00 0.04 0.08
    posterior predicted weight (kg)
    Density
    women men
    men
    women

    View Slide

  42. Always Be Contrasting
    Need to compute contrast, the
    difference between the categories
    It is never legitimate to compare
    overlap in distributions
    Must compute contrast
    distribution
    20 30 40 50 60 70
    0.00 0.04 0.08
    posterior predicted weight (kg)
    Density

    View Slide

  43. -1.5 -0.5 0.0 0.5 1.0 1.5
    -0.5 0.5 1.0 1.5 2.0 2.5
    parameter 1
    parameter 2
    It is never legitimate to compare overlap in parameters or lines

    View Slide

  44. -3 -2 -1 0 1 2 3
    0.0 0.5 1.0 1.5 2.0
    value
    Density
    -1.5 -0.5 0.0 0.5 1.0 1.5
    -0.5 0.5 1.0 1.5 2.0 2.5
    parameter 1
    parameter 2
    parameter 1
    parameter 2
    difference
    It is never legitimate to compare overlap in parameters or lines
    This means no comparing confidence intervals or p-values either!

    View Slide

  45. Causal contrast
    # causal contrast (in means)
    mu_contrast dens( mu_contrast , xlim=c(3,10) , lwd=3 ,
    col=1 , xlab="posterior mean weight contrast
    (kg)" )
    40 42 44 46 48 50
    0.0 0.4 0.8
    posterior mean weight (kg)
    Density
    women men
    3 4 5 6 7 8 9 10
    0.0 0.3 0.6
    posterior mean weight contrast (kg)
    Density

    View Slide

  46. Weight contrast
    20 30 40 50 60 70
    0.00 0.04 0.08
    posterior predicted weight (kg)
    Density
    men
    women

    View Slide

  47. Weight contrast
    # posterior W distributions
    W1 W2 # contrast
    W_contrast dens( W_contrast , xlim=c(-25,35) , lwd=3 ,
    col=1 , xlab="posterior weight contrast (kg)" )
    # proportion above zero
    sum( W_contrast > 0 ) / 1000
    # proportion below zero
    sum( W_contrast < 0 ) / 1000
    20 30 40 50 60 70
    0.00 0.04 0.08
    posterior predicted weight (kg)
    Density
    men
    women
    -20 -10 0 10 20 30
    0.00 0.02 0.04
    posterior weight contrast (kg)
    Density
    18% 82%

    View Slide

  48. From estimand to estimate
    Q: Causal effect of S on W?
    W
    H
    S
    W
    H
    S
    Q: Direct causal effect of S on W?
    3 4 5 6 7 8 9 10
    0.0 0.3 0.6
    posterior mean weight contrast (kg)
    Density
    -20 -10 0 10 20 30
    0.00 0.02 0.04
    posterior weight contrast (kg)
    Density

    View Slide

  49. From estimand to estimate
    W
    H
    S
    Q: Direct causal effect of S on W?
    Now we need to “block”
    association through H
    This means stratify by H

    View Slide

  50. S dat 140 145 150 155 160 165 170
    70 80 90 100
    H
    W
    direct
    indirect
    women
    men

    View Slide

  51. 140 145 150 155 160 165 170
    70 80 90 100
    H
    W
    S dat direct
    indirect
    Indirect: H influences W
    differently (slopes)
    Direct: W differs net any
    slope differences

    View Slide

  52. Index Variables & Lines
    AAACNXicbZBNTxsxEIa90BYa+hHg2ItFBApqG+1WCLggIXrJoUJUaghSHK1mnUliYe+u7FlEtMqf4sL/4AQHDq2qXvkLOCGH8jGSpcfvOyN73iTXylEY3gRz869ev1lYfFtZevf+w8fq8sqxyworsSUzndmTBBxqlWKLFGk8yS2CSTS2k9PvE799htapLP1Foxy7Bgap6isJ5KW4+qMdK74hnDJcEJ5TeZhZA3pcF6aI1RdvDAxsClGZ3vnGHheg8yHwz1wkSFBvxuqrR7C8uRlXa2EjnBZ/DtEMamxWR3H1SvQyWRhMSWpwrhOFOXVLsKSkxnFFFA5zkKcwwI7HFAy6bjndeszXvdLj/cz6kxKfqv9PlGCcG5nEdxqgoXvqTcSXvE5B/d1uqdK8IEzlw0P9QnPK+CRC3lMWJemRB5BW+b9yOQQLknzQFR9C9HTl53D8rRFtN7Z+btX2D2ZxLLJPbI3VWcR22D5rsiPWYpJdsGv2m/0JLoPb4G/w76F1LpjNrLJHFdzdAxUtqR8=
    Wi
    ⇠ Normal(µi, )
    µi = ↵ + (Hi
    ¯
    H)
    intercept slope

    View Slide

  53. Centering variables
    AAACNXicbZBNTxsxEIa90BYa+hHg2ItFBApqG+1WCLggIXrJoUJUaghSHK1mnUliYe+u7FlEtMqf4sL/4AQHDq2qXvkLOCGH8jGSpcfvOyN73iTXylEY3gRz869ev1lYfFtZevf+w8fq8sqxyworsSUzndmTBBxqlWKLFGk8yS2CSTS2k9PvE799htapLP1Foxy7Bgap6isJ5KW4+qMdK74hnDJcEJ5TeZhZA3pcF6aI1RdvDAxsClGZ3vnGHheg8yHwz1wkSFBvxuqrR7C8uRlXa2EjnBZ/DtEMamxWR3H1SvQyWRhMSWpwrhOFOXVLsKSkxnFFFA5zkKcwwI7HFAy6bjndeszXvdLj/cz6kxKfqv9PlGCcG5nEdxqgoXvqTcSXvE5B/d1uqdK8IEzlw0P9QnPK+CRC3lMWJemRB5BW+b9yOQQLknzQFR9C9HTl53D8rRFtN7Z+btX2D2ZxLLJPbI3VWcR22D5rsiPWYpJdsGv2m/0JLoPb4G/w76F1LpjNrLJHFdzdAxUtqR8=
    Wi
    ⇠ Normal(µi, )
    µi = ↵ + (Hi
    ¯
    H)
    “centered” height
    Centering H makes it so that
    alpha is the average weight of
    a person with average height
    average height
    Centering makes it easier to
    define scientific priors for alpha

    View Slide

  54. 140 145 150 155 160 165 170
    70 80 90 100
    H
    W
    Centering H makes it so
    that alpha is the average
    weight of a person with
    average height
    average W
    average H
    AAACNXicbZBNTxsxEIa90BYa+hHg2ItFBApqG+1WCLggIXrJoUJUaghSHK1mnUliYe+u7FlEtMqf4sL/4AQHDq2qXvkLOCGH8jGSpcfvOyN73iTXylEY3gRz869ev1lYfFtZevf+w8fq8sqxyworsSUzndmTBBxqlWKLFGk8yS2CSTS2k9PvE799htapLP1Foxy7Bgap6isJ5KW4+qMdK74hnDJcEJ5TeZhZA3pcF6aI1RdvDAxsClGZ3vnGHheg8yHwz1wkSFBvxuqrR7C8uRlXa2EjnBZ/DtEMamxWR3H1SvQyWRhMSWpwrhOFOXVLsKSkxnFFFA5zkKcwwI7HFAy6bjndeszXvdLj/cz6kxKfqv9PlGCcG5nEdxqgoXvqTcSXvE5B/d1uqdK8IEzlw0P9QnPK+CRC3lMWJemRB5BW+b9yOQQLknzQFR9C9HTl53D8rRFtN7Z+btX2D2ZxLLJPbI3VWcR22D5rsiPWYpJdsGv2m/0JLoPb4G/w76F1LpjNrLJHFdzdAxUtqR8=
    Wi
    ⇠ Normal(µi, )
    µi = ↵ + (Hi
    ¯
    H)

    View Slide

  55. 140 145 150 155 160 165 170
    70 80 90 100
    H
    W
    Centering H makes it so
    that alpha is the average
    weight of a person with
    average height
    average W
    average H
    AAACNXicbZBNTxsxEIa90BYa+hHg2ItFBApqG+1WCLggIXrJoUJUaghSHK1mnUliYe+u7FlEtMqf4sL/4AQHDq2qXvkLOCGH8jGSpcfvOyN73iTXylEY3gRz869ev1lYfFtZevf+w8fq8sqxyworsSUzndmTBBxqlWKLFGk8yS2CSTS2k9PvE799htapLP1Foxy7Bgap6isJ5KW4+qMdK74hnDJcEJ5TeZhZA3pcF6aI1RdvDAxsClGZ3vnGHheg8yHwz1wkSFBvxuqrR7C8uRlXa2EjnBZ/DtEMamxWR3H1SvQyWRhMSWpwrhOFOXVLsKSkxnFFFA5zkKcwwI7HFAy6bjndeszXvdLj/cz6kxKfqv9PlGCcG5nEdxqgoXvqTcSXvE5B/d1uqdK8IEzlw0P9QnPK+CRC3lMWJemRB5BW+b9yOQQLknzQFR9C9HTl53D8rRFtN7Z+btX2D2ZxLLJPbI3VWcR22D5rsiPWYpJdsGv2m/0JLoPb4G/w76F1LpjNrLJHFdzdAxUtqR8=
    Wi
    ⇠ Normal(µi, )
    µi = ↵ + (Hi
    ¯
    H)

    View Slide

  56. Index Variables & Lines
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    sex of i-th

    person
    AAACCXicbVDLSsNAFJ3UV62vqEs3g0VwISUpRd0IRTcuK9gHJCFMppN26GQSZiZCCd268VfcuFDErX/gzr9xkmahrQeGOZxzL/feEySMSmVZ30ZlZXVtfaO6Wdva3tndM/cPejJOBSZdHLNYDAIkCaOcdBVVjAwSQVAUMNIPJje5338gQtKY36tpQrwIjTgNKUZKS74J3QipcRC6AVEIXkGnIL59Nv+bnm/WrYZVAC4TuyR1UKLjm1/uMMZpRLjCDEnp2FaivAwJRTEjs5qbSpIgPEEj4mjKUUSklxWXzOCJVoYwjIV+XMFC/d2RoUjKaRToynxvuejl4n+ek6rw0ssoT1JFOJ4PClMGVQzzWOCQCoIVm2qCsKB6V4jHSCCsdHg1HYK9ePIy6TUb9nmjddeqt6/LOKrgCByDU2CDC9AGt6ADugCDR/AMXsGb8WS8GO/Gx7y0YpQ9h+APjM8f3ZGZLA==
    = [ 1, 2]
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2]
    S = 1 indicates female
    S = 2 indicates male
    Two intercepts and two slopes, one for each value in S

    View Slide

  57. Index Variables & Lines
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    AAACCXicbVDLSsNAFJ3UV62vqEs3g0VwISUpRd0IRTcuK9gHJCFMppN26GQSZiZCCd268VfcuFDErX/gzr9xkmahrQeGOZxzL/feEySMSmVZ30ZlZXVtfaO6Wdva3tndM/cPejJOBSZdHLNYDAIkCaOcdBVVjAwSQVAUMNIPJje5338gQtKY36tpQrwIjTgNKUZKS74J3QipcRC6AVEIXkGnIL59Nv+bnm/WrYZVAC4TuyR1UKLjm1/uMMZpRLjCDEnp2FaivAwJRTEjs5qbSpIgPEEj4mjKUUSklxWXzOCJVoYwjIV+XMFC/d2RoUjKaRToynxvuejl4n+ek6rw0ssoT1JFOJ4PClMGVQzzWOCQCoIVm2qCsKB6V4jHSCCsdHg1HYK9ePIy6TUb9nmjddeqt6/LOKrgCByDU2CDC9AGt6ADugCDR/AMXsGb8WS8GO/Gx7y0YpQ9h+APjM8f3ZGZLA==
    = [ 1, 2]
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2]
    i = 1
    S[1]=2

    View Slide

  58. Index Variables & Lines
    AAACQ3icbVBNaxsxENUmbZM4/XCbYy6iJsGhrdktIe0lENKLTyWldRxqLcusPLZFpN1Fmg0xi/9bLvkDueUP9JJDSum1UNlxIR99MPDmvRmkeWmhlaMwvAwWFh89frK0vFJbffrs+Yv6y1eHLi+txI7MdW6PUnCoVYYdUqTxqLAIJtXYTY8/Tf3uCVqn8uwbjQuMDQwzNVASyEtJ/Xs3UXxTOGW4IDyl6nNuDehJU5gyUW+9MTSwJURt1vPNXS5AFyNIqq89FU/4Gy5SpH9ts52od14By9tbSb0RtsIZ+EMSzUmDzXGQ1C9EP5elwYykBud6UVhQXIElJTVOaqJ0WIA8hiH2PM3AoIurWQYTvuGVPh/k1ldGfKbe3qjAODc2qZ80QCN335uK//N6JQ0+xpXKipIwkzcPDUrNKefTQHlfWZSkx56AtMr/lcsRWJDkY6/5EKL7Jz8kh+9b0U5r+8t2Y29/HscyW2evWZNF7APbY212wDpMsjP2g12zn8F5cBX8Cn7fjC4E8501dgfBn78/na9B
    Wi
    ⇠ Normal(µi, )
    µi = ↵S[i]
    + S[i]
    (Hi
    ¯
    H)
    AAACCXicbVDLSsNAFJ3UV62vqEs3g0VwISUpRd0IRTcuK9gHJCFMppN26GQSZiZCCd268VfcuFDErX/gzr9xkmahrQeGOZxzL/feEySMSmVZ30ZlZXVtfaO6Wdva3tndM/cPejJOBSZdHLNYDAIkCaOcdBVVjAwSQVAUMNIPJje5338gQtKY36tpQrwIjTgNKUZKS74J3QipcRC6AVEIXkGnIL59Nv+bnm/WrYZVAC4TuyR1UKLjm1/uMMZpRLjCDEnp2FaivAwJRTEjs5qbSpIgPEEj4mjKUUSklxWXzOCJVoYwjIV+XMFC/d2RoUjKaRToynxvuejl4n+ek6rw0ssoT1JFOJ4PClMGVQzzWOCQCoIVm2qCsKB6V4jHSCCsdHg1HYK9ePIy6TUb9nmjddeqt6/LOKrgCByDU2CDC9AGt6ADugCDR/AMXsGb8WS8GO/Gx7y0YpQ9h+APjM8f3ZGZLA==
    = [ 1, 2]
    AAACDHicbZDLSsNAFIYn9VbrrerSzWARXEhJSlE3QtGNywr2Ak0oJ9NJO3QyCTMToYQ+gBtfxY0LRdz6AO58GydtFtr6w8DHf85hzvn9mDOlbfvbKqysrq1vFDdLW9s7u3vl/YO2ihJJaItEPJJdHxTlTNCWZprTbiwphD6nHX98k9U7D1QqFol7PYmpF8JQsIAR0MbqlytuCHrkBy7weAT4Cvfm1HfOcqh5psuu2jPhZXByqKBczX75yx1EJAmp0ISDUj3HjrWXgtSMcDotuYmiMZAxDGnPoICQKi+dHTPFJ8YZ4CCS5gmNZ+7viRRCpSahbzqz1dViLTP/q/USHVx6KRNxoqkg84+ChGMd4SwZPGCSEs0nBoBIZnbFZAQSiDb5lUwIzuLJy9CuVZ3zav2uXmlc53EU0RE6RqfIQReogW5RE7UQQY/oGb2iN+vJerHerY95a8HKZw7RH1mfP1Wnmog=
    ↵ = [↵1, ↵2]
    i = 2
    S[2]=1

    View Slide

  59. Analyze the sample
    data(Howell1)
    d d =18 , ]
    dat W = d$weight,
    H = d$height,
    Hbar = mean(d$height),
    S = d$male + 1 ) # S=1 female, S=2 male
    m_SHW alist(
    W ~ dnorm(mu,sigma),
    mu a[S] ~ dnorm(60,10),
    b[S] ~ dunif(0,1),
    sigma ~ dunif(0,10)
    ), data=dat )
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)

    View Slide

  60. Contrasts at each height
    (1) Compute posterior predictive
    for women
    (2) Compute posterior predictive
    for men
    (3) Subtract (2) from (1)
    (4) Plot distribution at each height
    (on right)
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)

    View Slide

  61. Contrasts at each height
    130 140 150 160 170 180 190
    -6 -4 -2 0 2 4 6 8
    height (cm)
    weight contrast (F–M)
    xseq muF link(m_adults2,data=list(S=rep(1,50),H=xseq,Hbar=mean(d$height)))
    lines( xseq , apply(muF,2,mean) , lwd=3 , col=2 )
    muM link(m_adults2,data=list(S=rep(2,50),H=xseq,Hbar=mean(d$height)))
    lines( xseq , apply(muM,2,mean) , lwd=3 , col=4 )
    mu_contrast plot( NULL , xlim=range(xseq) , ylim=c(-6,8) , xlab="height (cm)"
    , ylab="weight contrast (F–M)" )
    for ( p in c(0.5,0.6,0.7,0.8,0.9,0.99) )
    shade( apply(mu_contrast,2,PI,prob=p) , xseq )
    abline(h=0,lty=2)
    women heavier
    men heavier
    Nearly all of the causal effect of S acts through H

    View Slide

  62. From estimand to estimate
    Q: Causal effect of S on W?
    W
    H
    S
    W
    H
    S
    Q: Direct causal effect of S on W?
    3 4 5 6 7 8 9 10
    0.0 0.3 0.6
    posterior mean weight contrast (kg)
    Density
    -20 -10 0 10 20 30
    0.00 0.02 0.04
    posterior weight contrast (kg)
    Density
    130 140 150 160 170 180 190
    -6 -4 -2 0 2 4 6 8
    height (cm)
    weight contrast (F–M)

    View Slide

  63. Categorical variables
    Easy to use with index coding
    Use samples to compute relevant
    contrasts
    Always summarize (mean, interval)
    as the last step
    Want mean difference and not
    difference of means
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)

    View Slide

  64. PAUSE

    View Slide

  65. Curves from lines
    H –> W obviously not linear
    Linear models can easily fit
    curves
    But this is not mechanistic
    library(rethinking)
    data(Howell1)
    60 80 100 140 180
    10 20 30 40 50 60
    height (cm)
    weight (kg)

    View Slide

  66. Curves from lines
    Linear models can easily fit
    curves
    Two popular strategies
    (1) polynomials — awful
    (2) splines and generalized
    additive models — less awful
    library(rethinking)
    data(Howell1)
    60 80 100 140 180
    10 20 30 40 50 60
    height (cm)
    weight (kg)

    View Slide

  67. Polynomial linear models
    Strategy: polynomial functions
    Problems: strange symmetries,
    explosive uncertainty at edges
    No local smoothing; only global
    DO NOT USE
    μi
    = α + β1
    xi
    + β2
    x2
    i

    View Slide

  68. View Slide

  69. 50 100 150 200
    0 20 40 60
    height (cm)
    weight (kg)
    μi
    = α + β1
    Hi
    + β2
    H2
    i

    View Slide

  70. 50 100 150 200
    0 20 40 60
    height (cm)
    weight (kg)
    50 100 150 200
    0 20 40 60
    height (cm)
    weight (kg)
    μi
    = α + β1
    Hi
    + β2
    H2
    i
    μi
    = α + β1
    Hi
    + β2
    H2
    i
    = + b3
    H3 + b4
    H4

    View Slide

  71. (&0.&53*$ 1&01-&
    h
    r
    V = πr2h
    'ĶĴłĿIJ ƉƎƉ ćF i
    IVNBO XFJHIU BT B G
    SVWJBO .BO XFSF B D
    IJT XFJHIU CZ DBMDV
    GVODUJPO PG IJT IFJHI
    Q PG IFJHIU ćJT NFBOT S = QI 4VCTUJUVUJOH UIJT JOUP UIF GPSNVMB
    7 = π(QI)I = πQI
    'JOBMMZ XFJHIU JT TPNF QSPQPSUJPO PG WPMVNF‰IPX NBOZ LJMPHSBN
    UJNFUFS 4P XF OFFE B QBSBNFUFS L UP FYQSFTT UIJT USBOTMBUJPO CFUX
    8 = L7 = LπQI
    "OE UIJT JT PVS GPSNVMB GPS FYQFDUFE XFJHIU HJWFO BO JOEJWJEVBMT
    WJPVTMZ BO PSEJOBSZ HFOFSBMJ[FE MJOFBS NPEFM #VU UIBUT PLBZ *U
    Thinking vs Fitting
    Linear models can fit anything
    (Geocentrism)
    Better to think
    60 80 100 140 180
    10 20 30 40 50 60
    height (cm)
    weight (kg)
    log Wi
    ∼ Normal(μi
    , σ)
    μi
    = α + β(Hi
    − ¯
    H)
    Will revisit in lecture 19

    View Slide

  72. Splines
    Basis-Splines: Wiggly function
    built from many local
    functions
    Basis function: A local
    function
    Local functions make splines
    better than polynomials, but
    equally geocentric
    Richard McElreath
    McElreath
    with Examples in R and Stan
    SECOND EDITION
    ical Rethinking
    o advanced multilevel mod-
    d Gaussian process models
    h (DAG) approach to causal
    w edition also contains new
    ered categorical predictors,
    ampling, instrumental vari-
    rely new chapter that goes
    n-specific scientific models
    mples
    umptions are reflected in
    in optional sections
    analyze causal graphs
    bsite and on GitHub
    nd is a Director at the Max
    Germany. He has published
    analysis of social behavior,
    Models of Social Evolution.

    View Slide

  73. View Slide

  74. View Slide

  75. View Slide

  76. View Slide

  77. Going Local — B-Splines
    B-Splines are just linear models, but with some weird
    synthetic variables:
    Weights w are like slopes
    Basis functions B are synthetic variables
    B values turn on weights in different regions
    Detailed example starting on page 114
    CZ HFOFSBUJOH OFX QSFEJDUPS WBSJBCMFT BOE VTJOH UIPTF JO UIF MJOFBS NPEFM µJ
    6O
    PNJBM SFHSFTTJPO #TQMJOFT EP OPU EJSFDUMZ USBOTGPSN UIF QSFEJDUPS CZ TRVBSJOH PS
    *OTUFBE UIFZ JOWFOU B TFSJFT PG FOUJSFMZ OFX TZOUIFUJD QSFEJDUPS WBSJBCMFT &BDI
    BSJBCMFT TFSWFT UP HSBEVBMMZ UVSO B TQFDJĕD QBSBNFUFS PO BOE PČ XJUIJO B TQFDJĕD
    IF QSFEJDUPS WBSJBCMFT &BDI PG UIFTF WBSJBCMFT JT DBMMFE B įĮŀĶŀ ijłĻİŁĶļĻ ćF
    EFM FOET VQ MPPLJOH WFSZ GBNJMJBS
    µJ = α + X
    #J, + X
    #J, + X
    #J, + ...
    JT UIF OUI CBTJT GVODUJPOT WBMVF PO SPX J BOE UIF X QBSBNFUFST BSF DPSSFTQPOE
    T GPS FBDI ćF QBSBNFUFST BDU MJLF TMPQFT BEKVTUJOH UIF JOĘVFODF PG FBDI CBTJT
    O UIF NFBO µJ
    4P SFBMMZ UIJT JT KVTU BOPUIFS MJOFBS SFHSFTTJPO CVU XJUI TPNF GBODZ
    QSFEJDUPS WBSJBCMFT ćFTF TZOUIFUJD WBSJBCMFT EP TPNF SFBMMZ FMFHBOU EFTDSJQUJWF‰
    ‰XPSL GPS VT
    EP XF DPOTUSVDU UIFTF CBTJT WBSJBCMFT # * EJTQMBZ UIF TJNQMFTU DBTF JO 'ĶĴłĿIJ ƌƉƊ
    BQQSPYJNBUF UIF UFNQFSBUVSF EBUB XJUI GPVS EJČFSFOU MJOFBS BQQSPYJNBUJPOT 'JSTU

    View Slide

  78. Going Local — B-Splines
    B-Splines are just linear models, but with some weird
    synthetic variables:
    Weights w are like slopes
    Basis functions B are synthetic variables
    B values turn on weights in different regions
    Detailed example starting on page 114
    CZ HFOFSBUJOH OFX QSFEJDUPS WBSJBCMFT BOE VTJOH UIPTF JO UIF MJOFBS NPEFM µJ
    6O
    PNJBM SFHSFTTJPO #TQMJOFT EP OPU EJSFDUMZ USBOTGPSN UIF QSFEJDUPS CZ TRVBSJOH PS
    *OTUFBE UIFZ JOWFOU B TFSJFT PG FOUJSFMZ OFX TZOUIFUJD QSFEJDUPS WBSJBCMFT &BDI
    BSJBCMFT TFSWFT UP HSBEVBMMZ UVSO B TQFDJĕD QBSBNFUFS PO BOE PČ XJUIJO B TQFDJĕD
    IF QSFEJDUPS WBSJBCMFT &BDI PG UIFTF WBSJBCMFT JT DBMMFE B įĮŀĶŀ ijłĻİŁĶļĻ ćF
    EFM FOET VQ MPPLJOH WFSZ GBNJMJBS
    µJ = α + X
    #J, + X
    #J, + X
    #J, + ...
    JT UIF OUI CBTJT GVODUJPOT WBMVF PO SPX J BOE UIF X QBSBNFUFST BSF DPSSFTQPOE
    T GPS FBDI ćF QBSBNFUFST BDU MJLF TMPQFT BEKVTUJOH UIF JOĘVFODF PG FBDI CBTJT
    O UIF NFBO µJ
    4P SFBMMZ UIJT JT KVTU BOPUIFS MJOFBS SFHSFTTJPO CVU XJUI TPNF GBODZ
    QSFEJDUPS WBSJBCMFT ćFTF TZOUIFUJD WBSJBCMFT EP TPNF SFBMMZ FMFHBOU EFTDSJQUJWF‰
    ‰XPSL GPS VT
    EP XF DPOTUSVDU UIFTF CBTJT WBSJBCMFT # * EJTQMBZ UIF TJNQMFTU DBTF JO 'ĶĴłĿIJ ƌƉƊ
    BQQSPYJNBUF UIF UFNQFSBUVSF EBUB XJUI GPVS EJČFSFOU MJOFBS BQQSPYJNBUJPOT 'JSTU

    View Slide

  79. Going Local — B-Splines
    B-Splines are just linear models, but with some weird
    synthetic variables:
    Weights w are like slopes
    Basis functions B are synthetic variables
    B values turn on weights in different regions
    Detailed example starting on page 114
    CZ HFOFSBUJOH OFX QSFEJDUPS WBSJBCMFT BOE VTJOH UIPTF JO UIF MJOFBS NPEFM µJ
    6O
    PNJBM SFHSFTTJPO #TQMJOFT EP OPU EJSFDUMZ USBOTGPSN UIF QSFEJDUPS CZ TRVBSJOH PS
    *OTUFBE UIFZ JOWFOU B TFSJFT PG FOUJSFMZ OFX TZOUIFUJD QSFEJDUPS WBSJBCMFT &BDI
    BSJBCMFT TFSWFT UP HSBEVBMMZ UVSO B TQFDJĕD QBSBNFUFS PO BOE PČ XJUIJO B TQFDJĕD
    IF QSFEJDUPS WBSJBCMFT &BDI PG UIFTF WBSJBCMFT JT DBMMFE B įĮŀĶŀ ijłĻİŁĶļĻ ćF
    EFM FOET VQ MPPLJOH WFSZ GBNJMJBS
    µJ = α + X
    #J, + X
    #J, + X
    #J, + ...
    JT UIF OUI CBTJT GVODUJPOT WBMVF PO SPX J BOE UIF X QBSBNFUFST BSF DPSSFTQPOE
    T GPS FBDI ćF QBSBNFUFST BDU MJLF TMPQFT BEKVTUJOH UIF JOĘVFODF PG FBDI CBTJT
    O UIF NFBO µJ
    4P SFBMMZ UIJT JT KVTU BOPUIFS MJOFBS SFHSFTTJPO CVU XJUI TPNF GBODZ
    QSFEJDUPS WBSJBCMFT ćFTF TZOUIFUJD WBSJBCMFT EP TPNF SFBMMZ FMFHBOU EFTDSJQUJWF‰
    ‰XPSL GPS VT
    EP XF DPOTUSVDU UIFTF CBTJT WBSJBCMFT # * EJTQMBZ UIF TJNQMFTU DBTF JO 'ĶĴłĿIJ ƌƉƊ
    BQQSPYJNBUF UIF UFNQFSBUVSF EBUB XJUI GPVS EJČFSFOU MJOFBS BQQSPYJNBUJPOT 'JSTU

    View Slide

  80. Spline
    Basis2
    Basis1 Basis3
    Basis4
    w = [1,–1,1,–1]

    View Slide

  81. w = [–1,–1,1,–1]

    View Slide

  82. w = [–1,1,1,–1]

    View Slide

  83. w = [–1,1,–1,–1]
    w = [–1,1,–1,1]

    View Slide

  84. Height as a function of age
    Obviously not linear
    Want function
    approximately right
    Fit spline as example
    But biological model
    would do a lot better 0 20 40 60 80
    60 80 100 140 180
    age (years)
    height (cm)

    View Slide

  85. Height as a function of age
    W
    H
    S
    A
    age
    weight
    height
    sex

    View Slide

  86. Height as a function of age
    W
    H
    S
    A
    age
    How can age cause anything?
    No definable intervention?
    Association between age and
    height due to accumulated
    growth

    View Slide

  87. View Slide

  88. Weighted basis functions
    Spline

    View Slide

  89. Curves and Splines
    Can build very non-linear functions from linear pieces
    Splines are powerful geocentric devices
    Adding scientific information helps
    e.g. Average weight only increases with height

    e.g. Height increases with age, then levels off (or declines)
    Ideally statistical model has some form as scientific model

    View Slide

  90. 0 20 40 60 80
    60 80 100 140 180
    age (years)
    height (cm)

    View Slide

  91. 0 20 40 60 80
    60 80 100 140 180
    age (years)
    height (cm)
    infancy
    childhood
    puberty
    adulthood

    View Slide

  92. 0 20 40 60 80
    60 80 100 140 180
    age (years)
    height (cm)
    infancy
    childhood
    adulthood
    puberty

    View Slide

  93. Course Schedule
    Week 1 Bayesian inference Chapters 1, 2, 3
    Week 2 Linear models & Causal Inference Chapter 4
    Week 3 Causes, Confounds & Colliders Chapters 5 & 6
    Week 4 Overfitting / Interactions Chapters 7 & 8
    Week 5 MCMC & Generalized Linear Models Chapters 9, 10, 11
    Week 6 Integers & Other Monsters Chapters 11 & 12
    Week 7 Multilevel models I Chapter 13
    Week 8 Multilevel models II Chapter 14
    Week 9 Measurement & Missingness Chapter 15
    Week 10 Generalized Linear Madness Chapter 16
    https://github.com/rmcelreath/stat_rethinking_2023

    View Slide

  94. View Slide

  95. BONUS

    View Slide

  96. Full Luxury Bayes
    We used two models for two
    estimands
    But alternative and equivalent
    approach is to use one model of
    entire causal system
    Then use joint posterior to
    compute each estimand
    W
    H
    S

    View Slide

  97. Full Luxury Bayes
    m_SHW_full alist(
    # weight
    W ~ dnorm(mu,sigma),
    mu a[S] ~ dnorm(60,10),
    b[S] ~ dunif(0,1),
    sigma ~ dunif(0,10),
    # height
    H ~ dnorm(nu,tau),
    nu h[S] ~ dnorm(160,10),
    tau ~ dunif(0,10)
    ), data=dat )
    W
    H
    S

    View Slide

  98. Full Luxury Bayes
    m_SHW_full alist(
    # weight
    W ~ dnorm(mu,sigma),
    mu a[S] ~ dnorm(60,10),
    b[S] ~ dunif(0,1),
    sigma ~ dunif(0,10),
    # height
    H ~ dnorm(nu,tau),
    nu h[S] ~ dnorm(160,10),
    tau ~ dunif(0,10)
    ), data=dat )
    W
    H
    S
    H
    S
    W
    H
    S

    View Slide

  99. Full Luxury Bayes
    m_SHW_full alist(
    # weight
    W ~ dnorm(mu,sigma),
    mu a[S] ~ dnorm(60,10),
    b[S] ~ dunif(0,1),
    sigma ~ dunif(0,10),
    # height
    H ~ dnorm(nu,tau),
    nu h[S] ~ dnorm(160,10),
    tau ~ dunif(0,10)
    ), data=dat )
    Causal effect is consequence
    of intervention
    Now simulate each
    intervention

    View Slide

  100. Simulating Interventions
    post Hbar n with( post , {
    # simulate W for S=1
    H_S1 W_S1 b[,1]*(H_S1-Hbar) , sigma)
    # simulate W for S=2
    H_S2 W_S2 b[,2]*(H_S2-Hbar) , sigma)
    # compute contrast
    W_do_S <})
    Total causal effect of S on W:

    Consequence of changing S at birth
    3 4 5 6 7 8 9 10
    0.0 0.3 0.6
    posterior mean weight contrast (kg)
    Density
    -20 -10 0 10 20 30
    0.00 0.02 0.04
    posterior weight contrast (kg)
    Density
    “p(W|do(S))”

    View Slide

  101. Simulating Interventions
    post Hbar n with( post , {
    # simulate W for S=1
    H_S1 W_S1 b[,1]*(H_S1-Hbar) , sigma)
    # simulate W for S=2
    H_S2 W_S2 b[,2]*(H_S2-Hbar) , sigma)
    # compute contrast
    W_do_S <})
    Total causal effect of S on W:

    Consequence of changing S at birth
    3 4 5 6 7 8 9 10
    0.0 0.3 0.6
    posterior mean weight contrast (kg)
    Density
    -20 -10 0 10 20 30
    0.00 0.02 0.04
    posterior weight contrast (kg)
    Density
    “p(W|do(S))”
    # automated way
    HWsim data=list(S=c(1,2)),
    vars=c("H","W"))
    W_do_S_auto

    View Slide

  102. Inference With Linear Models
    With more than two variables, scientific (causal)
    model and statistical model not always same
    (1) State each estimand

    (2) Design unique statistical model for each

    (3) Compute each estimand
    Or
    (1) State each estimand

    (2) Compute joint posterior for causal system

    (3) Simulate each estimand as an intervention
    ONE STAT MODEL

    FOR EACH ESTIMAND
    ONE SIMULATION

    FOR EACH ESTIMAND

    View Slide

  103. View Slide