Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking Fall 2017 Lecture 13

Statistical Rethinking Fall 2017 Lecture 13

Week 7, Lecture 13, Statistical Rethinking: A Bayesian Course with Examples in R and Stan. This lecture covers Chapters 10 and 11 of the book.

Richard McElreath

December 08, 2017
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Simpson’s Paradox • Trend reverses when additional predictor added •

    Can indicate confound => win! • Can also indicate collider => lose! https://paulvanderlaken.com/2017/09/27/simpsons-paradox-two-hr-examples-with-r-code/
  2. Gender contributes to personal research funding success in The Netherlands

    Romy van der Lee1 and Naomi Ellemers2 Department of Social and Organizational Psychology, Institute of Psychology, Leiden University, 2300 RB Leiden, The Netherlands Edited by Susan T. Fiske, Princeton University, Princeton, NJ, and approved August 19, 2015 (received for review May 26, 2015) We examined the application and review materials of three calls (n = 2,823) of a prestigious grant for personal research funding in a na- tional full population of early career scientists awarded by the Nether- lands Organization for Scientific Research (NWO). Results showed evidence of gender bias in application evaluations and success rates, as well as in language use in instructions and evaluation sheets. Male applicants received significantly more competitive “quality of re- searcher” evaluations (but not “quality of proposal” evaluations) and had significantly higher application success rates than female ap- plicants. Gender disparities were most prevalent in scientific disciplines with the highest number of applications and with equal gender dis- tribution among the applicants (i.e., life sciences and social sciences). Moreover, content analyses of the instructional and evaluation mate- rials revealed the use of gendered language favoring male applicants. Overall, our data reveal a 4% “loss” of women during the grant re- view procedure, and illustrate the perpetuation of the funding gap, which contributes to the underrepresentation of women in academia. gender bias | research funding | success rates | academia | STEM Women are still underrepresented in academia today. Despite various attempts to promote gender equality (e.g., affirmative action initiatives, quotas), female scientists are less likely to get of- fered tenure, are judged to be less competent, receive less payment and research facilities, and are less likely to be awarded research grants compared with male scientists (1–3). Over time, this type of bias accumulates and contributes to the attrition of women from academia (4); the academic pipeline leaks. Here we report evidence of gender bias in personal research funding for early career scientists. associated with male traits and considered necessary for academic career success (20). Moreover, women still earn on average 18% less than their male colleagues for the same work with similar re- sponsibilities (3). Although the salary gap seems to narrow for early career researchers, women in top academic positions are still sub- stantially underpaid compared with men. Finally, across different career phases, success rates for female scientists applying for re- search funding tend to be lower than for male scientists (3, 21, 22). Even when overall success rates for men and women are equal, women receive less research funding than men, and are less often listed as principal investigators (23–25). Closing the funding gap is of particular importance, because this may help retain women in ac- ademia and foster the closing of other gaps by facilitating negotia- tions about salaries, research facilities, and promotion opportunities. Current Study To investigate the possibility of a funding gap, we examined a national full population of early career researchers who applied for a prestigious personal grant between 2010 and 2012 (In- novational Research Incentives Scheme Veni; n = 2,823, with 42.1% female applicants) awarded by the Netherlands Organi- zation for Scientific Research (NWO). The NWO made avail- able anonymized data from their archives for the purpose of this study, and approved publication of this research. Our focus in this study was twofold. First, we tested for applicant gender differences in success rates and application evaluations. In doing so, we analyzed applicant gender as a statistical predictor of final success rates and also the success rates at each step in the review procedure (application, preselection, external reviewing,* interviews,
  3. discipline gender applications awards rejects 1 Chemical sciences m 83

    22 61 2 Chemical sciences f 39 10 29 3 Physical sciences m 135 26 109 4 Physical sciences f 39 9 30 5 Physics m 67 18 49 6 Physics f 9 2 7 7 Humanities m 230 33 197 8 Humanities f 166 32 134 9 Technical sciences m 189 30 159 10 Technical sciences f 62 13 49 11 Interdisciplinary m 105 12 93 12 Interdisciplinary f 78 17 61 13 Earth/life sciences m 156 38 118 14 Earth/life sciences f 126 18 108 15 Social sciences m 425 65 360 16 Social sciences f 409 47 362 17 Medical sciences m 245 46 199 18 Medical sciences f 260 29 231
  4. No evidence that gender contributes to personal research funding success

    in The Netherlands: A reaction to van der Lee and Ellemers A recent PNAS article (1) argues that success rates for attaining research grants are gender- biased. However, the overall gender effect borders on statistical significance, despite the large sample. Moreover, their conclusion could be a prime example of Simpson’s paradox (2, 3); if a higher percentage of women apply for grants in more competitive scientific disci- plines (i.e., with low application success rates for both men and women), then an analysis across all disciplines could incorrectly show “evidence” of gender inequality. Indeed, the social sciences and medical sciences are the two fields with a high proportion of female applicants as well as a low application success rate (table S1 in ref. 1). Moreover, multiple comparisons (across disciplines) are conducted without correcting for alpha inflation. Further- more, it cannot be ruled out that the findings are artifacts due to unmeasured conditions, be- cause no control variables were included. Fi- nally, possible composition effects are ignored. We analyzed data from the field of the social sciences in the Netherlands Organiza- tion of Scientific Research (NWO) consisting of 8,687 individual applications to all grants announced in the period between 2006 and 2013 (not just the Veni grant). Taking nesting within institutions and years into account (intraclass correlation coefficient = 14.5% in the empty model), bivariate analyses of the Veni grant application show no or just border- line significance (P = 0.062), whereas bivariate analyses of all applications show a highly sig- nificant result, which seems to support the con- clusion of van der Lee and Ellemers (1). However, when type of grant and social scien- tific field are included—separately or together— the results show no evidence to reject the null hypothesis of gender equality. Also, no interac- tion is found between gender and these conditions. In short, we find no convincing evidence for gender inequality. However, based on our findings, we also may not conclude that there is no gender inequality in NWO grant application success. Rather, it is too soon to spend public money on changing the eval- uation procedures and gender balancing programs within the Science Foundation in The Netherlands. More in-depth analyses with statistical techniques that overcome the above-mentioned issues are needed before jumping to conclusions about gen- der inequality in grant awards. Our analyses are summarized in Table 1 and more detailed analyses are available on request. Beate Volkera,1 and Wouter Steenbeekb aUniversity of Amsterdam, 1018 WV Amsterdam, The Netherlands; and bNetherlands Institute for the Study of Crime and Law Enforcement, 1081 HV Amsterdam, The Netherlands 1 van der Lee R, Ellemers N (2015) Gender contributes to personal research funding success in The Netherlands. Proc Natl Acad Sci USA 112(40):12349–12353. 2 Albers C (2015) NWO, gender bias and Simpson’s paradox. Casper Albers’ Blog. Available at blog.casperalbers.nl/science/nwo-gender- bias-and-simpsons-paradox/. Accessed November 5, 2015. 3 Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc, B 13(2):238–241. Author contributions: B.V. designed research; B.V. performed re- search; W.S. contributed new reagents/analytic tools; B.V. analyzed data; and B.V. and W.S. wrote the paper. The authors declare no conflict of interest. 1To whom correspondence should be addressed. Email: b.volker@ uva.nl. research funding success in The N reaction to van der Lee and Ellem A recent PNAS article (1) argues that success rates for attaining research grants are gender- biased. However, the overall gender effect borders on statistical significance, despite the large sample. Moreover, their conclusion could be a prime example of Simpson’s paradox (2, 3); if a higher percentage of women apply for grants in more competitive scientific disci- plines (i.e., with low application success rates for both men and women), then an analysis across all disciplines could incorrectly show “evidence” of gender inequality. Indeed, the social sciences and medical sciences are the two fields with a high proportion of female applicants as well as a low application success rate (table S1 in ref. 1). Moreover, multiple comparisons (across disciplines) are conducted without correcting for alpha inflation. Further- more, it cannot be ruled out that the findings 2013 (not just the Veni grant). Taking nesting within institutions and years into account (intraclass correlation coefficient = 14.5% in the empty model), bivariate analyses of the Veni grant application show no or just border- line significance (P = 0.062), whereas bivariate analyses of all applications show a highly sig- nificant result, which seems to support the con- clusion of van der Lee and Ellemers (1). However, when type of grant and social scien- tific field are included—separately or together— the results show no evidence to reject the null hypothesis of gender equality. Also, no interac- tion is found between gender and these conditions. In short, we find no convincing evidence for gender inequality. However, based on our findings, we also may not conclude that be de an on Be aU Am an Cr Am 1 v rese 112 2 A Albe bias
  5. GLMs need taming Ǿ+-*.*Ǿ' !/ ʡ )*-(ǿǍǢǎǍȀǢ Ǿ+-*.*Ǿ' !/ǾǾ*)$/$*) ʡ

    )*-(ǿǍǢǎǍȀ Ȁ ćF QBSBNFUFS OBNFT BSF JOFMFHBOU CVU ZPV DBO FEJU UIF BCPWF UP ZPVS MJLJOH /PUJDF UIBU "'$(( - BCPWF JOTFSUT XFBLMZ SFHVMBSJ[JOH QSJPST CZ EFGBVMU TFF Ǩ"'$(( - GPS PQUJPOT  4PNFUJNFT UIF JNQMJDJU ĘBU QSJPST PG "'( MFBE UP OPOTFOTF FTUJNBUFT 'PS FYBNQMF DPOTJEFS UIF GPMMPXJOH TJNQMF EBUB BOE NPEFM DPOUFYU 3 DPEF  ȕ *0/*( ) +- $/*- '(*./ + -! /'4 ..*$/  4 ʚǶ ǿ - +ǿǍǢǎǍȀ Ǣ - +ǿǎǢǎǍȀ Ȁ 3 ʚǶ ǿ - +ǿǶǎǢǖȀ Ǣ - +ǿǎǢǎǎȀ Ȁ ȕ !$/ $)*($'   (ǡ ʚǶ "'(ǿ 4 ʡ 3 Ǣ /ʙ'$./ǿ4ʙ4Ǣ3ʙ3Ȁ Ǣ !($'4ʙ$)*($' Ȁ +- $.ǿ(ǡȀ  ) / 1 Ǐǡǒʉ ǖǔǡǒʉ ǿ )/ - +/Ȁ Ƕǖǡǎǐ ǏǖǒǒǡǍǓ ǶǒǕǍǍǡǖǒ ǒǔǕǏǡǓǕ 3 ǎǎǡǑǐ ǏǖǒǒǡǍǓ ǶǒǔǕǍǡǐǕ ǒǕǍǐǡǏǒ y x 1 0 -1 2 0 -1 3 0 -1 4 0 -1 5 0 -1 6 0 -1 7 0 -1 8 0 -1 9 0 -1 10 0 1 11 1 1 12 1 1 13 1 1 14 1 1 15 1 1 16 1 1 17 1 1 18 1 1 19 1 1 20 1 1
  6. Binomial GLMs • Predict counts with a fixed maximum •

    Use logit link • Distrust MAP estimation & QA • may work, but routinely does not • regularization even more important now • Convert back to probability/count scale to plot predictions • Focus on predictions, not parameters
  7. > mean(y) [1] 2.84 > var(y) [1] 2.83 0 2

    4 6 8 10 0 500 1500 Count Frequency p = 0.014 , n = 200
  8. > mean(y) [1] 2.84 > var(y) [1] 2.83 0 2

    4 6 8 10 0 500 1500 Count Frequency p = 0.014 , n = 200 0 5 10 15 0 500 1000 1500 Count Frequency p = 0.014 , n = 500 > mean(y) [1] 7.07 > var(y) [1] 7.02
  9. > mean(y) [1] 2.84 > var(y) [1] 2.83 0 2

    4 6 8 10 0 500 1500 Count Frequency p = 0.014 , n = 200 0 5 10 15 0 500 1000 1500 Count Frequency p = 0.014 , n = 500 > mean(y) [1] 7.07 > var(y) [1] 7.02 0 5 10 15 20 25 0 200 600 1000 Count Frequency p = 0.014 , n = 900 > mean(y) [1] 12.76 > var(y) [1] 12.65
  10. > mean(y) [1] 2.84 > var(y) [1] 2.83 0 2

    4 6 8 10 0 500 1500 Count Frequency p = 0.014 , n = 200 0 5 10 15 0 500 1000 1500 Count Frequency p = 0.014 , n = 500 > mean(y) [1] 7.07 > var(y) [1] 7.02 0 5 10 15 20 25 0 200 600 1000 Count Frequency p = 0.014 , n = 900 > mean(y) [1] 12.76 > var(y) [1] 12.65 0 5 10 15 20 0 400 800 1200 Count Frequency p = 0.005 , n = 2000 > mean(y) [1] 9.96 > var(y) [1] 9.79 0 10 20 30 40 0 200 400 600 800 Count Frequency p = 0.012 , n = 2000 > mean(y) [1] 24.80 > var(y) [1] 24.72 > mean(y) [1] 1.20 > var(y) [1] 1.20 0 2 4 6 8 0 1000 2000 3000 Count Frequency p = 2e-04 , n = 6000
  11. Poisson GLMs • Counts without upper limit, constant expected value

    • Single parameter: events per unit time/distance • Variance equal to mean Z ∼ 1PJTTPO(λ) Z J ∼ /PSNBM(µJ, σ), µJ = α + βYJ &(Z J |YJ) = α + βYJ ∂ ∂YJ &(Z J |YJ) = β Z J ∼ #JOPNJBM(Q J, O), &(Z) = λ WBS(Z) = λ Z J ∼ /PSNBM(µJ, σ), µJ = α + βYJ &(Z J |YJ) = α + βYJ ∂ ∂YJ &(Z J |YJ) = β 0 5 10 15 0 500 1000 1500 Count Frequency 0 5 10 20 30 0 200 400 600 800 Count Frequency
  12. Poisson GLMs • Examples: Soccer goals, fission events, photons striking

    a detector, DNA mutations, soldiers killed by horses Siméon Denis Poisson (1781–1840) Abraham de Moivre (1667–1754)
  13. Oceanic tool complexity   (&/&3"-*;&% -*/&"3 . 3 DPEF

     GORJSRS  ORJ G3RSXODWLRQ GFRQWDFWKLJK  LIHOVH G&RQWDFW 8FÔMM DPOTJEFS B TFSJFT PG GPVS NPEFM G  B 1PJTTPO NPEFM XJUI B DPOTUBOU NFB EFQFOET VQPO MPHQPQVMBUJPO  B NPE DPOUBDU SBUF BOE  B NPEFM UIBU JOUFS culture population contact total_tools mean_TU Malekula 1100 low 13 3.2 Tikopia 1500 low 22 4.7 Santa Cruz 3600 low 24 4.0 Yap 4791 high 43 5.0 Lau Fiji 7400 high 33 5.0 Trobriand 8000 high 19 4.0 Chuuk 9200 high 40 3.8 Manus 13000 low 28 6.6 Tonga 17500 high 55 5.4 0 Hawaii 275000 low 71 6.6 Dr. Michelle Kline (Simon Fraser U) (1) Complexity of toolkit proportional to magnitude of population? (2) Contact with other islands moderates impact?
  14. Anatomy of Poisson GLM -FUT CVJME OPX 'JSTU XF NBLF

    TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$ EVNNZ WBSJBCMF GPS IJHI *)// DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUX QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[J PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF Q BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME F *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM
  15. Anatomy of Poisson GLM -FUT CVJME OPX 'JSTU XF NBLF

    TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$ EVNNZ WBSJBCMF GPS IJHI *)// DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUX QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[J PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF Q BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME F *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM total_tools (outcome)
  16. Anatomy of Poisson GLM -FUT CVJME OPX 'JSTU XF NBLF

    TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$ EVNNZ WBSJBCMF GPS IJHI *)// DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUX QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[J PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF Q BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME F *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM total_tools (outcome) expected tools for case i
  17. Anatomy of Poisson GLM -FUT CVJME OPX 'JSTU XF NBLF

    TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$ EVNNZ WBSJBCMF GPS IJHI *)// DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUX QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[J PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF Q BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME F *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM total_tools (outcome) expected tools for case i log link
  18. Log link • Goal: Map linear model to positive reals

      #*( &/5301: "/% 5)& (&/&3"-*;&% -*/&"3 .0%&- -1.0 -0.5 0.0 0.5 1.0 -3 -2 -1 0 1 2 3 x log measurement -1.0 -0.5 0.0 0.5 1.0 x 0 2 4 6 8 10 original measurement 'ĶĴłĿIJ ƑƏ ćF MPH MJOL USBOTGPSNT B MJOFBS NPEFM MFę JOUP B TUSJDUMZ QPT
  19. -1.0 -0.5 0.0 0.5 1.0 -3 -2 -1 0 1

    2 3 x log measurement -1.0 -0.5 0.0 0.5 1.0 x 0 2 4 6 8 10 original measurement 'ĶĴłĿIJ ƑƏ ćF MPH MJOL USBOTGPSNT B MJOFBS NPEFM MFę JOUP B TUSJDUMZ QPT JUJWF NFBTVSFNFOU SJHIU  ćJT USBOTGPSN SFTVMUT JO BO FYQPOFOUJBM TDBMJOH PG UIF MJOFBS NPEFM XJUI B VOJU DIBOHF PO UIF MJOFBS TDBMF NBQQJOH POUP JODSFBTJOHMZ MBSHFS DIBOHFT PO UIF PVUDPNF TDBMF 8IBU UIF MPH MJOL FČFDUJWFMZ BTTVNFT JT UIBU UIF QBSBNFUFST WBMVF JT UIF FYQPOFOUJBUJPO PG UIF MJOFBS NPEFM 4PMWJOH MPH(σJ) = α + βYJ GPS σJ ZJFMET UIF JOWFSTF MJOL σJ = FYQ(α + βYJ) HF PO UIF PVUDPNF TDBMF 3FDBMM UIBU XF EFĕOFE JOUFSBDUJPO $IBQUFS  BT B TJUVB JDI UIF FČFDU PG B QSFEJDUPS EFQFOET VQPO UIF WBMVF PG BOPUIFS QSFEJDUPS 8FMM QSFEJDUPS FTTFOUJBMMZ JOUFSBDUT XJUI JUTFMG CFDBVTF UIF JNQBDU PG B DIBOHF JO B QSFEJ OET VQPO UIF WBMVF PG UIF QSFEJDUPS CFGPSF UIF DIBOHF .PSF HFOFSBMMZ FWFSZ QSF SJBCMF FČFDUJWFMZ JOUFSBDUT XJUI FWFSZ PUIFS QSFEJDUPS WBSJBCMF XIFUIFS ZPV FYQMJ M UIFN BT JOUFSBDUJPOT $IBQUFS  PS OPU ćJT GBDU NBLFT UIF WJTVBMJ[BUJPO PG DPVO BM QSFEJDUJPOT FWFO NPSF JNQPSUBOU GPS VOEFSTUBOEJOH XIBU UIF NPEFM JT UFMMJOH ZP ćF TFDPOE WFSZ DPNNPO MJOL GVODUJPO JT UIF ĹļĴ ĹĶĻĸ ćJT MJOL GVODUJPO NB NFUFS UIBU JT EFĕOFE PWFS POMZ QPTJUJWF SFBM WBMVFT POUP B MJOFBS NPEFM 'PS FYBN PTF XF XBOU UP NPEFM UIF TUBOEBSE EFWJBUJPO σ PG B (BVTTJBO EJTUSJCVUJPO TP JU JPO PG B QSFEJDUPS WBSJBCMF Y ćF QBSBNFUFS σ NVTU CF QPTJUJWF CFDBVTF B TUBOE UJPO DBOOPU CF OFHBUJWF OPS DBO JU CF [FSP ćF NPEFM NJHIU MPPL MJLF ZJ ∼ /PSNBM(µ, σJ) MPH(σJ) = α + βYJ T NPEFM UIF NFBO µ JT DPOTUBOU CVU UIF TUBOEBSE EFWJBUJPO TDBMFT XJUI UIF WBMV MJOL JT CPUI DPOWFOUJPOBM BOE VTFGVM JO UIJT TJUVBUJPO *U QSFWFOUT σ GSPN UBLJOH JWF WBMVF -1.0 -0.5 0.0 0.5 1.0 -3 -2 -1 x log mea -1.0 -0.5 0.0 0.5 1.0 x 'ĶĴłĿIJ ƑƏ ćF MPH MJOL USBOTGPSNT B MJOFBS NPEFM MFę JOUP B TUSJDUMZ QPT JUJWF NFBTVSFNFOU SJHIU  ćJT USBOTGPSN SFTVMUT JO BO FYQPOFOUJBM TDBMJOH PG UIF MJOFBS NPEFM XJUI B VOJU DIBOHF PO UIF MJOFBS TDBMF NBQQJOH POUP JODSFBTJOHMZ MBSHFS DIBOHFT PO UIF PVUDPNF TDBMF 8IBU UIF MPH MJOL FČFDUJWFMZ BTTVNFT JT UIBU UIF QBSBNFUFST WBMVF JT UIF FYQPO MJOFBS NPEFM 4PMWJOH MPH(σJ) = α + βYJ GPS σJ ZJFMET UIF JOWFSTF MJOL σJ = FYQ(α + βYJ) NQBDU PG UIJT BTTVNQUJPO DBO CF TFFO JO 'ĶĴłĿIJ ƑƏ 6TJOH B MPH MJOL GPS B MJOF NQMJFT BO FYQPOFOUJBM TDBMJOH PG UIF PVUDPNF XJUI UIF QSFEJDUPS WBSJBCMF SJH Solve for sigma:
  20. Anatomy of Poisson GLM -FUT CVJME OPX 'JSTU XF NBLF

    TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$ EVNNZ WBSJBCMF GPS IJHI *)// DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUX QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[J PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF Q BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME F *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM total_tools (outcome) expected tools for case i log link log population contact (0/1) interaction
  21. Anatomy of Poisson GLM -FUT CVJME OPX 'JSTU XF NBLF

    TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$ EVNNZ WBSJBCMF GPS IJHI *)// DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUX QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[J PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF Q BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME F *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM total_tools (outcome) expected tools for case i log link
  22. Fitting α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$

    ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[JOH QSJPST PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF *OEFFE UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF QSFEJDUPST BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME FOE VQ TP *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM 3 DPEF  (ǎǍǡǎǍ ʚǶ (+ǿ '$./ǿ /*/'Ǿ/**'. ʡ +*$.ǿ '( ȀǢ '*"ǿ'(Ȁ ʚǶ  ʔ +ȉ'*"Ǿ+*+ ʔ ȉ*)//Ǿ#$"# ʔ +ȉ*)//Ǿ#$"#ȉ'*"Ǿ+*+Ǣ  ʡ )*-(ǿǍǢǎǍǍȀǢ ǿ+ǢǢ+Ȁ ʡ )*-(ǿǍǢǎȀ ȀǢ /ʙ Ȁ  ćF JNQBDU PG +*+0'/$*) PO UPPM DPVOUT JT JODSFBTFE CZ IJHI *)// ćJT JT UP TBZ UIBU UIF BTTPDJBUJPO CFUXFFO /*/'Ǿ/**'. BOE MPH +*+0'/$*) EFQFOET VQPO *)// 4P XF XJMM MPPL GPS B QPTJUJWF JOUFSBDUJPO CFUXFFO MPH +*+0'/$*) BOE *)// -FUT CVJME OPX 'JSTU XF NBLF TPNF OFX DPMVNOT XJUI UIF MPH PG +*+0'/$*) BOE B EVNNZ WBSJBCMF GPS IJHI *)// 3 DPEF  ɶ'*"Ǿ+*+ ʚǶ '*"ǿɶ+*+0'/$*)Ȁ ɶ*)//Ǿ#$"# ʚǶ $! '. ǿ ɶ*)//ʙʙǫ#$"#ǫ Ǣ ǎ Ǣ Ǎ Ȁ ćF NPEFM UIBU DPOGPSNT UP UIF SFTFBSDI IZQPUIFTJT JODMVEFT BO JOUFSBDUJPO CFUXFFO MPH QPQVMBUJPO BOE DPOUBDU SBUF *O NBUI GPSN JU JT 5J ∼ 1PJTTPO(λJ) MPH λJ = α + β1 MPH 1J + β$ $J + β1$ $J MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) β$ ∼ /PSNBM(, ) β1$ ∼ /PSNBM(, ) XIFSF 1 JT +*+0'/$*) BOE $ JT *)//Ǿ#$"# *WF VTFE NPSF TUSPOHMZ SFHVMBSJ[JOH QSJPST PO UIF β QBSBNFUFST CFDBVTF UIF TBNQMF JT TNBMM TP XF TIPVME GFBS PWFSĕUUJOH NPSF *OEFFE UIPTF /PSNBM(, ) QSJPST BSF QSPCBCMZ OPU DPOTFSWBUJWF FOPVHI "OE TJODF UIF QSFEJDUPST BSF OPU DFOUFSFE‰NPSF PO UIBU B MJUUMF MBUFS‰UIFSFT OP UFMMJOH XIFSF α TIPVME FOE VQ TP *WF BTTJHOFE BO FTTFOUJBMMZ ĘBU QSJPS UP JU "OE OPX UP ĕU UIF NPEFM UP UIF EBUB XF DBO VTF (+ BT VTVBM
  23. Beware marginal estimates ǰǒ !1ʅ! ǰ -FUT HMBODF BU UIF

    FTUJNBUFT KVTU UP SFNJOE PVSTFMWFT UIBU XIFO UIF NPEFM JODMVEFT BO JOUFS BDUJPO BOE FTQFDJBMMZ XIFO UIF QSFEJDUPST BSF OPU DFOUFSFE XF DBOU UFMM GSPN UIF UBCMF PG FTUJNBUFT BMPOF XIBU JT HPJOH PO *MM TIPX UIF EPUDIBSU GPS UIF FTUJNBUFT BT XFMM 3 DPEF  -/" &0ǯ*ƾƽǑƾƽǒ ,//ʅǰ -),1ǯ-/" &0ǯ*ƾƽǑƾƽǰǰ "+ 1!"3 ǂǑǂɵ džǁǑǂɵ  -  -  ƽǑdžǁ ƽǑǀǃ ƽǑǀDŽ ƾǑǂƿ ƾǑƽƽ ǦƽǑdžDž ǦƽǑƾǀ ƽǑƽDŽ - ƽǑƿǃ ƽǑƽǀ ƽǑƿƾ ƽǑǀƿ ǦƽǑdžDž ƾǑƽƽ ƽǑƾƿ ǦƽǑƽDž  ǦƽǑƽdž ƽǑDžǁ ǦƾǑǁǀ ƾǑƿǂ ǦƽǑƾǀ ƽǑƾƿ ƾǑƽƽ ǦƽǑdždž - ƽǑƽǁ ƽǑƽdž ǦƽǑƾƽ ƽǑƾdž ƽǑƽDŽ ǦƽǑƽDž ǦƽǑdždž ƾǑƽƽ bpc bc bp a -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 Value *WF VTFE UIF ,//ʅ PQUJPO UP JODMVEF UIF DPSSFMBUJPOT BNPOH UIF QBSBNFUFST #VU ĕSTU OPUJDF UIBU UIF NBJO FČFDU PG MPHQPQVMBUJPO - JT QPTJUJWF BOE QSFDJTF CVU UIBU CPUI  BOE - PWFSMBQ [FSP TVCTUBOUJBMMZ 4P ZPV NJHIU UIJOL JUT TBGF UP TBZ UIBU MPHQPQVMBUJPO JT SFMJBCMZ BTTPDJBUFE XJUI UIF UPUBM UPPMT CVU UIBU DPOUBDU SBUF IBT OP JNQBDU PO QSFEJDUJPO JO
  24. Pairs plot (Stan samples)   $06/5*/( "/% $-"44*'*$"5*0/ a

    0.15 0.30 -0.3 0.0 0.3 -0.5 0.5 1.5 0.15 0.25 0.35 -0.98 bp -0.14 0.14 bc -3 -1 1 3 -0.5 0.5 1.5 -0.3 0.0 0.3 0.08 -0.1 -3 -1 1 3 -0.99 bcp a 0.1 0.15 0.25 0.35 -0.46 -0.76 3.0 3.3 3.6 -0.4 0.0 0.4 0.09
  25. 20 25 _low -3 -2 -1 0 1 2 -0.2

    0.0 0.1 0.2 0.3 bc bpc JTUSJCVUJPO PG QMBVTJCMF EJČFSFODF JO BWFSBHF UPPM bpc+, bc– bpc–, bc+
  26. Focus on predictions '(Ǿ#$"# ʚǶ 3+ǿ +*./ɶ ʔ +*./ɶ ʔ

    ǿ+*./ɶ+ ʔ +*./ɶ+ȀȉǕ Ȁ '(Ǿ'*2 ʚǶ 3+ǿ +*./ɶ ʔ +*./ɶ+ȉǕ Ȁ 4JODF UIF QPTUFSJPS JT B EJTUSJCVUJPO UIF DPOUFOUT PG '(Ǿ#$"# BOE '(Ǿ'*2 BSF BMTP EJTUSJCVUJPOT /PX MFUT DPNQVUF UIF EJČFSFODF CFUXFFO UIFTF UXP EJTUSJCVUJPOT UP HFU UIF EJTUSJCVUJPO PG QMBVTJCMF EJČFSFODFT JO UPPMT CFUXFFO BO JTMBOE XJUI IJHI DPOUBDU BOE POF XJUI MPX DPOUBDU 3 DPEF  $!! ʚǶ '(Ǿ#$"# Ƕ '(Ǿ'*2 .0(ǿ$!! ʛ ǍȀȅ' )"/#ǿ$!!Ȁ ȁǎȂ ǍǡǖǒǏǔ   $06/5*/( "/% $-"44*'*$"5*0/ -5 0 5 10 15 20 25 0.00 0.04 0.08 lambda_high - lambda_low Density -3 -2 -1 0 1 2 -0.2 0.0 0.1 0.2 0.3 bc bpc 'ĶĴłĿIJ ƉƈƐ -Fę ćF EJTUSJCVUJPO PG QMBVTJCMF EJČFSFODF JO BWFSBHF UPPM
  27. Model comparison ȕ $)" )ʙǎ Ǒ !*- (*- ./' 

     ./$(/ . ȕ 2$'' '.* +'*/ /# *(+-$.*) ǿ $.').ǡ*(+- ʚǶ *(+- ǿ(ǎǍǡǎǍǢ(ǎǍǡǎǎǢ(ǎǍǡǎǏǢ(ǎ +'*/ǿ$.').ǡ*(+- Ȁ   +    2 $"#/   (ǎǍǡǎǎ ǔǖǡǍ ǑǡǏ ǍǡǍ ǍǡǓǏ ǎǎǡǎǖ  (ǎǍǡǎǍ ǕǍǡǎ Ǒǡǖ ǎǡǏ Ǎǡǐǒ ǎǎǡǑǏ ǎǡǏǕ (ǎǍǡǎǏ ǕǑǡǓ ǐǡǕ ǒǡǓ ǍǡǍǑ Ǖǡǖǎ ǕǡǑǔ (ǎǍǡǎǑ ǎǑǎǡǒ ǕǡǏ ǓǏǡǒ ǍǡǍǍ ǐǎǡǒǐ ǐǑǡǑǏ (ǎǍǡǎǐ ǎǑǖǡǕ ǎǓǡǔ ǔǍǡǕ ǍǡǍǍ ǑǐǡǖǓ ǑǓǡǍǎ m10.13 m10.14 m10.12 m10.10 m10.11 80 100 120 140 160 deviance WAIC ćF UPQ UXP NPEFMT JODMVEF CPUI QSFEJDUPST CVU UIF UPQ NPE BDUJPO CFUXFFO UIFN ćFSFT B MPU PG NPEFM XFJHIU BTTJHOFE U log pop, contact interaction log pop only null (intercept only) contact only $'' '.* +'*/ /# *(+-$.*) .').ǡ*(+- ʚǶ *(+- ǿ(ǎǍǡǎǍǢ(ǎǍǡǎǎǢ(ǎǍǡǎǏǢ(ǎǍǡǎǐǢ(ǎǍǡǎǑǢ)ʙǎ ǑȀ Ȁ /ǿ$.').ǡ*(+- Ȁ   +    2 $"#/   ǡǎǎ ǔǖǡǍ ǑǡǏ ǍǡǍ ǍǡǓǏ ǎǎǡǎǖ  ǡǎǍ ǕǍǡǎ Ǒǡǖ ǎǡǏ Ǎǡǐǒ ǎǎǡǑǏ ǎǡǏǕ ǡǎǏ ǕǑǡǓ ǐǡǕ ǒǡǓ ǍǡǍǑ Ǖǡǖǎ ǕǡǑǔ ǡǎǑ ǎǑǎǡǒ ǕǡǏ ǓǏǡǒ ǍǡǍǍ ǐǎǡǒǐ ǐǑǡǑǏ ǡǎǐ ǎǑǖǡǕ ǎǓǡǔ ǔǍǡǕ ǍǡǍǍ ǑǐǡǖǓ ǑǓǡǍǎ m10.13 m10.14 m10.12 m10.10 m10.11 80 100 120 140 160 180 200 deviance WAIC UPQ UXP NPEFMT JODMVEF CPUI QSFEJDUPST CVU UIF UPQ NPEFM (ǎǍǡǎǎ FYDMVEFT UIF JOUF PO CFUXFFO UIFN ćFSFT B MPU PG NPEFM XFJHIU BTTJHOFE UP CPUI IPXFWFS ćJT TVHHFT Since WAIC constructed over predictions, automatically accounts for posterior correlations
  28.  10*440/ 3&(3&44*0/ 7 8 9 10 11 12 20

    30 40 50 60 70 log-population total tools 'ĶĴłĿIJ ƉƈƑ &OT GPS UIF JTMBOET NP MBOET XJUI IJHI D BOE DPOĕEFODF SF EJDUJPOT GPS JTMBO WBMVFT PG MPHQP BYJT ćF EBTIFE U EJDUJPOT GPS MPX D Prediction ensemble high contact low contact
  29. Poisson GLMs • For counts without obvious upper bound •

    log link is customary; linear model of magnitude • Beware exploding exponential predictions • Focus on predictions, not parameters • Convert back to count scale to interpret/plot • Use offset to adjust exposure duration/distance • Predictions tend to be under-dispersed relative to data • Common problem for both binomial and Poisson GLMs => un-modeled heterogeneity
  30. Additional count distributions • Multinomial: generalized binomial, more than 2

    un-ordered outcomes • Tricky to use and understand • Geometric: number of trials until specific event • Common event-history (survival) distribution • Mixtures, coping with heterogeneity: • Beta-binomial: varying probabilities • gamma-Poisson: varying rates • many others (e.g. Dirichlet-multinomial)
  31. Monsters & mixtures • More complicated GLMs: • Monsters: Specialized,

    complex distributions • ordered categories, ranks • Mixtures: Blends of stochastic processes • Varying means, probabilities, rates • Varying process: zero-inflation, hurdles
  32. Ordered categories • How much do you like this class?

    (1–7) • How important is income of a potential spouse? (1–7) • How often do you see bats around Leipzig? (never, sometimes, frequently)
  33. • Discrete outcomes • Defined minimum and maximum • Defined

    order • “Distances” between categories unknown
  34. Ordered categories • Hard to model • Not continuous •

    Not counts • Solution: ordered logistic regression • categorical model with a fancy link function • Good example of making a monster
  35. Three principles • Action: Harm caused by action is morally

    worse than same harm caused by inaction. • Intention: Harm intended as means to goal worse than same harm foreseen as a side effect of goal. • Contact: Harm caused by physical contact worse than same harm without physical contact.
  36. Moral intuitions • Cushman et al. experiments • 331 individuals,

    30 scenarios, 9930 responses • How do responses vary with action, intention, contact? • Age, gender, individual? 1 2 3 4 5 6 7 0 500 1500 How permissible Frequency
  37. 1 2 3 4 5 6 7 0 500 1500

    How permissible Frequency 1 2 3 4 5 6 7 0 100 200 300 400 How permissible Frequency contact 1 2 3 4 5 6 7 0 200 600 1000 How permissible Frequency action 1 2 3 4 5 6 7 0 200 600 1000 How permissible Frequency intention
  38. Ordered logit • A log-cumulative-odds link probability model  

    .0/45&34 "/% .*9563&4 1 2 3 4 5 6 7 0 500 1000 1500 2000 response Frequency 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion 1 2 3 4 5 6 7 -2 -1 0 1 response log-cumulative-odds 'ĶĴłĿIJ ƉƉƉ 3FEFTDSJCJOH B EJTDSFUF EJTUSJCVUJPO VTJOH MPHDVNVMBUJWF PEET -Fę )JTUPHSBN PG EJTDSFUF SFTQPOTF JO UIF TBNQMF .JEEMF $VNV
  39. Ordered logit • A log-cumulative-odds link probability model  

    .0/45&34 "/% .*9563&4 1 2 3 4 5 6 7 0 500 1000 1500 2000 response Frequency 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion 1 2 3 4 5 6 7 -2 -1 0 1 response log-cumulative-odds 'ĶĴłĿIJ ƉƉƉ 3FEFTDSJCJOH B EJTDSFUF EJTUSJCVUJPO VTJOH MPHDVNVMBUJWF PEET -Fę )JTUPHSBN PG EJTDSFUF SFTQPOTF JO UIF TBNQMF .JEEMF $VNV
  40. Ordered logit • A log-cumulative-odds link probability model  

    .0/45&34 "/% .*9563&4 1 2 3 4 5 6 7 0 500 1000 1500 2000 response Frequency 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion 1 2 3 4 5 6 7 -2 -1 0 1 response log-cumulative-odds 'ĶĴłĿIJ ƉƉƉ 3FEFTDSJCJOH B EJTDSFUF EJTUSJCVUJPO VTJOH MPHDVNVMBUJWF PEET -Fę )JTUPHSBN PG EJTDSFUF SFTQPOTF JO UIF TBNQMF .JEEMF $VNV
  41. Ordered logit • A log-cumulative-odds link probability model UZ OBUVSBMMZ

    DPOTUSBJOT JUTFMG UP OFWFS FYDFFEJOH B UPUBM QSPCB POF "OE CFDBVTF UIJT JT BO PSEFSFE EFOTJUZ XF LOPX UIBU UIF WF MPHPEET PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion
  42. Ordered logit • A log-cumulative-odds link probability model UZ OBUVSBMMZ

    DPOTUSBJOT JUTFMG UP OFWFS FYDFFEJOH B UPUBM QSPCB POF "OE CFDBVTF UIJT JT BO PSEFSFE EFOTJUZ XF LOPX UIBU UIF WF MPHPEET PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion cumulative log-odds
  43. Ordered logit • A log-cumulative-odds link probability model UZ OBUVSBMMZ

    DPOTUSBJOT JUTFMG UP OFWFS FYDFFEJOH B UPUBM QSPCB POF "OE CFDBVTF UIJT JT BO PSEFSFE EFOTJUZ XF LOPX UIBU UIF WF MPHPEET PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion cumulative log-odds response
  44. Ordered logit • A log-cumulative-odds link probability model UZ OBUVSBMMZ

    DPOTUSBJOT JUTFMG UP OFWFS FYDFFEJOH B UPUBM QSPCB POF "OE CFDBVTF UIJT JT BO PSEFSFE EFOTJUZ XF LOPX UIBU UIF WF MPHPEET PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion cumulative log-odds response category
  45. Ordered logit • A log-cumulative-odds link probability model UZ OBUVSBMMZ

    DPOTUSBJOT JUTFMG UP OFWFS FYDFFEJOH B UPUBM QSPCB POF "OE CFDBVTF UIJT JT BO PSEFSFE EFOTJUZ XF LOPX UIBU UIF WF MPHPEET PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion cumulative log-odds response category linear model
  46. Ordered logit • A log-cumulative-odds link probability model WF MPHPEET

    PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion  03%&3&% $"5&(03*$"- 065$0.&4  UTFMG %P UIJT CZ UBLJOH  BOE TPMWJOH GPS 1S(Z J ≤ L) "GUFS B FCSB ZPV HFU 1S(Z J ≤ L) = FYQ(φL)  + FYQ(φL) . IU SFDPHOJ[F UIJT QSPCBCJMJUZ BT UIF MPHJTUJD TBNF BT JO UIF MBTU *U BSPTF JO UIF TBNF XBZ FTUBCMJTIJOH UIF MPHJTUJD GVODUJPO BT STF MJOL GPS UIF CJOPNJBM NPEFM #VU OPX XF IBWF B DVNVMBUJWF ODF UIF QSPCBCJMJUZ 1S(Z J ≤ L) JT DVNVMBUJWF XF TUJMM OFFE MJLFMJIPPET XIJDI BSF OPU DVNVMBUJWF 4P IPX EP
  47. Ordered logit • A log-cumulative-odds link probability model WF MPHPEET

    PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 response cumulative proportion  03%&3&% $"5&(03*$"- 065$0.&4  UTFMG %P UIJT CZ UBLJOH  BOE TPMWJOH GPS 1S(Z J ≤ L) "GUFS B FCSB ZPV HFU 1S(Z J ≤ L) = FYQ(φL)  + FYQ(φL) . IU SFDPHOJ[F UIJT QSPCBCJMJUZ BT UIF MPHJTUJD TBNF BT JO UIF MBTU *U BSPTF JO UIF TBNF XBZ FTUBCMJTIJOH UIF MPHJTUJD GVODUJPO BT STF MJOL GPS UIF CJOPNJBM NPEFM #VU OPX XF IBWF B DVNVMBUJWF ODF UIF QSPCBCJMJUZ 1S(Z J ≤ L) JT DVNVMBUJWF XF TUJMM OFFE MJLFMJIPPET XIJDI BSF OPU DVNVMBUJWF 4P IPX EP
  48. 1 2 3 4 5 6 7 0.0 0.2 0.4

    0.6 0.8 1.0 response cumulative proportion Ordered logit • A log-cumulative-odds link probability model WF MPHPEET PG UIF MBSHFTU PCTFSWBCMF WBMVF NVTU CF +∞ XIJDI NF BT DVNVMBUJWF QSPCBCJMJUZ PG POF 5IJT BODIPST UIF EJTUSJ OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ  03%&3&% $"5&(03*$"- 065$0.&4  UTFMG %P UIJT CZ UBLJOH  BOE TPMWJOH GPS 1S(Z J ≤ L) "GUFS B FCSB ZPV HFU 1S(Z J ≤ L) = FYQ(φL)  + FYQ(φL) . IU SFDPHOJ[F UIJT QSPCBCJMJUZ BT UIF MPHJTUJD TBNF BT JO UIF MBTU *U BSPTF JO UIF TBNF XBZ FTUBCMJTIJOH UIF MPHJTUJD GVODUJPO BT STF MJOL GPS UIF CJOPNJBM NPEFM #VU OPX XF IBWF B DVNVMBUJWF ODF UIF QSPCBCJMJUZ 1S(Z J ≤ L) JT DVNVMBUJWF XF TUJMM OFFE MJLFMJIPPET XIJDI BSF OPU DVNVMBUJWF 4P IPX EP
  49. 1 2 3 4 5 6 7 0.0 0.2 0.4

    0.6 0.8 1.0 response cumulative proportion Ordered logit • A log-cumulative-odds link probability model OE TUBOEBSEJ[FT JU BU UIF TBNF UJNF *G ZPV TUBSU JOTUFBE XJUI JOEJWJEVBM QSPCBCJMJUJFT PG FBDI PVUDPNF UIFO ZPVÔE IBWF UP EBSEJ[F UIFTF QSPCBCJMJUJFT UP FOTVSF UIFZ TVN UP FYBDUMZ POF *U U UP CF FBTJFS UP KVTU TUBSU XJUI UIF DVNVMBUJWF QSPCBCJMJUZ BOE SL CBDLXBSET UP UIF JOEJWJEVBM QSPCBCJMJUJFT * LOPX UIJT TFFNT VU *ÔMM XBML ZPV UISPVHI JU U XF XBOU JT GPS UIF DVNVMBUJWF MPHPEET PG BO PCTFSWFE WBMVF Z J VBMUPPSMFTTUIBO TPNF QPTTJCMF WBMVF L UP CF MPH 1S(Z J ≤ L)  − 1S(Z J ≤ L) = φL,  L JT B DPOUJOVPVT WBMVF EJGGFSFOU GPS FBDI PCTFSWBCMF WBMVF L BLF UIJT WBMVF JOUP B MJOFBS NPEFM JO B CJU 'PS OPX JUÔT KVTU B EFS 5IF BCPWF GVODUJPO JT KVTU B EJSFDU FNCPEJNFOU PG UIF MPH E DVNVMBUJWF EFOTJUZ PCKFDUJWFT XFÔWF TUBUFE TP GBS *U BDUVBMMZ IJOH FMTF BU BMM /PX XF TPMWF GPS UIF DVNVMBUJWF QSPCBCJMJUZ  03%&3&% $"5&(03*$"- 065$0.&4  UTFMG %P UIJT CZ UBLJOH  BOE TPMWJOH GPS 1S(Z J ≤ L) "GUFS B FCSB ZPV HFU 1S(Z J ≤ L) = FYQ(φL)  + FYQ(φL) . IU SFDPHOJ[F UIJT QSPCBCJMJUZ BT UIF MPHJTUJD TBNF BT JO UIF MBTU *U BSPTF JO UIF TBNF XBZ FTUBCMJTIJOH UIF MPHJTUJD GVODUJPO BT STF MJOL GPS UIF CJOPNJBM NPEFM #VU OPX XF IBWF B DVNVMBUJWF ODF UIF QSPCBCJMJUZ 1S(Z J ≤ L) JT DVNVMBUJWF XF TUJMM OFFE MJLFMJIPPET XIJDI BSF OPU DVNVMBUJWF 4P IPX EP IJT UIJOH 8FMM JUÔT B QSPCBCJMJUZ EFOTJUZ TP ZPV DBO VTF JU UP F MJLFMJIPPE PG BOZ PCTFSWBUJPO Z J  #Z EFGJOJUJPO JO B EJTDSFUF  03%&3&% $"5&(03*$"- 065$0.&4  MG %P UIJT CZ UBLJOH  BOE TPMWJOH GPS 1S(Z J ≤ L) "GUFS B B ZPV HFU 1S(Z J ≤ L) = FYQ(φL)  + FYQ(φL) . SFDPHOJ[F UIJT QSPCBCJMJUZ BT UIF MPHJTUJD TBNF BT JO UIF MBTU BSPTF JO UIF TBNF XBZ FTUBCMJTIJOH UIF MPHJTUJD GVODUJPO BT MJOL GPS UIF CJOPNJBM NPEFM #VU OPX XF IBWF B DVNVMBUJWF F UIF QSPCBCJMJUZ 1S(Z J ≤ L) JT DVNVMBUJWF TUJMM OFFE MJLFMJIPPET XIJDI BSF OPU DVNVMBUJWF 4P IPX EP T UIJOH 8FMM JUÔT B QSPCBCJMJUZ EFOTJUZ TP ZPV DBO VTF JU UP JLFMJIPPE PG BOZ PCTFSWBUJPO Z J  #Z EFGJOJUJPO JO B EJTDSFUF EFOTJUZ UIF MJLFMJIPPE PG BOZ PCTFSWBUJPO Z J = L NVTU CF 1S(Z J = L) = 1S(Z J ≤ L) − 1S(Z J ≤ L − ).  ZT UIBU TJODF UIF MPHJTUJD JT DVNVMBUJWF XF DBO DPNQVUF UIF CBCJMJUZ PG FYBDUMZ Z J = L CZ TVCUSBDUJOH UIF DVNVMBUJWF QSPC OF PCTFSWBCMF WBMVF MPXFS UIBO L UJOH UIF (-. JO UIF φ 8FÔSF BMNPTU SFBEZ UP XBML UISPVHI
  50. • Simplest model just uses an intercept for each category:

    intercept unique to category cumulative probabilities of each response PVUDPNF XFMM CF JOUFSFTUFE JO JT /"0-,+0" XIJDI JT BO JOUFHFS GSPN  NPSBMMZ QFSNJTTJCMF UIF QBSUJDJQBOU GPVOE UIF BDUJPO UBLFO PS OPU U 4JODF UIJT UZQF PG SBUJOH JT DBUFHPSJDBM BOE PSEFSFE JUT FYBDUMZ UIF UZQF PVS PSEFSFE MPHJU NPEFM UP ćF QSFEJDUPS WBSJBCMFT PG JOUFSFTU BSF HPJOH UP CF  1&,+ &+1"+1 FBDI B EVNNZ WBSJBCMF DPSSFTQPOEJOH UP FBDI QSJODJQMF PVUMJOFE BCPW  ćF CBTJD NPEFM 8FMM CFHJO XJUI B CBTJD NPEFM UIBU JODMV GPS FBDI MFWFM PG UIF PVUDPNF $POWFOUJPOT GPS XSJUJOH NBUIFNBUJDBM GP MPHJU WBSZ B MPU 8FMM VTF UIJT 3J ∼ 0SEFSFE(Q) MPH QL  − QL = αL >FXPXODWLY αL ∼ /PSNBM(, ) >FRPPRQ *O DPEF GPSN GPS *- BOE *-ƿ01+ UIF MJOL GVODUJPO XJMM CF FNCFEEF GVODUJPO BMSFBEZ 4P UP ĕU UIF CBTJD NPEFM JODPSQPSBUJOH OP QSFEJDUPS DPEF  *ƾƾǑƾ ʆǦ *-ǯ )&01ǯ Ordered logit
  51. Ordered logit in Stan "OE PG DPVSTF UIPTF BSF UIF

    TBNF BT UIF WBMVFT JO 2*Ǯ-/Ǯ( UIBU XF DPNQVUFE FBSMJFS #VU OPX XF BMTP IBWF B QPTUFSJPS EJTUSJCVUJPO BSPVOE UIFTF WBMVFT BOE XFSF SFBEZ UP BEE QSF EJDUPS WBSJBCMFT JO UIF OFYU TFDUJPO 5P ĕU UIF TBNF NPEFM VTJOH 4UBOT ).$ FOHJOF JU JT CFUUFS UP VTF BO FYQMJDJU WFDUPS PG JOUFSDFQU QBSBNFUFST 3 DPEF  ȅ +,1" 1%1 !1 4&1% +*" ǚ 0"ǚ +,1 )),4"! &+ 1+ ȅ 0, 4&)) -00 -/2+"! !1 )&01 *ƾƾǑƾ01+ ʆǦ *-ƿ01+ǯ )&01ǯ /"0-,+0" ʍ !,/!),$&1ǯ -%& ǒ 21-,&+10 ǰǒ -%& ʆǦ ƽǒ 21-,&+10 ʍ !+,/*ǯƽǒƾƽǰ ǰ ǒ !1ʅ)&01ǯ/"0-,+0"ʅ!ɢ/"0-,+0"ǰǒ 01/1ʅ)&01ǯ 21-,&+10ʅ ǯǦƿǒǦƾǒƽǒƾǒƿǒƿǑǂǰǰ ǒ %&+0ʅƿ ǒ ,/"0ʅƿ ǰ ȅ +""! !"-1%ʅƿ 1, 0%,4 3" 1,/ ,# -/*"1"/0 -/" &0ǯ*ƾƾǑƾ01+ǒ!"-1%ʅƿǰ "+ 1!"3 ),4"/ ƽǑDždž 2--"/ ƽǑDždž +Ǯ"## %1 21-,&+10DZƾDz ǦƾǑdžƿ ƽǑƽǀ ǦƾǑdžDŽ ǦƾǑDžDŽ ƾƽƾƿ ƾ 21-,&+10DZƿDz ǦƾǑƿDŽ ƽǑƽƿ ǦƾǑǀƾ ǦƾǑƿǀ ƾǁǃƾ ƾ 21-,&+10DZǀDz ǦƽǑDŽƿ ƽǑƽƿ ǦƽǑDŽǂ ǦƽǑǃDž ƾDžǁǂ ƾ 21-,&+10DZǁDz ƽǑƿǂ ƽǑƽƿ ƽǑƿƿ ƽǑƿDž ƿƽƽƽ ƾ 21-,&+10DZǂDz ƽǑDždž ƽǑƽƿ ƽǑDžǂ ƽǑdžƿ ƿƽƽƽ ƾ 21-,&+10DZǃDz ƾǑDŽDŽ ƽǑƽǀ ƾǑDŽƿ ƾǑDžƾ ƾDžǂƾ ƾ ćF JOEJWJEVBM 21-,&+10 QBSBNFUFST DPSSFTQPOE UP FBDI αL GSPN FBSMJFS
  52. 7th cutpoint missing, because known to be infinity (on logit

    scale) cutpoints[1] cutpoints[2] cutpoints[3] cutpoints[4] cutpoints[5] cutpoints[6] 0.1281697 0.2198018 0.3276686 0.5617471 0.7091352 0.8546406 logistic( coef( m11.1stan ) )
  53. 1 2 3 4 5 6 7 0.0 0.2 0.4

    0.6 0.8 1.0 response cumulative proportion [1] [2] [3] [4] [5] [6] cutpoints[1] cutpoints[2] cutpoints[3] cutpoints[4] cutpoints[5] cutpoints[6] 0.1281697 0.2198018 0.3276686 0.5617471 0.7091352 0.8546406 logistic( coef( m11.1stan ) )
  54. Adding predictor variables BSF UIF NPEFM GPS UIF BEEJUJPO PG

    QSFEJDUPS WBSJBCMFT UIBU PCFZ UIF PSEFSFE DPOTUS PVUDPNFT JODMVEF QSFEJDUPS WBSJBCMFT XF EFĕOF UIF MPHDVNVMBUJWFPEET PG FBDI SFTQPOTF PG JUT JOUFSDFQU αL BOE B UZQJDBM MJOFBS NPEFM 4VQQPTF GPS FYBNQMF XF XBOU UP DUPS Y UP UIF NPEFM 8FMM EP UIJT CZ EFĕOJOH B MJOFBS NPEFM φJ = βYJ  ćFO F UJWF MPHJU CFDPNFT MPH 1S(ZJ ≤ L)  − 1S(ZJ ≤ L) = αL − φJ φJ = βYJ N BVUPNBUJDBMMZ FOTVSFT UIF DPSSFDU PSEFSJOH PG UIF PVUDPNF WBMVFT XIJMF TUJMM N IF MJLFMJIPPE PG FBDI JOEJWJEVBM WBMVF BT UIF QSFEJDUPS YJ DIBOHFT WBMVF 8IZ JT NPEFM φ TVCUSBDUFE GSPN FBDI JOUFSDFQU #FDBVTF JG XF EFDSFBTF UIF MPHDVNVMB NJOJTIFE XIJMF UIF WBMVFT PO UIF SJHIU IBWF JODSFBTFE ćF FYQFDUFE WBMVF JT OPX .0(ǿ +&ȉǿǎǣǔȀ Ȁ ȁǎȂ ǑǡǔǏǖǔǑ "OE UIBUT XIZ XF TVCUSBDU φ UIF MJOFBS NPEFM βYJ GSPN FBDI JOUFSDFQU SBUIFS UIBO ćJT XBZ B QPTJUJWF β WBMVF JOEJDBUFT UIBU BO JODSFBTF JO UIF QSFEJDUPS WBSJBCMF Y SFTVMU JODSFBTF JO UIF BWFSBHF SFTQPOTF /PX XF DBO UVSO CBDL UP PVS iUSPMMFZw EBUB BOE JODMVEF QSFEJDUPS WBSJBCMFT UP I QMBJO WBSJBUJPO JO SFTQPOTFT ćF QSFEJDUPS WBSJBCMFT PG JOUFSFTU BSF HPJOH UP CF /$* / )/$*) BOE *)// FBDI B EVNNZ WBSJBCMF DPSSFTQPOEJOH UP FBDI QSJODJQMF PV FBSMJFS ćF MPHDVNVMBUJWF PEET PG FBDI SFTQPOTF L XJMM OPX CF MPH 1S(ZJ ≤ L)  − 1S(ZJ ≤ L) = αL − φJ φJ = β" "J + β* *J + β$ $J XIFSF "J JOEJDBUFT UIF WBMVF PG /$*) PO SPX J *J JOEJDBUFT UIF WBMVF PG $)/ )/$*) PO BOE $J JOEJDBUFT UIF WBMVF PG *)// PO SPX J 8IBU XFWF EPOF IFSF JT EFĕOF UIF MP PG FBDI QPTTJCMF SFTQPOTF UP CF BO BEEJUJWF NPEFM PG UIF GFBUVSFT PG UIF TUPSZ DPSSFTQP UP FBDI SFTQPOTF :PV ĕU UIJT NPEFM KVTU BT ZPVE FYQFDU CZ BEEJOH UIF TMPQFT BOE QSFEJDUPS WBSJB In general: Trolley data: NO INTERCEPT in phi!
  55. Adding predictor variables /PX XF DBO UVSO CBDL UP PVS

    iUSPMMFZw EBUB BOE JODMVEF QSFEJDUPS WBSJBCMFT UP IFMQ F O WBSJBUJPO JO SFTQPOTFT ćF QSFEJDUPS WBSJBCMFT PG JOUFSFTU BSF HPJOH UP CF /$*) $ )/$*) BOE *)// FBDI B EVNNZ WBSJBCMF DPSSFTQPOEJOH UP FBDI QSJODJQMF PVUMJO JFS ćF MPHDVNVMBUJWF PEET PG FBDI SFTQPOTF L XJMM OPX CF MPH 1S(ZJ ≤ L)  − 1S(ZJ ≤ L) = αL − φJ φJ = β" "J + β* *J + β$ $J FSF "J JOEJDBUFT UIF WBMVF PG /$*) PO SPX J *J JOEJDBUFT UIF WBMVF PG $)/ )/$*) PO SPX E $J JOEJDBUFT UIF WBMVF PG *)// PO SPX J 8IBU XFWF EPOF IFSF JT EFĕOF UIF MPHPE BDI QPTTJCMF SFTQPOTF UP CF BO BEEJUJWF NPEFM PG UIF GFBUVSFT PG UIF TUPSZ DPSSFTQPOEJ BDI SFTQPOTF :PV ĕU UIJT NPEFM KVTU BT ZPVE FYQFDU CZ BEEJOH UIF TMPQFT BOE QSFEJDUPS WBSJBCMFT +#$ QBSBNFUFS JOTJEF *-'*"$/ )FSFT B XPSLJOH NPEFM ǡǏ ʚǶ (+ǿ '$./ǿ - .+*). ʡ *-'*"$/ǿ +#$ Ǣ ǿǎǢǏǢǐǢǑǢǒǢǓȀ Ȁ Ǣ +#$ ʚǶ ȉ/$*) ʔ  ȉ$)/ )/$*) ʔ ȉ*)//Ǣ ǿǢ ǢȀ ʡ )*-(ǿǍǢǎǍȀǢ / )/$*) BOE *)// FBDI B EVNNZ WBSJBCMF DPSSFTQPOEJOH UP FBDI QSJODJQMF PVUMJOFE FBSMJFS ćF MPHDVNVMBUJWF PEET PG FBDI SFTQPOTF L XJMM OPX CF MPH 1S(ZJ ≤ L)  − 1S(ZJ ≤ L) = αL − φJ φJ = β" "J + β* *J + β$ $J XIFSF "J JOEJDBUFT UIF WBMVF PG /$*) PO SPX J *J JOEJDBUFT UIF WBMVF PG $)/ )/$*) PO SPX J BOE $J JOEJDBUFT UIF WBMVF PG *)// PO SPX J 8IBU XFWF EPOF IFSF JT EFĕOF UIF MPHPEET PG FBDI QPTTJCMF SFTQPOTF UP CF BO BEEJUJWF NPEFM PG UIF GFBUVSFT PG UIF TUPSZ DPSSFTQPOEJOH UP FBDI SFTQPOTF :PV ĕU UIJT NPEFM KVTU BT ZPVE FYQFDU CZ BEEJOH UIF TMPQFT BOE QSFEJDUPS WBSJBCMFT UP UIF +#$ QBSBNFUFS JOTJEF *-'*"$/ )FSFT B XPSLJOH NPEFM 3 DPEF  (ǎǎǡǏ ʚǶ (+ǿ '$./ǿ - .+*). ʡ *-'*"$/ǿ +#$ Ǣ ǿǎǢǏǢǐǢǑǢǒǢǓȀ Ȁ Ǣ +#$ ʚǶ ȉ/$*) ʔ  ȉ$)/ )/$*) ʔ ȉ*)//Ǣ ǿǢ ǢȀ ʡ )*-(ǿǍǢǎǍȀǢ  03%&3&% $"5&(03*$"- 065$0.&4  ǿǎǢǏǢǐǢǑǢǒǢǓȀ ʡ )*-(ǿǍǢǎǍȀ Ȁ Ǣ /ʙ Ǣ ./-/ʙ'$./ǿǎʙǶǎǡǖǢǏʙǶǎǡǏǢǐʙǶǍǡǔǢǑʙǍǡǏǢǒʙǍǡǖǢǓʙǎǡǕȀ Ȁ ćF QBSBNFUFS +#$ OPX DPOUBJOT UIF BEEJUJWF GVODUJPO XJUI TMPQF QBSBNFUFST BOE QSFEJDUPS WBSJBCMFT /PUJDF UIBU UIFSF JT OP MJOL GVODUJPO BSPVOE +#$ CFDBVTF UIF MJOL JT SFBMMZ JOTJEF *-'*"$/ BMSFBEZ IFODF iMPHJUw JO JUT OBNF  /PUJDF BMTP UIBU *WF BEPQUFE UIF BQQSPYJNBUF ."1 FTUJNBUFT GSPN UIF QSFWJPVT NPEFM (ǎǎǡǎ BT TUBSUJOH WBMVFT GPS UIF JOUFSDFQUT ćJT IFMQT (+ ĕOE UIF OFX ."1 FTUJNBUFT NPSF RVJDLMZ
  56. Plotting ordered logits • Oh, bother: Posterior prediction a vector

    of probabilities, one for each level of outcome • How to plot this?
  57. 0 1 0.0 0.5 1.0 intention probability action=0, contact=0 0

    0.0 0.5 1.0 inte probability action=1 1.0 action=0, contact=1 'JHVSF  1 PSEFSFE DBUFHP 1 2 3 4 5 6 7
  58.   .0/45&34 "/% .*9563&4 0 1 0.0 0.5 1.0

    intention probability action=0, contact=0 1 2 3 4 5 6 7 0 1 0.0 0.5 1.0 intention probability action=1, contact=0 1 2 3 4 5 6 7 0 1 0.0 0.5 1.0 intention probability action=0, contact=1 1 2 3 4 5 6 7 'ĶĴłĿIJ ƉƉƋ 1PTUFSJPS QSFEJDUJPOT PG UIF PSEFSFE DBUFHPSJDBM NPEFM XJUI JOUFSBDUJPOT (ǎǎǡǐ &BDI QMPU TIPXT IPX UIF EJTUSJCVUJPO PG QSFEJDUFE SF TQPOTFT WBSJFT CZ $)/ )/$*) -Fę &ČFDU PG $)/ )/$*) XIFO /$*) BOE *)// BSF CPUI [FSP ćF PUIFS UXP QMPUT FBDI DIBOHF FJUIFS /$*) PS *)// UP POF
  59. 1 2 3 4 5 6 7 0 50 150

    250 350 response Frequency intention=1, contact=1 ordered logit data post predict
  60. 1 2 3 4 5 6 7 0 50 150

    250 350 response Frequency intention=1, contact=1 ordered logit binomial 1 2 3 4 5 6 7 0 50 150 250 350 response Frequency intention=1, contact=1 data post predict
  61. Ordered logit • MAP estimation can be hard; choose good

    starting values. See notes for details. • Stan handles these models fine. Will be slower than other outcome types. • Also ordered probit; uses cumulative normal link
  62. Mixtures • Some outcomes mix different processes • replace parameter

    of likelihood with distribution of its own • Over-dispersion: counts often more variable than expected, because probabilities/rates are variable • beta-binomial, gamma-Poisson (negative-binomial) • Zero-inflated mixtures
  63. Monastery Mystery • Monks copy manuscripts • They also get

    drunk • Data: num manuscripts completed each day • Can infer number of days they got drunk?
  64. Analyze? • Zero-inflated Poisson observations • Hidden state: drunk or

    sober • Can estimate probability of drinking and rate of production when sober • Must build a new likelihood, a mixture of stochastic processes p 1 – p observe y = 0 observe y > 0 Drink Work 'ĶĴłĿIJ ƉƉƌ -Fę 4USVDUVSF PG UIF [FSP HJOOJOH BU UIF UPQ UIF NPOLT ESJOL Q P UIF UJNF %SJOLJOH NPOLT BMXBZT QSPEV NPOLT NBZ QSPEVDF FJUIFS Z =  PS Z > [FSPJOĘBUFE PCTFSWBUJPOT ćF CMVF MJOF PCTFSWBUJPOT UIBU BSPTF GSPN ESJOLJOH *
  65. Analyze? p 1 – p observe y = 0 observe

    y > 0 Drink Work 'ĶĴłĿIJ ƉƉƌ -Fę 4USVDUVSF PG UIF [FSP HJOOJOH BU UIF UPQ UIF NPOLT ESJOL Q P UIF UJNF %SJOLJOH NPOLT BMXBZT QSPEV NPOLT NBZ QSPEVDF FJUIFS Z =  PS Z > [FSPJOĘBUFE PCTFSWBUJPOT ćF CMVF MJOF PCTFSWBUJPOT UIBU BSPTF GSPN ESJOLJOH *  .0/45&34 "/% .*9563&4 1 – p observe y > 0 Work 0 1 2 3 4 5 0 50 100 150 manuscripts completed Frequency VSF PG UIF [FSPJOĘBUFE MJLFMJIPPE DBMDVMBUJPO #F POLT ESJOL Q PG UIF UJNF PS JOTUFBE XPSL  − Q PG T BMXBZT QSPEVDF BO PCTFSWBUJPO Z =  8PSLJOH drunk zeros
  66. p 1 – p observe zero Poisson process FYQ(−λ) TJ

    ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ Binomial process
  67. p 1 – p observe zero Poisson process FYQ(−λ) TJ

    ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ observe n λO FYQ(−λ) O! TJ ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ Binomial process
  68. p 1 – p observe zero Poisson process FYQ(−λ) TJ

    ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ observe n λO FYQ(−λ) O! TJ ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ Binomial process 1S(|Q, λ) = Q + ( − Q) FYQ(−λ) TJ ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ
  69. p 1 – p observe zero Poisson process FYQ(−λ) TJ

    ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ observe n λO FYQ(−λ) O! TJ ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ Binomial process 1S(O|Q, λ) = ( − Q) λO FYQ(−λ) O! TJ ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J 1S(|Q, λ) = Q + ( − Q) FYQ(−λ) TJ ∼ #JOPNJBM(OJ, QJ) MPHJU QJ = α TJ ∼ #FUB#JOPNJBM(OJ, ¯ QJ, θ) MPHJU ¯ QJ = α + β1 1J MPH θ = τ
  70. Zero-inflated Poisson model NF GSPN XIJDI QSPDFTT FWFS QSPEVDF Z

    >  UIF FYQSFTTJPO BCPWF JT KVTU UIF DIBODF UIF BOE ĕOJTI Z NBOVTDSJQUT UIF EJTUSJCVUJPO BCPWF XJUI QBSBNFUFST Q QSPCBCJMJUZ PG B [FSP BOE FTDSJCF JUT TIBQF ćFO B [FSPJOĘBUFE 1PJTTPO SFHSFTTJPO UBLFT UIF ZJ ∼ ;*1PJTTPO(QJ, λJ) MPHJU(QJ) = αQ + βQ YJ MPH(λJ) = αλ + βλ YJ P MJOFBS NPEFMT BOE UXP MJOL GVODUJPOT POF GPS FBDI QSPDFTT JO UIF FST PG UIF MJOFBS NPEFMT EJČFS CFDBVTF BOZ QSFEJDUPS TVDI BT Y NBZ XJUI FBDI QBSU PG UIF NJYUVSF *O GBDU ZPV EPOU FWFO IBWF UP VTF PUI NPEFMT‰ZPV DBO DPOTUSVDU UIF UXP MJOFBS NPEFMT IPXFWFS ZPV PVS IZQPUIFTJT XF OFFE OPX FYDFQU GPS TPNF BDUVBM EBUB 4P MFUT TJNVMBUF UIF SLJOH ćFO ZPVMM TFF UIF DPEF VTFE UP SFDPWFS UIF QBSBNFUFS WBMVFT p 1 – p observe y = 0 observe y > 0 Drink Work 'ĶĴłĿIJ ƉƉƌ -Fę 4USVDUVSF PG UIF [FSP HJOOJOH BU UIF UPQ UIF NPOLT ESJOL Q P UIF UJNF %SJOLJOH NPOLT BMXBZT QSPEV NPOLT NBZ QSPEVDF FJUIFS Z =  PS Z > [FSPJOĘBUFE PCTFSWBUJPOT ćF CMVF MJOF PCTFSWBUJPOT UIBU BSPTF GSPN ESJOLJOH * Linear models are independent
  71. Simulate, validate, cromulate • As models get more complicated, no

    guarantees you can • specify model correctly • estimate actual process reliably • Bayes not magic, just logic • Simulate “dummy data” • recover estimates • understand the model • Try parameter combinations hostile to estimation, so you know limits of the golem
  72. Simulated manuscripts CF BTTPDJBUFE EJČFSFOUMZ XJUI FBDI QBSU PG UIF

    NJYUVSF *O GBDU ZPV EPOU FWFO IBWF UP VTF UIF TBNF QSFEJDUPST JO CPUI NPEFMT‰ZPV DBO DPOTUSVDU UIF UXP MJOFBS NPEFMT IPXFWFS ZPV XJTI EFQFOEJOH VQPO ZPVS IZQPUIFTJT 8F IBWF FWFSZUIJOH XF OFFE OPX FYDFQU GPS TPNF BDUVBM EBUB 4P MFUT TJNVMBUF UIF NPOLT ESJOLJOH BOE XPSLJOH ćFO ZPVMM TFF UIF DPEF VTFE UP SFDPWFS UIF QBSBNFUFS WBMVFT VTFE JO UIF TJNVMBUJPO 3 DPEF  ȃ  !$) +-( / -. +-*Ǭ-$)& ʄǤ ƻǏƽ ȃ ƽƻɳ *! 4. -/ Ǭ2*-& ʄǤ Ƽ ȃ 1 -" Ƽ ()0.-$+/ + - 4 ȃ .(+' *) 4 - *! +-*0/$*)  ʄǤ ƾǁǀ  ;&30*/'-"5&% 065$0.&4  ȃ .$(0'/ 4. (*)&. -$)& -$)& ʄǤ -$)*(ǭ  ǐ Ƽ ǐ +-*Ǭ-$)& Ǯ ȃ .$(0'/ ()0.-$+/. *(+' /  4 ʄǤ ǭƼǤ-$)&ǮǷ-+*$.ǭ  ǐ -/ Ǭ2*-& Ǯ ćF PVUDPNF WBSJBCMF XF HFU UP PCTFSWF JT 4 XIJDI JT KVTU B MJTU PG DPVOUT PG DPNQMFUFE NBOVTDSJQUT POF DPVOU GPS FBDI EBZ PG UIF ZFBS 5BLF B MPPL BU UIF PVUDPNF WBSJBCMF 3 DPEF  .$(+' #$./ǭ 4 ǐ 3'ʃǙ()0.-$+/. *(+' / Ǚ ǐ '2ʃƿ Ǯ 5 -*.Ǭ-$)& ʄǤ .0(ǭ-$)&Ǯ 5 -*.Ǭ2*-& ʄǤ .0(ǭ4ʃʃƻ ƺ -$)&ʃʃƻǮ 5 -*.Ǭ/*/' ʄǤ .0(ǭ4ʃʃƻǮ '$) .ǭ ǭƻǐƻǮ ǐ ǭ5 -*.Ǭ2*-&ǐ5 -*.Ǭ/*/'Ǯ ǐ '2ʃƿ ǐ *'ʃ-)"$ƽ Ǯ ćJT QMPU JT TIPXO PO UIF SJHIUIBOE TJEF PG 'ĶĴłĿIJ ƉƉƌ ćF [FSPT QSPEVDFE CZ ESJOLJOH BSF   .0/45&34 "/% .*9563&4 p 1 – p observe y = 0 observe y > 0 Drink Work 0 1 2 3 4 5 0 50 100 150 manuscripts completed Frequency drunk zeros
  73. Fit model to dummy data ćF PVUDPNF WBSJBCMF XF HFU

    UP PCTFSWF JT 4 XIJDI JT KVTU B MJTU PG DPVOUT PG DPNQMFUFE NBOVTDSJQUT POF DPVOU GPS FBDI EBZ PG UIF ZFBS 5BLF B MPPL BU UIF PVUDPNF WBSJBCMF 3 DPEF  .$(+' #$./ǭ 4 ǐ 3'ʃǙ()0.-$+/. *(+' / Ǚ ǐ '2ʃƿ Ǯ 5 -*.Ǭ-$)& ʄǤ .0(ǭ-$)&Ǯ 5 -*.Ǭ2*-& ʄǤ .0(ǭ4ʃʃƻ ƺ -$)&ʃʃƻǮ 5 -*.Ǭ/*/' ʄǤ .0(ǭ4ʃʃƻǮ '$) .ǭ ǭƻǐƻǮ ǐ ǭ5 -*.Ǭ2*-&ǐ5 -*.Ǭ/*/'Ǯ ǐ '2ʃƿ ǐ *'ʃ-)"$ƽ Ǯ ćJT QMPU JT TIPXO PO UIF SJHIUIBOE TJEF PG 'ĶĴłĿIJ ƉƉƌ ćF [FSPT QSPEVDFE CZ ESJOLJOH BSF TIPXO JO CMVF ćPTF GSPN XPSL BSF TIPXO JO CMBDL ćF UPUBM OVNCFS PG [FSPT JT JOĘBUFE SFMBUJWF UP B UZQJDBM 1PJTTPO EJTUSJCVUJPO "OE UP ĕU UIF NPEFM UIF - /#$)&$)" QBDLBHF QSPWJEFT UIF [FSPJOĘBUFE 1PJTTPO MJLF MJIPPE BT 5$+*$. 'PS NPSF EFUBJM PO IPX JU SFMBUFT UP UIF NBUIFNBUJDT BCPWF TFF UIF 0WFSUIJOLJOH CPY BU UIF FOE PG UIJT TFDUJPO 6TJOH 5$+*$. JT TUSBJHIUGPSXBSE 3 DPEF  (ƼƼǏƿ ʄǤ (+ǭ '$./ǭ 4 ʋ 5$+*$.ǭ + ǐ '( Ǯǐ '*"$/ǭ+Ǯ ʄǤ +ǐ '*"ǭ'(Ǯ ʄǤ 'ǐ + ʋ )*-(ǭƻǐƼǮǐ ' ʋ )*-(ǭƻǐƼƻǮ Ǯ ǐ /ʃ'$./ǭ4ʃ4Ǯ Ǯ +- $.ǭ(ƼƼǏƿǮ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ + ǤƼǏƾDŽ ƻǏƾƼ ǤƽǏƻ ǤƻǏǂǃ ' ƻǏƻǀ ƻǏƻǃ ǤƻǏƼ ƻǏƽƼ JOLJOH NPOLT OFWFS QSPEVDF Z >  UIF FYQSFTTJPO BCPWF JT KVTU UIF D PUI XPSL  − Q BOE ĕOJTI Z NBOVTDSJQUT OF ;*1PJTTPO BT UIF EJTUSJCVUJPO BCPWF XJUI QBSBNFUFST Q QSPCBCJMJUZ PG B PG 1PJTTPO UP EFTDSJCF JUT TIBQF ćFO B [FSPJOĘBUFE 1PJTTPO SFHSFTTJPO ZJ ∼ ;*1PJTTPO(QJ, λJ) MPHJU(QJ) = αQ + βQ YJ MPH(λJ) = αλ + βλ YJ IBU UIFSF BSF UXP MJOFBS NPEFMT BOE UXP MJOL GVODUJPOT POF GPS FBDI QSPD O ćF QBSBNFUFST PG UIF MJOFBS NPEFMT EJČFS CFDBVTF BOZ QSFEJDUPS TVDI JBUFE EJČFSFOUMZ XJUI FBDI QBSU PG UIF NJYUVSF *O GBDU ZPV EPOU FWFO IB QSFEJDUPST JO CPUI NPEFMT‰ZPV DBO DPOTUSVDU UIF UXP MJOFBS NPEFMT IPX QFOEJOH VQPO ZPVS IZQPUIFTJT IBWF FWFSZUIJOH XF OFFE OPX FYDFQU GPS TPNF BDUVBM EBUB 4P MFUT TJN ESJOLJOH BOE XPSLJOH ćFO ZPVMM TFF UIF DPEF VTFE UP SFDPWFS UIF QBSBNF IF TJNVMBUJPO
  74. "OE UP ĕU UIF NPEFM UIF - /#$)&$)" QBDLBHF QSPWJEFT

    UIF [FSPJOĘBUFE 1PJTTPO MJLF MJIPPE BT 5$+*$. 'PS NPSF EFUBJM PO IPX JU SFMBUFT UP UIF NBUIFNBUJDT BCPWF TFF UIF 0WFSUIJOLJOH CPY BU UIF FOE PG UIJT TFDUJPO 6TJOH 5$+*$. JT TUSBJHIUGPSXBSE 3 DPEF  (ƼƼǏƿ ʄǤ (+ǭ '$./ǭ 4 ʋ 5$+*$.ǭ + ǐ '( Ǯǐ '*"$/ǭ+Ǯ ʄǤ +ǐ '*"ǭ'(Ǯ ʄǤ 'ǐ + ʋ )*-(ǭƻǐƼǮǐ ' ʋ )*-(ǭƻǐƼƻǮ Ǯ ǐ /ʃ'$./ǭ4ʃ4Ǯ Ǯ +- $.ǭ(ƼƼǏƿǮ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ + ǤƼǏƾDŽ ƻǏƾƼ ǤƽǏƻ ǤƻǏǂǃ ' ƻǏƻǀ ƻǏƻǃ ǤƻǏƼ ƻǏƽƼ 0O UIF OBUVSBM TDBMF UIPTF ."1 FTUJNBUFT BSF 3 DPEF  '*"$./$ǭǤƼǏƾDŽǮ ȃ +-*$'$/4 -$)& 3+ǭƻǏƻǀǮ ȃ -/ !$)$.# ()0.-$+/.ǐ 2# ) )*/ -$)&$)" ǯƼǰ ƻǏƼDŽDŽƿƻǂǃ ǯƼǰ ƼǏƻǀƼƽǂƼ /PUJDF UIBU XF DBO HFU BO BDDVSBUF FTUJNBUF PG UIF QSPQPSUJPO PG EBZT UIF NPOLT ESJOL FWFO UIPVHI XF DBOU TBZ GPS BOZ QBSUJDVMBS EBZ XIFUIFS PS OPU UIFZ ESBOL ćJT FYBNQMF JT UIF TJNQMFTU QPTTJCMF *O SFBM QSPCMFNT ZPV NJHIU IBWF QSFEJDUPS WBSJ BCMFT UIBU BSF BTTPDJBUFE POF PS CPUI QSPDFTTFT JOTJEF UIF [FSPJOĘBUFE 1PJTTPO NJYUVSF *O "OE UP ĕU UIF NPEFM UIF - /#$)&$)" QBDLBHF QSPWJEFT UIF [FSPJOĘBUFE 1PJTTPO MJLF MJIPPE BT 5$+*$. 'PS NPSF EFUBJM PO IPX JU SFMBUFT UP UIF NBUIFNBUJDT BCPWF TFF UIF 0WFSUIJOLJOH CPY BU UIF FOE PG UIJT TFDUJPO 6TJOH 5$+*$. JT TUSBJHIUGPSXBSE 3 DPEF  (ƼƼǏƿ ʄǤ (+ǭ '$./ǭ 4 ʋ 5$+*$.ǭ + ǐ '( Ǯǐ '*"$/ǭ+Ǯ ʄǤ +ǐ '*"ǭ'(Ǯ ʄǤ 'ǐ + ʋ )*-(ǭƻǐƼǮǐ ' ʋ )*-(ǭƻǐƼƻǮ Ǯ ǐ /ʃ'$./ǭ4ʃ4Ǯ Ǯ +- $.ǭ(ƼƼǏƿǮ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ + ǤƼǏƾDŽ ƻǏƾƼ ǤƽǏƻ ǤƻǏǂǃ ' ƻǏƻǀ ƻǏƻǃ ǤƻǏƼ ƻǏƽƼ 0O UIF OBUVSBM TDBMF UIPTF ."1 FTUJNBUFT BSF 3 DPEF  '*"$./$ǭǤƼǏƾDŽǮ ȃ +-*$'$/4 -$)& 3+ǭƻǏƻǀǮ ȃ -/ !$)$.# ()0.-$+/.ǐ 2# ) )*/ -$)&$)" ǯƼǰ ƻǏƼDŽDŽƿƻǂǃ ǯƼǰ ƼǏƻǀƼƽǂƼ /PUJDF UIBU XF DBO HFU BO BDDVSBUF FTUJNBUF PG UIF QSPQPSUJPO PG EBZT UIF NPOLT ESJOL FWFO UIPVHI XF DBOU TBZ GPS BOZ QBSUJDVMBS EBZ XIFUIFS PS OPU UIFZ ESBOL ćJT FYBNQMF JT UIF TJNQMFTU QPTTJCMF *O SFBM QSPCMFNT ZPV NJHIU IBWF QSFEJDUPS WBSJ BCMFT UIBU BSF BTTPDJBUFE POF PS CPUI QSPDFTTFT JOTJEF UIF [FSPJOĘBUFE 1PJTTPO NJYUVSF *O SFMBUJWF UP B UZQJDBM 1PJTTPO EJTUSJCVUJPO "OE UP ĕU UIF NPEFM UIF - /#$)&$)" QBDLBHF QSPWJEFT UIF [FSPJOĘBUFE 1PJTTPO MJLF MJIPPE BT 5$+*$. 'PS NPSF EFUBJM PO IPX JU SFMBUFT UP UIF NBUIFNBUJDT BCPWF TFF UIF 0WFSUIJOLJOH CPY BU UIF FOE PG UIJT TFDUJPO 6TJOH 5$+*$. JT TUSBJHIUGPSXBSE 3 DPEF  (ƼƼǏƿ ʄǤ (+ǭ '$./ǭ 4 ʋ 5$+*$.ǭ + ǐ '( Ǯǐ '*"$/ǭ+Ǯ ʄǤ +ǐ '*"ǭ'(Ǯ ʄǤ 'ǐ + ʋ )*-(ǭƻǐƼǮǐ ' ʋ )*-(ǭƻǐƼƻǮ Ǯ ǐ /ʃ'$./ǭ4ʃ4Ǯ Ǯ +- $.ǭ(ƼƼǏƿǮ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ + ǤƼǏƾDŽ ƻǏƾƼ ǤƽǏƻ ǤƻǏǂǃ ' ƻǏƻǀ ƻǏƻǃ ǤƻǏƼ ƻǏƽƼ 0O UIF OBUVSBM TDBMF UIPTF ."1 FTUJNBUFT BSF 3 DPEF  '*"$./$ǭǤƼǏƾDŽǮ ȃ +-*$'$/4 -$)& 3+ǭƻǏƻǀǮ ȃ -/ !$)$.# ()0.-$+/.ǐ 2# ) )*/ -$)&$)" ǯƼǰ ƻǏƼDŽDŽƿƻǂǃ ǯƼǰ ƼǏƻǀƼƽǂƼ /PUJDF UIBU XF DBO HFU BO BDDVSBUF FTUJNBUF PG UIF QSPQPSUJPO PG EBZT UIF NPOLT ESJOL FWFO UIPVHI XF DBOU TBZ GPS BOZ QBSUJDVMBS EBZ XIFUIFS PS OPU UIFZ ESBOL ćJT FYBNQMF JT UIF TJNQMFTU QPTTJCMF *O SFBM QSPCMFNT ZPV NJHIU IBWF QSFEJDUPS WBSJ BCMFT UIBU BSF BTTPDJBUFE POF PS CPUI QSPDFTTFT JOTJEF UIF [FSPJOĘBUFE 1PJTTPO NJYUVSF *O observe y = 0 observe y > 0 Drink Work 0 1 2 3 4 5 0 manuscripts completed 'ĶĴłĿIJ ƉƉƌ -Fę 4USVDUVSF PG UIF [FSPJOĘBUFE MJLFMJIPPE DBMDVMBUJPO #F HJOOJOH BU UIF UPQ UIF NPOLT ESJOL Q PG UIF UJNF PS JOTUFBE XPSL  − Q PG UIF UJNF %SJOLJOH NPOLT BMXBZT QSPEVDF BO PCTFSWBUJPO Z =  8PSLJOH NPOLT NBZ QSPEVDF FJUIFS Z =  PS Z >  3JHIU 'SFRVFODZ EJTUSJCVUJPO PG [FSPJOĘBUFE PCTFSWBUJPOT ćF CMVF MJOF TFHNFOU PWFS [FSP TIPXT UIF Z =  PCTFSWBUJPOT UIBU BSPTF GSPN ESJOLJOH *O SFBM EBUB XF UZQJDBMMZ DBOOPU TFF XIJDI [FSPT DPNF GSPN XIJDI QSPDFTT 4JODF ESJOLJOH NPOLT OFWFS QSPEVDF Z >  UIF FYQSFTTJPO BCPWF JT KVTU UIF DIBODF UIF NPOLT CPUI XPSL  − Q BOE ĕOJTI Z NBOVTDSJQUT %FĕOF ;*1PJTTPO BT UIF EJTUSJCVUJPO BCPWF XJUI QBSBNFUFST Q QSPCBCJMJUZ PG B [FSP BOE λ NFBO PG 1PJTTPO UP EFTDSJCF JUT TIBQF ćFO B [FSPJOĘBUFE 1PJTTPO SFHSFTTJPO UBLFT UIF GPSN ZJ ∼ ;*1PJTTPO(QJ, λJ) MPHJU(QJ) = αQ + βQ YJ MPH(λJ) = αλ + βλ YJ /PUJDF UIBU UIFSF BSF UXP MJOFBS NPEFMT BOE UXP MJOL GVODUJPOT POF GPS FBDI QSPDFTT JO UIF ;*1PJTTPO ćF QBSBNFUFST PG UIF MJOFBS NPEFMT EJČFS CFDBVTF BOZ QSFEJDUPS TVDI BT Y NBZ
  75. Other mixtures • Can ZIBinomial, too • Also “hurdle” models,

    aka zero-augmented • Continuous mixtures for overdispersed counts • beta-binomial • gamma-Poisson • We’ll focus on multilevel models instead