ignore base rate • Predictions on absolute effect scale, account for base rate • Using relative effects may exaggerate importance of predictor • Good for scaring people, getting published • Not so good for public health, scientific progress relative shark absolute penguin
risk • Example: • 1/1000 women develop blood clots • 3/1000 women on birth control develop blood clots • => 200% increase in blood clots! • Change in probability is only 0.002 • Pregnancy much more dangerous than blood clots
these data: • Most chimps right handed • A few look like lefties • Let’s estimate unique intercepts $06/5*/( 0.0 0.2 0.4 0.6 0.8 1.0 prosoc.left/condition proportion pulled left 0/0 1/0 0/1 1/1 CZ UIF MBTU UXP USFBUNFOU PO UIF GBS SJHIU P NFOUT GSPN UIF DPOUSPM DPOEJUJPO 4P JU N XBT QSFTFOU UP SFDFJWF UIF GPPE PO UIF PUI
#2 always pulls left, so not clear how strong handedness preference is • Non-bayesian estimate would be unidentified, because likelihood is flat for high values. Is common problem with logistic regression fit with glm() in R. • Lesson: GLMs sometimes need priors/regularization just to make sense. $06/5*/( 0 10 20 30 0.00 0.02 0.04 0.06 0.08 a[2] Density /PUJDF UIBU UIF NPEFM DPEF BCPWF VTFT ȁ JOJUJPO ćF DPEF SFDPHOJ[FT UIBU UIFSF BSF DPOĕSN UIJT GPS ZPVSTFMG
3 4 5 6 7 8 9 10 11 12 Posterior validation check A B C D E F 'ĶĴłĿIJ Ɖƈƍ 1PTUFSJPS WBMJEBUJPO GPS NPEFM (ǎǍǡǓ #MVF QPJOUT BSF PC TFSWFE QSPQPSUJPOT BENJUUFE GPS FBDI SPX JO UIF EBUB XJUI QPJOUT GSPN UIF TBNF EFQBSUNFOU DPOOFDUFE CZ B CMVF MJOF 0QFO QPJOUT UIF UJOZ WFSUJDBM CMBDL MJOFT XJUIJO UIFN BOE UIF DSPTTFT BSF FYQFDUFE QSPQPSUJPOT JO m f Females admitted more in all but 2 departments
Use logit link • Distrust MAP estimation & QA • may work, but routinely does not • regularization even more important now • Convert back to probability/count scale to plot predictions • Focus on predictions, not parameters
Fissions per unit time in Uranium • Photons striking a detector • Soldiers killed by horse kicks, per year • DNA mutations per strand per generation Siméon Denis Poisson (1781–1840)
log link is customary; linear model of magnitude • Beware exploding exponential predictions • Use offset to adjust exposure duration/distance • Focus on predictions, not parameters • Convert back to count scale to interpret/plot • Predictions tend to be under-dispersed relative to data • Common problem for both binomial and Poisson GLMs => un-modeled heterogeneity
un-ordered outcomes • Tricky to use and understand • Geometric: number of trials until specific event • Common event-history (survival) distribution • Mixtures, coping with heterogeneity: • Beta-binomial: varying probabilities • gamma-Poisson: varying rates • many others (e.g. Dirichlet-multinomial)