Lecture 07 of the Dec 2018 through March 2019 edition of Statistical Rethinking. Covers back-door criterion and introduction overfitting/cross-validation/information criteria.
• The back-door criterion: Confounding caused by existence of open back door paths from X to Y • If you know your elements, you know how to open/close each of them #VU XIBU FYBDUMZ JT DPOGPVOEJOH "OE XIJDI QSJODJQMFT FYQMBJO XIZ TPN WBSJBCMFT BOE TPNFUJNFT BEEJOH UIFN DBO QSPEVDF UIF TBNF QIFOPNFOP S DBVTBM NPOTUFST MVSLJOH PVU UIFSF IBVOUJOH PVS HSBQIT 8F SFRVJSF TPN $POGPVOEJOH JT BOZ DPOUFYU JO XIJDI UIF BTTPDJBUJPO CFUXFFO BO PVUDPN PS PG JOUFSFTU 9 JT OPU UIF TBNF BT JU XPVME CF JG XF IBE FYQFSJNFOUBMMZ E FT PG 9 'PS FYBNQMF JO UIF QSFWJPVT FYBNQMF UIF BTTPDJBUJPO CFUXFFO T JT DPOGPVOEFE CZ UIF VOPCTFSWFE WBSJBCMF 6 *G XF IBE BTTJHOFE FEVD MF XFE HFU B EJČFSFOU FTUJNBUF GPS UIF BTTPDJBUJPO %JSFDUMZ NBOJQVMBU T UIF HSBQI PO UIF MFę JOUP UIF HSBQI PO UIF SJHIU E U W E U W EPFT JU EP UIJT *O UIF HSBQI PO UIF MFę UIFSF BSF UXP QBUIT DPOOFDU
X Z Y The Collider X Z Y The Descendant A Open unless you condition on Z Open unless you condition on Z Closed until you condition on Z Conditioning on A is like conditioning on Z
NBOJQVMB OT UIF HSBQI PO UIF MFę JOUP UIF HSBQI PO UIF SJHIU E U W E U W X EPFT JU EP UIJT *O UIF HSBQI PO UIF MFę UIFSF BSF UXP QBUIT DPOOFD & → 8 BOE & ← 6 → 8 " iQBUIw IFSF KVTU NFBOT BOZ TFSJFT P E XBML UISPVHI UP HFU GSPN POF WBSJBCMF UP BOPUIFS JHOPSJOH UIF EJSFDUJPO OJQVMBUJPO SFNPWFT UIF JOĘVFODF PG 6 PO & ćJT UIFO TUPQT JOGPSNBUJPO XFFO & BOE 8 UISPVHI 6 *U CMPDLT UIF TFDPOE QBUI 0ODF UIF QBUI JT CM Z POF XBZ GPS JOGPSNBUJPO UP HP CFUXFFO & BOE 8 BOE UIFO NFBTVSJOH XFFO & BOE 8 DPVME ZJFME B VTFGVM NFBTVSF PG DBVTBM JOĘVFODF .BOJQV DPOGPVOEJOH CFDBVTF JU CMPDLT UIF PUIFS QBUI CFUXFFO & BOE 8 /PX DPOTJEFS UIBU UIFSF BSF TUBUJTUJDBM XBZT UP BDIJFWF UIF TBNF SFTVMU X OJQVMBUJOH & )PX ćF NPTU PCWJPVT JT UP BEE 6 UP UIF NPEFM UP DPOEJU T UIJT BMTP SFNPWF UIF DPOGPVOEJOH #FDBVTF JU BMTP CMPDLT UIF ĘPX PG J Two paths from E to W: (1) E → W (2) E ← U → W Close 2nd path by conditioning on U, closing the pipe.
JG XF DPOEJUJPO PO 1 JU XJMM CJBT JOGFSFODF OFWFS HFU UP NFBTVSF 6 * EPOU FYQFDU UIBU GBDU UP CF JNNFEJBUFMZ PVHI B RVBOUJUBUJWF FYBNQMF USJBET PG HSBOEQBSFOUT QBSFOUT BOE DIJMESFO ćJT TJNVMBUJPO SPKFDU PVS %"( BT B TFSJFT PG JNQMJFE GVODUJPOBM SFMBUJPOTIJQT ćF PO PG ( BOE 6 PO PG ( 1 BOE 6 GVODUJPOT PG BOZ PUIFS LOPXO WBSJBCMFT 3 paths from G to C: (1) G → C (2) G → P → C (3) G → P ← U → C Condition on P: Closes (2) but opens (3)
condition on to infer X → Y? • Procedure: (1) Find all paths. (2) Open/close as necessary. PDL UIF QBUI GSPN 9 UP : ćF TBNF IPMET GPS DPMMJEFST *G ZPV DPO FTDFOEFOU PG B DPMMJEFS JUMM TUJMM CF MJLF XFBLMZ DPOEJUJPOJOH PO B DP UUFS IPX DPNQMJDBUFE B DBVTBM %"( BQQFBST JU JT BMXBZT CVJMU PVU P BUJPOT "OE TJODF ZPV LOPX IPX UP PQFO BOE DMPTF FBDI ZPV PS ZPV PVU XIJDI WBSJBCMFT ZPV OFFE UP DPOUSPMPS OPUJO PSEFS UP TIVU UI FS TPNF FYBNQMFT SPBET ćF %"( CFMPX DPOUBJOT BO FYQPTVSF PG JOUFSFTU 9 BO PVUDPN TFSWFE WBSJBCMF 6 BOE UISFF PCTFSWFE DPWBSJBUFT " # BOE $ A B C U X Y
condition on to infer X → Y? • Condition on A or C. Do not condition on B. DPMMJEFS JUMM TUJMM CF MJLF XFBLMZ DPOEJUJPOJOH PO B DPMMJEFS MJDBUFE B DBVTBM %"( BQQFBST JU JT BMXBZT CVJMU PVU PG UIFTF GPVS DF ZPV LOPX IPX UP PQFO BOE DMPTF FBDI ZPV PS ZPVS DPNQVUFS CMFT ZPV OFFE UP DPOUSPMPS OPUJO PSEFS UP TIVU UIF CBDLEPPS QMFT "( CFMPX DPOUBJOT BO FYQPTVSF PG JOUFSFTU 9 BO PVUDPNF PG JOUFSFTU 6 BOE UISFF PCTFSWFE DPWBSJBUFT " # BOE $ A B C U X Y VF QBUI UIF DBVTBM FČFDU PG 9 PO : 8IJDI PG UIF PCTFSWFE DPWBSJ U X 8F BSF JOUFSFTUFE JO UIF CMVF QBUI UIF BUFT EP XF OFFE UP BEE UP UIF NPEFM J CBDLEPPS QBUIT "TJEF GSPN UIF EJSFD 9 ← 6 ← " → $ → : 9 ← 6 → # ← $ → : : BO VOPCTFSWFE WBSJBCMF 6 BOE UISF U X 8F BSF JOUFSFTUFE JO UIF CMVF QBUI UIF BUFT EP XF OFFE UP BEE UP UIF NPEFM J CBDLEPPS QBUIT "TJEF GSPN UIF EJSFD 9 ← 6 ← " → $ → : 9 ← 6 → # ← $ → : This path is open. This path is closed.
to infer W → D? $0/'30/5*/( $0/'06/%*/( A D M S W QI 4 JT XIFUIFS PS OPU B 4UBUF JT JO UIF TPVUIFSO 6OJUFE 4UBUFT " JT South Waffle Houses Marriage Divorce Age at marriage
to infer W → D? $0/'30/5*/( $0/'06/%*/( A D M S W QI 4 JT XIFUIFS PS OPU B 4UBUF JT JO UIF TPVUIFSO 6OJUFE 4UBUFT " JT South Waffle Houses Marriage Divorce Age at marriage
$0/'30/5*/( $0/'06/%*/( A D M S W FS PS OPU B 4UBUF JT JO UIF TPVUIFSO 6OJUFE 4UBUFT " JT NFEJBO BHF UJPOBM JOEFQFOEFODJFT QBJST PG WBSJBCMFT UIBU BSF OPU BTTPDJBUFE PODF XF DPOEJUJPO PO TPNF TFU PG PUIFS WBSJBCMFT #Z MJTUJOH UIFTF JNQMJFE DPOEJUJPOBM JOEFQFOEFODJFT BOE BTTFTTJOH FBDI XF DBO BU MFBTU UFTU TPNF PG UIF GFBUVSFT PG B HSBQI :PV DBO ĕOE DPOEJUJPOBM JOEFQFOEFODJFT VTJOH UIF TBNF QBUI MPHJD ZPV MFBSOFE GPS ĕOE JOH BOE DMPTJOH CBDLEPPST :PV KVTU IBWF UP GPDVT PO B QBJS PG WBSJBCMFT ĕOE BMM QBUIT DPO OFDUJOH UIFN BOE ĕHVSF PVU JG UIFSF JT BOZ TFU PG WBSJBCMFT ZPV DPVME DPOEJUJPO PO UP DMPTF UIFN BMM *O B MBSHF HSBQI UIJT JT RVJUF B DIPSF CFDBVTF UIFSF BSF NBOZ QBJST PG WBSJBCMFT BOE QPTTJCMZ NBOZ QBUIT #VU ZPVS DPNQVUFS JT HPPE BU TVDI DIPSFT *O UIJT DBTF UIFSF BSF UISFF JNQMJFE DPOEJUJPOBM JOEFQFOEFODJFT 3 DPEF $(+'$ *)$/$*)' ) + ) )$ .ǿ "ǾǓǡǏ Ȁ ǾȆȆǾ Ȇ ǾȆȆǾ Ȇ Ǣ Ǣ ǾȆȆǾ Ȇ (1) A and W independent, conditioning on S (2) D and S independent, conditioning on A, M, & W (3) M and W independent, conditioning on S
cause • Experiments not required! • Experiments not always practical or ethical • Disease, evolution, development, dynamics of popular music, global climate, war • Experiments must choose an intervention • Interventions influence many variables at once • Experimentally manipulate obesity? David Hume (1711–1776) rates your DAG 12/10
small world constructs • Residual confounding: • Misclassification • Measurement error • Missingness • DAGs can accommodate these problems, but maybe tell us there are no solutions • We will see some solutions in later week • Eventually need *real* models of the system
Cross-validation & information criteria: • estimate predictive accuracy • estimate overfitting risk • understand how overfitting relates to complexity • identify influential observations • See that prediction and causal inference are different objectives AIC LOO WAIC
from a sample? • Underfitting: Learning too little from the data. Too simple models both fit and predict poorly. • Overfitting: Learning too much from the data. Complex models tend to fit better, predict worse. • Want to find a model that navigates between underfitting and overfitting • Problem: Fit to sample always* improves as we add parameters *Not true of multilevel models & other types
the sample • Strategies • Regularizing priors (penalized likelihood) • Cross-validation • Information criteria • Science! • Proper approach depends upon purpose • Answers are never only in the data, but they do usually require data
an outcome. • How to quantify uncertainty? Should be: 1. Continuous 2. Increasing with number of possible events 3. Additive • These criteria intuitive, but effectiveness is why we keep using them
p is true, q is model • How accurate is q, for describing p? • Distance from q to p: Divergence */'03."5*0/ 5)&03: "/% .0%&- 1&3'03."/$& PS FYBNQMF UIBU UIF USVF EJTUSJCVUJPO PG FWFOUT JT Q = ., Q = . OTUFBE UIBU UIFTF FWFOUT IBQQFO XJUI QSPCBCJMJUJFT R = ., R = DI BEEJUJPOBM VODFSUBJOUZ IBWF XF JOUSPEVDFE BT B DPOTFRVFODF PG , R} UP BQQSPYJNBUF Q = {Q, Q} ćF GPSNBM BOTXFS UP UIJT RVFT VQPO ) BOE IBT B TJNJMBSMZ TJNQMF GPSNVMB %,-(Q, R) = J QJ MPH(QJ) − MPH(RJ) . HVBHF UIF EJWFSHFODF JT UIF BWFSBHF EJČFSFODF JO MPH QSPCBCJMJUZ CF FU Q BOE NPEFM R ćJT EJWFSHFODF JT KVTU UIF EJČFSFODF CFUXFFO ćF FOUSPQZ PG UIF UBSHFU EJTUSJCVUJPO Q BOE UIF FOUSPQZ BSJTJOH UP QSFEJDU Q 8IFO Q = R XF LOPX UIF BDUVBM QSPCBCJMJUJFT PG UIF U DBTF Distance from q to p is the average difference in log-probability.