limited. YELLOW: Limited menu – no power or only power from a generator, or food supplies may be low. RED: Restaurant is closed – indicating severe damage.
• Reveal spurious correlation • Uncover masked association • The bad: • Cause spurious correlation • Hide real associations .6-5*7"3 0 10 20 30 40 50 6 8 10 12 14 Waffle Houses per million Divorce rate AL AR GA ME NJ OK SC ' Q Q B Q B UIBO POF UZQF PG JOĘVFODF XF T POF DBVTF DBO IJEF BOPUIFS .VM *OUFSBDUJPOT &WFO XIFO WBSJBCMF FBDI NBZ TUJMM EFQFOE VQPO UIF P
does not imply correlation • Causation implies association, perhaps complex • Need models • Q: Does marriage cause divorce? 4163*064 -1 0 1 2 6 8 10 12 Marriage.s Divorce 'ĶĴłĿIJ ƍƊ %JWPSDF SBUF JT BTTPDJBUF NFEJBO BHF BU NBSSJBHF SJHIU #PUI Q UIJT FYBNQMF ćF BWFSBHF NBSSJBHF S NFEJBO BHF BU NBSSJBHF JT
a predictor, once we know the other predictors? • What is value of knowing marriage rate, once we already know median age at marriage? • What is value of knowing median age marriage, once we know marriage rate? .6-5*7"3*"5& -*/&"3 .0%&-4 TTBSZ TP IFSF JT UIF NPEFM UIBU QSFEJDUT EJWPSDF SBUF VTJOH CPUI NBSSJBHF %J ∼ /PSNBM(µJ, σ) >OLNHOLKRRG@ µJ = α + β3 3J + β" "J >OLQHDUPRGHO@ α ∼ /PSNBM(, ) >SULRUIRU α@
additional value in knowing marriage rate. • Once we know marriage rate, still value in knowing median age marriage. • If we don’t know median age marriage, still useful to know marriage rate. USJCVUJPO FTUJNBUFT -),1ǯ -/" &0ǯ*ǂǑǀǰ ǰ ćJT JT UIF SFTVMU XJUI ."1 WBMVFT TIPXO CZ U TPMJE IPSJ[POUBM MJOFT sigma bA bR a -2 0 2 V :PV DBO JOUFSQSFU UIFTF FTUJNBUFT BT TBZJOH
with outcome, “controlling” for other predictors • Useful intuition • Never analyze residuals! • Recipe: 1. Regress predictor on other predictors 2. Compute predictor residuals 3. Regress outcome on residuals
residual marriage rate? States with fast/slow rates of marriage (for age of marriage) do not (on average) have fast/slow divorce rates -2 -1 0 1 2 3 -1 0 1 MedianAgeMarriage.s Marriage. JT B S SJBHF QSFEJ BMPOF MJOF I BDDPS IBWF -1.5 -0.5 0.5 1.0 1.5 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower 'ĶĴłĿIJ ƍƍ 1SFEJDUPS SFTJEVBM QMPUT G
• Check model fit — golems do make mistakes • Find model failures, stimulate new ideas • Always average over the posterior distribution • Using only MAP leads to overconfidence • Embrace the uncertainty
(c) (b) 6 8 10 12 6 8 10 12 Observed divorce Predicted divorce ID UT TX MI DE DC NC OH IA KS MD MA WA NM WV VT OR SD AZ TN NH IN MS LA RI CO OK GA KY AK AL AR ME 4 ME