Slide 1

Slide 1 text

BAYESIAN INFERENCE IS JUST COUNTING Richard McElreath MPI-EVA p(x|y)p(y)/p(x)

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

The Golem of Prague go•lem |gōlǝm| noun • (in Jewish legend) a clay figure brought to life by magic. 
 • an automaton or robot. ORIGIN late 19th cent.: from Yiddish goylem, from Hebrew gōlem ‘shapeless mass.’

Slide 7

Slide 7 text

The Golem of Prague “Even the most perfect of Golem, risen to life to protect us, can easily change into a destructive force. Therefore let us treat carefully that which is strong, just as we bow kindly and patiently to that which is weak.” Rabbi Judah Loew ben Bezalel (1512–1609) From Breath of Bones: A Tale of the Golem

Slide 8

Slide 8 text

The Golems of Science Golem • Made of clay • Animated by “truth” • Powerful • Blind to creator’s intent • Easy to misuse • Fictional Model • Made of...silicon? • Animated by “truth” • Hopefully powerful • Blind to creator’s intent • Easy to misuse • Not even false

Slide 9

Slide 9 text

Bayesian data analysis • Use probability to describe uncertainty • Extends ordinary logic (true/false) to continuous plausibility • Computationally difficult • Markov chain Monte Carlo (MCMC) to the rescue • Used to be controversial • Ronald Fisher: Bayesian analysis “must be wholly rejected.” Pierre-Simon Laplace (1749–1827) Sir Harold Jeffreys (1891–1989) with Bertha Swirles, aka Lady Jeffreys (1903–1999)

Slide 10

Slide 10 text

Bayesian data analysis Count all the ways data can happen, according to assumptions. Assumptions with more ways that are consistent with data are more plausible.

Slide 11

Slide 11 text

Bayesian data analysis • Contrast with frequentist view • Probability is just limiting frequency • Uncertainty arises from sampling variation • Bayesian probability much more general • Probability is in the golem, not in the world • Coins are not random, but our ignorance makes them so Saturn as Galileo saw it

Slide 12

Slide 12 text

Garden of Forking Data • The future: • Full of branching paths • Each choice closes some • The data: • Many possible events • Each observation eliminates some

Slide 13

Slide 13 text

Garden of Forking Data (1) (2) (3) (4) (5) Contains 4 marbles ? Possible contents: Observe:

Slide 14

Slide 14 text

Conjecture: Data:

Slide 15

Slide 15 text

Conjecture: Data:

Slide 16

Slide 16 text

Conjecture: Data:

Slide 17

Slide 17 text

Conjecture: Data: 3 paths consistent with data

Slide 18

Slide 18 text

Garden of Forking Data (1) (2) (3) (4) (5) Possible contents: Ways to produce ? 3 ? ? ?

Slide 19

Slide 19 text

Garden of Forking Data (1) (2) (3) (4) (5) Possible contents: Ways to produce 0 3 ? ? 0

Slide 20

Slide 20 text

3 ways 9 ways 8 ways

Slide 21

Slide 21 text

3 ways 9 ways 8 ways

Slide 22

Slide 22 text

3 ways 9 ways 8 ways

Slide 23

Slide 23 text

Garden of Forking Data OE  XIJUF UIFSF BSF  QBUIT UIBU TVSWJWF WF DPOTJEFSFE ĕWF EJČFSFOU DPOKFDUVSFT BCPVU UIF DPOUFOUT PG UIF CBH F NBSCMFT UP GPVS CMVF NBSCMFT 'PS FBDI PG UIFTF DPOKFDUVSFT XFWF TFRVFODFT QBUIT UISPVHI UIF HBSEFO PG GPSLJOH EBUB DPVME QPUFOUJBMMZ EBUB  $POKFDUVSF 8BZT UP QSPEVDF < >  ×  ×  =  < >  ×  ×  =  < >  ×  ×  =  < >  ×  ×  =  < >  ×  ×  =  S PG XBZT UP QSPEVDF UIF EBUB GPS FBDI DPOKFDUVSF DBO CF DPNQVUFE VNCFS PG QBUIT JO FBDI iSJOHw PG UIF HBSEFO BOE UIFO CZ NVMUJQMZJOH ćJT JT KVTU B DPNQVUBUJPOBM EFWJDF *U UFMMT VT UIF TBNF UIJOH BT 'ĶĴ BWJOH UP ESBX UIF HBSEFO ćF GBDU UIBU OVNCFST BSF NVMUJQMJFE EVSJOH OHF UIF GBDU UIBU UIJT JT TUJMM KVTU DPVOUJOH PG MPHJDBMMZ QPTTJCMF QBUIT

Slide 24

Slide 24 text

Updating Another draw from the bag: QBUIT DPNQBUJCMF XJUI UIF EBUB TFRVFODF  0S ZPV DPVME UBLF UIF Q PWFS DPOKFDUVSFT      BOE KVTU VQEBUF UIFN JO MJHIU PG UIF OFX PCTFS PVU UIBU UIFTF UXP NFUIPET BSF NBUIFNBUJDBMMZ JEFOUJDBM "T MPOH BT UIF OFX JT MPHJDBMMZ JOEFQFOEFOU PG UIF QSFWJPVT PCTFSWBUJPOT SFT IPX UP EP JU 'JSTU XF DPVOU UIF OVNCFST PG XBZT FBDI DPOKFDUVSF DPVME Q X PCTFSWBUJPO  ćFO XF NVMUJQMZ FBDI PG UIFTF OFX DPVOUT CZ UIF QSFWJPVT OV GPS FBDI DPOKFDUVSF *O UBCMF GPSN $POKFDUVSF 8BZT UP QSPEVDF 1SFWJPVT DPVOUT /FX DPVOU < >    ×  =  < >    ×  =  < >    ×  =  < >    ×  =  < >    ×  =  X DPVOUT JO UIF SJHIUIBOE DPMVNO BCPWF TVNNBSJ[F BMM UIF FWJEFODF GPS FBDI T OFX EBUB BSSJWF BOE QSPWJEFE UIPTF EBUB BSF JOEFQFOEFOU PG QSFWJPVT PCTFSW F OVNCFS PG MPHJDBMMZ QPTTJCMF XBZT GPS B DPOKFDUVSF UP QSPEVDF BMM UIF EBUB VQ BO CF DPNQVUFE KVTU CZ NVMUJQMZJOH UIF OFX DPVOU CZ UIF PME DPVOU T VQEBUJOH BQQSPBDI BNPVOUT UP OPUIJOH NPSF UIBO BTTFSUJOH UIBU  XIFO X VT JOGPSNBUJPO TVHHFTUJOH UIFSF BSF 8QSJPS XBZT GPS B DPOKFDUVSF UP QSPEVDF B Q 4

Slide 25

Slide 25 text

Using other information marbles rare, but every bag contains at least one. Factory says: IJT FYBNQMF UIF QSJPS EBUB BOE OFX EBUB BSF PG UIF TBNF UZQF NBSCMFT ESBX #VU JO HFOFSBM UIF QSJPS EBUB BOE OFX EBUB DBO CF PG EJČFSFOU UZQFT 4VQ UIBU TPNFPOF GSPN UIF NBSCMF GBDUPSZ UFMMT ZPV UIBU CMVF NBSCMFT BSF SBSF H DPOUBJOJOH < > UIFZ NBEF  CBHT DPOUBJOJOH < > BOE  CBHT > ćFZ BMTP FOTVSFE UIBU FWFSZ CBH DPOUBJOFE BU MFBTU POF CMVF BOE PO 8F DBO VQEBUF PVS DPVOUT BHBJO $POKFDUVSF 1SJPS XBZT 'BDUPSZ DPVOU /FX DPVOU < >    ×  =  < >    ×  =  < >    ×  =  < >    ×  =  < >    ×  =  DPOKFDUVSF < > JT NPTU QMBVTJCMF CVU CBSFMZ CFUUFS UIBO < > * E EJČFSFODF JO UIFTF DPVOUT BU XIJDI XF DBO TBGFMZ EFDJEF UIBU POF PG UIF DPO SSFDU POF :PVMM TQFOE UIF OFYU DIBQUFS FYQMPSJOH UIBU RVFTUJPO OH 0SJHJOBM JHOPSBODF 8IJDI BTTVNQUJPO TIPVME XF VTF XIFO UIFSF JT OP QSF UIF QSJPS EBUB BOE OFX EBUB BSF PG UIF TBNF UZQF NBSCMFT ESBXO GSPN FSBM UIF QSJPS EBUB BOE OFX EBUB DBO CF PG EJČFSFOU UZQFT 4VQQPTF GPS POF GSPN UIF NBSCMF GBDUPSZ UFMMT ZPV UIBU CMVF NBSCMFT BSF SBSF 4P GPS H < > UIFZ NBEF  CBHT DPOUBJOJOH < > BOE  CBHT DPOUBJO Z BMTP FOTVSFE UIBU FWFSZ CBH DPOUBJOFE BU MFBTU POF CMVF BOE POF XIJUF EBUF PVS DPVOUT BHBJO $POKFDUVSF 1SJPS XBZT 'BDUPSZ DPVOU /FX DPVOU >    ×  =  >    ×  =  >    ×  =  >    ×  =  >    ×  =  < > JT NPTU QMBVTJCMF CVU CBSFMZ CFUUFS UIBO < > *T UIFSF B F JO UIFTF DPVOUT BU XIJDI XF DBO TBGFMZ EFDJEF UIBU POF PG UIF DPOKFDUVSFT :PVMM TQFOE UIF OFYU DIBQUFS FYQMPSJOH UIBU RVFTUJPO BM JHOPSBODF 8IJDI BTTVNQUJPO TIPVME XF VTF XIFO UIFSF JT OP QSFWJPVT JO

Slide 26

Slide 26 text

Using other information marbles rare. Factory says: IJT FYBNQMF UIF QSJPS EBUB BOE OFX EBUB BSF PG UIF TBNF UZQF NBSCMFT ESBX #VU JO HFOFSBM UIF QSJPS EBUB BOE OFX EBUB DBO CF PG EJČFSFOU UZQFT 4VQ UIBU TPNFPOF GSPN UIF NBSCMF GBDUPSZ UFMMT ZPV UIBU CMVF NBSCMFT BSF SBSF H DPOUBJOJOH < > UIFZ NBEF  CBHT DPOUBJOJOH < > BOE  CBHT > ćFZ BMTP FOTVSFE UIBU FWFSZ CBH DPOUBJOFE BU MFBTU POF CMVF BOE PO 8F DBO VQEBUF PVS DPVOUT BHBJO $POKFDUVSF 1SJPS XBZT 'BDUPSZ DPVOU /FX DPVOU < >    ×  =  < >    ×  =  < >    ×  =  < >    ×  =  < >    ×  =  DPOKFDUVSF < > JT NPTU QMBVTJCMF CVU CBSFMZ CFUUFS UIBO < > * E EJČFSFODF JO UIFTF DPVOUT BU XIJDI XF DBO TBGFMZ EFDJEF UIBU POF PG UIF DPO SSFDU POF :PVMM TQFOE UIF OFYU DIBQUFS FYQMPSJOH UIBU RVFTUJPO OH 0SJHJOBM JHOPSBODF 8IJDI BTTVNQUJPO TIPVME XF VTF XIFO UIFSF JT OP QSF

Slide 27

Slide 27 text

Counts to plausibility Unglamorous basis of applied probability: Things that can happen more ways are more plausible. J[F JT UP BEE VQ BMM PG UIF QSPEVDUT POF GPS FBDI WBMVF Q DBO UBLF XBZT Q DBO QSPEVDF %OFX × QSJPS QMBVTJCJMJUZ Q UIFO EJWJEF FBDI QSPEVDU CZ UIF TVN PG QSPEVDUT QMBVTJCJMJUZ PG Q BęFS %OFX = XBZT Q DBO QSPEVDF %OFX × QSJPS QMBVTJCJMJUZ Q TVN PG QSPEVDUT FT OPUIJOH TQFDJBM SFBMMZ BCPVU TUBOEBSEJ[JOH UP POF "OZ WBMVF XJMM EP #VU VTJOH CFS  FOET VQ NBLJOH UIF NBUIFNBUJDT NPSF DPOWFOJFOU $POTJEFS BHBJO UIF UBCMF GSPN CFGPSF OPX VQEBUFE VTJOH PVS EFĕOJUJPOT PG Q BOE iQ UZw 1PTTJCMF DPNQPTJUJPO Q XBZT UP QSPEVDF EBUB QMBVTJCJMJUZ < >    < > .   < > .   < > .   < >    DBO RVJDLMZ DPNQVUF UIFTF QMBVTJCJMJUJFT JO 3 ʄǤ ǭ ƾ ǐ ǃ ǐ DŽ Ǯ

Slide 28

Slide 28 text

Counts to plausibility TVN PG QSPEVDUT FT OPUIJOH TQFDJBM SFBMMZ BCPVU TUBOEBSEJ[JOH UP POF "OZ WBMVF XJMM EP #VU VTJOH CFS  FOET VQ NBLJOH UIF NBUIFNBUJDT NPSF DPOWFOJFOU $POTJEFS BHBJO UIF UBCMF GSPN CFGPSF OPX VQEBUFE VTJOH PVS EFĕOJUJPOT PG Q BOE iQ UZw 1PTTJCMF DPNQPTJUJPO Q XBZT UP QSPEVDF EBUB QMBVTJCJMJUZ < >    < > .   < > .   < > .   < >    DBO RVJDLMZ DPNQVUF UIFTF QMBVTJCJMJUJFT JO 3 ʄǤ ǭ ƾ ǐ ǃ ǐ DŽ Ǯ dz.0(ǭ24.Ǯ ƻǏƼǀ ƻǏƿƻ ƻǏƿǀ ćFTF QMBVTJCJMJUJFT BSF BMTP QSPCBCJMJUJFT‰UIFZ BSF OPOOFHBUJWF [FSP PS QPTJUJWF CFST UIBU TVN UP POF "OE BMM PG UIF NBUIFNBUJDBM UIJOHT ZPV DBO EP XJUI QSPCBCJ EBSEJ[F JT UP BEE VQ BMM PG UIF QSPEVDUT POF GPS FBDI WBMVF Q DBO UBLF XBZT Q DBO QSPEVDF %OFX × QSJPS QMBVTJCJMJUZ Q "OE UIFO EJWJEF FBDI QSPEVDU CZ UIF TVN PG QSPEVDUT QMBVTJCJMJUZ PG Q BęFS %OFX = XBZT Q DBO QSPEVDF %OFX × QSJPS QMBVTJCJMJUZ Q TVN PG QSPEVDUT ćFSFT OPUIJOH TQFDJBM SFBMMZ BCPVU TUBOEBSEJ[JOH UP POF "OZ WBMVF XJMM EP #VU VTJOH UIF OVNCFS  FOET VQ NBLJOH UIF NBUIFNBUJDT NPSF DPOWFOJFOU $POTJEFS BHBJO UIF UBCMF GSPN CFGPSF OPX VQEBUFE VTJOH PVS EFĕOJUJPOT PG Q BOE iQMBV TJCJMJUZw 1PTTJCMF DPNQPTJUJPO Q XBZT UP QSPEVDF EBUB QMBVTJCJMJUZ < >    < > .   < > .   < > .   < >    :PV DBO RVJDLMZ DPNQVUF UIFTF QMBVTJCJMJUJFT JO 3 3 DPEF  24. ʄǤ ǭ ƾ ǐ ǃ ǐ DŽ Ǯ 24.dz.0(ǭ24.Ǯ ǯƼǰ ƻǏƼǀ ƻǏƿƻ ƻǏƿǀ ćFTF QMBVTJCJMJUJFT BSF BMTP QSPCBCJMJUJFT‰UIFZ BSF OPOOFHBUJWF [FSP PS QPTJUJWF SFBM OVNCFST UIBU TVN UP POF "OE BMM PG UIF NBUIFNBUJDBM UIJOHT ZPV DBO EP XJUI QSPCBCJMJUJFT ZPV DBO BMTP EP XJUI UIFTF WBMVFT 4QFDJĕDBMMZ FBDI QJFDF PG UIF DBMDVMBUJPO IBT B EJSFDU QBSUOFS JO BQQMJFE QSPCBCJMJUZ UIFPSZ ćFTF QBSUOFST IBWF TUFSFPUZQFE OBNFT TP JUT XPSUI

Slide 29

Slide 29 text

Counts to plausibility TVN PG QSPEVDUT FT OPUIJOH TQFDJBM SFBMMZ BCPVU TUBOEBSEJ[JOH UP POF "OZ WBMVF XJMM EP #VU VTJOH CFS  FOET VQ NBLJOH UIF NBUIFNBUJDT NPSF DPOWFOJFOU $POTJEFS BHBJO UIF UBCMF GSPN CFGPSF OPX VQEBUFE VTJOH PVS EFĕOJUJPOT PG Q BOE iQ UZw 1PTTJCMF DPNQPTJUJPO Q XBZT UP QSPEVDF EBUB QMBVTJCJMJUZ < >    < > .   < > .   < > .   < >    DBO RVJDLMZ DPNQVUF UIFTF QMBVTJCJMJUJFT JO 3 ʄǤ ǭ ƾ ǐ ǃ ǐ DŽ Ǯ dz.0(ǭ24.Ǯ ƻǏƼǀ ƻǏƿƻ ƻǏƿǀ ćFTF QMBVTJCJMJUJFT BSF BMTP QSPCBCJMJUJFT‰UIFZ BSF OPOOFHBUJWF [FSP PS QPTJUJWF CFST UIBU TVN UP POF "OE BMM PG UIF NBUIFNBUJDBM UIJOHT ZPV DBO EP XJUI QSPCBCJ Plausibility is probability: Set of non-negative real numbers that sum to one. Probability theory is just a set of shortcuts for counting possibilities.

Slide 30

Slide 30 text

Building a model • How to use probability to do typical statistical modeling? 1. Design the model (data story) 2. Condition on the data (update) 3. Evaluate the model (critique)

Slide 31

Slide 31 text

Nine tosses of the globe: W L W W W L W L W

Slide 32

Slide 32 text

Design > Condition > Evaluate • Data story motivates the model • How do the data arise? • For W L W W W L W L W: • Some true proportion of water, p • Toss globe, probability p of observing W, 1–p of L • Each toss therefore independent of other tosses • Translate data story into probability statements

Slide 33

Slide 33 text

Design > Condition > Evaluate • Bayesian updating defines optimal learning in small world, converts prior into posterior • Give your golem an information state, before the data: Here, an initial confidence in each possible value of p between zero and one • Condition on data to update information state: New confidence in each value of p, conditional on data

Slide 34

Slide 34 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W W n = 4 W L W W W L W L W confidence n = 5 W L W W W L W L W W prior p, proportion W plausibility

Slide 35

Slide 35 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W W n = 4 W L W W W L W L W confidence n = 5 W L W W W L W L W W prior posterior p, proportion W plausibility

Slide 36

Slide 36 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W

Slide 37

Slide 37 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W

Slide 38

Slide 38 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W

Slide 39

Slide 39 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W

Slide 40

Slide 40 text

probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W

Slide 41

Slide 41 text

Design > Condition > Evaluate • Data order irrelevant, because golem assumes order irrelevant • All-at-once, one-at-a-time, shuffled order all give same posterior • Every posterior is a prior for next observation • Every prior is posterior of some other inference • Sample size automatically embodied in posterior   4."-- 803-%4 "/% -"3(& 803-%4 probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W 'ĶĴłĿIJ Ɗƍ )PX B #BZFTJBO NPEFM MFBSOT &BDI UPTT PG UIF HMPCF QSPEVDFT BO PCTFSWBUJPO PG XBUFS 8 PS MBOE -  ćF NPEFMT FTUJNBUF PG UIF QSP QPSUJPO PG XBUFS PO UIF HMPCF JT B QMBVTJCJMJUZ GPS FWFSZ QPTTJCMF WBMVF ćF MJOFT BOE DVSWFT JO UIJT ĕHVSF BSF UIFTF DPMMFDUJPOT PG QMBVTJCJMJUJFT *O FBDI QMPU B QSFWJPVT QMBVTJCJMJUJFT EBTIFE DVSWF BSF VQEBUFE JO MJHIU PG UIF MBUFTU

Slide 42

Slide 42 text

Design > Condition > Evaluate • Bayesian inference: Logical answer to a question in the form of a model
 
 “How plausible is each proportion of water, given these data?”
 • Golem must be supervised • Did the golem malfunction? • Does the golem’s answer make sense? • Does the question make sense? • Check sensitivity of answer to changes in assumptions

Slide 43

Slide 43 text

Construction perspective • Build joint model: (1) List variables (2) Define generative relations (3) ??? (4) Profit • Input: Joint prior • Deduce: Joint posterior

Slide 44

Slide 44 text

The Joint Model σ α ρσα σβ ρσα σβ σ β 8 ∼ #JOPNJBM(/, Q) Q ∼ 6OJGPSN(, ) • Bayesian models are generative • Can be run forward to generate predictions or simulate date • Can be run in reverse to infer process from data

Slide 45

Slide 45 text

The Joint Model σ α ρσα σβ ρσα σβ σ β 8 ∼ #JOPNJBM(/, Q) Q ∼ 6OJGPSN(, ) • Run forward:

Slide 46

Slide 46 text

Run in Reverse: 
 Computing the posterior 1. Analytical approach (often impossible) 2. Grid approximation (very intensive) 3. Quadratic approximation (limited) 4. Markov chain Monte Carlo (intensive)

Slide 47

Slide 47 text

Predictive checks • Something like a significance test, but not • No universally best way to evaluate adequacy of model-based predictions • No way to justify always using a threshold like 5% • Good predictive checks always depend upon purpose and imagination “It would be very nice to have a formal apparatus that gives us some ‘optimal’ way of recognizing unusual phenomena and inventing new classes of hypotheses [...]; but this remains an art for the creative human mind.” 
 —E.T. Jaynes (1922–1998)

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

Triumph of Geocentrism • Claudius Ptolemy (90–168) • Egyptian mathematician • Accurate model of planetary motion • Epicycles: orbits on orbits • Fourier series   -*/&" Earth equant planet epicycle deferent

Slide 50

Slide 50 text

Geocentrism • Descriptively accurate • Mechanistically wrong • General method of approximation • Known to be wrong Regression • Descriptively accurate • Mechanistically wrong • General method of approximation • Taken too seriously

Slide 51

Slide 51 text

Linear regression • Simple statistical golems • Model of mean and variance of normally (Gaussian) distributed measure • Mean as additive combination of weighted variables • Constant variance

Slide 52

Slide 52 text

1809 Bayesian argument for normal error and least-squares estimation

Slide 53

Slide 53 text

Why normal? • Why are normal (Gaussian) distributions so common in statistics? 1. Easy to calculate with 2. Common in nature 3. Very conservative assumption 0.0 0.1 0.2 0.3 0.4 x density −4σ −2σ 0 2σ 4σ 95%

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

-6 -3 0 3 6 0.00 0.10 0.20 position Density 16 steps -6 -3 0 3 6 0.0 0.1 0.2 0.3 position Density 4 steps -6 -3 0 3 6 0.00 0.10 0.20 position Density 8 steps 0 4 8 12 16 -6 -3 0 3 6 step number position 'ĶĴłĿIJ ƌƊ 3BOEPN XBMLT PO UIF TPDDFS ĕFME DPOWFSHF UP B OPSNBM EJT USJCVUJPO ćF NPSF TUFQT BSF UBLFO UIF DMPTFS UIF NBUDI CFUXFFO UIF SFBM Figure 4.2

Slide 60

Slide 60 text

-6 -3 0 3 6 0.00 0.10 0.20 position Density 16 steps -6 -3 0 3 6 0.0 0.1 0.2 0.3 position Density 4 steps -6 -3 0 3 6 0.00 0.10 0.20 position Density 8 steps 0 4 8 12 16 -6 -3 0 3 6 step number position 'ĶĴłĿIJ ƌƊ 3BOEPN XBMLT PO UIF TPDDFS ĕFME DPOWFSHF UP B OPSNBM EJT USJCVUJPO ćF NPSF TUFQT BSF UBLFO UIF DMPTFS UIF NBUDI CFUXFFO UIF SFBM Figure 4.2

Slide 61

Slide 61 text

-6 -3 0 3 6 0.00 0.10 0.20 position Density 16 steps -6 -3 0 3 6 0.0 0.1 0.2 0.3 position Density 4 steps -6 -3 0 3 6 0.00 0.10 0.20 position Density 8 steps 0 4 8 12 16 -6 -3 0 3 6 step number position 'ĶĴłĿIJ ƌƊ 3BOEPN XBMLT PO UIF TPDDFS ĕFME DPOWFSHF UP B OPSNBM EJT USJCVUJPO ćF NPSF TUFQT BSF UBLFO UIF DMPTFS UIF NBUDI CFUXFFO UIF SFBM Figure 4.2

Slide 62

Slide 62 text

Why normal? • Processes that produce normal distributions • Addition • Products of small deviations • Logarithms of products Francis Galton’s 1894 “bean machine” for simulating normal distributions

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

Why normal? • Ontological perspective • Processes which add fluctuations result in dampening • Damped fluctuations end up Gaussian • No information left, except mean and variance • Can’t infer process from distribution! • Epistemological perspective • Know only mean and variance • Then least surprising and most conservative (maximum entropy) distribution is Gaussian • Nature likes maximum entropy distributions

Slide 65

Slide 65 text

Why normal? • Ontological perspective • Processes which add fluctuations result in dampening • Damped fluctuations end up Gaussian • No information left, except mean and variance • Can’t infer process from distribution! • Epistemological perspective • Know only mean and variance • Then least surprising and most conservative (maximum entropy) distribution is Gaussian • Nature likes maximum entropy distributions

Slide 66

Slide 66 text

Linear models • Models of normally distributed data common • “General Linear Model”: t-test, single regression, multiple regression, ANOVA, ANCOVA, MANOVA, MANCOVA, yadda yadda yadda • All the same thing • Learn strategy, not procedure Willard Boepple

Slide 67

Slide 67 text

30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 20 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 100 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 200 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 350

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

Regression as a wicked oracle • Regression automatically focuses on the most informative cases • Cases that don’t help are automatically ignored • But not kind — ask carefully

Slide 70

Slide 70 text

Why not just add everything? • Could just add all available predictors to model • “We controlled for...” • Almost always a bad idea • Adding variables creates confounds • Residual confounding • Overfitting

Slide 71

Slide 71 text

AGE HEIGHT MATH ? MATH independent of HEIGHT, conditional on AGE

Slide 72

Slide 72 text

M 5 6 7 8 9 10 50 55 60 65 70 75 80 5 6 7 8 9 10 A 50 55 60 65 70 75 80 110 120 130 110 120 130 H

Slide 73

Slide 73 text

MATH independent of HEIGHT, conditional on AGE 118 119 120 121 122 55 60 65 70 75 H M A = 7 123 124 125 126 127 60 65 70 75 H M A = 8 128 129 130 131 132 60 65 70 75 80 H M A = 9 133 134 135 136 137 65 70 75 80 H M A = 10

Slide 74

Slide 74 text

LIGHT POWER SWITCH SWITCH dependent on POWER, conditional on LIGHT SWITCH independent of POWER

Slide 75

Slide 75 text

LIGHT POWER SWITCH ON ON ? OFF ON ? SWITCH dependent on POWER, conditional on LIGHT This effect known as “collider bias”

Slide 76

Slide 76 text

MARRIED AGE HAPPY HAPPY dependent on AGE, conditional on MARRIED HAPPY independent of AGE

Slide 77

Slide 77 text

Why not just add everything? • Matters for experiments as well • Conditioning on post-treatment variables can be very bad • Conditioning on pre-treatment can also be bad (colliders) • Good news! • Causal inference possible in observational settings • But requires good theory

Slide 78

Slide 78 text

Texts in Statistical Science Richard McElreath Statistical Rethinking A Bayesian Course with Examples in R and Stan SECOND EDITION ond on JUST COUNTING IMPLICATIONS OF ASSUMPTIONS