Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bayesian Statistical Analysis: A Gentle Introdu...
Search
Chris Fonnesbeck
December 05, 2011
Research
4
630
Bayesian Statistical Analysis: A Gentle Introduction
Get to know the Reverend Bayes.Reverend
Chris Fonnesbeck
December 05, 2011
Tweet
Share
More Decks by Chris Fonnesbeck
See All by Chris Fonnesbeck
Statistical Thinking for Data Science
fonnesbeck
5
1.1k
Structured Decision-making and Adaptive Management For The Control Of Infectious Disease
fonnesbeck
3
110
Estimating Microbial Diversity
fonnesbeck
0
120
Other Decks in Research
See All in Research
SSII2025 [SS1] レンズレスカメラ
ssii
PRO
2
990
プロシェアリング白書2025_PROSHARING_REPORT_2025
circulation
1
930
CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations
satai
3
220
公立高校入試等に対する受入保留アルゴリズム(DA)導入の提言
shunyanoda
0
6.2k
AI エージェントを活用した研究再現性の自動定量評価 / scisci2025
upura
1
130
2025年度人工知能学会全国大会チュートリアル講演「深層基盤モデルの数理」
taiji_suzuki
24
16k
最適決定木を用いた処方的価格最適化
mickey_kubo
4
1.8k
なめらかなシステムと運用維持の終わらぬ未来 / dicomo2025_coherently_fittable_system
monochromegane
0
1k
NLP Colloquium
junokim
1
170
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
satai
3
260
Sosiaalisen median katsaus 03/2025 + tekoäly
hponka
0
1.4k
大規模な2値整数計画問題に対する 効率的な重み付き局所探索法
mickey_kubo
1
290
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
72
4.9k
BBQ
matthewcrist
89
9.7k
Facilitating Awesome Meetings
lara
54
6.5k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
138
34k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
21
1.3k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
21k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
30
2.2k
Designing for Performance
lara
610
69k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Six Lessons from altMBA
skipperchong
28
3.9k
How STYLIGHT went responsive
nonsquared
100
5.6k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
Transcript
Bayesian Statistical Analysis A Gentle Introduction Center for Quantitative Sciences
Workshop 18 November 2011 Christopher J. Fonnesbeck Monday, December 5, 11
What is Bayesian Inference? Monday, December 5, 11
Practical methods for making inferences from data using probability models
for quantities we observe and about which we wish to learn. Gelman et al., 2004 Monday, December 5, 11
Rev. Thomas Bayes Monday, December 5, 11
Rev. Thomas Bayes Simon Laplace Monday, December 5, 11
Conclusions in terms of probability statements p( |y) unknowns observations
Monday, December 5, 11
Classical inference conditions on unknown parameter p(y| ) unknowns observations
Monday, December 5, 11
Classical vs Bayesian Statistics Monday, December 5, 11
Frequentist Monday, December 5, 11
Frequentist observations random Monday, December 5, 11
Frequentist model, parameters fixed Monday, December 5, 11
Frequentist Inference Monday, December 5, 11
Choose an estimator ˆ µ = P xi n based
on frequentist (asymptotic) criteria Monday, December 5, 11
Choose a test statistic based on frequentist (asymptotic) criteria t
= ¯ x µ s/ p n Monday, December 5, 11
Bayesian Monday, December 5, 11
Bayesian observations fixed Monday, December 5, 11
Bayesian model, parameters “random” Monday, December 5, 11
Components of Bayesian Statistics Monday, December 5, 11
Specify full probability model 1 Pr(y| )Pr( |⇥)Pr(⇥) Monday, December
5, 11
data y Monday, December 5, 11
data y covariates X Monday, December 5, 11
data y covariates X parameters ✓ Monday, December 5, 11
data y covariates X parameters ✓ missing data ˜ y
Monday, December 5, 11
2 Calculate posterior distribution Pr( |y) Monday, December 5, 11
3Check model for lack of fit Monday, December 5, 11
Why Bayes? ? Monday, December 5, 11
“... the Bayesian approach is attractive because it is useful.
Its usefulness derives in large measure from its simplicity. Its simplicity allows the investigation of far more complex models than can be handled by the tools in the classical toolbox.” Link and Barker (2010) Monday, December 5, 11
coherence X ˜ y y ✓ Monday, December 5, 11
Interpretation Monday, December 5, 11
Pr( ¯ Y 1.96 ⇥ ⇥ n < µ <
¯ Y + 1.96 ⇥ ⇥ n ) = 0.95 Confidence Interval Pr(a(Y ) < ✓ < b(Y )|✓) = 0.95 Monday, December 5, 11
Credible Interval Pr(a(y) < ✓ < b(y)|Y = y) =
0.95 Monday, December 5, 11
Uncertainty Monday, December 5, 11
C alpha N z b_psi beta a_psi pi mu psi
Ntotal occupied a b Ndist psi z alpha pi N beta mu occupied N alpha beta N alpha beta Complex Models Monday, December 5, 11
Probability Monday, December 5, 11
Pr(A) = m n A = an event of interest
m = no. of favourable outcomes n = total no. of possible outcomes (1) classical Monday, December 5, 11
all elementary events are equally likely Monday, December 5, 11
Pr(A) = lim n→∞ m n n = no. of
identical and independent trials m = no. of times A has occurred (2) frequentist Monday, December 5, 11
Between 1745 and 1770 there were 241,945 girls and 251,527
boys born in Paris Monday, December 5, 11
A = “Chris has Type A blood” Monday, December 5,
11
A = “Titans will win Superbowl XLVI” Monday, December 5,
11
A = “The prevalence of diabetes in Nashville is >
0.15” Monday, December 5, 11
(3) subjective Pr(A) Monday, December 5, 11
Measure of one’s uncertainty regarding the occurrence of A Pr(A)
Monday, December 5, 11
Pr(A|H) Monday, December 5, 11
A = “It is raining in Atlanta” Monday, December 5,
11
Pr(A|H) = 0.5 Monday, December 5, 11
Pr( A|H ) = ⇢ 0 . 4 if raining
in Nashville 0 . 25 otherwise Monday, December 5, 11
Pr(A|H) = 1, if raining 0, otherwise Monday, December 5,
11
S A Pr(A) = area of A area of S
Monday, December 5, 11
S A B A ∩ B Pr(A ⇥ B) =
Pr(A) + Pr(B) Pr(A ⇤ B) Monday, December 5, 11
A A ∩ B Pr(B|A) = Pr(A B) Pr(A) Monday,
December 5, 11
A A ∩ B conditional probability Pr(B|A) = Pr(A B)
Pr(A) Monday, December 5, 11
Independence Pr(B|A) = Pr(B) Monday, December 5, 11
S A B A ∩ B Pr(B|A) = Pr(A B)
Pr(A) Monday, December 5, 11
S A B A ∩ B Pr(A|B) = Pr(A B)
Pr(B) Pr(B|A) = Pr(A B) Pr(A) Monday, December 5, 11
Pr(A B) = Pr(A|B)Pr(B) = Pr(B|A)Pr(A) Monday, December 5, 11
Bayes Theorem Pr(B|A) = Pr(A|B)Pr(B) Pr(A) Monday, December 5, 11
Bayes Theorem Pr( |y) = Pr(y| )Pr( ) Pr(y) Posterior
Probability Prior Probability Likelihood of Observations Normalizing Constant Monday, December 5, 11
Bayes Theorem Pr( |y) = Pr(y| )Pr( ) R Pr(y|
)Pr( )d Monday, December 5, 11
“proportional to” Pr( |y) Pr(y| )Pr( ) Monday, December 5,
11
Pr( |y) Pr(y| )Pr( ) Posterior Prior Likelihood Monday, December
5, 11
information p( |y) p(y| )p( ) Monday, December 5, 11
“Following observation of , the likelihood contains all experimental information
from about the unknown .” θ y y L(✓|y) Monday, December 5, 11
binomial model data parameter sampling distribution of X p(X|✓) =
✓ N n ◆ ✓x (1 ✓)N x Monday, December 5, 11
binomial model likelihood function for θ L(✓|X) = ✓ N
n ◆ ✓x (1 ✓)N x Monday, December 5, 11
prior distribution p(θ|y) ∝ p(y|θ)p(θ) Monday, December 5, 11
Prior as population distribution Monday, December 5, 11
Monday, December 5, 11
Prior as information state Monday, December 5, 11
Monday, December 5, 11
All plausible values Monday, December 5, 11
Between 1745 and 1770 there were 241,945 girls and 251,527
boys born in Paris Monday, December 5, 11
Bayesian analysis is subjective Monday, December 5, 11
Statistical analysis is subjective Monday, December 5, 11
“... all forms of statistical inference make assumptions, assumptions which
can only be tested very crudely and can almost never be verified.” - Robert E. Kass Monday, December 5, 11
3 Model checking Monday, December 5, 11
1.5 2.0 2.5 0.0 0.2 0.4 0.6 0.8 1.0 x
p(x) separation Monday, December 5, 11
source: Gelman et al. 2008 Monday, December 5, 11
weakly-informative prior -4 -2 0 2 4 0.0 0.1 0.2
0.3 0.4 xrange Pr(x) Monday, December 5, 11
source: Gelman et al. 2008 Monday, December 5, 11
example: genetic probabilities Monday, December 5, 11
X-linked recessive Monday, December 5, 11
Monday, December 5, 11
affected carrier no gene unknown Woman Husband Brother Mother is
the woman a carrier? Monday, December 5, 11
Pr(θ = 1) = Pr(θ = 0) = 1 2
Pr(θ = 1) Pr(θ = 0) = 1 prior odds Monday, December 5, 11
affected carrier no gene unknown Woman Husband Brother Son Son
Mother Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Pr(y1 = 0, y2 = 0|θ = 0) = 1 Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Pr(y1 = 0, y2 = 0|θ = 0) = 1 “likelihood ratio” p(y1 = 0, y2 = 0|θ = 1) p(y1 = 0, y2 = 0|θ = 0) = 0.25 1 = 1/4 Monday, December 5, 11
what about Mom? Monday, December 5, 11
what about Mom? y = {y1 = 0, y2 =
0} Pr( = 1|y) = Pr(y| = 1)Pr( = 1) Pr(y) = Pr(y| = 1)Pr( = 1) P ✓ Pr(y| )Pr( ) Monday, December 5, 11
y = {y1 = 0, y2 = 0} Monday, December
5, 11
Pr( = 1|y) = p(y| = 1)Pr( = 1) p(y|
= 1)Pr( = 1) + p(y| = 0)Pr( = 0) y = {y1 = 0, y2 = 0} Monday, December 5, 11
Pr( = 1|y) = p(y| = 1)Pr( = 1) p(y|
= 1)Pr( = 1) + p(y| = 0)Pr( = 0) = (0.25)(0.5) (0.25)(0.5) + (1.0)(0.5) = 0.125 0.625 = 0.2 y = {y1 = 0, y2 = 0} Monday, December 5, 11
3rd unaffected son? Pr( = 1|y3 ) = (0.5)(0.2) (0.5)(0.2)
+ (1)(0.8) = 0.111 posterior from previous Monday, December 5, 11
Hierarchical Models Monday, December 5, 11
effectiveness of cardiac surgery example Monday, December 5, 11
Hospital Operations Deaths A 47 0 B 148 18 C
119 8 D 810 46 E 211 8 F 196 13 G 148 9 H 215 31 I 207 14 J 97 8 K 256 29 L 360 24 Monday, December 5, 11
clustering induces dependence between observations Monday, December 5, 11
parameters sampled from common distribution j hospital j survival rate
Monday, December 5, 11
population distribution j f(⇥) hyperparameters Monday, December 5, 11
θ1 θ2 θk y1 y2 yk ... ... deaths parameters
Monday, December 5, 11
θ1 θ2 θk y1 y2 yk ... ... deaths parameters
µ, σ2 hyperparameters Monday, December 5, 11
, ϕµ ϕσ θ1 θ2 θk y1 y2 yk ...
... deaths parameters µ, σ2 hyperparameters Monday, December 5, 11
non-hierarchical models of hierarchical data can easily be underfit or
overfit Monday, December 5, 11
“experiments” j = 1, . . . , J likelihood
∼ Binomial( , ) deaths j operations j θj logit( ) ∼ N(µ, ) θi σ2 population model µ ∼ , ∼ Pµ σ2 Pσ priors Monday, December 5, 11
0/47 = 0 18/148 = 0.12 8/119 = 0.07 46/810
= 0.06 Monday, December 5, 11
Monday, December 5, 11
Monday, December 5, 11