Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bayesian Statistical Analysis: A Gentle Introdu...
Search
Chris Fonnesbeck
December 05, 2011
Research
4
610
Bayesian Statistical Analysis: A Gentle Introduction
Get to know the Reverend Bayes.Reverend
Chris Fonnesbeck
December 05, 2011
Tweet
Share
More Decks by Chris Fonnesbeck
See All by Chris Fonnesbeck
Statistical Thinking for Data Science
fonnesbeck
5
1k
Structured Decision-making and Adaptive Management For The Control Of Infectious Disease
fonnesbeck
3
100
Estimating Microbial Diversity
fonnesbeck
0
110
Other Decks in Research
See All in Research
研究を支える拡張性の高い ワークフローツールの提案 / Proposal of highly expandable workflow tools to support research
linyows
0
250
論文読み会 KDD2024 | Relevance meets Diversity: A User-Centric Framework for Knowledge Exploration through Recommendations
cocomoff
0
140
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
sansan_randd
1
420
文書画像のデータ化における VLM活用 / Use of VLM in document image data conversion
sansan_randd
2
420
コミュニティドライブプロジェクト
smartfukushilab1
0
120
渋谷Well-beingアンケート調査結果
shibuyasmartcityassociation
0
380
アプリケーションから知るモデルマージ
maguro27
0
230
marukotenant01/tenant-20240916
marketing2024
0
650
Optimal and Diffusion Transports in Machine Learning
gpeyre
0
810
医療支援AI開発における臨床と情報学の連携を円滑に進めるために
moda0
0
140
第79回 産総研人工知能セミナー 発表資料
agiats
3
190
文化が形作る音楽推薦の消費と、その逆
kuri8ive
0
220
Featured
See All Featured
How GitHub (no longer) Works
holman
312
140k
Building Applications with DynamoDB
mza
93
6.2k
Site-Speed That Sticks
csswizardry
3
270
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7k
Optimizing for Happiness
mojombo
376
70k
A Philosophy of Restraint
colly
203
16k
Practical Orchestrator
shlominoach
186
10k
A Tale of Four Properties
chriscoyier
157
23k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
26
1.9k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
Building Your Own Lightsaber
phodgson
104
6.2k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
230
52k
Transcript
Bayesian Statistical Analysis A Gentle Introduction Center for Quantitative Sciences
Workshop 18 November 2011 Christopher J. Fonnesbeck Monday, December 5, 11
What is Bayesian Inference? Monday, December 5, 11
Practical methods for making inferences from data using probability models
for quantities we observe and about which we wish to learn. Gelman et al., 2004 Monday, December 5, 11
Rev. Thomas Bayes Monday, December 5, 11
Rev. Thomas Bayes Simon Laplace Monday, December 5, 11
Conclusions in terms of probability statements p( |y) unknowns observations
Monday, December 5, 11
Classical inference conditions on unknown parameter p(y| ) unknowns observations
Monday, December 5, 11
Classical vs Bayesian Statistics Monday, December 5, 11
Frequentist Monday, December 5, 11
Frequentist observations random Monday, December 5, 11
Frequentist model, parameters fixed Monday, December 5, 11
Frequentist Inference Monday, December 5, 11
Choose an estimator ˆ µ = P xi n based
on frequentist (asymptotic) criteria Monday, December 5, 11
Choose a test statistic based on frequentist (asymptotic) criteria t
= ¯ x µ s/ p n Monday, December 5, 11
Bayesian Monday, December 5, 11
Bayesian observations fixed Monday, December 5, 11
Bayesian model, parameters “random” Monday, December 5, 11
Components of Bayesian Statistics Monday, December 5, 11
Specify full probability model 1 Pr(y| )Pr( |⇥)Pr(⇥) Monday, December
5, 11
data y Monday, December 5, 11
data y covariates X Monday, December 5, 11
data y covariates X parameters ✓ Monday, December 5, 11
data y covariates X parameters ✓ missing data ˜ y
Monday, December 5, 11
2 Calculate posterior distribution Pr( |y) Monday, December 5, 11
3Check model for lack of fit Monday, December 5, 11
Why Bayes? ? Monday, December 5, 11
“... the Bayesian approach is attractive because it is useful.
Its usefulness derives in large measure from its simplicity. Its simplicity allows the investigation of far more complex models than can be handled by the tools in the classical toolbox.” Link and Barker (2010) Monday, December 5, 11
coherence X ˜ y y ✓ Monday, December 5, 11
Interpretation Monday, December 5, 11
Pr( ¯ Y 1.96 ⇥ ⇥ n < µ <
¯ Y + 1.96 ⇥ ⇥ n ) = 0.95 Confidence Interval Pr(a(Y ) < ✓ < b(Y )|✓) = 0.95 Monday, December 5, 11
Credible Interval Pr(a(y) < ✓ < b(y)|Y = y) =
0.95 Monday, December 5, 11
Uncertainty Monday, December 5, 11
C alpha N z b_psi beta a_psi pi mu psi
Ntotal occupied a b Ndist psi z alpha pi N beta mu occupied N alpha beta N alpha beta Complex Models Monday, December 5, 11
Probability Monday, December 5, 11
Pr(A) = m n A = an event of interest
m = no. of favourable outcomes n = total no. of possible outcomes (1) classical Monday, December 5, 11
all elementary events are equally likely Monday, December 5, 11
Pr(A) = lim n→∞ m n n = no. of
identical and independent trials m = no. of times A has occurred (2) frequentist Monday, December 5, 11
Between 1745 and 1770 there were 241,945 girls and 251,527
boys born in Paris Monday, December 5, 11
A = “Chris has Type A blood” Monday, December 5,
11
A = “Titans will win Superbowl XLVI” Monday, December 5,
11
A = “The prevalence of diabetes in Nashville is >
0.15” Monday, December 5, 11
(3) subjective Pr(A) Monday, December 5, 11
Measure of one’s uncertainty regarding the occurrence of A Pr(A)
Monday, December 5, 11
Pr(A|H) Monday, December 5, 11
A = “It is raining in Atlanta” Monday, December 5,
11
Pr(A|H) = 0.5 Monday, December 5, 11
Pr( A|H ) = ⇢ 0 . 4 if raining
in Nashville 0 . 25 otherwise Monday, December 5, 11
Pr(A|H) = 1, if raining 0, otherwise Monday, December 5,
11
S A Pr(A) = area of A area of S
Monday, December 5, 11
S A B A ∩ B Pr(A ⇥ B) =
Pr(A) + Pr(B) Pr(A ⇤ B) Monday, December 5, 11
A A ∩ B Pr(B|A) = Pr(A B) Pr(A) Monday,
December 5, 11
A A ∩ B conditional probability Pr(B|A) = Pr(A B)
Pr(A) Monday, December 5, 11
Independence Pr(B|A) = Pr(B) Monday, December 5, 11
S A B A ∩ B Pr(B|A) = Pr(A B)
Pr(A) Monday, December 5, 11
S A B A ∩ B Pr(A|B) = Pr(A B)
Pr(B) Pr(B|A) = Pr(A B) Pr(A) Monday, December 5, 11
Pr(A B) = Pr(A|B)Pr(B) = Pr(B|A)Pr(A) Monday, December 5, 11
Bayes Theorem Pr(B|A) = Pr(A|B)Pr(B) Pr(A) Monday, December 5, 11
Bayes Theorem Pr( |y) = Pr(y| )Pr( ) Pr(y) Posterior
Probability Prior Probability Likelihood of Observations Normalizing Constant Monday, December 5, 11
Bayes Theorem Pr( |y) = Pr(y| )Pr( ) R Pr(y|
)Pr( )d Monday, December 5, 11
“proportional to” Pr( |y) Pr(y| )Pr( ) Monday, December 5,
11
Pr( |y) Pr(y| )Pr( ) Posterior Prior Likelihood Monday, December
5, 11
information p( |y) p(y| )p( ) Monday, December 5, 11
“Following observation of , the likelihood contains all experimental information
from about the unknown .” θ y y L(✓|y) Monday, December 5, 11
binomial model data parameter sampling distribution of X p(X|✓) =
✓ N n ◆ ✓x (1 ✓)N x Monday, December 5, 11
binomial model likelihood function for θ L(✓|X) = ✓ N
n ◆ ✓x (1 ✓)N x Monday, December 5, 11
prior distribution p(θ|y) ∝ p(y|θ)p(θ) Monday, December 5, 11
Prior as population distribution Monday, December 5, 11
Monday, December 5, 11
Prior as information state Monday, December 5, 11
Monday, December 5, 11
All plausible values Monday, December 5, 11
Between 1745 and 1770 there were 241,945 girls and 251,527
boys born in Paris Monday, December 5, 11
Bayesian analysis is subjective Monday, December 5, 11
Statistical analysis is subjective Monday, December 5, 11
“... all forms of statistical inference make assumptions, assumptions which
can only be tested very crudely and can almost never be verified.” - Robert E. Kass Monday, December 5, 11
3 Model checking Monday, December 5, 11
1.5 2.0 2.5 0.0 0.2 0.4 0.6 0.8 1.0 x
p(x) separation Monday, December 5, 11
source: Gelman et al. 2008 Monday, December 5, 11
weakly-informative prior -4 -2 0 2 4 0.0 0.1 0.2
0.3 0.4 xrange Pr(x) Monday, December 5, 11
source: Gelman et al. 2008 Monday, December 5, 11
example: genetic probabilities Monday, December 5, 11
X-linked recessive Monday, December 5, 11
Monday, December 5, 11
affected carrier no gene unknown Woman Husband Brother Mother is
the woman a carrier? Monday, December 5, 11
Pr(θ = 1) = Pr(θ = 0) = 1 2
Pr(θ = 1) Pr(θ = 0) = 1 prior odds Monday, December 5, 11
affected carrier no gene unknown Woman Husband Brother Son Son
Mother Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Pr(y1 = 0, y2 = 0|θ = 0) = 1 Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Pr(y1 = 0, y2 = 0|θ = 0) = 1 “likelihood ratio” p(y1 = 0, y2 = 0|θ = 1) p(y1 = 0, y2 = 0|θ = 0) = 0.25 1 = 1/4 Monday, December 5, 11
what about Mom? Monday, December 5, 11
what about Mom? y = {y1 = 0, y2 =
0} Pr( = 1|y) = Pr(y| = 1)Pr( = 1) Pr(y) = Pr(y| = 1)Pr( = 1) P ✓ Pr(y| )Pr( ) Monday, December 5, 11
y = {y1 = 0, y2 = 0} Monday, December
5, 11
Pr( = 1|y) = p(y| = 1)Pr( = 1) p(y|
= 1)Pr( = 1) + p(y| = 0)Pr( = 0) y = {y1 = 0, y2 = 0} Monday, December 5, 11
Pr( = 1|y) = p(y| = 1)Pr( = 1) p(y|
= 1)Pr( = 1) + p(y| = 0)Pr( = 0) = (0.25)(0.5) (0.25)(0.5) + (1.0)(0.5) = 0.125 0.625 = 0.2 y = {y1 = 0, y2 = 0} Monday, December 5, 11
3rd unaffected son? Pr( = 1|y3 ) = (0.5)(0.2) (0.5)(0.2)
+ (1)(0.8) = 0.111 posterior from previous Monday, December 5, 11
Hierarchical Models Monday, December 5, 11
effectiveness of cardiac surgery example Monday, December 5, 11
Hospital Operations Deaths A 47 0 B 148 18 C
119 8 D 810 46 E 211 8 F 196 13 G 148 9 H 215 31 I 207 14 J 97 8 K 256 29 L 360 24 Monday, December 5, 11
clustering induces dependence between observations Monday, December 5, 11
parameters sampled from common distribution j hospital j survival rate
Monday, December 5, 11
population distribution j f(⇥) hyperparameters Monday, December 5, 11
θ1 θ2 θk y1 y2 yk ... ... deaths parameters
Monday, December 5, 11
θ1 θ2 θk y1 y2 yk ... ... deaths parameters
µ, σ2 hyperparameters Monday, December 5, 11
, ϕµ ϕσ θ1 θ2 θk y1 y2 yk ...
... deaths parameters µ, σ2 hyperparameters Monday, December 5, 11
non-hierarchical models of hierarchical data can easily be underfit or
overfit Monday, December 5, 11
“experiments” j = 1, . . . , J likelihood
∼ Binomial( , ) deaths j operations j θj logit( ) ∼ N(µ, ) θi σ2 population model µ ∼ , ∼ Pµ σ2 Pσ priors Monday, December 5, 11
0/47 = 0 18/148 = 0.12 8/119 = 0.07 46/810
= 0.06 Monday, December 5, 11
Monday, December 5, 11
Monday, December 5, 11