BeginneR Advanced Hoxo_m
If I have seen further it is by
standing on the sholders of Giants.
-- Sir Isaac Newton, 1676
Slide 9
Slide 9 text
BeginneR Session 1
-- Bayesian Modeling --
Slide 10
Slide 10 text
What is modeling?
Welcome to
Bayesian statistics
Agenda
Slide 11
Slide 11 text
What is modeling?
Slide 12
Slide 12 text
What is modeling?
ℎ
f X
ℎℎ
Truth
Knowledge
Slide 13
Slide 13 text
What is modeling?
ℎ
f X
ℎℎ
Truth
Knowledge
Narrow
sense
Broad sense
Slide 14
Slide 14 text
“Strong”
Hypothesis
“Weaken”
Hypothesis
Data
Data
What is modeling?
Hypothesis Driven Data Driven
Slide 15
Slide 15 text
What is modeling?
f X
ℎℎ .
f X
ℎℎ .
ℎ ℎ
Hypothesis Driven Data Driven
Slide 16
Slide 16 text
What is modeling?
A/B test
Hypothesis Driven
やったこと
ないけどね!
or
A B
HA
: A is better HB
: B is better
H0
: We have to choice better 1 of 2
Strong
hypothesis
A B
*
Simple
data
Slide 17
Slide 17 text
What is modeling?
Meta Analysis
H0
: There are best/better way
Weaken
hypothesis Complex
data
みんなこれの事を
なんて呼ぶの?
Data Driven
Slide 18
Slide 18 text
What is modeling?
Data Driven Analysis
Hypothesis Driven Analysis
How to do?
What to do?
Decision Making
Weaken Hypothesis
Strong Hypothesis
Simple Data
Complex Data
Slide 19
Slide 19 text
What is modeling?
Data Driven
Hypothesis Driven
How to do?
What to do?
Decision Making
Weaken Hypothesis
Strong Hypothesis
Simple Data
Complex Data
Simple
Model
Complex
Model
Slide 20
Slide 20 text
What is modeling?
Data Driven
Hypothesis Driven
How to do?
What to do?
Decision Making
Weaken Hypothesis
Strong Hypothesis
Simple Data
Complex Data
Simple
Model
Complex
Model
Narrow
sense
Broad sense
Slide 21
Slide 21 text
What is modeling?
ℎ
f X
ℎℎ
Truth
Knowledge
Narrow
sense
Broad sense
Slide 22
Slide 22 text
or
A B
HA
: A is better HB
: B is better
H0
: We have to choice better 1 of 2
A B
*
A is better
Slide 23
Slide 23 text
There is only one difference
between a madman and me.
The madman thinks he is sane.
I know I am mad.
Dalí is a dilly. 1956 , The American Magazine, 162(1), 28–9, 107–9.
-- Salvador Dalí
Slide 24
Slide 24 text
or
A B
HA
: A is better HB
: B is better
H0
: We have to choice better 1 of 2.
A B
There is a difference
between A and B
A>B
A is better
d
H1
:
Slide 25
Slide 25 text
Welcome to
Bayesian statistics
Slide 26
Slide 26 text
Dice with α faces
(regular polyhedron)
ℎ
…
Truth
Knowledge
?
Hypothesis Observation
= 5
you
Could you find α?
Yes. α is estimated at 6!!
Why do you think so?
Hmmmm…, well.., how many ( = 6)?
Oh, it is d
edf
!! ….nnNNNO!!! WHAT!!????
friend
Because, arg maxi
{(|)} = 6!!
Slide 30
Slide 30 text
Dice with faces
( = | = 6) =
1
610
maximum
likelihood
you(before)
you(after)
( = 6|)!!??
Hmmmm…
Well.., how many ( = 6)?
friend
= {5, 4, 3, 4, 2, 1, 2, 3, 1, 4}
Slide 31
Slide 31 text
= 1
, … , ∞
, ∀
∈ ℕ
= 1
, … , p
realization x <- sample(, 1)
∶=
∀
=
||
sample space
(can NEVER get)
stochastic variable
probability
distribution
<- c(1, 1, 1, 1, 1, 2, 2, 3, 4, 5, 5)
=
hist(, freq = FALSE, label = TRUE)
= 2
~ ⇔
t
→ : number of trial
Slide 32
Slide 32 text
∶ →
= 1
, … , ∞
, ∀
∈ ℕ
= 1
, … , p
realization
sample space
(can NEVER get)
= =
∀
=
||
probability
distribution
g <- function( = 6) {
map(1:∞, ~sample(1: , n=10, replace = TRUE))
}
=
<- g()
X <- density()
~
x
→
t
→
⇔ ~(|)
statistical
modeling
outcome function
of face dice
Slide 33
Slide 33 text
probability distribution
sample
space
|
=
~ (|)
∶ →
parameter
= 1
, … , p
∈
|
realization
X <- map(1:∞, ~g())
x <- sample(X, 1)
= 1
, … , ∞
, ∀
∈ ℕ
statistical
modeling
There was nobody that
then know their whereabouts...
Slide 60
Slide 60 text
likelihood
posterior
≅
) ∗
()
likelihood
|
”(t|i)∗”(i|z)
|
”(||z)
”(t|i)
prior
distribution
posterior
distribution
predictive
distribution
data
prior
Slide 61
Slide 61 text
likelihood
posterior
≅
) ∗
()
predictive
distribution
()
(|)
Truth
Information Criterion in Bayesian modeling
prior
likelihood
|
”(t|i)∗”(i|z)
|
”(||z)
”(t|i)
prior
distribution
posterior
distribution
data
Slide 62
Slide 62 text
likelihood prior
posterior
≅
) ∗
()
predictive
distribution
()
(|)
—˜
(| = − ()
Kullback–Leibler
divergence
Information Criterion in Bayesian modeling
Truth = − log − − log = log
likelihood
|
”(t|i)∗”(i|z)
|
”(||z)
”(t|i)
prior
distribution
posterior
distribution
data
expectation self-information
or
x y
HA
: A is better HB
: B is better
H0
: We have to choice better 1 of 2.
x y
There is a difference
between x and y
A>B
A is better
θ
H1
:
= t
− §
t
-
← | ← |
”(|│z)
”(t│i)
§
-
← | ← |
”(°│±)
”(§│²)
Slide 82
Slide 82 text
or
x y
HA
: A is better HB
: B is better
H0
: We have to choice better 1 of 2.
x y
There is a difference
between x and y
A>B
A is better
θ
H1
:
³
← [t
, §
]
t
-
← | ← |
”(|│z)
”(t│i)
§
-
← | ← |
”(°│±)
”(§│²)
Slide 83
Slide 83 text
or
x y
HA
: A is better HB
: B is better
H0
: We have to choice better 1 of 2.
x y
There is a difference
between x and y
A>B
A is better
θ
H1
:
³
← [t
, §
]
t
-
← | ← |
”
”
§
-
← | ← |
”
”
Slide 84
Slide 84 text
×
Slide 85
Slide 85 text
Summary, again…
Slide 86
Slide 86 text
What is modeling?
ℎ
f X
ℎℎ
Truth
Knowledge
Narrow
sense
Broad sense
Slide 87
Slide 87 text
What is modeling?
f X
ℎℎ .
f X
ℎℎ .
ℎ ℎ
Hypothesis Driven Data Driven
Slide 88
Slide 88 text
∶ →
= 1
, … , ∞
, ∀
∈ ℕ
= 1
, … , p
realization
sample space
(can NEVER get)
= =
∀
=
||
probability
distribution
=
<- g()
X <- density()
~
x
→
t
→
⇔ ~(|)
statistical
modeling
outcome function
with parameter
g <- function( = 6) {
map(1:∞, ~sample(1: , n=10, replace = TRUE))
}
Slide 89
Slide 89 text
|
← = {1
, … , ∞
}
x
|
←
t
← = 1
, … , ∞
x
t
←
~(|) ~
(|)
(, )
Bayesian Modeling
Slide 90
Slide 90 text
v.s.
me
“MUST be wholy
REJECTED!!!”
“p-value
**cking!!!”
Frequentist Bayesian
Old Stereotype
Slide 91
Slide 91 text
f X
ℎℎ .
f X
ℎℎ .
ℎ ℎ
Hypothesis Driven Data Driven
∶ → ∶
→
→
Slide 92
Slide 92 text
“Life shrinks or expands
to one’s courage.”
-- Anaïs Nin, 2000
http://theamericanreader.com