1.1k

# Statistical Rethinking 2023 - Lecture 09 January 30, 2023

## Transcript

1. Statistical Rethinking
9. Modeling Events
2023

2. Lecture 8

3. FLOW

4. Flow forward
Everything comes all at once:
Scienti c modeling, research design,
statistical modeling, coding,
interpretation, communication, the
price of eggs
Don’t have to understand it all at once
Nobody ever does

applications for 1973 UC
Berkeley
Strati ed by department and
gender of applicant
Gender discrimination by
1 A 512 313 825 male
2 A 89 19 108 female
3 B 353 207 560 male
4 B 17 8 25 female
5 C 120 205 325 male
6 C 202 391 593 female
7 D 138 279 417 male
8 D 131 244 375 female
9 E 53 138 191 male
10 E 94 299 393 female
11 F 22 351 373 male
12 F 24 317 341 female

6. https://arxiv.org/abs/2207.13665

7. Modeling events
Events: Discrete, unordered outcomes
Observations are counts
Unknowns are probabilities, odds
Everything interacts always
everywhere
A beast known as “log-odds”

(1) Estimand(s)
(2) Scienti c model(s)
(3) Statistical model(s)
(4) Analyze

G A
Was there gender discrimination in

G
D
A
gender
department

11. G
D
A
What can the “causal e ect of gender” mean?
R
Referee
G*
Perceived
gender

12. G
D
A
What can the “causal e ect of gender” mean?
R
Referee
G*
Perceived
gender
blinding

13. G
D
A
What can the “causal e ect of gender” mean?
R
Referee
G*
Perceived
gender

14. Context & discrimination
X
Z
Y
status
context
event

15. Wage discrimination
X
Z
Y
status
occupation
wage

16. G
D
A
gender
department
Which path is “discrimination”?

17. G
D
A
gender
department
Which path is “discrimination”?
Direct discrimination
(status-based or taste-based discrimination)

18. G
D
A
gender
department
Which path is “discrimination”?
Indirect discrimination
(structural discrimination)

19. G
D
A
gender
department
Which path is “discrimination”?
Total discrimination
(experienced discrimination)

20. G
D
A
Which path is “discrimination”?
G
D
A G
D
A
Total Indirect Direct
Requires mild
assumptions
Requires strong
assumptions
Requires strong
assumptions

21. G
D
A
Confounds!
U person feature
Will ignore for now, but confounds will return

(1) Estimand(s)
(2) Scienti c model(s)
(3) Statistical model(s)
(4) Analyze

23. Generative model
How can choice of
department create
structural discrimination?
When departments vary in
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.1,0.3,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )

24. Generative model
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.1,0.3,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )
G
D
A

25. Generative model
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.1,0.3,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )
G
D
A

26. Generative model
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.1,0.3,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )
G
D
A

27. Generative model
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.1,0.3,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )
> table(G,A)
A
G 0 1
1 421 101
2 350 128
Accept rates
Gender 1: 19%
Gender 2: 27%
> table(G,D)
D
G 1 2
1 361 161
2 99 379

28. Generative model
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.05,0.2,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )
> table(G,A)
A
G 0 1
1 473 46
2 404 77
Accept rates
Gender 1: 9%
Gender 2: 16%
> table(G,D)
D
G 1 2
1 355 164
2 95 386
> accept_rate
[,1] [,2]
[1,] 0.05 0.1
[2,] 0.20 0.3
same pattern as absence of
discrimination

29. Generative model
Could do a lot better
of applicant pool, distribution of
quali cations
Should sample applicant pool and
Rates are conditional on structure of
applicant pool
> table(G,A)
A
G 0 1
1 473 46
2 404 77
Accept rates
Gender 1: 9%
Gender 2: 16%
> table(G,D)
D
G 1 2
1 355 164
2 95 386

30. PAUSE

(1) Estimand(s)
(2) Scienti c model(s)
(3) Statistical model(s)
(4) Analyze

32. Modeling events
We observe: Counts of events
We estimate:
Probability (or log-odds) of events
Like the globe tossing model, but
need “proportion of water”
strati ed by other variables

33. Generalized Linear Models
Linear Models: Expected value is
parameters
Generalized Linear Models:
Expected value is some function of
parameters
Y
i
∼ Normal(μ
i
, σ)
μ
i
= α + β
X
X
i
+ β
Z
Z
i

34. -3 -2 -1 0 1 2 3
-0.5 0.0 0.5 1.0 1.5
x
a+b*x
-3 -2 -1 0 1 2 3
-0.5 0.0 0.5 1.0 1.5
x
f^-1(a+b*x)

35. Generalized Linear Models
Linear Models: Expected value is
parameters
Generalized Linear Models:
Expected value is some function of
parameters
Y
i
∼ Normal(μ
i
, σ)
μ
i
= α + β
X
X
i
+ β
Z
Z
i
Y
i
∼ Bernoulli(p
i
)
f(p
i
) = α + β
X
X
i
+ β
Z
Z
i
f(p
i
)

36. Generalized Linear Models
Y
i
∼ Bernoulli(p
i
)
f(p
i
) = α + β
X
X
i
+ β
Z
Z
i
f(p
i
)
0/1 outcome probability of event
can take any real value
f() maps probability scale
to linear model scale

Y
i
∼ Bernoulli(p
i
)
f(p
i
) = α + β
X
X
i
+ β
Z
Z
i
f(p
i
)
linear model
f
f−1 is the inverse link function p
i
= f−1(α + β
X
X
i
+ β
Z
Z
i
)
f−1

38. Example inverse function
f(a) = a2

39. Example inverse function
f(a) = a2
f−1(a2) = a2 = a

Distributions: Relative number of ways to observe data, given
assumptions about rates, probabilities, slopes, etc.
Distributions are matched to constraints on observed variables
Link functions are matched to distributions

41. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5

42. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5

43. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5

44. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5

45. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5

46. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5

47. sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
Figure 9.5
Distributional assumptions
constraints on observations
You cannot “test” if your
data are “normal”

Distributions: Relative number
of ways to observe data, given
assumptions
Distributions are matched to
constraints on observed
variables
distributions
(&/&3"-*;&% -*/&"3 .0%&-4
sum
large mean
count events
low rate
count
events
low probability
many trials
dnorm
dgamma
dpois dbinom
dexp
Z ∼ /PSNBM(µ, σ)
Z ∼ #JOPNJBM(O, Q)
Z ∼ 1PJTTPO(λ)
Z ∼ (BNNB(λ, L)
Z ∼ &YQPOFOUJBM(λ)
'ĶĴłĿĲ Ƒƍ 4PNF PG UIF FYQPOFOUJBM GBNJMZ EJTUSJCVUJPOT UIFJS OPUBUJPO
BOE TPNF PG UIFJS SFMBUJPOTIJQT \$FOUFS FYQPOFOUJBM EJTUSJCVUJPO \$MPDL

49. Overthinking: Binomial maximum entropy. The usual way to derive a maximum entropy distribu-
tion is to state the constraints and then use a mathematical device called the Lagrangian to solve for
the probability assignments that maximize entropy. But instead we’ll extend the strategy used in the
Overthinking box on page 306. As a bonus, this strategy will allow us to derive the constraints that
are necessary for a distribution, in this case the binomial, to be a maximum entropy distribution.
Let p be the binomial distribution, and let pi
be the probability of a sequence of observations i
with number of successes xi
and number of failures n − xi
. Let q be some other discrete distribution
defined over the same set of observable sequences. As before, KL divergence tells us that:
−H(q, p) ≥ H(q) =⇒ −
i
qi
log pi ≥ −
i
qi
log qi
What we’re going to do now is work with H(q, p) and simplify it until we can isolate the constraint
that defines the class of distributions for which p has maximum entropy. Let λ = i
pi
xi
be the
expected value of p. Then from the definition of H(q, p):
−H(q, p) = −
i
qi
log λ
n
xi
1 −
λ
n
n−xi
= −
i
qi
xi
log λ
n + (n − xi) log 1 −
λ
n
After some algebra:
−H(q, p) = −
i
qi
xi
log λ
n − λ
+ n log
n − λ
n = −n log
n − λ
n − log λ
n − λ
i
qi
xi
¯
q
The term on the far right labeled ¯
q is the expected value of the distribution q. If we knew it, we could
complete the calculation, because no other term depends upon qi
. This means that expected value is
the constraint that defines the class of distributions for which the binomial p has maximum entropy.
If we now set the expected value of q equal to λ, then H(q) = H(p). For any other expected value of
q, H(p) > H(q).
Finally, notice the term log[λ/(n − λ)]. This term is the log of the ratio of the expected number
of successes to the expected number of failures. That ratio is the “odds” of a success, and its logarithm
is called “log odds.” This quantity will feature prominently in models we construct from the binomial
distribution, in Chapter 11.
Page 312

50. work with H(q, p) and simplify it until we can isolate the constraint
butions for which p has maximum entropy. Let λ = i
pi
xi
be the
m the definition of H(q, p):
xi
1 −
λ
n
n−xi
= −
i
qi
xi
log λ
n + (n − xi) log 1 −
λ
n
λ
n − λ
+ n log
n − λ
n = −n log
n − λ
n − log λ
n − λ
i
qi
xi
¯
q
d ¯
q is the expected value of the distribution q. If we knew it, we could
use no other term depends upon qi
. This means that expected value is
class of distributions for which the binomial p has maximum entropy.

Y
i
∼ Bernoulli(p
i
)
logit(p
i
) = α + β
X
X
i
+ β
Z
Z
i
logit(p
i
)
logit(p
i
) = log
p
i
1 − pi
p
i
= logit−1(α + β
X
X
i
+ β
Z
Z
i
)
logit−1
Bernoulli/Binomial models
most o en use logit link

logit(p
i
) = log
p
i
1 − pi
Bernoulli/Binomial models
most o en use logit link
“log odds” odds
Y
i
∼ Bernoulli(p
i
)
logit(p
i
) = α + β
X
X
i
+ β
Z
Z
i
logit(p
i
)
p
i
= logit−1(α + β
X
X
i
+ β
Z
Z
i
)
logit−1
“logistic”

logit(p
i
) = log
p
i
1 − pi
= q
i
logit−1(q
i
) = ?

logit(p
i
) = log
p
i
1 − pi
= q
i
logit−1(q
i
) = ? = p
i

logit(p
i
) = log
p
i
1 − pi
= q
i
logit−1(q
i
) = ? = p
i
log
p
i
1 − pi
= q
i
p
i
=
exp(q
i
)
1 + exp(qi
)
logit−1

56. logit link is a harsh transform
“log-odds scale”: e value of
the linear model
logit(p)=0, p=0.5
logit(p)=4, p=nearly always
logit(p)=–4, p=hardly ever
-6 -4 -2 0 2 4 6
logit(p)
p
0 0.5 1

57. logit(p
i
) = α + βx
i

58. logit(p
i
) = α + βx
i

59. Logistic priors
distributions
Anything above +4 = almost always
Anything below –4 = almost never
logit(p
i
) = α
-30 -20 -10 0 10 20 30
0.00 0.02 0.04
alpha
Density
-30 -20 -10 0 10 20 30
0.00 0.10 0.20
alpha
Density
-30 -20 -10 0 10 20 30
0.0 0.2 0.4
alpha
Density
α ∼ Normal(0,10)
α ∼ Normal(0,1.5)
α ∼ Normal(0,1)
α
p
i

60. 0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.4 0.8
probability of event
Density
0.0 0.2 0.4 0.6 0.8 1.0
0 1 2 3 4
probability of event
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 1.0 1.5
probability of event
Density
α ∼ Normal(0,10)
α ∼ Normal(0,1.5)
α ∼ Normal(0,1)
-30 -20 -10 0 10 20 30
0.00 0.02 0.04
alpha
Density
-30 -20 -10 0 10 20 30
0.00 0.10 0.20
alpha
Density
-30 -20 -10 0 10 20 30
0.0 0.2 0.4
alpha
Density
logit(p
i
) = α
α
p
i

61. logit(p
i
) = α + βx
i
β ∼ Normal(0,10)
α ∼ Normal(0,10)
-2 -1 0 1 2
0.0 0.4 0.8
x value
probability
a <- rnorm(1e4,0,10)
b <- rnorm(1e4,0,10)
xseq <- seq(from=-3,to=3,len=100)
p <- sapply( xseq , function(x)
inv_logit(a+b*x) )
plot( NULL , xlim=c(-2.5,2.5) , ylim=c(0,1) ,
xlab="x value" , ylab="probability" )
for ( i in 1:10 ) lines( xseq , p[i,] , lwd=3 ,
col=2 )

62. β ∼ Normal(0,10)
α ∼ Normal(0,10)
-2 -1 0 1 2
0.0 0.4 0.8
x value
probability
-2 -1 0 1 2
0.0 0.4 0.8
x value
probability
β ∼ Normal(0,0.5)
α ∼ Normal(0,1.5)

63. A statistical model
# generative model, basic mediator scenario
N <- 1000 # number of applicants
# even gender distribution
G <- sample( 1:2 , size=N , replace=TRUE )
# gender 1 tends to apply to department 1, 2 to 2
D <- rbern( N , ifelse( G==1 , 0.3 , 0.8 ) ) + 1
# matrix of acceptance rates [dept,gender]
accept_rate <- matrix( c(0.05,0.2,0.1,0.3) , nrow=2 )
# simulate acceptance
A <- rbern( N , accept_rate[D,G] )
G
D
A

64. Estimand: Total e ect of G
A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
]
G
D
A
α = [α
1
, α
2]
Genders
Pr(A
i
= 1) = p
i
p
i
=
exp(α[G
i
])
1 + exp(α[Gi
])

65. Estimand: Direct e ect of G
A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
, D
i
]
G
D
A
α =
[
α
1,1
α
1,2
α
2,1
α
2,2]
Departments
Genders

66. A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
]
A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
, D
i
]
α
j
∼ Normal(0,1) α
j,k
∼ Normal(0,1)
Total e ect Direct e ect(s)

67. A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
]
A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
, D
i
]
α
j
∼ Normal(0,1) α
j,k
∼ Normal(0,1)
Total e ect Direct e ect(s)
dat_sim <- list( A=A , D=D , G=G )
m1 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
dat_sim <- list( A=A , D=D , G=G )
m2 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )

68. Total e ect Direct e ect(s)
dat_sim <- list( A=A , D=D , G=G )
m1 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
dat_sim <- list( A=A , D=D , G=G )
m2 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
precis(m1,depth=2)
mean sd 5.5% 94.5% n_eff Rhat4
a -1.80 0.13 -2.01 -1.60 1549 1
a -1.09 0.10 -1.25 -0.93 1159 1

69. Total e ect Direct e ect(s)
dat_sim <- list( A=A , D=D , G=G )
m1 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
dat_sim <- list( A=A , D=D , G=G )
m2 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
precis(m1,depth=2) precis(m2,depth=3)
mean sd 5.5% 94.5% n_eff Rhat4
a -1.80 0.13 -2.01 -1.60 1549 1
a -1.09 0.10 -1.25 -0.93 1159 1
mean sd 5.5% 94.5% n_eff Rhat4
a[1,1] -2.31 0.18 -2.60 -2.04 2529 1
a[1,2] -0.92 0.19 -1.23 -0.62 2216 1
a[2,1] -1.93 0.31 -2.45 -1.44 2214 1
a[2,2] -0.93 0.11 -1.11 -0.75 2055 1

70. Total e ect Direct e ect(s)
dat_sim <- list( A=A , D=D , G=G )
m1 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
dat_sim <- list( A=A , D=D , G=G )
m2 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
precis(m1,depth=2)
precis(m2,depth=3)
mean sd 5.5% 94.5% n_eff Rhat4
a -1.80 0.13 -2.01 -1.60 1549 1
a -1.09 0.10 -1.25 -0.93 1159 1
mean sd 5.5% 94.5% n_eff Rhat4
a[1,1] -2.31 0.18 -2.60 -2.04 2529 1
a[1,2] -0.92 0.19 -1.23 -0.62 2216 1
a[2,1] -1.93 0.31 -2.45 -1.44 2214 1
a[2,2] -0.93 0.11 -1.11 -0.75 2055 1
> inv_logit(coef(m2))
a[1,1] a[1,2] a[2,1] a[2,2]
0.06296434 0.21109945 0.08253890 0.20003819

71. PAUSE

(1) Estimand(s)
(2) Scienti c model(s)
(3) Statistical model(s)
(4) Analyze

applications for 1973 UC
Berkeley
Strati ed by department and
gender of applicant
Evidence of gender
discrimination?
1 A 512 313 825 male
2 A 89 19 108 female
3 B 353 207 560 male
4 B 17 8 25 female
5 C 120 205 325 male
6 C 202 391 593 female
7 D 138 279 417 male
8 D 131 244 375 female
9 E 53 138 191 male
10 E 94 299 393 female
11 F 22 351 373 male
12 F 24 317 341 female

74. Logistic & Binomial Regression
Logistic regression:
Binary [0,1] outcome and logit link
Binomial regression:
Count [0,N] outcome and logit link
A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
, D
i
]
A
i
∼ Binomial(N
i
, p
i
)
logit(p
i
) = α[G
i
, D
i
]
Completely equivalent for inference

75. A G D
1 0 2 1
2 0 1 1
3 1 2 2
4 1 2 2
5 0 2 2
6 0 1 1
7 0 2 2
8 0 2 2
9 0 2 2
10 0 2 2
11 0 2 2
12 1 2 2
13 0 2 2
14 0 1 1
15 0 2 2
16 0 1 2
17 0 1 1
18 0 1 1
19 0 1 1
20 0 1 1
G D A N
1 1 1 30 355
2 2 1 10 92
3 1 2 38 135
4 2 2 117 418
dat_sim2 <- aggregate( A ~ G + D , dat_sim , sum )
dat_sim2\$N <- aggregate( A ~ G + D , dat_sim , length )\$A
LOOOOOOOONG
Aggregated

76. Logistic & Binomial Regression
A
i
∼ Bernoulli(p
i
)
logit(p
i
) = α[G
i
, D
i
]
A
i
∼ Binomial(N
i
, p
i
)
logit(p
i
) = α[G
i
, D
i
]
Completely equivalent for inference
m2 <- ulam(
alist(
A ~ bernoulli(p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
m2_bin <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim2 , chains=4 , cores=4 )

77. Logistic & Binomial Regression
logit(p
i
) = α[G
i
, D
i
]
A
i
∼ Binomial(N
i
, p
i
)
logit(p
i
) = α[G
i
, D
i
]
Completely equivalent for inference
m2 <- ulam(
alist(
A ~ binomial(1,p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim , chains=4 , cores=4 )
m2_bin <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat_sim2 , chains=4 , cores=4 )
A
i
∼ Binomial(1,p
i
)

dat <- list(
N = d\$applications,
G = ifelse(d\$applicant.gender=="female",1,2),
D = as.integer(d\$dept)
)
# total effect gender
mG <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
Total e ect

dat <- list(
N = d\$applications,
G = ifelse(d\$applicant.gender=="female",1,2),
D = as.integer(d\$dept)
)
# total effect gender
mG <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
# direct effects
mGD <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
Total e ect Direct e ect(s)

80. 0 200 400 600 800 1000
0.0 1.0 2.0 3.0
n_eff = 2548
a[1,1]
0 200 400 600 800 1000
0.2 0.6 1.0
n_eff = 2669
a[2,1]
0 200 400 600 800 1000
-0.5 0.5 1.5
n_eff = 2859
a[1,2]
0 200 400 600 800 1000
0.2 0.5 0.8
n_eff = 2590
a[2,2]
0 200 400 600 800 1000
-1.0 -0.7 -0.4
n_eff = 2612
a[1,3]
0 200 400 600 800 1000
-1.0 0.0
n_eff = 2617
a[2,3]
0 200 400 600 800 1000
-1.2 -0.8 -0.4
n_eff = 2598
a[1,4]
0 200 400 600 800 1000
-1.2 -0.8 -0.4
n_eff = 2433
a[2,4]
0 200 400 600 800 1000
-1.6 -1.2 -0.8
n_eff = 2520
a[1,5]
0 200 400 600 800 1000
-1.5 -0.5
n_eff = 3003
a[2,5]
0 200 400 600 800 1000
-3.0 -2.0 -1.0
n_eff = 2346
a[1,6]
0 200 400 600 800 1000
-3.5 -2.5 -1.5
n_eff = 2384
a[2,6]

81. n_eff = 2548
a[1,1] n_eff = 2669
a[2,1] n_eff = 2859
a[1,2]
n_eff = 2590
a[2,2] n_eff = 2612
a[1,3] n_eff = 2617
a[2,3]
n_eff = 2598
a[1,4] n_eff = 2433
a[2,4] n_eff = 2520
a[1,5]
n_eff = 3003
a[2,5] n_eff = 2346
a[1,6] n_eff = 2384
a[2,6]

82. # total effect gender
mG <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
# direct effects
mGD <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
mean sd 5.5% 94.5% n_eff Rhat4
a -0.83 0.05 -0.91 -0.75 1487 1
a -0.22 0.04 -0.28 -0.16 1499 1
mean sd 5.5% 94.5% n_eff Rhat4
a[1,1] 1.48 0.24 1.11 1.87 2548 1
a[1,2] 0.66 0.40 0.03 1.32 2859 1
a[1,3] -0.65 0.08 -0.79 -0.52 2612 1
a[1,4] -0.62 0.11 -0.79 -0.45 2598 1
a[1,5] -1.15 0.12 -1.34 -0.96 2520 1
a[1,6] -2.50 0.20 -2.81 -2.18 2346 1
a[2,1] 0.49 0.07 0.38 0.60 2669 1
a[2,2] 0.53 0.08 0.40 0.67 2590 1
a[2,3] -0.53 0.11 -0.72 -0.35 2617 1
a[2,4] -0.70 0.10 -0.87 -0.54 2433 1
a[2,5] -0.94 0.16 -1.20 -0.69 3003 1
a[2,6] -2.67 0.21 -3.00 -2.34 2384 1

83. # total effect gender
mG <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G],
a[G] ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
# direct effects
mGD <- ulam(
alist(
A ~ binomial(N,p),
logit(p) <- a[G,D],
matrix[G,D]:a ~ normal(0,1)
), data=dat , chains=4 , cores=4 )
mean sd 5.5% 94.5% n_eff Rhat4
a -0.83 0.05 -0.91 -0.75 1487 1
a -0.22 0.04 -0.28 -0.16 1499 1
mean sd 5.5% 94.5% n_eff Rhat4
a[1,1] 1.48 0.24 1.11 1.87 2548 1
a[1,2] 0.66 0.40 0.03 1.32 2859 1
a[1,3] -0.65 0.08 -0.79 -0.52 2612 1
a[1,4] -0.62 0.11 -0.79 -0.45 2598 1
a[1,5] -1.15 0.12 -1.34 -0.96 2520 1
a[1,6] -2.50 0.20 -2.81 -2.18 2346 1
a[2,1] 0.49 0.07 0.38 0.60 2669 1
a[2,2] 0.53 0.08 0.40 0.67 2590 1
a[2,3] -0.53 0.11 -0.72 -0.35 2617 1
a[2,4] -0.70 0.10 -0.87 -0.54 2433 1
a[2,5] -0.94 0.16 -1.20 -0.69 3003 1
a[2,6] -2.67 0.21 -3.00 -2.34 2384 1

84. Total e ect
post1 <- extract.samples(mG)
PrA_G1 <- inv_logit( post1\$a[,1] )
PrA_G2 <- inv_logit( post1\$a[,2] )
diff_prob <- PrA_G1 - PrA_G2
dens(diff_prob,lwd=4,col=2,xlab="Gender
contrast (probability)")
-0.18 -0.14 -0.10
0 10 25
Gender contrast (probability)
Density

85. Total e ect Direct e ect(s)
post1 <- extract.samples(mG)
PrA_G1 <- inv_logit( post1\$a[,1] )
PrA_G2 <- inv_logit( post1\$a[,2] )
diff_prob <- PrA_G1 - PrA_G2
dens(diff_prob,lwd=4,col=2,xlab="Gender
contrast (probability)")
post2 <- extract.samples(mGD)
PrA <- inv_logit( post2\$a )
diff_prob_D_ <- sapply( 1:6 , function(i)
PrA[,1,i] - PrA[,2,i] )
plot(NULL,xlim=c(-0.2,0.3),ylim=c(0,25),xlab="G
ender contrast (probability)",ylab="Density")
for ( i in 1:6 ) dens( diff_prob_D_[,i] , lwd=4
-0.18 -0.14 -0.10
0 10 25
Gender contrast (probability)
Density
-0.2 -0.1 0.0 0.1 0.2 0.3
0 5 15 25
Gender contrast (probability)
Density

86. What is the average direct e ect of gender across departments?
Depends upon distribution of applications, probability
woman/man applies to each department
G
D
A
What is the
intervention actually?

87. Depends upon distribution of applications, probability
woman/man applies to each department
G
D
A
gender
G*
perceived
gender
What is the average direct e ect of gender across departments?
What is the
intervention actually?

88. To calculate causal e ect of G*, must average
(marginalize) over departments
Easy to do it as a simulation
NB: We are still assuming no confounds!
What is the
intervention actually?
G
D
A
G*

89. # number of applicatons to simulate
total_apps <- sum(dat\$N)
# number of applications per department
apps_per_dept <- sapply( 1:6 , function(i)
sum(dat\$N[dat\$D==i]) )
# simulate as if all apps from women
D=rep(1:6,times=apps_per_dept),
N=rep(1,total_apps),
G=rep(1,total_apps)))
# simulate as if all apps from men
D=rep(1:6,times=apps_per_dept),
N=rep(1,total_apps),
G=rep(2,total_apps)))
# summarize
dens( p_G1 - p_G2 , lwd=4 , col=2 ,
xlab="effect of gender perception" )
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3
0 2 4 6 8
effect of gender perception
Density

90. # simulate as if all apps from women
D=rep(1:6,times=apps_per_dept),
N=rep(1,total_apps),
G=rep(1,total_apps)))
# simulate as if all apps from men
D=rep(1:6,times=apps_per_dept),
N=rep(1,total_apps),
G=rep(2,total_apps)))
# show each dept density with weight as in
population
w <- xtabs( dat\$N ~ dat\$D ) / sum(dat\$N)
plot(NULL,xlim=c(-0.2,0.3),ylim=c(0,25),xlab="
Gender contrast (probability)",ylab="Density")
for ( i in 1:6 ) dens( diff_prob_D_[,i] ,
lwd=2+20*w[i] , col=1+i , add=TRUE )
abline(v=0,lty=3)
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3
0 2 4 6 8
effect of gender perception
Density
-0.2 -0.1 0.0 0.1 0.2 0.3
0 5 10 15 20 25
Gender contrast (probability)
Density

91. Post-strati cation
Description, prediction & causal
inference o en require post-strati cation
Post-strati cation: Re-weighting
estimates for target population
At a di erent university, distribution of
applications will di er => predicted
consequence of intervention di erent
-0.2 -0.1 0.0 0.1 0.2 0.3
0 5 10 15 20 25
Gender contrast (probability)
Density

Evidence for discrimination?
Big structural e ects, but
(1) Distribution of applications can
be a consequence of discrimination
(data do not speak to this)
(2) Confounds likely
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3
0 2 4 6 8
effect of gender perception
Density

93. BONUS

94. Survival Analysis
Counts o en modeled as time-to-event
Tricky, because cannot ignore censored cases
Le -censored: Don’t know when time started
Right-censored: Observation ended before event
Ignoring censored cases leads to inferential error
Imagine estimating time-to-PhD: Time in
program before dropping out is info about rate

95. Survival Analysis
data(AustinCats)
20-thousand cats
Estimand: Adoption rates of black and non-black cats
Events: (1) adopted or (2) something else
Something else could be: death, escape, censored

96. Austin Cats
Outcome variable: days_to_event
What is a good distribution?
Basic choices: Exponential or gamma
Maximum entropy distributions for
waiting times
Generative story: X things required
before event
\$&/403*/( "/% 4637*7"-
3 DP

ʚǶ Ǐ
3 ʚǶ - +'\$/ ǿ ǎ ǒ Ǣ (\$)ǿ-0)\$!ǿǢǎǢǎǍǍȀȀ Ȁ
*G ZPV QMPU UIF EFOTJUZ PG UIF 3 WBMVFT ZPVMM TFF UIF CMBDL EFOTJUZ JO UIF MFę QMPU CFMPX "T UIF OVNCFS
PG QBSUT JODSFBTFT UIF EFOTJUZ DPOWFSHFT UP BO FYQPOFOUJBM :0V EPOU OFFE WFSZ NBOZ #Z / =
UIF CMVF EFOTJUZ CFMPX JU JT BMNPTU FYBDUMZ FYQPOFOUJBM
0 20 40 60 80 100
0.00 0.02 0.04
day
Density
2 parts
5 parts
Single part failure: Exponential
0 20 40 60 80 100
0.00 0.02 0.04
day
Density
2/10 parts
5/10 parts
Multiple part failure: Gamma
ćF QMPU PO UIF SJHIU BCPWF JT GPS UIF HBNNB EJTUSJCVUJPO XIJDI BSJTFT XIFO UIF NBDIJOF CSFBLT POMZ
UIF CMVF EFOTJUZ CFMPX JU JT BMNPTU FYBDUMZ FYQPOFOUJBM
0 20 40 60 80 100
0.00 0.02 0.04
day
Density
2 parts
5 parts
Single part failure: Exponential
0.00 0.02 0.04
Density
ćF QMPU PO UIF SJHIU BCPWF JT GPS UIF HBNNB EJTUSJCVUJPO X
BęFS . PG JUT QBSUT IBWF CSPLFO ćJT DPEF XJMM MFU ZPV TJN
ʚǶ ǎǍ
ʚǶ Ǐ
3 ʚǶ - +'\$/ ǿ ǎ ǒ Ǣ .*-/ǿ-0)\$!ǿǢǎǢǎǍǍȀȀȁȂ Ȁ
ćJT QSPEVDFT UIF CMBDL EFOTJUZ JO UIF SJHIU IBOE QMPU ć
UP CSFBL /BUVSBMMZ UIF FYQPOFOUJBM JT B KVTU B TQFDJBM DBT
UIJT JT ZFU BOPUIFS XBZ UP HFU BO BQQSPYJNBUF (BVTTJBO EJT
JU MPPLT NPSF (BVTTJBO ćF CMVF EFOTJUZ BCPWF JT BMSFBEZ
IBQQFO UP DBVTF TPNF FWFOU UIFO XBJUJOH UJNFT DBO FOE V

97. Un-censored observations
HFU BO BQQSPYJNBUF (BVTTJBO EJTUSJCVUJPO "T UIF HBNNBT NFBO JODSFBTFT
ćF CMVF EFOTJUZ BCPWF JT BMSFBEZ SBUIFS (BVTTJBO *G TFWFSBM UIJOHT OFFE UP
FOU UIFO XBJUJOH UJNFT DBO FOE VQ MPPLJOH WFSZ (BVTTJBO
Y
UJPOT QSPCBCJMJUZ PG PCTFSWFE XBJUJOH UJNF JT TJNQMZ
%J ∼ &YQPOFOUJBM(λJ)
Q(%J|λJ) = λJ
FYQ(−λJ
%J)
BUT UIBU BSF USJDLZ *G TPNFUIJOH FMTF IBQQFOFE CFGPSF B DBU DPVME CF
IBTOU CFFO BEPQUFE ZFU UIFO XF OFFE UIF QSPCBCJMJUZ PG OPU CFJOH
O UIF PCTFSWBUJPO UJNF TP GBS 0OF XBZ UP NPUJWBUF UIJT JT UP JNBHF
M KPJOJOH UIF TIFMUFS PO UIF TBNF EBZ *G IBMG IBWF CFFO BEPQUFE BęFS
BCJMJUZ PG XBJUJOH EBZT BOE TUJMM OPU CFJOH BEPQUFE JT *G BęFS
0 1 2 3 4 5
0.0 0.2 0.4 0.6 0.8 1.0
x
dexp(x)
probability event happens at time x
λ = 0.5
λ = 1.0
OFOUJBM JT B KVTU B TQFDJBM DBTF PG UIF HBNNB XIFO . = /PUF UIBU
BO BQQSPYJNBUF (BVTTJBO EJTUSJCVUJPO "T UIF HBNNBT NFBO JODSFBTFT
CMVF EFOTJUZ BCPWF JT BMSFBEZ SBUIFS (BVTTJBO *G TFWFSBM UIJOHT OFFE UP
UIFO XBJUJOH UJNFT DBO FOE VQ MPPLJOH WFSZ (BVTTJBO
OT QSPCBCJMJUZ PG PCTFSWFE XBJUJOH UJNF JT TJNQMZ
%J ∼ &YQPOFOUJBM(λJ)
Q(%J|λJ) = λJ
FYQ(−λJ
%J)
IBU BSF USJDLZ *G TPNFUIJOH FMTF IBQQFOFE CFGPSF B DBU DPVME CF
OU CFFO BEPQUFE ZFU UIFO XF OFFE UIF QSPCBCJMJUZ PG OPU CFJOH
IF PCTFSWBUJPO UJNF TP GBS 0OF XBZ UP NPUJWBUF UIJT JT UP JNBHF
OJOH UIF TIFMUFS PO UIF TBNF EBZ *G IBMG IBWF CFFO BEPQUFE BęFS
JUZ PG XBJUJOH EBZT BOE TUJMM OPU CFJOH BEPQUFE JT *G BęFS

98. Censored cats
0 1 2 3 4 5
0.0 0.2 0.4 0.6 0.8 1.0
x
pexp(x)
probability event before-or-at time x
0 1 2 3 4 5
0.0 0.2 0.4 0.6 0.8 1.0
x
exp(-x)
probability not-event before-or-at time x
λ = 0.5
λ = 1.0
CDF CCDF
HJWFT UIF QSPQPSUJPO PG DBUT BEPQUFE CFGPSF PS BU B DFSUBJO OVNCFS PG
IF DVNVMBUJWF EJTUSJCVUJPO HJWFT UIF QSPCBCJMJUZ B DBU JT OPU BEPQUFE
PG EBZT ćBU JT UIF QSPCBCJMJUZ UIBU XF OFFE ćJT EJTUSJCVUJPOPOF
F QSPCBCJMJUZ EJTUSJCVUJPOJT DBMMFE UIF İļĺĽĹĲĺĲĻŁĮĿņ İłĺłĹĮ
ĶŀŁĿĶįłŁĶļĻ *O UIF DBTF PG UIF FYQPOFOUJBM EJTUSJCVUJPO UIF DVNVMB
1S(%J|λJ) = − FYQ(−λJ
%J)
KVTU
1S(%J|λJ) = FYQ(−λJ
%J)
E JO PVS NPEFM TJODF JU JT QSPCBCJMJUZ PG XBJUJOH %J
EBZT XJUIPVU CFJOH
ɶ*0/Ǿ 1 )/ʙʙǫ*+/\$*)ǫ Ǣ ǎ Ǣ Ǎ Ȁ
ɶ*'*-ʙʙǫ'&ǫ Ǣ ǎ Ǣ Ǎ Ȁ
NJOVT UIF DVNVMBUJWF QSPCBCJMJUZ EJTUSJCVUJPOJT DBMMFE UIF İļĺĽĹĲĺĲĻŁĮĿņ İłĺ
ŁĶŃĲ ĽĿļįĮįĶĹĶŁņ ıĶŀŁĿĶįłŁĶļĻ *O UIF DBTF PG UIF FYQPOFOUJBM EJTUSJCVUJPO UIF DV
UJWF JT
1S(%J|λJ) = − FYQ(−λJ
%J)
4P UIF DPNQMFNFOU JT KVTU
1S(%J|λJ) = FYQ(−λJ
%J)
4P UIBUT XIBU XF OFFE JO PVS NPEFM TJODF JU JT QSPCBCJMJUZ PG XBJUJOH %J
EBZT XJUIPVU
BEPQUFE ZFU
3 DPEF
'\$--4ǿ- /#\$)&\$)"Ȁ
/ǿ0./\$)/.Ȁ
ʚǶ 0./\$)/.
ɶ*+/ ʚǶ \$! '. ǿ ɶ*0/Ǿ 1 )/ʙʙǫ*+/\$*)ǫ Ǣ ǎ Ǣ Ǎ Ȁ
ɶ'& ʚǶ \$! '. ǿ ɶ*'*-ʙʙǫ'&ǫ Ǣ ǎ Ǣ Ǎ Ȁ
/ ʚǶ '\$./ǿ
4.Ǿ/*Ǿ 1 )/ ʙ .ǡ)0( -\$ǿ ɶ4.Ǿ/*Ǿ 1 )/ ȀǢ
'& ʙ .ǡ\$)/ " -ǿ ɶ'& Ȁ Ǣ
*+/ ʙ .ǡ\$)/ " -ǿ ɶ*+/ Ȁ
Cumulative distribution
(CDF)
Complementary cumulative
distribution (CCDF)
Event happened Event didn’t happen yet

99. 4P UIF DPNQMFNFOU JT KVTU
1S(%J|λJ) = FYQ(−λJ
%J)
4P UIBUT XIBU XF OFFE JO PVS NPEFM TJODF JU JT UIF QSPCBCJMJUZ PG XBJUJOH %J
EBZT XJUIPVU
CFJOH BEPQUFE ZFU
ćF NPEFM
%J|"J = ∼ &YQPOFOUJBM(λJ)
%J|"J = ∼ &YQPOFOUJBM\$\$%'(λJ)
λJ = /µJ
MPH µJ = αİĶı[J]
3 DPEF
'\$--4ǿ- /#\$)&\$)"Ȁ
/ǿ0./\$)/.Ȁ
ʚǶ 0./\$)/.
ɶ*+/ ʚǶ \$! '. ǿ ɶ*0/Ǿ 1 )/ʙʙǫ*+/\$*)ǫ Ǣ ǎ Ǣ Ǎ Ȁ
/ ʚǶ '\$./ǿ

100. 4P UIF DPNQMFNFOU JT KVTU
1S(%J|λJ) = FYQ(−λJ
%J)
4P UIBUT XIBU XF OFFE JO PVS NPEFM TJODF JU JT UIF QSPCBCJMJUZ PG XBJUJOH %J
EBZT XJUIPVU
CFJOH BEPQUFE ZFU
ćF NPEFM
%J|"J = ∼ &YQPOFOUJBM(λJ)
%J|"J = ∼ &YQPOFOUJBM\$\$%'(λJ)
λJ = /µJ
MPH µJ = αİĶı[J]
3 DPEF
'\$--4ǿ- /#\$)&\$)"Ȁ
/ǿ0./\$)/.Ȁ
ʚǶ 0./\$)/.
ɶ*+/ ʚǶ \$! '. ǿ ɶ*0/Ǿ 1 )/ʙʙǫ*+/\$*)ǫ Ǣ ǎ Ǣ Ǎ Ȁ
/ ʚǶ '\$./ǿ
log average time

101. %J|"J = ∼ &YQPOFOUJBM(λJ)
%J|"J = ∼ &YQPOFOUJBM\$\$%'(λJ)
λJ = /µJ
MPH µJ = αİĶı[J]
3 DPEF
'\$--4ǿ- /#\$)&\$)"Ȁ
/ǿ0./\$)/.Ȁ
ʚǶ 0./\$)/.
ɶ*+/ ʚǶ \$! '. ǿ ɶ*0/Ǿ 1 )/ʙʙǫ*+/\$*)ǫ Ǣ ǎ Ǣ Ǎ Ȁ
/ ʚǶ '\$./ǿ
4.Ǿ/*Ǿ 1 )/ ʙ .ǡ)0( -\$ǿ ɶ4.Ǿ/*Ǿ 1 )/ ȀǢ
*'*-Ǿ\$ ʙ \$! '. ǿ ɶ*'*-ʙʙǫ'&ǫ Ǣ ǎ Ǣ Ǐ Ȁ Ǣ
*+/ ʙ ɶ*+/
Ȁ
(ǎǎǡǎǑ ʚǶ 0'(ǿ
'\$./ǿ
library(rethinking)
data(AustinCats)
d <- AustinCats
dat <- list(
days = d\$days_to_event,
color_id = ifelse( d\$color=="Black" , 1 , 2 ) )
meow <- ulam(
alist(
lambda <- 1.0/mu,
log(mu) <- a[color_id],
a[color_id] ~ normal(0,1)
) , data=dat , chains=4 , cores=4 )

102. Black cats
Other cats
45 50 55 60 65
0.0 0.2 0.4 0.6
waiting time
Density

103. Black cats
Other cats
0 20 40 60 80 100
0.0 0.2 0.4 0.6 0.8 1.0