Slide 15
Slide 15 text
Positive Pointwise Mutual Information (PPMI) (Bullinaria+ 2007)
14
,𝑗𝑗
= max 0, log
(, )
()
= max 0, log #(, ) + log #(∗,∗) − log #(∗, ) − log #(,∗)
, = #(, )/#(∗,∗), = #(,∗)/#(∗,∗), () = #(∗, )/#(∗,∗)
#(,∗) = ∑𝑗𝑗
#(,) ,#(∗, ) = ∑
#(, ) ,#(∗,∗) = ∑,𝑗𝑗
#(, )
Discount frequent words
and frequent context words
beer
wine
car
ride
have
new
drink
bottle
train
book
speed
read
0
0
0.09
0.03
0.09
0
0
0.49
0.02
0
2.04
1.78
0
0
0
1.97
1.87
0
0
0
0
0
0.13
1.43
0
0
0
0.55
1.16
0
0
0
0
0
0.85
Context
Word
cos(beer,wine)
= 0.99 > 0.941
cos(beer,train)
= 0.00 < 0.387
J Bullinaria and J Levy. 2007. Extracting semantic representations from word co-occurrence statistics: A computational study. Behavior Research
Methods, 39:510–526.