Fine grained Entity Typing using Graph Convolutional Networks

Slide 1

Slide 1 text

1/13 1 Fine-grained Entity Typing using GCN without predefined typing structure 2019-05-23 @izuna385

Slide 2

Slide 2 text

2/13 Paper(Baseline) 2 Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing (Xiong et al, NAACL ’19) Task : Entity typing(multi-label prediction) RQ : Does Typing-Correlation encoding via GCN contributes Typing prediction? Approach : Add graph propagation layer to prediction module.

Slide 3

Slide 3 text

3/13 Task and Dataset (Choi et al, ACL ’18) 3 • train/dev/test = 2000 * 3 •

Slide 4

Slide 4 text

4/13 Motivation for using GCN • Recent model for Entity typing incorporates pre-defined typing structure. • But there are lots of typing which are unseen in KB. • Without pre-defined structure, still label-correlation should be considered. (Shikhar et al, 2018) inconsistent

Slide 5

Slide 5 text

5/13 GCN(simplest) W0: learnable parameter

Slide 6

Slide 6 text

6/13 GCN with self loop W0: learnable parameter

Slide 7

Slide 7 text

7/13 GCN(summarized) 0 https://www.experoinc.com/post/node-classification-by-graph- convolutional-network Adjacent/co-occurrence matrix has structure information. Propagation rule is learned during training.

Slide 8

Slide 8 text

8/13 Incorporate structure-bias into prediction layer Linear transformation layer :encoded mention mention encoding (In this slide, this module isn’t explained.)

Slide 9

Slide 9 text

9/13 Incorporate structure-bias into prediction layer Linear transformation layer(learned) :encoded mention

Slide 10

Slide 10 text

10/13 Incorporate structure-bias into prediction layer Linear transformation layer(learned) :encoded mention • Linear-transformation layer(prediction layer) can be seen as typing embedding matrix. • But, how to incorporate label-correlation?

Slide 11

Slide 11 text

11/13 Incorporate structure-bias into prediction layer Linear transformation layer :encoded mention Typing co- occurrence information (fixed) Typing matrix Propagation rule :learned :type-type correlation incorporated type-emb matrix

Slide 12

Slide 12 text

12/13 Evaluation and re-implementation result model P(test) R(test) F(test) Choi et al.(2018) 47.1 24.2 32.0 LabelGCN(paper) 50.3 29.2 36.9 LabelGCN(re-imp) 49.2 28.4 36.0 Hyperparameters are the same as original ones. Epochs are not written, so I stopped at 100 iter. No P-R curve (prob-threshold) tuning.(Threshold=0.5, same as Choi et al.)

Slide 13

Slide 13 text

13/13 Future work • Ablation study • Apply this model to another dataset/domain (OntoNotes, etc)