Slide 1

Slide 1 text

1/26 1 Unofficial slides @izuna385

Slide 2

Slide 2 text

2/26 Summary • To tackle with the low-resource domain where datasets are limited, • propose new recommendation scheme for creating annotation, • which is agnostic to novel domains.

Slide 3

Slide 3 text

3/26 Entity Linking(EL) • Link mention to specific entity in Knowledge Base 3 Beam, Andrew L., et al. "Clinical Concept Embeddings Learned from Massive Sources of Medical Data." arXiv:1804.01486 (2018). entity Knowledge Base

Slide 4

Slide 4 text

4/26 Procedure for EL 4 1. Prepare Mention/Context vector based on annotated data 2. Learn/prepare Entity (and its feature) representation 3. Candidate generation 4. Linking

Slide 5

Slide 5 text

5/26 Procedure for EL 5 1. Prepare Mention/Context vector based on annotated data 2. Learn/prepare Entity (and its feature) representation 3. Candidate generation 4. Linking Require large amount of labeled data

Slide 6

Slide 6 text

6/26 Defect of previous EL requiring lots of domain-specific annotations. (e.g. [Gillick et al., ‘19] : 9 million) • They can’t cope with new domains and KB for them. • Prior-based candidate generation works only in general domain. [Gillick et al., ‘19]

Slide 7

Slide 7 text

7/26 Previous Human-In-The-Loop EL requires pre-trained sequence tagger and index for candidates. ● This can’t solve cold-start problem, where no annotation exists. ● Also, previous studies only links againsts Wikipedia. (If we link mentions to Wikipedia, prior can be utilized.)

Slide 8

Slide 8 text

8/26 There proposal ● Can interactive annotation, during which candidate recommendation system are trained, improve annotation process for human?

Slide 9

Slide 9 text

9/26 There proposal ● Can interactive annotation, during which candidate recommendation system are trained, improve annotation process for human? Human-In-The-Loop Approach (proposed) ■ Main focus is candidate ranking step after candidates are generated. ■ All entities in ref-KB are supposed to have title and description.

Slide 10

Slide 10 text

10/26 Procedure

Slide 11

Slide 11 text

11/26 Procedure Their forcus Levenshtein distance based Fuzzy search (Next slide)

Slide 12

Slide 12 text

12/26 (Supplement) Examples of fuzzy search ● [Murty et al., ACL ‘18] N character-gram features. ・TFIDF character Ngram(N=1~5) + L2 normalized + cossim. ● In this From-Zero-to-Hero paper, they leveraged WordPiece tokenization of BERT. ● Still searching other examples ...

Slide 13

Slide 13 text

13/26 Candidate Ranking ● Handcraft feature-based ranking with LightGBM / RankSVM / RankNet. ● For enhancing interactiveness and avoid slow inference, they avoided DNN models and utilized Gradient boosted trees.

Slide 14

Slide 14 text

14/26 Used Features for Candidate Ranking ● See the original paper. ● Note: Although they used cos-sim. between sentence and entity label, BERT-encoders were not the fine-tuned one.

Slide 15

Slide 15 text

15/26 Dataset ● Because mentions in WWO have the extreme variance in surface form, Avg. Amb is lower. ● For WWO and 1641, fuzzy search are conducted, which results in the increase of Avg. Cand. : both stands for same name

Slide 16

Slide 16 text

16/26 Experiments Can interactive annotation, during which candidate recommendation system are trained, improve annotation process for human? ● For validating their research question (as the following.) ● Evaluation 1: Performance of Recommender  ・V.S. Non-interactive ranking performance 2: Simulating User annotation 3: Real User’s annotation performance  ・Speed and needness for searching query

Slide 17

Slide 17 text

17/26 1: Automatic suggestion performance ● Result High Precision: If the gold is included in cands, that method can specify the gold with high performance. High Recall : Does that CG method can catch gold in candidates? If recall is high, gold may be in cands with high confidence.

Slide 18

Slide 18 text

18/26 1: Automatic suggestion performance ● Result High Precision: If the gold is included in cands, that method can specify the gold with high performance. High Recall : Does that CG method can catch gold in candidates? If recall is high, gold may be in cands with high confidence.

Slide 19

Slide 19 text

19/26 1: Automatic suggestion performance ● Result High Precision: If the gold is included in cands, that method can specify the gold with high performance. High Recall : Does that CG method can catch gold in candidates? If recall is high, gold may be in cands with high confidence. For noisy text, use Levenshtein.

Slide 20

Slide 20 text

20/26 2: Candidate ranking performance Note: ■ Evaluation is different for each dataset. ■ “MFLE” denotes gold freq. in training dataset.

Slide 21

Slide 21 text

21/26 2: Candidate ranking performance |C|: Average available candidates. t: training time ・AIDA is biased with training set. ・LightGBM and RansSVM are fast enough to re-train after each user’s annotation, they say.

Slide 22

Slide 22 text

22/26 2: Candidate ranking -feature importance- ● Jaccard with AIDA is due to the fact that wikidata has low length of entity desc. ● Levenshtein ML for WWO and 1641 is because for creating Human-in-the-Loop situation, they create KB from mentions in documents. ● Sentence Cos-sim. is useful over three datasets.

Slide 23

Slide 23 text

23/26 3: Simulation ● Only in annotating training datasets, because generalizing model is out-of-corpus. ● Seed annotation --> add them to the training sets for ranker → User add new annotation → ...

Slide 24

Slide 24 text

24/26 3: Simulation ● Only in annotating training datasets, because generalizing model is out-of-corpus. ● Seed annotation --> add them to the training sets for ranker → User add new annotation → ...

Slide 25

Slide 25 text

25/26 4: Real User Study ● Their annotation recomender ● Leads to 35% faster annotation, 15% less search queries.

Slide 26

Slide 26 text

26/26 Conclusion ● Newly proposed annotation recommender for EL. ● Some feature-based analysis for noisy-text EL. ● Human-in-the-loop approach is effective for training candidate ranker and recommendation.