Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distant Learning for Entity Linking with Automatic Noise Detection

izuna385
August 07, 2019

Distant Learning for Entity Linking with Automatic Noise Detection

ACL19 sup. for journal club.

Other EL paper slides and summaries:
https://github.com/izuna385/EntityLinking_RecentTrend

izuna385

August 07, 2019
Tweet

More Decks by izuna385

Other Decks in Science

Transcript

  1. 2/20 Entity Linking • Link mention to specific entity in

    Knowledge Base 2 Beam, Andrew L., et al. "Clinical Concept Embeddings Learned from Massive Sources of Medical Data." arXiv:1804.01486 (2018). entity Knowledge Base
  2. 4/20 Procedure 4 1. Prepare Mention/Context vector 2. Learn/prepare Entity(inKB)

    representation 3. Candidate generation 4. Linking Large amount of labeled data Wikipedia-hyperlink based alias table
  3. 7/20 : scorer for linking : loss for linker expecting

    high link score from bag in which possibly exists gold entity expecting low link score from negative-sampled bag
  4. 8/20 Under “Supervised” settings If candidate generation fail to get

    gold entity, we can simply add gold entity to bag. ← Shikhar et al., ACL’18
  5. 9/20 Under “Distant” settings We can’t know whether candidate bag

    has gold entity or not. But for training g with valid data point, we want to know/classify this.
  6. 11/20 Noisy/Valid E+ classifier E+ bag rep. Contextualized mention pN

    : Classify whether bag for mention is ‘noisy’ or ‘valid’ 1 0
  7. 12/20 Noisy/Valid E+ classifier E+ bag rep. Contextualized mention pN

    : Classify whether bag for mention is ‘noisy’ or ‘valid’ 1 0 NOTE: pN doesn’t have inputs of mention-candidate surface sim.
  8. 14/20 Loss for training pN (noisy/valid bag classifier) with linker

    valid(not noisy) prob. link loss For possibly valid(= gold entity exists) bag, sum up link loss for training linker, but… 1 0
  9. 15/20 Loss for training pN (noisy/valid bag classifier) with linker

    valid(not noisy) prob. link loss assigning ‘noisy’ to all bags easily lead loss to 0, so we can’t train linker and bag classifier. 1 0
  10. 16/20 Loss for training pN (noisy/valid bag classifier) with linker

    valid(not noisy) prob. link loss : Hyperparameter: beliefs about noisy data points. (e.g. 0.9) noisiness mean val. for Document 1 0
  11. 17/20 Loss for training pN (noisy/valid bag classifier) with linker

    valid(not noisy) prob. link loss : Hyperparameter: beliefs about noisy data points. (e.g. 0.9) noisiness mean val. for Document expect training linker with gold-entity-highly-possibly-exists data ↑ by adding this loss 1 0
  12. 19/20 Table3: Linker error rate for dev set Blue: denoising

    succeeded Red: denoising failure, due to flaw of candidate generation ND: denoising bags for training linker = succeeded at catching the signal of gold entity in bag
  13. 20/20 confirming pN separates valid/noisy data : bag in which

    gold entity doesn’t exist. Figure 3: