Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Revealing the Myth of Higher-Order Inference in...

Revealing the Myth of Higher-Order Inference in Coreference Resolution

Emory NLP

July 08, 2021
Tweet

More Decks by Emory NLP

Other Decks in Technology

Transcript

  1. Introduction • PyTorch implementation of end-to-end coreference resolution model •

    Based on [Lee et al.’17], [Joshi et al.’20] • Four higher-order inference (HOI) methods • Two previous methods from [Lee et al.’ 18], [Kantor and Globerson’19] • Two new methods • Empirical e ff ectiveness of four HOI methods
  2. Approach Overview • Local-decision coreference model: c2f-coref by [Lee et

    al.’18] • Four HOI methods on top: • Span re fi nement • Attended Antecedent (AA): [Lee et al.’18] • Entity Equalization (EE): [Kantor and Globerson’19] • Span Clustering (SC) • Cluster Merging (CM): inspired from [Wiseman et al.’16]
  3. Approach End-to-End Coreference Model • Mention-linking process • Local decisions

    between two spans • Learn distribution over each span’s antecedents
  4. Approach Span Re fi nement • Enrich span representation using

    predicted antecedent distribution • : di ff erent ways of enrichment • Re-rank antecedents
  5. Approach Entity Equalization (EE) • [Kantor and Globerson’19] • “soft”

    entity from antecedents • attended entity representation over entity distribution
  6. Approach Span Clustering (SC) • New span re fi nement

    • true predicted entities from antecedent distribution • entity representation: attended spans
  7. Approach Cluster Merging (CM) • Build up and maintain entity

    representation through antecedent ranking • Con fi guration: ranking order (sequential vs. easy- fi rst) • Con fi guration: cluster merging reduction (max vs. average pooling) • Ranking score: antecedent score + cluster matching score
  8. Results Avg. F1 Avg. F1 - M Joshi et al.’19

    76.9 - Joshi et al.’20 79.6 - BERT(Local) 77.4 77.3 ( ± 0.1) SpanBERT(Local) 79.9 79.7 ( ± 0.1) + AA 79.7 79.4 ( ± 0.2) + EE 79.4 78.9 ( ± 0.4) + SC 79.7 79.2 ( ± 0.3) + CM 80.2 79.9 ( ± 0.2)
  9. Analysis Direct Impact • Turn o ff HOI at evaluation

    • Performance drop on test set: trivial AA -0.02 ( ± 0.06) EE 0.03 ( ± 0.07) SC 0.11 ( ± 0.10) CM 0.04 ( ± 0.04)
  10. Analysis Coreferent Links • Examine the change of link correctness

    on test set (W: Wrong; C: Correct) • HOI e ff ects are two-sided W→C C→W + AA 240.8 (1.3%) 241.2 (1.3%) + EE 244.1 (1.3%) 245.3 (1.3%) + SC 248.2 (1.3%) 262.0 (1.4) + CM 226.4 (1.2%) 235.0 (1.2%)
  11. Analysis Pronoun Resolution • Examine coreferent links on pronouns w.r.t

    plurality (S: Singular; P: Plural) S→P P→S BERT(Local) 2.3 6.5 SpanBERT(Local) 2.8 6.6 + AA 1.8 8.8 + EE 1.8 5.5 + SC 3.8 7.2 + CM 3.0 6.6
  12. Analysis Ambiguous Pronoun • Long-standing HOI motivation: contamination from ambiguous

    pronouns • (he, you) and (you, they) → (you, he, they) • Number of clusters containing ambiguous pronouns: trivial di ff erence # Clusters BERT(Local) 48.8 (3.5%) SpanBERT(Local) 43.8 (2.7%) + AA 44.8 (2.4%) + EE 44.0 (2.5%) + SC 45.4 (3.0%) + CM 43.8 (2.6%)
  13. Summary • Current HOI provides marginal bene fi ts over

    local decisions. • HOI depends on the quality of fi rst-round antecedent ranking. • HOI is an implicit regularization (mutually dependent with local ranking).