Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[輪講資料] A Frustratingly Easy Approach for Entity and Relation Extraction

tossy
June 03, 2022

[輪講資料] A Frustratingly Easy Approach for Entity and Relation Extraction

「A Frustratingly Easy Approach for Entity and Relation Extraction」の論文について解説した資料

tossy

June 03, 2022
Tweet

More Decks by tossy

Other Decks in Research

Transcript

  1. A Frustratingly Easy Approach for Entity and Relation Extraction
    Zexuan Zhong, Danqi Chen
    NAACL 2021(North American Chapter of the Association for Computational Linguistics)
    URL: https://aclanthology.org/2021.naacl-main.5/ citations:53
    What is this paper about?
    Paper contributions
    Key points
    Validate advantages and effectiveness Related work
    1
    Propose a simple and effective
    approach for named entity
    recognition and relation extraction
    ・ Learns two independent encoders
    ・ Insert typed entity markers in
    training relation model
    Outperforms all previous joint
    model on three datasets
    Wadden+: Entity, relation,
    and event extraction with
    contextualized span
    representations, EMNLP ‘19
    ・ 8-16x speedup with a small
    accuracy drop

    View Slide

  2. Introduction
    2

    View Slide

  3. Introduction: Entity and Relation Extraction
    [01] Sang+: Introduction to the conll-2003 shared task: Language independent named entity recognition, CoNLL ‘03
    [02] Ratinov+: Design challenges and misconceptions in named entity recognition, CoNLL ‘09
    [03] Zelenko+: Kernel methods for relation extraction, EMNLP ‘02
    [04] Bunescu+: A shortest path dependency kernel for relation extraction, EMNLP '05
    Named Entity Recognition[01],[02]
    Relation Extraction[03], [04]
    3
    Input
    morpa is a fully implemented parser for a text-to-speech system

    View Slide

  4. Problem definition
    Input
    n tokens, token represents a single word
    m spans, span is an ordered sequence of token
    X of up to length L=8, START(i) an END(i) denote start and end indices of
    4
    Output
    : span, : entity type
    : subject/object span, : relation type

    View Slide

  5. Introduction: Early and Recently Work
    Early Work: pipelined approach
    • Training entity model and relation model separately[05], [06], [07]
    Recently: end-to-end approach
    • modeling entity model and relation model jointly[08]~[17]
    • joint models can better capture the interactions between entities
    and relations
    [05] Zhou+: Exploring various knowledge in relation extraction, ACL ‘05
    [06] Kambhatla+: Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction, ACL ‘04
    [07] Chan+: Exploiting syntactico-semantic structures for relation extraction, ACL-HLT ‘11
    [08] Li+: Incremental joint extraction of entity mentions and relations, ACL ‘14
    [09] Miwa+: End-to-end relation extraction using LSTMs on sequences and tree structures, ACL ‘16
    [10] Katiyar+: Going out on a limb: Joint extraction of entity mentions and relations without dependency trees, ACL ‘17
    [11] Zhang(a)+: End-to-end neural relation extraction with global optimization, EMNLP ‘17
    [12] Zhang(b)+: Position aware attention and supervised data improve slot filling, EMNLP ‘17
    [13] Li+: Entity-relation extraction as multi-turn question answering, ACL ‘19
    [14] Luan+: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction, EMNLP ‘18
    [15] Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19
    [16] Lin+: A joint neural model for information extraction with global features, ACL ‘20
    [17] Wang+: Two are better than one: Joint entity and relation extraction with table sequence encoders, EMNLP '20 5

    View Slide

  6. Introduction: Joint models
    6
    multi-task learning[16]
    shared span representations between two task
    [16] Luan+: A general framework for information extraction using dynamic span graphs, NAACL ‘19
    The output vector as inputs to
    the relation propagation layer
    The output vector as inputs to
    the entity and relation
    prediction layer

    View Slide

  7. This work approach
    • 「Is it good to share the span representation with the entity and relation
    like the joint model?」
    • → Learns two independent encoders
    • Insert typed entity markers in training relation model
    • An efficient approximation: 8-16x speed up with a small accuracy drop
    7
    [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room
    Insert Markers Insert Markers

    View Slide

  8. Entity model
    8

    View Slide

  9. Entity model
    9
    Input Bill Smith was in the hotel room
    Entity Encoder
    ……
    ・ bert-base-uncased
    ・ albert-xxlarge-v1
    ・ scibert-scivocab-uncased
    Contextualized
    representations
    Span
    representations
    ……
    Bill Smith was in room
    ……
    ……

    View Slide

  10. Entity model
    10
    Input Bill Smith was in the hotel room
    Entity Encoder
    ……
    ・ bert-base-uncased
    ・ albert-xxlarge-v1
    ・ scibert-scivocab-uncased
    Contextualized
    representations
    Span
    representations
    ……
    Bill Smith was in room
    ……
    ……
    FFNN
    …… 2-layer FFNN
    150 hidden units
    Dense
    Softmax
    ……
    PER

    View Slide

  11. Entity model
    11
    Input Bill Smith was in the hotel room
    Entity Encoder
    ……
    ・ bert-base-uncased
    ・ albert-xxlarge-v1
    ・ scibert-scivocab-uncased
    Contextualized
    representations
    Span
    representations
    ……
    Bill Smith was in room
    ……
    ……
    FFNN
    …… 2-layer FFNN
    150 hidden units
    Dense
    Softmax
    ……
    PER
    ……
    Dense
    Softmax
    ……

    View Slide

  12. Relation model
    12

    View Slide

  13. Relation model
    13
    Bill Smith was in the hotel room
    Bill Smith hotel room
    PER FAC FAC
    Input
    Output
    Bill Smith room
    PER FAC
    hotel room
    FAC FAC
    PHYS
    PART-WHOLE
    ※ Person ※ Facilities ※ Facilities
    ※ Physical

    View Slide

  14. Relation model: Inserting Markers
    14
    Bill Smith was in the hotel room
    Bill Smith hotel room
    PER FAC FAC
    Bill Smith
    PER
    hotel
    FAC [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room
    Bill Smith
    PER
    room
    FAC [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC]
    hotel
    FAC
    room
    FAC Bill Smith was in the [S:FAC] hotel [/S:FAC] [O:FAC] room [/O:FAC]
    …… ……
    Insert Markers Insert Markers

    View Slide

  15. Relation model
    15
    Modified
    Input
    Relation Encoder
    ……
    ・ bert-base-uncased
    ・ albert-xxlarge-v1
    ・ scibert-scivocab-uncased
    Contextualized
    representations
    Span-pair
    representations
    ……
    Bill Smith→room
    ……
    [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC]
    room
    [O:FAC]
    [S:PER]

    View Slide

  16. Relation model
    16
    Modified
    Input
    Relation Encoder
    ……
    ・ bert-base-uncased
    ・ albert-xxlarge-v1
    ・ scibert-scivocab-uncased
    Contextualized
    representations
    Span-pair
    representations
    ……
    Bill Smith→room
    ……
    [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] [/O:FAC]
    room
    [O:FAC]
    Dense
    Softmax
    PHYS

    View Slide

  17. Cross-sentence context
    • Cross-sentence help predict entity types and relations[15],[16]
    • Related work
    • Add a 3-sentence context window[15]
    • This work
    • given an input sentence with n words
    augment the input with
    from the left context and right context
    → Entity: W=300, Relation: W=100
    17
    [15] Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19
    [16] Luan+: A general framework for information extraction using dynamic span graphs, NAACL ‘19

    View Slide

  18. Experiments
    18

    View Slide

  19. Experiments: Dataset
    • ACE04[18], ACE05[19]
    • a variety of domains
    (newswire and online forums)
    • Entity
    PER, ORG, GPE, LOC, FAC, VEH, WEA
    • Relation
    Physical, Part-whole, Personal-Social,
    ORG-Affiliation, Agent-Artifact, Gen-Affiliation
    • SciERC[20]
    • 500 AI paper abstracts, scientific terms
    and relations
    • Entity
    Task, Method, Evaluation Metric, Material, Other Scientific Terms, Generic
    • Relation
    Used-for, Feature-of, Hyponym-of, Part-of, Compare, Conjunction
    [18] https://catalog.ldc.upenn.edu/LDC2005T09
    [19] https://catalog.ldc.upenn.edu/LDC2006T06
    [20] Luan+: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction, EMNLP ‘18
    http://nlp.cs.washington.edu/sciIE/ 19

    View Slide

  20. Experiments: Evaluation metrics
    • micro F1 measure
    • Named Entity Recognition
    • span boundaries and the predicted entity type are both correct
    • Relation extraction
    • (1) boundaries evaluation(Rel)
    the boundaries of two spans are correct and the predicted relation type is
    correct
    • (2) strict evaluation(Rel+)
    predicted entity types also must be correct in addition Rel
    20

    View Slide

  21. Result
    21

    View Slide

  22. Main results: Entity model
    • Entity model
    • cross-sentence information is useful and pre-trained transformer
    encoders can capture long-range dependencies from a large context
    22
    Encoder
    L=LSTM
    L+E=LSTM+ELMo
    Bb=BERT-base
    Bl=BERT-large
    ALB=ALBERT-xxlarge-v1
    SciB=SciBERT
    cross-sentence information
    trained with additional data(e.g., coreference)

    View Slide

  23. Main results: Relation model
    • Relation model
    • learning representations for entities and relations of different entity pairs, as
    well as early fusion of entity information in the relation model
    23
    Encoder
    L=LSTM
    L+E=LSTM+ELMo
    Bb=BERT-base
    Bl=BERT-large
    ALB=ALBERT-xxlarge-v1
    SciB=SciBERT
    cross-sentence information
    trained with additional data(e.g., coreference)

    View Slide

  24. Main results: compared to the previous SOTA
    • Compared to the previous SOTA model without using context
    • This result clearly demonstrates the superiority of proposed model
    24
    Encoder
    L=LSTM
    L+E=LSTM+ELMo
    Bb=BERT-base
    Bl=BERT-large
    ALB=ALBERT-xxlarge-v1
    SciB=SciBERT
    cross-sentence information
    trained with additional data(e.g., coreference)

    View Slide

  25. Analysis
    25

    View Slide

  26. Analysis: Importance of Typed Text Markers
    26
    [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC]
    Typed markers
    Proposed method
    PHYS
    72.6%
    Untyped markers 70.5%
    [S] Bill Smith [/S] was in the hotel [O] room [/O]
    PHYS
    No marker 67.6%
    Bill Smith was in the hotel room
    PHYS
    Relation F1
    Markers + entity
    auxiliary loss[15],[16]
    70.7%
    [S] Bill Smith [/S] was in the hotel [O] room [/O]
    PHYS
    PER FAC
    [15] Wadden+: Entity, relation, and event extraction with contextualized span representations, EMNLP ‘19
    [16] Luan+: A general framework for information extraction using dynamic span graphs, NAACL ‘19

    View Slide

  27. Analysis: Modeling Entity-Relation Interactions
    • Does sharing encoders help?
    27
    +
    ,
    • Two tasks have different input formats and require different features
    for predicting entity types and relations
    → separate encoders indeed learns better task-specific features

    View Slide

  28. Approximation model with batch computations
    28
    [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room
    [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC]
    Bill Smith was in the hotel room
    Bill Smith hotel room
    PER FAC FAC
    Bill Smith was in the [S:FAC] hotel [/S:FAC] [O:FAC] room [/O:FAC]
    ……
    One shortcoming of this approach is that need to run for every pair of
    entities

    View Slide

  29. Approximation model with batch computations
    29
    [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room
    [S:PER] Bill Smith [/S:PER] was in the hotel [O:FAC] room [/O:FAC]
    [S:PER]
    Bill Smith [/S:PER]
    was in the hotel [O:FAC] [/O:FAC]
    room
    [S:PER]
    Bill Smith [/S:PER]
    was in the hotel [O:FAC] [/O:FAC]
    room
    [S:PER]
    Bill Smith [/S:PER]
    was in the hotel [O:FAC] [/O:FAC]
    room [S:PER] [/S:PER] [O:FAC] [/O:FAC]
    the same sentence in one run of the relation model
    → 8-16x speedup with only 1% accuracy drop

    View Slide

  30. Conclusion
    30

    View Slide

  31. Conclusion
    • Proposed a simple and effective approach for named entity recognition
    and relation extraction
    • Learns two independent encoders
    • Insert typed entity markers in training relation model
    • An efficient approximation: 8-16x speed up with a small accuracy drop
    31
    [S:PER] Bill Smith [/S:PER] was in the [O:FAC] hotel [/O:FAC] room
    Insert Markers Insert Markers

    View Slide