Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word based Graph Parsing

wing.nus
November 14, 2022

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word based Graph Parsing

Talk abstract: In this talk, I would like to introduce this COLING-2022 paper. Besides the details of the work itself, I also want to discuss the stories behind the work. For example, how did the idea come out; how we developed the idea gradually; how we responded to the negative comments in the previous two submission rounds; etc. I also want to discuss with the audience how our work may be improved and look forward to new ideas.

Paper abstract: This paper proposes to cast end-to-end span-based SRL as a word-based graph parsing task. The major challenge is how to represent spans at the word level. Borrowing ideas from research on Chinese word segmentation and named entity recognition, we propose and compare four different schemata of graph representation, i.e., BES, BE, BIES, and BII, among which we find that the BES schema performs the best. We further gain interesting insights through detailed analysis. Moreover, we propose a simple constrained Viterbi procedure to ensure the legality of the output graph according to the constraints of the SRL structure. We conduct experiments on two widely used benchmark datasets, i.e., CoNLL05 and CoNLL12. Results show that our word-based graph parsing approach achieves consistently better performance than previous results, under all settings of end-to-end and predicate-given, without and with pre-trained language models (PLMs). More importantly, our model can parse 669/252 sentences per second, without and with PLMs respectively.

Bio: Zhenghua Li is a full professor at Soochow University Suzhou, China. Zhenghua received the Bachelor, Master, and PhD degrees in computer science from Harbin Institute of Technology, Harbin, China, in 2006, 2008, and 2013, respectively. He joined Soochow University afterward. So far, he has published 40+ top-tier conference papers in the NLP/AI fields, including 9 ACL long papers, as the first author or the corresponding author. His NLPCC-2020 paper had won the best paper award; and his COLING-2022 paper had won the best long paper award. Besides writing papers, he has been actively participated in shared tasks and competitions and has won the first place for multiple times, including tasks on syntax parsing (CoNLL-2009), semantic parsing (SemEval-2019 UCCA), CoNLL-2019 EDS), and grammatical error correction (GEC) (CTC-2021, CGED-2021, WAIC-2022). Meanwhile, he has been highly interested in and constantly engaged in constructing high-quality datasets for research on syntactic parsing (CODT), semantic parsing (MuCPAD), text-to-SQL parsing (DuSQL, SeSQL), and GEC (MuGEC). He, as PI, has been granted three NSFC projects, and has been closely collaborated with Alibaba, Huawei, and Baidu during the past three years. As a graduate student supervisor, he has 3 students obtaining their PhD degree and about 15 students obtaining their Master degree. Two of them teach and do research in universities, and many work in top-tier IT companies in China.

wing.nus

November 14, 2022
Tweet

More Decks by wing.nus

Other Decks in Technology

Transcript

  1. Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based

    Graph Parsing {slzhou.cs, yzhang.cs}@outlook.com; [email protected]; {zhli13,hongyu,minzhang}@suda.edu.cn COLING-2022 Soochow University 2022/11/11 Shilin Zhou, Qingrong Xia, Zhenghua Li,Yu Zhang, Yu Hong, Min Zhang
  2. Rejection experience • 2021.08 AAAI submission • 2022.01 ARR submission;

    commit to NAACL-2022 • 2022.05 COLING submission (substantial improvement) 2022/11/11 2
  3. Two formalisms of semantic role labeling (SRL) 2022/11/11 3 Word-based

    SRL (Since CoNLL-2008) Span-based SRL Sense (frame) ID: want.01
  4. Two formalisms of semantic role labeling (SRL) 4 Word-based SRL

    (Since CoNLL-2008) Dependency tree. Picture from (Zhang+, ACL-2020)
  5. Frame (word sense) 2022/11/11 5 Word sense (frame) ID: want.01

  6. (1) BIO-based approach • First identify predicates; then recognize the

    arguments of each predicates independently. • Cons: encoding and decoding for multiple times 2022/11/11 6
  7. (2) Span-based graph parsing approach (Luheng He et al. ACL-2018)

    • Cons: the search space is large (O(n3)) • Heuristic pruning strategies 2022/11/11 7 They want do ... to do more they more ...
  8. How we start this work? • A failed submission: EMNLP-2021

    short paper by Qingrong Xia (now Ph.D at Huawei Cloud) 2022/11/11 8 A complex TreeCRF-based syntactic parser can process over 1K sentences per second (Zhang+ ACL-2020)!
  9. So, can we develop a faster SRL approach with satisfactory

    accuracy? 2022/11/11 9
  10. Picture from Hao Peng et al. (ACL-2017) Inspired by works

    on word-based semantic graph parsing (SGDP) • Dozat and Manning (ACL-2018) • First-order, local loss • Wang+ (ACL-2019) • Second-order • MFVI 2022/11/11 10
  11. Our idea: word-based graph • Can we convert span-based graphs

    into word-based? • span-level nodes into word-level 2022/11/11 11
  12. Our idea: word-based graph • Naturally, we think of char-based

    named entity recognition ! " [#$%& PER] (He is Zhenghua Ouyang) 2022/11/11 12 ! " # $ % & BII O O B-PER I-PER I-PER I-PER BIS BIE O O B-PER I-PER I-PER E-PER BIES BE O O B-PER O O E-PER BES
  13. The BII schema for graph representation I: attach all inside/ending

    words 2022/11/11 13 Notice the ending period.
  14. The BE schema for graph repr. • E: attach the

    ending word 2022/11/11 14
  15. BII vs. BE (before 2022.01) 2022/11/11 15

  16. BII vs. BE (before 2022.01) 2022/11/11 16

  17. The BIES schema for graph repr. • E: ending words

    • I: inside words • S: single-word arguments 2022/11/11 17
  18. The BES schema for graph repr. 2022/11/11 18

  19. So far, We are done with the schemata. 2022/11/11 19

  20. The two-stage parsing approach • First predict the edges: p(True|i,j)

    = sigmoid( s(i,j) ) • Then predict the labels: 2022/11/11 20 .
  21. The model (edge prediction) 2022/11/11 21

  22. MFVI (mean field variational inference) 2022/11/11 22 = p(True|i,j)

  23. The training loss • First predict the edges: p(True|i,j) =

    sigmoid( . ) • Then predict the labels: 2022/11/11 23
  24. One remaining issue: label conflict 2022/11/11 24 An example under

    the BES schema Another conflict type: (B-A0 E-A1)
  25. One remaining issue: label conflict 2022/11/11 25 An example under

    the BES schema Another conflict type: (B-A0 E-A1) Designing rules to handle this?
  26. A more elegant way to handle label conflict 2022/11/11 26

    B-A0 E-A0 O B-A1 I E-A1 O B-* E-* S-* I O Note: the I label is indispensable.
  27. Constrained Viterbi decoding 2022/11/11 27 Note: the I label is

    indispensable. do Some students want to … B-A0 S-A0 O B-A1 E-A0 B-A0 O B-A1
  28. Constrained Viterbi decoding 2022/11/11 28 Note: the I label is

    indispensable. do Some students want to … B-A0 S-A0 O B-A1 E-A0 B-A0 O B-A1 How do we get the emission probs?
  29. REUSE the probs 2022/11/11 29 Note: the I label is

    indispensable. do Some students want to … B-A0 S-A0 O B-A1 E-A0 B-A0 O B-A1 How do we get the emission probs?
  30. Experiments • Benchmark datasets (English) • CoNLL05 (Palmer et al.,

    2005) • Larger-scale CoNLL12 (Pradhan et al., 2012) 2022/11/11 30
  31. Comparing schemata (on ConLL05, Brown as out-of-domain data) 2022/11/11 31

    Clearly, the E label (only considering boundaries) is very useful!
  32. Comparing schemata (on ConLL05, Brown as out-of-domain data) 2022/11/11 32

    Clearly, the S label is also very useful! (Further improvement!) Seemingly, I is harmful. (Our mind was not clear then.)
  33. Comparing schemata (on ConLL05, Brown as out-of-domain data) 2022/11/11 33

    Then, why not try BES??? (a few days before COLING deadline)
  34. Comparing schemata (on ConLL05) 2022/11/11 34 BES > BE on

    single-word arguments. BE & BES > BIES & BII on multi-word arguments.
  35. 2022/11/11 35 Final results: better than previous results, and comparable

    with Zhang et al. (COLING-2022).[look at the P/R values]
  36. Efficiency comparison (our main motivation) 2022/11/11 36

  37. Conclusions and findings • We propose a third approach for

    span-based SRL: word-based graph parsing • very efficient (fast) • higher accuracy than previous results • We design and compare different graph repr. schemata. • with meaningful Insights • We propose a simple and elegant way based on constrained Viterbi to handle conflicts in the output graphs. 2022/11/11 37
  38. Thanks 38 Questions