Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word based Graph Parsing

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based
Graph Parsing {slzhou.cs, yzhang.cs}@outlook.com; [email protected]; {zhli13,hongyu,minzhang}@suda.edu.cn COLING-2022 Soochow University 2022/11/11 Shilin Zhou, Qingrong Xia, Zhenghua Li,Yu Zhang, Yu Hong, Min Zhang

Rejection experience • 2021.08 AAAI submission • 2022.01 ARR submission;
commit to NAACL-2022 • 2022.05 COLING submission (substantial improvement) 2022/11/11 2

Two formalisms of semantic role labeling (SRL) 2022/11/11 3 Word-based
SRL (Since CoNLL-2008) Span-based SRL Sense (frame) ID: want.01

Two formalisms of semantic role labeling (SRL) 4 Word-based SRL
(Since CoNLL-2008) Dependency tree. Picture from (Zhang+, ACL-2020)

Frame (word sense) 2022/11/11 5 Word sense (frame) ID: want.01

(1) BIO-based approach • First identify predicates; then recognize the
arguments of each predicates independently. • Cons: encoding and decoding for multiple times 2022/11/11 6

(2) Span-based graph parsing approach (Luheng He et al. ACL-2018)
• Cons: the search space is large (O(n3)) • Heuristic pruning strategies 2022/11/11 7 They want do ... to do more they more ...

How we start this work? • A failed submission: EMNLP-2021
short paper by Qingrong Xia (now Ph.D at Huawei Cloud) 2022/11/11 8 A complex TreeCRF-based syntactic parser can process over 1K sentences per second (Zhang+ ACL-2020)!

So, can we develop a faster SRL approach with satisfactory
accuracy? 2022/11/11 9

Picture from Hao Peng et al. (ACL-2017) Inspired by works
on word-based semantic graph parsing (SGDP) • Dozat and Manning (ACL-2018) • First-order, local loss • Wang+ (ACL-2019) • Second-order • MFVI 2022/11/11 10

Our idea: word-based graph • Can we convert span-based graphs
into word-based? • span-level nodes into word-level 2022/11/11 11

Our idea: word-based graph • Naturally, we think of char-based
named entity recognition ! " [#$%& PER] (He is Zhenghua Ouyang) 2022/11/11 12 ! " # $ % & BII O O B-PER I-PER I-PER I-PER BIS BIE O O B-PER I-PER I-PER E-PER BIES BE O O B-PER O O E-PER BES

The BII schema for graph representation I: attach all inside/ending
words 2022/11/11 13 Notice the ending period.

The BE schema for graph repr. • E: attach the
ending word 2022/11/11 14

BII vs. BE (before 2022.01) 2022/11/11 15

BII vs. BE (before 2022.01) 2022/11/11 16

The BIES schema for graph repr. • E: ending words
• I: inside words • S: single-word arguments 2022/11/11 17

The BES schema for graph repr. 2022/11/11 18

So far, We are done with the schemata. 2022/11/11 19

The two-stage parsing approach • First predict the edges: p(True|i,j)
= sigmoid( s(i,j) ) • Then predict the labels: 2022/11/11 20 .

The model (edge prediction) 2022/11/11 21

MFVI (mean field variational inference) 2022/11/11 22 = p(True|i,j)

The training loss • First predict the edges: p(True|i,j) =
sigmoid( . ) • Then predict the labels: 2022/11/11 23

One remaining issue: label conflict 2022/11/11 24 An example under
the BES schema Another conflict type: (B-A0 E-A1)

One remaining issue: label conflict 2022/11/11 25 An example under
the BES schema Another conflict type: (B-A0 E-A1) Designing rules to handle this?

A more elegant way to handle label conflict 2022/11/11 26
B-A0 E-A0 O B-A1 I E-A1 O B-* E-* S-* I O Note: the I label is indispensable.

Constrained Viterbi decoding 2022/11/11 27 Note: the I label is
indispensable. do Some students want to … B-A0 S-A0 O B-A1 E-A0 B-A0 O B-A1

Constrained Viterbi decoding 2022/11/11 28 Note: the I label is
indispensable. do Some students want to … B-A0 S-A0 O B-A1 E-A0 B-A0 O B-A1 How do we get the emission probs?

REUSE the probs 2022/11/11 29 Note: the I label is
indispensable. do Some students want to … B-A0 S-A0 O B-A1 E-A0 B-A0 O B-A1 How do we get the emission probs?

Experiments • Benchmark datasets (English) • CoNLL05 (Palmer et al.,
2005) • Larger-scale CoNLL12 (Pradhan et al., 2012) 2022/11/11 30

Comparing schemata (on ConLL05, Brown as out-of-domain data) 2022/11/11 31
Clearly, the E label (only considering boundaries) is very useful!

Clearly, the S label is also very useful! (Further improvement!) Seemingly, I is harmful. (Our mind was not clear then.)

Then, why not try BES??? (a few days before COLING deadline)

Comparing schemata (on ConLL05) 2022/11/11 34 BES > BE on
single-word arguments. BE & BES > BIES & BII on multi-word arguments.

2022/11/11 35 Final results: better than previous results, and comparable
with Zhang et al. (COLING-2022).[look at the P/R values]

Efficiency comparison (our main motivation) 2022/11/11 36

Conclusions and findings • We propose a third approach for
span-based SRL: word-based graph parsing • very efficient (fast) • higher accuracy than previous results • We design and compare different graph repr. schemata. • with meaningful Insights • We propose a simple and elegant way based on constrained Viterbi to handle conflicts in the output graphs. 2022/11/11 37

Thanks 38 Questions

Fast and Accurate End-to-End Span-based Semanti...

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word based Graph Parsing

More Decks by wing.nus

Other Decks in Technology

Featured

Transcript