Slide 1

Slide 1 text

EXPLOITING DISCOURSE ANALYSIS FOR ARTICLE-WIDE TEMPORAL CLASSIFICATION Jun-Ping Ng National University of Singapore Min-Yen Kan National University of Singapore Ziheng Lin SAP Asia Pte. Ltd. Vanessa Wei Feng University of Toronto Bin Chen Institute for Infocomm Research Jian Su Institute for Infocomm Research Chew-Lim Tan National University of Singapore Wednesday, October 2, 13

Slide 2

Slide 2 text

TEMPORAL CLASSIFICATION At least 19 people were killed and 114 people were wounded in Tuesday’s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. 2 Wednesday, October 2, 13

Slide 3

Slide 3 text

TEMPORAL CLASSIFICATION At least 19 people were killed and 114 people were wounded in Tuesday’s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. 3 Wednesday, October 2, 13

Slide 4

Slide 4 text

TEMPORAL CLASSIFICATION At least 19 people were killed and 114 people were wounded in Tuesday’s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. 3 Wednesday, October 2, 13

Slide 5

Slide 5 text

TEMPORAL CLASSIFICATION At least 19 people were killed and 114 people were wounded in Tuesday’s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. 4 Wednesday, October 2, 13

Slide 6

Slide 6 text

TEMPORAL CLASSIFICATION At least 19 people were killed and 114 people were wounded in Tuesday’s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. OVERLAP 4 Wednesday, October 2, 13

Slide 7

Slide 7 text

TEMPORAL CLASSIFICATION At least 19 people were killed and 114 people were wounded in Tuesday’s southern Philippines airport blast, officials said, but reports said the death toll could climb to 30. OVERLAP BEFORE BEFORE 4 Wednesday, October 2, 13

Slide 8

Slide 8 text

USING TEMPORAL RELATIONS • Ordering sentences in text summarization (Barzilay et al., 2002) • Temporal information extraction (Ling and Weld, 2010; Ji et al., 2011) 5 Wednesday, October 2, 13

Slide 9

Slide 9 text

OUTLINE • Motivation for Discourse Analysis • Building Features Out of Discourse Analysis • Experiments and Results • Discussion 6 Wednesday, October 2, 13

Slide 10

Slide 10 text

CURRENT WORK • Focus is on event pairs within a sentence, or in adjacent sentences • Centers around lexico-syntactic features 7 Wednesday, October 2, 13

Slide 11

Slide 11 text

BEYOND SENTENCE BOUNDARIES A B C D E s1 s2 s3 s4 8 Wednesday, October 2, 13

Slide 12

Slide 12 text

BEYOND SURFACE FEATURES Max opened the door. The room was pitch dark. Max switched off the light. The room was pitch dark. (Example quoted from Lascarides and Asher (1993)) 9 (1) (2) Wednesday, October 2, 13

Slide 13

Slide 13 text

WHY DISCOURSE? • Understand how sentences relate to one another • Rich variety of studies on discourse 10 Wednesday, October 2, 13

Slide 14

Slide 14 text

FRAMEWORKS • Rhetorical Structure Theory (RST) • Penn Discourse Treebank (PDTB) Discourse Relations • Topical Text Segmentation 11 Wednesday, October 2, 13

Slide 15

Slide 15 text

RST Max switched off the light. The room was pitch dark. 12 Wednesday, October 2, 13

Slide 16

Slide 16 text

RST Max switched off the light. The room was pitch dark. Max switched off the light. The room was pitch dark. RESULT 12 Wednesday, October 2, 13

Slide 17

Slide 17 text

PDTB-STYLED DISCOURSE RELATIONS Max switched off the light. The room was pitch dark. 13 Wednesday, October 2, 13

Slide 18

Slide 18 text

PDTB-STYLED DISCOURSE RELATIONS Max switched off the light. The room was pitch dark. The room was pitch dark. CONTINGENCY :: CAUSE arg1 arg2 Max switched off the light. 13 Wednesday, October 2, 13

Slide 19

Slide 19 text

TOPICAL TEXT SEGMENTATION • Group sentences about a similar topic together • Transitioning across groups of sentences logically represents a shift in the topic being discussed • Coarse-grained discourse analysis 14 Wednesday, October 2, 13

Slide 20

Slide 20 text

TOPICAL TEXT SEGMENTATION The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 15 Wednesday, October 2, 13

Slide 21

Slide 21 text

TOPICAL TEXT SEGMENTATION The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 15 Wednesday, October 2, 13

Slide 22

Slide 22 text

TOPICAL TEXT SEGMENTATION The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 15 Wednesday, October 2, 13

Slide 23

Slide 23 text

TOPICAL TEXT SEGMENTATION The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 15 Wednesday, October 2, 13

Slide 24

Slide 24 text

TOPICAL TEXT SEGMENTATION The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 15 Wednesday, October 2, 13

Slide 25

Slide 25 text

OUTLINE • Motivation for Discourse Analysis • Building Features Out of Discourse Analysis • Experiments and Results • Discussion 16 Wednesday, October 2, 13

Slide 26

Slide 26 text

CLASSIFYING TEMPORAL RELATIONS RST Parsing Identify events within article PDTB Parsing Text Segmentation Feature Generation Convolution Kernel w SVM Temporal Relations 17 Wednesday, October 2, 13

Slide 27

Slide 27 text

CLASSIFYING TEMPORAL RELATIONS RST Parsing Identify events within article PDTB Parsing Text Segmentation Feature Generation Convolution Kernel w SVM Temporal Relations 17 1 Wednesday, October 2, 13

Slide 28

Slide 28 text

CLASSIFYING TEMPORAL RELATIONS RST Parsing Identify events within article PDTB Parsing Text Segmentation Feature Generation Convolution Kernel w SVM Temporal Relations 17 1 2 Wednesday, October 2, 13

Slide 29

Slide 29 text

CLASSIFYING TEMPORAL RELATIONS RST Parsing Identify events within article PDTB Parsing Text Segmentation Feature Generation Convolution Kernel w SVM Temporal Relations 17 1 2 3 Wednesday, October 2, 13

Slide 30

Slide 30 text

CLASSIFYING TEMPORAL RELATIONS RST Parsing Identify events within article PDTB Parsing Text Segmentation Feature Generation Convolution Kernel w SVM Temporal Relations 17 1 2 3 4 Wednesday, October 2, 13

Slide 31

Slide 31 text

CLASSIFYING TEMPORAL RELATIONS RST Parsing Identify events within article PDTB Parsing Text Segmentation Feature Generation Convolution Kernel w SVM Temporal Relations 17 1 2 3 4 5 Wednesday, October 2, 13

Slide 32

Slide 32 text

RST FEATURE A powerful …. shed at the …. airport elaboration at … (0915 GMT) elaboration while … bus terminal at the city. elaboration conjunction 18 Wednesday, October 2, 13

Slide 33

Slide 33 text

RST FEATURE A powerful …. shed at the …. airport elaboration at … (0915 GMT) elaboration while … bus terminal at the city. elaboration conjunction 18 Wednesday, October 2, 13

Slide 34

Slide 34 text

RST FEATURE A powerful …. shed at the …. airport elaboration at … (0915 GMT) elaboration while … bus terminal at the city. elaboration conjunction conjunction elaboration 18 Wednesday, October 2, 13

Slide 35

Slide 35 text

PDTB FEATURE A powerful bomb ….… at about 5.15 pm (0915 GMT) another explosion …. at the city while synchrony 19 Wednesday, October 2, 13

Slide 36

Slide 36 text

PDTB FEATURE A powerful bomb ….… at about 5.15 pm (0915 GMT) another explosion …. at the city while synchrony 19 Wednesday, October 2, 13

Slide 37

Slide 37 text

PDTB FEATURE A powerful bomb ….… at about 5.15 pm (0915 GMT) another explosion …. at the city while synchrony synchrony 19 Wednesday, October 2, 13

Slide 38

Slide 38 text

TEXT SEGMENTATION FEATURE The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 20 Wednesday, October 2, 13

Slide 39

Slide 39 text

TEXT SEGMENTATION FEATURE The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 1 2 20 Wednesday, October 2, 13

Slide 40

Slide 40 text

TEXT SEGMENTATION FEATURE The Davao Medical Centre, a regional government hospital, recorded 19 deaths with 50 wounded. A powerful bomb tore through a waiting shed at the Davao City international airport at about 5.15 pm (0915 GMT) while another explosion hit a bus terminal at the city. Medical evacuation workers however said the injured list was around 114, spread out at various hospitals. 1 2 20 Wednesday, October 2, 13

Slide 41

Slide 41 text

CONVOLUTION KERNELS • Allows us to capture structural similarities easily • Tree structure used as feature for support vector machines (SVM) • No need to flatten structure representation into a set of real number features 21 Wednesday, October 2, 13

Slide 42

Slide 42 text

STRUCTURAL SIMILARITY S NP VP PP S NP VP 22 Wednesday, October 2, 13

Slide 43

Slide 43 text

STRUCTURAL SIMILARITY S NP VP PP S NP VP S NP S VP S PP S NP S VP 22 Wednesday, October 2, 13

Slide 44

Slide 44 text

OUTLINE • Motivation for Discourse Analysis • Building Features Out of Discourse Analysis • Experiments and Results • Discussion 23 Wednesday, October 2, 13

Slide 45

Slide 45 text

DATA SET • Subset of 20 newswire articles from ACE 2005 corpus • Annotated with 375 event pairs • Applied temporal transitivity rules to obtain 7994 event pairs 24 Wednesday, October 2, 13

Slide 46

Slide 46 text

DATA SET BREAKDOWN Class # E-E Pairs AFTER 45% BEFORE 45% OVERLAP 10% 25 Wednesday, October 2, 13

Slide 47

Slide 47 text

DATA SET BREAKDOWN Class # E-E Pairs AFTER 45% BEFORE 45% OVERLAP 10% 0 150 300 450 600 0 5 10 15 20 25 30 35 40 # E-E Pairs Sentence Gap AFTER BEFORE OVERLAP 25 Wednesday, October 2, 13

Slide 48

Slide 48 text

RESULTS System P R F1 1 Do2012 43.86 52.65 47.46 2 Base 59.55 38.14 46.50 3 Base + RST + PDTB + TopicSeg 71.89 41.99 53.01 4 Base + RST + PDTB + TopicSeg + Coref 75.23 43.58 55.19 5 Base + ORST + PDTB + OTopicSeg + OCoref 78.35 54.24 64.10 26 Wednesday, October 2, 13

Slide 49

Slide 49 text

ARE RELATIONS USEFUL? 0 25 50 75 100 RST Manner-Means RST Condition PDTB Conjunction PDTB Concession PDTB Condition PDTB Synchrony Percentage (%) Discourse Relations OVERLAP BEFORE AFTER 27 Wednesday, October 2, 13

Slide 50

Slide 50 text

ARE RELATIONS USEFUL? 0 25 50 75 100 RST Manner-Means RST Condition PDTB Conjunction PDTB Concession PDTB Condition PDTB Synchrony Percentage (%) Discourse Relations OVERLAP BEFORE AFTER 27 Wednesday, October 2, 13

Slide 51

Slide 51 text

TOP FRAGMENTS - “BEFORE” 1. (Temporal ... 2. (Temporal (Elaboration ... 3. (Condition (Explanation ... 4. (Condition (Attribution ... 5. (Elaboration (Background ... 28 Wednesday, October 2, 13

Slide 52

Slide 52 text

RST “TEMPORAL” RELATION Milosevic … wielded… a decade before.. swept out.. power after a… October 2000. temporal temporal Milosevic and his wife wielded enormous power in Yugoslavia for more than a decade before he was swept out of power after a popular revolt in October 2000. 29 Wednesday, October 2, 13

Slide 53

Slide 53 text

CONFUSION MATRIX Predicted Predicted Predicted Predicted O B A N O 14.7% 14.1% 12.8% 58.5% B 0.5% 57.9% 15.5% 26.0% A 0.5% 15.7% 57.3% 26.5% 30 Actual Wednesday, October 2, 13

Slide 54

Slide 54 text

CONFUSION MATRIX 31 Predicted Predicted Predicted Predicted O B A N O 14.7% 14.1% 12.8% 58.5% B 0.5% 57.9% 15.5% 26.0% A 0.5% 15.7% 57.3% 26.5% Actual Wednesday, October 2, 13

Slide 55

Slide 55 text

CONFUSION MATRIX 32 Predicted Predicted Predicted Predicted O B A N O 14.7% 14.1% 12.8% 58.5% B 0.5% 57.9% 15.5% 26.0% A 0.5% 15.7% 57.3% 26.5% Actual Wednesday, October 2, 13

Slide 56

Slide 56 text

DOING BETTER • Investigate how to exploit other aspects of the discourse structures we have • Integrate features with joint-inference approaches for better overall results 33 Wednesday, October 2, 13

Slide 57

Slide 57 text

CONCLUSION • Discourse analysis is important for temporal classification • Rich structural information can be robustly captured with convolution kernels • Improvements to discourse parses will help improve results further 34 Wednesday, October 2, 13

Slide 58

Slide 58 text

THANK YOU! Questions? 35 Wednesday, October 2, 13

Slide 59

Slide 59 text

36 Wednesday, October 2, 13

Slide 60

Slide 60 text

ABLATION TESTS 37 Ablated Feature Change in F1 - RST -9.03 - TopicSeg -2.98 - Coref -2.18 - PDTB -1.42 Differences are all statistically significant, with p < 0.05 at least Wednesday, October 2, 13

Slide 61

Slide 61 text

COMPLEMENTARY RST & PDTB 38 RST PDTB Could be potentially more rigid More flexible in linking up text Able to link up all text in an article Does not build a global discourse tree Wednesday, October 2, 13