F. Liu, J. Flanigan, S. Thomson, N. Sadeh, N. Smith Presented By: Mahak Gupta Date: 25th May 2016 * North American Chapter of the Association for Computational Linguistics – Human Language Technologies
• Abstract Meaning Representation (AMR) • A semantic representation of a language • Semantic representation approach for summarization • Graphs, Integer Linear Programming • Some Mathematical equations (Sorry for that … We’ll not go too much into it). In spite of that, I’ll try to have an interesting session for the next 45- 50 minutes J 2
program in order to create a summary that retains the most important points of the original text. • Summarization Applications • News summaries • Reviews 4
: • Given a single document produce • Abstract • Outline • Headline § Multiple Document Summarization : • Given a group of document produce a gist of the document • A series of news stories of the same event • A set of webpages about a topic 5 Stanford NLP Course
content of a document §Query-focused Summarization • Summarize a document with respect to an information need expressed in a user query 6 Stanford NLP Course
summary from phrases or sentences in the source document(s) §Abstractive Summarization • Express the ideas in the source document using different words 7 Stanford NLP Course
DAY-TO-DAY LIFE ? • Movie reviews to a friend • Minutes of Meeting (share info. with colleagues) • Share reviews about a book • Last night studies before exam ;) 9
representation is an abstract language in which meanings can be represented. • Are you aware of any other representations in Semantics ? - First order Predicate Logic • In this paper we’ll discuss mainly about Abstract Meaning Representations (AMR) as a semantic representation. 10
graphs that are easy for people to read, and easy for programs to traverse. chase-01 dog cat The dog was chasing a cat Concepts (from PropBank*) Relations ARG-0 ARG-1 11 * Proposition Bank – http://propbank.github.io (Vx)[dog(x) -> (y) [cat(y) & chase(x,y) ] ]
REPRESENTATION ? http://cdn.theatlantic.com/static/mt/assets/food/main%20wavebreakmedia%20ltd%20shutterstock_85474510.jpg - Semantic Representation in the Human Brain during Listening And Reading - A continuous semantic space describes the representation of thousands of object and action categories across the human brain 12
Generation JAMR:- https://github.com/jflanigan/jamr (Flanigan et al., 2014) Collapsing Summary: Joe’s dog was chasing a cat in the garden. Sentence A: I saw Joe’s dog, which was running in the garden. Sentence B: The dog was chasing a cat 14
a cat in the garden. Collapse Graph Expansion Concept Merging 15 * * * * * Sentence A: I saw Joe’s dog, which was running in the garden. Sentence B: The dog was chasing a cat
of summary graph edges that can be covered by an automatically constructed source graph. Summary Edge Coverage (%) Labeled Unlabeled Expand Train 64.8 67.0 75.5 Dev. 77.3 78.6 85.4 Test 63.0 64.7 75.0 16
maintaining brevity, and producing fluent language. • Let G(V,E) be a source graph; • We want a subgraph G’(V’,E’) that maximizes the objective function • f(v) = Feature vector for vertices/concepts, g(e) = Feature vector for edges. • Vi = 1 if node i is included in graph, ei,j =1 if edge from node i to node j is included 17
Concept freq in the input sentence set Depth Average and smallest depth of node to the root of the sentence graph. Position Average and foremost position of sentences containing the concept. Span Average and longest word span of concept; binarized using 5 length thresholds. Entity Two binary features indicating whether the concept is a named entity/date entity or not Bias Bias term, 1 for any node NODE FEATURES
frequent edge labels between concepts. Freq Edge frequency (w/o label, non-expanded edges) in the document sentences Position Average and foremost position of sentences containing the edge (without label) Nodes Node features extracted from the source and target nodes. IsExpanded A binary feature indicating the edge is due to graph expansion or not. Bias Bias term, 1 for any edge
words. • Given a predicted subgraph, a system summary is created by finding the most frequently aligned word span for each concept node (JAMR gives these alignments). • Words in the resulting spans are generated in no particular order. • This is not a natural language summary, it is suitable for unigram-based summarization evaluation methods like ROUGE-1 (Unigram alignments). • Overall, this is a research problem. 22
• Exploring a full-fledged pipeline that consists of an automatic AMR parser, a graph-to-graph summarizer, and a AMR-to-text generator; • Devising an evaluation metric that is better suited to abstractive summarization. • Tense prediction in AMR’S 29
representations • Semantic Representation approach towards Abstractive Summarization • It is an interesting approach taking into account the meaning representation of a language but it doesn’t generalize well on unknown data (may be because its fairly a very new idea). • Other graph approaches scale better and are now tested with ROGUE-2,3,4 scores i.e bigrams, trigrams etc. 30 Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions
1. We’ve discussed extractive summarization is a fairly simple approach, then What is the motivation for abstractive summarization ? 2. What is the term ROGUE-1. Why is it discussed in this paper ? 31 Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions
Discriminative Graph-Based Parser for the Abstract Meaning Representation • Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions • Stanford NLP Course 33