Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization

Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization

Overwhelming amounts of dialogue are being recorded in instant messaging apps, Slack channels, customer service interactions, and so on. Wouldn't it be great if automated tools could provide us short, succinct summaries?
However, challenges such as insufficient training data and low information density impede our ability to train abstractive summarization models. In this work, we propose a novel curriculum-based prompt learning method with self-training to address these problems. Specifically, prompts are learned using a curriculum learning strategy that gradually increases the degree of prompt perturbation, thereby improving the dialogue understanding and modeling capabilities of our model. Unlabeled dialogue is incorporated by means of self-training so as to reduce the dependency on labeled data. We further investigate topic-aware prompts to better plan for the generation of summaries. Experiments confirm that our model substantially outperforms strong baselines and achieves new state-of-the-art results on the AMI and ICSI datasets.

Published at EMNLP 2022
Paper: http://gerard.demelo.org/papers/dialogue-summarization.pdf

Gerard de Melo

December 19, 2022
Tweet

More Decks by Gerard de Melo

Other Decks in Technology

Transcript

  1. Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization Changqun

    Li1, Linlin Wang1, Xin Lin1, Gerard de Melo2, Liang He1 1 East China Normal University 2 Hasso Plattner Institute / University of Potsdam
  2. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185 Conventional Transformer-based models (e.g. BART)

    Prior Work Incorporating dialogue characteristics, e.g. dialogue acts, discourse, topic segments Hierarchical architectures for long dialogues Challenge 1: Key information scattered across utterances (by different participants) Challenge 2: Topic Drift
  3. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185 Conventional Transformer-based models (e.g. BART)

    Prior Work Incorporating dialogue characteristics, e.g. dialogue acts, discourse, topic segments Hierarchical architectures for long dialogues Challenge 1: Key information scattered across utterances (by different participants) Challenge 2: Topic Drift Challenge 3: Insufficient Training Data (e.g. just 137 meetings)
  4. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185 Key Ideas 1) Custom Prompt-based

    Learning for Better Dialogue Understanding Challenge 1: Key information scattered across utterances (by different participants) Challenge 2: Topic Drift Challenge 3: Insufficient Training Data (e.g. just 137 meetings) 2) Exploit Unlabeled Data
  5. Prompts Jane: Did the problem with your account get resolved?

    Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ...
  6. Prompts: Text Prompts Jane: Did the problem with your account

    get resolved? Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ... Summary of the dialogue :
  7. Prompts: Soft Prompts Jane: Did the problem with your account

    get resolved? Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ... P1 P2 ... Pn
  8. Prompts: Perturbed Prompts Jane: Did the problem with your account

    get resolved? Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ... P2 P1 ... 0 Prevent Overfitting via: Random Swapping and Cutoff
  9. Prompts: Interpolated Prompts with Curriculum Schedule P2 P1 ... 0

    P1 P2 ... Pn MixUp interpolation Curriculum Learning: Gradually increase perturbation
  10. Decoder with Topic-based Prompts David: I heard you’ve taken over

    Chris’s company? Is that true? Julie: Yes. ... Decoder Dialogue embedings Summary
  11. Decoder with Topic-based Prompts David: I heard you’ve taken over

    Chris’s company? Is that true? Julie: Yes. ... DialoGPT Topic segmentation Topic Prompts Decoder Dialogue embedings Summary
  12. Experiments: Main Results In Paper: Evaluation on SamSum dataset. Human

    Evaluation of Fluency, Informativeness, Relevance
  13. Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization Changqun

    Li, Linlin Wang, Xin Lin, G. de Melo, Liang He Contact: [email protected] http://gerard.demelo.org http://dialoguesystems.org/ Dialogue Summarization benefits from: Prompt Perturbation with Curriculum Schedule Topic Prompts for the Decoder Prompt Optimization with Self-Training gdemelo gdm3000 @[email protected]