Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization

Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization

Overwhelming amounts of dialogue are being recorded in instant messaging apps, Slack channels, customer service interactions, and so on. Wouldn't it be great if automated tools could provide us short, succinct summaries?
However, challenges such as insufficient training data and low information density impede our ability to train abstractive summarization models. In this work, we propose a novel curriculum-based prompt learning method with self-training to address these problems. Specifically, prompts are learned using a curriculum learning strategy that gradually increases the degree of prompt perturbation, thereby improving the dialogue understanding and modeling capabilities of our model. Unlabeled dialogue is incorporated by means of self-training so as to reduce the dependency on labeled data. We further investigate topic-aware prompts to better plan for the generation of summaries. Experiments confirm that our model substantially outperforms strong baselines and achieves new state-of-the-art results on the AMI and ICSI datasets.

Published at EMNLP 2022
Paper: http://gerard.demelo.org/papers/dialogue-summarization.pdf

Gerard de Melo

December 19, 2022
Tweet

More Decks by Gerard de Melo

Other Decks in Technology

Transcript

  1. Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization Changqun

    Li1, Linlin Wang1, Xin Lin1, Gerard de Melo2, Liang He1 1 East China Normal University 2 Hasso Plattner Institute / University of Potsdam
  2. Massive Amounts of Dialogue Image: REVE Chat Customer Service Instant

    Messaging Slack Etc.
  3. Dialogue Summarization

  4. Dialogue Summarization Challenge 1: Key information scattered across utterances (by

    different participants)
  5. Dialogue Summarization Challenge 1: Key information scattered across utterances (by

    different participants) Challenge 2: Topic Drift
  6. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185 Conventional Transformer-based models (e.g. BART)

    Prior Work Incorporating dialogue characteristics, e.g. dialogue acts, discourse, topic segments Hierarchical architectures for long dialogues Challenge 1: Key information scattered across utterances (by different participants) Challenge 2: Topic Drift
  7. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185 Conventional Transformer-based models (e.g. BART)

    Prior Work Incorporating dialogue characteristics, e.g. dialogue acts, discourse, topic segments Hierarchical architectures for long dialogues Challenge 1: Key information scattered across utterances (by different participants) Challenge 2: Topic Drift Challenge 3: Insufficient Training Data (e.g. just 137 meetings)
  8. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185 Key Ideas 1) Custom Prompt-based

    Learning for Better Dialogue Understanding Challenge 1: Key information scattered across utterances (by different participants) Challenge 2: Topic Drift Challenge 3: Insufficient Training Data (e.g. just 137 meetings) 2) Exploit Unlabeled Data
  9. Approach Heterogeneous Prompts Self-Training

  10. Approach

  11. Curriculum Prompt Learning in Encoder

  12. Prompts Jane: Did the problem with your account get resolved?

    Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ...
  13. Prompts: Text Prompts Jane: Did the problem with your account

    get resolved? Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ... Summary of the dialogue :
  14. Prompts: Soft Prompts Jane: Did the problem with your account

    get resolved? Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ... P1 P2 ... Pn
  15. Prompts: Perturbed Prompts Jane: Did the problem with your account

    get resolved? Evelyn: Well, I can sign in but can’t push code. Jane: Oh, let me have a look at the log files. ... P2 P1 ... 0 Prevent Overfitting via: Random Swapping and Cutoff
  16. Prompts: Interpolated Prompts with Curriculum Schedule P2 P1 ... 0

    P1 P2 ... Pn MixUp interpolation Curriculum Learning: Gradually increase perturbation
  17. Decoder

  18. Decoder with Topic-based Prompts David: I heard you’ve taken over

    Chris’s company? Is that true? Julie: Yes. ... Decoder Dialogue embedings Summary
  19. Decoder with Topic-based Prompts David: I heard you’ve taken over

    Chris’s company? Is that true? Julie: Yes. ... DialoGPT Topic segmentation Topic Prompts Decoder Dialogue embedings Summary
  20. Prompt Optimization with Self-Training

  21. Prompt Optimization with Self-Training Synthetic Data with different difficulty levels

  22. Experiments: Main Results In Paper: Evaluation on SamSum dataset. Human

    Evaluation of Fluency, Informativeness, Relevance
  23. Experiments: Few-Shot Results on SAMSum In Paper: Further Analyses and

    Comparisons
  24. Experiments: Ablation Study ICSI Dev. Set

  25. Example

  26. Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization Changqun

    Li, Linlin Wang, Xin Lin, G. de Melo, Liang He Contact: [email protected] http://gerard.demelo.org http://dialoguesystems.org/ Dialogue Summarization benefits from: Prompt Perturbation with Curriculum Schedule Topic Prompts for the Decoder Prompt Optimization with Self-Training gdemelo gdm3000 @[email protected]
  27. Details: Datasets