Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization

Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization

Overwhelming amounts of dialogue are being recorded in instant messaging apps, Slack channels, customer service interactions, and so on. Wouldn't it be great if automated tools could provide us short, succinct summaries?
However, challenges such as insufficient training data and low information density impede our ability to train abstractive summarization models. In this work, we propose a novel curriculum-based prompt learning method with self-training to address these problems. Specifically, prompts are learned using a curriculum learning strategy that gradually increases the degree of prompt perturbation, thereby improving the dialogue understanding and modeling capabilities of our model. Unlabeled dialogue is incorporated by means of self-training so as to reduce the dependency on labeled data. We further investigate topic-aware prompts to better plan for the generation of summaries. Experiments confirm that our model substantially outperforms strong baselines and achieves new state-of-the-art results on the AMI and ICSI datasets.

Published at EMNLP 2022
Paper: http://gerard.demelo.org/papers/dialogue-summarization.pdf

Gerard de Melo

December 19, 2022
Tweet

More Decks by Gerard de Melo

Other Decks in Technology

Transcript

  1. Curriculum Prompt Learning with Self-Training
    for Abstractive Dialogue Summarization
    Changqun Li1, Linlin Wang1, Xin Lin1, Gerard de Melo2, Liang He1
    1 East China Normal University
    2 Hasso Plattner Institute / University of Potsdam

    View Slide

  2. Massive Amounts of Dialogue
    Image: REVE Chat
    Customer Service
    Instant Messaging
    Slack
    Etc.

    View Slide

  3. Dialogue Summarization

    View Slide

  4. Dialogue Summarization
    Challenge 1:
    Key information scattered
    across utterances
    (by different participants)

    View Slide

  5. Dialogue Summarization
    Challenge 1:
    Key information scattered
    across utterances
    (by different participants)
    Challenge 2:
    Topic Drift

    View Slide

  6. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185
    Conventional Transformer-based models
    (e.g. BART)
    Prior Work
    Incorporating dialogue characteristics,
    e.g. dialogue acts, discourse,
    topic segments
    Hierarchical architectures
    for long dialogues
    Challenge 1:
    Key information scattered
    across utterances
    (by different participants)
    Challenge 2:
    Topic Drift

    View Slide

  7. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185
    Conventional Transformer-based models
    (e.g. BART)
    Prior Work
    Incorporating dialogue characteristics,
    e.g. dialogue acts, discourse,
    topic segments
    Hierarchical architectures
    for long dialogues
    Challenge 1:
    Key information scattered
    across utterances
    (by different participants)
    Challenge 2:
    Topic Drift
    Challenge 3:
    Insufficient Training Data
    (e.g. just 137 meetings)

    View Slide

  8. Bike Image: Adapted from https://www.flickr.com/photos/swambo/14119129185
    Key Ideas
    1) Custom Prompt-based Learning
    for Better Dialogue Understanding
    Challenge 1:
    Key information scattered
    across utterances
    (by different participants)
    Challenge 2:
    Topic Drift
    Challenge 3:
    Insufficient Training Data
    (e.g. just 137 meetings)
    2) Exploit Unlabeled Data

    View Slide

  9. Approach
    Heterogeneous
    Prompts
    Self-Training

    View Slide

  10. Approach

    View Slide

  11. Curriculum Prompt Learning
    in Encoder

    View Slide

  12. Prompts
    Jane: Did the problem with your account get resolved?
    Evelyn: Well, I can sign in but can’t push code.
    Jane: Oh, let me have a look at the log files.
    ...

    View Slide

  13. Prompts: Text Prompts
    Jane: Did the problem with your account get resolved?
    Evelyn: Well, I can sign in but can’t push code.
    Jane: Oh, let me have a look at the log files.
    ...
    Summary of the dialogue :

    View Slide

  14. Prompts: Soft Prompts
    Jane: Did the problem with your account get resolved?
    Evelyn: Well, I can sign in but can’t push code.
    Jane: Oh, let me have a look at the log files.
    ...
    P1
    P2
    ... Pn

    View Slide

  15. Prompts: Perturbed Prompts
    Jane: Did the problem with your account get resolved?
    Evelyn: Well, I can sign in but can’t push code.
    Jane: Oh, let me have a look at the log files.
    ...
    P2
    P1
    ... 0
    Prevent Overfitting
    via:
    Random Swapping
    and Cutoff

    View Slide

  16. Prompts: Interpolated Prompts
    with Curriculum Schedule
    P2
    P1
    ... 0
    P1
    P2
    ... Pn
    MixUp
    interpolation
    Curriculum Learning:
    Gradually increase perturbation

    View Slide

  17. Decoder

    View Slide

  18. Decoder with
    Topic-based Prompts
    David: I heard you’ve taken over Chris’s company? Is that true? Julie: Yes. ...
    Decoder
    Dialogue
    embedings
    Summary

    View Slide

  19. Decoder with
    Topic-based Prompts
    David: I heard you’ve taken over Chris’s company? Is that true? Julie: Yes. ...
    DialoGPT
    Topic segmentation
    Topic Prompts
    Decoder
    Dialogue
    embedings
    Summary

    View Slide

  20. Prompt Optimization
    with Self-Training

    View Slide

  21. Prompt Optimization
    with Self-Training
    Synthetic Data
    with different
    difficulty levels

    View Slide

  22. Experiments:
    Main Results
    In Paper: Evaluation on SamSum dataset.
    Human Evaluation of Fluency, Informativeness, Relevance

    View Slide

  23. Experiments:
    Few-Shot Results on SAMSum
    In Paper:
    Further Analyses and Comparisons

    View Slide

  24. Experiments:
    Ablation Study
    ICSI Dev. Set

    View Slide

  25. Example

    View Slide

  26. Curriculum Prompt Learning with Self-Training
    for Abstractive Dialogue Summarization
    Changqun Li, Linlin Wang, Xin Lin, G. de Melo, Liang He
    Contact:
    [email protected]
    http://gerard.demelo.org
    http://dialoguesystems.org/
    Dialogue Summarization
    benefits from:
    Prompt Perturbation
    with Curriculum Schedule
    Topic Prompts for the
    Decoder
    Prompt Optimization
    with Self-Training
    gdemelo
    gdm3000
    @[email protected]

    View Slide

  27. Details: Datasets

    View Slide