Structured Latent Spaces for Stylized Response Generation

Structuring Latent Spaces for Stylized Response Generation ੢ࢿࠁ (ML Engineer,
Pingpong)

2 Introduction

• Stylized response generation • ؀ച ղীࢲ queryী ੸੺ೠ ׹߸ਸ
ਗೞח झఋੌ۽ ࢤࢿ • polite, professional, friendly, … • ޙઁ੼ 1. Parallel ؘ੉ఠ੄ ࠗ੤ • Text style transfer੄ Ӕࠄ੸ ޙઁ -> ӝઓ੄ unsupervised approach۽ ೧Ѿೡ ࣻ হਸө? 2. Non-parallel ؀ച ؘ੉ఠ੄ ࠗ੤ • ؀ച ؘ੉ఠ ഑਷ non-parallel झఋੌ ؘ੉ఠ݅ ઓ੤ (׏झ, ࣗࢸ, ࠶۽Ӓ ١) ➡ ӝઓ ߑधਵ۽ח less style-speciﬁcೞѢա less context-relevantೣ 3 Motivation

• S2S + LM (Niu and Bansal, 2018) • ؀ച
ؘ੉ఠ۽ seq2seq ݽ؛ ೟ण • झఋੌ ؘ੉ఠ۽ LM ೟ण • ف ݽ؛ ഛܫ੄ о઺೤ਵ۽ ׮਺ ష௾ ৘ஏ ➡ ъઁ۽ biasܳ ઱ӝ ٸޙী relevance ↓ • Multi-task learning (Luan et al., 2017) • ؀ച ؘ੉ఠ۽ seq2seq ݽ؛ ೟ण • झఋੌ ؘ੉ఠ۽ autoencoder ೟ण • ف ؘ੉ఠܳ э਷ latent space ࢚ী mapping दఇ ➡ ৈ੹൤ ܻ࠙ػ ௿۞झఠܳ ഋࢿೞӝ ٸޙী style intensity ↓ 4 Previous Work

• ؀ച-झఋੌ р shared latent space ഋࢿ • ੄޷੸ਵ۽ ਬࢎೠ
stylized sentenceٜ੉ ࠺तೠ Ҕী ݽ੉ب۾ • SPACEFUSION (Gao et al., 2019)ਸ non-parallel ؘ੉ఠ۽ ഛ੢ • Automatic, human evaluation ݽف baselineী ࠺೧ ਋ࣻ 5 Contributions

6 The Proposed Method STYLEFUSION

• Joint optimizationਸ ా೧ ׹߸੄ relevance৬ diversityܳ ֫੐ • relevance৬
diversity੄ shared latent spaceܳ ഋࢿ • ف ࢿ૕ਸ ઑ੺ೡ ࣻ ੓ب۾ э਷ ҕр ࢚ী align • Lossী ҙ۲ regularization termਸ ୶оೞח ߑध 7 Recap: SPACEFUSION

• {seq2seq, autoencoder} encoder + (parameter-shared) decoder 8 STYLEFUSION: Model
Architecture

• ׮ܲ latent spaceՙܻ оөਕ૑ѱ ݅٘ח term • Cross-space distance
• • • • Same-space distance • • • dconv = ∑ i∈batch dE (zS2S(xi ), zAE(yi )) n l dstyle = 1 2 dcross NN ({zS2S(xi )}, {zAE(si )}) + 1 2 dcross NN ({zAE(si )}, {zS2S(xi )}) dcross NN ({ai }, {bi }) = ∑ i∈batch dE (ai , bNN of ai ) n l dspread-out = min[dsame NN ({zAE(yi )}), dsame NN ({zAE(si )}), dsame NN ({zS2S(xi )})] dsame NN ({ai }) = ∑ i∈batch dE (ai , aNN of ai ) n l ℒfuse,{conv,style} = d{conv,style} − dspread-out 9 STYLEFUSION: Fusion Objective

• • ੌ߈੸ਵ۽ • Overﬁtting ߑ૑ܳ ਤ೧ ۽ pretrain •
Data augmentation • ੐੄۽ <unk>۽ ݃झఊ • ℒ = − 1 |y| log p(y|zS2S) + ℒconv + ℒstyle Dstyle ≪ Dconv Dconv si ∈ Dstyle P(mask) ∝ (freq)−1 11 STYLEFUSION: Training Objective

• • ઱߸ীࢲ ࢠ೒݂ • ੄ ରਗী ৔ೱ ߉૑ ঋب۾
normalize: • • ޷ܻ ೟णػ style classiﬁer੄ ഛܫҗ о઺೤ • neural: 2-layer GRU • ngram: logistic regression using multi-hot features (n=1,2,3,4) z = zS2S(x) + r zS2S(x) z ρ = |r|/(σ l) score(hi ) = (1 − λ)P(hi |zS2S(x)) + λPstyle(hi ) 12 STYLEFUSION: Inference

13 Experiments

• ؘ੉ఠࣇ • Reddit: ؀ച, 10M context-response ह • arXiv,
Holmes: п 1M, 38K ޙ੢ • : pretrained style classiﬁer۽ Redditীࢲ ೙ఠ݂ • Baselines • MTask (Luan et al., 2017) • S2S+LM (Niu and Bansal, 2018) • Retrieval, Rand, Human Dtest 14 Experimental Setup

• ࢤࢿ ৘द 15 Results

• ী ٮܲ style intensity৬ relevance ߸ച ρ 16 Results

• ী ٮܲ style intensity৬ relevance ߸ച ρ 17 Results

• Rand: style intensity ↑, but relevance ↓ • S2S+LM:
BLEU ↓ • MTask: style intensity ↓, diversity ↓ 18 Results: Automatic Evaluation

• MTask: style intensity ↓ • S2S+LM: relevance ↓ 19
Results: Human Evaluation & Visualization zS2S(x) zAE(s) zAE(y)

20 Thank You! Any Question?

Structured Latent Spaces for Stylized Response ...

Structured Latent Spaces for Stylized Response Generation

Scatter Lab Inc.

More Decks by Scatter Lab Inc.

Other Decks in Research

Featured

Transcript

Structuring Latent Spaces for Stylized Response Generation ੢ࢿࠁ (ML Engineer,

2 Introduction

• Stylized response generation • ؀ച ղীࢲ queryী ੸੺ೠ ׹߸ਸ

• S2S + LM (Niu and Bansal, 2018) • ؀ച

• ؀ച-झఋੌ р shared latent space ഋࢿ • ੄޷੸ਵ۽ ਬࢎೠ

6 The Proposed Method STYLEFUSION

• Joint optimizationਸ ా೧ ׹߸੄ relevance৬ diversityܳ ֫੐ • relevance৬

• {seq2seq, autoencoder} encoder + (parameter-shared) decoder 8 STYLEFUSION: Model

• ׮ܲ latent spaceՙܻ оөਕ૑ѱ ݅٘ח term • Cross-space distance

• ҕр ࢚ীࢲ ߭ఠ ੉زী ٮܲ ੄޷ ߸ചо ݒՍۣب۾ ೞח

• • ੌ߈੸ਵ۽ • Overﬁtting ߑ૑ܳ ਤ೧ ۽ pretrain •

• • ઱߸ীࢲ ࢠ೒݂ • ੄ ରਗী ৔ೱ ߉૑ ঋب۾

13 Experiments

• ؘ੉ఠࣇ • Reddit: ؀ച, 10M context-response ह • arXiv,

• ࢤࢿ ৘द 15 Results

• ী ٮܲ style intensity৬ relevance ߸ച ρ 16 Results

• ী ٮܲ style intensity৬ relevance ߸ച ρ 17 Results

• Rand: style intensity ↑, but relevance ↓ • S2S+LM:

• MTask: style intensity ↓ • S2S+LM: relevance ↓ 19

20 Thank You! Any Question?