Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Structured Latent Spaces for Stylized Response Generation

Structured Latent Spaces for Stylized Response Generation

Scatter Lab Inc.

October 02, 2019
Tweet

More Decks by Scatter Lab Inc.

Other Decks in Research

Transcript

  1. • Stylized response generation • ؀ച ղীࢲ queryী ੸੺ೠ ׹߸ਸ

    ਗೞח झఋੌ۽ ࢤࢿ • polite, professional, friendly, … • ޙઁ੼ 1. Parallel ؘ੉ఠ੄ ࠗ੤ • Text style transfer੄ Ӕࠄ੸ ޙઁ -> ӝઓ੄ unsupervised approach۽ ೧Ѿೡ ࣻ হਸө? 2. Non-parallel ؀ച ؘ੉ఠ੄ ࠗ੤ • ؀ച ؘ੉ఠ ഑਷ non-parallel झఋੌ ؘ੉ఠ݅ ઓ੤ (׏झ, ࣗࢸ, ࠶۽Ӓ ١) ➡ ӝઓ ߑधਵ۽ח less style-specificೞѢա less context-relevantೣ 3 Motivation
  2. • S2S + LM (Niu and Bansal, 2018) • ؀ച

    ؘ੉ఠ۽ seq2seq ݽ؛ ೟ण • झఋੌ ؘ੉ఠ۽ LM ೟ण • ف ݽ؛ ഛܫ੄ о઺೤ਵ۽ ׮਺ ష௾ ৘ஏ ➡ ъઁ۽ biasܳ ઱ӝ ٸޙী relevance ↓ • Multi-task learning (Luan et al., 2017) • ؀ച ؘ੉ఠ۽ seq2seq ݽ؛ ೟ण • झఋੌ ؘ੉ఠ۽ autoencoder ೟ण • ف ؘ੉ఠܳ э਷ latent space ࢚ী mapping दఇ ➡ ৈ੹൤ ܻ࠙ػ ௿۞झఠܳ ഋࢿೞӝ ٸޙী style intensity ↓ 4 Previous Work
  3. • ؀ച-झఋੌ р shared latent space ഋࢿ • ੄޷੸ਵ۽ ਬࢎೠ

    stylized sentenceٜ੉ ࠺तೠ Ҕী ݽ੉ب۾ • SPACEFUSION (Gao et al., 2019)ਸ non-parallel ؘ੉ఠ۽ ഛ੢ • Automatic, human evaluation ݽف baselineী ࠺೧ ਋ࣻ 5 Contributions
  4. • Joint optimizationਸ ా೧ ׹߸੄ relevance৬ diversityܳ ֫੐ • relevance৬

    diversity੄ shared latent spaceܳ ഋࢿ • ف ࢿ૕ਸ ઑ੺ೡ ࣻ ੓ب۾ э਷ ҕр ࢚ী align • Lossী ҙ۲ regularization termਸ ୶оೞח ߑध 7 Recap: SPACEFUSION
  5. • ׮ܲ latent spaceՙܻ оөਕ૑ѱ ݅٘ח term • Cross-space distance

    • • • • Same-space distance • • • dconv = ∑ i∈batch dE (zS2S(xi ), zAE(yi )) n l dstyle = 1 2 dcross NN ({zS2S(xi )}, {zAE(si )}) + 1 2 dcross NN ({zAE(si )}, {zS2S(xi )}) dcross NN ({ai }, {bi }) = ∑ i∈batch dE (ai , bNN of ai ) n l dspread-out = min[dsame NN ({zAE(yi )}), dsame NN ({zAE(si )}), dsame NN ({zS2S(xi )})] dsame NN ({ai }) = ∑ i∈batch dE (ai , aNN of ai ) n l ℒfuse,{conv,style} = d{conv,style} − dspread-out 9 STYLEFUSION: Fusion Objective
  6. • ҕр ࢚ীࢲ ߭ఠ ੉زী ٮܲ ੄޷ ߸ചо ݒՍۣب۾ ೞח

    term • Smoothing between prediction and target • • • Smoothing between non-stylized and random stylized sentence • • ✓ ℒsmooth,conv = − 1 |y| log p(y|zconv) zconv = (1 − u)zAE(y) + uzS2S(x) + ϵ ℒsmooth,style = − (1 − u) 1 |y| log p(y|zstyle) − u 1 |s| log p(s|zstyle) zstyle = (1 − u)zAE(y) + uzAE(s) + ϵ u ∼ U(0,1), ϵ ∼ N(0,σ2I) 10 STYLEFUSION: Smoothness Objective
  7. • • ੌ߈੸ਵ۽ • Overfitting ߑ૑ܳ ਤ೧ ۽ pretrain •

    Data augmentation • ੐੄۽ <unk>۽ ݃झఊ • ℒ = − 1 |y| log p(y|zS2S) + ℒconv + ℒstyle Dstyle ≪ Dconv Dconv si ∈ Dstyle P(mask) ∝ (freq)−1 11 STYLEFUSION: Training Objective
  8. • • ઱߸ীࢲ ࢠ೒݂ • ੄ ରਗী ৔ೱ ߉૑ ঋب۾

    normalize: • • ޷ܻ ೟णػ style classifier੄ ഛܫҗ о઺೤ • neural: 2-layer GRU • ngram: logistic regression using multi-hot features (n=1,2,3,4) z = zS2S(x) + r zS2S(x) z ρ = |r|/(σ l) score(hi ) = (1 − λ)P(hi |zS2S(x)) + λPstyle(hi ) 12 STYLEFUSION: Inference
  9. • ؘ੉ఠࣇ • Reddit: ؀ച, 10M context-response ह • arXiv,

    Holmes: п 1M, 38K ޙ੢ • : pretrained style classifier۽ Redditীࢲ ೙ఠ݂ • Baselines • MTask (Luan et al., 2017) • S2S+LM (Niu and Bansal, 2018) • Retrieval, Rand, Human Dtest 14 Experimental Setup
  10. • Rand: style intensity ↑, but relevance ↓ • S2S+LM:

    BLEU ↓ • MTask: style intensity ↓, diversity ↓ 18 Results: Automatic Evaluation
  11. • MTask: style intensity ↓ • S2S+LM: relevance ↓ 19

    Results: Human Evaluation & Visualization zS2S(x) zAE(s) zAE(y)