文献紹介:ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation

文献紹介 ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn
Dialogue Generation 鈴木脩右 2019/09/20 長岡技術科学大学自然言語処理研究室 1

文献情報 [1] Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo,
and Xueqi Cheng. ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3721–3730, Florence, Italy, July 2019. Association for Computational Linguistics. 2

Abstract • マルチターンの対話生成モデル ReCoSa を提案 • 対話文脈中から応答に関連した文脈を識別 • 関連文脈の識別には Self-Attention
を利用 • 中国語，英語のデータセットで，自動評価，人手評価共にベースラインを上回る 3

Introduction • 対話文脈中には応答に関連した文脈がある • 既存モデルの HRED などでは考慮されていない •
HRAN では Attention を用いこれを解決 • RNN では遠い位置にある関連文脈を捉えるのは難しい • 本研究では，Self-Attention でこの問題を改善 Table 1: The two examples from the customer services dataset, and the red sentence indicaters the relevant context to the response.[1] 4

Related Work • HRED - Encoder-Decoder モデルを階層構造にした (Serban et al.,
2016) • VHRED - HRED に VAE を導入し，多様性を改善 (Serban et al., 2017) • WSeq - HRED に cos 類似度を重みにした Attention を導入 (Tian et al., 2017) • HRAN - HRED の各階層に Attention を導入し関連文脈を考慮 (Xing et al., 2018) • HVMN - VHRED に Memory-Network を導入 (Chen et al., 2018) 5

Relevant Context Self-Attention Model(ReCoSa) • Context Representaion Encoder • 対話文脈をエンコードし，
Self-Attention で重みを取得 • Response Representaion Encoder • Self-Attention でマスクされた応答の単語埋め込みと位置埋め込みを取得 • Context-Response Attention Decoder • Attention で応答表現を得て，softmax 層で単語確率を取得 Figure 1: The architecture of ReCoSa model[1] 6

Context Representation Encoder • Word-level Encoder • LSTM を使用 •
単語埋め込みから文表現を獲得 • 文表現に位置埋め込みを追加 • Context Self-Attention • 遠く離れた依存関係をうまく捉えることが可能 • 構造は Multi-Head Attention • 文表現から文脈表現を獲得 7

Response Representation Encoder • 応答文を Self-Attention に入力 • 単語埋め込みと位置埋め込みを取得し，応答表現を獲得 •
トレーニングの応答文にマスクを掛ける • 推論のために，生成された応答文を入力としてフィードバック 8

Context-Response Attention Decoder • 2 つの Encoder から得られた表現を Multi-Head Attention
に入力 • Softmax 層で単語の生成確率を取得して生成 9

Experiments • データセット • Ubuntu dialogue corpus(英語) train 3.98M/valid 10K/test
10K (ペア数) • JDC Dataset (中国語) train 500K/valid 7,843/test 7,843 (ペア数) • ベースライン • SEQ2SEQ,HRED,VHRED,WSeq,HRAN,HVMN • 評価方法 • 自動評価 - perplexity,BLEU,distinct • 人手評価 - ReCoSa をベースラインと比較し win/loss/tie で 3 人が評価 10

Results 1 Table 2: The metric-based evaluation results (%).[1] Table
3: The human evaluation on JDC and Ubuntu.[1] 11

Results 2 Table 4: The generated response Example1 from different
models on JDC. The red contexts indicate the relevant context to the response.[1] 12

Analysis on Relevant Contexts • JDC Dataset からランダムに 500 ペアを抽出
• 応答に関連付けられている文脈に 1，それ以外は 0 をアノテーション • アノテーターは 3 名，kappa 値は 0.514 • 評価は Precision@n，Recall@n，F 値@n (n=1,3,5,10) 13

Analysis results • WSeq が@1 で最高精度を出している • 関連文脈と応答は類似していることが多いため Table 5:
The attention analysis results (%).[1] 14

Conclusion • マルチターンの対話生成モデル ReCoSa を提案 • 対話文脈から応答に関連した文脈を識別可能 • Self-Attentin により遠く離れた依存関係も捉えられる
• ベースラインを上回る精度を出した • 分析によって，関連文脈の識別が人間の判断と近いことがわかった • 関連文脈の識別は対話生成の品質改善に有効 15

文献紹介:ReCoSa: Detecting the Relevant Contexts wi...

文献紹介:ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation

shu_suzuki

More Decks by shu_suzuki

Other Decks in Technology

Featured

Transcript

文献紹介 ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn

文献情報 [1] Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo,

Abstract • マルチターンの対話生成モデル ReCoSa を提案 • 対話文脈中から応答に関連した文脈を識別 • 関連文脈の識別には Self-Attention

Introduction • 対話文脈中には応答に関連した文脈がある • 既存モデルの HRED などでは考慮されていない •

Related Work • HRED - Encoder-Decoder モデルを階層構造にした (Serban et al.,

Relevant Context Self-Attention Model(ReCoSa) • Context Representaion Encoder • 対話文脈をエンコードし，

Context Representation Encoder • Word-level Encoder • LSTM を使用 •

Response Representation Encoder • 応答文を Self-Attention に入力 • 単語埋め込みと位置埋め込みを取得し，応答表現を獲得 •

Context-Response Attention Decoder • 2 つの Encoder から得られた表現を Multi-Head Attention

Experiments • データセット • Ubuntu dialogue corpus(英語) train 3.98M/valid 10K/test

Results 1 Table 2: The metric-based evaluation results (%).[1] Table

Results 2 Table 4: The generated response Example1 from different

Analysis on Relevant Contexts • JDC Dataset からランダムに 500 ペアを抽出

Analysis results • WSeq が@1 で最高精度を出している • 関連文脈と応答は類似していることが多いため Table 5:

Conclusion • マルチターンの対話生成モデル ReCoSa を提案 • 対話文脈から応答に関連した文脈を識別可能 • Self-Attentin により遠く離れた依存関係も捉えられる