Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Courteously Yours: Inducing courteous behavior ...

ryoma yoshimura
December 18, 2019

Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network

研究室の論文読み会での発表資料です。

ryoma yoshimura

December 18, 2019
Tweet

More Decks by ryoma yoshimura

Other Decks in Research

Transcript

  1. Courteously Yours:
 Inducing courteous behavior in Customer Care responses using

    
 Reinforced Pointer Generator Network
 Hitesh Golchha, Mauajama Firdaus, Asif Ekbal, Pushpak Bhattacharyya
 NAACL2019
 
 2019/12/18 論文読み会 
 紹介者: 吉村
 

  2. Introduction
 • Customer care 
 ◦ Essential tool used by

    companies in building stable customer relations.
 • What’s important
 ◦ Providing customer satisfaction by greeting, empathizing, apologizing at the right time.
 ◦ This build a strong relation with the customer and increase in customer retention.
 • They propose an effective framework 
 ◦ Inducing courteous behavior in customer care responses
 ◦ System adds courteous nature and emotional sense to the replies.
 

  3. Example of Courteous Responses
 • Creation of a high quality

    and a large conversational dataset
 ◦ Courteously Yours Customer Care Dataset (CYCCD) prepared from the actual conversations on Twitter. 
 • Proposal of a strong benchmark model 
 ◦ based on a context and emotionally aware reinforced pointer-generator approach 
 Main Contribusions

  4. Method
 Task:
 Given the conversation history and the generic response,

    
 generate the courteous response. 
 
 
 
 
 
 
 Model:
 
 
 
 
 

  5. Encoder, Decoder and Attention
 
 
 
 
 
 


    h*: context vector 
 s t : decoder state
 c: conversation vector 
 W, b: parameter

  6. Output distribution calculation
 
 
 
 
 
 
 


    h*: context vector 
 st: decoder state 
 c: conversation vector 
 W, b: parameter

  7. Model Traning
 Joint reinforcement learning
 self-critical policy gradient (Rennie et

    al., 2017)
 
 
 
 
 
 Reward: BLEU (m1), Emotional accuracy (m2) 
 
 
 Loss function
 
 x 1 , generic response
 x 2 , conversation history 
 y s sampling p(y t s |y 1 s..ys t-1 , x)
 y g by greedy search
 λ1 = 0.75, λ2 = 0.25 
 η= 0.99

  8. Dataset 
 • Customer Support on Twitter
 ◦ Over 3

    million tweets and replies from the biggest brands on Twitter
 • Preparing responses of generic style
 ◦ Remove courteous and non-informative sentences
 ▪ Ex: Sorry to hear about the trouble! ◦ Retain informative sentences
 ▪ Ex: Simply visit url name to see availability in that area! ◦ Transforme Informative sentences with courteous expressions to Informative and generic expressions
 ▪ Ex: We appreciate the feedback, we’ll pass this along to the appropriate team. 
 

  9. Process for data creation
 1. Sentence segmentaion
 2. Clustering
 ◦

    K-Means Clustering to cluster these sentences. 3. Annotations 
 ◦ Three annotators proficient in the English language ◦ Annotate the sentences into the three categories
 4. Preparing generic responses
 ◦ Create data by the above operation

  10. Experiments
 • vocab size: 30k
 • hidden state: 256
 •

    word embeddings: 128
 • optimize: AdaGrad with gradient clipping
 • batch size: 16
 
 

  11. Automatic Evaluation
 • BLEU, ROUGE, perplexity
 • Task Specific metrics


    ◦ Content preservation (CP)
 ▪ ROUGE-L recall 
 ◦ Emotional accuracy (CA)
 ▪ cosine similarity between the MojiTalk distributions X and Y 
 X: original generic response 
 Y: generated courteous response 
 LCS: longest common subsequence 
 Xe: original generic response 
 Ye: generated courteous response 
 

  12. Human evaluation
 • Randomly sample 500 responses
 • Three annotators


    • Fluency (F)
 ◦ 0: incorrect or incomplete
 ◦ 1: moderately correct
 ◦ 2: correct
 • Content Adequacy (CA)
 ◦ same as fluency
 • Courtesy Appropriateness (CoA)
 ◦ -1: inaapropriate
 ◦ 0: non-couteous
 ◦ 1 appropriate

  13. Error Analysis
 • Unknown Tokens
 ◦ Model 1 does not

    have the coping mechanism
 ◦ Incomplete sequences 
 • Wrong coping 
 ◦ being influenced by language model
 ◦ Gold: ..which store in gilingham did you visit?
 ◦ Predict ..which store in belgium did you visit?
 • Mistakes in emotion identification
 ◦ More prominent in Model 1 and 2 
 ◦ Gold: you’re very welcome, hope the kids have an amazing halloween ! 
 ◦ Predict: we apologize for the inconvenience. hope the kids have an 
 amazing halloween ! 
 

  14. Error Analysis
 • Extra information
 ◦ Model 1,2,3 sometimes generate

    extra informative sentences.
 ◦ Gold: please send us a dm
 ◦ Predict: please send us a dm please let us know if you did not 
 receive it 
 • Contextually wrong courteous phrases
 ◦ Gold: we want to help, reply by dm and ..
 ◦ Predict: im sorry you havent received it. please reply by dm and ..
 
 
 

  15. Conclusions
 • They propose a new research problem
 ◦ Inducing

    courteous behavior in customer care responses.
 • They create large benchmark corpus
 • Proposed framework
 ◦ Model the dialogue history and the past emotional states through emotional embeddings.
 • Automatic and Human evaluation
 • Qualitative and Quantitative analysis
 ◦ correct courteous behavior and content preservation, along with minor inaccuracies