Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network

Courteously Yours:  Inducing courteous behavior in Customer Care responses using
  Reinforced Pointer Generator Network  Hitesh Golchha, Mauajama Firdaus, Asif Ekbal, Pushpak Bhattacharyya  NAACL2019    2019/12/18 論文読み会   紹介者: 吉村   

Introduction  • Customer care   ◦ Essential tool used by
companies in building stable customer relations.  • What’s important  ◦ Providing customer satisfaction by greeting, empathizing, apologizing at the right time.  ◦ This build a strong relation with the customer and increase in customer retention.  • They propose an effective framework   ◦ Inducing courteous behavior in customer care responses  ◦ System adds courteous nature and emotional sense to the replies.   

Example of Courteous Responses  • Creation of a high quality
and a large conversational dataset  ◦ Courteously Yours Customer Care Dataset (CYCCD) prepared from the actual conversations on Twitter.   • Proposal of a strong benchmark model   ◦ based on a context and emotionally aware reinforced pointer-generator approach   Main Contribusions 

Method  Task:  Given the conversation history and the generic response,
  generate the courteous response.               Model:           

Conversational History Representation  Sentence, Utterance Encoding: biLSTM  Emotional Encoding: Deep
Emoji (Felbo et al 2017)           

Encoder, Decoder and Attention             
h*: context vector   s t : decoder state  c: conversation vector   W, b: parameter 

Output distribution calculation               
h*: context vector   st: decoder state   c: conversation vector   W, b: parameter 

Model Traning  Joint reinforcement learning  self-critical policy gradient (Rennie et
al., 2017)            Reward: BLEU (m1), Emotional accuracy (m2)       Loss function    x 1 , generic response  x 2 , conversation history   y s sampling p(y t s |y 1 s..ys t-1 , x)  y g by greedy search  λ1 = 0.75, λ2 = 0.25   η= 0.99 

Dataset   • Customer Support on Twitter  ◦ Over 3
million tweets and replies from the biggest brands on Twitter  • Preparing responses of generic style  ◦ Remove courteous and non-informative sentences  ▪ Ex: Sorry to hear about the trouble! ◦ Retain informative sentences  ▪ Ex: Simply visit url name to see availability in that area! ◦ Transforme Informative sentences with courteous expressions to Informative and generic expressions  ▪ Ex: We appreciate the feedback, we’ll pass this along to the appropriate team.    

Process for data creation  1. Sentence segmentaion  2. Clustering  ◦
K-Means Clustering to cluster these sentences. 3. Annotations   ◦ Three annotators proficient in the English language ◦ Annotate the sentences into the three categories  4. Preparing generic responses  ◦ Create data by the above operation 

Experiments  • vocab size: 30k  • hidden state: 256  •
word embeddings: 128  • optimize: AdaGrad with gradient clipping  • batch size: 16     

Automatic Evaluation  • BLEU, ROUGE, perplexity  • Task Specific metrics 
◦ Content preservation (CP)  ▪ ROUGE-L recall   ◦ Emotional accuracy (CA)  ▪ cosine similarity between the MojiTalk distributions X and Y   X: original generic response   Y: generated courteous response   LCS: longest common subsequence   Xe: original generic response   Ye: generated courteous response    

Human evaluation  • Randomly sample 500 responses  • Three annotators 
• Fluency (F)  ◦ 0: incorrect or incomplete  ◦ 1: moderately correct  ◦ 2: correct  • Content Adequacy (CA)  ◦ same as fluency  • Courtesy Appropriateness (CoA)  ◦ -1: inaapropriate  ◦ 0: non-couteous  ◦ 1 appropriate 

Result 

Error Analysis  • Unknown Tokens  ◦ Model 1 does not
have the coping mechanism  ◦ Incomplete sequences   • Wrong coping   ◦ being influenced by language model  ◦ Gold: ..which store in gilingham did you visit?  ◦ Predict ..which store in belgium did you visit?  • Mistakes in emotion identification  ◦ More prominent in Model 1 and 2   ◦ Gold: you’re very welcome, hope the kids have an amazing halloween !   ◦ Predict: we apologize for the inconvenience. hope the kids have an   amazing halloween !    

Error Analysis  • Extra information  ◦ Model 1,2,3 sometimes generate
extra informative sentences.  ◦ Gold: please send us a dm  ◦ Predict: please send us a dm please let us know if you did not   receive it   • Contextually wrong courteous phrases  ◦ Gold: we want to help, reply by dm and ..  ◦ Predict: im sorry you havent received it. please reply by dm and ..       

Examples     

Conclusions  • They propose a new research problem  ◦ Inducing
courteous behavior in customer care responses.  • They create large benchmark corpus  • Proposed framework  ◦ Model the dialogue history and the past emotional states through emotional embeddings.  • Automatic and Human evaluation  • Qualitative and Quantitative analysis  ◦ correct courteous behavior and content preservation, along with minor inaccuracies   

Courteously Yours: Inducing courteous behavior ...

Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network

ryoma yoshimura

More Decks by ryoma yoshimura

Other Decks in Research

Featured

Transcript

Courteously Yours:  Inducing courteous behavior in Customer Care responses using

Introduction  • Customer care   ◦ Essential tool used by

Example of Courteous Responses  • Creation of a high quality

Method  Task:  Given the conversation history and the generic response,

Conversational History Representation  Sentence, Utterance Encoding: biLSTM  Emotional Encoding: Deep

Encoder, Decoder and Attention

Output distribution calculation

Model Traning  Joint reinforcement learning  self-critical policy gradient (Rennie et

Dataset   • Customer Support on Twitter  ◦ Over 3

Process for data creation  1. Sentence segmentaion  2. Clustering  ◦

Experiments  • vocab size: 30k  • hidden state: 256  •

Automatic Evaluation  • BLEU, ROUGE, perplexity  • Task Specific metrics

Human evaluation  • Randomly sample 500 responses  • Three annotators

Result

Error Analysis  • Unknown Tokens  ◦ Model 1 does not

Error Analysis  • Extra information  ◦ Model 1,2,3 sometimes generate

Examples

Conclusions  • They propose a new research problem  ◦ Inducing