Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Always empathic, but never biased

1e2ead439777ff94d9b2dd11a0607e01?s=47 Wolf Paulus
February 09, 2020

Always empathic, but never biased

Striving for likability - (always empathic but never biased). Over the next decade, the ambient computing era will eclipse the PC era. Information will be available everywhere, accessible frictionless and frustration-free through voice user interfaces. Many of the old recipes for creating delight don't apply to Voice-first or Voice-Only experiences. Recent research shows that when communicating emotions, your voice matters more than your words. And those emotions are more accurately perceived in a voice-only interaction, compared to one with extra visual information. This talk explores and demonstrates possibilities of a likable and unbiased engagement, by using affective computing technologies and emotions analytics (e.g., expressive speech synthesis, sentiment analysis, or readability statistics). After briefly rehashing that a young, feminine, overly upbeat voice might not always be the most appropriate choice when synthesizing messages, we will turn to a more controversial but interesting topic: genderless bots and an approach that is using "Neural Word Embeddings" to detect bias, before a bot relays it to its users. A Voice User Interface speaks for you, your products, or your company. It should connect appropriately with users, could even encourage appropriate behavior change, and reflect your corporate values in the relationship with your customers.


Wolf Paulus

February 09, 2020

More Decks by Wolf Paulus

Other Decks in Technology


  1. Wolf Paulus - Intuit Inc. ____ Wolf Paulus Intuit Inc.

    Technology Futures Striving for likability (always empathic, never biased)
  2. Wolf Paulus - Intuit Inc. Type • Click • Touch

    2007 1994 1985 1981 2014 Experience Talk
  3. Wolf Paulus - Intuit Inc. Wolf Paulus - 2017

  4. Wolf Paulus - Intuit Inc. 3D Virtual Reality 2D WIMP

    (windows, icons, menus, pointer) 1D Texting / ChatBOT UI 0D Voice User Interface immersive frictionless
  5. Wolf Paulus - Intuit Inc. In this new environment of

    ambient-computing, form factor or looks hardly matter. Not only what, but equally important how a virtual assistant says it, will determine success. Likability becomes the ultimate differentiator in an otherwise un-differentiable experience. Frictionless 0D VUI
  6. Wolf Paulus - Intuit Inc. Validate that a response carries

    the intended attitude. Apply techniques and off the shelf tools, originally created to analyze user input, like customer feedback.
  7. Wolf Paulus - Intuit Inc. Sentiment Analysis

  8. Wolf Paulus - Intuit Inc. Dictionary of Affect in Language

  9. Wolf Paulus - Intuit Inc. Sense Position Positive Negative Objective

    • Term Positive Objective Negative faithful 0.625 0.375 0 hateful 0.333 0 0.667 honorable 0.625 0.125 0.25 • • honorable hateful faithful Quantified Synonyms
  10. Wolf Paulus - Intuit Inc. Quantified Synonyms

  11. Wolf Paulus - Intuit Inc.

  12. Wolf Paulus - Intuit Inc. Voice-Only Communication Enhances Empathic Accuracy

    Michael W. Kraus Yale University, School of Management This research tests the prediction that voice-only communication increases empathic accuracy over communication across senses. We theorized that people often intentionally communicate their feelings and internal states through the voice, and as such, voice-only communication allows perceivers to focus their attention on the channel of communication most active and accurate in conveying emotions to others. We used 5 experiments to test this hypothesis (N ϭ 1,772), finding that voice-only communication elicits higher rates of empathic accuracy relative to vision-only and multisense communication both while engaging in interactions and perceiving emotions in recorded interactions of strangers. Experiments 4 and 5 reveal that voice-only communication is particularly likely to enhance empathic accuracy through increasing focused attention on the linguistic and paralinguistic vocal cues that accompany speech. Overall, the studies question the primary role of the face in communication of emotion, and offer new insights for improving emotion recognition accuracy in social interactions. Keywords: emotion, empathic accuracy, person perception, organizational behavior Supplemental materials: http://dx.doi.org/10.1037/amp0000147.supp Social mammals have a profound capacity to connect with others: Young Rhesus monkeys will cling to a cloth surrogate that provides the simulated tactile warmth of a caregiver, rather than a wire one that provides nutrients (Harlow, 1958), and infants have the ability to mimic simple facial expressions soon after birth (Meltzoff, 1985). Social connections are critical for managing the survival-related threats that individuals experience (Bowlby, 1988). One way that individuals develop and maintain social connec- tions is through empathic accuracy—the ability to judge the emotions, thoughts, and feelings of other individuals (Côté & Miners, 2006; Ickes, Stinson, Bissonnette, & Garcia, 1990; Stinson & Ickes, 1992). With enhanced empathic accuracy, individuals can respond more appropriately to conflicts at work (Côté & Miners, 2006) and to support- seeking romantic partners (Richards, Butler, & Gross, 2003). Enhanced empathic accuracy also allows individuals to more easily navigate complex political organizations and social networks (Mayer, Salovey, & Caruso, 2008). In con- trast, a dearth of empathic accuracy is a common symptom of many psychological disorders (American Psychiatric As- sociation, 2013). Despite powerful motivations to connect with others many people experience failures in social connection and understanding (Hawkley & Cacioppo, 2010). In the present research, we suggest that one potent barrier to empathic accuracy is the ways in which emotion expressions across modalities divide our attention between more and less rel- evant channels of communication. Humans have an impres- sive array of tools for expressing and perceiving the emo- tions of others (Zaki, Bolger, & Ochsner, 2009). Research on emotion recognition began with studies testing the hy- pothesis that people can recognize facial expressions of emotion cross-culturally (Ekman, 1989; Russell, 1994). More recent research reveals the power of other senses to accurately communicate emotions: Touches on the body and forearms of a stranger communicate an array of emo- tions (Hertenstein, Keltner, App, Bulleit, & Jaskolka, 2006) as do nonword vocal bursts played back to strangers (Gend- ron, Roberson, van der Vyver, & Barrett, 2014; Simon- Thomas, Keltner, Sauter, Sinicropi-Yao, & Abramson, 2009). In particular, we contend that the voice, including both speech content and the linguistic and paralinguistic vocal cues (e.g., pitch, cadence, speed, and volume) that accompany it, is a particularly powerful channel for per- ceiving the emotions of others. This assertion supports the central prediction tested in these studies—that voice-only I thank Noam Segal for his contributions to early versions of this article and Zoë Kraus for inspiration. I also thank Jessica Halten, the members of the Champaign Social Interaction Laboratory, and the Yale University School of Management Behavioral Laboratory for assistance with data collection. Correspondence concerning this article should be addressed to Michael W. Kraus, Organizational Behavior, Yale University School of Manage- ment, 165 Whitney Avenue, 5477 Evans Hall, New Haven, CT 06511. E-mail: michael.kraus@yale.edu American Psychologist © 2017 American Psychological Association 2017, Vol. 72, No. 7, 644–654 0003-066X/17/$12.00 http://dx.doi.org/10.1037/amp0000147 644 https://www.apa.org/pubs/journals/releases/amp-amp0000147.pdf Voice-only communication is more accurate than visual alone or voice + visual, when it comes to determining a speakerʼs emotions American Psychologist © 2017 American Psychological Association 2017, Vol. 72, No. 7, 644–654 0003-066X/17/$12.00 http://dx.doi.org/10.1037/amp0000147
  13. Wolf Paulus - Intuit Inc. Speech Synthesis Markup Language to

    impact affect <emphasis level= ".." > enclosed text be spoken with emphasis <prosody pitch = ".." > modifies the baseline pitch e.g., low / high <prosody rate = ".." > change in the speaking rate, e.g., slow / fast <prosody volume = ".." > modifies the volume, e.g., soft / loud <prosody range = ".." > modifies pitch range (variability) e.g., low / high <prosody contour = ".." > sets the actual pitch contour for the contained text. (time position, target) <glottal_tension pitch = ".." > tense or lax speech quality e.g. low / high (low value is perceived as more breathy and generally more pleasant.) <breathiness level=".."> perceived level of the aspiration noise (drawing breath) e.g., low / high “Voice Transformation SSML” SSML - Speech Synthesis Markup Language (SSML) Version 1.1 W3C Recommendation 7 September 2010 “Expressive SSML” <express-as type="GoodNews"> expresses a positive, upbeat message. <express-as type="Apology"> expresses a message of regret. <express-as type="Uncertainty"> conveys an uncertain, interrogative message. “Amazon:Emotion” <amazon:emotion name="excited" intensity="medium">Congrats, … <amazon:emotion type="disappointed" intensity="high">I'm so sorry …
  14. Wolf Paulus - Intuit Inc.

  15. Wolf Paulus - Intuit Inc. Speech Synthesis Identifying emotion in

    a speaker's voice
  16. Wolf Paulus - Intuit Inc. Riddle …

  17. Wolf Paulus - Intuit Inc.

  18. Wolf Paulus - Intuit Inc.

  19. Wolf Paulus - Intuit Inc.

  20. Wolf Paulus - Intuit Inc.

  21. Wolf Paulus - Intuit Inc.

  22. Wolf Paulus - Intuit Inc. Who is the doctor?

  23. Wolf Paulus - Intuit Inc. Who is the doctor?

  24. Wolf Paulus - Intuit Inc. Bias is disproportionate weight in

    favor of or against an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, or a belief.

  26. Wolf Paulus - Intuit Inc. https://contentdesign.intuit.com/accessibility-and-inclusion/use-gender-neutral-language/ Use gender- neutral language

    Inclusive gender-neutral alternatives There are other ways to be gender neutral and inclusive: • Instead of “ladies and gentlemen,” use something like “distinguished guests” or be more specific and say “customers” or “developers.” • Instead of “men” or “women,” use “everyone.” • Instead of “the lady or man in the green shirt,” say “the person in the green shirt.” • Instead of “guys,” try “folks” or “friends” or “team.” These are all better choices in communications such as Slack messages. • Instead of “boys and girls,” simply say “children.” • Instead of “brothers and sisters,” try “siblings.” https://contentdesign.intuit.com/
  27. Wolf Paulus - Intuit Inc. Hunting for an engineer who

    will tackle our toughest problems and smashing bugs in our huge code base.
  28. Wolf Paulus - Intuit Inc. A Word Embedding format generally

    tries to map a word using a dictionary to a vector. Word Embeddings
  29. Wolf Paulus - Intuit Inc. Task: Given a specific word

    in the middle of a sentence (the input word), look at the words nearby - a typical window-size might be 5, meaning 5 words behind and 5 words ahead (10 in total). The network is going to tell the probability for every word in the vocabulary of being a “nearby word”. The output probabilities are going to relate to how likely it is find each vocabulary word nearby the input word. For example, if you gave the trained network the input word “Soviet”, the output probabilities are going to be much higher for words like “Union” and “Russia” than for unrelated words like “watermelon” and “kangaroo”. The neural network is trained by feeding it word pairs found in the training documents (corpus). The example shows some of the training samples (word pairs), taken form a training sentence, using a small window size of 2, with the word highlighted word being the input word. Word Embeddings
  30. Wolf Paulus - Intuit Inc. A simple neural network with

    a single hidden layer is trained to perform a certain task. However, the goal is not to use the neural network once it's trained, but to learn the weights of the hidden layer, i.e. “word vectors”. Word Embeddings
  31. Wolf Paulus - Intuit Inc. Neural word embeddings

  32. Wolf Paulus - Intuit Inc. Demo https://expo.futures.a.intuit.com/mumbler/

  33. Wolf Paulus - Intuit Inc. Mumbler consists of two docker

    containers, which are stored in AWS ECR and deployed as Fargate Tasks in AWS ECS
  34. Wolf Paulus - Intuit Inc. Summary & conclusion • Most

    tools & techniques mentioned were created to analyze customer feedback. • Re-use them, to validate that responses carry only the intended attitude/sentiment and bias. • Smart-speakers may hear "Please" and "Thanks" less often than before, let’s make sure the skills or bots we are building responds kindly, considerately and empathically if warranted, and thereby deserve a user’s politeness.
  35. Wolf Paulus - Intuit Inc. Thank You ! Striving for

    likability (always empathic, never biased)