Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2019 - Hearing Voices: Considering the Complexity of Vocal Interface Design

UX Y'all
October 04, 2019

2019 - Hearing Voices: Considering the Complexity of Vocal Interface Design

UX Y'all 2019 Session by Gretchen McNeely + Scott McCall

UX Y'all

October 04, 2019
Tweet

More Decks by UX Y'all

Other Decks in Design

Transcript

  1. Voice hype is getting louder One in six Americans owns

    a smart speaker. +128% since January 2017 (NPR and Edison Research) 50% of all searches will be voice driven by 2020 (comScore) Nearly 100 million smartphone users will be using voice assistants in 2020 (eMarketer)
  2. Zero UI and the dream of frictionless HCI Technology has

    evolved enough to begin to be able to meet humans a little bit more in the middle, interaction-wise.
  3. “Zero UI minimizes the cost to the user” Andy Goodman

    Fjord By not worrying about acting like a computer, you can theoretically: • Save time and money • Be more efficient • Decrease socially deviant behaviors like staring at your iPhone
  4. How well is voice currently working? Growing in popularity, but

    how well is it working? Weather, jokes, novelty quizzes, timers Strict human-to-human interaction is limited
  5. Voice and audio are good for... Safety in a hands-

    or eyes-free environment Shared interfaces for IOT or ambient devices without screens nearby (e.g. Nest) Languages that are hard to type Complicated things that people can articulate (e.g. Apple TV, “Give me thriller movies with Nicolas Cage, for free. Only show the ones with four or more stars”)
  6. AgVoice enables hands-free crop reporting AgVoice is freeing up crop

    inspectors from paperwork, keeping them mobile and efficient. • The system prompts the user, then records the inspector’s response (such as the presence of pests, the growth progress of fruit and vegetables, etc.), • It tags the data with a timestamp and geo-location, then uploads it to a secure cloud for transcription and automated report creation. http://www.agvoiceglobal.com/
  7. Voice and audio are bad for... Anything requiring negotiation or

    a lot of variables Huge amounts of input or output Input that is hard for humans to describe Comparing lists of complicated things
  8. Google’s E Ink: Voice enhances touch Debuted at CES 2019

    Shows weather, traffic, events Touch with voice allows for more useful and efficient interaction with virtual assistant
  9. Voice design raises tough questions Conversational complexity Power dynamics Gender

    and uncanny valley Voice selection control Auditory, syntax, semantics Designing for multiple audiences Hat tip: WIDD 9/24 panelists and attendees
  10. The human challenge There’s baggage in every interaction Voice tech

    may be under-equipped to handle: - Body language - Auditory (syntax and non-syntax) cues - Human conversational expectations - Power dynamics - Speed - Capability - Flexibility and pivot
  11. Design and technical dilemmas A technical system is tasked with

    understanding non-technical content - Verbal elements create roadblocks - Anger, frustration - Tone - Semantic confusion Losing visual interface means increased expectations of “humanity” Alexa: “sorry, I misunderstood” based on perceived agitation in user’s voice
  12. Empathy for the machine What do we owe the computer

    as designers? What types of relationships are we promoting? How does that influence the way humans interact with voice interfaces? - Mirror - Authority and subordinate (and vice-versa) - Peer to peer - Guide and guided - Co-captains (Michael Knight and K.I.T.T.) What does this mean for the way we manage other verbal interactions?
  13. Gender A sticking point in voice design The “servile companion”:

    - Alexa - Google Home’s female voice default - Cortana - Siri What happens when we move away from the binary? What about “uncanny valley”?
  14. The future is female What are users conveying when they

    retain default female voices? What do designers convey when we default to this? --- Why are we the keepers of that decision? What if you could roll your own?
  15. Auditory voice considerations What about turning over control to the

    user? Volume Pitch / Tuning Pace Pause Resonance Intonation / Timbre
  16. Linguistic voice considerations Specific nature of tasks / transactions Likely

    user terms -- both volunteered and recognized Semantics -- meaning Syntax -- structure Accent Dialect (accent plus syntax plus semantics) Who leads the charge on iterative design and revision?
  17. Specialized audiences Children: how do their interactions with voice assistants

    shape their interactions with other humans? Is voice a training tool? The elderly: are we considering the role of compassion and companionship for the lonely? What are the ethics behind using family member voices in assistants? Neurodiverse users: factual responses and lack of non-verbal cueing can ease the way for autistic users
  18. The Future of Voice...May Not Be Voice Rise of a

    new syntax Streamlining Voice interaction • Role of “earcons” • “Alexa Brief”: We change how we communicate in order to accommodate the computer’s needs • Also streamlining: Reduction of “wake” words for Alexa, Google to create a more seamless and efficient interaction
  19. Replacing Voice with Audio and Tactile Cues Step away from

    a human model for HCI interactions Instead, emphasize haptics and audio cues (“earcons”) Could address multiple issues: - Shifts us away from human-human conversational expectations - Greater accessibility for disabled users - Reduction in voice-based bias - More flexibility for users placed under temporary limitations
  20. Moving away from voice Audio does not have to mean

    voice… Audio “earcons,” haptics, visual cues, and physical gesture can form a communication ecosystem that reduces bias and allows optimal accessibility. As potential voice designers, maybe we won’t be designing for voice.