[Refresh] Principled Product Development for Voice-Based Interfaces

[Refresh] Principled Product Development for Voice-Based Interfaces


Nara Kasbergen

September 07, 2018


  1. 1.
  2. 3.

    Who am I? ▪ Senior full-stack web developer ▪ At

    NPR since March 2014 ▪ Part of a 5-member "innovation lab"-type team focused 100% on voice UI development ▫ Formed in September 2017
  3. 6.

    18% of Americans own voice-activated speakers, more than doubling over

    the past year Source: NPR + Edison Research Smart Audio Report
  4. 7.
  5. 10.

    “These devices are evil. We should not allow big corporations

    like Amazon to spy on our every move. - my boyfriend (and others)
  6. 11.

    Our mission statement Be a visionary while guiding new product

    and service ideas from conception through launch. Along the way, partner with Member stations to create a more informed public, reaching them wherever they are.
  7. 12.

    3 core challenges on voice platforms ▪ Respect user privacy

    as much as possible ▪ Anticipate children interacting with your app unsupervised ▪ Consider gender … specifically, gender roles and voice
  8. 14.

    “Aren't these devices basically just for spying? I'd be worried

    about my girlfriend being able to find out everything I say when she's not home. - an acquaintance
  9. 15.

    The challenge: Expectations vs. Reality ▪ Nearly all these devices

    have a "mute" button ▫ You can use WireShark to verify that nothing said while muted is sent to e.g. Amazon ▪ Very little data is stored on the device; virtually everything is processed & stored in the cloud ▪ Communication happens over HTTPS
  10. 16.

    The challenge: Expectations vs. Reality ▪ The developer platform is

    heavily sandboxed ▫ As a developer, I have no control over what happens if my app is not in active use ▫ I get no information about e.g. how many people are in the room, which room, etc. ▫ By default, user data is anonymized
  11. 17.

    Example scenario: Station search A user is using the "Play

    NPR" skill on Alexa for the first time. They are asked to choose a member station in their area. They specify the location "Seattle, Washington" and are given 3 stations to choose from. The user selects KUOW, which we save as their default station for return visits.
  12. 18.

    Short-term data Long-term data ▪ Most recent search query (e.g.

    "Seattle, Washington") ▪ The search results ▪ The last thing the voice assistant said/asked ▪ The user's default station ▪ The number of times the user has accessed the app
  13. 19.

    Our approach ▪ Use a 24-hour TTL on short-term user

    data ▪ Don't ask for login where it's not required ▪ When using login, don't store account data ▫ Use access token to retrieve on demand ▪ Remove all remaining user data when a user uninstalls the app
  14. 21.

    “ These devices are particularly appealing to parents/families, and that

    continues to be the case, with adoption growing more quickly among that segment These devices are particularly appealing to parents/families, and that continues to be the case, with adoption growing more quickly among that segment. - NPR + Edison Research Smart Audio Report
  15. 22.

    43% of parents purchased the speaker to reduce screen time

    Source: NPR + Edison Research Smart Audio Report
  16. 23.

    Example scenario A parent is listening to NPR One on

    Alexa while their 5-year-old child is playing in the room. They have to step out for a while to make a phone call. When they come back, Alexa is playing an episode of a podcast featuring curse words.
  17. 24.

    Our approach ▪ We play a content warning before the

    start of a story if it features e.g. strong language or disturbing content ▪ Always make it easy to skip or stop playing audio; don't interfere with the user's choice ▪ Overall, we don't specifically target this audience; our content is aimed at adults
  18. 25.

    If you do want to serve this audience… ▪ Do

    not require login ▪ Avoid retaining user data longer than necessary ▪ Simplify your voice interactions; kids are not going to remember complex commands ▪ Reward good behavior like saying "please" and "thank you"
  19. 27.

    What's in a name? Gendered ▪ Alexa (female) ▪ Cortana

    (female) ▪ Siri (female) ▪ Bixby (male) Neutral ▪ Google Assistant
  20. 28.
  21. 29.
  22. 30.

    The status quo: Amazon Alexa ▪ Provides only one (female-sounding)

    voice at both the device/user and app level ▫ 8 new app-level voices are part of an opt-in Developer Preview program ▪ Can be renamed ("wake word") ▫ Amazon, Echo, Computer
  23. 31.

    The status quo: Google Assistant ▪ Designed to be gender-neutral

    ▪ Provides 8 voice options at the device/user level (not labelled by gender, offers a range of voices) ▫ Defaults to a female-sounding voice ▪ Provides 4 voice options at the app level ▫ Defaults to male ("Male 1")
  24. 32.
  25. 33.
  26. 34.

    “I use the male voice because I have two daughters.

    And they should know that voice assistants can present as male or female. - Ha-Hoa Hamano (product manager)
  27. 35.
  28. 36.

    ▪ … We're still working it out ▪ Alexa basically

    locks you into a female voice ▪ Should we care about cross-platform consistency? ▪ The dilemma: NPR has already received its fair share of criticism for putting too many cis male voices on the air Our approach
  29. 37.

    ▪ W3C standard supported on all major platforms ▪ Use

    <audio> to embed short, pre-recorded audio A solution: SSML <audio>
  30. 38.

    My recommendations ▪ Where possible, use <audio> instead of TTS

    ▫ Record real voices representative of the diversity of your user base ▪ Otherwise, be conscious of perpetuating gender roles and stereotypes ▫ Don't make the assistant overly subservient
  31. 40.

    NPR research, talks & blog posts ▪ NPR + Edison

    Research Smart Audio Report ▪ Finding Your Voice: Building Screenless Interfaces with Node.js (my talk at jsDay 2018) ▪ Talking Back To Your Radio: How We Approached Voice UI (npr.design) ▪ How To Prototype For Audio-Rich Voice Experiences Without Really Trying (npr.design)
  32. 41.

    Other useful resources ▪ Ethical OS Toolkit (Institute For The

    Future) ▪ Alexa Skill Blueprints (Amazon) ▪ Conversation design (Google) ▪ Designing Voice Experiences (Smashing Mag) ▪ Intelligent Assistants Have Poor Usability: A User Study of Alexa, Google Assistant, and Siri (Nieman Norman Group)
  33. 42.

    Takeaways from 1 year of voice UI work ▪ It's

    not just an engineering challenge ▪ Think of it as another form of front-end design ▪ Test with a diverse set of real users ▪ Remember being in the home is a privilege ▪ Empower everyone on the team to speak up and weigh in on design decisions