[SelfConf] Principled Product Development for Voice-Based Interfaces

Principled Product Development for Voice-Based Interfaces Nara Kasbergen @xiehan |
August 17, 2018 Self.conference #selfconf

Who am I? ▪ Senior full-stack web developer ▪ At
NPR since March 2014 ▪ Part of a 5-member skunkworks team focused 100% on voice UI development ▫ Formed in September 2017

Why voice UI? Why is NPR investing in this space?

This is our "Craigslist moment" Then: Now:

18% of Americans own voice-activated speakers, more than doubling over
the past year Source: NPR + Edison Research Smart Audio Report

Early Adopters Early Mainstream ▪ "It's cool tech" ▪ Smart
home ▪ Older (over half are 45+) ▪ 58% female ▪ "It's useful" ▪ Day-to-day activities ▪ Audio listening ▪ Increasing usage ▪ More engaged

Why ethics? Why do I care? Why should you?

James Robert Liang (Volkswagen)

“These devices are evil. We should not allow big corporations
like Amazon to spy on our every move. - my boyfriend (and others)

Our mission statement Be a visionary while guiding new product
and service ideas from conception through launch. Along the way, partner with Member stations to create a more informed public, reaching them wherever they are.

Being in the home is a privilege.

3 core challenges on voice platforms ▪ Respect user privacy
as much as possible ▪ Anticipate children interacting with your app unsupervised ▪ Consider gender … specifically, gender roles and voice

Respect user privacy as much as possible 1.

“Aren't these devices basically just for spying? I'd be worried
about my girlfriend being able to find out everything I say when she's not home. - an acquaintance

The challenge: Expectations vs. Reality ▪ Nearly all these devices
have a "mute" button ▫ You can use WireShark to verify that nothing said while muted is sent to e.g. Amazon ▪ Very little data is stored on the device; virtually everything is processed & stored in the cloud ▪ Communication happens over HTTPS

The challenge: Expectations vs. Reality ▪ The developer platform is
heavily sandboxed ▫ As a developer, I have no control over what happens if my app is not in active use ▫ I get no information about e.g. how many people are in the room, which room, etc. ▫ Platforms use a strict permission system

The status quo: Alexa ▪ Gives users a random alphanumeric
ID ▫ Per device, per install: i.e. if a user uninstalls the app, then reinstalls it, the ID resets! ▪ Allows login via OAuth 2.0 ▫ Access token gets added to requests after successful authorization

The status quo: Google Assistant ▪ Currently assigns a random
user ID, but that is deprecated and will be removed June 1, 2019 ▪ Strongly encourages login (OAuth 2.0 or Google) ▫ Access token gets added to requests after successful authorization ▪ Non-signed-in users are subject to voice match

Example scenario: Station search A user is using the "Play
NPR" skill on Alexa for the first time. They are asked to choose a member station in their area. They specify the location "Seattle, Washington" and are given 3 stations to choose from. The user selects KUOW, which we save as their default station for return visits.

Short-term data Long-term data ▪ Most recent search query (e.g.
"Seattle, Washington") ▪ The search results ▪ The last thing the voice assistant said/asked ▪ The user's default station ▪ The number of times the user has accessed the app

Our approach ▪ Use a 24-hour TTL on short-term user
data ▪ Don't ask for login where it's not required ▪ When using login, don't store account data ▫ Use access token to retrieve on demand ▪ Remove all remaining user data when a user uninstalls the app

The challenge: Analytics ▪ One of our biggest business requirements
is analytics. We can't avoid it. ▪ We decided to use Google Analytics ▫ Analytics providers for voice are just emerging ▪ Recommendation: hash the user's ID before sending data to analytics provider

GDPR: Sensible privacy guidelines ▪ Be aware of what personal
data means ▪ Hold and process data only if it is absolutely necessary for the completion of a task ▪ Have a process in place for erasing a user's data at their request

Anticipate children interacting with your app unsupervised 2.

“ These devices are particularly appealing to parents/families, and that
continues to be the case, with adoption growing more quickly among that segment These devices are particularly appealing to parents/families, and that continues to be the case, with adoption growing more quickly among that segment. - NPR + Edison Research Smart Audio Report

43% of Early Mainstream parents purchased the speaker to reduce
screen time Source: NPR + Edison Research Smart Audio Report

Example scenario A parent is listening to NPR One on
Alexa while their 5-year-old child is playing in the room. They have to step out for a while to make a phone call. When they come back, Alexa is playing an episode of a podcast featuring curse words.

Our approach ▪ We play a content warning before the
start of a story if it features e.g. strong language or disturbing content ▪ Always make it easy to skip or stop playing audio; don't interfere with the user's choice ▪ As a side note, users must sign up for an NPR account in order to use this app

If you do want to serve this audience… ▪ Do
not require login ▪ Avoid retaining user data longer than necessary ▪ Simplify your voice interactions; kids are not going to remember complex commands ▪ Reward good behavior like saying "please" and "thank you"

Consider gender … specifically, gender roles and voice 3.

What's in a name? Gendered ▪ Alexa (female) ▪ Cortana
(female) ▪ Siri (female) ▪ Bixby (male) Neutral ▪ Google Assistant

The status quo: Alexa ▪ Provides only one (female-sounding) voice
at both the device/user and app level ▫ 8 new app-level voices are part of an opt-in Developer Preview program ▪ Can be renamed ("wake word") ▫ Amazon, Echo, Computer

The status quo: Google Assistant ▪ Designed to be gender-neutral
▪ Provides 8 voice options at the device/user level (not labelled by gender, offers a range of voices) ▫ Defaults to a female-sounding voice ▪ Provides 4 voice options at the app level ▫ Defaults to male ("Male 1")

“I use the male voice because I have two daughters.
And they should know that voice assistants can present as male or female. - Ha-Hoa Hamano (product manager)

▪ … We're still working it out ▪ Alexa basically
locks you into a female voice ▪ Should we care about cross-platform consistency? ▪ The dilemma: NPR has already received its fair share of criticism for putting too many cis male voices on the air Our approach

▪ W3C standard supported on all major platforms ▪ Use
<audio> to embed short, pre-recorded audio A solution: SSML <audio>

My recommendations ▪ Where possible, use <audio> instead of TTS
▫ Record real voices representative of the diversity of your user base ▪ Otherwise, be conscious of perpetuating gender roles and stereotypes ▫ Don't make the assistant overly subservient

Wrap-up Miscellaneous resources & tips

Links to research, talks & blog posts ▪ NPR +
Edison Research Smart Audio Report ▪ Finding Your Voice: Building Screenless Interfaces with Node.js (my talk at jsDay 2018) ▪ Talking Back To Your Radio: How We Approached Voice UI (npr.design) ▪ How To Prototype For Audio-Rich Voice Experiences Without Really Trying (npr.design)

Other useful resources ▪ Alexa Skill Blueprints (Amazon) ▪ Conversation
design (Google) ▪ Designing Voice Experiences (Smashing Mag) ▪ Intelligent Assistants Have Poor Usability: A User Study of Alexa, Google Assistant, and Siri (Nieman Norman Group)

General recommendations ▪ Don't embark on this work without a
designer ▪ Think of it as another form of front-end design ▪ Test with real users ▪ Remember being in the home is a privilege ▪ Empower everyone on the team to speak up and weigh in on design decisions

Thank you! Questions? Thoughts?

[SelfConf] Principled Product Development for V...

[SelfConf] Principled Product Development for Voice-Based Interfaces

More Decks by Nara Kasbergen

Other Decks in Technology

Featured

Transcript