Finding Your Voice:
Building Screenless
Interfaces with Node.js
Nara Kasbergen (@xiehan) | jsDay Italy | May 10, 2018
Slide 2
Slide 2 text
Who am I?
▷ Senior full-stack web developer
▷ At NPR since March 2014
▷ Part of a 5-member skunkworks
team focused 100% on voice UI
development
○ Formed in September 2017
Slide 3
Slide 3 text
What is NPR?
A quick explainer for the Italians in the audience:
+
Slide 4
Slide 4 text
Why voice UI development?
Then: Now:
Slide 5
Slide 5 text
No content
Slide 6
Slide 6 text
“smart speakers”
Slide 7
Slide 7 text
“smart speakers”
Slide 8
Slide 8 text
“voice assistants”
Slide 9
Slide 9 text
How Amazon views Alexa
“Alexa, set the
thermostat to
25 degrees.”
“Okay.”
“I'd like to
reorder paper
towels please.”
“Alexa, thank
you!”
“No problem.”
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
“What would you want
to know about
voice UI development?”
Slide 12
Slide 12 text
“
What can I actually make?
Slide 13
Slide 13 text
“
Is it possible to build one app for
Amazon Echo, Google Home,
and Apple HomePod?
Slide 14
Slide 14 text
To understand the present,
we must understand the past.
1.
What can you
actually make?
Slide 15
Slide 15 text
A brief timeline of voice assistants
2015
Amazon Alexa
Skills Kit launches
(June)
2014 2016
Google Assistant
(May) +
Google Home
(November)
2017
Samsung Bixby
(August) +
Microsoft Cortana
(October)
2018
Apple HomePod
(February)
Amazon Echo
launches
November 6
Slide 16
Slide 16 text
A natural evolution
add voice activation
to existing custom
app ecosystem add content via
RSS feeds
add support for
custom “skills”
1.
2.
3.
Slide 17
Slide 17 text
A natural evolution
add voice activation
to existing custom
app ecosystem add content via
RSS feeds
add support for
custom “skills”
1.
2.
3.
Slide 18
Slide 18 text
Conclusions
▷ Amazon has a 2-year lead
▷ Only Amazon and Google have
fully developed ecosystems
▷ A big focus is adding access to
news and podcasts via RSS
▷ Home automation is secondary
Slide 19
Slide 19 text
tl;dr yes … and no
2.
Can you build one
“skill” to rule them all?
Slide 20
Slide 20 text
Alexa + Google ecosystems
▷ Heavily leverage their existing
cloud infrastructure
○ AWS Lambda + Google Cloud Functions
▷ Can also build a traditional REST
API accessed by their services
Slide 21
Slide 21 text
The request/response flow
Your
code
request
response
Slide 22
Slide 22 text
The request/response flow
Your
code
request
response
P.S. all the NLP and ML happens here
Slide 23
Slide 23 text
The future is “serverless”
▷ Others can speak more eloquently
on this subject than me
○ Hopefully you went to Luciano's talk
▷ Let's just assume we want to use
Lambda or Cloud Functions…
▷ … node.js wins!
Slide 24
Slide 24 text
The official SDKs are not bad
Alexa node.js SDK:
github.com/alexa/alexa-skills-kit-sdk-for-nodejs
Actions on Google node.js SDK:
github.com/actions-on-google/actions-on-google-nodejs
SSML: A common language
WFUV is your station.
There is a three second pause here
then I continue.
When I wake up, I speak slowly.
Slide 27
Slide 27 text
Backends-for-Frontends (BFFs)
BFF on
AWS
Lambda
The
"real"
API
request
response
Slide 28
Slide 28 text
Two “skills”, one codebase
BFF on
AWS
Lambda
The
"real"
API
BFF on
AWS
Lambda
>60% shared code with
separate view layers
different builds using Gulp
Slide 29
Slide 29 text
Generic Response Model
Slide 30
Slide 30 text
Challenges
▷ Text-to-Speech (TtS) is still king
○ Google didn't even add support for their
native audio player until February 2018
▷ No access to the user's location
▷ Error handling is interesting!
○ User might not even trigger your skill
Slide 31
Slide 31 text
Conclusions
▷ The code is not hard
▷ Understanding platform limitations
and user expectations are
Slide 32
Slide 32 text
No content
Slide 33
Slide 33 text
Open source opportunities
▷ Would it be helpful to have a
formalized framework?
Slide 34
Slide 34 text
Open source opportunities
▷ Would it be helpful to have a
formalized framework?
○ Not really. The code is not hard.
▷ What we struggle with the most: QA
○ We need something like Selenium or
Nightwatch.js for voice UI