Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[VoiceCon] The Voice Technology Landscape

[VoiceCon] The Voice Technology Landscape

On the eve of a new decade, we are in the midst of a seismic technological shift: the transition of user interfaces from being primarily visual to being voice-first. Science fiction books, movies and TV shows have predicted this for years, including the iconic ship computer in Star Trek, or, more recently, the virtual assistant J.A.R.V.I.S. in the Iron Man film series and the operating system Samantha in the 2013 film Her. But it’s also clear that we’re not quite there yet; Amazon Alexa, Google Assistant, and all of the other voice assistants are far from being able to handle open-ended queries and respond to every request. The developer platforms still have rigid technical limitations that prevent us from realizing our most visionary voice UX ambitions. So how much further do we have to go? What technological breakthroughs are required to usher in the voice-first revolution?

With 2020 less than a month away, let’s take a look at the technological landscape of the voice space at the end of 2019 and make some predictions for the future: what are the surest bets for new features in the next year? What would we most like to see (but might be less likely to happen)? And what will be the biggest breakthrough, and can we make any predictions about how much longer we have to wait for it?

Nara Kasbergen

December 11, 2019
Tweet

More Decks by Nara Kasbergen

Other Decks in Technology

Transcript

  1. The Voice Technology Landscape
    A Look Ahead to 2020 & Beyond

    View Slide

  2. I'm tech lead for Voice & Emerging Platforms
    at NPR (National Public Radio) in the USA
    You can find & follow me at @xiehan
    I AM NARA KASBERGEN
    Guten Tag!

    View Slide

  3. WHAT IS N.P.R.?
    A nationwide network
    of public radio stations

    View Slide

  4. WHY DOES N.P.R. CARE ABOUT VOICE?
    Then: Now:

    View Slide

  5. The Voice Platforms Team at NPR

    View Slide

  6. MY BIASES
    I'm a software engineer
    Consumer app developer
    Audio (long-form)
    "Smart speakers"

    View Slide

  7. WHERE ARE WE GOING?

    View Slide

  8. View Slide

  9. View Slide

  10. View Slide

  11. View Slide

  12. BOLD PREDICTION
    By 2030, we'll have built J.A.R.V.I.S.

    View Slide

  13. ELEMENTS NEEDED FOR VOICE UTOPIA (OR, J.A.R.V.I.S.)
    1. Better NLU
    2. Better AI
    3. Ubiquity
    4. Connectivity
    5. Trust

    View Slide

  14. Better NLU
    1

    View Slide

  15. PROBLEM STATEMENT
    We need machines to understand humans
    the way humans naturally speak
    Wake words & invocation names are clunky
    The more complex the query, the harder for
    users to remember the proper order
    Users want to speak to voice assistants and
    devices in their native language

    View Slide

  16. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    In 2020, we will see more
    platform workarounds to
    avoid requiring invocation
    names and wake words

    View Slide

  17. EXAMPLE #1: GOOGLE ALARMS
    You used to have to say "Hey Google, stop"
    to stop an alarm that is going off
    Now, you can just say "stop" (because
    Google Assistant anticipates that you are
    going to say something when the alarm
    starts going off)

    View Slide

  18. EXAMPLE #2: BIXBY N.L. CATEGORIES
    Example: Rideshare (Uber, Lyft, Taxify, etc.)
    The first time the user says "Hi Bixby, get
    me a ride" they choose their preferred app
    After that, "Hi Bixby, get me a ride" always
    defaults to that provider
    (No more need for invocation names, i.e.
    "Bixby, ask Uber to get me a ride")

    View Slide

  19. Bixby NL Categories

    View Slide

  20. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Other platforms will copy
    Bixby's NL categories
    ASAP

    View Slide

  21. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Bixby could do well in
    Europe if and only if
    Samsung moves fast
    enough to add more
    languages

    View Slide

  22. Better AI
    2

    View Slide

  23. PROBLEM STATEMENT
    So far, much of the AI focus in this space
    has concentrated on the platform side
    The platform side is a black box, and at this
    point that seems unlikely to change
    Consumer applications need AI in order to
    provide richer, more helpful, context-
    dependent responses to user queries

    View Slide

  24. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Free business idea:
    "AI as a service" for voice
    consumer apps

    View Slide

  25. Screenshot of Invocable

    View Slide

  26. A WORD ABOUT VOICE DEV PLATFORMS RIGHT NOW
    The current ecosystem is not great
    There are too many different ways to build
    skills/actions, not easy to switch back & forth
    Platforms have focused on a "lowest
    common denominator" approach,
    optimizing for non-technical people
    This approach is holding back the field

    View Slide

  27. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    In 2020, other companies
    will copy Samsung's best
    ideas from the Bixby
    developer platform

    View Slide

  28. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    In 2020, Google might
    actually figure out its
    voice development
    strategy

    View Slide

  29. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    In 2020, Google might
    actually figure out its
    voice development
    strategy
    We just might not like it

    View Slide

  30. Google Assistant docs

    View Slide

  31. Ubiquity
    3

    View Slide

  32. PROBLEM STATEMENT
    We need to be able to talk to our voice
    assistants anytime, anywhere
    Too much emphasis has been placed on
    smart phones
    The smart home will be the key to moving
    us closer to this utopia
    Context-awareness is the key

    View Slide

  33. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    In 2020, Amazon will
    continue to lead the pack
    by embedding Alexa in
    anything it can

    View Slide

  34. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Samsung has the
    opportunity to quickly
    gain an edge because of
    the large numbers of
    hardware already in
    people's homes

    View Slide

  35. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Smart fridge sales are
    not going to soar in 2020

    View Slide

  36. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Google's Nest/Home
    rebranding confusion will
    hurt it in 2020

    View Slide

  37. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    If Apple doesn't move
    faster in this area, it's
    going to fall even further
    behind

    View Slide

  38. Connectivity
    4

    View Slide

  39. PROBLEM STATEMENT
    Currently, voice technology relies heavily on
    an Internet connection because recordings
    are transcribed and processed in the cloud,
    and consumer apps also run in the cloud
    There are two solutions to this:
    1. Improve connectivity (faster
    connections, fewer dead zones)
    2. Provide more offline-first possibilities

    View Slide

  40. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    The rollout of 5G will
    improve connectivity and
    therefore the usability of
    voice assistants

    View Slide

  41. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Dead zones will continue
    to be a problem

    View Slide

  42. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Platforms will begin to
    move more of their NLP
    to devices in order to
    allow for more offline-
    first possibilities

    View Slide

  43. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    The wildcard factor:
    What about offline-first
    consumer apps?

    View Slide

  44. Trust
    5

    View Slide

  45. PROBLEM STATEMENT
    No one is going to want JARVIS in their
    home and in their life if they don't trust it
    The fact that voice tech has been
    dominated by big tech companies that
    have a poor reputation for respecting
    consumer privacy has hurt this space
    Tech companies need to win back our trust

    View Slide

  46. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    There will be more
    privacy scandals in 2020

    View Slide

  47. Screenshot of 2019 Amazon privacy scandal

    View Slide

  48. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    We could see our first
    GDPR lawsuit related to
    voice technology in 2020

    View Slide

  49. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Facebook may launch a
    voice assistant in 2020

    View Slide

  50. OTHER PREDICTIONS

    View Slide

  51. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    There will be more small
    feature releases to
    improve monetization

    View Slide

  52. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Alexa Presentation
    Language (APL) will be
    deprecated

    View Slide

  53. APL, we hardly knew you...

    View Slide

  54. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Multimodal devices will
    continue to be important
    and not important at the
    same time

    View Slide

  55. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    2020 may determine
    whether Bixby has a
    future or not

    View Slide

  56. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Microsoft will give up on
    Cortana (but may not kill
    it completely just yet)

    View Slide

  57. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Apple could finally do
    something and it could
    change everything

    View Slide

  58. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    Apple could finally do
    something and it could
    change everything
    Apple also may not do anything

    View Slide

  59. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019
    In 2020, voice tech could
    shift back toward native
    mobile app integrations

    View Slide

  60. Danke Schön!
    ANY QUESTIONS?
    @xiehan on Twitter
    [email protected]

    View Slide