[VoiceCon] The Voice Technology Landscape

[VoiceCon] The Voice Technology Landscape

On the eve of a new decade, we are in the midst of a seismic technological shift: the transition of user interfaces from being primarily visual to being voice-first. Science fiction books, movies and TV shows have predicted this for years, including the iconic ship computer in Star Trek, or, more recently, the virtual assistant J.A.R.V.I.S. in the Iron Man film series and the operating system Samantha in the 2013 film Her. But it’s also clear that we’re not quite there yet; Amazon Alexa, Google Assistant, and all of the other voice assistants are far from being able to handle open-ended queries and respond to every request. The developer platforms still have rigid technical limitations that prevent us from realizing our most visionary voice UX ambitions. So how much further do we have to go? What technological breakthroughs are required to usher in the voice-first revolution?

With 2020 less than a month away, let’s take a look at the technological landscape of the voice space at the end of 2019 and make some predictions for the future: what are the surest bets for new features in the next year? What would we most like to see (but might be less likely to happen)? And what will be the biggest breakthrough, and can we make any predictions about how much longer we have to wait for it?

B36609b33707f04623f84f7381d5e94e?s=128

Nara Kasbergen

December 11, 2019
Tweet

Transcript

  1. The Voice Technology Landscape A Look Ahead to 2020 &

    Beyond
  2. I'm tech lead for Voice & Emerging Platforms at NPR

    (National Public Radio) in the USA You can find & follow me at @xiehan I AM NARA KASBERGEN Guten Tag!
  3. WHAT IS N.P.R.? A nationwide network of public radio stations

  4. WHY DOES N.P.R. CARE ABOUT VOICE? Then: Now:

  5. The Voice Platforms Team at NPR

  6. MY BIASES I'm a software engineer Consumer app developer Audio

    (long-form) "Smart speakers"
  7. WHERE ARE WE GOING?

  8. None
  9. None
  10. None
  11. None
  12. BOLD PREDICTION By 2030, we'll have built J.A.R.V.I.S.

  13. ELEMENTS NEEDED FOR VOICE UTOPIA (OR, J.A.R.V.I.S.) 1. Better NLU

    2. Better AI 3. Ubiquity 4. Connectivity 5. Trust
  14. Better NLU 1

  15. PROBLEM STATEMENT We need machines to understand humans the way

    humans naturally speak Wake words & invocation names are clunky The more complex the query, the harder for users to remember the proper order Users want to speak to voice assistants and devices in their native language
  16. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 In 2020,

    we will see more platform workarounds to avoid requiring invocation names and wake words
  17. EXAMPLE #1: GOOGLE ALARMS You used to have to say

    "Hey Google, stop" to stop an alarm that is going off Now, you can just say "stop" (because Google Assistant anticipates that you are going to say something when the alarm starts going off)
  18. EXAMPLE #2: BIXBY N.L. CATEGORIES Example: Rideshare (Uber, Lyft, Taxify,

    etc.) The first time the user says "Hi Bixby, get me a ride" they choose their preferred app After that, "Hi Bixby, get me a ride" always defaults to that provider (No more need for invocation names, i.e. "Bixby, ask Uber to get me a ride")
  19. Bixby NL Categories

  20. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Other platforms

    will copy Bixby's NL categories ASAP
  21. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Bixby could

    do well in Europe if and only if Samsung moves fast enough to add more languages
  22. Better AI 2

  23. PROBLEM STATEMENT So far, much of the AI focus in

    this space has concentrated on the platform side The platform side is a black box, and at this point that seems unlikely to change Consumer applications need AI in order to provide richer, more helpful, context- dependent responses to user queries
  24. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Free business

    idea: "AI as a service" for voice consumer apps
  25. Screenshot of Invocable

  26. A WORD ABOUT VOICE DEV PLATFORMS RIGHT NOW The current

    ecosystem is not great There are too many different ways to build skills/actions, not easy to switch back & forth Platforms have focused on a "lowest common denominator" approach, optimizing for non-technical people This approach is holding back the field
  27. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 In 2020,

    other companies will copy Samsung's best ideas from the Bixby developer platform
  28. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 In 2020,

    Google might actually figure out its voice development strategy
  29. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 In 2020,

    Google might actually figure out its voice development strategy We just might not like it
  30. Google Assistant docs

  31. Ubiquity 3

  32. PROBLEM STATEMENT We need to be able to talk to

    our voice assistants anytime, anywhere Too much emphasis has been placed on smart phones The smart home will be the key to moving us closer to this utopia Context-awareness is the key
  33. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 In 2020,

    Amazon will continue to lead the pack by embedding Alexa in anything it can
  34. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Samsung has

    the opportunity to quickly gain an edge because of the large numbers of hardware already in people's homes
  35. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Smart fridge

    sales are not going to soar in 2020
  36. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Google's Nest/Home

    rebranding confusion will hurt it in 2020
  37. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 If Apple

    doesn't move faster in this area, it's going to fall even further behind
  38. Connectivity 4

  39. PROBLEM STATEMENT Currently, voice technology relies heavily on an Internet

    connection because recordings are transcribed and processed in the cloud, and consumer apps also run in the cloud There are two solutions to this: 1. Improve connectivity (faster connections, fewer dead zones) 2. Provide more offline-first possibilities
  40. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 The rollout

    of 5G will improve connectivity and therefore the usability of voice assistants
  41. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Dead zones

    will continue to be a problem
  42. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Platforms will

    begin to move more of their NLP to devices in order to allow for more offline- first possibilities
  43. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 The wildcard

    factor: What about offline-first consumer apps?
  44. Trust 5

  45. PROBLEM STATEMENT No one is going to want JARVIS in

    their home and in their life if they don't trust it The fact that voice tech has been dominated by big tech companies that have a poor reputation for respecting consumer privacy has hurt this space Tech companies need to win back our trust
  46. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 There will

    be more privacy scandals in 2020
  47. Screenshot of 2019 Amazon privacy scandal

  48. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 We could

    see our first GDPR lawsuit related to voice technology in 2020
  49. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Facebook may

    launch a voice assistant in 2020
  50. OTHER PREDICTIONS

  51. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 There will

    be more small feature releases to improve monetization
  52. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Alexa Presentation

    Language (APL) will be deprecated
  53. APL, we hardly knew you...

  54. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Multimodal devices

    will continue to be important and not important at the same time
  55. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 2020 may

    determine whether Bixby has a future or not
  56. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Microsoft will

    give up on Cortana (but may not kill it completely just yet)
  57. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Apple could

    finally do something and it could change everything
  58. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 Apple could

    finally do something and it could change everything Apple also may not do anything
  59. PREDICTION BY NARA KASBERGEN (@XIEHAN) DEC 11, 2019 In 2020,

    voice tech could shift back toward native mobile app integrations
  60. Danke Schön! ANY QUESTIONS? @xiehan on Twitter nara.kasbergen@gmail.com