Slide 1

Slide 1 text

OK Computer London Web, 19/03/15

Slide 2

Slide 2 text

Peter Gasston @stopsatgreen broken-links.com

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Input Types Touch: Manipulate space Keyboard: Data entry Voice: Command or query

Slide 5

Slide 5 text

55% of teens 41% of adults use voice search every day* *maybe

Slide 6

Slide 6 text

10% of Baidu search queries are by voice That’s ~500m per day

Slide 7

Slide 7 text

To Talk: Speak Listen Understand

Slide 8

Slide 8 text

Synthesis

Slide 9

Slide 9 text

Chrome/Safari v a r t x t = ' H e l l o w o r l d ' , s a y = n e w S p e e c h S y n t h e s i s U t t e r a n c e ( t x t ) ; w i n d o w . s p e e c h S y n t h e s i s . s p e a k ( s a y ) ; The Web Speech A.P.I. Play

Slide 10

Slide 10 text

SSU Attributes v a r t x t = ' H e l l o w o r l d ' , s a y = n e w S p e e c h S y n t h e s i s U t t e r a n c e ( t x t ) ; s a y . l a n g = ' e n - G B ' ; s a y . p i t c h = 0 . 7 5 ; s a y . r a t e = 1 . 5 ; s a y . v o l u m e = 0 . 5 ; w i n d o w . s p e e c h S y n t h e s i s . s p e a k ( s a y ) ; The Web Speech A.P.I. Play

Slide 11

Slide 11 text

Methods, Attrs, Events Play / Pause / Resume / Cancel / End / Error

Slide 12

Slide 12 text

Synthesis As A Service 1. 2. 3. 4. developer.att.com/apis/speech ws.neospeech.com/ cereproc.com/en/products/cloud ivona.com/en/for-business/speech-cloud/

Slide 13

Slide 13 text

Neospeech The Web Speech A.P.I. 1

Slide 14

Slide 14 text

SSML < s p e a k v e r s i o n = " 1 . 0 " > < p > < s > T h i s i s < a b b r > S . S . M . L . < / a b b r > < / s > < s > S p e a k < p r o s o d y r a t e = " - 2 0 % " > s l o w l y < / p r o s o d y > . < / s > < / p > < / s p e a k >

Slide 15

Slide 15 text

Recognition

Slide 16

Slide 16 text

Challenges 1. Multiple users 2. Multiple languages 3. Accents

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Web Speech API v a r r e c o g = n e w S p e e c h R e c o g n i t i o n ( ) ; x-browser v a r s p e e c h R e c o g n i t i o n = ( w i n d o w . S p e e c h R e c o g n i t i o n | | w i n d o w . w e b k i t S p e e c h R e c o g n i t i o n ) ; v a r r e c o g = n e w s p e e c h R e c o g n i t i o n ( ) ;

Slide 19

Slide 19 text

SpeechRecognition Methods v a r r e c o g = n e w S p e e c h R e c o g n i t i o n ( ) ; r e c o g . s t a r t ( ) ; r e c o g . s t o p ( ) ; r e c o g . a b o r t ( ) ;

Slide 20

Slide 20 text

SpeechRecognition Events v a r r e c o g = n e w S p e e c h R e c o g n i t i o n ( ) ; r e c o g . o n r e s u l t = f u n c t i o n ( ) { } ; r e c o g . o n n o m a t c h = f u n c t i o n ( ) { } ; r e c o g . o n e r r o r = f u n c t i o n ( ) { } ;

Slide 21

Slide 21 text

MVS v a r r e c o g = n e w S p e e c h R e c o g n i t i o n ( ) ; r e c o g . o n r e s u l t = f u n c t i o n ( r e s u l t ) { o u t p u t . t e x t C o n t e n t = r e s u l t s [ 0 ] [ 0 ] . t r a n s c r i p t ; } ; b t n . o n c l i c k = r e c o g . s t a r t ( ) ; Click header for demo

Slide 22

Slide 22 text

SpeechRecognition Events start audiostart soundstart speechstart speechend soundend audioend end Click header for demo

Slide 23

Slide 23 text

Interim Results v a r r e c o g = n e w S p e e c h R e c o g n i t i o n ( ) ; r e c o g . i n t e r i m R e s u l t s = t r u e ; r e c o g . o n r e s u l t = f u n c t i o n ( r e s u l t ) { i f ( r e s u l t . r e s u l t s [ 0 ] . i s F i n a l ) { … } } ; b t n . o n c l i c k = r e c o g . s t a r t ( ) ; Click header for demo

Slide 24

Slide 24 text

Continuous v a r r e c o g = n e w S p e e c h R e c o g n i t i o n ( ) ; r e c o g . c o n t i n u o u s = t r u e ; r e c o g . o n r e s u l t = f u n c t i o n ( r e s u l t ) { o u t p u t . t e x t C o n t e n t = r e s u l t . r e s u l t s [ 0 ] [ 0 ] . t r a n s c r i p t ; } ; b t n . o n c l i c k = f u n c t i o n ( ) { i f ( l i s t e n i n g ) { r e c o g . s t o p ( ) ; } e l s e { r e c o g . s t a r t ( ) ; } }

Slide 25

Slide 25 text

SpeechRTC + Web Speech API

Slide 26

Slide 26 text

Tea. Earl Grey. Hot.

Slide 27

Slide 27 text

[Google demo]

Slide 28

Slide 28 text

JuliusJS v a r r e c o g = n e w J u l i u s ( ) ; r e c o g . o n r e c o g n i t i o n = f u n c t i o n ( r e s u l t ) { c o n s o l e . l o g ( r e s u l t ) ; }

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Wit + Node 1. Web Speech API + text 2. Direct speech: GuM, Web Audio API

Slide 33

Slide 33 text

Wit.ai Microphone.js

Slide 34

Slide 34 text

[Wit dashboard demo]

Slide 35

Slide 35 text

Wit.ai Response

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

Use Cases No screen Busy hands

Slide 38

Slide 38 text

Amazon : ‘Alexa’ Apple : Siri Google : Voice Search Microsoft : Cortana

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

Closed Systems :(

Slide 43

Slide 43 text

The End Thanks for your patience.

Slide 44

Slide 44 text

Copyright Note The video clips in this presentation from the films Moon, Star Trek: The Voyage Home, Her, Star Trek, and 2001: A Space Odyssey belong to their respective copyright holders and are used here without permission.