Upgrade to Pro — share decks privately, control downloads, hide ads and more …

In conversation with a browser - at SydJS

Phil Nash
September 18, 2019

In conversation with a browser - at SydJS

Voice assistants have taken off, but can we build our own with web technologies? I've been building bots for other platforms, but I wanted to investigate how well one could work in the browser. Can we talk to a web application and get results?

Let's dive into the Web Speech API, speech synthesis, and conversation design. We'll find out whether browsers can be virtual assistants or virtually useless.




Browser voices: https://glitch.com/~browser-voices

An introduction to the MediaRecorder API: https://www.twilio.com/blog/mediastream-recording-api
Web recorder: https://glitch.com/~web-recorder

Speech to text with Watson in the browser: https://watson-speech.mybluemix.net/microphone-streaming.html

Tensorflow speech model: https://storage.googleapis.com/tfjs-speech-model-test/2019-01-03a/dist/index.html
ml5js: https://ml5js.org/

Phil Nash

September 18, 2019

More Decks by Phil Nash

Other Decks in Programming


  1. ELIZA: Please tell me what's been bothering you. YOU: I'm

    scared of public speaking. ELIZA: Do you know anyone else who is scared of public speaking ? YOU: Lot's of people ELIZA: Do you feel strongly about discussing such things ? YOU: Only with you ELIZA: You're not really talking about me -- are you ? YOU: No, I'm the one who is scared. ELIZA: Why 'no' ? @philnash
  2. Speech Recognition Start Speech Recognition start listening to me and

    probably putting things into that textarea there eventually I can get it going again I can see he @philnash
  3. Speech Recognition const recognition = new webkitSpeechRecognition(); recognition.addEventListener('result', event =>

    { const result = event.results[0][0].transcript; console.log(result); }); recognition.start(); 01. 02. 03. 04. 05. 06. @philnash
  4. MediaRecorder API const stream = await navigator.mediaDevices.getUserMedia(); const recorder =

    new MediaRecorder(stream, { type: 'audio/webm' }); const chunks = []; 01. 02. 03. @philnash
  5. MediaRecorder API recorder.addEventListener('dataavailable', event => { if (typeof event.data ===

    'undefined') return; if (event.data.size === 0) return; chunks.push(event.data); }); 01. 02. 03. 04. 05. @philnash
  6. MediaRecorder API recorder.addEventListener('stop', event => { const recording = new

    Blob(chunks, { type: 'audio/webm' }); }); 01. 02. 03. 04. 05. @philnash
  7. Then what? Send the file to a speech to text

    service • Google Cloud Speech • Azure Cognitive Services • IBM Watson @philnash