Conversational Interfaces in iOS - Try Swift! Tokyo 2018

Db257e8558247c0658ace420677d5937?s=47 Wendy Lu
March 02, 2018

Conversational Interfaces in iOS - Try Swift! Tokyo 2018

Try Swift! Tokyo 2018

Db257e8558247c0658ace420677d5937?s=128

Wendy Lu

March 02, 2018
Tweet

Transcript

  1. Conversational Interfaces in iOS

  2. None
  3. “Conversation as a Platform”

  4. What are conversational interfaces? տ扖ࣳαЀόЄϢδαφ;΅Ҙ

  5. Onus on the software, not the user ϳЄσЄ΁ͽ΅΀̵ͥ ϊϢϕγδί΁揗೅Ψͧ͡͵

  6. None
  7. None
  8. None
  9. iOS Speech Recognition API Amazon Lex Google Speech API OpenEars

    Nuance
  10. Speech Recognition API ᶪ्扯挷API

  11. Server-side recognition

  12. Free, but not unlimited

  13. Pre-recorded or live audio

  14. Over 50 languages and dialects 50զӤ΄ࢵ;ො᥺

  15. iOS 10+ Internet connection required

  16. func recognizeRecording() { guard let url = Bundle.main.url(forResource: "hi", withExtension:

    "m4a") else { return } guard let recognizer = SFSpeechRecognizer() else { // Device or locale not supported return } if !recognizer.isAvailable { // Internet connection may not be available return } let request = SFSpeechURLRecognitionRequest(url: url) recognizer.recognitionTask(with: request) { (result, error) in guard let result = result else { return } print("result: \(result.bestTranscription.formattedString)") if result.isFinal { print("final result: \(result.bestTranscription.formattedString)") } } }
  17. let audioEngine = AVAudioEngine() let speechRecognizer = SFSpeechRecognizer() let request

    = SFSpeechAudioBufferRecognitionRequest() func startRecording() throws { let node = audioEngine.inputNode let recordingFormat = node.outputFormat(forBus: 0) node.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { [weak self] (buffer, _) in self?.request.append(buffer) } audioEngine.prepare() try audioEngine.start() speechRecognizer?.recognitionTask(with: request, resultHandler: { (result, error) in guard let result = result else { return } print("result: \(result.bestTranscription.formattedString)") }) } func stopRecording() { audioEngine.stop() request.endAudio() }
  18. Google Speech API Bing Speech API OpenEars Nuance Amazon Lex

  19. Cost Free tier + paid

  20. None
  21. None
  22. None
  23. None
  24. None
  25. None
  26. func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool

    { let credentialsProvider = AWSCognitoCredentialsProvider(regionType: AWSRegionType.USEast1, identityPoolId:"your-pool-id") let serviceConfiguration = AWSServiceConfiguration(region: AWSRegionType.USEast1, credentialsProvider:credentialsProvider) AWSServiceManager.default().defaultServiceConfiguration = serviceConfiguration let config = AWSLexInteractionKitConfig.defaultInteractionKitConfig(withBotName: "RecipeBot", botAlias:"Prod") // 5000 seconds before timeout config.noSpeechTimeoutInterval = 5000 config.maxSpeechTimeoutInterval = 5000 // We will use this key to retrieve the interaction kit in our view controller AWSLexInteractionKit.register(with: serviceConfiguration!, interactionKitConfiguration: config, forKey:"USEast1InteractionKit") return true }
  27. Listen for input

  28. let interactionKit = AWSLexInteractionKit(forKey: "USEast1InteractionKit") interactionKit.audioInAudioOut()

  29. interactionKit.audioInTextOut() interactionKit.textInTextOut() interactionKit.textInAudioOut()

  30. private func interactionKit(_ interactionKit: AWSLexInteractionKit, onDialogReadyForFulfillmentForIntent intent: String, slots: Dictionary<String,

    Any>) { print("Intent fulfilled: \(intent)") }
  31. internal func interactionKit(onAudioPlaybackStarted _ : AWSLexInteractionKit) { spinner.startAnimating() } internal

    func interactionKit(onAudioPlaybackFinished _ : AWSLexInteractionKit) { spinner.stopAnimating() }
  32. None
  33. func interactionKit(_ interactionKit: AWSLexInteractionKit, onError error: Error) { interactionKit.audioInAudioOut() }

  34. •Completely free, but not unlimited Speech Recognition API •Built right

    into iOS
  35. •Higher-level abstraction (text parsing, error handling) Lex •Broad match of

    phrases to intents
  36. Information •Cross Platform Lex •Not free past first 5000 requests/month

  37. Best Practices

  38. Transparency! ᭐ก௔Ѻ

  39. None
  40. Sensitive information 䱛ੂఘ䁭

  41. Be creative!

  42. Thanks! Wendy Lu @wendyluwho