Talking to Swift

Talking to Swift

Talk given at Swift Summit 2016, in San Francisco.

https://www.skilled.io/u/swiftsummit/talking-to-swift

4b71cdf53bc76edd1a2ab66c446954b5?s=128

Alexis Gallagher

November 07, 2016
Tweet

Transcript

  1. Talking to Swift @alexisgallagher

  2. None
  3. None
  4. "Just use the keyboard." "A keyboard, how quaint..."

  5. None
  6. None
  7. None
  8. Bots are here ?!

  9. Conversational UI 1. What is it? Why the hype? Is

    it real? 2. How to design it? 3. How to build it?
  10. What is it? Trend 1: Messaging

  11. None
  12. None
  13. None
  14. None
  15. Message Thread: an evolving interaction model

  16. What is it? Trend 2: Chatbots

  17. The DREAM is not new

  18. None
  19. None
  20. None
  21. The REALITY is not new either

  22. $ emacs -q -f doctor

  23. None
  24. None
  25. None
  26. But now, we really use them! ! • Siri (iOS,

    macOS, watchOS, tvOS) • Google (Search, Maps, Google Home) • Amazon Alexa • XFinity X1 • Slackbot
  27. But now, we really use them! ! • Voice-based customer

    service phone tree • Text chat with automated customer service agent • Automated telephone survey • IM messaging spam • Email spam
  28. SiriKit?

  29. SiriKit.

  30. chatbots = restricted domain bots

  31. What is it? Trend 3: Voice

  32. None
  33. None
  34. None
  35. None
  36. Progress in speech recognition (within the last month!)

  37. Progress in speech synthesis (within the last month!)

  38. None
  39. What is it? Conversational UI Reality Check(list): • Messaging? Evolving

    new interaction model. No magic AI. • Chatbots? Still quite limited. No magic AI. (afaict). • Voice? Yes! Significant breakthroughs thanks to neural nets.
  40. How to Design it 1. Develop a character

  41. None
  42. What If? by Anne Bernays & Pamela Painter • Exercises

    for characterization • What does the character want? • Remember most from childhood? • Usually feel two hours after lunch? • What are they in denial of?
  43. How to Design it 2. Restrict the domain

  44. Voice UI is opposite of Message Thread UI Thread Voice

    history accessible no history identity visible no identity rich media text only guided input freeform text
  45. Example character: Rochefoucauld domain: sorrows of life

  46. Rochefoucauld • François VI, Duc de la Rouchefoucauld, Prince de

    Marcillac • Born 1613, Died 1680 • Wealthy, noble, good-looking • Unhappily married, imprisoned, libeled, exiled, shot in the eye
  47. None
  48. Rochefoucauld • Maxims and other works • 500+ aphorisms on

    love, ambition, self-delusion, ennui, envy, esteem, evils, the exchange of secrets, etc. • character: cynical, worldly wise, witty • domain: life's sorrows
  49. We all have strength enough to bear the misfortunes of

    others
  50. To say that one never flirts is in itself a

    form of flirtation
  51. How to Build it

  52. None
  53. import AVFoundation

  54. Speech Synthesis API class MyDelegate : AVSpeechSynthesisDelegate { public func

    speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) { print("Just finishe saying hello") } } let synth = AVSpeechSynthesizer() let delegate = MyDelegate() synth.delegate = delegate let utterance = AVSpeechUtterance(string:"Hello, world") synth.speak(utterance)
  55. import Speech

  56. Once, create and/or configure • AVAudioEngine, AVAudioSession, SFSpeechRecognizer Per utterance,

    request stream of live partial results • AVAudioNodeTapBlock, SFSpeechAudioBufferRecognitionRequest, SFSpeechAudioBufferRecognitionTask During utterances, handle callbacks with results • SFSpeechRecognitionResult • but SFSpeechRecognitionResult.isFinal is never
  57. Problem: "endpointing"

  58. Solution: Asynchronous Operation

  59. /** Operation which recognizes speech and finishes after it has

    reached an "endpoint", an interval over which no speech has been recognized (e.g., a period of silence or of noise). */ public class SpeechRecognitionOperation: Operation, SFSpeechRecognitionTaskDelegate { open override var isAsynchronous: Bool { return true } public var output:String? = nil private var endpointTimer:Timer? private var request:SFSpeechAudioBufferRecognitionRequest? // ... public init(engine e:AVAudioEngine, recognizer r:SFSpeechRecognizer) { /* ... */ } // ... }
  60. None
  61. // Represents a bot, e.g., Eliza, Rochefoucauld, etc.. protocol Interlocutor

    : class { init() func respondTo(saying:String) -> String } // wraps any Interlocutor in a speech rec/synth UI public class VoiceChatter : NSObject { public init(interlocutor:Interlocutor) { /* ... */ } } // receives a callback when a line of dialog is recognized/spoken. protocol VoiceChatterDelegate { func engineDidUpdateDialog(engine:VoiceChatter, dialogLines:[String]) }
  62. None
  63. Tree, where every node is a Q&A or a set

    of quotations: let rocheTree:ConversationNode = .Question( question: "Is your problem with your own feelings, or with other people?", answerPatterns: [ ("feelings", .Question( question: "And do you suffer from love, or from ambition?", answerPatterns:[ ("love", .Statement(quotations:feelingsLoveQuotes)), ("ambition", .Statement(quotations:feelingsAmbitionQuotes)) ]) ), // ... ])
  64. None
  65. None
  66. import Alexa !

  67. Amazon Alexa Custom Skills • 1000s of them, some quite

    fun! • deployed on the Amazon Echo device • most easily hosted on Amazon Lambda ... • ... which only supports NodeJS, Python, Java
  68. !

  69. None
  70. None
  71. None
  72. None
  73. Conclusions: • Yes, Swift can go anywhere! ! • ...

    even into conversational UI • Real hardware/software advances in speech • Works best in constrained domains • An interesting world where design is writing • Don't believe the hype (except on neural nets)
  74. Acknowledgements • Nick Jackson (@sheriffjackson), docker and terraform magic, in

    https://github.com/algal/SwiftOnLambda • Claus Höfele (@claushoefele), Alexa wrapper • Xavier Schott, Objective-C Eliza implementation • Swift@IBM, for Kitura and helpful docker images
  75. end @alexisgallagher

  76. Talking to Swift @alexisgallagher #swiftsummit

  77. The Most Human Human by Brian Christian • Pop sci

    book on chat bots • How they work, where they fail • Turing Test winners pretend to be distractible
  78. Swift Package Manager + Xcode: • isolate cross platform in

    modules • use SPM to manage its build • use SPM to generate xcodeproj files • contain iOS app in Xcode workspace • SPM-generated project is one imported module • iOS project imports and uses it
  79. None
  80. None
  81. What If? by Anne Bernays & Pamela Painter • Exercises

    for characterization • What does the character want? • Remember most from childhood? • Usually feel two hours after lunch? • What are they in denial of?
  82. None
  83. cgi-bin, all over again /** Reads all of stdin in

    a String buffer. Transforms the string. Prints result to stdout. */ func readTransformPrint(transform:(String)->String) { var input:String = "" for line in lineGenerator(file: stdin) { input += line } let result = transform(input) print(result, separator: "", terminator: "") }
  84. None
  85. None
  86. None
  87. None
  88. None
  89. None