Upgrade to Pro — share decks privately, control downloads, hide ads and more …

iOS Speech Recognition

KMKLabs
March 14, 2018

iOS Speech Recognition

In this presentation, speaker talk about speech recognition in iOS. Where we can find them and how to implement it to our apps. This presentation will cover both online and offline speech recognition in iOS also speaker will demonstrate their capability.

For offline speech recognition we use OpenEars and for online speech recognition we use Speech Recognition API and Dictation by Apple.

KMKLabs

March 14, 2018
Tweet

More Decks by KMKLabs

Other Decks in Programming

Transcript

  1. Speech recognition (SR) is the inter-disciplinary sub-field of computational linguistics

    that develops methodologies and technologies that enables the recognition and translation of spoken language into text by computers. It is also known as "automatic speech recognition" (ASR), "computer speech recognition", or just "speech to text" (STT). Source : wikipedia :D
  2. iOS Provide 2 Way to Recognize Speech 1. Using Keyboard

    Dictation 2. Using Speech Recognition API
  3. Keyboard Dictation - Available in keyboard pad automatically - Available

    since iOS 5 - Need internet connection to recognize speech - No need additional permission to use it
  4. Cons - Speech recognition depend on active keyboard language preference

    - Cannot trigger it programmatically - Sometime user not aware with this feature - Need internet connection to work well
  5. Speech Recognition API - Introduced in 2016 - Use same

    engine with SIRI - Can recognize over 50 languages / dialects - We can trigger it programmatically - Free for use
  6. Introduction Created by Halle Winkler in Germany, OpenEars® is a

    shared-source iOS framework for iPhone voice recognition and speech synthesis (TTS). It lets you easily implement local, offline speech recognition in English and five other languages, and English text-to-speech (synthesized speech). Pocketsphinx (the open source voice recognition engine that OpenEars uses) is capable of local recognition of vocabularies with hundreds or even thousands of words depending on the environment and other factors, and performs very well with medium-sized language models (vocabularies). The best part is that it uses no network connectivity because all processing occurs locally on the device.
  7. Pocketsphinx - Support for several languages like US English, UK

    English, French, Mandarin, German, Dutch, Russian and ability to build a models for others - Available for Android, iOS, Linux / Unix, Windows - Developed by Carnegie Mellon University.
  8. Pros - No need internet connection to recognize words -

    Very light, not take so much resources - free - Easy to use - Very reliable - Support 6 languages - Support iOS 8.0+
  9. Cons - Limited word - Not work so well in

    noisy environment - Good for recognize word or short sentence but no reliable for long sentence