iOS Speech Recognition - Speaker Deck

Slide 1

Slide 1 text

iOS SPEECH RECOGNITION KURNIADI

Slide 2

Slide 2 text

What is Speech Recognition ?

Slide 3

Slide 3 text

Speech recognition (SR) is the inter-disciplinary sub-field of computational linguistics that develops methodologies and technologies that enables the recognition and translation of spoken language into text by computers. It is also known as "automatic speech recognition" (ASR), "computer speech recognition", or just "speech to text" (STT). Source : wikipedia :D

Slide 4

Slide 4 text

Voice or Speech recognition ? Voice -> Who Speech -> What

Slide 5

Slide 5 text

iOS Provide 2 Way to Recognize Speech 1. Using Keyboard Dictation 2. Using Speech Recognition API

Slide 6

Slide 6 text

Keyboard Dictation - Available in keyboard pad automatically - Available since iOS 5 - Need internet connection to recognize speech - No need additional permission to use it

Slide 7

Slide 7 text

Cons - Speech recognition depend on active keyboard language preference - Cannot trigger it programmatically - Sometime user not aware with this feature - Need internet connection to work well

Slide 8

Slide 8 text

Speech Recognition API - Introduced in 2016 - Use same engine with SIRI - Can recognize over 50 languages / dialects - We can trigger it programmatically - Free for use

Slide 9

Slide 9 text

Cons - We need iOS 10+ - Need internet connection - Need permission from user

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Is it not enough for you ? WHY ??

Slide 12

Slide 12 text

Must be connected to Internet

Slide 13

Slide 13 text

Introduction Created by Halle Winkler in Germany, OpenEars® is a shared-source iOS framework for iPhone voice recognition and speech synthesis (TTS). It lets you easily implement local, offline speech recognition in English and five other languages, and English text-to-speech (synthesized speech). Pocketsphinx (the open source voice recognition engine that OpenEars uses) is capable of local recognition of vocabularies with hundreds or even thousands of words depending on the environment and other factors, and performs very well with medium-sized language models (vocabularies). The best part is that it uses no network connectivity because all processing occurs locally on the device.

Slide 14

Slide 14 text

Pocketsphinx - Support for several languages like US English, UK English, French, Mandarin, German, Dutch, Russian and ability to build a models for others - Available for Android, iOS, Linux / Unix, Windows - Developed by Carnegie Mellon University.

Slide 15

Slide 15 text

Pros - No need internet connection to recognize words - Very light, not take so much resources - free - Easy to use - Very reliable - Support 6 languages - Support iOS 8.0+

Slide 16

Slide 16 text

Cons - Limited word - Not work so well in noisy environment - Good for recognize word or short sentence but no reliable for long sentence

Slide 17

Slide 17 text

No content