as date/#/quantifier) abbreviations (Mrs. – Misses) acronyms (H.I.V. –“aitch eye ve”) word segmentation (NATO – “nayto”) increasingly, may employ POS tagging as well as rules and dictionaries Wednesday, 6 March, 13
alphabet controversy = /k o1 n t r ax0 v er2 s iy/ dynamic time warping (dtw)! disambiguation (record/rɛkɝd/rɪkɔrd) often uses dictionary- and rule-based approaches, and probably a lexicon, to determine proper word choice Wednesday, 6 March, 13
“I am speaking” accurate prosody modelling essential for natural- sounding systems (instead of flat robot sounds) “sentence-final” prosody NLP processes that can identify emotion or sentiment will help increase the accuracy of prosody generation in TTS applications. Wednesday, 6 March, 13
synthesizer to read, there are two main kinds of speech synthesis: synthesis by rule formant-based (speak & spell) articulation-based (speech “organs”) concatenative synthesis (w/real voices) Wednesday, 6 March, 13
Linguistics. Oxford University Press. Taylor, P. A. (2009). Text-to-speech synthesis. Cambridge, U.K. ; New York: Cambridge University Press. D. Sasirekha, & Chandra, E. (2012). Text to Speech: A Simple Tutorial. International Journal of Soft Computing and Engineering, 2(1), 275–278. Wednesday, 6 March, 13