Text-to-speech [synthesis]: Technical Presentation

text-to-speech [synthesis] II this time, it’s technical Wednesday, 6 March,
13

a review Wednesday, 6 March, 13

except, it turns out... Wednesday, 6 March, 13

text analysis and text normalization text normalization (“verbalizing”) numbers (1772
as date/#/quantiﬁer) abbreviations (Mrs. – Misses) acronyms (H.I.V. –“aitch eye ve”) word segmentation (NATO – “nayto”) increasingly, may employ POS tagging as well as rules and dictionaries Wednesday, 6 March, 13

phonetic analysis grapheme to phoneme conversion something like the phonetic
alphabet controversy = /k o1 n t r ax0 v er2 s iy/ dynamic time warping (dtw)! disambiguation (record/rɛkɝd/rɪkɔrd) often uses dictionary- and rule-based approaches, and probably a lexicon, to determine proper word choice Wednesday, 6 March, 13

prosody prediction pattern, rhythm, and intonation “I am speaking” /
“I am speaking” accurate prosody modelling essential for natural- sounding systems (instead of ﬂat robot sounds) “sentence-ﬁnal” prosody NLP processes that can identify emotion or sentiment will help increase the accuracy of prosody generation in TTS applications. Wednesday, 6 March, 13

acoustic modelling once we have a phonetic structure for the
synthesizer to read, there are two main kinds of speech synthesis: synthesis by rule formant-based (speak & spell) articulation-based (speech “organs”) concatenative synthesis (w/real voices) Wednesday, 6 March, 13

thanks again! Mitkov, R. (2005). The Oxford Handbook of Computational
Linguistics. Oxford University Press. Taylor, P. A. (2009). Text-to-speech synthesis. Cambridge, U.K. ; New York: Cambridge University Press. D. Sasirekha, & Chandra, E. (2012). Text to Speech: A Simple Tutorial. International Journal of Soft Computing and Engineering, 2(1), 275–278. Wednesday, 6 March, 13

Text-to-speech [synthesis]: Technical Presentation

Text-to-speech [synthesis]: Technical Presentation

AhemNason

More Decks by AhemNason

Other Decks in Education

Featured

Transcript

text-to-speech [synthesis] II this time, it’s technical Wednesday, 6 March,

a review Wednesday, 6 March, 13

except, it turns out... Wednesday, 6 March, 13

text analysis and text normalization text normalization (“verbalizing”) numbers (1772

phonetic analysis grapheme to phoneme conversion something like the phonetic

prosody prediction pattern, rhythm, and intonation “I am speaking” /

acoustic modelling once we have a phonetic structure for the

thanks again! Mitkov, R. (2005). The Oxford Handbook of Computational