Slide 7
Slide 7 text
Statistical analysis of the corpus
• Interested in word length, measured both in characters and number
of syllables.
• Due to the special word structure in Vietnamese, these values are
computed as follows:
• Word length in characters is calculated without the possible blanks within a
word
• The number of syllables is trivial to count by counting the blanks within a
word plus one.
• For the average syllable length, the average is taken per word, i.e. the syllable
length per word is averaged.