Slide 10
Slide 10 text
Dictionary format
A dictionary is a data structure that provides information about available terms as well as
how those terms should appear next to each other according to Japanese grammar or
probability.
Let's look into IPADIC dictionary format.
The important thing is the first four columns (Surface, Left Context ID, Right Context ID,
Cost). After that, the metadata such as the part of speech, reading and pronunciation of
the term are described.
原村,1293,1293,8684,名詞,固有名詞,地域,一般,*,*,原村,ハラムラ,ハラムラ
大倉谷地,1293,1293,8676,名詞,固有名詞,地域,一般,*,*,大倉谷地,オオクラヤチ,オークラヤチ
駒ケ崎,1293,1293,8676,名詞,固有名詞,地域,一般,*,*,駒ケ崎,コマガサキ,コマガサキ
里本江,1293,1293,8676,名詞,固有名詞,地域,一般,*,*,里本江,サトホンゴ,サトホンゴ