Slide 5
Slide 5 text
提案⼿法1: PinyinGPT-Concat
⽂脈として拼⾳をくっつけて⼊れる
• Positional encoding が調整されている
5
我 下 周 有 时 间 , 除 了 l b y y d s
[SEP] [SEP] 礼 拜 一 有 点 事
PinyinGPT-Concat
1 2 3 5 6 7 8
4 9 10 12 13 14 15 16 10
11 11 12 13 14 15 16
Context of
Chinese characters
Abbreviated
pinyin
Target
Chinese characters
礼 拜 一 有 点 事 [EOS]
[CLS]
0
我 下 周 有 时 间 , 除 了 礼 拜 一 有 点 事 [EOS]
Figure 1: An illustration of the training process of Pinyin-Concat (top) and Pinyin-Embed (bottom), respectively.
The example is same as the instance of s2 from Table 2.
is
ts.
60
p-
ng
un
0k
as
in
ng
hi-
er
el
n-
rs
stance of [w1, . . . , wn, [SEP], pn+1, . . . , pn+k,
[SEP], wn+1, . . . , wn+k], the model is trained
to minimize the following loss function, where
w