Slide 13
Slide 13 text
2.4 Transformer Ϟδϡʔϧͷղऍ
˛ਤ 2.13 LSH Attention(Reformer จ Figure 2)
Reformer Ͱਤ 2.13 Ͱද͞ΕΔ LSH(Locality Sensitive Hashing) At-
tention ͰɺAttention ॲཧΛಛఆͷ୯ޠ͚ͩʹߜͬͯߦ͏͜ͱͰܭࢉޮ
Λ্ͤ͞ɺ1,000 ୯ޠલޙͷऔΓѻ͍͕த৺ͷ Transformer ੜͷݚ
ڀʹର͠ɺͦͷेഒҎ্ͷ୯ޠΛऔΓѻ͑ΔΑ͏ʹͨ͠ݚڀͰ͢ɻ͜ͷ
Reformer Ͱ༻͍ΒΕ͍ͯΔߟ͑ํͦͷཧల։άϥϑͷऔΓѻ͍Λ͔
ͳΓߟྀͨ͠༰Ͱ͋ΓɺάϥϑχϡʔϥϧωοτϫʔΫͱ߹ΘͤͯѲ͠
͓ͯ͘ͱཧղ͕ਂ·Γ·͢ɻ
͏গ͠ߟ͢ΔͳΒɺTransformer શͯͷ୯ޠಉ࢜ʹ͍ͭͯ Atten-
tion ॲཧΛߦ͏ͱ͍͏ιϑτͳߏɺReformer ಛఆͷؔ࿈ੑͷߴ͍୯ޠ
ಉ࢜ʹ͍͔ͭͯ͠ܭࢉΛߦΘͳ͍ϋʔυͳߏͱߟ͑Δ͜ͱ͕Ͱ͖Δ͔ͱࢥ
͍·͢ɻάϥϑχϡʔϥϧωοτϫʔΫͱͯ͠ Transformer Λཧղ͢Δ͜
ͱͰɺ͜ͷΑ͏ʹҰݟෳࡶͰͦ͠͏ʹݟ͑Δ Transformer ੜͷॲཧΛ
୯ޠΛϊʔυͱΈͳͨ͠άϥϑχϡʔϥϧωοτϫʔΫͱͯ͠ཧղ͢Δ͜ͱ
͕Ͱ͖ɺײతͳཧղͱߟ͕ՄೳʹͳΓ·͢ɻ
27