$30 off During Our Annual Pro Sale. View Details »

マルチリンガルな言語モデル入門:これまでとこれから

Ryokan RI
August 31, 2023

 マルチリンガルな言語モデル入門:これまでとこれから

情報処理学会 第257回 自然言語処理研究発表会(2023)招待講演

https://sites.google.com/sig-nl.ipsj.or.jp/sig-nl/%E7%A0%94%E7%A9%B6%E7%99%BA%E8%A1%A8%E4%BC%9A/NL257

Ryokan RI

August 31, 2023
Tweet

More Decks by Ryokan RI

Other Decks in Research

Transcript

  1. ϚϧνϦϯΨϧͳݴޠϞσϧೖ໳
    d͜Ε·Ͱͱ͜Ε͔Βd
    2023/9/1


    ৘ใॲཧֶձ ୈ257ճ NLݚ


    ཥ ྇פ (LINE גࣜձࣾʣ

    View Slide

  2. ‣ 2018 - 2023: ౦ژେֶ ௽Ԭݚڀࣨʢम࢜ - ത࢜ʣ


    ‣ 2023 -: LINE גࣜձࣾʢNLP ΤϯδχΞʣ
    ࣗݾ঺հ
    ࠷ۙɺֶੜ࣌୅ʹΠϯλʔϯͰ


    ͓ੈ࿩ʹͳͬͨํʑͱຊΛॻ͖·ͨ͠ɻ

    View Slide

  3. ༮গظ…
    私 中国 持
    算数

    View Slide

  4. ֶߍʹͯ
    算数 !

    View Slide

  5. தࠃޠͰֶश͠ ೔ຊޠͷςετΛड͚Δ
    ݴޠԣஅసҠֶश
    Cross-lingual Transfer Learning

    View Slide

  6. ܇࿅σʔλ͕ಛఆͷݴޠ͔͠ͳ͍ʢྫɿதࠃޠʣঢ়گԼͰɺଞͷ
    ݴޠʢྫɿ೔ຊޠʣͷσʔλʹ΋ରԠͰ͖ΔϞσϧΛͭ͘Δɻ
    ݴޠԣஅసҠֶश
    Q. ͳͥͦͷΑ͏ͳ͜ͱΛ͢Δͷ͔ʁ


    ‣ ϥϕϧ෇σʔλ͕رগͳݴޠ΋ੈքʹ͸ͨ͘͞Μ͋ΔͨΊ

    View Slide

  7. ੈքதͷݴޠͱσʔλྔ
    7 ݴޠʢ೔ຊޠɺӳޠͳͲʣ
    2191 ݴޠ
    222 ݴޠ
    The State and Fate of Linguistic Diversity and Inclusion in the NLP World (Joshi et al., ACL 2020) Figure 1 ΑΓ

    View Slide

  8. ChatGPT ΋ݴޠԣஅసҠֶशΛ͍ͯ͠Δʁ

    View Slide

  9. InstructGPT ͷ࿦จͷ࣮ݧʹΑΔͱ…


    ‣ Instruct Tuning ͷֶशσʔλ͸96%͕ӳޠ


    ‣ ϑΝΠϯνϡʔχϯάͨ͠ GPT3 ͸ଞͷݴޠʹ΋͏·͘
    ൚Խ͍ͯ͠Δ
    ChatGPT ΋ݴޠԣஅసҠֶशΛ͍ͯ͠Δʁ
    Q. ͳͥ͜ͷΑ͏ͳ͜ͱ͕Ͱ͖Δͷ͔ʁ


    ‣ ϕʔεͷݴޠϞσϧ͕ෳ਺ݴޠͰ܇࿅͞Ε͓ͯΓݴޠԣஅసҠֶश
    ʹରԠͰ͖ΔೳྗΛ͍࣋ͬͯΔɻ

    View Slide

  10. ‣ ϚϧνϦϯΨϧͳݴޠϞσϧ͸Ͳ͏΍ͬͯ࡞Δ͔ʁ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧʹؔ͢ΔະղܾͷṖ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧͷ͜Ε͔Β
    ຊ೔ͷςʔϚ

    View Slide

  11. ‣ ϚϧνϦϯΨϧͳݴޠϞσϧ͸Ͳ͏΍ͬͯ࡞Δ͔ʁ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧʹؔ͢ΔະղܾͷṖ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧͷ͜Ε͔Β
    ຊ೔ͷςʔϚ

    View Slide

  12. ୯ޠͷҙຯͷۙ͞ΛϕΫτϧ্ۭؒͰදݱͨ͠΋ͷ
    ϞϊϦϯΨϧͳ୯ޠϕΫτϧ
    ݘ

    ίϯϐϡʔλ
    ܭࢉػ
    ͱ
    ΋
    ͳ͍
    ދ
    ࣅͨ୯ޠͷϕΫτϧ͸ۙ͘ͳΔ

    View Slide

  13. ݴޠʹґΒͣɺ୯ޠͷҙຯͷۙ͞ΛϕΫτϧ্ۭؒͰදݱͨ͠΋ͷ
    ϚϧνϦϯΨϧͳ୯ޠϕΫτϧ

    ݘ
    ܭࢉػ
    ͱ
    ΋
    ͳ͍ DBU
    EPH
    DPNQVUFS
    OPU
    UP
    BOE
    ίϯϐϡʔλ

    View Slide

  14. ϚϧνϦϯΨϧͳݴޠϕΫτϧͷ࡞Γํ
    ᶃ ର༁σʔλΛ࢖ͬͯϕΫτϧΛֶश͢Δ
    Bilingual Word Representations with Monolingual Quality in Mind (Luong et al., LatentVar 2015) Figure 1 ΑΓ
    ςΩετͷपғͷ୯ޠΛ༧ଌ͢Δ͜ͱʹՃ͑ͯɺର༁ͱͳΔ୯ޠ΋༧ଌ͢Δɻ

    View Slide

  15. ‣ ݴޠͷߏ଄ͷڞ௨ੑΛར༻͢Ε͹Մೳɻ
    Q. ର༁σʔλͳ͠ͰϚϧνϦϯΨϧͳݴޠϕΫτϧΛ࡞Εͳ͍͔ʁ

    View Slide

  16. ୯ޠϕΫτϧۭؒ


    ಉܕͷԾఆ
    ผʑʹֶश͞ΕͨҟͳΔݴޠͷ୯ޠຒΊࠐΈۭؒͷߏ଄͕ࣅΔɻ
    Exploiting Similarities among Languages for Machine Translation (Mikolov et al., arXiv 2013) Figure 1 ΑΓ
    ӳޠ εϖΠϯޠ

    View Slide

  17. ϚοϐϯάʹΑΔ


    ϚϧνϦϯΨϧͳ୯ޠϕΫτϧͷ֫ಘ
    ϞϊϦϯΨϧͳ୯ޠϕΫτϧΛֶश

    View Slide

  18. ϚοϐϯάʹΑΔ


    ϚϧνϦϯΨϧͳ୯ޠϕΫτϧͷ֫ಘ
    ϞϊϦϯΨϧͳ୯ޠϕΫτϧΛֶश
    ઢܗม׵Ͱ


    ڞ௨ۭؒʹϚοϐϯά

    View Slide

  19. ର༁σʔλ͋Γ


    ‣ ର༁ࣙॻΛ༻ҙɻର༁ͷ୯ޠϕΫτϧಉ͕࢜Ұக͢ΔΑ͏ʹϚοϐϯά
    Λֶशɻ


    ର༁σʔλͳ͠ʢର༁ࣙॻੜ੒ ⁵ Ϛοϐϯάͷ֫ಘʣ


    ‣ ιʔε/λʔήοτݴޠͷϕΫτϧͷݟ෼͚͕͔ͭͳ͘ͳΔϚοϐϯάΛ
    ఢରతֶशͰ֫ಘɻ


    ‣ ୯ޠϕΫτϧಉ࢜ͷྨࣅ౓ϕΫτϧͷྨࣅ౓Ͱର༁ࣙॻΛੜ੒ɻ
    ϚοϐϯάʹΑΔ


    ϚϧνϦϯΨϧͳ୯ޠϕΫτϧͷ֫ಘ
    Exploiting Similarities among Languages for Machine Translation (Mikolov et al., arXiv 2013)
    Word translation without parallel data (Lample et al., ICLR 2018)
    A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings (Artetxe et al., ACL 2018)

    View Slide

  20. ιʔε/λʔήοτݴޠͷϕΫτϧͷ


    ݟ෼͚͕͔ͭͳ͘ͳΔϚοϐϯάΛఢରతֶशͰ֫ಘ
    ϕΫτϧۭؒͷܗ͕ಉ͡ͳͷͰɺେମҰக͢Δɻ


    ʢ͜ͷޙʹ൓෮తʹϚοϐϯάΛվળ͢Δεςοϓ͋Γɻʣ
    Word translation without parallel data (Lample et al., ICLR 2018) Figure 1 ΑΓ

    View Slide

  21. ୯ޠϕΫτϧۭؒಉܕͷԾఆͷݶք
    ୯ޠϕΫτϧۭؒͷܗ͕߹க͢ΔͨΊʹ͸ɺ


    ‣ ֶशίʔύεͷυϝΠϯ͕͋Δఔ౓Ұக͢Δඞཁ͕͋Δɻ


    ‣ ݴޠ͕ܥ౷తʹ͋Δఔ౓͍ۙ͠


    ඞཁ͕͋Δɻ
    On the Limitations of Unsupervised Bilingual Dictionary Induction (Søgaard et al., ACL 2018)
    ͨͩ͠ɺ͍ͭͰ΋ର༁σʔλͳ͠ͰϚϧνϦϯΨϧͳ୯ޠϕΫτϧΛ
    ֫ಘ͢Δ͜ͱ͕͍ͭͰ΋͏·͘ߦ͔͘ͱ͍͏ͱͦΜͳ͜ͱ΋ͳ͘…
    😢

    View Slide

  22. ຋༁ؔ܎ʹ͋ΔͱࢥΘΕ͍ͯΔ୯ޠͰ΋ɺҙຯ͕શ͘ಉ͡ͱݴ͑ΔΘ͚Ͱ͸
    ͳ͔ͬͨΓ͢Δɻݻ༗໊ࢺͰ͋Δʮ෋࢜ࢁʯΛͱͬͯ΋…
    ͱ͜ΖͰ…


    ʮҧ͏ݴޠͰಉ͡ҙຯ
    ➡︎
    ಉ͡ϕΫτϧʯͰຊ౰ʹྑ͍ʁ
    ͱ͋ΔݴޠͰʮ෋࢜ࢁʯ͕ͲͷΑ͏ͳจ຺Ͱݴٴ͞ΕΔ͔ʹҧ͍͕͋Γɺͦ͏
    ͨ͠จԽతͳؚ஝ͳͲΛߟ͑Δͱɺશ͘ಉ͡Α͏ͳҙຯͰ͋Δͱ͸͍͑ͳ͍ɻ
    Ͳ͜·Ͱͷҙຯͷ౳ՁੑΛٻΊΔ͔͸݁ہ͸λεΫґଘɻ
    ෋࢜ࢁ .U'VKJ
    େ࿨ࠢ
    ೔ຊͰҰ൪
    ߴ͍ࢁ
    ؍ޫ஍

    View Slide

  23. ‣ ͋Μ·Γ೉͍͜͠ͱߟ͑ͳͯ͘΋ग़དྷͯ͠·͏ɻ
    Q. BERT ͰϚϧνϦϯΨϧϞσϧ࡞Δ࣌͸Ͳ͏͢Δͷʁ
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., NAACL 2019) Figure 1 ΑΓ

    View Slide

  24. ϚϧνϦϯΨϧͳ BERT ͷ࡞Γํ
    ෳ਺ݴޠͷσʔλΛ༻͍ͯΤϯίʔμΛ܇࿅͢Δɻ
    ࣄલֶश
    Corpus
    🇯🇵🇺🇸🇨🇳🇬🇧🇰🇷🇫🇷🇮🇳🇩🇪…
    Encoder
    ෳ਺ݴޠͷ୯ݴޠίʔύεΛूΊɺϚεΫ෇͖ݴޠ
    ϞσϦϯάͰΤϯίʔμΛ܇࿅͢Δɻ

    View Slide

  25. ݴޠԣஅసҠֶशͷྲྀΕ
    ϚϧνϦϯΨϧࣄલֶश
    🇯🇵🇬🇧 Encoder

    View Slide

  26. ݴޠԣஅసҠֶशͷྲྀΕ
    🇬🇧 Task-specific


    Module
    Output
    🇯🇵🇬🇧
    Encoder
    ϑΝΠϯνϡʔχϯά
    ϚϧνϦϯΨϧࣄલֶश
    Encoder

    View Slide

  27. ݴޠԣஅసҠֶशͷྲྀΕ
    🇬🇧 Output
    🇯🇵🇬🇧
    🇯🇵 Output
    ϚϧνϦϯΨϧࣄલֶश
    ධՁ
    ϑΝΠϯνϡʔχϯά
    Task-specific


    Module
    Encoder
    Task-specific


    Module
    Encoder
    Encoder

    View Slide

  28. ϚϧνϦϯΨϧͳ BERT ͷੑೳ
    XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization (Hu et al., ICML 2020) Table 2 ΑΓ
    ‣ mBERT ͸ 100ݴޠҎ্ͷ Wikipedia هࣄͰࣄલֶश͞Ε͍ͯΔɻ


    ‣ ෯޿͍λεΫͰϑΝΠϯνϡʔχϯά࣌ʹݟ͍ͯͳ͍ݴޠͰͷධՁ
    ͰνϟϯεϨʔτΑΓང͔ʹߴ͍είΞΛ͍ࣔͯ͠Δɻ

    View Slide

  29. ϚϧνϦϯΨϧͳݴޠϞσϧͷ


    ಺෦දݱ
    ϕΫτϧͷ੒෼Λ෼ੳ͢Δͱ…


    ‣ݴޠຖʹ෼཭ͨ͠੒෼


    ‣ݴޠʹґΒͣࠞ߹ͨ͠੒෼


    Λݟ͚ͭΔ͜ͱ΋Ͱ͖Δɻ
    The Geometry of Multilingual Language Model Representations (Chang et al., EMNLP 2022) Figure 1 ΑΓɻ

    View Slide

  30. ‣ ϚϧνϦϯΨϧͳݴޠϞσϧ͸Ͳ͏΍ͬͯ࡞Δ͔ʁ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧʹؔ͢ΔະղܾͷṖ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧͷ͜Ε͔Β
    ຊ೔ͷςʔϚ

    View Slide

  31. Q. ͳͥ mBERT ͸ର༁σʔλͳ͠ͰϚϧνϦϯΨϧͳೳྗΛ֫ಘ͍ͯ͠Δͷ͔ʁ

    View Slide

  32. ϚϧνϦϯΨϧͳ BERT ͷ࡞Γํ
    ෳ਺ݴޠͷσʔλΛ༻͍ͯΤϯίʔμΛ܇࿅͢Δɻ
    ࣄલֶश
    Corpus
    🇯🇵🇺🇸🇨🇳🇬🇧🇰🇷🇫🇷🇮🇳🇩🇪…
    Encoder
    ෳ਺ݴޠͷ୯ݴޠίʔύεΛूΊɺϚεΫ෇͖ݴޠ
    ϞσϦϯάͰΤϯίʔμΛ܇࿅͢Δɻ
    ͳͥݴޠԣஅసҠֶशʹඞཁͳ
    ݴޠؒͷରԠΛֶश͍ͯ͠Δʁ

    View Slide

  33. ౰ॳ͞͞΍͔Ε͍ͯͨᷚ


    ڞ௨ͷจࣈྻԾઆ
    आ༻ͳͲ΋͋Γɺදه͕Ұக͢Δ୯ޠ΋ଟ਺ʂ
    EN
    DE
    sing
    singen
    EN
    FR
    banana
    banane
    ಉ͡ޠ଒ͷݴޠʹ͸ࣅͨΑ͏ͳจࣈྻ͕ଟ͍ɻ


    ςΩετ͸αϒϫʔυ෼ׂ͞ΕΔͷͰɺಉ͡αϒϫʔυ͕
    ҟͳΔݴޠͰಉ͡ҙຯ߹͍Λ࣋ͭ͜ͱʹͳΔɻ
    🍌
    🎶

    View Slide

  34. ౰ॳ͞͞΍͔Ε͍ͯͨᷚ


    ڞ௨ͷจࣈྻԾઆ
    ݻ༗໊ࢺ
    ਺ࣈ

    View Slide

  35. ڞ௨ͷจࣈྻԾઆͷݕূ
    ڞ௨ͷจࣈྻ͕େ੾ʁ


    ➡︎
    ڞ௨ͷจࣈྻ͕શ͘ݱΕͳ͍ঢ়گͰ


    ϚϧνϦϯΨϧͳ BERT Λֶश͢ΔͱͲ͏ͳΔ͔ʁ
    ҟͳΔݴޠ͔ΒͷτʔΫϯͷ ID ͕ॏͳΒͳ͚Ε͹Α͍


    ‣ ยํͷݴޠͷจࣈͷϢχίʔυΛͣΒͯ͠มͳจࣈʹ͢Δ


    ‣ τʔΫϯͷ ID ΛͣΒ͢
    ࣮ݧઃఆͷ࡞Γํ
    Emerging Cross-lingual Structure in Pretrained Language Models (Conneau et al., ACL 2020)
    Cross-Lingual Ability of Multilingual BERT: An Empirical Study (K et al., ICLR 2020)

    View Slide

  36. ڞ௨ͷจࣈྻԾઆͷ൓ূ
    ‣ ͦͷ··ͷઃఆ


    ‣ ڞ௨จࣈྻ͕ແ͍Α͏ʹௐઅͨ͠ઃఆ


    ͰݴޠԣஅసҠֶशͷੑೳ͸͋·ΓมΘΒͳ͍ɻ
    Cross-Lingual Ability of Multilingual BERT: An Empirical Study (K et al., ICLR 2020) Table 1 ΑΓ

    View Slide

  37. ݁ہɺݴޠԣஅͳೳྗ͸Ͳ͔͜Βʁ
    จࣈྻͱ͍ͬͨද૚తͳݴޠͷڞ௨ੑ͸ɺݴޠԣஅͳೳྗ
    ͷ֫ಘʹඞਢͰ͸ͳ͍͜ͱ͕Θ͔ͬͨɻ


    ➡︎
    ॏཁͳͷ͸ݴޠͷߏ଄తͳڞ௨ੑʁ


    ݴޠԣஅͷೳྗ͕֫ಘ͞ΕΔ৚݅ʹ͍ͭͯௐࠪ͸͞Ε͍ͯ
    Δ͕ɺ֬ఆతʹʮ͜Ε͕ॏཁʯͱ͍͏ཁૉ͸Α͘෼͔ͬͯ
    ͍ͳ͍ɻ
    Towards a Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review (Philippy et al., ACL 2023)

    View Slide

  38. ݁ہɺݴޠԣஅͳೳྗ͸Ͳ͔͜Βʁ
    ݴޠͷߏ଄తͳڞ௨ੑͱ͸Կ͔ʁ


    ݴޠ͕ҟͳ͍ͬͯͯ΋…


    ‣ ໊ࢺɺಈࢺͳͲͷҟͳΔΧςΰϦͷ୯ޠΛ૊Έ߹Θͤͯ
    จΛ࡞Δɻ


    ‣ ಉ͡Α͏ͳτϐοΫʹ͍ͭͯ࿩͢ɻ


    ͜ͷลͷڞ௨ੑΛ Transformer ͕र্͍͍͛ͯΔͷͩͱ
    ࢥ͍ͬͯΔ͕ɺ͏·͍ࣔ͠ํΛߟ͍͑ͨɻ

    View Slide

  39. ‣ ϚϧνϦϯΨϧͳݴޠϞσϧ͸Ͳ͏΍ͬͯ࡞Δ͔ʁ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧʹؔ͢ΔະղܾͷṖ


    ‣ ϚϧνϦϯΨϧͳݴޠϞσϧͷ͜Ε͔Β
    ຊ೔ͷςʔϚ

    View Slide

  40. ChatGPT ͸൚༻తͳϚϧνϦϯΨϧϞσϧ͔ʁ
    GPT4 ͸̐୒໰୊ͷϕϯνϚʔΫͰɺଟ͘ͷݴޠʹ͍ͭͯ
    ׂࣣҎ্ͷਖ਼౴཰Λൃش
    https://openai.com/research/gpt-4 ΑΓ

    View Slide

  41. ChatGPT ͸൚༻తͳϚϧνϦϯΨϧϞσϧ͔ʁ
    ଟ͘ͷݴޠΛͦͦ͜͜ཧղͰ͖͍ͯΔΑ͏ʹݟ͑Δɻ

    View Slide

  42. ϚϧνϦϯΨϧϞσϧ͸ ChatGPT Ͱ΋͏ྑ͍ʁ
    ϚϧνϦϯΨϧϞσϧͷ࣮༻্ͷେ՝୊
    ޮ཰ͱੑೳͷτϨʔυΦϑ

    View Slide

  43. ϚϧνϦϯΨϧϞσϧ͸ ChatGPT Ͱ΋͏े෼ʁ


    -τʔΫφΠθʔγϣϯͷඇޮ཰͞-
    Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models (Ahia et al., NAACL 2023) Figure 5 ΑΓ
    ಉ͡಺༰ͷςΩετͰ΋ӳޠʹൺ΂ͯ೔ຊޠ͸̎ഒ΄ͲͷτʔΫϯΛফඅ͢Δɻ

    View Slide

  44. ϚϧνϦϯΨϧͳϞσϧʹ͓͚Δ


    τʔΫφΠθʔγϣϯͷඇޮ཰͞
    ͋ΔݴޠͷςΩετͷܥྻ௕ΛॖΊΔͨΊʹ͸ɺ
    ͦͷݴޠͷޠኮ਺Λ૿΍͢ඞཁ͕͋Δɻ
    ͋Δݴޠͷޠኮ਺Λ૿΍͢ͱɺଞͷݴޠ͕ඇޮ཰ΛඃΔ
    ͋Δݴޠͷޠኮ਺Λ૿΍͢ͱɺҟͳΔݴޠͰͷܭࢉ͕
    ෆඞཁʹॏ͘ͳΔɻੑೳ΋ѱ͘ͳΔ͔΋͠Εͳ͍ɻ

    View Slide

  45. ଟݴޠͷढ͍ (Curse of multilinguality) ͱ͍͏ݱ৅͕஌ΒΕ͍ͯΔɻ


    ‣ ݴޠΛ૿΍͍ͯ͘͠ͱɺݴޠԣஅλεΫͷੑೳ্͕͕Δ…
    ͱࢥͬͨΒɺҰఆ਺Λ௒͑ΔͱԼ͕Γ࢝ΊΔݱ৅ɻ
    ଟݴޠͷढ͍
    ݴޠ਺
    ੑೳ
    😱

    View Slide

  46. ͭ·Γɺଟݴޠͷढ͍ͱ͸ɺϞσϧ
    ύϥϝʔλͱ͍͏ϦιʔεΛɺෳ਺
    ݴޠ͕৯͍߹͏ݱ৅͕ى͖͍ͯΔͱ
    ଊ͑ΒΕΔɻ
    ଟݴޠͷढ͍
    Ϟσϧͷύϥϝʔλ਺Λେ͖͍ͯ͘͘͠ͱϚγʹͳΔɻ
    Unsupervised Cross-lingual Representation Learning at Scale (Conneau et al., ACL 2020) Figure 4 ΑΓ

    View Slide

  47. ϞσϧύϥϝʔλΛ૿΍͍ͯ͘͠ʹ΋ݶ౓͕͋Γɺ࢖༻࣌ͷ
    ίετ΋ෛ୲ʹͳͬͯ͘Δɻ
    ଟݴޠͷढ͍Λղͨ͘Ίʹ
    ‣ ݴޠؒͷڝ߹Λܰݮ͢ΔͨΊʹɺ
    ݴޠຖʹݸผͷϞδϡʔϧΛઃ͚
    Δɻ
    ղܾ๏ͷ̍ͭ
    Lifting the Curse of Multilinguality by Pre-training Modular Transformers (Pfeiffer et al., NAACL 2022)

    View Slide

  48. ֤ࠃ͕ߴੑೳͳϞϊϦϯΨϧ LLM ͷ։ൃʹۈ͠ΜͰ͍Δɻ


    طଘͷϞϊϦϯΨϧϞσϧΛ૊Έ߹ΘͤΔ͜ͱ͸Ͱ͖ͳ͍͔ʁ
    ϞϊϦϯΨϧϞσϧ͸׆༻Ͱ͖Δ͔ʁ
    ֶशʹඞཁͳϦιʔεΛେ෯ʹઅ໿Ͱ͖Δ͔΋ɻ
    Ja LLM En LLM Zh LLM …
    ϚϧνϦϯΨϧϞσϧ

    View Slide

  49. ‣ ݴޠڞ௨ɾݴޠಛԽͷϞδϡʔϧΛ࠷దʹ഑ஔͨ͠ΞʔΩς
    Ϋνϟͷ։ൃ


    ‣ طଘͷϞϊϦϯΨϧϞσϧΛ૊Έ߹ΘͤͯɺϚϧνϦϯΨϧ
    ϞσϧΛߏங͢Δख๏ɻ
    ະདྷͷϚϧνϦϯΨϧϞσϧʹ޲͚ͯͷํ޲ੑ

    View Slide