$30 off During Our Annual Pro Sale. View Details »

資源として見る実験プログラム

 資源として見る実験プログラム

言語処理学会第29回年次大会 併設ワークショップ JLR2023 『日本語言語資源の構築と利用性の向上』での口頭発表「資源として見る実験プログラム」の資料版スライドです。

URL: https://jedworkshop.github.io/JLR2023/program

Hayato Tsukagoshi

March 20, 2023
Tweet

More Decks by Hayato Tsukagoshi

Other Decks in Research

Transcript

  1. ࢿݯͱͯ͠ݟΔ࣮ݧϓϩάϥϜ
    ໊ݹ԰େֶେֶӃ৘ใֶݚڀՊ म࢜2೥ ෢ా࡫໺ݚڀࣨ
    ௩ӽ ॣ

    View Slide

  2. •ݚڀΛਐΊΔʹ͸࣮ݧ͕ඞཁ

    •ਂ૚ֶश or ࣗવݴޠॲཧ෼໺ʹ͓͍࣮ͯݧ͸ϓϩάϥϜʹΑΓ࣮ࢪ͞ΕΔ

    • ࣮ݧϓϩάϥϜ΋ݚڀ׆ಈͷॏཁͳཁૉͰ͋Δ
    ຊൃදͷझࢫ
    •࣮ݧϓϩάϥϜͷ໾ׂͱॏཁੑɺ࣮ݧϓϩάϥϜ؅ཧͷํࡦɺࣗવݴޠॲཧ
    ͷ޻ֶతൃలʹ͍ͭͯ঺հ

    •ਝ଎͔ͭద੾ͳݚڀ਱ߦͷͨΊʹɺ࣮ݧϓϩάϥϜΛࢿݯͱͯ͠ଊ͑Δ
    • ࣮ମݧɾମײΛަ͑ͯ “ྑ͍࣮ݧϓϩάϥϜ” ͷͨΊͷٞ࿦Λਪਐ

    •࣮ݧϓϩάϥϜͷੵۃతͳެ։ɾվળɾٞ࿦Λଅਐ͍ͨ͠
    ֓ཁ
    2

    View Slide

  3. •໊લ: ௩ӽ ॣ / TSUKAGOSHI, Hayato

    •ॴଐ: ໊େ ෢ా࡫໺ݚ M2

    ݚڀ:
    •ఆٛจΛ༻͍ͨจຒΊࠐΈߏ੒๏

    (NLP 2021, ACL-IJCNLP 2021, ࣗવݴޠॲཧ Vol. 30)

    •ࣗવݴޠਪ࿦ͱ࠶ݱثΛ༻͍ͨSplit and Rephrase

    ʹ͓͚Δੜ੒จͷ඼࣭޲্ (NLP 2022)

    •ҟͳΔڭࢣ৴߸͔Βߏஙͨ͠จϕΫτϧͷൺֱ

    ͱ౷߹ (म࿦, *SEM 2022)

    •(ڞஶ) Ψ΢ε෼෍ʹجͮ͘จදݱੜ੒ (NLP2023)

    •ֶৼಛผݚڀһ(DC1)࠾༻಺ఆɾത࢜՝ఔਐֶ༧ఆ
    ࣗݾ঺հ
    3
    ϓϩϑΟʔϧαΠτ: https://hpprc.dev/
    @γΞτϧ

    View Slide

  4. •ຊࢿྉ͸

    ݴޠॲཧֶձ ୈ29ճ೥࣍େձ ซઃϫʔΫγϣοϓ JLR2023

    ೔ຊޠݴޠࢿݯͷߏஙͱར༻ੑͷ޲্

    Ͱͷචऀͷޱ಄ൃදʮࢿݯͱͯ͠ݟΔ࣮ݧϓϩάϥϜʯͷվగ൛Ͱ͢ɻ

    •ൃදதʹඈ͹ͨ͠εϥΠυɾෆཁͳεϥΠυΛ௥Ճɾ࡟আ͍ͯ͠·͢ɻ

    •ൃදதʹݴٴΛলུͨ͠εϥΠυʹ͸εϥΠυӈ্ʹɹɹɹɹ ͱ͍͏

    ϚʔΫ͕͍͍ͭͯ·͢ɻ

    •ຊࢿྉͷϥΠηϯε͸CC-BY 4.0ʹج͖ͮ·͢ɻ·ͨɺ͢΂ͯͷϖʔδʹ͍ͭ
    ͯݚڀࣨɾاۀ಺ɾSNS্Ͱͷڞ༗ͱྑࣝͷൣғ಺ͰͷվมΛڐՄ͠·͢ɻ
    ຊࢿྉʹ͍ͭͯ
    4
    Skipped

    View Slide

  5. •࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑
    • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ

    • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల

    • େن໛ͳϞσϧͷ
    fi
    ne-tuningςΫχοΫͱ࣮૷ྫ

    •έʔεελσΟ
    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ

    • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁

    • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங
    ໨࣍
    5

    View Slide

  6. •࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑
    • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ

    • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల

    • େن໛ͳϞσϧͷ
    fi
    ne-tuningςΫχοΫͱ࣮૷ྫ

    •έʔεελσΟ
    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ

    • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁

    • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங
    ໨࣍
    6

    View Slide

  7. ࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑

    View Slide

  8. •ݚڀ׆ಈʹ͓͚Δ࣮ݧ͸ʮԾઆʯʮ࣮૷ͱධՁʯʮߟ࡯ʯʹ෼ղͰ͖Δ

    • ීஈ͸ʮԾઆʯ΍ʮߟ࡯ʯ͕ॏࢹ͞Ε͕ͪ

    • ͕͜͜ݚڀͷ໘ന͍ͱ͜ΖͰ͸͋Δ

    • Ͱ͸ɺద੾ͳʮ࣮૷ͱධՁʯ͸࣮ݱͰ͖ͯ౰વ͔ʁ
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹ
    8
    ద੾ͳ

    Ծઆ
    ద੾ͳ

    ࣮૷ͱධՁ
    ద੾ͳ

    ߟ࡯

    View Slide

  9. •࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹ
    9
    ద੾ͳ

    Ծઆ
    ؒҧͬͨ

    ࣮૷ͱධՁ
    ؒҧͬͨ

    ߟ࡯

    View Slide

  10. •࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ
    • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ
    • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ
    • ʮԿ΋͔΋μϝͩͬͨʯ
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹ
    10
    ద੾ͳ

    Ծઆ
    ؒҧͬͨ

    ࣮૷ͱධՁ
    ؒҧͬͨ

    ߟ࡯

    View Slide

  11. •࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ
    • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ
    • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ
    • ʮԿ΋͔΋μϝͩͬͨʯ
    •ؒҧ࣮ͬͨݧ݁Ռ͔Β͸ؒҧͬͨߟ࡯͔͠ੜ·Εͳ͍
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹ
    11
    ద੾ͳ

    Ծઆ
    ؒҧͬͨ

    ࣮૷ͱධՁ
    ؒҧͬͨ

    ߟ࡯

    View Slide

  12. •࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ

    • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ
    • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ
    • ʮԿ΋͔΋μϝͩͬͨʯ
    •ؒҧ࣮ͬͨݧ݁Ռ͔Β͸ؒҧͬͨߟ࡯͔͠ੜ·Εͳ͍
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹ
    12
    ద੾ͳ

    Ծઆ
    ؒҧͬͨ

    ࣮૷ͱධՁ
    ؒҧͬͨ

    ߟ࡯
    ਖ਼͍͠ʮ࣮૷ͱධՁʯ͸

    ख໭ΓΛݮΒ͠ݚڀΛਝ଎Խ͢Δ
    ʮ࣮૷ͱධՁʯͷ

    best practice΋

    ࢿݯͱͯ͠ॏཁ

    View Slide

  13. •Best practiceΛֶͿ
    • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ

    •ΞϯνύλʔϯΛ஌Δ
    • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ

    •৽͍ٕ͠ज़Λ஌Δ
    • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋

    • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ
    13

    View Slide

  14. •Best practiceΛֶͿ
    • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ

    •ΞϯνύλʔϯΛ஌Δ
    • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ

    •৽͍ٕ͠ज़Λ஌Δ
    • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋

    • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ
    14

    View Slide

  15. •ม਺໊Λಀ͛ͣʹߟ͑Δ
    •άϩʔόϧม਺Λආ͚Δ

    •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏)

    •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ
    ࣮૷ͱධՁͷޡΓΛ๷͙
    15
    ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ

    View Slide

  16. •ม਺໊Λಀ͛ͣʹߟ͑Δ
    •άϩʔόϧม਺Λආ͚Δ

    •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏)

    •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ
    •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ

    •࠷ڧ train.py (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ)

    •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ)
    ࣮૷ͱධՁͷޡΓΛ๷͙
    16
    ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ

    View Slide

  17. •ม਺໊Λಀ͛ͣʹߟ͑Δ
    •άϩʔόϧม਺Λආ͚Δ

    •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏)

    •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ
    •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ

    •࠷ڧ train.py (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ)

    •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ)

    •ࣗ෼ͷهԱྗΛ৴͡ΔͷΛ΍ΊΔ
    •͋ΒΏΔ࡞ۀΛϓϩάϥϜʹى͜͢ (σʔλͷμ΢ϯϩʔυɾલॲཧɾධՁ)
    ࣮૷ͱධՁͷޡΓΛ๷͙
    17
    ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ

    View Slide

  18. •ม਺໊Λಀ͛ͣʹߟ͑Δ
    •άϩʔόϧม਺Λආ͚Δ

    •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏)

    •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ
    •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ

    •࠷ڧ train.py (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ)

    •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ)

    •ࣗ෼ͷهԱྗΛ৴͡ΔͷΛ΍ΊΔ
    •͋ΒΏΔ࡞ۀΛϓϩάϥϜʹى͜͢ (σʔλͷμ΢ϯϩʔυɾલॲཧɾධՁ)
    ࣮૷ͱධՁͷޡΓΛ๷͙
    18
    ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ
    ʰϦʔμϒϧίʔυʱ

    ʰGoogleͷιϑτ΢ΣΞΤϯδχΞϦϯάʱ

    Λಡ΋͏ʂ

    View Slide

  19. •Best practiceΛֶͿ
    • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ

    •ΞϯνύλʔϯΛ஌Δ
    • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ

    •৽͍ٕ͠ज़Λ஌Δ
    • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋

    • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ
    19

    View Slide

  20. •Best practiceΛֶͿ
    • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ

    •ΞϯνύλʔϯΛ஌Δ
    • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ

    •৽͍ٕ͠ज़Λ஌Δ
    • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋

    • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋

    • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల
    • େن໛ͳϞσϧͷ
    fi
    ne-tuningςΫχοΫͱ࣮૷ྫ
    ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ
    20

    View Slide

  21. ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల

    View Slide

  22. •HuggingFaceͷTransformers͕୆಄🤗

    •Ͱ͖Δ͜ͱ͸େ͖͘෼͚ͯ3ͭ

    • ஶ໊ͳਂ૚ֶशϞσϧɾΞʔΩςΫνϟ࣮૷ͷར༻

    • ࣄલ܇࿅ࡁΈϞσϧύϥϝʔλͷڞ༗ɾμ΢ϯϩʔυ

    • ࣄલఆٛɾࣄલ܇࿅͞ΕͨϞσϧΛ༻ֶ͍ͨशɾਪ࿦ͷ؆ུԽ

    •PyTorchɾTensorFlowɾJax / FlaxʹରԠ

    •NLPઐ໳ϥΠϒϥϦ͕ͩͬͨɺը૾ɾԻ੠ܥͷϞσϧ΋ೖ͖͍ͬͯͯΔ

    • ը૾෼໺ͷಉ༷ͷϥΠϒϥϦtimm΋࠷ۙ HuggingFace ؅ཧԼʹ
    ਂ૚ֶश༻ϥΠϒϥϦ: Transformers
    22

    View Slide

  23. •ެ։Ϟσϧͷར༻͕ۃΊͯ؆୯

    • ࣗલͷϞσϧͷެ։΋ඇৗʹָɺ਺ߦͰ࣮ߦՄೳ

    •Ϟσϧͷμ΢ϯϩʔυɾॏΈͷϩʔυ·Ͱ1ߦͰ࣮૷Մೳ

    • ࠷ۙ͸ AutoModel ΫϥεಋೖͷಋೖͰ͞Βʹศརʹ

    • ઃఆϑΝΠϧ (con
    fi
    g.json) ͸֤ࣗ֬ೝ͠·͠ΐ͏
    ਂ૚ֶश༻ϥΠϒϥϦ: Transformers
    23

    View Slide

  24. •HuggingFaceͷσʔληοτॲཧ༻ϥΠϒϥϦ

    •ࣗ෼Ͱ΍Δʹ͸େมͳػߏ͕࣮૷

    • ެ։σʔληοτͷμ΢ϯϩʔυɾ੔ܗ

    • લॲཧͷࣗಈతͳΩϟογϡॲཧ
    • ಉ͡σʔληοτʹର͢Δಉ͡લॲཧΛࣗಈతʹεΩοϓ

    • Ϛϧνϓϩηεॲཧͷ؆ศԽ

    • PythonͷmultiprocessingͳͲࣗ෼Ͱ΍Δʹ͸໘౗ͳॲཧ͕ࣗಈతʹ
    σʔληοτ༻ϥΠϒϥϦ: datasets
    24
    Skipped

    View Slide

  25. •༷ʑͳॲཧ͕ۃΊͯ؆୯ʹ࣮ݱՄೳ

    • σʔλͷμ΢ϯϩʔυ(with αϒηοτͷࢦఆ): 1ߦ

    • σʔληοτΛϚϧνϓϩηεͰฒྻʹલॲཧ

    •࣮ମݧ: Juman++౳ͷ෼͔ͪॻ͖ॲཧΛࣗಈతʹΩϟογϡͰ͖ඇৗʹศར
    σʔληοτ༻ϥΠϒϥϦ: datasets
    25

    View Slide

  26. •ۙ೥ͷਂ૚ֶशͰ͸΄ͱΜͲͷίʔυ͕PythonͰهड़͞ΕΔ

    •͔͠͠ɺPython͸ͦΕ΄Ͳ଎͍ݴޠͰ͸ͳ͍

    • Global Interpreter Lock (GIL) ͷଘࡏʹΑͬͯฒྻॲཧ͕໘౗

    • ͦ΋ͦ΋શମతʹͳΜͱͳ͘஗͍

    •ʮPyTorch΋C++Ͱهड़͞ΕͯΔ͠C++Λ࢖͑͹ʁʯ

    • ΋ɺ΋͏ͪΐͬͱϞμϯͳݴޠΛ࢖͍͍ͨؾ͕࣋ͪ…

    •࠷ۙʹͳͬͯ Rust ͕༷ʑͳ৔ॴʹಋೖ͞Ε͍ͯΔ

    • PythonͱҟͳΓ੩తܕ෇͚ɾίϯύΠϧ͞ΕͯػցޠΛੜ੒
    ϓϩάϥϛϯάݴޠͷมભ
    26
    Skipped

    View Slide

  27. •ϑϩϯτΤϯυ։ൃ͔ΒγεςϜϓϩάϥϛϯά·Ͱ༷ʑͳ৔ॴͰར༻

    •ण໋ (lifetime)΍ॴ༗ݖͱ͍ͬͨ֓೦ͷಋೖʹΑΓthread safe & null safe
    ࣮ࡍʹRustΛར༻͍ͯ͠ΔϥΠϒϥϦ
    •huggingface/tokenizers

    • transformers ಺Ͱར༻͞Ε͍ͯΔτʔΫφΠβ༻ϥΠϒϥϦ

    •google-research/deduplicate-text-datasets

    • େྔͷςΩετ͔ΒॏෳΛ࡟আ (ֶशޮ཰ͷվળ)

    •Rust+ػցֶशͷ·ͱΊ: vaaaaanquish/Awesome-Rust-MachineLearning
    ϓϩάϥϛϯάݴޠͷมભ: Rustͷಋೖ
    27
    Skipped

    View Slide

  28. •ߏ଄ԽσʔλͷऔΓѻ͍͸ݱ୅Ͱ΋ॏཁ

    •SQLϥΠΫʹςʔϒϧσʔλΛૢ࡞Ͱ͖Δ

    Pandas͕ඇৗʹ༗໊͕ͩ…

    •Rustϕʔεͷςʔϒϧૢ࡞ϥΠϒϥϦ

    Polars͕஫໨͞Ε͖͍ͯͯΔ

    • ϕϯνϚʔΫ্Ͱ͸ۃΊͯߴ଎

    •ϝιουνΣΠϯΛ׆͔ͨ͠ه๏

    ͳͲ࢖͍উखʹ͍ͭͯ΋༗๬
    ςʔϒϧσʔλ༻ϥΠϒϥϦͷมભ: Polarsͷ୆಄
    28
    https://www.pola.rs/benchmarks.html
    Polars
    Pandas ςʔϒϧσʔλॲཧϥΠϒϥϦͷ

    ϕϯνϚʔΫʹ͓͚Δॲཧ࣌ؒͷάϥϑ
    Skipped

    View Slide

  29. •DeepSpeed: Microsoft͕։ൃɺਂ૚ֶशϞσϧΛޮ཰తʹ܇࿅

    • Transformersͱͷ࿈ܞ΋

    •Accelerate: HuggingFace͕։ൃɺෳ਺GPUରԠͳͲΛ؆୯ʹ࣮ݱ

    •FlexGen:
    • ௒େن໛ϞσϧΛखݩͰਪ࿦Ͱ͖ΔΑ͏ʹ޻෉͢ΔϥΠϒϥϦ

    • ϨΠςϯγͰ͸ͳ͘εϧʔϓοτʹϑΥʔΧε
    ͦͷ΄͔ͷιϑτ΢ΣΞɾςΫχοΫ
    29
    Skipped

    View Slide

  30. •ෳ਺ͷ࣮ݧઃఆΛࢼ͢৔߹͸

    ίϚϯυϥΠϯҾ਺ͷར༻͕ศར

    •͔͠͠argparse͸ܕ͕෇͔ͳ͍

    Typed Argument Parser (Tap)
    •PythonͷdataclassͷΑ͏ʹ

    ίϚϯυϥΠϯύʔαΛهड़Մೳ

    • αϒίϚϯυͷఆٛ΍ܧঝ΋

    •ద੾ͳܕ෇͚ʹΑͬͯิ׬ਫ਼౓޲্

    •ଐੑ໊ͷtypo΍ܕͷؒҧ͍͕ܹݮ
    ิ׬ͷޮ͘argparseͷ୅ସ: Tap
    30
    https://github.com/swansonk14/typed-argument-parser

    View Slide

  31. େن໛ͳϞσϧͷ

    fi
    ne-tuningςΫχοΫͱ࣮૷ྫ

    View Slide

  32. •Ұൠʹύϥϝʔλ਺ͷେ͖ͳϞσϧͷํ͕ੑೳ͕ߴ͍

    •Ͱ͖Ε͹େ͖ͳϞσϧΛ܇࿅͍͕ͨ͠Α͘ൃੜ͢Δ໰୊͕͍͔ͭ͘

    • GPUͷϝϞϦෆ଍
    • ܇࿅͕஗͍
    •େ͖ͳϞσϧΛ܇࿅͢ΔͨΊͷςΫχοΫΛ࣮૷ྫͱڞʹ͍͔ͭ͘঺հ

    • खܰ (͕ͩۃΊͯ༗ޮ) ͳ΋ͷʹݫબ
    େن໛ͳϞσϧͷ܇࿅ςΫχοΫͱ࣮૷ྫ
    32

    View Slide

  33. A100: VRAM 80GB
    •T5-3B (30ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ

    •ຊൃදͷ޻෉ΛೖΕΕ͹΋ͬͱେ͖ͳϞσϧ΋܇࿅Ͱ͖Δ͸ͣ

    A6000: VRAM 48GB
    •BERT-large (3.3ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ

    GTX2080 ti: VRAM 11GB
    •BERT-base (1.1ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ
    GPUͱ܇࿅ՄೳͳϞσϧαΠζͷഽײ
    33
    ໔੹ࣄ߲: ೖྗܥྻ௕΍ͦͷଞ͞·͟·ͳཁҼʹӨڹΛड͚Δײ֮஋ͳͷͰ͝ঝ஌͓͖͍ͩ͘͞

    View Slide

  34. •ਂ૚ֶशϞσϧͷύϥϝʔλ͸௨ৗ 32 bit ͷ ුಈখ਺఺਺ Ͱදݱ

    • ࣮͸ͦΜͳʹࡉ͔͘਺஋Λදݱ͠ͳͯ͘΋ྑ͍

    •਺஋දݱͷ bit ਺ΛݮΒͤΔͱলϝϞϦɾ௿ܭࢉίετʹͳ͓ͬͯಘ

    • 16 bitͰ਺஋Λදݱͨ͠ͷ͕൒ਫ਼౓ුಈখ਺఺਺

    •16 bitͷ࢖͍ํʹΑ༷ͬͯʑͳ࢓༷͕ଘࡏ

    • FP16: traditionalͳ൒ਫ਼౓ුಈখ਺఺਺
    • BF16 (b
    fl
    oat16): Google͕ఏҊɺA100ͳͲ࠷ۙͷGPU΍TPUͰར༻Մೳ

    • B͸BrainͷBΒ͍͠
    ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16
    34
    https://cloud.google.com/tpu/docs/b
    fl
    oat16?hl=ja

    View Slide

  35. •FP16͸ਫ਼౓ෆ଍Ͱֶश͕ෆ҆ఆʹͳΔ৔߹͕ଘࡏ

    • BF16ͷํ͕better͔΋…ʁ

    • ࣮ମݧ: T5ͷେ͖ͳϞσϧ͸BF16Λ༻͍ͳ͍ͱ͏·ֶ͘शͰ͖ͳ͍
    ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16
    35
    ը૾͸Wikipedia͔ΒҾ༻
    BF16
    FP16
    FP32

    View Slide

  36. •࣮༻తʹ͸ɺAutomatic Mixed Precision (AMP) ͕༗༻

    • FP16 / BF16 ͩͱϚζ͍෦෼͸ࣗಈతʹFP32ʹͯ͘͠ΕΔ

    •AMP͸PyTorchͳΒࣗಈతʹ࣮ߦͯ͘͠ΕΔΠϯλϑΣʔε͕ଘࡏ

    • AMP & BF16ͷར༻͸ҎԼͷΑ͏ʹ࣮ݱՄೳ

    • ͜ΕͱGradScalerͱ͍͏ػߏΛ࢖͏ඞཁ͕͋Δ
    ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16
    36
    ը૾͸ https://carbon.now.sh/ Ͱੜ੒
    forwardΛwithͷதͰ࣮ߦ

    View Slide

  37. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    37
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3 u4
    u1 u5
    ॱ఻೻
    ٯ఻೻

    View Slide

  38. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    38
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u4
    u1 u5
    ॱ఻೻
    ٯ఻೻
    ௨ৗ
    u4 ͷޯ഑ܭࢉʹͦΕҎલͷ৘ใ͕ඞཁ
    u3
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.

    View Slide

  39. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    39
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3 u4
    u1 u5
    ॱ఻೻
    ٯ఻೻
    ௨ৗ
    u4Ҏલͷܭࢉ݁ՌΛهԱͯ͠ར༻
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.

    View Slide

  40. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    40
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3 u4
    u1 u5
    ॱ఻೻
    ٯ఻೻
    ௨ৗ
    u4Ҏલͷܭࢉ݁ՌΛهԱͯ͠ར༻
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.
    ॱ఻೻ͷܭࢉ݁ՌΛ͢΂ͯهԱ

    View Slide

  41. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    41
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3
    u1 u5
    ॱ఻೻
    ٯ఻೻
    Gradient Checkpointing
    u4
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.

    View Slide

  42. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    42
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3
    u1 u5
    ॱ఻೻
    ٯ఻೻
    Gradient Checkpointing Checkpoint
    u4
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.

    View Slide

  43. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    43
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3
    u1 u5
    ॱ఻೻
    ٯ఻೻
    Gradient Checkpointing Checkpoint
    ܭࢉάϥϑͷҰ෦ͷ݁ՌͷΈอଘ
    u4
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.

    View Slide

  44. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    44
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3
    u1 u5
    ॱ఻೻
    ٯ఻೻
    Gradient Checkpointing Checkpoint
    ඞཁͳ෼Λܭࢉ͠௚͠
    u4

    View Slide

  45. •Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ

    •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ
    Gradient Checkpointing
    45
    https://github.com/cybertronai/gradient-checkpointing

    Fitting larger networks into memory.

    Backprop and systolic arrays.
    ଛࣦ
    u2 u3 u4 u5
    u1
    u2 u3
    u1 u5
    ॱ఻೻
    ٯ఻೻
    Gradient Checkpointing Checkpoint
    هԱ͢Δܭࢉ݁Ռ͕গͳ͍ʂ
    u4

    View Slide

  46. •ܭࢉ݁Ռͷอଘ਺Λ

    ݮΒ͢͜ͱͰ

    ϝϞϦ࢖༻ྔΛ࡟ݮ

    •ඞཁͳΒ࠶౓ॱ఻೻

    ܭࢉΛ͢ΔͷͰ

    ࣮ߦ࣌ؒ͸৳ͼΔ
    • ΍Ε͹͍͍ͱ͍͏

    Θ͚Ͱ͸ͳ͍
    Gradient Checkpointing
    46

    View Slide

  47. •Gradient Checkpointing͸ࣗ෼Ͱ΍Ζ͏ͱ͢Δͱগ͠େม…

    • transformersͳΒ1ߦͰར༻Մೳ
    • ະରԠϞσϧ΋ଘࡏ͢ΔͷͰ౎౓֬ೝΛ

    •࣮ମݧ: όοναΠζ 512ͰͷֶशͷϝϞϦ࢖༻ྔ͕ 80GB→25GB
    Gradient Checkpointing: ࣮૷
    47
    ը૾͸ https://carbon.now.sh/ Ͱੜ੒

    View Slide

  48. •PyTorch 1.9͔Β௥Ճ͞Εͨ৽͍͠ਪ࿦Ϟʔυ

    • طଘͷਪ࿦Ϟʔυͱͯ͠͸ `torch.no_grad` ͕ଘࡏ

    •੍໿͕ଟগ௥Ճ͞Εͨ͜ͱͰɺΑΓϝϞϦফඅΛ཈͑ͨਪ࿦͕Մೳʹ

    • ධՁ࣌͸΍Γಘɺ܇࿅࣌ʹ͸࢖ͬͯ͸͍͚ͳ͍
    torch.inference_mode
    48
    ը૾͸ https://carbon.now.sh/ Ͱੜ੒
    Skipped

    View Slide

  49. •࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑
    • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ

    • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల

    • େن໛ͳϞσϧͷ
    fi
    ne-tuningςΫχοΫͱ࣮૷ྫ

    •έʔεελσΟ
    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ

    • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁

    • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங
    ໨࣍
    49

    View Slide

  50. •Best practiceΛֶͿ
    • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ

    •ΞϯνύλʔϯΛ஌Δ
    • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ

    •৽͍ٕ͠ज़Λ஌Δ
    • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋

    • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋
    ࠶ܝ: ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ
    50

    View Slide

  51. •͜͜·Ͱͷ࣮૷ํ਑ɾςΫχοΫΛ׆༻ͨ͠ϓϩδΣΫτྫΛ঺հ

    •Work In Progress (WIP) ͳ಺༰͕ଟ͍఺͸͝༰͍ࣻͩ͘͞

    ঺հϓϩδΣΫτ
    •BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ
    •ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁
    •SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEϞσϧͷߏங
    έʔεελσΟ
    51

    View Slide

  52. έʔεελσΟ:

    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ

    View Slide

  53. •ࣄલֶशࡁΈϞσϧΛऔΓר͘؀ڥ͸ۃΊͯٸ଎ʹ੔උ͞Ε͍ͯΔ

    • BERTΛ༻͍ͨߴ඼࣭ͳςϯϓϨʔτ͸΄ͱΜͲଘࡏ͠ͳ͍

    • ಛʹ࠷৽ͷPython, PyTorch, TransformersʹରԠͰ͖͍ͯͳ͍

    •ࣗવݴޠॲཧͷॳֶऀʹͱͬͯ͸͍ۤ͠ঢ়گ

    • ʮݚڀ΍࣮ݧΛͲͷΑ͏ʹ։࢝ͨ͠ΒΑ͍͔Θ͔Βͳ͍ʯ

    • ʮΑ͍ઃܭɺ࣮ݧ؅ཧΛͲͷΑ͏ʹߦ͑͹ྑ͍͔Θ͔Βͳ͍ʯ

    •ʮϞμϯͰߴ඼࣭ͳ࣮ݧϓϩάϥϜʯͷݟຊ͕ඞཁ
    • ࣗ෼ͳΓͷ࣮૷ํ਑ɾࢦ਑Λදݱͨ͠Θ͔Γ΍͍࣮͢૷͕͋Δͱ༗༻
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ
    53
    https://github.com/hppRC/bert-classi
    fi
    cation-tutorial

    View Slide

  54. •ςΩετ෼ྨͷೖ໳ͱͯ͠༗໊ͳʮϥΠϒυΞχϡʔείʔύεʯ͕୊ࡐ

    •BERTΛ
    fi
    ne-tuning͢ΔྲྀΕΛग़དྷΔ͚ͩγϯϓϧʹ࣮૷

    •࣮૷: hppRC/bert-classi
    fi
    cation-tutorial
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ
    54
    https://github.com/hppRC/bert-classi
    fi
    cation-tutorial

    View Slide

  55. •ςΩετ෼ྨͷೖ໳ͱͯ͠༗໊ͳʮϥΠϒυΞχϡʔείʔύεʯ͕୊ࡐ

    •BERTΛ
    fi
    ne-tuning͢ΔྲྀΕΛग़དྷΔ͚ͩγϯϓϧʹ࣮૷

    •࣮૷: hppRC/bert-classi
    fi
    cation-tutorial

    ߩݙ
    •Python 3.10, PyTorch 1.13, Transformers 4.25 Ҏ্ʹରԠ

    •Type Hintsͷ׆༻ͱݟ௨͠ͷྑ͍ઃܭ

    •ʮσʔλ४උʯ → ʮ܇࿅ & ධՁʯ ͱ͍͏୯ํ޲తͳ࣮ݧϓϩηεͷ࣮ྫ

    •࣮ݧςϯϓϨʔτͱͯͦ͠ͷଞͷλεΫ΁ͷస༻͕༰қͳ࣮૷
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ
    55
    https://github.com/hppRC/bert-classi
    fi
    cation-tutorial

    View Slide

  56. •୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍
    56

    View Slide

  57. •୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍
    57
    લॲཧΛඞͣ

    ϓϩάϥϜͱͯ͠࢒͢

    View Slide

  58. •୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍
    58
    JSONLܗࣜͷ

    σʔληοτ
    JSONL / csv / tsv ͷ

    ύʔεॲཧΛ

    ઈରʹࣗ࡞͠ͳ͍

    View Slide

  59. •୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍
    59
    ࣮ݧͱධՁΛ

    ࿈ଓతʹߦ͏

    View Slide

  60. •୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ
    BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍
    60
    Ϟσϧࣗମ͸อଘ͠ͳ͍
    ʮอଘͨ͠ϞσϧΛϩʔυͯ͠ධՁʯ
    ͸͔ͳΓόά͕ൃੜ͠΍͍͢

    View Slide

  61. •࣮ݧ݁Ռͱ࣮ͯ͠ݧઃఆɾධՁࢦඪɾֶशϩάΛอଘ͢Δ
    •࣮ݧ݁Ռ͸࣮ݧઃఆɾ೔෇͝ͱʹσΟϨΫτϦΛ੾Δ
    • ྫ: outputs/[Ϟσϧ໊]/[೥݄೔]/[࣌෼ඵ]
    • ಉ࣮͡ݧઃఆͰ΋݁Ռ্͕ॻ͖͞ΕͨΓ͠ͳ͍
    ࣮ݧ݁Ռͷอଘํ๏: betterͳ࣮ݧ؅ཧΛ໨ࢦͯ͠
    61
    ͱʹ͔͘

    σΟϨΫτϦΛ੾Δ

    View Slide

  62. •࣮ݧ݁Ռͱ࣮ͯ͠ݧઃఆɾධՁࢦඪɾֶशϩάΛอଘ͢Δ
    •࣮ݧ݁Ռ͸࣮ݧઃఆɾ೔෇͝ͱʹσΟϨΫτϦΛ੾Δ
    • ྫ: outputs/[Ϟσϧ໊]/[೥݄೔]/[࣌෼ඵ]
    • ಉ࣮͡ݧઃఆͰ΋݁Ռ্͕ॻ͖͞ΕͨΓ͠ͳ͍

    •࣮ݧ݁Ռͷूܭ͸

    1. ධՁࢦඪϑΝΠϧ (metrics.json) Λ࠶ؼతʹऩू

    2. ࣮ݧઃఆϑΝΠϧ (con
    fi
    g.json) ΛಡΈࠐΈ

    3. PandasͰ࣮ݧઃఆͱධՁࢦඪͷσʔλϑϨʔϜΛ࡞੒

    4. ࣮ݧઃఆ͝ͱʹgroupby͢ΔͳͲ͓޷ΈͰ
    ࣮ݧ݁Ռͷอଘํ๏: betterͳ࣮ݧ؅ཧΛ໨ࢦͯ͠
    62
    ͱʹ͔͘

    σΟϨΫτϦΛ੾Δ

    View Slide

  63. έʔεελσΟ:

    ChatGPTΛ༻͍ͨӳޠσʔληοτ

    ͷࣄྫ͝ͱ຋༁

    View Slide

  64. ࣗવݴޠਪ࿦ (Natural Language Inference; NLI)
    •จϖΞ (લఏจɾԾઆจ) ʹϥϕϧ (ؚҙɾໃ६ɾதཱ) ͕෇༩

    •จϖΞͷҙຯؔ܎Λ༧ଌ͢ΔλεΫ
    NLIσʔληοτ
    64
    લఏจ Ծઆจ ϥϕϧ
    A man playing an electric guitar on stage. A man playing guitar on stage. ؚҙ
    A man playing an electric guitar on stage. A man playing banjo on the
    fl
    oor. ໃ६
    A man playing an electric guitar on stage. A man is performing for cash. தཱ

    View Slide

  65. •ӳޠͷNLIσʔληοτ͸͔ͳΓ੔උ͞Ε͍ͯΔ

    • Stanford NLI (SNLI; Bowman et al., 2015): ໿57ສจϖΞ

    • Multi-Genre NLI (MNLI; Williams et al., 2018): ໿41ສจϖΞ

    •೔ຊޠͷNLIσʔληοτ͸ӳޠͱൺֱͯ͠ݶఆత

    • JSNLI (٢ӽΒ, 2020): Stanford NLIσʔληοτΛ೔ຊޠʹػց຋༁

    • JNLI (܀ݪΒ, 2022): JGLUE ʹಉࠝɺΩϟϓγϣϯΛར༻

    • JaNLI (୩தΒ, 2021): ݴޠֶత஌ݟʹجͮ͘ఢରతσʔληοτ
    ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁
    65

    View Slide

  66. •೔ຊޠͰ࠷΋ར༻͞Ε͍ͯΔNLIσʔληοτ͸JSNLI
    • 2018೥͘Β͍ͷGoogle຋༁ʹΑͬͯ୯จ୯ҐͰ຋༁

    • ୯จ୯Ґͷ຋༁͸จϖΞͷҙຯؔ܎Λյͯ͠͠·͏ݒ೦΋

    •େن໛ݴޠϞσϧ͸຋༁΋্ख͘͜ͳͤΔ͜ͱ͕஌ΒΕ͍ͯΔ

    • ಛʹChatGPT͸ඇৗʹҹ৅తͳೳྗΛൃش

    •େن໛ݴޠϞσϧ͸promptʹΑͬͯଟ༷ͳ৚݅෇͖ੜ੒͕Մೳ

    • ৚݅෇͚࣍ୈͰϥϕϧͷҙຯؔ܎Λյͣ͞ʹࣄྫ͝ͱʹ຋༁Ͱ͖ΔͷͰ͸ʁ
    •ChatGPTΛ༻͍ͯӳޠNLIσʔληοτΛ೔ຊޠʹࣄྫ͝ͱ຋༁
    ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁
    66

    View Slide

  67. 0. OpenAIͷAPI KeyΛൃߦ

    1. pip install openai

    2. promptΛઃܭ

    3. ࣙॻํʹpromptΛೖΕͯJSONͱͯ͠APIʹ౤͛Δ (উखʹ΍ͬͯ͘ΕΔ)
    ChatGPTʹΑΔ຋༁ͷखॱ
    67

    View Slide

  68. ຋༁ର৅
    •Stanford NLI: ໿57ສจϖΞ

    •Multi-Genre NLI: ໿41ສจϖΞ

    ຋༁ख๏
    •ϥϕϧͷҙຯؔ܎Λյ͞ͳ͍Α͏ʹ຋༁͢ΔΑ͏promptͰࢦࣔ

    •OpenAIͷChatGPT API (gpt-3.5-turbo, $0.002/1K tokens) Λར༻

    •໿100ສจϖΞͷ຋༁ʹ5ສԁఔ౓ (DeepLͷAPIͩͱ17ສԁҎ্)

    ੒Ռ෺
    •೔ӳର༁෇͖ͷ೔ຊޠNLIσʔληοτ
    ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁
    68
    ൃදऀ஫: promptΛ؆ૉʹ͢ΔͳͲͰ΋ͬͱ҆ՁʹͰ͖ͦ͏Ͱ͢

    View Slide

  69. •6-shot learningΛ࣮ࢪ

    • NLIσʔληοτͷ೔ӳର༁

    ͱͯ͠JSICK͔ΒࣄྫΛഈआ

    •গ਺ࣄྫͷޙʹ຋༁͍ͨ͠ࣄྫΛ౤ೖ

    •Batch Prompting (Cheng et al., 2023)

    ΋ར༻
    ࣮ࡍͷPrompt
    69

    View Slide

  70. •6-shot learningΛ࣮ࢪ

    • NLIσʔληοτͷ೔ӳର༁

    ͱͯ͠JSICK͔ΒࣄྫΛഈआ

    •গ਺ࣄྫͷޙʹ຋༁͍ͨ͠ࣄྫΛ౤ೖ

    •Batch Prompting (Cheng et al., 2023)

    ΋ར༻

    ମײ
    •zero-shotΑΓfew-shotͷ΄͏͕

    ֨ஈʹ຋༁඼࣭͕ߴ͍

    •promptΤϯδχΞϦϯάͰ͞Βʹ

    ຋༁඼࣭Λ޲্ͤ͞ΒΕͦ͏
    ࣮ࡍͷPrompt
    70

    View Slide

  71. •ۙ೔ެ։༧ఆ

    • σʔληοτ

    • ࣮ݧίʔυ

    • prompt

    •ݱࡏධՁத…
    WIP: NU-NLI
    71

    View Slide

  72. έʔεελσΟ:

    SimCSEͷ࠶ݱ࣮૷ͱ

    ೔ຊޠSimCSEϞσϧͷߏங

    View Slide

  73. •ରরֶश(Contrastive Learning)Λ༻͍ͯࣄલֶशࡁΈϞσϧΛ
    fi
    ne-tuning

    • Unsupervised SimCSE:ʮಉ͡จΛ2ճຒΊࠐΜͰରরֶशʯ
    • Supervised SimCSE: ʮؚҙؔ܎ʹ͋ΔจΛਖ਼ྫͱͯ͠ରরֶशʯ
    Gao+: SimCSE: Simple Contrastive Learning of Sentence Embeddings, EMNLP ’21
    SimCSE: ରরֶशʹجͮ͘จຒΊࠐΈख๏
    73
    ਤ͸࿦จΑΓҾ༻ɻҎલ࣮ࢪͨ͠SimCSEͷྠߨࢿྉ͸ͪ͜Β

    View Slide

  74. •ެ࣮ࣜ૷͸ଟ༷ͳந৅Խ͕ࢪ͞Ε͍ͯͯॳֶऀʹ͸௥͍ͮΒ͍

    • จຒΊࠐΈͷݚڀΛଅਐ͍ͨ͠

    •ग़དྷΔ͚ͩந৅ԽΛݮΒͨ͠γϯϓϧͳ࠶ݱ࣮૷Λެ։

    • hppRC/simple-simcse
    SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE
    74

    View Slide

  75. •ެ࣮ࣜ૷͸ଟ༷ͳந৅Խ͕ࢪ͞Ε͍ͯͯॳֶऀʹ͸௥͍ͮΒ͍

    • จຒΊࠐΈͷݚڀΛଅਐ͍ͨ͠

    •ग़དྷΔ͚ͩந৅ԽΛݮΒͨ͠γϯϓϧͳ࠶ݱ࣮૷Λެ։

    • hppRC/simple-simcse

    •γϯϓϧͳ PyTorch + transformers ͷߏ੒ɾશମͰ250ߦ

    • + ࿦จ΁ͷ֘౰Օॴ΁ͷݴٴɾࢲݟΛؚΉίϝϯτ107ߦ
    • σʔλͷલॲཧΛআ͘
    •࠶ݱ࣮૷Λ༻͍ͨ࠶ݱ࣮ݧ΋࣮ࢪ

    • ࿦จͷϋΠύϥͰ4Ϟσϧɾ50ཚ਺γʔυͰ࣮ݧ (=200ճ)
    SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE
    75

    View Slide

  76. •ίϝϯτ෇͖ͰֶशϧʔϓΛ؆ܿʹهड़
    SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE
    76

    View Slide

  77. •hppRC/simple-simcse

    •৽͘͠จຒΊࠐΈݚڀΛ࢝ΊΔਓͷ

    ଍͕͔Γͱͯ͠ิ଍આ໌΋هࡌ

    •ੑೳ΋΄΅࠶ݱ
    SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE
    77
    50ճ࣮ͣͭݧͨ͠ࡍͷੑೳͷώετάϥϜ
    ϋΠύϥʹର͢Δݴٴɾࢲݟ

    View Slide

  78. •ӳޠͷࣄલֶशࡁΈจຒΊࠐΈϞσϧ͸ଟ਺ଘࡏ

    •ҰํͰ೔ຊޠจຒΊࠐΈϞσϧͷܾఆ൛͸ଘࡏ͠ͳ͍

    • ิ଍: ࠷ۙ PKSHA͔ࣾΒ೔ຊޠSimCSEϞσϧ ͕ެ։

    • ೔ຊޠจຒΊࠐΈք۾΋੝Γ্͕Γͭͭ͋ΔΧϞ…ʂ

    •೔ຊޠจຒΊࠐΈϞσϧͷแׅతͳධՁ͕ଘࡏ͠ͳ͍
    WIP: ೔ຊޠSimCSEϞσϧͷߏங
    78

    View Slide

  79. •ӳޠͷࣄલֶशࡁΈจຒΊࠐΈϞσϧ͸ଟ਺ଘࡏ

    •ҰํͰ೔ຊޠจຒΊࠐΈϞσϧͷܾఆ൛͸ଘࡏ͠ͳ͍

    • ิ଍: ࠷ۙ PKSHA͔ࣾΒ೔ຊޠSimCSEϞσϧ ͕ެ։

    • ೔ຊޠจຒΊࠐΈք۾΋੝Γ্͕Γͭͭ͋ΔΧϞ…ʂ

    •೔ຊޠจຒΊࠐΈϞσϧͷแׅతͳධՁ͕ଘࡏ͠ͳ͍

    •೔ຊޠจຒΊࠐΈϞσϧͷߏஙͱแׅతͳධՁΛ࣮ࢪ
    • ۙ೥ͷจຒΊࠐΈख๏ͱͯ͠୅දతͳSimCSEΛϕʔεʹ

    • ڭࢣ͋Γɾڭࢣͳ͠ͷ྆ํΛ࣮ݧ

    • ෳ਺ͷσʔληοτɾϋΠύϥͰ࣮ݧ
    WIP: ೔ຊޠSimCSEϞσϧͷߏங
    79

    View Slide

  80. ܇࿅σʔλ

    •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ

    •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI, MNLI)
    WIP: ೔ຊޠSimCSEϞσϧͷߏங
    80
    WikipediaܥΛ2ͭ
    WebܥΛ2ͭ

    View Slide

  81. ܇࿅σʔλ

    •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ

    •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI, MNLI)
    ࣮ݧઃఆ:
    •ࣄલֶशࡁΈݴޠϞσϧ21छྨͰ࣮ݧ (base: 14छྨ, large: 7छྨ)
    •όοναΠζ: {64, 128, 256, 512}, ֶश཰: {1e-5, 3e-5, 5e-5}

    •ҟͳΔཚ਺γʔυ஋Ͱ3ճ࣮ͣͭݧͯ͠࠷ྑͷϋΠύϥͰධՁ
    WIP: ೔ຊޠSimCSEϞσϧͷߏங
    81
    WikipediaܥΛ2ͭ
    WebܥΛ2ͭ

    View Slide

  82. ܇࿅σʔλ

    •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ

    •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI, MNLI)
    ࣮ݧઃఆ:
    •ࣄલֶशࡁΈݴޠϞσϧ21छྨͰ࣮ݧ (base: 14छྨ, large: 7छྨ)
    •όοναΠζ: {64, 128, 256, 512}, ֶश཰: {1e-5, 3e-5, 5e-5}

    •ҟͳΔཚ਺γʔυ஋Ͱ3ճ࣮ͣͭݧͯ͠࠷ྑͷϋΠύϥͰධՁ

    ݱঢ়ͷ࣮ݧ݁Ռ
    •ڭࢣ͋Γ/ͳ͠ڞʹૣҴాେRoBERTa-large͕࠷ߴੑೳ

    •Studio Ousia ೔ຊޠLUKE-largeͱXLM-RoBERTa-large͕͍࣍Ͱߴੑೳ
    WIP: ೔ຊޠSimCSEϞσϧͷߏங
    82
    ݱࡏ·Ͱʹ…
    ڭࢣͳ͠: 1559ճ
    ڭࢣ͋Γ: 3172ճ

    View Slide

  83. •ۙ೔ެ։༧ఆ

    • σʔληοτલॲཧ༻ͷϓϩάϥϜ
    • ࣮ݧίʔυ (ֶशɾධՁ)

    • ࣮ݧ݁Ռ (ϋΠύϥ୳ࡧ࣌ͷ݁Ռ΋)

    • ࣄલ܇࿅ࡁΈϞσϧ
    •ݱࡏ࣮ݧɾධՁத…
    WIP: ೔ຊޠSimCSEϞσϧͷߏங
    83
    BCCWJͷXMLΛ

    Ϛϧνϓϩηεʹલॲཧͯ͠
    ςΩετϑΝΠϧʹ

    ม׵͢ΔϓϩάϥϜͳͲ
    ஶ໊ͳ೔ຊޠσʔληοτ

    લॲཧ༻ͷϓϩάϥϜηοτͱͯ͠΋

    View Slide

  84. •ਝ଎͔ͭద੾ͳݚڀ਱ߦͷͨΊͷ࣮ݧϓϩάϥϜʹ͍ͭͯ঺հ

    •έʔεελσΟΛ௨࣮ͯ͠ફతͳςΫχοΫΛ঺հ

    • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ
    • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁
    • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEϞσϧͷߏங
    •࣮ݧϓϩάϥϜ΋ੵۃతʹެ։ɾվળɾٞ࿦͍͖ͯ͠·͠ΐ͏ʂ
    ·ͱΊ: ࢿݯͱͯ͠ΈΔ࣮ݧϓϩάϥϜ
    84

    View Slide