Slide 1

Slide 1 text

ࢿݯͱͯ͠ݟΔ࣮ݧϓϩάϥϜ ໊ݹ԰େֶେֶӃ৘ใֶݚڀՊ म࢜2೥ ෢ా࡫໺ݚڀࣨ ௩ӽ ॣ

Slide 2

Slide 2 text

•ݚڀΛਐΊΔʹ͸࣮ݧ͕ඞཁ •ਂ૚ֶश or ࣗવݴޠॲཧ෼໺ʹ͓͍࣮ͯݧ͸ϓϩάϥϜʹΑΓ࣮ࢪ͞ΕΔ • ࣮ݧϓϩάϥϜ΋ݚڀ׆ಈͷॏཁͳཁૉͰ͋Δ ຊൃදͷझࢫ •࣮ݧϓϩάϥϜͷ໾ׂͱॏཁੑɺ࣮ݧϓϩάϥϜ؅ཧͷํࡦɺࣗવݴޠॲཧ ͷ޻ֶతൃలʹ͍ͭͯ঺հ •ਝ଎͔ͭద੾ͳݚڀ਱ߦͷͨΊʹɺ࣮ݧϓϩάϥϜΛࢿݯͱͯ͠ଊ͑Δ • ࣮ମݧɾମײΛަ͑ͯ “ྑ͍࣮ݧϓϩάϥϜ” ͷͨΊͷٞ࿦Λਪਐ •࣮ݧϓϩάϥϜͷੵۃతͳެ։ɾվળɾٞ࿦Λଅਐ͍ͨ͠ ֓ཁ 2

Slide 3

Slide 3 text

•໊લ: ௩ӽ ॣ / TSUKAGOSHI, Hayato •ॴଐ: ໊େ ෢ా࡫໺ݚ M2 ݚڀ: •ఆٛจΛ༻͍ͨจຒΊࠐΈߏ੒๏
 (NLP 2021, ACL-IJCNLP 2021, ࣗવݴޠॲཧ Vol. 30) •ࣗવݴޠਪ࿦ͱ࠶ݱثΛ༻͍ͨSplit and Rephrase
 ʹ͓͚Δੜ੒จͷ඼࣭޲্ (NLP 2022) •ҟͳΔڭࢣ৴߸͔Βߏஙͨ͠จϕΫτϧͷൺֱ
 ͱ౷߹ (म࿦, *SEM 2022) •(ڞஶ) Ψ΢ε෼෍ʹجͮ͘จදݱੜ੒ (NLP2023) •ֶৼಛผݚڀһ(DC1)࠾༻಺ఆɾത࢜՝ఔਐֶ༧ఆ ࣗݾ঺հ 3 ϓϩϑΟʔϧαΠτ: https://hpprc.dev/ @γΞτϧ

Slide 4

Slide 4 text

•ຊࢿྉ͸
 ݴޠॲཧֶձ ୈ29ճ೥࣍େձ ซઃϫʔΫγϣοϓ JLR2023
 ೔ຊޠݴޠࢿݯͷߏஙͱར༻ੑͷ޲্
 Ͱͷචऀͷޱ಄ൃදʮࢿݯͱͯ͠ݟΔ࣮ݧϓϩάϥϜʯͷվగ൛Ͱ͢ɻ •ൃදதʹඈ͹ͨ͠εϥΠυɾෆཁͳεϥΠυΛ௥Ճɾ࡟আ͍ͯ͠·͢ɻ •ൃදதʹݴٴΛলུͨ͠εϥΠυʹ͸εϥΠυӈ্ʹɹɹɹɹ ͱ͍͏
 ϚʔΫ͕͍͍ͭͯ·͢ɻ •ຊࢿྉͷϥΠηϯε͸CC-BY 4.0ʹج͖ͮ·͢ɻ·ͨɺ͢΂ͯͷϖʔδʹ͍ͭ ͯݚڀࣨɾاۀ಺ɾSNS্Ͱͷڞ༗ͱྑࣝͷൣғ಺ͰͷվมΛڐՄ͠·͢ɻ ຊࢿྉʹ͍ͭͯ 4 Skipped

Slide 5

Slide 5 text

•࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑ • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ •έʔεελσΟ • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங ໨࣍ 5

Slide 6

Slide 6 text

•࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑ • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ •έʔεελσΟ • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங ໨࣍ 6

Slide 7

Slide 7 text

࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑

Slide 8

Slide 8 text

•ݚڀ׆ಈʹ͓͚Δ࣮ݧ͸ʮԾઆʯʮ࣮૷ͱධՁʯʮߟ࡯ʯʹ෼ղͰ͖Δ • ීஈ͸ʮԾઆʯ΍ʮߟ࡯ʯ͕ॏࢹ͞Ε͕ͪ • ͕͜͜ݚڀͷ໘ന͍ͱ͜ΖͰ͸͋Δ • Ͱ͸ɺద੾ͳʮ࣮૷ͱධՁʯ͸࣮ݱͰ͖ͯ౰વ͔ʁ ద੾ͳ࣮ݧΛ͢ΔͨΊʹ 8 ద੾ͳ
 Ծઆ ద੾ͳ
 ࣮૷ͱධՁ ద੾ͳ
 ߟ࡯

Slide 9

Slide 9 text

•࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ ద੾ͳ࣮ݧΛ͢ΔͨΊʹ 9 ద੾ͳ
 Ծઆ ؒҧͬͨ
 ࣮૷ͱධՁ ؒҧͬͨ
 ߟ࡯

Slide 10

Slide 10 text

•࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ • ʮԿ΋͔΋μϝͩͬͨʯ ద੾ͳ࣮ݧΛ͢ΔͨΊʹ 10 ద੾ͳ
 Ծઆ ؒҧͬͨ
 ࣮૷ͱධՁ ؒҧͬͨ
 ߟ࡯

Slide 11

Slide 11 text

•࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ • ʮԿ΋͔΋μϝͩͬͨʯ •ؒҧ࣮ͬͨݧ݁Ռ͔Β͸ؒҧͬͨߟ࡯͔͠ੜ·Εͳ͍ ద੾ͳ࣮ݧΛ͢ΔͨΊʹ 11 ద੾ͳ
 Ծઆ ؒҧͬͨ
 ࣮૷ͱධՁ ؒҧͬͨ
 ߟ࡯

Slide 12

Slide 12 text

•࣮ࡍʹ͸ʮ࣮૷ͱධՁʯ΋ۃΊͯॏཁ • ʮ࣮૷͕ؒҧֶ͍ͬͯͯश͕Ͱ͖ͳ͔ͬͨʯ • ʮධՁํ๏͕ؒҧ͍ͬͯͯҙຯͷͳ͍࣮ݧΛ͍ͯͨ͠ʯ • ʮԿ΋͔΋μϝͩͬͨʯ •ؒҧ࣮ͬͨݧ݁Ռ͔Β͸ؒҧͬͨߟ࡯͔͠ੜ·Εͳ͍ ద੾ͳ࣮ݧΛ͢ΔͨΊʹ 12 ద੾ͳ
 Ծઆ ؒҧͬͨ
 ࣮૷ͱධՁ ؒҧͬͨ
 ߟ࡯ ਖ਼͍͠ʮ࣮૷ͱධՁʯ͸
 ख໭ΓΛݮΒ͠ݚڀΛਝ଎Խ͢Δ ʮ࣮૷ͱධՁʯͷ
 best practice΋
 ࢿݯͱͯ͠ॏཁ

Slide 13

Slide 13 text

•Best practiceΛֶͿ • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ •ΞϯνύλʔϯΛ஌Δ • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ •৽͍ٕ͠ज़Λ஌Δ • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋ • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋ ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ 13

Slide 14

Slide 14 text

•Best practiceΛֶͿ • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ •ΞϯνύλʔϯΛ஌Δ • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ •৽͍ٕ͠ज़Λ஌Δ • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋ • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋ ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ 14

Slide 15

Slide 15 text

•ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ ࣮૷ͱධՁͷޡΓΛ๷͙ 15 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ

Slide 16

Slide 16 text

•ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ •࠷ڧ train.py (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ) •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ) ࣮૷ͱධՁͷޡΓΛ๷͙ 16 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ

Slide 17

Slide 17 text

•ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ •࠷ڧ train.py (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ) •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ) •ࣗ෼ͷهԱྗΛ৴͡ΔͷΛ΍ΊΔ •͋ΒΏΔ࡞ۀΛϓϩάϥϜʹى͜͢ (σʔλͷμ΢ϯϩʔυɾલॲཧɾධՁ) ࣮૷ͱධՁͷޡΓΛ๷͙ 17 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ

Slide 18

Slide 18 text

•ม਺໊Λಀ͛ͣʹߟ͑Δ •άϩʔόϧม਺Λආ͚Δ •഑ྻʹҙຯʹ͋ΔσʔλΛԡ͠ࠐΉͷΛ΍ΊΔ (dict΍dataclassΛ࢖͏) •Type HintsΛ࢖ͬͯग़དྷΔ͚ͩܕΛ໌ࣔ͢Δ •Jupyter Notebook͚ͩͰ࣮ݧ͢ΔͷΛ΍ΊΔ •࠷ڧ train.py (ਗ਼໺, 2021) Λආ͚Δ (੹຿Λ෼͚Δ) •ঢ়ଶʹґଘͨ͠ॲཧΛආ͚Δ (ࢀরಁաੑͷ͋Δؔ਺Λઃܭͷத৺ʹ͢Δ) •ࣗ෼ͷهԱྗΛ৴͡ΔͷΛ΍ΊΔ •͋ΒΏΔ࡞ۀΛϓϩάϥϜʹى͜͢ (σʔλͷμ΢ϯϩʔυɾલॲཧɾධՁ) ࣮૷ͱධՁͷޡΓΛ๷͙ 18 ൃදऀ஫: ʰϦʔμϒϧίʔυʱΛ݄1ͰಡΈฦ͢ͱྑ͍Ͱ͠ΐ͏ɻ ʰϦʔμϒϧίʔυʱ
 ʰGoogleͷιϑτ΢ΣΞΤϯδχΞϦϯάʱ
 Λಡ΋͏ʂ

Slide 19

Slide 19 text

•Best practiceΛֶͿ • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ •ΞϯνύλʔϯΛ஌Δ • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ •৽͍ٕ͠ज़Λ஌Δ • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋ • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋ ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ 19

Slide 20

Slide 20 text

•Best practiceΛֶͿ • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ •ΞϯνύλʔϯΛ஌Δ • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ •৽͍ٕ͠ज़Λ஌Δ • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋ • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ 20

Slide 21

Slide 21 text

ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల

Slide 22

Slide 22 text

•HuggingFaceͷTransformers͕୆಄🤗 •Ͱ͖Δ͜ͱ͸େ͖͘෼͚ͯ3ͭ • ஶ໊ͳਂ૚ֶशϞσϧɾΞʔΩςΫνϟ࣮૷ͷར༻ • ࣄલ܇࿅ࡁΈϞσϧύϥϝʔλͷڞ༗ɾμ΢ϯϩʔυ • ࣄલఆٛɾࣄલ܇࿅͞ΕͨϞσϧΛ༻ֶ͍ͨशɾਪ࿦ͷ؆ུԽ •PyTorchɾTensorFlowɾJax / FlaxʹରԠ •NLPઐ໳ϥΠϒϥϦ͕ͩͬͨɺը૾ɾԻ੠ܥͷϞσϧ΋ೖ͖͍ͬͯͯΔ • ը૾෼໺ͷಉ༷ͷϥΠϒϥϦtimm΋࠷ۙ HuggingFace ؅ཧԼʹ ਂ૚ֶश༻ϥΠϒϥϦ: Transformers 22

Slide 23

Slide 23 text

•ެ։Ϟσϧͷར༻͕ۃΊͯ؆୯ • ࣗલͷϞσϧͷެ։΋ඇৗʹָɺ਺ߦͰ࣮ߦՄೳ •Ϟσϧͷμ΢ϯϩʔυɾॏΈͷϩʔυ·Ͱ1ߦͰ࣮૷Մೳ • ࠷ۙ͸ AutoModel ΫϥεಋೖͷಋೖͰ͞Βʹศརʹ • ઃఆϑΝΠϧ (con fi g.json) ͸֤ࣗ֬ೝ͠·͠ΐ͏ ਂ૚ֶश༻ϥΠϒϥϦ: Transformers 23

Slide 24

Slide 24 text

•HuggingFaceͷσʔληοτॲཧ༻ϥΠϒϥϦ •ࣗ෼Ͱ΍Δʹ͸େมͳػߏ͕࣮૷ • ެ։σʔληοτͷμ΢ϯϩʔυɾ੔ܗ • લॲཧͷࣗಈతͳΩϟογϡॲཧ • ಉ͡σʔληοτʹର͢Δಉ͡લॲཧΛࣗಈతʹεΩοϓ • Ϛϧνϓϩηεॲཧͷ؆ศԽ • PythonͷmultiprocessingͳͲࣗ෼Ͱ΍Δʹ͸໘౗ͳॲཧ͕ࣗಈతʹ σʔληοτ༻ϥΠϒϥϦ: datasets 24 Skipped

Slide 25

Slide 25 text

•༷ʑͳॲཧ͕ۃΊͯ؆୯ʹ࣮ݱՄೳ • σʔλͷμ΢ϯϩʔυ(with αϒηοτͷࢦఆ): 1ߦ • σʔληοτΛϚϧνϓϩηεͰฒྻʹલॲཧ •࣮ମݧ: Juman++౳ͷ෼͔ͪॻ͖ॲཧΛࣗಈతʹΩϟογϡͰ͖ඇৗʹศར σʔληοτ༻ϥΠϒϥϦ: datasets 25

Slide 26

Slide 26 text

•ۙ೥ͷਂ૚ֶशͰ͸΄ͱΜͲͷίʔυ͕PythonͰهड़͞ΕΔ •͔͠͠ɺPython͸ͦΕ΄Ͳ଎͍ݴޠͰ͸ͳ͍ • Global Interpreter Lock (GIL) ͷଘࡏʹΑͬͯฒྻॲཧ͕໘౗ • ͦ΋ͦ΋શମతʹͳΜͱͳ͘஗͍ •ʮPyTorch΋C++Ͱهड़͞ΕͯΔ͠C++Λ࢖͑͹ʁʯ • ΋ɺ΋͏ͪΐͬͱϞμϯͳݴޠΛ࢖͍͍ͨؾ͕࣋ͪ… •࠷ۙʹͳͬͯ Rust ͕༷ʑͳ৔ॴʹಋೖ͞Ε͍ͯΔ • PythonͱҟͳΓ੩తܕ෇͚ɾίϯύΠϧ͞ΕͯػցޠΛੜ੒ ϓϩάϥϛϯάݴޠͷมભ 26 Skipped

Slide 27

Slide 27 text

•ϑϩϯτΤϯυ։ൃ͔ΒγεςϜϓϩάϥϛϯά·Ͱ༷ʑͳ৔ॴͰར༻ •ण໋ (lifetime)΍ॴ༗ݖͱ͍ͬͨ֓೦ͷಋೖʹΑΓthread safe & null safe ࣮ࡍʹRustΛར༻͍ͯ͠ΔϥΠϒϥϦ •huggingface/tokenizers • transformers ಺Ͱར༻͞Ε͍ͯΔτʔΫφΠβ༻ϥΠϒϥϦ •google-research/deduplicate-text-datasets • େྔͷςΩετ͔ΒॏෳΛ࡟আ (ֶशޮ཰ͷվળ) •Rust+ػցֶशͷ·ͱΊ: vaaaaanquish/Awesome-Rust-MachineLearning ϓϩάϥϛϯάݴޠͷมભ: Rustͷಋೖ 27 Skipped

Slide 28

Slide 28 text

•ߏ଄ԽσʔλͷऔΓѻ͍͸ݱ୅Ͱ΋ॏཁ •SQLϥΠΫʹςʔϒϧσʔλΛૢ࡞Ͱ͖Δ
 Pandas͕ඇৗʹ༗໊͕ͩ… •Rustϕʔεͷςʔϒϧૢ࡞ϥΠϒϥϦ
 Polars͕஫໨͞Ε͖͍ͯͯΔ • ϕϯνϚʔΫ্Ͱ͸ۃΊͯߴ଎ •ϝιουνΣΠϯΛ׆͔ͨ͠ه๏
 ͳͲ࢖͍উखʹ͍ͭͯ΋༗๬ ςʔϒϧσʔλ༻ϥΠϒϥϦͷมભ: Polarsͷ୆಄ 28 https://www.pola.rs/benchmarks.html Polars Pandas ςʔϒϧσʔλॲཧϥΠϒϥϦͷ
 ϕϯνϚʔΫʹ͓͚Δॲཧ࣌ؒͷάϥϑ Skipped

Slide 29

Slide 29 text

•DeepSpeed: Microsoft͕։ൃɺਂ૚ֶशϞσϧΛޮ཰తʹ܇࿅ • Transformersͱͷ࿈ܞ΋ •Accelerate: HuggingFace͕։ൃɺෳ਺GPUରԠͳͲΛ؆୯ʹ࣮ݱ •FlexGen: • ௒େن໛ϞσϧΛखݩͰਪ࿦Ͱ͖ΔΑ͏ʹ޻෉͢ΔϥΠϒϥϦ • ϨΠςϯγͰ͸ͳ͘εϧʔϓοτʹϑΥʔΧε ͦͷ΄͔ͷιϑτ΢ΣΞɾςΫχοΫ 29 Skipped

Slide 30

Slide 30 text

•ෳ਺ͷ࣮ݧઃఆΛࢼ͢৔߹͸
 ίϚϯυϥΠϯҾ਺ͷར༻͕ศར •͔͠͠argparse͸ܕ͕෇͔ͳ͍ Typed Argument Parser (Tap) •PythonͷdataclassͷΑ͏ʹ
 ίϚϯυϥΠϯύʔαΛهड़Մೳ • αϒίϚϯυͷఆٛ΍ܧঝ΋ •ద੾ͳܕ෇͚ʹΑͬͯิ׬ਫ਼౓޲্ •ଐੑ໊ͷtypo΍ܕͷؒҧ͍͕ܹݮ ิ׬ͷޮ͘argparseͷ୅ସ: Tap 30 https://github.com/swansonk14/typed-argument-parser

Slide 31

Slide 31 text

େن໛ͳϞσϧͷ
 fi ne-tuningςΫχοΫͱ࣮૷ྫ

Slide 32

Slide 32 text

•Ұൠʹύϥϝʔλ਺ͷେ͖ͳϞσϧͷํ͕ੑೳ͕ߴ͍ •Ͱ͖Ε͹େ͖ͳϞσϧΛ܇࿅͍͕ͨ͠Α͘ൃੜ͢Δ໰୊͕͍͔ͭ͘ • GPUͷϝϞϦෆ଍ • ܇࿅͕஗͍ •େ͖ͳϞσϧΛ܇࿅͢ΔͨΊͷςΫχοΫΛ࣮૷ྫͱڞʹ͍͔ͭ͘঺հ • खܰ (͕ͩۃΊͯ༗ޮ) ͳ΋ͷʹݫબ େن໛ͳϞσϧͷ܇࿅ςΫχοΫͱ࣮૷ྫ 32

Slide 33

Slide 33 text

A100: VRAM 80GB •T5-3B (30ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ •ຊൃදͷ޻෉ΛೖΕΕ͹΋ͬͱେ͖ͳϞσϧ΋܇࿅Ͱ͖Δ͸ͣ A6000: VRAM 48GB •BERT-large (3.3ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ GTX2080 ti: VRAM 11GB •BERT-base (1.1ԯύϥϝʔλ) ͕όοναΠζ 16 Ͱී௨ʹ܇࿅Ͱ͖Δ GPUͱ܇࿅ՄೳͳϞσϧαΠζͷഽײ 33 ໔੹ࣄ߲: ೖྗܥྻ௕΍ͦͷଞ͞·͟·ͳཁҼʹӨڹΛड͚Δײ֮஋ͳͷͰ͝ঝ஌͓͖͍ͩ͘͞

Slide 34

Slide 34 text

•ਂ૚ֶशϞσϧͷύϥϝʔλ͸௨ৗ 32 bit ͷ ුಈখ਺఺਺ Ͱදݱ • ࣮͸ͦΜͳʹࡉ͔͘਺஋Λදݱ͠ͳͯ͘΋ྑ͍ •਺஋දݱͷ bit ਺ΛݮΒͤΔͱলϝϞϦɾ௿ܭࢉίετʹͳ͓ͬͯಘ • 16 bitͰ਺஋Λදݱͨ͠ͷ͕൒ਫ਼౓ුಈখ਺఺਺ •16 bitͷ࢖͍ํʹΑ༷ͬͯʑͳ࢓༷͕ଘࡏ • FP16: traditionalͳ൒ਫ਼౓ුಈখ਺఺਺ • BF16 (b fl oat16): Google͕ఏҊɺA100ͳͲ࠷ۙͷGPU΍TPUͰར༻Մೳ • B͸BrainͷBΒ͍͠ ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16 34 https://cloud.google.com/tpu/docs/b fl oat16?hl=ja

Slide 35

Slide 35 text

•FP16͸ਫ਼౓ෆ଍Ͱֶश͕ෆ҆ఆʹͳΔ৔߹͕ଘࡏ • BF16ͷํ͕better͔΋…ʁ • ࣮ମݧ: T5ͷେ͖ͳϞσϧ͸BF16Λ༻͍ͳ͍ͱ͏·ֶ͘शͰ͖ͳ͍ ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16 35 ը૾͸Wikipedia͔ΒҾ༻ BF16 FP16 FP32

Slide 36

Slide 36 text

•࣮༻తʹ͸ɺAutomatic Mixed Precision (AMP) ͕༗༻ • FP16 / BF16 ͩͱϚζ͍෦෼͸ࣗಈతʹFP32ʹͯ͘͠ΕΔ •AMP͸PyTorchͳΒࣗಈతʹ࣮ߦͯ͘͠ΕΔΠϯλϑΣʔε͕ଘࡏ • AMP & BF16ͷར༻͸ҎԼͷΑ͏ʹ࣮ݱՄೳ • ͜ΕͱGradScalerͱ͍͏ػߏΛ࢖͏ඞཁ͕͋Δ ൒ਫ਼౓ුಈখ਺఺਺: FP16, BF16 36 ը૾͸ https://carbon.now.sh/ Ͱੜ੒ forwardΛwithͷதͰ࣮ߦ

Slide 37

Slide 37 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 37 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays. ଛࣦ u2 u3 u4 u5 u1 u2 u3 u4 u1 u5 ॱ఻೻ ٯ఻೻

Slide 38

Slide 38 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 38 ଛࣦ u2 u3 u4 u5 u1 u2 u4 u1 u5 ॱ఻೻ ٯ఻೻ ௨ৗ u4 ͷޯ഑ܭࢉʹͦΕҎલͷ৘ใ͕ඞཁ u3 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.

Slide 39

Slide 39 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 39 ଛࣦ u2 u3 u4 u5 u1 u2 u3 u4 u1 u5 ॱ఻೻ ٯ఻೻ ௨ৗ u4Ҏલͷܭࢉ݁ՌΛهԱͯ͠ར༻ https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.

Slide 40

Slide 40 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 40 ଛࣦ u2 u3 u4 u5 u1 u2 u3 u4 u1 u5 ॱ఻೻ ٯ఻೻ ௨ৗ u4Ҏલͷܭࢉ݁ՌΛهԱͯ͠ར༻ https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays. ॱ఻೻ͷܭࢉ݁ՌΛ͢΂ͯهԱ

Slide 41

Slide 41 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 41 ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing u4 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.

Slide 42

Slide 42 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 42 ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint u4 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.

Slide 43

Slide 43 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 43 ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint ܭࢉάϥϑͷҰ෦ͷ݁ՌͷΈอଘ u4 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays.

Slide 44

Slide 44 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 44 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays. ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint ඞཁͳ෼Λܭࢉ͠௚͠ u4

Slide 45

Slide 45 text

•Back Propagation༻ͷϝϞϦΛ࡟ݮ͢ΔςΫχοΫ •ܭࢉάϥϑͷҰ෦ͷܭࢉ݁ՌͷΈΛอଘ͢Δ Gradient Checkpointing 45 https://github.com/cybertronai/gradient-checkpointing
 Fitting larger networks into memory. Backprop and systolic arrays. ଛࣦ u2 u3 u4 u5 u1 u2 u3 u1 u5 ॱ఻೻ ٯ఻೻ Gradient Checkpointing Checkpoint هԱ͢Δܭࢉ݁Ռ͕গͳ͍ʂ u4

Slide 46

Slide 46 text

•ܭࢉ݁Ռͷอଘ਺Λ
 ݮΒ͢͜ͱͰ
 ϝϞϦ࢖༻ྔΛ࡟ݮ •ඞཁͳΒ࠶౓ॱ఻೻
 ܭࢉΛ͢ΔͷͰ
 ࣮ߦ࣌ؒ͸৳ͼΔ • ΍Ε͹͍͍ͱ͍͏
 Θ͚Ͱ͸ͳ͍ Gradient Checkpointing 46

Slide 47

Slide 47 text

•Gradient Checkpointing͸ࣗ෼Ͱ΍Ζ͏ͱ͢Δͱগ͠େม… • transformersͳΒ1ߦͰར༻Մೳ • ະରԠϞσϧ΋ଘࡏ͢ΔͷͰ౎౓֬ೝΛ •࣮ମݧ: όοναΠζ 512ͰͷֶशͷϝϞϦ࢖༻ྔ͕ 80GB→25GB Gradient Checkpointing: ࣮૷ 47 ը૾͸ https://carbon.now.sh/ Ͱੜ੒

Slide 48

Slide 48 text

•PyTorch 1.9͔Β௥Ճ͞Εͨ৽͍͠ਪ࿦Ϟʔυ • طଘͷਪ࿦Ϟʔυͱͯ͠͸ `torch.no_grad` ͕ଘࡏ •੍໿͕ଟগ௥Ճ͞Εͨ͜ͱͰɺΑΓϝϞϦফඅΛ཈͑ͨਪ࿦͕Մೳʹ • ධՁ࣌͸΍Γಘɺ܇࿅࣌ʹ͸࢖ͬͯ͸͍͚ͳ͍ torch.inference_mode 48 ը૾͸ https://carbon.now.sh/ Ͱੜ੒ Skipped

Slide 49

Slide 49 text

•࣮ݧϓϩάϥϜͷ؅ཧͱ࣮૷ͷࢦ਑ • ద੾ͳ࣮ݧΛ͢ΔͨΊʹ • ਂ૚ֶशؔ࿈ιϑτ΢ΣΞͷൃల • େن໛ͳϞσϧͷ fi ne-tuningςΫχοΫͱ࣮૷ྫ •έʔεελσΟ • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEͷߏங ໨࣍ 49

Slide 50

Slide 50 text

•Best practiceΛֶͿ • ࣮૷ʹࡍͯ͠ʮྑ͍ʯͱ͞Ε͍ͯΔઃܭ΍ه๏ΛֶͿ •ΞϯνύλʔϯΛ஌Δ • ࣮૷ʹࡍͯ͠ʮѱ͍ʯͱ͞Ε͍ͯΔઃܭ΍ࢥߟ͔Β୤͢Δ •৽͍ٕ͠ज़Λ஌Δ • طଘͷٕज़ΑΓ΋ͬͱྑ͍ํ๏͕ੜ·Ε͍ͯΔ͔΋ • ৽͍ٕ͠ज़ͷࢥ૝Λࠓͷٕज़ʹ΋Ԡ༻Ͱ͖Δ͔΋ ࠶ܝ: ద੾ͳ࣮ݧΛ͢ΔͨΊʹඞཁͳ͜ͱ 50

Slide 51

Slide 51 text

•͜͜·Ͱͷ࣮૷ํ਑ɾςΫχοΫΛ׆༻ͨ͠ϓϩδΣΫτྫΛ঺հ •Work In Progress (WIP) ͳ಺༰͕ଟ͍఺͸͝༰͍ࣻͩ͘͞ ঺հϓϩδΣΫτ •BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ •ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ •SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEϞσϧͷߏங έʔεελσΟ 51

Slide 52

Slide 52 text

έʔεελσΟ:
 BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ

Slide 53

Slide 53 text

•ࣄલֶशࡁΈϞσϧΛऔΓר͘؀ڥ͸ۃΊͯٸ଎ʹ੔උ͞Ε͍ͯΔ • BERTΛ༻͍ͨߴ඼࣭ͳςϯϓϨʔτ͸΄ͱΜͲଘࡏ͠ͳ͍ • ಛʹ࠷৽ͷPython, PyTorch, TransformersʹରԠͰ͖͍ͯͳ͍ •ࣗવݴޠॲཧͷॳֶऀʹͱͬͯ͸͍ۤ͠ঢ়گ • ʮݚڀ΍࣮ݧΛͲͷΑ͏ʹ։࢝ͨ͠ΒΑ͍͔Θ͔Βͳ͍ʯ • ʮΑ͍ઃܭɺ࣮ݧ؅ཧΛͲͷΑ͏ʹߦ͑͹ྑ͍͔Θ͔Βͳ͍ʯ •ʮϞμϯͰߴ඼࣭ͳ࣮ݧϓϩάϥϜʯͷݟຊ͕ඞཁ • ࣗ෼ͳΓͷ࣮૷ํ਑ɾࢦ਑Λදݱͨ͠Θ͔Γ΍͍࣮͢૷͕͋Δͱ༗༻ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ 53 https://github.com/hppRC/bert-classi fi cation-tutorial

Slide 54

Slide 54 text

•ςΩετ෼ྨͷೖ໳ͱͯ͠༗໊ͳʮϥΠϒυΞχϡʔείʔύεʯ͕୊ࡐ •BERTΛ fi ne-tuning͢ΔྲྀΕΛग़དྷΔ͚ͩγϯϓϧʹ࣮૷ •࣮૷: hppRC/bert-classi fi cation-tutorial BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ 54 https://github.com/hppRC/bert-classi fi cation-tutorial

Slide 55

Slide 55 text

•ςΩετ෼ྨͷೖ໳ͱͯ͠༗໊ͳʮϥΠϒυΞχϡʔείʔύεʯ͕୊ࡐ •BERTΛ fi ne-tuning͢ΔྲྀΕΛग़དྷΔ͚ͩγϯϓϧʹ࣮૷ •࣮૷: hppRC/bert-classi fi cation-tutorial ߩݙ •Python 3.10, PyTorch 1.13, Transformers 4.25 Ҏ্ʹରԠ •Type Hintsͷ׆༻ͱݟ௨͠ͷྑ͍ઃܭ •ʮσʔλ४උʯ → ʮ܇࿅ & ධՁʯ ͱ͍͏୯ํ޲తͳ࣮ݧϓϩηεͷ࣮ྫ •࣮ݧςϯϓϨʔτͱͯͦ͠ͷଞͷλεΫ΁ͷస༻͕༰қͳ࣮૷ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ 55 https://github.com/hppRC/bert-classi fi cation-tutorial

Slide 56

Slide 56 text

•୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍ 56

Slide 57

Slide 57 text

•୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍ 57 લॲཧΛඞͣ
 ϓϩάϥϜͱͯ͠࢒͢

Slide 58

Slide 58 text

•୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍ 58 JSONLܗࣜͷ
 σʔληοτ JSONL / csv / tsv ͷ
 ύʔεॲཧΛ
 ઈରʹࣗ࡞͠ͳ͍

Slide 59

Slide 59 text

•୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍ 59 ࣮ݧͱධՁΛ
 ࿈ଓతʹߦ͏

Slide 60

Slide 60 text

•୯ํ޲σʔλϑϩʔΛҙࣝͨ͠ઃܭ BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ: ֓؍ 60 Ϟσϧࣗମ͸อଘ͠ͳ͍ ʮอଘͨ͠ϞσϧΛϩʔυͯ͠ධՁʯ ͸͔ͳΓόά͕ൃੜ͠΍͍͢

Slide 61

Slide 61 text

•࣮ݧ݁Ռͱ࣮ͯ͠ݧઃఆɾධՁࢦඪɾֶशϩάΛอଘ͢Δ •࣮ݧ݁Ռ͸࣮ݧઃఆɾ೔෇͝ͱʹσΟϨΫτϦΛ੾Δ • ྫ: outputs/[Ϟσϧ໊]/[೥݄೔]/[࣌෼ඵ] • ಉ࣮͡ݧઃఆͰ΋݁Ռ্͕ॻ͖͞ΕͨΓ͠ͳ͍ ࣮ݧ݁Ռͷอଘํ๏: betterͳ࣮ݧ؅ཧΛ໨ࢦͯ͠ 61 ͱʹ͔͘
 σΟϨΫτϦΛ੾Δ

Slide 62

Slide 62 text

•࣮ݧ݁Ռͱ࣮ͯ͠ݧઃఆɾධՁࢦඪɾֶशϩάΛอଘ͢Δ •࣮ݧ݁Ռ͸࣮ݧઃఆɾ೔෇͝ͱʹσΟϨΫτϦΛ੾Δ • ྫ: outputs/[Ϟσϧ໊]/[೥݄೔]/[࣌෼ඵ] • ಉ࣮͡ݧઃఆͰ΋݁Ռ্͕ॻ͖͞ΕͨΓ͠ͳ͍ •࣮ݧ݁Ռͷूܭ͸ 1. ධՁࢦඪϑΝΠϧ (metrics.json) Λ࠶ؼతʹऩू 2. ࣮ݧઃఆϑΝΠϧ (con fi g.json) ΛಡΈࠐΈ 3. PandasͰ࣮ݧઃఆͱධՁࢦඪͷσʔλϑϨʔϜΛ࡞੒ 4. ࣮ݧઃఆ͝ͱʹgroupby͢ΔͳͲ͓޷ΈͰ ࣮ݧ݁Ռͷอଘํ๏: betterͳ࣮ݧ؅ཧΛ໨ࢦͯ͠ 62 ͱʹ͔͘
 σΟϨΫτϦΛ੾Δ

Slide 63

Slide 63 text

έʔεελσΟ:
 ChatGPTΛ༻͍ͨӳޠσʔληοτ
 ͷࣄྫ͝ͱ຋༁

Slide 64

Slide 64 text

ࣗવݴޠਪ࿦ (Natural Language Inference; NLI) •จϖΞ (લఏจɾԾઆจ) ʹϥϕϧ (ؚҙɾໃ६ɾதཱ) ͕෇༩ •จϖΞͷҙຯؔ܎Λ༧ଌ͢ΔλεΫ NLIσʔληοτ 64 લఏจ Ծઆจ ϥϕϧ A man playing an electric guitar on stage. A man playing guitar on stage. ؚҙ A man playing an electric guitar on stage. A man playing banjo on the fl oor. ໃ६ A man playing an electric guitar on stage. A man is performing for cash. தཱ

Slide 65

Slide 65 text

•ӳޠͷNLIσʔληοτ͸͔ͳΓ੔උ͞Ε͍ͯΔ • Stanford NLI (SNLI; Bowman et al., 2015): ໿57ສจϖΞ • Multi-Genre NLI (MNLI; Williams et al., 2018): ໿41ສจϖΞ •೔ຊޠͷNLIσʔληοτ͸ӳޠͱൺֱͯ͠ݶఆత • JSNLI (٢ӽΒ, 2020): Stanford NLIσʔληοτΛ೔ຊޠʹػց຋༁ • JNLI (܀ݪΒ, 2022): JGLUE ʹಉࠝɺΩϟϓγϣϯΛར༻ • JaNLI (୩தΒ, 2021): ݴޠֶత஌ݟʹجͮ͘ఢରతσʔληοτ ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁ 65

Slide 66

Slide 66 text

•೔ຊޠͰ࠷΋ར༻͞Ε͍ͯΔNLIσʔληοτ͸JSNLI • 2018೥͘Β͍ͷGoogle຋༁ʹΑͬͯ୯จ୯ҐͰ຋༁ • ୯จ୯Ґͷ຋༁͸จϖΞͷҙຯؔ܎Λյͯ͠͠·͏ݒ೦΋ •େن໛ݴޠϞσϧ͸຋༁΋্ख͘͜ͳͤΔ͜ͱ͕஌ΒΕ͍ͯΔ • ಛʹChatGPT͸ඇৗʹҹ৅తͳೳྗΛൃش •େن໛ݴޠϞσϧ͸promptʹΑͬͯଟ༷ͳ৚݅෇͖ੜ੒͕Մೳ • ৚݅෇͚࣍ୈͰϥϕϧͷҙຯؔ܎Λյͣ͞ʹࣄྫ͝ͱʹ຋༁Ͱ͖ΔͷͰ͸ʁ •ChatGPTΛ༻͍ͯӳޠNLIσʔληοτΛ೔ຊޠʹࣄྫ͝ͱ຋༁ ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁ 66

Slide 67

Slide 67 text

0. OpenAIͷAPI KeyΛൃߦ 1. pip install openai 2. promptΛઃܭ 3. ࣙॻํʹpromptΛೖΕͯJSONͱͯ͠APIʹ౤͛Δ (উखʹ΍ͬͯ͘ΕΔ) ChatGPTʹΑΔ຋༁ͷखॱ 67

Slide 68

Slide 68 text

຋༁ର৅ •Stanford NLI: ໿57ສจϖΞ •Multi-Genre NLI: ໿41ສจϖΞ ຋༁ख๏ •ϥϕϧͷҙຯؔ܎Λյ͞ͳ͍Α͏ʹ຋༁͢ΔΑ͏promptͰࢦࣔ •OpenAIͷChatGPT API (gpt-3.5-turbo, $0.002/1K tokens) Λར༻ •໿100ສจϖΞͷ຋༁ʹ5ສԁఔ౓ (DeepLͷAPIͩͱ17ສԁҎ্) ੒Ռ෺ •೔ӳର༁෇͖ͷ೔ຊޠNLIσʔληοτ ChatGPTʹΑΔࣗવݴޠਪ࿦σʔληοτͷࣄྫ͝ͱ຋༁ 68 ൃදऀ஫: promptΛ؆ૉʹ͢ΔͳͲͰ΋ͬͱ҆ՁʹͰ͖ͦ͏Ͱ͢

Slide 69

Slide 69 text

•6-shot learningΛ࣮ࢪ • NLIσʔληοτͷ೔ӳର༁
 ͱͯ͠JSICK͔ΒࣄྫΛഈआ •গ਺ࣄྫͷޙʹ຋༁͍ͨ͠ࣄྫΛ౤ೖ •Batch Prompting (Cheng et al., 2023)
 ΋ར༻ ࣮ࡍͷPrompt 69

Slide 70

Slide 70 text

•6-shot learningΛ࣮ࢪ • NLIσʔληοτͷ೔ӳର༁
 ͱͯ͠JSICK͔ΒࣄྫΛഈआ •গ਺ࣄྫͷޙʹ຋༁͍ͨ͠ࣄྫΛ౤ೖ •Batch Prompting (Cheng et al., 2023)
 ΋ར༻ ମײ •zero-shotΑΓfew-shotͷ΄͏͕
 ֨ஈʹ຋༁඼࣭͕ߴ͍ •promptΤϯδχΞϦϯάͰ͞Βʹ
 ຋༁඼࣭Λ޲্ͤ͞ΒΕͦ͏ ࣮ࡍͷPrompt 70

Slide 71

Slide 71 text

•ۙ೔ެ։༧ఆ • σʔληοτ • ࣮ݧίʔυ • prompt •ݱࡏධՁத… WIP: NU-NLI 71

Slide 72

Slide 72 text

έʔεελσΟ:
 SimCSEͷ࠶ݱ࣮૷ͱ
 ೔ຊޠSimCSEϞσϧͷߏங

Slide 73

Slide 73 text

•ରরֶश(Contrastive Learning)Λ༻͍ͯࣄલֶशࡁΈϞσϧΛ fi ne-tuning • Unsupervised SimCSE:ʮಉ͡จΛ2ճຒΊࠐΜͰରরֶशʯ • Supervised SimCSE: ʮؚҙؔ܎ʹ͋ΔจΛਖ਼ྫͱͯ͠ରরֶशʯ Gao+: SimCSE: Simple Contrastive Learning of Sentence Embeddings, EMNLP ’21 SimCSE: ରরֶशʹجͮ͘จຒΊࠐΈख๏ 73 ਤ͸࿦จΑΓҾ༻ɻҎલ࣮ࢪͨ͠SimCSEͷྠߨࢿྉ͸ͪ͜Β

Slide 74

Slide 74 text

•ެ࣮ࣜ૷͸ଟ༷ͳந৅Խ͕ࢪ͞Ε͍ͯͯॳֶऀʹ͸௥͍ͮΒ͍ • จຒΊࠐΈͷݚڀΛଅਐ͍ͨ͠ •ग़དྷΔ͚ͩந৅ԽΛݮΒͨ͠γϯϓϧͳ࠶ݱ࣮૷Λެ։ • hppRC/simple-simcse SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE 74

Slide 75

Slide 75 text

•ެ࣮ࣜ૷͸ଟ༷ͳந৅Խ͕ࢪ͞Ε͍ͯͯॳֶऀʹ͸௥͍ͮΒ͍ • จຒΊࠐΈͷݚڀΛଅਐ͍ͨ͠ •ग़དྷΔ͚ͩந৅ԽΛݮΒͨ͠γϯϓϧͳ࠶ݱ࣮૷Λެ։ • hppRC/simple-simcse •γϯϓϧͳ PyTorch + transformers ͷߏ੒ɾશମͰ250ߦ • + ࿦จ΁ͷ֘౰Օॴ΁ͷݴٴɾࢲݟΛؚΉίϝϯτ107ߦ • σʔλͷલॲཧΛআ͘ •࠶ݱ࣮૷Λ༻͍ͨ࠶ݱ࣮ݧ΋࣮ࢪ • ࿦จͷϋΠύϥͰ4Ϟσϧɾ50ཚ਺γʔυͰ࣮ݧ (=200ճ) SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE 75

Slide 76

Slide 76 text

•ίϝϯτ෇͖ͰֶशϧʔϓΛ؆ܿʹهड़ SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE 76

Slide 77

Slide 77 text

•hppRC/simple-simcse •৽͘͠จຒΊࠐΈݚڀΛ࢝ΊΔਓͷ
 ଍͕͔Γͱͯ͠ิ଍આ໌΋هࡌ •ੑೳ΋΄΅࠶ݱ SimCSEͷ࠶ݱ࣮૷: Simple-SimCSE 77 50ճ࣮ͣͭݧͨ͠ࡍͷੑೳͷώετάϥϜ ϋΠύϥʹର͢Δݴٴɾࢲݟ

Slide 78

Slide 78 text

•ӳޠͷࣄલֶशࡁΈจຒΊࠐΈϞσϧ͸ଟ਺ଘࡏ •ҰํͰ೔ຊޠจຒΊࠐΈϞσϧͷܾఆ൛͸ଘࡏ͠ͳ͍ • ิ଍: ࠷ۙ PKSHA͔ࣾΒ೔ຊޠSimCSEϞσϧ ͕ެ։ • ೔ຊޠจຒΊࠐΈք۾΋੝Γ্͕Γͭͭ͋ΔΧϞ…ʂ •೔ຊޠจຒΊࠐΈϞσϧͷแׅతͳධՁ͕ଘࡏ͠ͳ͍ WIP: ೔ຊޠSimCSEϞσϧͷߏங 78

Slide 79

Slide 79 text

•ӳޠͷࣄલֶशࡁΈจຒΊࠐΈϞσϧ͸ଟ਺ଘࡏ •ҰํͰ೔ຊޠจຒΊࠐΈϞσϧͷܾఆ൛͸ଘࡏ͠ͳ͍ • ิ଍: ࠷ۙ PKSHA͔ࣾΒ೔ຊޠSimCSEϞσϧ ͕ެ։ • ೔ຊޠจຒΊࠐΈք۾΋੝Γ্͕Γͭͭ͋ΔΧϞ…ʂ •೔ຊޠจຒΊࠐΈϞσϧͷแׅతͳධՁ͕ଘࡏ͠ͳ͍ •೔ຊޠจຒΊࠐΈϞσϧͷߏஙͱแׅతͳධՁΛ࣮ࢪ • ۙ೥ͷจຒΊࠐΈख๏ͱͯ͠୅දతͳSimCSEΛϕʔεʹ • ڭࢣ͋Γɾڭࢣͳ͠ͷ྆ํΛ࣮ݧ • ෳ਺ͷσʔληοτɾϋΠύϥͰ࣮ݧ WIP: ೔ຊޠSimCSEϞσϧͷߏங 79

Slide 80

Slide 80 text

܇࿅σʔλ •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI, MNLI) WIP: ೔ຊޠSimCSEϞσϧͷߏங 80 WikipediaܥΛ2ͭ WebܥΛ2ͭ

Slide 81

Slide 81 text

܇࿅σʔλ •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI, MNLI) ࣮ݧઃఆ: •ࣄલֶशࡁΈݴޠϞσϧ21छྨͰ࣮ݧ (base: 14छྨ, large: 7छྨ) •όοναΠζ: {64, 128, 256, 512}, ֶश཰: {1e-5, 3e-5, 5e-5} •ҟͳΔཚ਺γʔυ஋Ͱ3ճ࣮ͣͭݧͯ͠࠷ྑͷϋΠύϥͰධՁ WIP: ೔ຊޠSimCSEϞσϧͷߏங 81 WikipediaܥΛ2ͭ WebܥΛ2ͭ

Slide 82

Slide 82 text

܇࿅σʔλ •ڭࢣͳ͠: ೔ຊޠWikipedia, Wiki-40B, CC-100, BCCWJ •ڭࢣ͋Γ: JSNLI, NU-NLI (SNLI, MNLI) ࣮ݧઃఆ: •ࣄલֶशࡁΈݴޠϞσϧ21छྨͰ࣮ݧ (base: 14छྨ, large: 7छྨ) •όοναΠζ: {64, 128, 256, 512}, ֶश཰: {1e-5, 3e-5, 5e-5} •ҟͳΔཚ਺γʔυ஋Ͱ3ճ࣮ͣͭݧͯ͠࠷ྑͷϋΠύϥͰධՁ ݱঢ়ͷ࣮ݧ݁Ռ •ڭࢣ͋Γ/ͳ͠ڞʹૣҴాେRoBERTa-large͕࠷ߴੑೳ •Studio Ousia ೔ຊޠLUKE-largeͱXLM-RoBERTa-large͕͍࣍Ͱߴੑೳ WIP: ೔ຊޠSimCSEϞσϧͷߏங 82 ݱࡏ·Ͱʹ… ڭࢣͳ͠: 1559ճ ڭࢣ͋Γ: 3172ճ

Slide 83

Slide 83 text

•ۙ೔ެ։༧ఆ • σʔληοτલॲཧ༻ͷϓϩάϥϜ • ࣮ݧίʔυ (ֶशɾධՁ) • ࣮ݧ݁Ռ (ϋΠύϥ୳ࡧ࣌ͷ݁Ռ΋) • ࣄલ܇࿅ࡁΈϞσϧ •ݱࡏ࣮ݧɾධՁத… WIP: ೔ຊޠSimCSEϞσϧͷߏங 83 BCCWJͷXMLΛ
 Ϛϧνϓϩηεʹલॲཧͯ͠ ςΩετϑΝΠϧʹ
 ม׵͢ΔϓϩάϥϜͳͲ ஶ໊ͳ೔ຊޠσʔληοτ
 લॲཧ༻ͷϓϩάϥϜηοτͱͯ͠΋

Slide 84

Slide 84 text

•ਝ଎͔ͭద੾ͳݚڀ਱ߦͷͨΊͷ࣮ݧϓϩάϥϜʹ͍ͭͯ঺հ •έʔεελσΟΛ௨࣮ͯ͠ફతͳςΫχοΫΛ঺հ • BERTʹΑΔςΩετ෼ྨνϡʔτϦΞϧ • ChatGPTΛ༻͍ͨӳޠσʔληοτͷࣄྫ͝ͱ຋༁ • SimCSEͷ࠶ݱ࣮૷ͱ೔ຊޠSimCSEϞσϧͷߏங •࣮ݧϓϩάϥϜ΋ੵۃతʹެ։ɾվળɾٞ࿦͍͖ͯ͠·͠ΐ͏ʂ ·ͱΊ: ࢿݯͱͯ͠ΈΔ࣮ݧϓϩάϥϜ 84