Slide 1

Slide 1 text

࿦จ঺հ: Finetuned Language Models Are Zero-Shot Learners Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le Google Research ಡΈखɿүপ େ @ICLR2022 ஫ه͕ͳ͍ݶΓɺਤද͸ຊ࿦จ͔ΒͷҾ༻

Slide 2

Slide 2 text

• എܠ֓ཁ • ఏҊख๏σʔληοτ • ධՁ࣮ݧ • ߟ࡯ ໨࣍ 2

Slide 3

Slide 3 text

• λεΫͷࢦࣔ΍ྫࣔΛ༩͑Δ͜ͱͰɺ೚ҙͷλεΫΛղ͘͜ͱ͕Ͱ͖ΔݴޠϞσϧ͕ొ৔ • ൚༻ݴޠϞσϧ࣮ݱ΁ͷେ͖ͳҰาͱͯ͠஫໨ΛཋͼΔ GPT-3ͷొ৔ 3 https://beta.openai.com/playground 1BSBQISBTFUIFGPMMPXJOHUFYU &YBNQMFT *MPWFGSVJUT*FOKPZFBUJOHGPPE *IBWFMVODI *FBUMVODI ೖྗ ग़ྗ 5SBOTMBUFUIFGPMMPXJOHUFYU JOUP+BQBOFTF )PXBSFZPVʁ ͓ݩؾͰ͔͢ʁ λεΫͷࢦࣔ λεΫͷྫࣔ (15

Slide 4

Slide 4 text

• (15͸8FC্ͷ௒େྔͷจষʢ ԯ୯ޠʣΛಡΈࠐΉʢ࣍ͷ୯ޠΛ༧ଌ͢Δʣ͜ͱͰֶश – ͦͷதʹ͸Լهͷ2"αΠτͳͲ΋ؚ·Εɺ༷ʑͳ࣭໰ͱͦΕʹର͢Δճ౴Λ҉ʹֶश͍ͯ͠ΔͷͰ͸ͳ͍͔ → λεΫͷࢦࣔͱͦΕʹର͢Δճ౴Λ໌ࣔతʹֶशͤ͞Ε͹ɺ(15ͷΑ͏ͳܳ౰͕Ͱ͖ͳ͍ͩΖ͏͔ – (15ΑΓ΋ খ͍͞ϞσϧͰ ߴ͍҆ఆͨ͠ੑೳΛ࣮ݱͰ͖ͳ͍ͩΖ͏͔ ͳͥGPT-3͸ͦͷΑ͏ͳܳ౰͕Ͱ͖Δͷ͔ 4 ग़య: https://c4-search.apps.allenai.org/

Slide 5

Slide 5 text

• JOTUSVDUJPOUVOJOHλεΫͷࢦࣔ΍ྫࣔΛϓϩϯϓτͱͯ͠Ճ͑ͨϚϧνλεΫֶश • ֶश࣌ʹ͸༩͍͑ͯͳ͍λεΫͱͦͷࢦࣔΛ༻͍ͯධՁ͢Δ͜ͱͰɺ Ϟσϧ͕ະ஌ͷࢦࣔͱλεΫʹ൚ԽͰ͖Δ͔ධՁ λεΫͷࢦࣔͱͦΕʹର͢Δճ౴Λֶश͢ΔJOTUSVDUJPOUVOJOHΛఏҊ 5 5SBOTMBUFUIFGPMMPXJOH UFYUJOUP+BQBOFTF )PXBSFZPV 8IBUJTUIFTFOUJNFOU PGUIJTNPWJFSFWJFX *MJLFTQPSUT 5IJTNPWJFJTGVO *EPO`UMJLFUIJTGJMN ͓ݩؾͰ͔͢ʁ ࢲ͸ӡಈ͕޷͖Ͱ͢ɻ QPTJUJWF OFHBUJWF ֶश ධՁ ௨ৗͷϚϧνλεΫֶश 1BSBQISBTFUIF GPMMPXJOHUFYU *MPWFGSVJUT *HPUPTDIPPM *FOKPZFBUJOHGSVJU *BUUFOETDIPPM ຋༁ ײ৘෼ੳ ݴ͍׵͑ + + + + + + ⋮ ⋮ ⋮ ਺ेݸͷ λεΫ ֶश͍ͯ͠ ͳ͍λεΫ ϓϩϯϓτ Ϟσϧ

Slide 6

Slide 6 text

ۙ೥ྨࣅͷऔΓ૊Έ͕૬͙࣍ 6 '-"/ ঺հ࿦จ 5<> /BUVSBM*OTUSVDUJPOT W<> W<> .FUB*$-<> த৺૊৫ ൃදձٞ ϓϩϯϓτ ͷ಺༰ #JH4DJFODF XPSLTIPQ *$-3 *$-3 W"$- WBS9JW ౤ߘத /""$- ࢦࣔ ˎྫࣔΛՃ࣮͑ͨݧ༗Γ ࢦࣔ ࢦࣔ ྫࣔ ˎෛྫ౳Ճ࣮͑ͨݧ΋༗Δ͕ɺ λεΫͷࢦࣔ ྫ͕ࣔ࠷ߴੑೳ ྫࣔ ˎࢦࣔΛՃ࣮͑ͨݧ༗Γ Ϟσϧ -B.%"15 EFDPEFSPOMZ # 5-. FODPEFSEFDPEFS ## W#"35 FODPEFSEFDPEFS . W5-. (15 EFDPEFSPOMZ . [1] Sanh et al., Multitask Prompted Training Enables Zero-Shot Task Generalization. ICLR, 2022. [2] Mishra et al., Cross-Task Generalization via Natural Language Crowdsourcing Instructions. ACL, 2022. [3] Wang et al., Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks. arXiv, 2022. [4] Min et al., MetaICL: Learning to Learn In Context. NAACL, 2022. (JUIVCͷϨϙδτϦ΍ϓϩϯϓτ ऩूαΠτͷυϝΠϯΛ΋ͱʹಉఆ

Slide 7

Slide 7 text

• ैདྷͷσʔληοτʹՃ͑ɺϓϩϯϓτΛऩू͢ΔྲྀΕ͕࢝·͍ͬͯΔ ϓϩϯϓτΛऩू͢ΔϓϥοτϑΥʔϜ͕ొ৔ 7 [1] Bach et al., PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. ACL demo, 2022. https://github.com/bigscience-workshop/promptsource [2] Wang et al., Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks. arXiv, 2022. https://instructions.apps.allenai.org/ 1SPNQUTPVSDFʢ5ͷֶशධՁʹ࢖༻ʣ<> /BUVSBM*OTUSVDUJPOT<>

Slide 8

Slide 8 text

• എܠ֓ཁ • ఏҊख๏σʔληοτ • ධՁ࣮ݧ • ߟ࡯ ໨࣍ 8

Slide 9

Slide 9 text

• طଘͷσʔληοτΛ༻ҙ͠ɺ֤σʔληοτΛΫϥελʹ෼ྨ • Ұ෦ͷΫϥελͰֶशΛߦ͍ɺֶशʹ࢖༻͠ͳ͍ΫϥελͰධՁ͢Δ͜ͱͰݫີʹ[FSPTIPUੑೳΛଌΔ σʔληοτͷऩूɾ෼ྨ 9 /BUVSBM-BOHVBHF6OEFSTUBOEJOH ෼ྨλεΫ /BUVSBM-BOHVBHF(FOFSBUJPO ੜ੒λεΫ * ߋʹɺ3FBEDPNQXDPNNPOTFOTFΫϥελͰධՁ͢Δࡍ͸3FBEJOHDPNQͱ$PNNPOTFOTFΫϥελ͸ֶशʹ༻͍ͣɺ /BUVSBMMBOHVBHFJOGFSFODFΫϥελͰධՁ͢Δࡍ͸1BSBQISBTFΫϥελ͸ֶशʹ༻͍ͳ͍ɻٯ΋ಉ༷ɻ

Slide 10

Slide 10 text

• ֤σʔληοτʹ͍ͭͯछྨͷςϯϓϨʔτʢϓϩϯϓτʣΛ࡞੒ – ྫʣ/BUVSBM-BOHVBHF*OGFSFODFʢؚҙؔ܎ೝࣝʣͷ৔߹ • λεΫͷଟ༷ੑΛ֬อ͢ΔͨΊʹɺσʔληοτͷೖྗͱग़ྗΛೖΕସ͑ͨςϯϓϨʔτ΋࡞੒ – ྫʣײ৘ۃੑ෼ྨσʔληοτΛ༻͍ͯɺײ৘ۃੑΛೖྗʹ঎඼ϨϏϡʔΛੜ੒͢ΔςϯϓϨʔτΛ࡞੒ ςϯϓϨʔτͷ࡞੒ 10

Slide 11

Slide 11 text

• ࡞੒͞ΕͨςϯϓϨʔτʹσʔληοτதͷ֤σʔλΛ౰ͯ͸ΊݴޠϞσϧʹೖྗ͠ɺճ౴Λग़ྗɻ ਖ਼͍͠ճ౴ͷ໬౓Λ࠷େԽ͢ΔΑ͏ʹֶश • ෼ྨੜ੒λεΫڞʹɺੜ੒͞Εͨճ౴ͱਖ਼͍͠ճ౴Λൺֱͯ͠ධՁ – ͔͠͠෼ྨλεΫͰ͸ਖ਼͍͠ճ౴ͷදݱ͸༷ʑʢྫ͑͹্هྫͰ͸ɺlZFTzҎ֎ʹlFOUBJMNFOUz͕༗ΓಘΔʣ – ͦ͜ͰςϯϓϨʔτʹl0QUJPOTZFTOPzΛՃ͑Δ͜ͱͰɺճ౴͸lZFTOPzͷ͍ͣΕ͔Ͱ͋Δ͜ͱΛࢦࣔ Ϟσϧͷֶश/ධՁ 11 3FBE UIF GPMMP XJOH ݴޠϞσϧ ʜ ZFT OP ZFT 3FBEUIFGPMMPXJOHBOEEFUFSNJOFJGUIF IZQPUIFTJTDBOCFJOGFSSFEGSPNUIFQSFNJTF 1SFNJTF3VTTJBODPTNPOBVU7BMFSZ )ZQPUIFTJT3VTTJBOTIPMEUIFSFDPSE 0QUJPOTZFTOP ճ౴Λੜ੒ ೖྗ

Slide 12

Slide 12 text

• എܠ֓ཁ • ఏҊख๏σʔληοτ • ධՁ࣮ݧ • ߟ࡯ ໨࣍ 12

Slide 13

Slide 13 text

• (15# #SPXOFUBM  • (-B. #& %VFUBM  • -B.%"15# 5IPQQJMBO FUBM  – XFC্ͷจষɺର࿩σʔλɺXJLJQFEJBͰ ࣄલֶशͨ͠EFDPEFSPOMZݴޠϞσϧ – ࣄલֶशίʔύεʹ͸ίʔυ΍ӳޠҎ֎ͷ จষ΋ؚ·ΕΔ • '-"/# – ࣄલֶशࡁΈ-B.%"15Λ JOTUSVDUJPOUVOJOH ࣮ݧઃఆ 13 ϕʔεϥΠϯͱఏҊ๏ σʔληοτ ϕʔε ϥΠϯ ఏҊ๏ ֶश ධՁ • ධՁͰ࢖༻͢ΔΫϥελҎ֎ͷશσʔληο τΛࠞ߹ • ֤σʔληοτͷσʔλ਺Λ࠷େສ݅ʹ੍ ݶ͠ɺσʔληοτؒͷෆۉߧΛܰݮ • ςϯϓϨʔτ͸σʔλຖʹϥϯμϜબ୒ • ԼهͷΫϥελͰධՁ – OBUVSBMMBOHVBHFJOGFSFODF SFBEJOH DPNQSFIFOTJPO DMPTFECPPL2"  USBOTMBUJPO DPNNPOTFOTFSFBTPOJOH  DPSFGFSFODFSFTPMVUJPO TUSVDUUPUFYU • ֤ςϯϓϨʔτͰಘͨੑೳͷฏۉΛଌఆ – ධՁࢦඪ͸BDDVSBDZ #-&6ͳͲ λεΫʹΑΓҟͳΔ * '-"/ͷςϯϓϨʔτΛೖྗͯ͠΋ੑೳ͕ஶ͘͠௿͍ͨΊɺ શͯͷϕʔεϥΠϯϞσϧ͸(15ͷϓϩϯϓτͰධՁ

Slide 14

Slide 14 text

• JOTUSVDUJPOUVOFͨ͠Ϟσϧ ͷํ͕ɺ UVOF͍ͯ͠ͳ͍Ϟσϧ ΑΓߴ͍ੑೳ • ଟ͘ͷσʔληοτͰ(15΍(-B. Λ্ճΔ – େྔͷจষΛಡΈࠐΜͰֶश͢ΔΑΓɺ JOTUSVDUJPOUVOJOHͷํ͕༗ޮͱࣔࠦ instruction tuning͕ػೳͨ͠λεΫ 14 (accuracy) (accuracy/F1) (accuracy) (BLEU)

Slide 15

Slide 15 text

• ҰํɺDPNNPOTFOTFSFBTPOJOH΍DPSFGFSFODFTPMVUJPOͰ͸-B.%"15ʹର͠λεΫͷΈੑೳ͕޲্ – (15΍-B.%"15ͷϓϩϯϓτ͸MBOHVBHFNPEFMJOHΛͦͷ··ղ͔ͤΔ΋ͷʹͳ͍ͬͯΔ ʢೖྗͱग़ྗΛͭͳ͛ΔͱࣗવͳจষʹͳΔʣҰํɺ'-"/ͷϓϩϯϓτ͸৑௕ͳͨΊͱਪଌ – ͦ΋ͦ΋ධՁʹར༻͢Δϓϩϯϓτ͸౷Ұ͢΂͖ͱࢥ͏͕ɺ'-"/ͷϓϩϯϓτΛ(15΍-BNCEB15ʹೖྗ͢Δͱ ੑೳ͕ஶ͘͠௿͍ͨΊɺ'-"/ͷϓϩϯϓτΛධՁʹ࢖༻͠ͳ͔ͬͨͱͷ͜ͱɻٯ͸ෆ໌ instruction tuning͕ػೳ͠ʹ͍͘λεΫ 15 (15 '-"/ *QBDLFEVQNZCFMPOHJOHT 8IBUJTUIFDBVTF 015*0/4  *XBTIVOUJOHGPSBOFXBQBSUNFOU  *XBTNPWJOHPVUPGNZBQBSUNFOU ೖྗ ਖ਼ղ ग़ྗ *XBTIVOUJOHGPSBOFXBQBSUNFOU *XBTIVOUJOHGPSBOFXBQBSUNFOU *QBDLFEVQNZCFMPOHJOHTCFDBVTF ʢCOPAσʔληοτͷೖग़ྗྫɻ੨ࣈ͕ϓϩϯϓτɻʣ

Slide 16

Slide 16 text

• ӳޠͰ΋தࠃޠͰ΋ճ౴Մೳ ੒ޭͨ͠ࢦࣔͷྫ 16 • ੜ੒λεΫ΋ղ͘͜ͱ͕Ͱ͖Δ • ʮۭ͔Βόφφ͕߱ͬͯ͘Δʯ ݱ৅Λࣔ͢৽୯ޠΛ࡞੒ ʢ࣮֬ʹֶशͯ͠ͳ͍λεΫʣ

Slide 17

Slide 17 text

• จதͷO൪໨ͷ୯ޠΛ౴͑Δͱ͍͏୯७ͳλεΫ ͢Βࣦഊ͢Δྫ͕ࢄݟ ࣦഊͨ͠ࢦࣔͷྫ 17 • σϯϚʔΫޠʢ%BOJTIʣͰճ౴͞Εͨ΋ͷͷɺ ࣭໰จ͕຋༁͞Εͯ͠·ͬͨ

Slide 18

Slide 18 text

• എܠ֓ཁ • ఏҊख๏σʔληοτ • ධՁ࣮ݧ • ߟ࡯ ໨࣍ 18

Slide 19

Slide 19 text

λεΫ/σʔληοτ/ςϯϓϨʔτͷ਺ 19 • େےͰ͸ֶशʹ࢖͏λεΫΛ૿΍͢΄Ͳੑೳ޲্ – ʢՃ͑Δॱ൪΍૊Έ߹Θͤʹ΋ΑΔͷͰҰ֓ʹ͸ ݴ͑ͳ͍͕ʣҰ෦ͷλεΫ͸௥Ճͯ͠΋ੑೳ͕͞ ΄Ͳ޲্ͤͣ • σʔληοτ਺Ϋϥελ͸૿΍͢΄Ͳੑೳ͕޲্ ʢ˛ WT˔ʣ • ςϯϓϨʔτ਺σʔληοτ͸ੑೳʹ͋·ΓӨڹ ͠ͳ͍ʢԣ࣠ʣ

Slide 20

Slide 20 text

• Ϟσϧ͕େ͖͘ͳΔ΄Ͳੑೳ͕޲্ɻશͯͷ࿦จͰಉ༷ͷ܏޲͕ใࠂɻ • '-"/Ͱ͸#ҎԼͷ৔߹JOTUSVDUJPOUVOJOHʹΑΓΉ͠Ζੑೳ͕௿Լ͢Δ͜ͱΛใࠂ͍ͯ͠Δ͕ɺ ଞͷ࿦จͰ͸ΑΓখ͍͞ϞσϧͰJOTUSVDUJPOUVOJOHͷ༗ޮੑΛใࠂ – 5 /BUVSBM*OTUSVDUJPOTW5-. # – .FUB*$-(15MBSHF . – /BUVSBM*OTUSVDUJPOTW#"35CBTF . • ϓϩϯϓτͷҧ͍ʢԼͭ͸ྫࣔΛՃ͍͑ͯΔʣ ࣄલֶशͷҧ͍ʢ$BVTBM-.WT.BTLFE-.ʣ ͕ݪҼʁ Ϟσϧͷେ͖͞ 20 A100/V100 ͰֶशՄೳ

Slide 21

Slide 21 text

• ύλʔϯͷϓϩϯϓτʢۭനɺσʔληοτ໊ɺλεΫͷࢦࣔʣ ͷ͏ͪɺֶश࣌ʹʮλεΫͷࢦࣔʯΛՃ͑ͨ৔߹͕࠷΋ߴੑೳʢӈਤʣ – λεΫͷJOEJDBUPS͕͋Ε͹͍͍ͱ͍͏Θ͚Ͱ͸ͳ͍ • λεΫͷࢦࣔʹՃ͑ͯྫࣔΛೖྗͨ͠৔߹ʢ'FXTIPUʣɺ શମతʹੑೳ޲্ʢԼਤʣ – ֶशɾධՁ࣌ͷϓϩϯϓτʹݟຊΛݸϥϯμϜʹ௥Ճ – 4USVDUUPUFYUͳͲPVUQVUͷܗ͕ࣜෳࡶͳλεΫͰಛʹੑೳ޲্ʁ ϓϩϯϓτͷ಺༰ 21 ʢྫࣔΛՃ͍͑ͯͳ͍'-"/ʣ ʢྫࣔΛՃ͑ͨ'-"/ʣ

Slide 22

Slide 22 text

• /BUVSBM*OTUSVDUJPOTͰ͸਺छྨͷϓϩϯϓτΛ࡞੒ – %FGλεΫͷఆٛʢࢦࣔʣ 1PTλεΫͷਖ਼ྫʢྫࣔʣ /FHλεΫͷෛྫ &YQMਖ਼ྫͱෛྫͷࠜڌͷઆ໌ • /FHʢෛྫʣ΍&YQMʢࠜڌͷઆ໌ʣΛՃ͑ͯ΋ੑೳ͸޲্ͤͣɺ%FG 1PTʢࢦࣔʴྫࣔʣͰे෼ λεΫͷࢦࣔͱྫࣔҎ֎ʹ༗ޮͳϓϩϯϓτ͸͋Δ͔ʁ 22 ग़య: Wang et al., Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks. arXiv, 2022. ֶश࣌ʹ ࢖༻ͨ͠ ϓϩϯϓτ ධՁ࣌ʹ࢖༻ͨ͠ϓϩϯϓτ දதͷ਺஋͸ෳ਺λεΫͷฏۉੑೳʢ306(&-ʣ

Slide 23

Slide 23 text

• ϓϩϯϓτΛՃ͑ͯϚϧνλεΫֶशΛߦ͏JOTUSVDUJPOUVOJOHΛఏҊ͠ɺ ϓϩϯϓτΛՃ͑ͳ͍৔߹ʹൺ΂ͯɺ൚Խੑೳ͕޲্͢Δ͜ͱΛ֬ೝ – େن໛ϞσϧͰେྔͷจষΛಡΈࠐΉํ๏ʢ(15ͳͲʣͱ͸ҟͳΔΞϓϩʔν͕ൃݟ͞Εͨ – ͲͪΒ͕༏Ε͍ͯΔ͔͸ެฏʹධՁͰ͖͍ͯͳ͍ • ൚Խੑೳʹର͠༩͑ΔӨڹ͸ҎԼͷ௨Γ – λεΫσʔληοτ਺ɿଟ͍΄͏͕ੑೳ͸ߴ·Δ͕ɺੑೳ޲্΁ͷد༩౓͸λεΫʹΑΓ·ͪ·ͪ – Ϟσϧͷେ͖͞ɿେ͖͍΄Ͳੑೳ͸ߴ͍ɻҰఆҎ্ͷେ͖͕͞ඞཁ͔͸ෆಁ໌ – ϓϩϯϓτͷ಺༰ɿݱঢ়Ͱ͸ɺλεΫͷࢦࣔͱྫࣔʢਖ਼ྫʣ͕࠷ߴੑೳ • ࠓޙͷݚڀ՝୊ – ຊ౰ʹϓϩϯϓτΛཧղͰ͖͍ͯΔ͔ʁ ࣍ͷ৿Լ͞Μͷ࿦จ঺հΛָ͓͠Έʹʂ > 8FCTPO 1BWMJDL l%P1SPNQU#BTFE.PEFMT3FBMMZ6OEFSTUBOEUIF.FBOJOHPG5IFJS1SPNQUT z/""$-  – ൚Խੑೳ޲্ʹ༗༻ͳࢦࣔ΍λεΫ͸Կ͔ʁ – ೚ҙͷࢦࣔΛཧղͰ͖Δ͜ͱΛͲ͏ධՁ͢Δ͔ʁ ݁࿦ 23

Slide 24

Slide 24 text

APPENDIX 24

Slide 25

Slide 25 text

• λεΫʢσʔληοτʣͷ਺͕૿͑Δͱੑೳͷ தԝ஋͸ߴ͘ͳΔ͕ɺ෼ࢄ΋େ͖͘ͳΓ͏Δ σʔληοτ਺/ςϯϓϨʔτ਺ɿT0Ͱ͸΍΍ҟͳΔ܏޲Λࣔ͢ 25 • ςϯϓϨʔτ਺σʔληοτΛ૿΍͢ͱ ੑೳͷதԝ஋͸্͕Δ෼ࢄ͸খ͘͞ͳΔ – ಛʹຊདྷͷλεΫͱೖग़ྗ͕ҟͳΔςϯϓϨʔτ ΛՃ͑ͨ৔߹ʹੑೳ޲্ʢ྘৭ʣ σʔληοτ਺ɿ ςϯϓϨʔτ਺ɿ