Slide 1

Slide 1 text

10݄31೔ ෢ాɾ࡫໺ݚڀࣨɹM2 ໼໺ઍߛ ReAct: Synergizing Reasoning and Acting in Language Models Tree of Thoughts: Deliberate Problem Solving with Large Language Models Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Grif fi ths, Yuan Cao, Karthik Narasimhan Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao ICLR 2023 NeurIPS 2023

Slide 2

Slide 2 text

બΜͩཧ༝ ● ྆ํͱ΋Prompt Engineering Guideʹ΋ܝࡌ͞Ε͍ͯΔϓϩϯϓτͷख๏ ● ͋Δఔ౓༗໊Ͱ༗༻ ● ͨ͘͞ΜఏҊ͞Ε͍ͯΔϓϩϯϓτͷ࿦จΛಡΉͱ͖ͷϕʔεʹͳΓͦ͏ 2

Slide 3

Slide 3 text

֓؍ ● ReAct ● ਪ࿦ͱߦಈΛަޓʹੜ੒͢Δ ● ߦಈͰ֎෦πʔϧΛར༻͢Δ͜ͱͰHallucinationΛ๷͙͜ͱ͕Ͱ͖Δ ● Tree of Thougt ● ޙ໭Γ΍ઌಡΈΛߦ͍ɺख़ߟ͢ΔࣄͰCoTͳͲͰ͸೉͔ͬͨ͠ෳࡶͳਪ࿦Λ ߦ͏͜ͱ͕Ͱ͖Δ 3

Slide 4

Slide 4 text

ReAct: Synergizing Reasoning and Acting 
 in Language Models

Slide 5

Slide 5 text

ReAct: Synergizing Reasoning and Acting in Language Models ● LLMʹਪ࿦ͱߦಈΛަޓʹੜ੒ͤ͞Δϓϩϯϓτख๏ʮReActʯͷఏҊ ● ਪ࿦Ͱ͸ߦಈΛܭըͨ͠Γɺߦಈͷ݁ՌΛॲཧͨ͠Γ͢Δ ● ߦಈͰ͸஌ࣝϕʔε΍؀ڥʹΞΫηε͠৘ใΛಘΔ ● ֎෦͔Βͷਖ਼͍͠৘ใΛར༻͢Δ͜ͱͰHallucinationΛ཈੍ ● Hallucination: LLM͕΋ͬͱ΋Β͍͠ӕΛग़ྗ͢Δ͜ͱ 5 2ϋϩ΢Οϯ͸೥Ҏ্લ͔Βଓ͘ࡇΓͰ͔͢ʁ ·ͣ͸ϋϩ΢Ο ϯͷىݯΛௐ΂ Α͏ ϋϩ΢Οϯىݯ ϋϩ΢Οϯͷൃ঵͸ ೥Ҏ্΋લɻʜ ೥લ͸ ೥Ҏ্લ 
 ͔ͩΒ౴͑͸ ʮ͸͍ʯ ਪ࿦ ߦಈ ਪ࿦

Slide 6

Slide 6 text

REACTͷਪ࿦ྫ 6 2ʰԾ໘ϥΠμʔిԦʱͱʰ͋·ͪΌΜʱɺ์ૹճ਺͕ଟ͍ͷ͸ͲͪΒͰ͠ΐ͏ʁ ʰԾ໘ϥΠμʔిԦʱͷ์ૹճ਺Λݕࡧͨ͠ޙɺʰ͋·ͪΌΜʱͷ์ૹճ਺Λݕࡧ͠ɺͦΕΒͷൺֱΛߦ͏ɻ 4FBSDI<ʰԾ໘ϥΠμʔిԦʱͷ์ૹճ਺͸Կճʁ> ਪ࿦ ߦಈ ʰԾ໘ϥΠμʔిԦʱͷ์ૹճ਺͸ճͰ͋Δɻ࣍ʹʰ͋·ͪΌΜʱͷ์ૹճ਺Λݕࡧ͢Δɻ 4FBSDI<ʰ͋·ͪΌΜʱͷ์ૹճ਺͸Կճʁ> ʰԾ໘ϥΠμʔిԦʱͷ์ૹճ਺͸ճͰ͢ɻ ʰ͋·ͪΌΜʱͷ์ૹճ਺͸ճͰ͢ɻ ʰԾ໘ϥΠμʔిԦʱͷ์ૹճ਺͸ճͰ͋Δɻʰ͋·ͪΌΜʱͷ์ૹճ਺͸ճͰ͋Δɻ ʰ͋·ͪΌΜʱͷ์ૹճ਺͕ଟ͍ͷͰɺ࠷ऴతͳ౴͑͸ʰ͋·ͪΌΜʱͰ͋Δɻ 'JOJTI<ʰ͋·ͪΌΜʱ> ໰୊͸+&.)PQ2"ΑΓҾ༻ ਪ࿦ ߦಈ ਪ࿦ ߦಈ

Slide 7

Slide 7 text

஌ࣝू໿తͳਪ࿦λεΫ HotPotQA ● ଟஈ֊ਪ࿦λεΫ ● ̎ͭҎ্ͷWikipediaͷهࣄʹର͢Δਪ࿦Λඞཁͱ͢Δ ● ྫ) Which magazine was started fi rst Arthur's Magazine or First for Women? Fever ● ࣄ࣮ݕূλεΫ ● ओுʹର͠ɺWikipediaͷهࣄΛࢀর͠[ Supported, Refuted NotEnoughInfo ]ͷϥϕϧ Λ͚ͭΔ ● ྫʣNikolaj Coster-Waldau worked with the Fox Broadcasting Company.
 -Supported 7

Slide 8

Slide 8 text

஌ࣝू໿తͳਪ࿦λεΫͰͷߦಈۭؒ ● Wikipedia͔Β৘ใΛ֫ಘ͢ΔͨΊͷߦಈۭؒΛఆٛ ● ҎԼͷ3ͭͷΞΫγϣϯ ● search[entity]: entityͷWikipediaϖʔδ͕͋Ε͹࠷ॳͷ5จΛɺແ ͚Ε͹ྨࣅͷΤϯςΟςΟΛ5ͭฦ͢ ● lookup[string]: stringΛؚΉจΛฦ͢ (Ctrl+F) ● fi nish[answer]: answerΛ౴͑ͱͯ͠ݱࡏͷλεΫΛऴྃ͢Δ 8

Slide 9

Slide 9 text

஌ࣝू໿తͳਪ࿦λεΫͰͷReActͷྲྀΕ 9 "QQMF3FNPUFͷଞʹɺ"QQMF3FNPUF͕ຊདྷૢ࡞͢Δ 
 ͨΊʹσβΠϯ͞ΕͨϓϩάϥϜΛίϯτϩʔϧͰ͖Δ 
 σόΠε͸͋Γ·͔͢ʁ "QQMF3FNPUF 'SPOU3PX ૢ࡞

Slide 10

Slide 10 text

ൺֱϕʔεϥΠϯ Standerd ● ໰୊Λ༩͑ղ౴ͷΈΛߦΘͤΔϓϩϯϓτ 10 "QQMF3FNPUFͷଞʹɺ"QQMF3FNPUF͕ຊདྷૢ࡞͢ΔͨΊʹσβΠϯ͞Εͨ ϓϩάϥϜΛίϯτϩʔϧͰ͖ΔσόΠε͸͋Γ·͔͢ʁ

Slide 11

Slide 11 text

ൺֱϕʔεϥΠϯ Chain of Thought (CoT)[1] ● தؒͷਪ࿦εςοϓ΋ग़ྗͤ͞Δ͜ͱͰɺෳࡶͳਪ࿦͕Մೳ ● ߦಈΛར༻ͤͣɺਪ࿦ͷΈΛߦ͏ʢϞσϧ಺෦ͷ৘ใͷΈར༻ʣ Chain of Thought self-consistency (CoT-SC)[2] ● ෳ਺ͷCoTʹΑͬͯଟ਺ܾ͢Δ ● Թ౓ม਺Λ0.7ʹઃఆͨ͠CoTΛ21ճߦ͍ɺ
 ղ౴Λଟ਺ܾͰܾఆ͢Δ 11 <>$IBJOPG5IPVHIU1SPNQUJOH&MJDJUT3FBTPOJOHJO-BSHF-BOHVBHF.PEFMT <>4FMG$POTJTUFODZ*NQSPWFT$IBJOPG5IPVHIU3FBTPOJOHJO-BOHVBHF.PEFMT

Slide 12

Slide 12 text

ൺֱϕʔεϥΠϯ Act-Only ● ReAct͔Βਪ࿦ΛऔΓআ͍ͨ΋ͷ 12

Slide 13

Slide 13 text

஌ࣝू໿తͳਪ࿦λεΫͷ࣮ݧઃఆ ● Ϟσϧʹ͸PaLM-540BΛར༻ ● HotpotQA ● ֶशηοτͷࣄྫΛਓखͰΞϊςʔγϣϯͯ͠6-shot ● Fever ● ֶशηοτͷࣄྫΛਓखͰΞϊςʔγϣϯͯ͠3-shot 13

Slide 14

Slide 14 text

஌ࣝू໿తͳਪ࿦λεΫͰͷ࣮ݧ݁Ռ ● ReAct > Act ● ActͰ͸ࠓ·Ͱͷ৘ใ͔Β࠷ऴతͳ
 ౴͑Λղ౴͢Δ͜ͱ͕೉͍͠ 14

Slide 15

Slide 15 text

஌ࣝू໿తͳਪ࿦λεΫͰͷ࣮ݧ݁Ռ ● ReAct vs CoT ● HotpotQAͰ͸CoTͷউͪ 15

Slide 16

Slide 16 text

ReAct vs CoT @HotpotQA ● CoTͷHallucination͸ReActΑΓਂࠁ (False positive, Hallucination) ● ReAct͸ਪ࿦εςοϓ͕ॊೈͰͳ͍ (Reasoning error) ● ಉ͡ਪ࿦ɺߦಈΛ܁Γฦ͢ϧʔϓʹϋϚΔ͜ͱ͕ଟ͍ ● ReAct͸ݕࡧͰࣦഊ͢Δͱਪ࿦͕୤ઢͯ͠͠·͏ (Search result error) 16 ˛)PUQPU2"Ͱͷ੒ޭɺࣦഊࣄྫ͔ΒϥϯμϜʹநग़͠ɺਓखͰݪҼΛௐࠪ

Slide 17

Slide 17 text

಺෦஌ࣝͱ֎෦஌ࣝͷ૊Έ߹Θͤ ● CoT͸ਪ࿦εςοϓͷߏங͸ΑΓਖ਼͕֬ͩɺ಺෦஌ࣝΛར༻͢Δ͜ͱͰ Hallucination͕ൃੜ͢Δ ● ReAct͸֎෦஌ࣝΛར༻͠ਖ਼͕֬ͩɺਪ࿦εςοϓͷߏஙʹ͓͍ͯ͸CoTΑΓྼΔ ➡ҎԼͷܦݧଇʹج͍ͮͯCoT-SCͱReActΛ੾Γସ͑Δ ● ReAct→CoT-SC: نఆͷεςοϓͰղ౴͕ಘΒΕͳ͍৔߹ ● ReActͷਪ࿦εςοϓΛ௕͗ͯ͘͢͠΋ੑೳ͸্͕Βͳ͍ ● CoT-SC→ReAct: ଟ਺ܾͷଟ਺೿͕ա൒਺Λ௒͑ͳ͍৔߹ ● CoT͕ࣗ৴Λ࣋ͬͯղ౴Ͱ͖͍ͯͳ͍ 17 )PUQPU2"Ͱεςοϓ 
 'FWFSͰεςοϓ

Slide 18

Slide 18 text

஌ࣝू໿తͳਪ࿦λεΫͰͷ࣮ݧ݁Ռ ● ReAct+CoT-SC > CoT-SC, ReAct ● CoT-SC (sample=21) < ReAct+CoT-SC (sample=5) ● ಺෦஌ࣝͱ֎෦஌ࣝΛ૊Έ߹ΘͤΔ͜ͱ͸༗༻ 18 ˛$P54$Ͱͷαϯϓϧ਺ ଟ਺ܾͷ฼਺ ͱੑೳ

Slide 19

Slide 19 text

஌ࣝू໿తͳਪ࿦λεΫͰͷFT࣮ݧ ● PaLM540b͕ੜ੒͠ɺਖ਼ղͨ͠3000݅ͷࣄྫΛར༻ͯ͠62b, 8bΛ Fine-Tuning ● ReAct͸ͦͷෳࡶੑ͔Βfew-shotͰϞσϧʹֶशͤ͞Δ͜ͱ͕೉͘͠ɺ Fine-Tuningͱ૬ੑ͕ྑ͍ 19

Slide 20

Slide 20 text

ҙࢥܾఆλεΫ ALFWorld ● ՈͷதΛςΩετૢ࡞ʹΑͬͯΤʔδΣϯτ͕୳ࡧ͠ɺෳࡶͳ̒छྨͷλεΫΛ ୡ੒͢Δ ● ߦಈͷબ୒ࢶ͸50छྨҎ্͋Γɺ૯౰ͨΓతʹ୳ࡧ͢Δ͜ͱ͸ࠔ೉ ● ྫ) You are in the middle of a room. Looking quickly around you, you see a drawer 2, a shelf 5,…, and a drawer 4. 
 Your task is to: put some vase in safe. ● vase΋safe΋ೖྗʹ͸ଘࡏͤͣɺৗࣝΛར༻ͯ͠୳ࡧ͠ͳ͚Ε͹ͳΒͳ͍ 
 (Ֆළ͸୨ͷ্ʹ͋Γͦ͏) WebShop 20 <> <>8FC4IPQ5PXBSET4DBMBCMF3FBM8PSME8FC*OUFSBDUJPOXJUI(SPVOEFE-BOHVBHF"HFOUT

Slide 21

Slide 21 text

ҙࢥܾఆλεΫ ALFWorld WebShop ● ݱ࣮ͷ঎඼Λར༻ͨ͠ΦϯϥΠϯγϣοϐϯά؀ڥͰϢʔβʔͷࢦࣔʹج ͍ͮͯ঎඼Λߪೖ͢ΔλεΫ ● ΤʔδΣϯτ͸ݕࡧ΍ʮ໭ΔʯͳͲͷϘλϯΛར༻Ͱ͖Δ ● Score (ճ౴͕Ͳͷఔ౓ཁ݅Λຬ͍ͨͯ͠Δ͔)ͱSuccess rate (શͯͷཁ݅ Λຬͨͨ͠੡඼Λճ౴ׂͨ͠߹)ͰධՁ ● ྫ) get me a sixteen pack of apple cinnamon freeze dried banana chips, and price lower than 50.00 dollars 21

Slide 22

Slide 22 text

ൺֱϕʔεϥΠϯ ALFWorld ● BUTLER[1]: λεΫछຖʹ100000݅ͷσϞͰ໛฿ֶशͨ͠Ϟσϧ ● ReAct-IM: ReActΛInner MonologueελΠϧ[2]ʹͨ͠΋ͷ ● ΤʔδΣϯτ͕؍ଌͨ͠؀ڥͱɺୡ੒͕ඞཁͳখ໨ඪΛInner Monologue ͱͯ͠ੜ੒͢Δ WebShop ● IL: 1012݅ͷσϞͰ໛฿ֶशͨ͠Ϟσϧ ● ILʴRL: 10587݅ͷσϞͰ໛฿ֶशʴڧԽֶशͨ͠Ϟσϧ 22 <>"-'8PSME"MJHOJOH5FYUBOE&NCPEJFE&OWJSPONFOUTGPS*OUFSBDUJWF-FBSOJOH <>*OOFS.POPMPHVF&NCPEJFE3FBTPOJOHUISPVHI1MBOOJOHXJUI-BOHVBHF.PEFMT

Slide 23

Slide 23 text

ҙࢥܾఆλεΫͷ࣮ݧઃఆ ● Ϟσϧʹ͸PaLM-540BΛར༻ ● ALFWorld ● λεΫͷछྨຖʹ3݅ਓखͰΞϊςʔτ͠ɺϥϯμϜͳ2-shotΛར༻ ● WebShop ● 1-shotΛར༻ 23

Slide 24

Slide 24 text

ҙࢥܾఆλεΫͷ࣮ݧ݁Ռ ● ReAct > Act ● Act͸໨ඪͷ෼ղͱ؀ڥΛ೺Ѳ͕Ͱ͖ͳ͍ ● ฏۉతʹ͸ReAct > BUTLER ● େྔͷֶशσʔλΛར༻͢Δख๏ʹউར 24 ˛"-'8PSMEʹ͓͚ΔλεΫछྨຖͷ੒ޭ཰ 

Slide 25

Slide 25 text

ҙࢥܾఆλεΫͷ࣮ݧ݁Ռ ● ReAct > ReAct-IM ● Inner MonologueελΠϧͰ͸؀ڥͷ؍ଌͱୡ੒͢΂͖খ໨ඪʹ͍ͭͯͷΈੜ੒ ● ্खʹৗࣝΛద༻ͯ͠ɺਪ࿦ͷεςοϓΛߏங͢Δ͜ͱ͕೉͍͠ ● ྫ) task: put a clean knife in countertop. ● think: To solve the task, I need to fi nd and take a clean knife, then put it in countertop. ● φΠϑΛݟ͚ͭΔ΋ͷͷɺͦΕ͕៉ྷͩͱ৴ͯ͡Χ΢ϯλʔʹஔ͖ଓ͚Δ 25 ˛"-'8PSMEʹ͓͚ΔλεΫछྨຖͷ੒ޭ཰ 

Slide 26

Slide 26 text

ҙࢥܾఆλεΫͷ࣮ݧ݁Ռ ● ReAct > Act > IL, IL+RL ● ReAct͸ϊΠζͷଟ͍؍ଌ͔Β͏·࣍͘ͷߦಈΛੜ੒Ͱ͖͍ͯΔ ● For ‘space-saving ottoman bench for living room’, the item has options ‘39x18x18inch’ and ‘blue’ and seems good to buy. 26 ˛8FC4IPQͰͷείΞ

Slide 27

Slide 27 text

ReAct: Synergizing Reasoning and Acting in Language Models ● LLMʹਪ࿦ͱߦಈΛަޓʹੜ੒ͤ͞Δϓϩϯϓτख๏ͷఏҊ ● ਪ࿦Ͱ͸࣍ͷߦಈΛܭը͠ɺߦಈͷ݁ՌΛॲཧ͢Δ ● ߦಈͰ͸஌ࣝϕʔε΍؀ڥʹΞΫηε͠ɺ௥Ճͷ৘ใΛಘΔ ● ࣭໰Ԡ౴λεΫͱࣄ࣮ݕূλεΫʹ͓͍ͯɺWikipediaAPIͱର࿩͠ɺ HallucinationΛࠀ෰ ● 2ͭͷର࿩ܕҙࢥܾఆϕϯνϚʔΫʹ͓͍ͯɺେྔͷֶशσʔλΛར༻͠ ͨ໛฿ֶशͱڧԽֶशʹΑΔϕʔεϥΠϯΛ1-shot/2-shotͷઃఆͰେ෯ʹ ্ճͬͨ 27

Slide 28

Slide 28 text

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Slide 29

Slide 29 text

Tree of Thoughts: Deliberate Problem Solving with Large Language Models ● ݴޠϞσϧͷਪ࿦͸ࠨ͔Βӈʹݶఆ ● ઌಡΈΛඞཁͱͨ͠Γɺ࠷ॳͷܾఆ͕ ۃΊͯॏཁͳλεΫͰ͸ࣦഊ ➡໰୊ղܾͷதؒஈ֊ (Thought)Λ୳ࡧ Մೳͱ͢ΔϑϨʔϜϫʔΫɺ
 ʮTree of ThoughtʯΛఏҊ ● ෳ਺ͷҟͳΔਪ࿦ܦ࿏Λߟྀ͠ɺ࣍ͷ ߦಈΛܾఆ͢ΔͨΊʹࣗݾධՁΛߦ͏ ʢख़ߟ͢Δʣ 29

Slide 30

Slide 30 text

ϕʔεϥΠϯͱͷൺֱ 30 ● ೚ҙͷ໰୊Λ໦ͷ୳ࡧ໰୊ͱͯ͠ଊ͑Δ ● ֤ϊʔυ͸ೖྗͱ͜Ε·ͰͷࢥߟεςοϓΛؚΉ෦෼ղΛද͢ঢ়ଶ

Slide 31

Slide 31 text

Tree of Thoughtͷਪ࿦ྫ 31 ҎԼͷ਺ࣈͱ࢛ଇԋࢉΛ૊Έ߹ΘͤͯΛ࡞͍ͬͯͩ͘͞   Y  Y  Y *OQVU ⭕ ⭕ ⭕ ⭕ ✖︎ ✖︎ ✖︎ Ͳͷਪ࿦͕ ༗๬͔ͳʜ

Slide 32

Slide 32 text

Tree of Thought ● ෳࡶͳਪ࿦໰୊Λখ໰୊ʹ෼ղ͠ɺͲͷਪ࿦ܦ࿏͕ے͕ྑ͍͔ධՁΛ ߦ͍ͳ͕Β୳ࡧ͢Δ ● λεΫຖʹมߋՄೳͳҎԼͷ4ͭͷΦϓγϣϯ A. ࢥߟΛ෼ղ͢Δେ͖͞ B. ࢥߟͷੜ੒ํ๏: ࠓ·Ͱͷࢥߟ΋ೖྗ͢Δ/ ͠ͳ͍ C. ֤ϊʔυͷධՁํ๏: ಠཱʹఆྔԽ / બ୒ࢶΛൺֱ ● ઌಡΈͱৗࣝΛར༻ D. ໦ͷ୳ࡧํ๏: BFS / DFS 32

Slide 33

Slide 33 text

ઌಡΈ΍୳ࡧΛඞཁͱ͢ΔఏҊλεΫ Game of 24 ● ༩͑ΒΕͨ4ͭͷ਺ࣈʹ࢛ଇԋࢉΛద༻ͯ͠24Λ࡞Δ Creative Writing ● ༩͑ΒΕͨ4ͭͷจ͕ஈམͷ࠷ޙͷจͱͳΔΑ͏ͳ4ஈམߏ੒ͷจষΛग़ྗ ● ੜ੒ͨ͠จষʹҰ؏ੑ͕͋Δ͔ɺGPT4ͱਓखͰධՁ 33

Slide 34

Slide 34 text

ઌಡΈ΍୳ࡧΛඞཁͱ͢ΔఏҊλεΫ Mini Crosswords ● 5x5αΠζͷΫϩεϫʔυ(ख͕͔Γ͸ॎɺԣʹͦΕͧΕ5ͭͣͭ)Λղ͘ ● ਖ਼ղͨ͠จࣈ਺ɺ୯ޠ਺ɺήʔϜ਺ͰධՁ 34

Slide 35

Slide 35 text

Game of 24ͰͷϕʔεϥΠϯ IO ● ໰୊Λೖྗͱ͠ɺղ౴Λग़ྗͱ͢Δඪ४తͳϓϩϯϓτ (5-shot) CoT ● 4ͭͷ਺஋͔Β24Λ࡞ΔͨΊͷద౰ͳ࢛ଇԋࢉͷࣜΛ༩͑Δ (5-shot) ● 13 - 9 = 4 (left: 4 4 10); 10 - 4 = 6 (left: 4 6); 4 * 6 = 24 (left: 24) CoT-SC ● 100ճCoTͰͷਪ࿦Λ࣮ߦ͠ɺଟ਺ܾΛऔΔ IO + re fi ne ● 10ճ൓෮ͯ͠ਪ࿦͢Δ ● ؒҧ͍͑ͯͨ৔߹͸मਖ਼͢ΔΑ͏ͳࢦࣔͱཤྺΛ༩͑Δ 35

Slide 36

Slide 36 text

Game of 24ͰͷToTηοτΞοϓ 36 ● P32ʹࣔͨ͠Φϓγϣϯ͸ҎԼͷΑ͏ʹͳΔ A. ࢥߟΛ෼ղ͢Δେ͖͞ɿ̍౓ͷ࢛ଇԋࢉ B. ࢥߟͷੜ੒ํ๏ɿࠓ·Ͱͷࢥߟ͸ೖྗͤͣɺ୯७ͳ౳ࣜͷੜ੒ͷΈߦ͏ C. ֤ϊʔυͷධՁํ๏ɿ24ʹ౸ୡͰ͖Δ͔ʁ- [ sure/maybe/impossible ] ● ৗࣝΛར༻͠ɺ24ʹରͯ͠େ͖͗ͨ͢Γখ͗͢͞ΔީิΛল͘ D. ໦ͷ୳ࡧํ๏ɿ෯༏ઌ୳ࡧ

Slide 37

Slide 37 text

Game of 24ͰͷToTηοτΞοϓ 37

Slide 38

Slide 38 text

Game of 24Ͱͷ݁Ռ ● ToT͕୳ࡧ෯=1ͷઃఆͰ΋ଞͷϓϩϯϓτΑΓ͔ͳΓߴ͍ੑೳ ● ୳ࡧ෯Λ5ʹ͢Δͱ74%΋ղ͚Δ 38

Slide 39

Slide 39 text

Game of 24Ͱͷ݁Ռ ● ToTͷ๚໰ࡁϊʔυͰͷ੒ޭ཰͕ߴ͍ ● IO, CoTͰ͸ͦΕͧΕͷ࣮ߦશମΛ
 ϊʔυͱͯ͠ܭࢉ ● (ےྑ͘୳ࡧͰ͖͍ͯΔ) 39 ● CoTͰ͸60%͕࠷ॳͷ౳ࣜੜ੒࣌఺Ͱ ਖ਼౴ʹࣦഊ ● ޙ໭Γ΍ઌಡΈ͕Ͱ͖ͳ͍ख๏ʹ͸
 ݶք͕͋Δ ˛๚໰ࡁΈϊʔυͱͦͷਖ਼ղ཰ ˛εςοϓຖͷࣦഊ཰

Slide 40

Slide 40 text

Creative writingͰͷϕʔεϥΠϯ IO ● จΛೖྗͱ͠จষΛग़ྗ͢Δඪ४తͳϓϩϯϓτ CoT ● ઌʹ୹͍ܭըΛॻ͍͔ͯΒจষΛॻ͔ͤΔ IO+re fi ne ● 5ճ·Ͱ൓෮͢Δ ● Ұ؏ੑʹ͚ܽΔ৔߹͸मਖ਼͢ΔΑ͏ͳࢦࣔͱཤྺΛ༩͑Δ 40

Slide 41

Slide 41 text

Creative writingͰͷToTηοτΞοϓ 41 ● P32ʹࣔͨ͠Φϓγϣϯ͸ҎԼͷΑ͏ʹͳΔ A. ࢥߟΛ෼ղ͢Δେ͖͞ɿͲͷΑ͏ͳจষΛهड़͢Δ͔ͷܭը / จষͷهड़ B. ࢥߟͷੜ੒ํ๏ɿࠓ·Ͱͷࢥߟ (ܭը) Λར༻ͯ͠จষΛهड़͢Δ C. ֤ϊʔυͷධՁํ๏ɿ֤ࢥߟ͸ಉ࣌ʹ5ͭੜ੒͞Εɺ࠷΋ྑ͍΋ͷΛબͿ D. ໦ͷ୳ࡧํ๏ɿਂ͞=2, ෯=1 ● ܭըͱจষͷهड़Λߦ͏ͨΊਂ͞͸2ɺධՁ͝ͱʹ࠷ྑͷࢥߟͷΈ࢒ͨ͢Ί෯͸1

Slide 42

Slide 42 text

Creative writingͰͷToTηοτΞοϓ 42

Slide 43

Slide 43 text

Creative writingͰͷ݁Ռ ● GPT-4ʹΑΔධՁɺਓखධՁͷ྆ํͰCoTΑΓToTͷํ͕Ұ؏ੑ ͷ͋ΔจষΛੜ੒͍ͯ͠Δͱ൑அ͞Εͨ ● +re fi neͷઃఆ͸ࠓճͷλεΫͰ༗ޮͰ͋ͬͨ 43

Slide 44

Slide 44 text

Mini CrosswordsͰͷϕʔεϥΠϯ IO ● ख͕͔ΓΛೖྗͱͯ͠ΫϩεϫʔυΛग़ྗͱ͢Δ(5-shot) 44 4PMWFYNJOJDSPTTXPSET(JWFOBOJOQVUPGIPSJ[POUBMDMVFTBOEWFSUJDBMDMVFT  HFOFSBUFBOPVUQVUPGSPXT XIFSFFBDISPXJTMFUUFSTFQBSBUFECZTQBDF 
 *OQVU I"MVOBSWBMMFZ I"GBUUZPJM I5PFOUJDF I5PMPXFSUPSFEVDF I"TPMJUBSZQFSTPO W"DDPSEJOHUPUIFSPTUFS W"OPUIFSOBNFGPS1PSU'SBODRVJ W"OJMMJDJUMPWFSB&VSPQFBOMBLF W5PMJTQ W5PDPNFJO 0VUQVU 3*--& 0-&*/ 5&.15 "#"4& -0/&3 CoT ● ख͕͔ΓͱରԠ͢Δ୯ޠΛࣔ͢ (5-shot)

Slide 45

Slide 45 text

Mini CrosswordsͰͷToTηοτΞοϓ 45 ● P32ʹࣔͨ͠Φϓγϣϯ͸ҎԼͷΑ͏ʹͳΔ A. ࢥߟΛ෼ղ͢Δେ͖͞ɿख͕͔Γ͔Β୯ޠΛਪଌ B. ࢥߟͷੜ੒ํ๏ɿࠓ·ͰͷࢥߟΛར༻ͯ͠୯ޠΛਪଌ C. ֤ϊʔυͷධՁํ๏ɿଞͷख͕͔Γͷ୯ޠͷअຐʹͳΒͳ͍͔ʁ
 - [ sure/maybe/impossible ] ● ଞͷ୯ޠ͕ຒΊΒΕͳ͘ͳͬͨ (impossible) ৔߹͸લͷࢥߟʹ໭Δ D. ໦ͷ୳ࡧํ๏ɿਂ͞༏ઌ୳ࡧ

Slide 46

Slide 46 text

Mini CrosswordsͰͷToTηοτΞοϓ 46

Slide 47

Slide 47 text

Mini CrosswordsͰͷ݁Ռ ● IO, CoTͱൺֱͯ͠ToT͸͔ͳΓੑೳ͕ྑ͍ ● ࢥߟΛධՁ͢Δ෦෼ͷOracleઃఆͰߋʹੑ ೳ্͕͕͓ͬͯΓɺධՁʹվળͷ༨஍͕͋ Δ (best state) ● ݱঢ়ͷධՁͰ͸ղܾͰ͖Δީิ୯ޠͱख͕͔ Γͷ૊Έ߹ΘͤͰ΋impossible͕ͭ͘͜ͱ͕ ͋Δ 47

Slide 48

Slide 48 text

Tree of Thoughts: Deliberate Problem Solving with Large Language Models ● ໰୊ղܾͷதؒஈ֊ (Thought)Λ୳ ࡧՄೳͱ͢ΔϑϨʔϜϫʔΫɺ
 ʮTree of ThoughtʯΛఏҊ ● ෳ਺ͷҟͳΔਪ࿦ܦ࿏Λߟྀ͠ɺ࣍ ͷߦಈΛܾఆ͢ΔͨΊʹࣗݾධՁΛ ߦ͏ʢख़ߟ͢Δʣ ● ఏҊͨ̏ͭ͠ͷઌಡΈ΍୳ࡧ͕ඞཁ ͳλεΫʹ͓͍ͯେ෯ʹੑೳ޲্ 48