自然言語処理による論文執筆支援

ࣗવݴޠॲཧʹΑΔ࿦จࣥචࢧԉ ౦๺େֶ ಛ೚ݚڀһɼLangsmithגࣜձࣾ ڞಉ૑ۀऀ ܀ྛथੜ

l ܀ྛथੜʢTatsuki Kuribayashiʣ - https://kuribayashi4.github.io/ l ܦྺ - 2014-23: ౦๺େֶʢֶ෦ɾम࢜ɾത࢜ɾϙευΫʣ
- 2017-current: Tohoku NLP Groupʢ৘ใՊֶݚڀՊɼסݚڀࣨʣ - 2018: Langsmithגࣜձࣾઃཱʢಉظͷҏ౻୓ւͱ૑ۀɼEISͷ೔ߴխढ़͕։ൃʹڠྗʣ l ઐ໳ɾؔ৺ - ࣗવݴޠॲཧ (Natural language processing; NLP)ɼܭࢉ৺ཧݴޠֶ - ݴޠϞσϧͱਓؒͷݴޠॲཧͷରরɼࣗવݴޠॲཧʹΑΔࣥචࢧԉɼٞ࿦ͷࣗಈղੳͳͲ - 2019-2022: ࣗવݴޠॲཧτοϓࠃࡍձٞʢACL, EMNLPʣච಄࿦จ4೥࿈ଓ࠾୒ - 2021೥౓ ౦๺େֶ૯௕৆ ࣗݾ঺հ ࿦จࣥචࢧԉγεςϜ

༨ஊɿChatGPT͕࿩୊ʹͳͬͯ·͢Ͷ https://openai.com/blog/chatgpt/ ໰͍߹Θͤ ฦ౴ੜ੒

༨ஊɿChatGPT͕࿩୊ʹͳͬͯ·͢Ͷ ҎԼͷ࿦จ֓ཁͷӳޠදݱΛ ચ࿅ͤͯ͞΄͍͠ ະൃදͷ಺༰Λπʔϧʹೖྗ͢Δةݥੑ ʢޙͰ৮Ε·͢ʣ https://openai.com/blog/chatgpt/

༨ஊɿChatGPT͕࿩୊ʹͳͬͯ·͢Ͷ ࿦จͷ֓ཁΛθϩ͔Β ॻ͍ͯ΋Βͬͯɼ ண૝ΛಘΑ͏ ChatGPTͷΦʔαʔγοϓ͸ʁ ʢޙͰ৮Ε·͢ʣ https://openai.com/blog/chatgpt/

l ಋೖ - ·ͱΊ - ໰୊ҙࣝ l Langsmithͷ঺հ - ҰॹʹLangsmithΛ࢖ͬͯΈΑ͏
- ଞπʔϧͷ͍ܰ঺հ l ྙཧɾϦςϥγʔ - ະൃදσʔλΛπʔϧʹೖྗ͢ΔϦεΫ - Ⴎ઄ͷةݥੑ - Φʔαʔγοϓͷ໰୊ ֓ཁ

l ࿦จࣥචʹ͓͍ͯ - ͜Μͳศརͳπʔϧ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - ͜Μͳةݥੑ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - LangsmithΛ஌Βͳ͔ͬͨ ʮ஌Βͳ͔ͬͨʯΛ๷͙ձ
LangsmithͳͲɼ ࣗવݴޠॲཧʹΑΔڭҭɾݚڀࢧԉ͕੝Μͳ ౦๺େֶ͔Βੵۃతʹ׆༻͍ͯ͜͠͏ʂ ߋʹؾʹͳΔํ͸ https://www.nlp.ecei.tohoku.ac.jp/ Tohoku NLP GroupΛνΣοΫʂ

l ࿦จࣥචʹ͓͍ͯ - ͜Μͳศརͳπʔϧ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - ͜Μͳةݥੑ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - LangsmithΛ஌Βͳ͔ͬͨ l ࿦จࣥචࢧԉ
- ֶज़త஌ݟ΍ΞΠσΟΞ͕طʹ͋ΓɼͦΕΒΛӳޠ࿦จ্Ͱ෼͔Γ΍͘͢ॻ͘࡞ۀͷࢧԉ - ϑΥʔΧε֎🙅ɿ ஌ݟ΍ΞΠσΟΞΛಘΔͨΊͷࢧԉɽӳޠҎ֎ͷ࿦จࣥචࢧԉɽ ΞΧσϛοΫϥΠςΟϯάʹؔ͢Δମܥతͳֶशͷࢧԉɽ l ࡾߦ - LangsmithΛؚΊπʔϧͷબ୒ࢶΛ஌Ζ͏ɽ - ࢼ͠ʹ࢖ͬͯΈΑ͏ɽ - ࣗ෼ʹ͋͏΋ͷΛͱΓ͍Εͯɼݚڀ׆ಈΛޮ཰Խ͠Α͏ɽ ʮ஌Βͳ͔ͬͨʯΛ๷͙ձ ʢ※ͨͩ͠৘ใͷѻ͍ʹ஫ҙ͠Α͏ʣ LangsmithͳͲɼ ࣗવݴޠॲཧʹΑΔڭҭɾݚڀࢧԉ͕੝Μͳ ౦๺େֶ͔Βੵۃతʹ׆༻͍ͯ͜͠͏ʂ ߋʹؾʹͳΔํ͸ https://www.nlp.ecei.tohoku.ac.jp/ Tohoku NLP GroupΛνΣοΫʂ

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… લఏɿ࿦จͷॻ͖ํ͸େࣄ

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… l ʮদඌ͙Έͷ࿦จͷॻ͖ํɿӳޠ࿦จʯΑΓҾ༻
- 日本から投稿される情報系（特に人工知能、Ｗｅｂ等の分野です）の論文は、主要な国際会議にほとんど通りません。研究者人口や研究費の額を考えると、国際コミュニティにおける日本人のプレゼンスは非常に低いです。… - 研究の内容自体は、ほとんどの場合、問題ないです。たとえば、私は、人工知能学会論文誌に載るほぼ全ての論文、そして人工知能学会全国大会で発表される論文の３分の１は、きちんと書けばAAAIに通ると思います。（AAAI は人工知能の分野で最もレベルの高い国際会議のひとつです。）日本の情報系の研究は面白いものが多いです。英語論文を読んで、こんなのが通るの？と思うことが良くあると思いますが、裏返せばそれは、日本の研究のレベルの高さ、そしてそれを補って余りある(?)プレゼンテーションのまずさを意味しています。… લఏɿ࿦จͷॻ͖ํ͸େࣄ http://ymatsuo.com/japanese/ronbun_eng.html

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… લఏɿΞϯϑΣΞͳ౔ඨ
ੈքͱͷ ϑΣΞͳউෛ ੈքͱͷ ΞϯϑΣΞͳউෛ Ұൠʹ೔ຊਓ͸ӳޠ͕ۤखͱ͞ΕΔ

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… લఏɿΞϯϑΣΞͳ౔ඨ
ੈքͱͷ ϑΣΞͳউෛ ੈքͱͷ ΞϯϑΣΞͳউෛ The language in which the manuscript is written must be English. (Guidelines for Submission, ACL Anthology) Ұൠʹ೔ຊਓ͸ӳޠ͕ۤखͱ͞ΕΔ ϚΠϊϦςΟͱͯ͠ɼͳͥӳޠͰॻ͔ͳ͚Ε͹͍͚ͳ͍ͷ͔ʁͱ৺ʹཹΊ͓ͯ͘͜ͱ͸େࣄʢͩͱࢥ͏ʣ Heuristic Why this is problematic The paper has language errors As long as the writing is clear enough, better scientific content should be more valuable than better journalistic skills. ͱڧௐ͞ΕΔ΄Ͳʹ͸ɼ જࡏతʹࠪಡɾධՁʹ όΠΞεΛ༩͑͏Δ https://aclrollingreview.org/reviewertutorial 🔍 🔍 ࿦จ͸ӳޠͰॻ͖ͳ͍͞

l ٻΊΒΕΔεϐʔυײʢ※ࣗવݴޠॲཧ෼໺ͷ৔߹ʣ લఏɿੈք͸଴ͬͯ͘Εͳ͍ 0 500 1000 1500 2000 2500
3000 3500 4000 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 NLPτοϓࠃࡍձٞ (ACL) ౤ߘ਺ https://aclweb.org/aclwiki/Conference_acceptance_rates ౤ߘ਺ ೔ຊɿ౤ߘ਺ੈք12ҐʢυΠπ6Ґɼؖࠃ7Ґʣ ਓޱ125M ਓޱ83.2M ਓޱ51.7M ௚ۙͷτοϓࠃࡍձٞʢEMNLP, 2022/12ʣ ʹ͓͚Δ౤ߘ਺ʢνϡʔτϦΞϧΑΓʣ

ֶੜ͸ӳޠ࿦จࣥචͷϊ΢ϋ΢Λ஌Βͳ͍ʢ౰ͨΓલʣ l ਓతίετ - ڠྗऀͷϦιʔεͷݶքɿ - ڞஶͷڭһɼത࢜՝ఔֶੜͳͲ͕͢΂ͯͷֶੜʹ͖͖ͭͬΓʹͳΔ͜ͱ͸೉͍͠ - ֶशػձͷ࠶ݱੑɿ -
ݚڀࣨ͸ਓͷྲྀಈੑ͕ߴ͍ - ࣗ෼ͷ৔߹ɼۮવӳޠϥΠςΟϯά͕ಘҙͳಉ೥୅ͷཹֶੜ͕͓Γɼֶͼ͕ଟ͔ͬͨ l ۚમతίετ - ӳจߍਖ਼͸ߴ͍ʢࣗવݴޠॲཧ෼໺ͷ৔߹ɼfull paper1ຊͰ͓Αͦ10ສԁऑʣ l ࣌ؒతίετ - ֶੜʹͱͬͯͦ΋ͦ΋शಘ͕೉͍͠ - ڠྗऀ͕͍ͳ͍ͱ͖ʹֶੜ͕Ͳ͏ͨ͠ΒΑ͍͔෼͔Βͳ͍ʢ଴ͪͷൃੜʣ - ӳจߍਖ਼౳ʹ͔͔Δࣄ຿ॲཧ લఏɿΪϟοϓΛຒΊΔͷ͸༰қͰͳ͍ AIͷख΋ आΓ͍ͨ

Langsmithͷ঺հ ଞͷπʔϧ΋͋ͱͰ৮Ε·͢

LangsmithͷϢʔεέʔε ڭһʮ͜ͷ݁ՌͰӳޠ࿦จΛॻ͍ͯΈΑ͏͔ɽ ·ͣॳߘ׬੒Λ໨ࢦͦ͏ʯ ֶੜʮॳߘॻ͍ͯΈ·ͨ͠ʯ ڭһʮ෼͔Γ·ͨ͠ɽͪΐͬͱଞͷ࢓ࣄ͕ ͋ͬͯɼ਺೔ޙʹݟʹߦ͖·͢ʯ ʢֶੜʮ਺೔ޙ·Ͱ΍Δ͜ͱͳ͍ͳʯʣ ʲ਺೔ޙʳ ڭһʮ͏ʔΜɼͪΐͬͱӳޠͷ੿͕͞໨ཱͬ
ͯɼ಺༰ͷఴ࡟ʹूதͰ͖ͳ͍…ͻͱ·ͣ͜ ͏͍͏ॻ͖ํ͸΍Ίͨ΄͏͕ྑͯ͘…ʯ ֶੜʮ͢Έ·ͤΜ…ʯ ڭһʮ͜ͷ݁ՌͰӳޠ࿦จΛॻ͍ͯΈΑ͏͔ɽ ·ͣॳߘ׬੒Λ໨ࢦͦ͏ʯ ֶੜʮॳߘॻ͍ͯΈ·ͨ͠ʯ ڭһʮ෼͔Γ·ͨ͠ɽͪΐͬͱଞͷ࢓ࣄ͕͋ͬ ͯɼ਺೔ޙʹݟʹߦ͖·͢ɽLangsmithΛ࢖ͬ ͯɼදݱΛચ࿅͓͍ͤͯͯ͞΄͍͠Ͱ͢ɽʯ ֶੜʮΘ͔Γ·ͨ͠ʯ ʲ਺೔ޙʳ ڭһʮ͢ΜͳΓಡΊ·ͨ͠ɽઌʹ͜ͷ݁ՌΛݟ ͤͯ·ͣશମͷ܏޲ʹ৮Εͨޙ…ʯ ΑΓਂ͍ٞ࿦ ͕ͳ͍৔߹ ͋Δ৔߹ ਂ͍ٞ࿦ʹ౿ΈࠐΊͳ͍

Ͱ͖Δ͜ͱɿֶज़తจষʹ͓͚Δྲྀெͳදݱ΁ͷ຋༁ We can see the outcome in Table 1.
Results are shown in Table1. The results is shown in Table 1. Our results are shown in Table 1. The result can be seen in Table 1. In recent years, () attention has been paid to the language models. In recent years, much attention has been… In recent years, considerable attention has been… In recent years, substantial attention has been… In recent years, significant attention has been… In recent years, extensive attention has been… In recent years, there has been considerable attention paid to … ʮΑ͋͘Δ౓ʯ ͜͜ʹԿ͔ิͬͯ΄͍͠ ϙΠϯτ̍ɿ໌Β͔ͳޡΓ͚ͩͰͳ͘ɼݴ͍ճ͠ͷࣗવ͞Λߟ͑ͯ͘ΕΔ ϙΠϯτ̎ɿෳ਺ͷީิΛఏࣔͯ͘͠ΕΔʢγεςϜ͕ҊΛग़͠ɼਓ͕ؒ࠷ऴܾఆ͢Δͱ͍͏ڠಇʣ ೖྗ ग़ྗ LangsmithͰ

https://www.youtube.com/watch?v=onisJumnvSA

l Langsmith EditorʹΞΫηε - https://editor.langsmith.co.jp - ʮLangsmith EditorʯͰݕࡧ l νϡʔτϦΞϧ͸Ұ୴εΩοϓ
- ӈͷϩέοτͷϘλϯ͔Β΋͏Ұ౓ݟΕ·͢ ࣮ફ࿅शᶃ LangsmithΤσΟλΛ։͜͏ 🔍

l “Grammar error correction (GEC) ()…in text.” ͱ͍͏จதʹΧʔιϧΛೖΕɼ “command+.” ʢwindowsͷ৔߹͸“control
+ .”ʣΛԡ͠·͠ΐ͏ l ॻ͖׵͑ީิ͕ੜ੒͞Ε·ͨ͠Ͷʁ - ଴͕ͪ࣌ؒൃੜ͠·ͨ͠Ͷʁ͓ۚͷྗ͔ɼ౦๺େֶͷॴଐͷྗʢ౦๺େϓϥϯʣͰղܾͰ͖·͢ ࣮ફ࿅शᶄ ॻ͖׵͑ػೳΛ࢖ͬͯΈΑ͏

l In recent years, much attention has been paid to
the language models. ͱೖྗ͠Α͏ɽ l બ୒Օॴ͕ॏ఺తʹॻ͖׵͑ΒΕ·͢ - recentΛμϒϧΫϦοΫ͠Α͏ - paidΛμϒϧΫϦοΫ͠Α͏ - modelsΛμϒϧΫϦοΫ͠Α͏ ࣮ફ࿅शᶅ ෦෼తͳॻ͖׵͑Λ࣮ߦ͠Α͏

l In recent years, () attention has been paid to
the language models. ͱೖྗ͠Α͏ɽ l จதʹΧʔιϧΛೖΕɼ “command+.” ʢwindowsͷ৔߹͸“control + .”ʣ Λԡ͠·͠ΐ͏ l () ͷͱ͜Ζʹ৭ʑຒΊͯ͘Ε·ͨ͠Ͷʁ ࣮ફ࿅शᶆ ෦෼తͳॻ͖׵͑Λ࣮ߦ͠Α͏

l ొ࿥͢Δར఺ - ଴͕ͪ࣌ؒ୹͘ͳΓ·͢ - ֤෼໺ಛԽϞσϧΛ࢖͑·͢ l ೖྗσʔλ͸γεςϜͷվળʹ׆༻͢Δ Մೳੑ͕͋Γ·͢ʢϦεΫ͸ޙड़ʣ -
ආ͚͍ͨ৔߹͸ɼ༗ྉϓϥϯʹҠߦ͍ͯͩ͘͠͞🙇 - ༗ྉϓϥϯܧଓ཰͓Αͦ95%ʢࡢ݄2023೥1݄தͷ৔߹ʣ ࣮ફ࿅शᶇɿ౦๺େֶϓϥϯʹొ࿥͠Α͏ https://ja.langsmith.co.jp/tohoku-univ/

l ϥΠςΟϯάʢӳޠදݱʣΛ௚઀ࢧԉ͢Δ - ຋༁: DeepL, Google Translate,… - ޠኮ: power
thesaurus, ֤छࣙॻ,… - ݴ͍ճ͠: Langsmith, WordTune, DeepL write… - จ๏νΣοΫ: Grammarly,… l ݚڀ׆ಈҰൠΛࢧԉ͢Δ - จݙ؅ཧ: Mendeley, Paperpile - จݙݕࡧ: Elicit, Connected Papers, … - ࣥච؀ڥ: Overleaf, LaTeX on VSCode,… - ࿦จΛಡΉ׆ಈ: PDF Expert, SCISPACE, Readable, … ͜͜Ͱৄࡉͳൺֱ͸͠·ͤΜɽ࢖͍΍͍͢΋ͷΛऔΓೖΕͯΈ·͠ΐ͏ʂ ଞπʔϧ

Ϣʔεέʔε ຋༁ ਪᏏ ຊ൪؀ڥ ޮ཰తͳ୤ॳߘ drafting (Լॻ͖) revising &
editing (ਪᏏ) proofreading (ߍਖ਼) ࠷ऴߘʹ޲͚࣭ͯΛ্͛Δ ຋༁ϞσϧͳͲ ΤϥʔνΣοΧʔ ͳͲ Langsmith͸͜ͷ தؒΛຒΊ͍ͨ The motivation were was …に起因した… …attributed to…

ྙཧɾϦςϥγʔ

l πʔϧʹೖྗͨ͠σʔλ͕Ͳ͏ѻΘΕΔ͔֬ೝ͠Α͏ʢϓϥΠόγʔϙϦγʔΛಡ΋͏ʣ - ਪᏏதͷ࿦จ͸యܕతʹ͸ະൃද৘ใͰ͋ΓऔΓѻ͍஫ҙ - ೖྗσʔλ͕׆༻͞ΕΔ৔߹ɼγεςϜ։ൃऀ͕ͦͷೖྗΛݟΔͱ͍͏ةݥੑ΋͞ͳ͕Βɼ ೖྗͨ͠৘ใΛϞσϧ͕ۮવग़ྗͯ͠͠·͏Մೳੑ͕൱ఆͰ͖ͳ͍ʢޙड़ʣ ະग़൛࿦จ৘ใΛπʔϧʹೖྗ͢Δةݥੑ

l πʔϧʹೖྗͨ͠σʔλ͕Ͳ͏ѻΘΕΔ͔֬ೝ͠Α͏ʢϓϥΠόγʔϙϦγʔΛಡ΋͏ʣ - ਪᏏதͷ࿦จ͸యܕతʹ͸ະൃද৘ใͰ͋ΓऔΓѻ͍஫ҙ - ೖྗσʔλ͕׆༻͞ΕΔ৔߹ɼγεςϜ։ൃऀ͕ͦͷೖྗΛݟΔͱ͍͏ةݥੑ΋͞ͳ͕Βɼ ೖྗͨ͠৘ใΛϞσϧ͕ۮવग़ྗͯ͠͠·͏Մೳੑ͕൱ఆͰ͖ͳ͍ʢޙड़ʣ l DeepL΍Langsmithͷ৔߹ -
ແྉϓϥϯʢ౦๺େϓϥϯʣɿೖྗσʔλΛγεςϜͷվળͷͨΊʹར༻͢ΔՄೳੑ͕͋Δɽ - ༗ྉϓϥϯɿೖྗσʔλ͸ฐࣾαʔόʹ࢒Βͳ͍ɽͭ·ΓϞσϧ։ൃʹ΋࢖Θͳ͍ɽ l OpenAI (ChatGPT) ͷ৔߹ - Communication Information: If you communicate with us, we may collect your name, contact information, and the contents of any messages you send… ະग़൛࿦จ৘ใΛπʔϧʹೖྗ͢Δةݥੑ https://openai.com/privacy/ Other 3.5% Closed QA 2.6% Extract 1.9% {summary} """ This is the outline of the commercial for that play: """ 3 Methods and experimental details 3.1 High-level methodology Our methodology follows that of Ziegler et al. (2019) and Stiennon et al. (2020), who applied it in the stylistic continuation and summarization domains. We start with a pretrained language model (Radford et al., 2019; Brown et al., 2020; Fedus et al., 2021; Rae et al., 2021; Thoppilan et al., 2022), a distribution of prompts on which we want our model to produce aligned outputs, and a team of trained human labelers (see Sections 3.4 for details). We then apply the following three steps (Figure 2). Step 1: Collect demonstration data, and train a supervised policy. Our labelers provide demon- strations of the desired behavior on the input prompt distribution (see Section 3.2 for details on this distribution). We then fine-tune a pretrained GPT-3 model on this data using supervised learning. Step 2: Collect comparison data, and train a reward model. We collect a dataset of comparisons between model outputs, where labelers indicate which output they prefer for a given input. We then train a reward model to predict the human-preferred output. Step 3: Optimize a policy against the reward model using PPO. We use the output of the RM as a scalar reward. We fine-tune the supervised policy to optimize this reward using the PPO algorithm (Schulman et al., 2017). Steps 2 and 3 can be iterated continuously; more comparison data is collected on the current best policy, which is used to train a new RM and then a new policy. In practice, most of our comparison data comes from our supervised policies, with some coming from our PPO policies. 3.2 Dataset Our prompt dataset consists primarily of text prompts submitted to the OpenAI API, specifically those using an earlier version of the InstructGPT models (trained via supervised learning on a subset of our demonstration data) on the Playground interface.4 Customers using the Playground were informed that their data could be used to train further models via a recurring notification any time InstructGPT models were used. In this paper we do not use data from customers using the API in production. We heuristically deduplicate prompts by checking for prompts that share a long common prefix, and we limit the number of prompts to 200 per user ID. We also create our train, validation, and test splits based on user ID, so that the validation and test sets contain no data from users whose data is in the training set. To avoid the models learning potentially sensitive customer details, we GPT-3.5 (ChatGPTͷલ਎) ͷ৔߹ɼϢʔβೖྗྫ͔Βσʔληοτ͕࡞ΒΕɼ Ϟσϧͷֶशʹ࢖ΘΕͨΓɼ࿦จதͰఏࣔ͞ΕͨΓ͍ͯ͠Δ ҎԼͷ࿦จ֓ཁͷӳޠදݱΛ ચ࿅ͤͯ͞΄͍͠ [Ouyang+, 2022] 🔍

l ֶशʹ࢖༻ͨ͠σʔλʢΦʔϓϯΞΫηεͷ࿦จͳͲʣͱಉ͡จষ͕ɼ Ϣʔβʹରͯ͠ۮવग़ྗ͞ΕΔՄೳੑΛ൱ఆͰ͖ͳ͍ - χϡʔϥϧϞσϧΛ༻͍ͨγεςϜͷ৔߹ Ⴎ઄νΣοΫΛ͠Α͏ʢϥΠςΟϯάπʔϧͷ࢖༻Λ໰Θͣʣ

஍ٿԹஆԽ໰୊ʹͭ ͍ͯɺݸਓϨϕϧ͔ ΒͰ΋؀ڥʹ༏͍͠ ੜ׆༷ࣜΛ৺͕͚Δ ͜ͱ͕େ੾Ͱ͢ɻ l ֶशʹ࢖༻ͨ͠σʔλʢΦʔϓϯΞΫηεͷ࿦จͳͲʣͱಉ͡จষ͕ɼ Ϣʔβʹରͯ͠ۮવग़ྗ͞ΕΔՄೳੑΛ൱ఆͰ͖ͳ͍ - χϡʔϥϧϞσϧΛ༻͍ͨγεςϜͷ৔߹
Ⴎ઄νΣοΫΛ͠Α͏ʢϥΠςΟϯάπʔϧͷ࢖༻Λ໰Θͣʣ χϡʔϥϧʢݴޠʣϞσϧͷʮ͔ͳΓ؆୯ͳ͓ؾ࣋ͪʯ ࠓ೔͸ே͔Β੖ΕͰؾ࣋ͪ ͷ͍͍Ұ೔ʹͳΓͦ͏ɻಓ ࿏΋ࠞΜͰ͍ͳ͍ͷͰɺε ϜʔζͳυϥΠϒ͕ظ଴Ͱ ͖ͦ͏Ͱ͢ɻ ৽͍͠εϚʔτϑΥϯ͕ൃ ച͞Ε·͢ɻߴੑೳΧϝϥ ͱେ༰ྔόοςϦʔɺߴ଎ ͳϓϩηοαʔ͕ಛ௕Ͱ͢ ʮ৽͍͠ʯͷޙ͸ ʮεϚʔτϑΥϯʯ ͕དྷΕΔ ʮͰ͢ʯͰจষ͕ ऴΘΓ͕ͪ ͋Γ͕ͪͳ୯ޠܥྻ ύλʔϯΛֶश ࠓ೔͸৽͍͠ε ϚʔτϑΥϯͱͱ ΋ʹ݈߁ͳੜ׆Λ … ؍࡯ͨ͠ύλʔϯΛ ૊Έ߹Θͤͯɼ ͋Γಘͦ͏ͳ୯ޠͷฒͼ Λ֬཰తʹੜ੒͢Δ ຊݚڀͰ͸ը૾ͱς ΩετΛ૊Έ߹Θͤ ͯXX໰୊Λղ͘͜ͱ ͕େ੾Ͱ͋Γɻੑೳ ͕10%޲্͢Δ͜ͱ ͕… 🔍 ੜ੒݁Ռ େྔͷֶशσʔλ ຊݚڀͰ͸ը૾ͱς ΩετΛ૊Έ߹Θͤ ͯXX໰୊Λղ͘͜ͱ Ͱɼੑೳ͕10%޲্ ͢Δ͜ͱΛࣔ͢ɻ ૊Έ߹Θͤͷ݁Ռɼ ۮવֶशσʔλͱಉ͡ ΋ͷ͕ग़ྗ͞ΕΔՄೳ ੑΛ൱ఆͰ͖ͳ͍ ࢲͨͪ͸ɺ݈߁ͳੜ׆Λ ૹΔͨΊʹɺద౓ͳӡಈ ͱӫཆόϥϯεͷྑ͍৯ ੜ׆͕େ੾Ͱ͢ɻ ͦͷ··ड͚ೖΕΔͱႮ઄ͷڪΕ ͓͔͠ͳ͜ͱΛݴ͏ Մೳੑ΋΋ͪΖΜ͋Γ·͢

l ࿦จࢽɾֶձͷΨΠυϥΠϯΛ֬ೝ͠Α͏ ࿦จࣥචʹ͓͍ͯAIͷར༻͸Ͳ͜·ͰೝΊΒΕΔ͔ ICML2023 • The Large Language Model
(LLM) policy for ICML 2023 prohibits text produced entirely by LLMs (i.e., “generated”). This does not prohibit authors from using LLMs for editing or polishing author-written text. • The LLM policy is largely predicated on the principle of being conservative with respect to guarding against potential issues of using LLMs, including plagiarism. • The LLM policy applies to ICML 2023. We expect this policy may evolve in future conferences as we understand LLMs and their impacts on scientific publishing better. https://icml.cc/Conferences/2023/llm-policy ʢػցֶश෼໺ʣ ݴޠϞσϧʢχϡʔϥϧϞσϧʣͷΈ͔Βੜ੒͞ΕͨจষΛ࢖͏͜ͱ͸ېࢭ͠·͢ɽ ݴޠϞσϧͷग़ྗΛฤूͨ͠Γɼஶऀ͕ॻ͍ͨจষΛݴޠϞσϧʹਪᏏͯ͠΋Β͏͜ͱ͸ېࢭ͠·ͤΜɽ Ϟσϧͷߩݙ͕Φʔαʔγοϓͷൃੜʹ஋͢Δ͔ɼ͕Ұͭͷڥ໨ͩͱ͸ࢥ͍·͢

l ࿦จࢽɾֶձͷΨΠυϥΠϯΛ֬ೝ͠Α͏ l ֘౰ֶձͰݴٴ͕ͳ͍৔߹ɼࠓճ৮Εͨํ਑͸ߟ͑ํͷҰͭͱͯ͠಄ʹೖΕ͓ͯ͘ͱ Α͍͔΋͠Ε·ͤΜʢतۀͷϨϙʔτ࡞੒ͳͲʹ͍ͭͯ΋ಉ༷ʣ ࿦จࣥචʹ͓͍ͯAIͷར༻͸Ͳ͜·ͰೝΊΒΕΔ͔ ACL • Assistance
purely with the language of the paper. …The use of tools that only assist with language, like Grammarly or spell checkers, does not need to be disclosed. • Short-form input assistance. …predictive keyboards or tools like smart compose in google docs …Similarly to language tools above, the use of such tools does not need to be disclosed in response to the writing assistance question. • Literature search. …beware of the possible biases in suggested citations. • Low-novelty text. …They should specify where such text was used, and convince the reviewers that the generation was checked to be accurate and is accompanied by relevant and appropriate citations… • New ideas. …we suggest acknowledging the use of the model, and checking for known sources for any such ideas to acknowledge them as well… • New ideas + new text: a contributor of both ideas and their execution seems to us like the definition of a co-author, …we would discourage such use in ACL submissions…. https://2023.aclweb.org/blog/ACL-2023-policy/ ʢࣗવݴޠ ॲཧ෼໺ʣ

l ࣗવݴޠॲཧٕज़͕ൃల͠ɼ࿦จࣥචʹؔ࿈͢Δ΋ͷʹݶͬͯ΋ɼ ༷ʑͳαʔϏε্ཱ͕ͪ͛ΒΕͭͭ͋Δ l ద੾ʹ࢖͍͜ͳͤ͹ڧྗͳύʔτφʔͱͳΔ - γεςϜΛར༻͢ΔࡍͷϦεΫʹ͍ͭͯҙࣝ͢Δඞཁ͕͋Δ - ͦͷσʔλ͸ೖྗͯ͠ྑ͍͔ʁ -
ͦͷग़ྗΛ҆௚ʹड͚ೖΕͯΑ͍͔ʁ l ·ͣ͸Langsmith EditorΛ࢖ͬͯΈΑ͏ ·ͱΊ

自然言語処理による論文執筆支援

自然言語処理による論文執筆支援

tatsuki kuribayashi

More Decks by tatsuki kuribayashi

Other Decks in Research

Featured

Transcript

ࣗવݴޠॲཧʹΑΔ࿦จࣥචࢧԉ ౦๺େֶ ಛ೚ݚڀһɼLangsmithגࣜձࣾ ڞಉ૑ۀऀ ܀ྛथੜ

l ܀ྛथੜʢTatsuki Kuribayashiʣ - https://kuribayashi4.github.io/ l ܦྺ - 2014-23: ౦๺େֶʢֶ෦ɾम࢜ɾത࢜ɾϙευΫʣ

༨ஊɿChatGPT͕࿩୊ʹͳͬͯ·͢Ͷ https://openai.com/blog/chatgpt/ ໰͍߹Θͤ ฦ౴ੜ੒

༨ஊɿChatGPT͕࿩୊ʹͳͬͯ·͢Ͷ ҎԼͷ࿦จ֓ཁͷӳޠදݱΛ ચ࿅ͤͯ͞΄͍͠ ະൃදͷ಺༰Λπʔϧʹೖྗ͢Δةݥੑ ʢޙͰ৮Ε·͢ʣ https://openai.com/blog/chatgpt/

༨ஊɿChatGPT͕࿩୊ʹͳͬͯ·͢Ͷ ࿦จͷ֓ཁΛθϩ͔Β ॻ͍ͯ΋Βͬͯɼ ண૝ΛಘΑ͏ ChatGPTͷΦʔαʔγοϓ͸ʁ ʢޙͰ৮Ε·͢ʣ https://openai.com/blog/chatgpt/

l ಋೖ - ·ͱΊ - ໰୊ҙࣝ l Langsmithͷ঺հ - ҰॹʹLangsmithΛ࢖ͬͯΈΑ͏

l ࿦จࣥචʹ͓͍ͯ - ͜Μͳศརͳπʔϧ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - ͜Μͳةݥੑ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - LangsmithΛ஌Βͳ͔ͬͨ ʮ஌Βͳ͔ͬͨʯΛ๷͙ձ

l ࿦จࣥචʹ͓͍ͯ - ͜Μͳศརͳπʔϧ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - ͜Μͳةݥੑ͕͋Δ͜ͱΛ஌Βͳ͔ͬͨ - LangsmithΛ஌Βͳ͔ͬͨ l ࿦จࣥචࢧԉ

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… લఏɿ࿦จͷॻ͖ํ͸େࣄ

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… l ʮদඌ͙Έͷ࿦จͷॻ͖ํɿӳޠ࿦จʯΑΓҾ༻

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… લఏɿΞϯϑΣΞͳ౔ඨ

l ࿦จͷॻ͖ํͷॏཁੑ - ࣗ෼͕ಡΈ͍ͨ࿦จ=Ձ஋ͷ͋Δ஌ݟºϓϨθϯςʔγϣϯ͕͏·͍º… - ϓϨθϯςʔγϣϯ͕͏·͍=ಡΈ΍͍͢ºਤද͕Θ͔Γ΍͍͢º… - ಡΈ΍͍͢=ӳޠ͕Θ͔Γ΍͍͢ºߏ੒͕Θ͔Γ΍͍͢º… લఏɿΞϯϑΣΞͳ౔ඨ

l ٻΊΒΕΔεϐʔυײʢ※ࣗવݴޠॲཧ෼໺ͷ৔߹ʣ લఏɿੈք͸଴ͬͯ͘Εͳ͍ 0 500 1000 1500 2000 2500

ֶੜ͸ӳޠ࿦จࣥචͷϊ΢ϋ΢Λ஌Βͳ͍ʢ౰ͨΓલʣ l ਓతίετ - ڠྗऀͷϦιʔεͷݶքɿ - ڞஶͷڭһɼത࢜՝ఔֶੜͳͲ͕͢΂ͯͷֶੜʹ͖͖ͭͬΓʹͳΔ͜ͱ͸೉͍͠ - ֶशػձͷ࠶ݱੑɿ -

Langsmithͷ঺հ ଞͷπʔϧ΋͋ͱͰ৮Ε·͢

Ͱ͖Δ͜ͱɿֶज़తจষʹ͓͚Δྲྀெͳදݱ΁ͷ຋༁ We can see the outcome in Table 1.

https://www.youtube.com/watch?v=onisJumnvSA

l Langsmith EditorʹΞΫηε - https://editor.langsmith.co.jp - ʮLangsmith EditorʯͰݕࡧ l νϡʔτϦΞϧ͸Ұ୴εΩοϓ

l “Grammar error correction (GEC) ()…in text.” ͱ͍͏จதʹΧʔιϧΛೖΕɼ “command+.” ʢwindowsͷ৔߹͸“control

l In recent years, much attention has been paid to

l In recent years, () attention has been paid to

l ొ࿥͢Δར఺ - ଴͕ͪ࣌ؒ୹͘ͳΓ·͢ - ֤෼໺ಛԽϞσϧΛ࢖͑·͢ l ೖྗσʔλ͸γεςϜͷվળʹ׆༻͢Δ Մೳੑ͕͋Γ·͢ʢϦεΫ͸ޙड़ʣ -

l ϥΠςΟϯάʢӳޠදݱʣΛ௚઀ࢧԉ͢Δ - ຋༁: DeepL, Google Translate,… - ޠኮ: power

Ϣʔεέʔε ຋༁ ਪᏏ ຊ൪؀ڥ ޮ཰తͳ୤ॳߘ drafting (Լॻ͖) revising &

ྙཧɾϦςϥγʔ

l ֶशʹ࢖༻ͨ͠σʔλʢΦʔϓϯΞΫηεͷ࿦จͳͲʣͱಉ͡จষ͕ɼ Ϣʔβʹରͯ͠ۮવग़ྗ͞ΕΔՄೳੑΛ൱ఆͰ͖ͳ͍ - χϡʔϥϧϞσϧΛ༻͍ͨγεςϜͷ৔߹ Ⴎ઄νΣοΫΛ͠Α͏ʢϥΠςΟϯάπʔϧͷ࢖༻Λ໰Θͣʣ

஍ٿԹஆԽ໰୊ʹͭ ͍ͯɺݸਓϨϕϧ͔ ΒͰ΋؀ڥʹ༏͍͠ ੜ׆༷ࣜΛ৺͕͚Δ ͜ͱ͕େ੾Ͱ͢ɻ l ֶशʹ࢖༻ͨ͠σʔλʢΦʔϓϯΞΫηεͷ࿦จͳͲʣͱಉ͡จষ͕ɼ Ϣʔβʹରͯ͠ۮવग़ྗ͞ΕΔՄೳੑΛ൱ఆͰ͖ͳ͍ - χϡʔϥϧϞσϧΛ༻͍ͨγεςϜͷ৔߹

l ࿦จࢽɾֶձͷΨΠυϥΠϯΛ֬ೝ͠Α͏ ࿦จࣥචʹ͓͍ͯAIͷར༻͸Ͳ͜·ͰೝΊΒΕΔ͔ ICML2023 • The Large Language Model

l ࿦จࢽɾֶձͷΨΠυϥΠϯΛ֬ೝ͠Α͏ l ֘౰ֶձͰݴٴ͕ͳ͍৔߹ɼࠓճ৮Εͨํ਑͸ߟ͑ํͷҰͭͱͯ͠಄ʹೖΕ͓ͯ͘ͱ Α͍͔΋͠Ε·ͤΜʢतۀͷϨϙʔτ࡞੒ͳͲʹ͍ͭͯ΋ಉ༷ʣ ࿦จࣥචʹ͓͍ͯAIͷར༻͸Ͳ͜·ͰೝΊΒΕΔ͔ ACL • Assistance

l ࣗવݴޠॲཧٕज़͕ൃల͠ɼ࿦จࣥචʹؔ࿈͢Δ΋ͷʹݶͬͯ΋ɼ ༷ʑͳαʔϏε্ཱ͕ͪ͛ΒΕͭͭ͋Δ l ద੾ʹ࢖͍͜ͳͤ͹ڧྗͳύʔτφʔͱͳΔ - γεςϜΛར༻͢ΔࡍͷϦεΫʹ͍ͭͯҙࣝ͢Δඞཁ͕͋Δ - ͦͷσʔλ͸ೖྗͯ͠ྑ͍͔ʁ -