Slide 1

Slide 1 text

ACL2020 Category Survey: Sentiment Analysis

Slide 2

Slide 2 text

಺ڮ ݎࢤ uchi_k @__uchi_k__ About me yuni, inc. ୅ද ʮσʔλυϦϒϯͳ΋ͷͮ͘ΓΛࢧԉ͢ΔʯΛܝ͛ɺ UGCղੳ΍ϚʔέςΟϯάσʔλղੳͷडୗɺࣗࣾ SleepTechϒϥϯυɺύʔιφϥΠζϚοτϨεͷ xSleepΛӡӦ͍ͯ͠·͢ɻ former ژେ৘ใӃ, ະ౿2016, ϑϦʔϥϯε, FreakOut Machine Learning Engineer

Slide 3

Slide 3 text

ACL2020 ੜ੒ܥɺάϥϑܥͷ࿦จ͕͔ͳΓ૿͑ͨҹ৅ #&35 3P#&35B౳ͷࣄલֶशݴޠϞσϧʹؔ͢Δݴٴ͕΄΅ඞͣ͋Δ ࠶ݱੑͷࢹ఺΍࣮຿΁ͷԠ༻͔Βɺࢦඪͷݟ௚͕͠ਐΜͩ ϕετϖʔύʔ΋ɺ/-1λεΫͷςετέʔεΈ͍ͨͳ΋ͷΛఆ ٛͯ͠௨ա཰ΛݟΑ͏Έ͍ͨͳ࿩Λ͍ͯͨ͠Γ ,OPXMFEHFHSBQIʹճؼͯ͠ɺάϥϑ্Ͱͷԋࢉ΍άϥϑߏ଄ɺֶ शΛߦ͏Α͏ͳ࿩͕૿Ճ Ҏ্ɺࢲݟͰͨ͠

Slide 4

Slide 4 text

Sentiment Analysis ͦͷଞ ൺֱબ޷෼ྨ ཁ໿ આಘྗղੳ ελϯεݕग़ લॲཧ ΞεϖΫτࣝผ ײ৘ݪҼϖΞநग़ ΞεϖΫτϕʔεײ৘෼ྨ ϨϏϡʔίϝϯτ౳ͷ VTFSHFOFSBUFE DPOUFOUT 6($ ͷϚΠ χϯά͕ଟ͘ɺ࢈ۀԠ༻ ࢤ޲ͷݚڀ͕ଟ͍෼໺ 6($ͷղੳ͸঎඼ͷΧς ΰϦ΍࢖༻ঢ়گ͋Γ͖ͷ ײ৘ೝࣝʹͳΔ͜ͱ͕ଟ ͍ˠΞεϖΫτϕʔεײ ৘෼ྨͷ࿦จ͕ଟ͍

Slide 5

Slide 5 text

Sentiment Analysis 4FOUJNFOUBOBMZTJTͷ໰୊ҙࣝͱͯ͠ଟ͍ͷ͸ɺσʔλ͕গͳ͍ɺ ݸผλεΫͷͭͳ͗߹ΘͤͰ૬ޓ࡞༻ΛߟྀͰ͖͍ͯͳ͍ɺ৽͍͠໰୊ ΁ͷରॲͷͭͷύλʔϯ σʔλ͕গͳ͍໰୊ʹରͯ͠͸ɺผݴޠϦιʔεͷར༻ɺผυϝΠ ϯͷར༻ɺ֎෦஌ࣝάϥϑͷར༻ɺޮ཰తͳΞϊςʔγϣϯ౳Ͱରॲ $SPTT-JOHVBM6OTVQFSWJTFE4FOUJNFOU$MBTTJpDBUJPOXJUI .VMUJ7JFX5SBOTGFS-FBSOJOHࢿݯͷগͳ͍ݴޠʹ͍ͭͯɺڭࢣͳ ͠ػց຋༁ͱݴޠ൑ผثΛ࢖ͬͨڭࢣͳ͠ݴޠԣஅηϯνϝϯτ෼ྨϞ σϧΛఏҊɻ

Slide 6

Slide 6 text

$SPTT-JOHVBM6OTVQFSWJTFE4FOUJNFOU$MBTTJpDBUJPO XJUI.VMUJ7JFX5SBOTGFS-FBSOJOH #σʔλ͕গͳ͍໰୊΁ͷରॲྫɿผݴޠϦιʔεͷར༻ )POHMJBOH'FJ #BJEV3FTFBSDI 1JOH-J #BJEV3FTFBSDI "$- ࢿݯͷগͳ͍ݴޠʹ͍ͭͯɺڭࢣͳ͠ػց຋༁ͱݴޠ൑ผثΛ࢖ͬͨ ڭࢣͳ͠ݴޠԣஅηϯνϝϯτ෼ྨϞσϧΛఏҊɻ ຋༁ثͱݴޠ൑ผثʹΑΔݴޠ BEWFSTBSJBMͳֶशͰݴޠීว ͳಛ௃ۭؒΛֶश͢Δ λʔήοτݴޠͷϥϕϧ෇͖Ϧ ιʔε΋ιʔεݴޠͱͷΫϩε ϦϯΨϧιʔε΋ඞཁͱ͠ͳ͍ ఺͕৽͍͠

Slide 7

Slide 7 text

Sentiment analysis ݸผλεΫͷͭͳ͗߹Θͤʹͳ͍ͬͯΔɺͱ͍͏ͷ͸ɺྫ͑͹Ξε ϖΫτϕʔεηϯνϝϯτ෼ੳͰͷΞεϖΫτ༻ޠநग़ɺҙݟ༻ޠந ग़ɺηϯνϝϯτ෼ੳͷஈ֊ʹͳ͍ͬͯΔɺͳͲ λεΫͷ਺चͭͳ͗͸λεΫ͝ͱͷ஌ݟ͕ڞ༗͞Εͳ͍ɺ૬ޓ࡞༻Λ͏ ·ֶ͘शͰ͖ͳ͍ͳͲͷ໰୊఺͕͋ΓɺUSBOTGPSNFSͳͲΛ࢖ͬͨ FOEUPFOEͳख๏΍HSBQIBUUFOUJPOͳͲͷσʔλߏ଄Ͱରॲ͢ ΔͳͲͷख๏͕ݟΒΕͨ ৽͍͠໰୊΁ͷରॲͱͯ͠͸ɺ4/4ͰݟΒΕΔΤίʔνΣΠϯόʔ ݱ৅ʹΑΓ൓ରҙݟʹΞΫηε͢Δͷ͕೉͘͠ͳ͍ͬͯΔݱঢ়ʹରॲ͢ ΔͨΊɺٞ࿦ʹର྆͠ۃੑͷҙݟΛநग़͢ΔɺͰ͋ͬͨΓɺTFYJTNత ͳ౤ߘͷର৅΍λΠϓʢܦݧஊͳͷ͔ྫ͑ͳͷ͔౳ʣ·Ͱਪఆ͢Δ΋ͷ ͳͲ

Slide 8

Slide 8 text

঺հ͢Δ࿦จ "$PNQSFIFOTJWF"OBMZTJTPG1SFQSPDFTTJOHGPS8PSE 3FQSFTFOUBUJPO-FBSOJOHJO"⒎FDUJWF5BTLT )FUFSPHFOFPVT(SBQI/FVSBM/FUXPSLTGPS&YUSBDUJWF %PDVNFOU4VNNBSJ[BUJPO ˞ଞͷ࿦จಡΈձͰ࿩ͨ͠಺༰ΛؚΈ·͢ ײ৘ೝࣝͰҰൠతʹߦΘΕΔલॲཧͬͯຊ౰ʹҙຯ͋Δͷʁ࣮͸֐ ΋͋ΔΜ͡Όͳ͍ʁͱ͍͏ٙ໰Λௐ΂ͨ )FUFSPHFOFPVTHSBQIΛ࢖ͬͯ୯ޠ΍จɺ֓೦ͷؔ܎ੑΛදݱ ֦ͨ͠ுੑͷߴ͍நग़తจॻཁ໿

Slide 9

Slide 9 text

"$PNQSFIFOTJWF"OBMZTJTPG1SFQSPDFTTJOH GPS8PSE3FQSFTFOUBUJPO-FBSOJOHJO"⒎FDUJWF5BTLT #abstract /BTUBSBO#BCBOFKBE %FQBSUNFOUPG&MFDUSJDBM&OHJOFFSJOHBOE$PNQVUFS4DJFODF FUBM "$- ಛʹײ৘ೝࣝܥͷλεΫʹ͓͍ͯલॲཧ͕୯ޠຒΊࠐΈʹ༩͑ΔӨڹΛௐ΂ɺ Α͘ߦΘΕΔ࣮ݧઃఆ͕ຊ౰ʹਖ਼͍͠ͷ͔ݕূ͢Δ ֶशࡁΈͷ୯ޠຒΊࠐΈΛ࢖͍͕͚ͪͩͲɺྫ͑͹ʮ޾ͤʯͱʮ൵͠Έʯͷ ϖΞ͕ʮ޾ͤʯͱʮتͼʯͷϖΞΑΓྨࣅ౓͕ߴ͘ͳΔΑ͏ͳຒΊࠐΈ͕ଘ ࡏ͢Δͷʹײ৘ೝ͕ࣝຊ౰ʹղ͚Δʁ 4UPQXPSET OFHBUJPO 104 MFNNBUJ[BUJPOͳͲͷલॲཧΛͲ͏࢖͏͔ ͕ຊ࣭తʹॏཁͳͷͰ͸ʁ લॲཧ͕୯ޠຒΊࠐΈʹ༩͑ΔӨڹͷେ͖͞Λݕূ͠ɺैདྷͷ࣮ݧઃఆͷݟ ௚͠Λߦ͍͍ͨ

Slide 10

Slide 10 text

#distributional hypothesis #word embedding ෼෍Ծઆʹجͮ͘୯ޠຒΊࠐΈͷݶք ʮ޾ͤʯͱʮ൵͠ΈʯͷϖΞ͕ʮ޾ͤʯͱʮتͼʯͷϖΞΑΓྨࣅ౓ ͕ߴ͘ͳΔɺͳͲ௚ײʹ൓͢Δྨࣅ౓͕ಘΒΕΔ͜ͱ΋͋ΓɺλεΫ ͝ͱʹ୯ޠຒΊࠐΈΛௐ੔͢Δඞཁ͕͋Δ The Distributional Hypothesis is that words that occur in the same contexts tends to have similar meanings [Harris, 1954]. ࣅͨจ຺Ͱසൟʹग़ݱ͢Δ୯ޠಉ࢜͸ҙຯతʹྨࣅ͍ͯ͠Δͱߟ͑ͯɺ ຒΊࠐΈۭؒͰ΋ۙ͘ͳΔͱ͍͏Ծઆ ୯ޠͷҙຯΛܾΊΔͨΊͷҰͭͷํ๏ͱͯ͠ɺ෼෍Ծઆ͕͋Δɻ ౷ܭతʹ୯ޠͷҙຯΛಘΔͨΊͷํ๏ͰɺXPSEWFDͷΑ͏ͳਪ࿦ ϕʔεͷϞσϧ΍୯ʹ౷ܭ৘ใΛ࣍ݩ࡟ݮ͢ΔΧ΢ϯτϕʔεͷख๏΋ ͋Δ

Slide 11

Slide 11 text

"$PNQSFIFOTJWF"OBMZTJTPG1SFQSPDFTTJOH GPS8PSE3FQSFTFOUBUJPO-FBSOJOHJO"⒎FDUJWF5BTLT #abstract λεΫݻ༗ͷඍௐ੔΍Ϟσϧͷվળ΋ॏཁͰ͸͋Δ͕ɺઌߦݚڀ͔Β͸લॲ ཧ΍ϋΠύʔύϥϝʔλͷӨڹ͕ແࢹͰ͖ͳ͍͜ͱ͕ಡΈऔΕΔ ֶश༻σʔλͷ୯ޠຒΊࠐΈΛߦ͏લɾޙͦΕͧΕͷλΠϛϯάͰલॲཧΛ ߦͬͨΓɺςετσʔλͷલॲཧͱ߹ΘͤͨΓ߹Θͤͳ͔ͬͨΓΛࢼ͢ /BTUBSBO#BCBOFKBE %FQBSUNFOUPG&MFDUSJDBM&OHJOFFSJOHBOE$PNQVUFS4DJFODF FUBM "$- ಛʹײ৘ೝࣝܥͷλεΫʹ͓͍ͯલॲཧ͕୯ޠຒΊࠐΈʹ༩͑ΔӨڹΛௐ΂ɺ Α͘ߦΘΕΔ࣮ݧઃఆ͕ຊ౰ʹਖ਼͍͠ͷ͔ݕূ͢Δ

Slide 12

Slide 12 text

ؔ࿈ݚڀʢTFOUJNFOU FNPUJPOʣ • &NPUJPO$BVTF1BJS&YUSBDUJPO"/FX5BTLUP&NPUJPO "OBMZTJTJO5FYUT ◦ 3VJ9JB 4DIPPMPG$PNQVUFS4DJFODFBOE&OHJOFFSJOH FUBM "$- ◦ FNPUJPOͱDBVTFͷϖΞΛநग़͢Δ৽͍͠λεΫͷఏҊɻFNPUJPOͱ DBVTFͷϖΞͰϚϧνλεΫֶशΛߦ͏ • /-'**5BU*&45&NPUJPO3FDPHOJUJPOVUJMJ[JOH/FVSBM /FUXPSLTBOE.VMUJMFWFM1SFQSPDFTTJOH ◦ 4BNVFM1FDBS 4MPWBL6OJWFSTJUZPG5FDIOPMPHZ FUBM &./-1 ◦ 6TFSHFOFSBUFEDPOUFOUTΛ࢖༻͢Δ৔߹ͷલॲཧͷॏཁੑʹ͍ͭͯௐ΂ ͍ͯΔɻಛʹإจࣈ΍ֆจࣈͷೝࣝΛৄ͘͠ߦ͍είΞΛ্͛Δ͜ͱʹ੒ޭ

Slide 13

Slide 13 text

ؔ࿈ݚڀʢXPSEWFD 6($ʣ • *NQSPWJOH%JTUSJCVUJPOBM4JNJMBSJUZXJUI-FTTPOTGSPN8PSE &NCFEEJOHT ◦ 0NFS-FWZ #BS*MBO6OJWFSTJUZ FUBM "$- ◦ 8PSEFNCFEEJOHʹ͓͍ͯɺΧ΢ϯτϕʔεͷख๏Ͱ΋ϋΠύʔύϥϝʔ λௐ੔࣍ୈͰXPSEWFDͳͲͷਪ࿦ϕʔεͷख๏ʹউͯΔ͜ͱΛࣔͨ͠ ◦ ख๏΋ॏཁ͕ͩɺϋΠύʔύϥϝʔλͷٞ࿦΋ॏཁͱ͍͏͜ͱΛ໰୊ఏى • /-'**5BU*&45&NPUJPO3FDPHOJUJPOVUJMJ[JOH/FVSBM /FUXPSLTBOE.VMUJMFWFM1SFQSPDFTTJOH ◦ 4BNVFM1FDBS 4MPWBL6OJWFSTJUZPG5FDIOPMPHZ FUBM &./-1 ◦ 6TFSHFOFSBUFEDPOUFOUTΛ࢖༻͢Δ৔߹ͷલॲཧͷॏཁੑʹ͍ͭͯௐ΂ ͍ͯΔɻಛʹإจࣈ΍ֆจࣈͷೝࣝΛৄ͘͠ߦ͍είΞΛ্͛Δ͜ͱʹ੒ޭ ◦ લॲཧʹΧςΰϥΠζ͞ΕΔΑ͏ͳॲཧΛ͔ͬ͠Γ΍Δ͜ͱͰείΞ޲্ʹ ͭͳ͕Δͱ͍͏͜ͱ͕࿦จͰࣔ͞Εͨ #recent study #ugc #word2vec

Slide 14

Slide 14 text

ؔ࿈ݚڀʢલॲཧʣ • 0OTUPQXPSET pMUFSJOHBOEEBUBTQBSTJUZGPSTFOUJNFOU BOBMZTJTPGUXJUUFS ◦ )BTTBO4BJG 5IF0QFO6OJWFSTJUZ FUBM -3&$ ◦ ετοϓϫʔυͷআڈ͕༗ޮ͔ͦ͏Ͱͳ͍͔͸ϫʔυϦετͷ࡞Γํ΍λε ΫͰେ͖͘ҟͳΔ͕ɺUXJUUFSTFOUJNFOUͰ͸Ұൠతͳํ๏ͩͱ֐ͷํ͕େ ͖͍͜ͱΛࣔͨ͠ ◦ Ұൠతͳલॲཧख๏ΛφΠʔϒʹద༻͢Δ͚ͩͰ͸͍͚ͳ͍͜ͱ͕͋Δ͜ͱ Λࣔͨ͠ • "DPNQBSBUJWFFWBMVBUJPOPGQSFQSPDFTTJOHUFDIOJRVFTBOE UIFJSJOUFSBDUJPOTGPSUXJUUFSTFOUJNFOUBOBMZTJT ◦ 4ZNFPO4ZNFPOJEJT &YQFSU4ZTUFNTXJUI"QQMJDBUJPOT ◦ લॲཧͷςΫχοΫΛ৭ʑࢼͯ͠ΈͨΒɺײ৘෼ੳͰ͸MFNNBUJ[BUJPOͱ ਺ࣈͷআڈɺ୹ॖܗͷஔ׵͕࠷΋είΞʹد༩ ◦ ෼ྨσʔλͷલॲཧʹؔͯ͠แׅతͳείΞධՁΛߦͬͨ #recent study #preprocessing #emotion

Slide 15

Slide 15 text

#key points ͜ͷ࿦จΛ঺հ͢Δཧ༝ ໘ന͍৽نख๏΋ͨ͘͞Μ͋Δ͕ɺ࣮ӡ༻Ͱਫ਼౓͕ग़ͤΔ΋ͷ͕ͳ͔ ͳ͔ͳ͍ͱײ͍ͯͨ͡ ݁ہલॲཧͷબͼํ΍ख๏ͷҧ͍͕େ͖͘είΞʹӨڹ͍ͯ͠Δ͕ɺ ࿦จͰͦΕΛ࿦͍ͯ͡Δ΋ͷ͕΄ͱΜͲͳ͍ ҉໧஌తͳલॲཧͷ஌ࣝΛ·ͱΊΔ͍͍ػձʹͳΕ͹͍͍͔ͳͱࢥͬ ͨ

Slide 16

Slide 16 text

#key points ΍ͬͨ͜ͱ લॲཧΛ୯ޠຒΊࠐΈʹ౷߹͢ΔͱͲΜͳޮՌ͕͋Δ͔ʁ Ͳͷલॲཧ͕ײ৘෼ੳܥͷλεΫʹޮՌ͕͋Δͷ͔ʁ ࣄલֶश͞Εͨ΋ͷΑΓվળ͞Ε͍ͯΔ͔ʁ ͭͷֶशσʔλɺͭͷςετσʔλΛ࢖༻ͨ͠ײ৘ܥλεΫͰɺֶ शσʔλɺ෼ྨσʔλɺ྆ํɺͦΕͧΕʹલॲཧΛద༻ͨ͠৔߹Ͱൺ ֱ ݕূͨ͜͠ͱ

Slide 17

Slide 17 text

#preprocessing #pipeline /-1ʹ͓͚ΔલॲཧͷྲྀΕ ΫϦʔχϯά ෼ׂ ਖ਼نԽ ѹॖ ϕΫτϧԽ λά ه߸ͳͲͷআڈ QVODUVBUJPO ܗଶૉղੳ ࣙॻͷ௥Ճ ܎Γड͚ղੳ ਺ࣈͷஔ͖׵͑ إจࣈͳͲͷೝࣝ TQFMMDIFDL දهΏΕ MPXFSDBTJOH ୅දޠ΁ͷஔ͖׵͑ লུޠ MFNNBUJ[BUJPO TUFNNJOH OFHBUJPO Φϯτϩδʔ 4UPQXPSEͷআڈ 104 $#08 TLJQHSBN #&35 DPWFSBHFͷௐࠪ ෼ྨσʔλͱޠኮΛ͚ۙͮΔ FUD

Slide 18

Slide 18 text

#preprocessing #negation /FHBUJPO • ൓ҙޠࣙॻͷ࡞੒ ◦8PSE/FUίʔύεͰ൓ҙޠࣙॻΛ࡞੒ ◦൓ҙޠ͕ݟ͔ͭΒͳ͍PSͭͰ͋Ε͹ͦͷ··ɺෳ਺͋Δ৔߹͸ VL8BDίʔύεͷதͰ࠷େͷස౓Λ࣋ͭ൓ҙޠͱͨ͠Γ୯ʹϥϯμϜ ʹબ୒ͨ͠Γ • ൱ఆޠͷ൓ҙޠ΁ͷஔ׵ ◦൱ఆޠ͕ݟ͔ͭͬͨ৔߹ɺଓ͘୯ޠΛநग़͠ɺ൓ҙޠࣙॻͰ൓ҙޠΛ ݕࡧɻ൓ҙޠ͕ݟ͔ͭͬͨ৔߹ɺ൱ఆޠͱ൱ఆ͞ΕͨޠΛͦΕʹஔ͖ ׵͑Δ ◦ྫ͑͹ɺͱ͍͏จͰ͸ɺ൱ఆޠʢ`OPUʣ ͱͦΕʹରԠ͢Δ୯ޠʢIBQQZʣΛಛఆɻ൓ҙޠࣙॻͰbIBQQZ`ͷ൓ ҙޠʢ`TBE`ʣΛ୳͠ɺOPUIBQQZ`ΛbTBE`ʹஔ͖׵͑Δ

Slide 19

Slide 19 text

#corpus #training #dataset /FXT શମͱͯ͠ɺ4UPQXPSEͷআڈ΍104Ͱ͸WPDBCTJ[F͸͋·Γม ΘΒͳ͍͕DPSQVTTJ[F͕େ͖͘ݮগ ʙ೥ͷΞϝϦΧͷͷग़ ൛෺͔Βͷ ݅ͷهࣄ 8JLJQFEJB 8JLJQFEJBͷهࣄ ݅Ͱ ߏ੒͞ΕΔɺ/FXTΑΓ໿ഒେ͖ ͍ίʔύε 5SBJOJOH$PSQVT ͭͷαΠζɾੑ࣭ͷҟͳΔίʔύεʹͭͷલॲཧΛߦ͏

Slide 20

Slide 20 text

#corpus #evaluation #dataset &WBMVBUJOH$PSQVT 4FOUJNFOUBOBMZTJT FNPUJPODMBTTJpDBUJPO TBSDBTNEFUFDUJPOͷͭͷλεΫͰධՁɻ • *.%# ◦ ݅ͷөըϨϏϡʔɻϙδωΨൺ • 4FN&WBM ◦ ໿πΠʔτɻϙδωΨൺ • "JSMJOF ◦ ߤۭձࣾࣾʹؔ͢Δ໿݅πΠʔτɻ 4FOUJNFOUBOBMZTJTײ৘ϙδωΨ • *4&"3 ◦ ໿݅ͷɺײ৘Λשى͢Δݸਓతͳ࿩ • "MN ◦ ໿݅ͷ͓ͱ͗࿩ • 44&$ ◦ 4FN&WBMΛ࠶Ξϊςʔγϣϯͨ͠໿݅ͷπ Πʔτ &NPUJPO%FUFDUJPOײ৘Ϋϥε෼ྨ 4BSDBTN%FUFDUJPOൽ೑ͷݕग़ • 0OJPO ◦ ൽ೑Λѻ͏ϝσΟΞͱͦ͏Ͱͳ͍ϝσΟΞ͔Βऩू ͨ͠໿݅ͷχϡʔεϔουϥΠϯ • *"$ ◦ ໿݅ͷൃ࿩Ԡ౴ • 3FEEJU ◦ ஶऀ͕ϥϕϧ෇͚ͨ͠໿ສ݅ͷ3FEEJU౤ߘ

Slide 21

Slide 21 text

#result /FHBUJPO͕શͯͷσʔληοτʹ͓͍ͯ࠷΋ޮՌతͩͬͨ /FXTίʔύεʹલॲཧΛߦͬͨ͋ͱ୯ޠຒΊࠐΈΛ࡞੒ͨ͠৔߹ͷGTDPSF /FHBUJPOҎ֎ͷલॲཧͷෳ߹ΛؚΊͯ΋ɺOFHBUJPOͷΈͷ৔߹͕ৗʹ ൪໨ʹείΞ͕ߴ͔ͬͨ ୯७ʹલॲཧΛॏͶͯ΋ৗʹޮՌ͕͋Δͱ͸ݶΒͳ͍

Slide 22

Slide 22 text

#result ʢҰൠతͳʣ4UPQXPSET TUFNNJOH͸ɺ୯ମͰ͸ҙຯ͕͋Γͦ͏ʹݟ͑ ͯ΋ଞͷલॲཧͱಉ࣌ͩͱείΞʹد༩͍ͯ͠ͳ͍͜ͱ͕Θ͔Δ શͯͷલॲཧΛద༻ͯ͠΋OFHBUJPOͷΈͷ৔߹ͱมΘΒͳ͍͔গ͠Լ͕Δ ͘Β͍ʢ0OJPO 3FEEJU 44&$ʣ 4UPQXPSET΍104͸ίʔύεαΠζΛେ͖͘ݮগͤ͞Δ͕ɺ104Ͱ͸ είΞݮগ͕ͳ͍

Slide 23

Slide 23 text

#result XJLJQFEJBDPSQVTΛ࢖ͬͨ$#08 4LJQHSBN #&35ͷ'TDPSFൺֱ 8JLJQFEJBͰ΋ಉ༷ͷ܏޲͕ΑΓڧ·ͬͨ

Slide 24

Slide 24 text

#result #preprocess #postprocess ֶशίʔύεʹલॲཧΛద༻͢Δ৔߹ʢQSFʣͱɺ ෼ྨσʔληοτʹલॲཧΛద༻͢Δ৔߹ʢQPTUʣͷൺֱ ௚ײ௨ΓɺQPTUͷΈ͕͍ͣΕͷ৔߹Ͱ΋࠷΋είΞ͕௿͘ͳͬͨ QSFͱCPUIͰείΞʹେ͖ͳ͕ࠩͳ͘ɺQSF͕࠷΋ॏཁͰ͋Δ͜ͱ͕ࣔ ͞Εͨ Ұൠతʹɺ୯ޠຒΊࠐΈ͕༩͑ΒΕͨ৔߹͸෼ྨσʔλΛ߹Θͤʹ͍͘͜ͱ ͕ଟ͍Α͏ʹࢥ͏ͷͰɺҙ֎ͳ݁Ռ

Slide 25

Slide 25 text

#result #compare with SoTA 4P5"ϕʔεϥΠϯʹର͢ΔఏҊϞσϧͷධՁ શͯͷλεΫͰఏҊख๏܈͕4P5"Λ্ճΔ #&35͕Ұ൪ڧ͍ͷ͸౰વͳͷͰগͣ͠Δ͍͕ɺఏҊख๏શମͱͯ͠উͬͯ ͍Δ΋ͷ͕ଟ͍ʢ*.%# *"$ 0OJPO 3FEEJU 44&$ʣ 4P5"ͳࣄલֶशϞσϧ͸ఏҊख๏ΑΓང͔ʹେ͖͍ίʔύεΛ࢖͍ͬͯΔ ͷͰɺQSFͷॏཁੑ͕Θ͔Δ

Slide 26

Slide 26 text

#result #relative improvement GTDPSFͷઈର஋ͱجຊతͳલॲཧ͔Βͷ૬ରతͳվળ TFOUJNFOUBOBMZTJTͱTBSDBTNEFUFDUJPOͷͭͷόΠφϦλεΫΑ ΓϚϧνΫϥε෼ྨλεΫͰͷվળ෯ͷ΄͏͕एׯେ͖͍ ΑΓଟ͘ͷσʔληοτͰൺֱ͠ͳ͍ͱ·ͩ·ͩඍົͳࠩͰ͔͠ͳ͍Α͏ͳ ؾ͸͢Δ

Slide 27

Slide 27 text

#key points ·ͱΊ ୯ޠຒΊࠐΈ࣌఺Ͱͷલॲཧ͕࠷΋λΠϛϯάͱͯ͠༗ޮͰ͋Δ͜ͱ ͕ࣔ͞Εͨ ୯ମͱͯ͠͸OFHBUJPO͕࠷΋ޮՌ͕͋ΓɺҰൠతͳTUPQXPSET ΍TUFNNJOH͸είΞΛԼ͛Δ͜ͱ͕ଟ͍ ڊେͳίʔύεͰֶशࡁΈͷ୯ޠຒΊࠐΈΛ࢖͏ΑΓɺλεΫʹ߹Θ ͤͨલॲཧΛద੾ͳλΠϛϯάͰߦ͏͜ͱͰείΞͰ্ճΕΔ ҉໧஌తʹ஌ΒΕ͍ͯͨ஌ݟ͕ଟ͍͕ɺ͔ͬ͠Γͨ͠ݕূΛߦ͏͜ͱ Ͱମܥతͳ஌ࣝʹͨ͠ ਖ਼௚·ͩ·࣮ͩݧ͕଍Γ͍ͯͳ͍෦෼΋͋Δͱײ͕ͨ͡ɺΑΓแׅత ͳ࣮ݧΛ͢Δ͜ͱͰ͔ͳΓ໘ന͘ͳΓͦ͏

Slide 28

Slide 28 text

঺հ͢Δ࿦จ "$PNQSFIFOTJWF"OBMZTJTPG1SFQSPDFTTJOHGPS8PSE 3FQSFTFOUBUJPO-FBSOJOHJO"⒎FDUJWF5BTLT )FUFSPHFOFPVT(SBQI/FVSBM/FUXPSLTGPS&YUSBDUJWF %PDVNFOU4VNNBSJ[BUJPO ˞ଞͷ࿦จಡΈձͰ࿩ͨ͠಺༰ΛؚΈ·͢ ײ৘ೝࣝͰҰൠతʹߦΘΕΔલॲཧͬͯຊ౰ʹҙຯ͋Δͷʁ࣮͸֐ ΋͋ΔΜ͡Όͳ͍ʁͱ͍͏ٙ໰Λௐ΂ͨ )FUFSPHFOFPVTHSBQIΛ࢖ͬͯ୯ޠ΍จɺ֓೦ͷؔ܎ੑΛදݱ ֦ͨ͠ுੑͷߴ͍நग़తจॻཁ໿

Slide 29

Slide 29 text

)FUFSPHFOFPVT(SBQI/FVSBM/FUXPSLT GPS&YUSBDUJWF%PDVNFOU4VNNBSJ[BUJPO #abstract จॻཁ໿Ͱ͸ɺηϯςϯεؒͷؔ܎ੑͷϞσϧԽ͕ ඇৗʹॏཁɻैདྷ͸ɺ3//ϕʔεͷख๏ͰܥྻͰ ϞσϧԽ͍ͯͨ͠ %BORJOH8BOH 4IBOHIBJ,FZ-BCPSBUPSZPG*OUFMMJHFOU*OGPSNBUJPO1SPDFTTJOH 'VEBO6OJWFSTJUZ FUBM "$- நग़తจॻཁ໿Ͱηϯςϯεؒͷؔ܎ੑΛදݱ͢ΔͨΊʹ IFUFSPHFOFPVTHSBQIΛಋೖ͠ɺ4P5"Λୡ੒֦ுੑͳͲʹ͍ͭͯݕূͨ͠ɻ จॻͷҙຯߏ଄͸ܥྻΑΓάϥϑߏ଄ͷํ͕దͯ͠ ͍Δ͜ͱ͕࠷ۙͷݚڀͰΘ͔͖͍ͬͯͯΔ͕ɺྑ͍ άϥϑߏ଄͸·ͩఏҊ͞Ε͍ͯͳ͔ͬͨ ୯ޠϊʔυͱจϊʔυΛ࣋ͭIFUFSPͳHSBQIߏ ଄ΛఏҊ͠ɺ୯จॻɾଟจॻཁ໿ͦΕͧΕͰ 4P5"Λୡ੒ɻ֦ுੑʹ͍ͭͯ΋ٞ࿦ͨ͠

Slide 30

Slide 30 text

#abstract #extractive document summarization ݩͷจॻ͔Βؔ࿈͢ΔจॻΛऔΓग़ͯ͠ɺཁ໿ ͱͯ͠࠶ߏ੒͢ΔλεΫ நग़తจॻཁ໿ ୯ޠΛܦ༝ͨ͠จͷؔ܎ੑΛදݱ͢ΔIFUFSPHSBQIΛఆٛ υΩϡϝϯτͷ֤ηϯςϯεΛ#JEJSFDUJPOBM-45.ͰϕΫτϧԽɻ͜Ε ʹΑͬͯηϯςϯεͷҙຯΛଊ͑ͨϕΫτϧ͕࡞ΒΕΔʢXPSEMBZFSʣ நग़ܕͱɺදݱΛந৅Խͯ͠θϩ͔Βཁ໿จΛ ࡞Δੜ੒ܕɺͦΕΒͷࠞ߹ͷύλʔϯ͕͋Δ ͞Βʹ͜ͷϕΫτϧಉ࢜ͷؔ܎ੑΛ#JEJSFDUJPOBM-45.Ͱֶश͢Δ ʢTFOUFODFMBZFSʣ ηϯςϯεΛநग़͢Δ֬཰Λग़ྗ 4VNNB3V//FS ॳظͷݚڀ

Slide 31

Slide 31 text

)FUFSPHFOFPVT(SBQI ࣮ੈքͷάϥϑ͸IFUFSPHFOFPVTͳ΋ͷ͕ଟ͍ ࣮ੈքͷάϥϑ͸ɺҟͳΔಛ௃ۭؒͷ༷ʑͳλΠϓͷϊʔυɾΤοδͰ ߏ੒͞Ε͍ͯΔ #abstract #heterogeneous graph

Slide 32

Slide 32 text

#model overview ηϯςϯεͷΈΛϊʔυͱͯ͠άϥϑΛߏங͢ ΔͷͰ͸ͳ͘ɺηϯςϯεΛͭͳ͙஥հ໾ͷΑ ͏ͳϊʔυΛ௥Ճ 1SPQPTFE(SBQI ୯ޠΛܦ༝ͨ͠จͷؔ܎ੑΛදݱ͢ΔIFUFSPHSBQIΛఆٛ จ৘ใͰ୯ޠϊʔυΛߋ৽Ͱ͖Δ ଞͷϊʔυλ ΠϓΛ௥Ճ͢ΔͳͲͷ֦ுੑ͕͋ΔɺͳͲͷར ఺ ͜ͷ࿦จͰ͸ɺ࠷খҙຯ୯ҐΛ୯ޠʹ͍ͯ͠ Δɻྫ͑͹ɺΑΓந৅Խͯ͠୯ޠͷҙຯ΍֓೦ ΛϊʔυλΠϓͱ͢Δ͜ͱ΋໘നͦ͏ HSBQIJOJUJBMJ[Fˠ("5Ͱߋ৽ˠηϯςϯε ಛ௃͔Βཁ໿จʹ௥Ճ͢Δ͔൱͔ͷ෼ྨ໰୊Λ ղ͘ɺͱ͍͏खॱ

Slide 33

Slide 33 text

#model overview #learning step HSBQIJOJUJBMJ[FSͰɺจʹΧʔωϧαΠζͷҟ ͳΔ$//Λద༻ͯ͠OHSBNಛ௃Λநग़ʢہ ॴಛ௃ʣɺ࣍ʹ#J-45.ͰηϯςϯεϨϕϧͷ ಛ௃Λநग़ʢେҬಛ௃ʣ 1SPQPTFE(SBQI ֶशखॱͱNPEFMPWFSWJFX ୯ޠϊʔυͱจϊʔυͷؔ܎ੑʹؔ͢Δ৘ใͱ ͯ͠ɺUGJEGΛΤοδಛ௃Ͱ࢖༻͢Δ άϥϑಛ௃͸(SBQI"UUFOUJPO/FUXPSLͰ ߋ৽

Slide 34

Slide 34 text

#model overview #graph attention network ࣗ਎ͱपғʹͦΕͧΕॏΈΛ͔͚ͨϕΫτϧ͔ΒBUUFOUJPOΛܭࢉ ͠ɺपลϊʔυ͔ΒͷBHHSFHBUJPOʹར༻ (SBQI"UUFOUJPO/FUXPSL άϥϑ্ͰͷBUUFOUJPOΛఆٛ "UUFOUJPO ྡ઀ϊʔυ "UUFOUJPOΛܭࢉ͢Δؔ਺ "UUFOUJPOΛߟྀͨ͠ BHHSFHBUJPO άϥϑू໿ͷڑ཭ؔ਺Λɺάϥϑߏ଄ʹґଘ͠ͳ͍BUUFOUJPOͱͯ͠ ఆֶٛ͠शϕʔεͰٻΊΔɺΈ͍ͨͳ࿩ ϊʔυಛ௃

Slide 35

Slide 35 text

#dataset #train test split %BUBTFU ୯จॻཁ໿Ͱ͸ͭɺෳ਺จॻཁ໿Ͱ͸ͭͷσʔληοτͰ࣮ݧ • ୯จॻཁ໿Ͱ࠷΋޿͘ར༻͞Ε͍ͯΔϕϯνϚʔΫσʔληοτ • USBJO WBMJE UFTUσʔλ͸ͦΕͧΕ $//%BJMZ.BJM2"σʔλ • /FX:PSL5JNFT"OOPUBUFE$PSQVT 4BOEIBVT ͔Βऩू͞Εͨ୯จॻཁ໿ σʔληοτ • USBJO WBMJE UFTUσʔλ͸ͦΕͧΕ ݅ /:5 .VMUJ/FXT • ෳ਺จॻཁ໿σʔληοτ • ͦΕͧΕʙͷจॻʹର͠ɺਓ͕ؒॻ͍ͨཁ໿͕͋Δ • USBJO WBMJE UFTUσʔλ͸ͦΕͧΕ

Slide 36

Slide 36 text

#experiment #setting #hyper-parameter #preprocessing 4FUUJOH)ZQFSQBSBNFUFST લॲཧ άϥϑ ࣮ݧ ετοϓϫʔυ΍۟ಡ఺ͷআڈ ೖྗจॻͷ࠷େ௕Λจʹ ઃఆ UGJEGԼҐΛআڈ ޠኮ਺Λʹ੍ݶ ࣍ݩͷ(MP7FͰຒΊࠐΈ จϕΫτϧαΠζ͸ͰॳظԽ Τοδಛ௃ྔ ࣍ݩ͸ͰॳظԽ IFBE όοναΠζ ֶश཰F "EBN FQPDIͰMPTT ͕Լ͕Βͳ͍৔߹FBSMZTUPQQJOH ୯จॻཁ໿Ͱ͸্Ґจ ෳ਺จॻཁ໿Ͱ͸্ҐจΛબ୒

Slide 37

Slide 37 text

#methods #extractor • &YU#J-45. ◦$//૚#J-45. ◦จॻΛจͷܥྻͱΈͳ͠จؔ܎Λֶश͢Δ • &YU5SBOTGPSNFS ◦5SBOTGPSNFS૚USBOTGPSNFS ◦શจͷϖΞϫΠζ૬ޓ࡞༻Λֶश ◦จϨϕϧͷ׬શ࿈݁άϥϑͱΈͳͤΔ • )4( )FUFS4VN(SBQI ◦ఏҊख๏ɻจ୯ޠจͷؔ܎ੑΛάϥϑͰϞσϧԽ ◦)4(Ͱ͸ϊʔυ෼ྨʹΑͬͯཁ໿จΛબ୒͠ɺ͞ΒʹUSJHSBN CMPDLJOHʹΑͬͯUSJHSBN͕ࣅ͍ͯΔจΛআ֎͠৑௕ੑΛ཈͑ͨόʔ δϣϯ΋࣮ݧ .FUIPET

Slide 38

Slide 38 text

#result #CNN/DailyMail 3FTVMUʢ୯จॻཁ໿ɿ$//%BJMZ.BJMʣ $//%BJMZ.BJMͰͷ୯จॻཁ໿ͷ݁Ռɻطଘख๏͢΂ͯΛ্ճΔείΞ͕ಘΒΕͨɻ -&"%͕ϕʔεϥΠϯɺ 03"$-&͕VQQFSCPVOE MBCFM QSFWJPVTTUVEZ QSPQPTFENFUIPE จ຺όϯσΟου໰୊ͱͯ͠ఆٛ ͨ͠)&3ʹؔͯ͠͸ಛʹϙϦ γʔ͋Γͳ͠΋࣮ݧ͠ɺ͍ͣΕ ΋উͪ ʢ#&35Λ࢖͍ͬͯͳ͍ʣશͯͷطଘख๏ΑΓߴ͍είΞ͕ಘΒΕͨ 306(& -ͰධՁɻͦΕ ͧΕHSBN HSBN Ұக͢Δ ࠷௕ܥྻͷྨࣅ౓ͷείΞ

Slide 39

Slide 39 text

#result #CNN/DailyMail 3FTVMUʢ୯จॻཁ໿ɿ$//%BJMZ.BJMʣ จܥྻ΍׬શ઀ଓάϥϑΛར༻ͨ͠ख๏ͱൺ΂Δ͜ͱͰɺ IFUFSPHSBQIߏ଄ͷ༗༻ੑ͕ࣔ͞Εͨɻ &YUNFUIPE QSPQPTFENFUIPE จܥྻ΍ɺ׬શ઀ଓάϥϑΛ࢖ͬ ͨ&YU#J-45. &YU 5SBOTGPSNFSΑΓߴ͍είΞ IFUFSPHSBQIΛ࢖͏͜ͱͰɺ ηϯςϯεؒͷෆཁͳ݁߹ΛޮՌ తʹআڈͰ͖͍ͯΔ

Slide 40

Slide 40 text

#result #NYT50 3FTVMUʢ୯จॻཁ໿ɿ/:5ʣ /:5Ͱͷ୯จॻཁ໿ͷ࣮ݧ݁Ռɻ$//%BJMZ.BJMͱجຊతʹಉ͡܏޲͕ݟΒΕͨɻ جຊతʹ$//%BJMZ.BJM ͱಉ͡ͰɺఏҊख๏͕طଘ ख๏Λ্ճ͍ͬͯΔ QSPQPTFENFUIPE USJHSBNCMPDLJOH͋Γ όʔδϣϯ͕ҐͰ͸ͳ͍ ͷ͸ͳͥɾɾɾʁ ˠ$//%BJMZ.BJMͰ͸ॏෳͷ গͳ͍Օ৚ॻ͖Λ࿈݁͢Δܗࣜ ͕ͩɺ/:5Ͱ͸Ωʔϑ Ϩʔζ͕ෳ਺ճొ৔͢ΔͳͲॏ ෳ͕͋ΔɻͳͷͰɺUSJHSBN CMPDLJOHͰ͸/:5Ͱε ίΞΛग़ͮ͠Β͍ͷͰ͸

Slide 41

Slide 41 text

#ablation #CNN/DailyMail ୯ޠϑΟϧλϦϯάͷ࡟আͰ 3 3-͸είΞݮগ 3 ͸είΞ૿Ճ "CMBUJPO $//%BJMZ.BJMͰBCMBUJPO͠Ϟδϡʔϧͷߩݙ౓Λௐ΂ͨɻ ୯ޠϑΟϧλϦϯάʹΑΓɺಛʹॏཁͳ୯ޠϊʔυʹϑΥʔΧεͰ͖Δར఺ ͕CJHSBN৘ใΛࣦ͏σϝϦοτΛ্ճ͍ͬͯΔͷͰ͸ͳ͍͔ ("5૚ؒͷSFTJEVBM DPOOFDUJPOΛ࡟আ͢Δ͜ͱͰ είΞ͕େ͖͘ݮগ ("5૚ͷSFTJEVBMDPOOFDUJPO͸ɺIFUFSPHSBQIʹ͓͚ΔผλΠϓͷ ϊʔυ͔Βͷू໿Ͱཧ࿦తʹॏཁͳͷͰ୯ͳΔ݁߹Ͱ͸ஔ͖׵͑Ͱ͖ͳ͍

Slide 42

Slide 42 text

#result #multidocument )4( )%4(ڞʹطଘख๏Λ্ճ ΔείΞ͕ಘΒΕ͍ͯͯɺಛʹ )%4(ͰείΞ্ঢ͕େ͖͍ 3FTVMUʢଟจॻཁ໿ʣ ଟจॻཁ໿Ͱ΋จॻϊʔυΛ௥Ճͨ͠ఏҊख๏Ͱݕূ จॻϊʔυͷ௥Ճ͕ଟจॻཁ໿ʹ ޮՌతͰ͋Δ͜ͱ͕ࣔࠦ USJHSBNCMPDLJOH͕ޮ͍͍ͯͳ͍ ͷ͸ɺ͓ͦΒ͖ͬ͘͞ͱಉ͡ཧ༝ ఏҊख๏Ͱ͸୯ʹϊʔυλΠϓΛ௥Ճ͢Δ͚ͩͰผλεΫʹԠ༻Ͱ͖͓ͯ Γɺൃలੑ͕ߴ͍ QSPQPTFENFUIPE

Slide 43

Slide 43 text

#qualitative analysis #degree ୯ޠϊʔυͷ౓਺͕ߴ͍ͱɺͦͷ୯ޠ ͷग़ݱ਺͕ଟ͍ͱ͍͏͜ͱʹͳΓจॻ ͷ৑௕౓Λʢଟগʣද͢ 2VBMJUBUJWF"OBMZTJT ୯ޠϊʔυͷ౓਺͕༩͑ΔӨڹΛௐࠪ ୯ޠϊʔυ͕͋Δ͜ͱͰɺจ৘ใͷू໿ͱେҬදݱͷ఻೻͕ߦΘΕ͍ͯΔՄ ೳੑ͕ࣔࠦ͞ΕΔ ୯ޠͷ౓਺ͱ306(&͕ൺྫ ˠ৑௕ੑͷߴ͍จॻ΄Ͳཁ໿͠қ͍ ౓਺͕ߴ͍ͱෳ਺ͷจͷ৘ใΛू໿͢ Δ͜ͱ͕Ͱ͖ɺϞσϧͷԸܙΛΑΓڧ ͘ड͚Δ͜ͱ͕Ͱ͖Δͱߟ͑ΒΕΔ

Slide 44

Slide 44 text

#qualitative analysis #source จॻ͕૿Ճ͢Δ͜ͱͰɺϕʔεϥΠϯ ͸্ঢ͢Δ͕ఏҊख๏Ͱ͸௿Լ͠ จͰฒͿ 2VBMJUBUJWF"OBMZTJT ଟจॻཁ໿Ͱɺจॻͷ਺ͷӨڹΛௐࠪ จॻ਺ͷ૿ՃͰ)&5&346.(3"1)ͱ)&5&3%0$46.(3"1)ͷੑ ೳ͕֦ࠩେจॻͱจॻͷؔ܎͕ෳࡶʹͳΔ΄Ͳɺจॻϊʔυͷར఺͕Α Γେ͖͘ͳΔ 'JSTU͸ɺΧόϨοδΛ֬อͰ͖Δ จষΛ֤จॻ͔Βڧ੍తʹநग़Ͱ͖Δ จॻ਺ͷ૿Ճʹ൐͍ɺશจͷओࢫΛΧ όʔͰ͖ΔݶΒΕͨ਺ͷจΛநग़͢Δ ͜ͱ͕ࠔ೉ʹͳ͍ͬͯͨ͘Ί

Slide 45

Slide 45 text

#key points ·ͱΊ IFUFSPHSBQIΛ࢖͏͜ͱͰɺจॻཁ໿ʹpOFHSBJOFEͳҙຯ୯Ґ Λಋೖ͢Δ͜ͱ͕Ͱ͖ɺจɾจষؒͷؔ܎ੑͷϞσϦϯά΁ͷ༗ޮੑ ͕͔֬ΊΒΕͨ ख๏ͷ֦ுੑ͸ߴ͘ɺ୯จॻཁ໿͔ΒϊʔυλΠϓͷ௥ՃͷΈͰଟจ ॻཁ໿ʹରԠՄೳ IFUFSPHSBQIʹಛԽͨ͠ख๏ʢϝλύεΛ࢖ͬͨαϒάϥϑͷఆ ٛɺIFUFSPHSBQIʹର͢ΔBUUFOUJPO౳ʣΛࢼ͢ͱ໘ന͍͔΋ ࠓޙ͸#&35౳ࣄલֶशϞσϧΛ͍Ζ͍Ζݕ౼͍ͨ͠ͱͷ͜ͱ චऀ΋ܰ͘৮Ε͍͕ͯͨɺ୯ޠϊʔυʹ౰ͨΔ෦෼͕ҙຯϊʔυ·Ͱ ந৅Խ͞ΕͨΓͨ͠Βख๏ͷ༏Ґੑ͕ΑΓ׆͔͞ΕΔͱࢥ͏ɻͦ͏Ͱ ͳͯ͘΋ɺϊʔυλΠϓͷ௥Ճ͸͍Ζ͍Ζࢼͤͦ͏