Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
抽出的文書要約における hetero graph の応用 Heterogeneous Grap...
Search
uchi_k
September 06, 2020
Programming
0
1.2k
抽出的文書要約における hetero graph の応用 Heterogeneous Graph Neural Networks for Extractive Document Summarization
ACL 2020 に採択された Heterogeneous Graph Neural Networks for Extractive Document Summarization を読んでいます。
uchi_k
September 06, 2020
Tweet
Share
More Decks by uchi_k
See All by uchi_k
ACL2020 Category Survey: Sentiment Analysis
uchi_k
2
3.3k
前処理が単語埋め込みに与える影響 A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks
uchi_k
2
1k
Graph Neural Networks のビジネス応用可能性 heterogeneous graph と論文再現性について
uchi_k
1
3.3k
ACL精神医療論文まとめ 8min LT
uchi_k
0
1.3k
【論文紹介】医用画像への転移学習の有効性について Transfusion: Understanding Transfer Learning for Medical Imaging
uchi_k
4
3.5k
Graph: A Survey of Graph Neural Networks, Embedding, Tasks and Applications
uchi_k
1
1.1k
Other Decks in Programming
See All in Programming
Kotlinの開発でも AIをいい感じに使いたい / Making the Most of AI in Kotlin Development
kohii00
4
820
もう僕は OpenAPI を書きたくない
sgash708
5
1.9k
Boost Performance and Developer Productivity with Jakarta EE 11
ivargrimstad
0
730
Ruby on cygwin 2025-02
fd0
0
180
負債になりにくいCSSをデザイナとつくるには?
fsubal
10
2.6k
「個人開発マネタイズ大全」が教えてくれたこと
bani24884
1
150
Rails 1.0 のコードで学ぶ find_by* と method_missing の仕組み / Learn how find_by_* and method_missing work in Rails 1.0 code
maimux2x
1
110
React 19アップデートのために必要なこと
uhyo
8
1.4k
『GO』アプリ バックエンドサーバのコスト削減
mot_techtalk
0
160
CloudNativePGを布教したい
nnaka2992
0
110
データの整合性を保つ非同期処理アーキテクチャパターン / Async Architecture Patterns
mokuo
53
18k
データベースのオペレーターであるCloudNativePGがStatefulSetを使わない理由に迫る
nnaka2992
0
230
Featured
See All Featured
Adopting Sorbet at Scale
ufuk
74
9.2k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
46
2.3k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
120k
The Power of CSS Pseudo Elements
geoffreycrofte
75
5.5k
Producing Creativity
orderedlist
PRO
344
40k
A designer walks into a library…
pauljervisheath
205
24k
It's Worth the Effort
3n
184
28k
StorybookのUI Testing Handbookを読んだ
zakiyama
28
5.5k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
45
9.4k
How STYLIGHT went responsive
nonsquared
98
5.4k
Transcript
Heterogeneous Graph Neural Networks for Extractive Document Summarization
ڮ ݎࢤ uchi_k @__uchi_k__ About me yuni, inc. ද nlpaper.challenge
ӡӦ Freelance Machine Learning ɹɹɹɹɹEngineer / Researcher former ژେใӃ, ະ౿16 FreakOut Machine Learning Engineer
nlpaper.challenge ࣗવݴޠॲཧͷΛ͍Ζ͍Ζ͢ΔࣾձਓɾֶੜɾݚڀऀͷίϛϡχςΟ ʢϘϥϯςΟΞத৺ͰӡӦʣ "$-ͷશཏΛࢦͯ͠ɺ"$-ެࣜʹ͋Δʹै͍ɺͷ Λઃఆͯ͠ɺͦΕͧΕͷνʔϜʹ͔ΕͯαʔϕΠ ຊఔͷจΛಡΈɺٞ-5ձͳͲΛ͍ͯ͠·ͨ͠
ACL2020 ੜܥɺάϥϑܥͷจ͕͔ͳΓ૿͑ͨҹ #&35 3P#&35BͷࣄલֶशݴޠϞσϧʹؔ͢Δݴٴ͕΄΅ඞͣ͋Δ ࠶ݱੑͷࢹ࣮ͷԠ༻͔Βɺࢦඪͷݟ͕͠ਐΜͩ ϕετϖʔύʔɺ/-1λεΫͷςετέʔεΈ͍ͨͳͷΛఆ ٛͯ͠௨աΛݟΑ͏Έ͍ͨͳΛ͍ͯͨ͠Γ ,OPXMFEHFHSBQIʹճؼͯ͠ɺάϥϑ্Ͱͷԋࢉάϥϑߏɺֶ शΛߦ͏Α͏ͳ͕૿Ճ Ҏ্ɺࢲݟͰͨ͠
)FUFSPHFOFPVT(SBQI/FVSBM/FUXPSLT GPS&YUSBDUJWF%PDVNFOU4VNNBSJ[BUJPO #abstract จॻཁͰɺηϯςϯεؒͷؔੑͷϞσϧԽ͕ ඇৗʹॏཁɻैདྷɺ3//ϕʔεͷख๏ͰܥྻͰ ϞσϧԽ͍ͯͨ͠ %BORJOH8BOH 4IBOHIBJ,FZ-BCPSBUPSZPG*OUFMMJHFOU*OGPSNBUJPO1SPDFTTJOH 'VEBO6OJWFSTJUZ FUBM
"$- நग़తจॻཁͰηϯςϯεؒͷؔੑΛදݱ͢ΔͨΊʹ IFUFSPHFOFPVTHSBQIΛಋೖ͠ɺ4P5"Λୡ֦ுੑͳͲʹ͍ͭͯݕূͨ͠ɻ จॻͷҙຯߏܥྻΑΓάϥϑߏͷํ͕దͯ͠ ͍Δ͜ͱ͕࠷ۙͷݚڀͰΘ͔͖͍ͬͯͯΔ͕ɺྑ͍ άϥϑߏ·ͩఏҊ͞Ε͍ͯͳ͔ͬͨ ୯ޠϊʔυͱจϊʔυΛ࣋ͭIFUFSPͳHSBQIߏ ΛఏҊ͠ɺ୯จॻɾଟจॻཁͦΕͧΕͰ 4P5"Λୡɻ֦ுੑʹ͍ͭͯٞͨ͠
#abstract #extractive document summarization ݩͷจॻ͔Βؔ࿈͢ΔจॻΛऔΓग़ͯ͠ɺཁ ͱͯ͠࠶ߏ͢ΔλεΫ நग़తจॻཁ ୯ޠΛܦ༝ͨ͠จͷؔੑΛදݱ͢ΔIFUFSPHSBQIΛఆٛ υΩϡϝϯτͷ֤ηϯςϯεΛ#JEJSFDUJPOBM-45.ͰϕΫτϧԽɻ͜Ε ʹΑͬͯηϯςϯεͷҙຯΛଊ͑ͨϕΫτϧ͕࡞ΒΕΔʢXPSEMBZFSʣ
நग़ܕͱɺදݱΛநԽͯ͠θϩ͔ΒཁจΛ ࡞ΔੜܕɺͦΕΒͷࠞ߹ͷύλʔϯ͕͋Δ ͞Βʹ͜ͷϕΫτϧಉ࢜ͷؔੑΛ#JEJSFDUJPOBM-45.Ͱֶश͢Δ ʢTFOUFODFMBZFSʣ ηϯςϯεΛநग़͢Δ֬Λग़ྗ 4VNNB3V//FS ॳظͷݚڀ
)FUFSPHFOFPVT(SBQI ࣮ੈքͷάϥϑIFUFSPHFOFPVTͳͷ͕ଟ͍ ࣮ੈքͷάϥϑɺҟͳΔಛۭؒͷ༷ʑͳλΠϓͷϊʔυɾΤοδͰ ߏ͞Ε͍ͯΔ #abstract #heterogeneous graph
#model overview ηϯςϯεͷΈΛϊʔυͱͯ͠άϥϑΛߏங͢ ΔͷͰͳ͘ɺηϯςϯεΛͭͳ͙հͷΑ ͏ͳϊʔυΛՃ 1SPQPTFE(SBQI ୯ޠΛܦ༝ͨ͠จͷؔੑΛදݱ͢ΔIFUFSPHSBQIΛఆٛ จใͰ୯ޠϊʔυΛߋ৽Ͱ͖Δ ଞͷϊʔυλ ΠϓΛՃ͢ΔͳͲͷ֦ுੑ͕͋ΔɺͳͲͷར
͜ͷจͰɺ࠷খҙຯ୯ҐΛ୯ޠʹ͍ͯ͠ Δɻྫ͑ɺΑΓநԽͯ͠୯ޠͷҙຯ֓೦ ΛϊʔυλΠϓͱ͢Δ͜ͱ໘നͦ͏ HSBQIJOJUJBMJ[Fˠ("5Ͱߋ৽ˠηϯςϯε ಛ͔ΒཁจʹՃ͢Δ͔൱͔ͷྨΛ ղ͘ɺͱ͍͏खॱ
#model overview #learning step HSBQIJOJUJBMJ[FSͰɺจʹΧʔωϧαΠζͷҟ ͳΔ$//Λద༻ͯ͠OHSBNಛΛநग़ʢہ ॴಛʣɺ࣍ʹ#J-45.ͰηϯςϯεϨϕϧͷ ಛΛநग़ʢେҬಛʣ 1SPQPTFE(SBQI ֶशखॱͱNPEFMPWFSWJFX
୯ޠϊʔυͱจϊʔυͷؔੑʹؔ͢Δใͱ ͯ͠ɺUGJEGΛΤοδಛͰ༻͢Δ άϥϑಛ(SBQI"UUFOUJPO/FUXPSLͰ ߋ৽
#model overview #graph attention network ࣗͱपғʹͦΕͧΕॏΈΛ͔͚ͨϕΫτϧ͔ΒBUUFOUJPOΛܭࢉ ͠ɺपลϊʔυ͔ΒͷBHHSFHBUJPOʹར༻ (SBQI"UUFOUJPO/FUXPSL άϥϑ্ͰͷBUUFOUJPOΛఆٛ "UUFOUJPO
ྡϊʔυ "UUFOUJPOΛܭࢉ͢Δؔ "UUFOUJPOΛߟྀͨ͠ BHHSFHBUJPO άϥϑूͷڑؔΛɺάϥϑߏʹґଘ͠ͳ͍BUUFOUJPOͱͯ͠ ఆֶٛ͠शϕʔεͰٻΊΔɺΈ͍ͨͳ ϊʔυಛ
#dataset #train test split %BUBTFU ୯จॻཁͰͭɺෳจॻཁͰͭͷσʔληοτͰ࣮ݧ • ୯จॻཁͰ࠷͘ར༻͞Ε͍ͯΔϕϯνϚʔΫσʔληοτ • USBJO
WBMJE UFTUσʔλͦΕͧΕ $//%BJMZ.BJM2"σʔλ • /FX:PSL5JNFT"OOPUBUFE$PSQVT 4BOEIBVT ͔Βऩू͞Εͨ୯จॻཁ σʔληοτ • USBJO WBMJE UFTUσʔλͦΕͧΕ ݅ /:5 .VMUJ/FXT • ෳจॻཁσʔληοτ • ͦΕͧΕʙͷจॻʹର͠ɺਓ͕ؒॻ͍ͨཁ͕͋Δ • USBJO WBMJE UFTUσʔλͦΕͧΕ
#experiment #setting #hyper-parameter #preprocessing 4FUUJOH)ZQFSQBSBNFUFST લॲཧ άϥϑ ࣮ݧ ετοϓϫʔυ۟ಡͷআڈ ೖྗจॻͷ࠷େΛจʹ
ઃఆ UGJEGԼҐΛআڈ ޠኮΛʹ੍ݶ ࣍ݩͷ(MP7FͰຒΊࠐΈ จϕΫτϧαΠζͰॳظԽ Τοδಛྔ ࣍ݩͰॳظԽ IFBE όοναΠζ ֶशF "EBN FQPDIͰMPTT ͕Լ͕Βͳ͍߹FBSMZTUPQQJOH ୯จॻཁͰ্Ґจ ෳจॻཁͰ্ҐจΛબ
#methods #extractor • &YU#J-45. ◦$// #J-45. ◦จॻΛจͷܥྻͱΈͳ͠จؔΛֶश͢Δ • &YU5SBOTGPSNFS ◦5SBOTGPSNFS
USBOTGPSNFS ◦શจͷϖΞϫΠζ૬ޓ࡞༻Λֶश ◦จϨϕϧͷશ࿈݁άϥϑͱΈͳͤΔ • )4( )FUFS4VN(SBQI ◦ఏҊख๏ɻจ୯ޠจͷؔੑΛάϥϑͰϞσϧԽ ◦)4(ͰϊʔυྨʹΑͬͯཁจΛબ͠ɺ͞ΒʹUSJHSBN CMPDLJOHʹΑͬͯUSJHSBN͕ࣅ͍ͯΔจΛআ֎͠ੑΛ͑ͨόʔ δϣϯ࣮ݧ .FUIPET
#result #CNN/DailyMail 3FTVMUʢ୯จॻཁɿ$//%BJMZ.BJMʣ $//%BJMZ.BJMͰͷ୯จॻཁͷ݁Ռɻطଘख๏ͯ͢Λ্ճΔείΞ͕ಘΒΕͨɻ -&"%͕ϕʔεϥΠϯɺ 03"$-&͕VQQFSCPVOE MBCFM QSFWJPVTTUVEZ QSPQPTFENFUIPE จ຺όϯσΟουͱͯ͠ఆٛ
ͨ͠)&3ʹؔͯ͠ಛʹϙϦ γʔ͋Γͳ࣮͠ݧ͠ɺ͍ͣΕ উͪ ʢ#&35Λ͍ͬͯͳ͍ʣશͯͷطଘख๏ΑΓߴ͍είΞ͕ಘΒΕͨ 306(& -ͰධՁɻͦΕ ͧΕHSBN HSBN Ұக͢Δ ࠷ܥྻͷྨࣅͷείΞ
#result #CNN/DailyMail 3FTVMUʢ୯จॻཁɿ$//%BJMZ.BJMʣ จܥྻશଓάϥϑΛར༻ͨ͠ख๏ͱൺΔ͜ͱͰɺ IFUFSPHSBQIߏͷ༗༻ੑ͕ࣔ͞Εͨɻ &YUNFUIPE QSPQPTFENFUIPE จܥྻɺશଓάϥϑΛͬ ͨ&YU#J-45. &YU
5SBOTGPSNFSΑΓߴ͍είΞ IFUFSPHSBQIΛ͏͜ͱͰɺ ηϯςϯεؒͷෆཁͳ݁߹ΛޮՌ తʹআڈͰ͖͍ͯΔ
#result #NYT50 3FTVMUʢ୯จॻཁɿ/:5ʣ /:5Ͱͷ୯จॻཁͷ࣮ݧ݁Ռɻ$//%BJMZ.BJMͱجຊతʹಉ͕͡ݟΒΕͨɻ جຊతʹ$//%BJMZ.BJM ͱಉ͡ͰɺఏҊख๏͕طଘ ख๏Λ্ճ͍ͬͯΔ QSPQPTFENFUIPE USJHSBNCMPDLJOH͋Γ όʔδϣϯ͕ҐͰͳ͍
ͷͳͥɾɾɾʁ ˠ$//%BJMZ.BJMͰॏෳͷ গͳ͍Օॻ͖Λ࿈݁͢Δܗࣜ ͕ͩɺ/:5ͰΩʔϑ Ϩʔζ͕ෳճొ͢ΔͳͲॏ ෳ͕͋ΔɻͳͷͰɺUSJHSBN CMPDLJOHͰ/:5Ͱε ίΞΛग़ͮ͠Β͍ͷͰ
#ablation #CNN/DailyMail ୯ޠϑΟϧλϦϯάͷআͰ 3 3-είΞݮগ 3 είΞ૿Ճ "CMBUJPO $//%BJMZ.BJMͰBCMBUJPO͠ϞδϡʔϧͷߩݙΛௐͨɻ ୯ޠϑΟϧλϦϯάʹΑΓɺಛʹॏཁͳ୯ޠϊʔυʹϑΥʔΧεͰ͖Δར
͕CJHSBNใΛࣦ͏σϝϦοτΛ্ճ͍ͬͯΔͷͰͳ͍͔ ("5ؒͷSFTJEVBM DPOOFDUJPOΛআ͢Δ͜ͱͰ είΞ͕େ͖͘ݮগ ("5ͷSFTJEVBMDPOOFDUJPOɺIFUFSPHSBQIʹ͓͚ΔผλΠϓͷ ϊʔυ͔ΒͷूͰཧతʹॏཁͳͷͰ୯ͳΔ݁߹Ͱஔ͖͑Ͱ͖ͳ͍
#result #multidocument )4( )%4(ڞʹطଘख๏Λ্ճ ΔείΞ͕ಘΒΕ͍ͯͯɺಛʹ )%4(ͰείΞ্ঢ͕େ͖͍ 3FTVMUʢଟจॻཁʣ ଟจॻཁͰจॻϊʔυΛՃͨ͠ఏҊख๏Ͱݕূ จॻϊʔυͷՃ͕ଟจॻཁʹ ޮՌతͰ͋Δ͜ͱ͕ࣔࠦ
USJHSBNCMPDLJOH͕ޮ͍͍ͯͳ͍ ͷɺ͓ͦΒ͖ͬ͘͞ͱಉ͡ཧ༝ ఏҊख๏Ͱ୯ʹϊʔυλΠϓΛՃ͢Δ͚ͩͰผλεΫʹԠ༻Ͱ͖͓ͯ Γɺൃలੑ͕ߴ͍ QSPQPTFENFUIPE
#qualitative analysis #degree ୯ޠϊʔυͷ͕ߴ͍ͱɺͦͷ୯ޠ ͷग़ݱ͕ଟ͍ͱ͍͏͜ͱʹͳΓจॻ ͷΛʢଟগʣද͢ 2VBMJUBUJWF"OBMZTJT ୯ޠϊʔυͷ͕༩͑ΔӨڹΛௐࠪ ୯ޠϊʔυ͕͋Δ͜ͱͰɺจใͷूͱେҬදݱͷ͕ߦΘΕ͍ͯΔՄ ೳੑ͕ࣔࠦ͞ΕΔ
୯ޠͷͱ306(&͕ൺྫ ˠੑͷߴ͍จॻ΄Ͳཁ͠қ͍ ͕ߴ͍ͱෳͷจͷใΛू͢ Δ͜ͱ͕Ͱ͖ɺϞσϧͷԸܙΛΑΓڧ ͘ड͚Δ͜ͱ͕Ͱ͖Δͱߟ͑ΒΕΔ
#qualitative analysis #source จॻ͕૿Ճ͢Δ͜ͱͰɺϕʔεϥΠϯ ্ঢ͢Δ͕ఏҊख๏ͰԼ͠ จͰฒͿ 2VBMJUBUJWF"OBMZTJT ଟจॻཁͰɺจॻͷͷӨڹΛௐࠪ จॻͷ૿ՃͰ)&5&346.(3"1)ͱ)&5&3%0$46.(3"1)ͷੑ
ೳ͕֦ࠩେจॻͱจॻͷ͕ؔෳࡶʹͳΔ΄Ͳɺจॻϊʔυͷར͕Α Γେ͖͘ͳΔ 'JSTUɺΧόϨοδΛ֬อͰ͖Δ จষΛ֤จॻ͔Βڧ੍తʹநग़Ͱ͖Δ จॻͷ૿Ճʹ͍ɺશจͷओࢫΛΧ όʔͰ͖ΔݶΒΕͨͷจΛநग़͢Δ ͜ͱ͕ࠔʹͳ͍ͬͯͨ͘Ί
#key points ·ͱΊ IFUFSPHSBQIΛ͏͜ͱͰɺจॻཁʹpOFHSBJOFEͳҙຯ୯Ґ Λಋೖ͢Δ͜ͱ͕Ͱ͖ɺจɾจষؒͷؔੑͷϞσϦϯάͷ༗ޮੑ ͕͔֬ΊΒΕͨ ख๏ͷ֦ுੑߴ͘ɺ୯จॻཁ͔ΒϊʔυλΠϓͷՃͷΈͰଟจ ॻཁʹରԠՄೳ IFUFSPHSBQIʹಛԽͨ͠ख๏ʢϝλύεΛͬͨαϒάϥϑͷఆ ٛɺIFUFSPHSBQIʹର͢ΔBUUFOUJPOʣΛࢼ͢ͱ໘ന͍͔
ࠓޙ#&35ࣄલֶशϞσϧΛ͍Ζ͍Ζݕ౼͍ͨ͠ͱͷ͜ͱ චऀܰ͘৮Ε͍͕ͯͨɺ୯ޠϊʔυʹͨΔ෦͕ҙຯϊʔυ·Ͱ நԽ͞ΕͨΓͨ͠Βख๏ͷ༏Ґੑ͕ΑΓ׆͔͞ΕΔͱࢥ͏ɻͦ͏Ͱ ͳͯ͘ɺϊʔυλΠϓͷՃ͍Ζ͍Ζࢼͤͦ͏