Slide 1

Slide 1 text

Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning Pengda Qin, Weiran Xu, William Yang Wang ACL 2018 εϥΠυதͷਤද͸࿦จ͔ΒҾ༻͞Εͨ΋ͷ খொक ୈ10ճ࠷ઌ୺NLPษڧձ@ཧݚ AIP ೔ຊڮΦϑΟε 2018/08/03

Slide 2

Slide 2 text

ؔ܎நग़͸஌ࣝάϥϑߏஙͷ Ωʔίϯϙʔωϯτ | Barack Obama is married to Michell Obama. →spouse 2 Google ͷ஌ࣝάϥϑ

Slide 3

Slide 3 text

ؔ܎நग़ͷओཁͳ໰୊͸ σʔλεύʔεωε 3 | ஌ࣝϕʔε͔Βؒ઀ڭࢣ͋Γֶशʢdistant supervisionʣʹΑͬͯ஌ࣝ֫ಘ͢Δख๏͕੝Μ | ஌ࣝϕʔεΛݩʹੜίʔύεʹࣗಈͰϥϕϧΛ ͚ͭɺڭࢣ͋ΓֶशͰ෼ྨثΛֶश͠ɺະ஌ͷ ࣄྫΛநग़͢Δ →ؒ઀తͳϥϕϧͳͷͰϊΠδʔͳͷ͕໰୊ ʢfalse positive ͕ͨ͘͞Μ͋Δʣ Barack Obama is married to Michell Obama. →spouse

Slide 4

Slide 4 text

ؒ઀ڭࢣ͋Γֶशʹ͓͚Δ false positive ͷ໰୊ | ਂ૚ֶशొ৔લͷؒ઀ڭࢣ͋Γֶश { ෳ਺ؔ܎Λߟྀ͠ͳ͍ (Minz et al., 2009) { ෳ਺ؔ܎Λಉ࣌ʹֶश (Hofmann et al., 2011; Surdeanu et al., 2012) →ΠϯελϯεΛ໌ࣔతʹෛྫʹ༻͍͍ͯͳ͍ | ਂ૚ֶशొ৔ޙͷؒ઀ڭࢣ͋Γֶश { ӅΕ૚Ͱؤ݈ʹֶश͢Δ͜ͱΛૂ͏͕ɺ1Ϋϥε ʹର͠1Πϯελϯε͔͠ਖ਼ྫʹ༻͍ͳ͍ (Zeng et al., 2014; 2015) { ࣄྫʹର͢ΔΞςϯγϣϯΛ༻͍ͯϊΠζʹର Ԡ (Lin et al., 2016; Ji et al., 2017) →ෛྫʢfalse positiveʣΛߟྀ͍ͯ͠ͳ͍ 4

Slide 5

Slide 5 text

ڧԽֶशͰؒ઀ڭࢣ͋Γֶशͷ ϊΠζ (FP)ΛऔΓআ͘ʢਤ1ʣ 5

Slide 6

Slide 6 text

ຊݚڀͷ3ߦ·ͱΊ | ؤ݈ͳؒ઀ڭࢣ͋Γؔ܎நग़ͷͨΊͷਂ૚ڧԽ ֶशϑϨʔϜϫʔΫΛఏҊ | ఏҊख๏͸Ϟσϧʹґଘ͠ͳ͍ͷͰɺͲΜͳؔ ܎நग़ख๏ͱ΋૊Έ߹ΘͤΔ͜ͱ͕Մೳ | χϡʔϥϧؔ܎நग़ͷੑೳ޲্Λݕূ 6

Slide 7

Slide 7 text

ؒ઀ڭࢣ͋ΓֶशͷͨΊͷ ਂ૚ڧԽֶश | ΤʔδΣϯτ͸ɺؒ઀ڭࢣ͋ΓֶशͰΞϊςʔ τ͞Εͨจʹ͍ͭͯɺؔ܎෼ྨͷੑೳʹج͍ͮ ͯɺͦͷจΛ࢒͔͢औΓআ͔͘ΛܾΊΔ 7

Slide 8

Slide 8 text

ঢ়ଶ: ཤྺΛ Markov Decision Process ͰϞσϧԽ | จ͸୯ޠ෼ࢄදݱͱҐஔ෼ࢄදݱʹม׵ (Zeng et al., 2014) | จ෼ࢄදݱ͸ݱࡏͷจϕΫτϧͱ͜Ε·Ͱʹऔ Γআ͔ΕͨจͷฏۉϕΫτϧͷ࿈݁ 8

Slide 9

Slide 9 text

ߦಈ: ܇࿅ࣄྫʹΠϯελϯε Λ࢒͔͢औΓআ͔͘ | ֤ؔ܎͝ͱʹ1ΤʔδΣϯτ | ͦͷจΛ࢒͔͢औΓআ͔͘Λܾఆ 9

Slide 10

Slide 10 text

ใु: F஋্͕͕Δ͔Ͳ͏͔ | ؔ܎෼ྨͷ F 1 Λ༻͍ΔʢΫϥεͷෆۉߧ͕͋Δ ϚϧνΫϥεͳͷͰɺਖ਼ղ཰Λ༻͍ͳ͍ʣ !" = $(&' " − &' ")') 10

Slide 11

Slide 11 text

ํࡦ: 2஋෼ྨثͰؔ܎෼ྨ | ୯७ͳ CNN Λ༻͍ͯؔ܎෼ྨثΛߏங (dos Santos et al., 2015) 11

Slide 12

Slide 12 text

ํࡦޯ഑๏ʹΑΔ܇࿅ ʢ࠶ܝਤ1ʣ 12

Slide 13

Slide 13 text

ํࡦʹجͮ͘ڧԽֶश ϑϨʔϜϫʔΫʢਤ2ʣ 13 | ใुΛܭࢉ͢ΔͨΊʹ!"#$Λ!% "#$ͱ!& "#$ʹɺ '"#$ Λ'% "#$ͱ'& "#$ʹ෼ׂ͠ɺͦΕͧΕͷF 1 ΛٻΊΔ | ؔ܎෼ྨث͸ pre-train ͓ͯ͘͠

Slide 14

Slide 14 text

ؔ܎நग़࣮ݧ | σʔληοτ { New York Times ίʔύεʹ Freebase ͷؔ܎Λ λά෇͚ͨ͠σʔλ (Riedel et al., 2010)→52छ ྨͷؔ܎λά { Stanford NE recognizer Ͱ NE λά෇͚ | ࣮ݧઃఆ { ΤʔδΣϯτ͸ CNNɺ୯ޠ෼ࢄදݱ͸ pre-train ͞Εͨ΋ͷΛ࢖༻ { ؔ܎෼ྨث΋ CNNɺ !" #$%ͱ!& #$%͸ͦΕͧΕ2:1ʹ ͳΔΑ͏ʹௐ੔ʢ'" #$%ͱ'& #$% ͸ͦΕͧΕରԠ͢ Δ!" #$%ͱ!& #$% ͷ2ഒʹͳΔΑ͏ʹϥϯμϜαϯϓ Ϧϯάʣ 14

Slide 15

Slide 15 text

ਂ૚ڧԽֶशʹΑͬͯ ؔ܎நग़ਫ਼౓͕޲্ʢද1ʣ | Originalʢֶश͠ͳ͍ʣͱൺֱͯؒ͠઀ڭࢣ͋ ΓֶशʢpretrainʣͷޮՌ͋Γ | ڧԽֶशʢRLʣͰ͞Βʹੑೳ޲্ 15

Slide 16

Slide 16 text

ఏҊख๏ʢ+RLʣ͸Ϟσϧ ʹґଘ͠ͳ͍ʢਤ4ʣ | PCNN+ONE: 1จ͚ͩબͿख๏ Zeng et al. (2015) | PCNN+ATT: શจʹΞςϯγϣϯ (Lin et al. (2016) 16

Slide 17

Slide 17 text

஌ࣝϕʔε (Freebase) ͱ σʔλ (NYT) ͱͷͣΕʢਤ5ʣ | ྔ͕গͳ͍ؔ܎͸ؒҧ͍ͬͯΔ͜ͱ͕ଟ͍ʢؔ ܎ ID ͸ද1ͷ֤ߦʹରԠʣ →͜Ε·Ͱͷख๏͸औΓআ͍͍ͯͳ͔ͬͨͷͰɺ ੑೳѱԽʹͭͳ͕͍ͬͯͨ 17

Slide 18

Slide 18 text

ڧԽֶशʹΑͬͯϊΠζ (FP) ΛऔΓআ͚Δʢද4ʣ | ؒҧͬͨจ຺Λֶश͢ΔͷΛ๷͙͜ͱ͕Ͱ͖Δ 18

Slide 19

Slide 19 text

·ͱΊ | ؤ݈ͳؒ઀ڭࢣ͋Γؔ܎நग़ͷͨΊͷਂ૚ڧԽ ֶशϑϨʔϜϫʔΫΛఏҊ { False positive ΛऔΓআ͘͜ͱʹয఺ | ఏҊख๏͸Ϟσϧʹґଘ͠ͳ͍ͷͰɺͲΜͳؔ ܎நग़ख๏ͱ΋૊Έ߹ΘͤΔ͜ͱ͕Մೳ | χϡʔϥϧؔ܎நग़ͷੑೳ޲্Λݕূ 19

Slide 20

Slide 20 text

ॴײ | ڧԽֶशͱ৘ใநग़͸૬ੑ͕͍͍ (Narasimhan et al., 2016) →ʢϒʔτετϥοϓతख๏Ͱ΋͋Γ͕ͪͳʣ ϊΠζΛऔΓআ͘͜ͱ͕Ͱ͖Δ →೚ҙͷख๏ͷલॲཧͱͯ͠΋࢖͑Δ | False positive ͷ໰୊͸ؒ઀ڭࢣ͋Γֶशʹͱͬ ͯ΍͸Γ໰୊ Ͱɺϋʔυͳ੍໿ͱͯ͠࢖͏ͷ͸ ةݥ (Nagesh et al., 2014) | ΠϯελϯεΛू߹ͱͯ͠ѻ͏Ξϓϩʔνͱ૊ Έ߹Θ͍ͤͨ | Ϋϥεؒͷؔ܎ΛϞσϧʹೖΕ͍ͨ 20

Slide 21

Slide 21 text

࣭ٙԠ౴ᶃ | Q1: ϊΠζΛݮΒ͍ͨ͠ͱ͍͏͜ͱ͕ͩɺ֤Τ ϙοΫͰਖ਼ྫɾෛྫʹݕূʹ࢖͏σʔλ΋ϊΠ ζ͕ೖ͍ͬͯΔͷͰ͸ͳ͍͔ʁ A1: ࣮ࡍʹೖ͍ͬͯΔՄೳੑ͸͋Δ͕ɺֶश͕ ෆ҆ఆʹͳΒͳ͍Α͏ɺ਺ճͷΤϙοΫͷ݁Ռ Λฏۉͨ͠Γ͢ΔςΫχοΫΛ࢖͍ͬͯΔ | Q2: ܇࿅ࣄྫ͕มΘΔͨͼʹֶश͠௚͍ͯ͠Δ ͱܭࢉྔ͕େมͰ͸ͳ͍͔ʁ A2: CNN ͰϞσϧΛ͍ܰͯ͘͠Δͷ͸ɺܭࢉྔ తͳ໰୊΋͋Δͷ͔΋͠Εͳ͍ 21

Slide 22

Slide 22 text

࣭ٙԠ౴ᶄ | Q3: ؔ܎෼ྨͷϞσϧʹґଘ͠ͳ͍ͱओுͯ͠ ͍Δ͕ɺ෼ྨثͷϞσϧʹ໌Β͔ʹґଘ͢Δͷ Ͱ͸ʁ A3: ෼ྨʹؔͯ͠͸͔֬ʹλεΫʹԠͯ͡࡞Γ ࠐΜͩํ͕Α͍ͱࢥΘΕΔ͕ɺ࿦จͱͯ͠͸؆ ୯ͳϞσϧͰ΋ੑೳ͕Α͘ͳΔ͜ͱΛਪ͍ͨ͠ ͷͰɺ͋͑ͯφΠʔϒͳϞσϧʹ͍ͯ͠Δ λεΫʹԠͯ͡෼ྨثΛ࡞Ε͹͞ΒʹΑ͘ͳΔ ͱࢥΘΕΔ 22

Slide 23

Slide 23 text

ࢀߟจݙ σʔληοτ | Riedel et al.. Modeling Relations and Their Mentions without Labeled Set. ECML PKDD 2010. ຊݚڀͷϕʔεϥΠϯ | Zeng et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. EMNLP 2015. | Lin et al. Neural Relation Extraction with Selective Attention over Instances. ACL 2016. 23

Slide 24

Slide 24 text

ࢀߟจݙ χϡʔϥϧҎલͷ৘ใநग़ʢؒ઀ڭࢣ͋Γʣ | Mintz et al. Distant Supervision for Relation Extraction without Labeled Data. ACL 2009. | Hoffmann et al. Knowledge-based Weak Supervision for Information Extraction of Overlapping Relations. ACL 2011. | Surdeanu et al. Multi-instance Multi-label Learning for Relation Extraction. EMNLP 2012. | Nagesh et al. Noisy Or-based Model for Relation Extraction using Distant Supervision. EMNLP 2014. 24

Slide 25

Slide 25 text

ࢀߟจݙ χϡʔϥϧҎޙͷ৘ใநग़ | Zeng et al. Relation Classification via Convolutional Deep Neural Network. COLING 2014. | Narasimhan et al. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning. EMNLP 2016. | Ji et al. Distant Supervision for Relation Extraction with Sentence-level Attention and Entity Descriptions. AAAI 2017. 25