Upgrade to Pro — share decks privately, control downloads, hide ads and more …

論文紹介: How Contextual are Contextualized Word Representations?

論文紹介: How Contextual are Contextualized Word Representations?

- TMU Komachi lab
- paper reading
- How Contextual are Contextualized Word Representations? Kawin Ethayarajh. EMNLP2019
- paper URL: https://arxiv.org/abs/1909.00512

Satoru Katsumata

December 11, 2023
Tweet

More Decks by Satoru Katsumata

Other Decks in Research

Transcript

  1. "CTUSBDU w %FFQOFVSBMMBOHVBHFNPEFMTIBWFTVDDFTTGVMMZDSFBUFE DPOUFYUVBMJ[FEXPSESFQSFTFOUBUJPOT
 &-.P 1FUFSTFUBM  MBZFSCJ-45.
 #&35 %FWMJOFUBM

     MBZFSUSBOTGPSNFS&ODPEFS
 (15 3BEGPSEFUBM  MBZFSUSBOTGPSNFS%FDPEFS w 5IFTFSFQSFTFOUBUJPOTSFNBJOQPPSMZVOEFSTUPPE
 )PXDPOUFYUVBMBSFUIFTFDPOUFYUVBMJ[FEXPSESFQSFTFOUBUJPOT 
 "SFUIFSFJOpOJUFMZNBOZDPOUFYUTQFDJpDSFQSFTFOUBUJPOTUIBU #&35BOE&-.PDBOBTTJHOUPFBDIXPSE  w 5IJTQBQFSBOTXFSUIFTFRVFTUJPOCZTUVEZJOHUIFHFPNFUSZPGUIF SFQSFTFOUBUJPOTQBDFGPSFBDIMBZFSPG&-.P #&35 BOE(15 !2
  2. &YQFSJNFOU4FUUJOH w 5IFDPOUFYUVBMJ[JOHNPEFMT
 &-.P#&35(15 JOQVUMBZFSUIMBZFS  w %BUB
 454 w

    .FBTVSFTPG$POUFYUVBMJUZ
 4FMGTJNJMBSJUZ
 *OUSBTFOUFODFTJNJMBSJUZ
 .BYJNVNFYQMBJOBCMFWBSJBODF !3
  3. 4FMGTJNJMBSJUZ w -FUCFBXPSEUIBUBQQFBSTJOTFOUFODFT
 BUJOEJDFTSFTQFDUJWFMZ
 ˠ w -FUCFBGVODUJPOUIBUNBQTUPJUTSFQSFTFOUBUJPO JOMBZFS w 4FMGTJNJMBSJUZ

    w 4FMGTJNJMBSJUZJTUIFBWFSBHFDPTJOFTJNJMBSJUZCFUXFFOJUT DPOUFYUVBMJ[FESFQSFTFOUBUJPOTBDSPTTJUTOVOJRVFDPOUFYUT !4 w {s1 , . . . , sn } {i1 , . . . , in } w = s1 [i1 ] = . . . = sn [in ] fl (s, i) s[i] l SelfSiml (w) = 1 n2 − n ∑ j ∑ k≠j cos (fl (sj , ij), fl (sk , ik))
  4. *OUSBTFOUFODFTJNJMBSJUZ w -FUCFBTFOUFODFUIBUJTBTFRVFODFPGOXPSET w -FUCFBGVODUJPOUIBUNBQTUPJUTSFQSFTFOUBUJPOJO MBZFS w *OUSBTFOUFODFTJNJMBSJUZ
 w *OUSBTFOUFODFTJNJMBSJUZPGBTFOUFODFJTUIFBWFSBHFDPTJOF

    TJNJMBSJUZCFUXFFOJUTXPSESFQSFTFOUBUJPOTBOEUIFTFOUFODF WFDUPS
 ˠ5IJTNFBTVSFDBQUVSFTIPXDPOUFYUTQFDJpDJUZNBOJGFTUT
 JOUIFWFDUPSTQBDF !5 s ⟨w1 , . . . , wn ⟩ fl (s, i) s[i] l IntraSiml (s) = 1 n ∑ i cos ( ⃗ sl , fl (s, i)) where ⃗ sl = 1 n ∑ i fl (s, i)
  5. .BYJNVNFYQMBJOBCMF WBSJBODF w -FUCFBXPSEUIBUBQQFBSTJOTFOUFODFT
 BUJOEJDFTSFTQFDUJWFMZ
 ˠ w -FUCFBGVODUJPOUIBUNBQTUPJUTSFQSFTFOUBUJPOJOMBZFS w JTUIFPDDVSSFODFNBUSJYPGXPSEX


    BSFUIFpSTUNTJOHVMBSWBMVFTPGUIJTNBUSJY w .BYJNVNFYQMBJOBCMFWBSJBODF w .&7JTUIFQSPQPSUJPOPGWBSJBODFJOX`TDPOUFYUVBMJ[FESFQSFTFOUBUJPOT GPSBHJWFOMBZFSUIBUDBOCFFYQMBJOFECZUIFJSpSTUQSJODJQBMDPNQPOFOU w *UHJWFTVTBOVQQFSCPVOEPOIPXXFMMBTUBUJDFNCFEEJOHDPVME SFQMBDFBXPSE`TDPOUFYUVBMJ[FESFQSFTFOUBUJPOT !6 w {s1 , . . . , sn } {i1 , . . . , in } w = s1 [i1 ] = . . . = sn [in ] fl (s, i) s[i] l MEVl (w) = σ2 1 ∑ i σ2 i [fl (s1 , i1 ) . . . fl (sn , in )] σ1 . . . σm
  6. "EKVTUJOHGPS"OJTPUSPQZ w &YBNQMF4FMGTJNJMBSJUZ
 5IJTQBQFSJOEVDF#BTFMJOF w JTUIFTFUPGBMMXPSEPDDVSSFODF
 NBQTBXPSEPDDVSSFODFUPJUTSFQSFTFOUBUJPOJO MBZFS w #BTFMJOFJTUIFBWFSBHFDPTJOFTJNJMBSJUZCFUXFFO

    UIFSFQSFTFOUBUJPOTPGVOJGPSNMZSBOEPNMZ TBNQMFEXPSETGSPNEJ⒎FSFOUDPOUFYUT !7 Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl (y))] 𝒪 fl ( ⋅ ) l
  7. "EKVTUJOHGPS"OJTPUSPQZ w &YBNQMF4FMGTJNJMBSJUZ
 5IJTQBQFSJOEVDF#BTFMJOF w JTUIFTFUPGBMMXPSEPDDVSSFODF
 NBQTBXPSEPDDVSSFODFUPJUTSFQSFTFOUBUJPOJO MBZFS w #BTFMJOFJTUIFBWFSBHFDPTJOFTJNJMBSJUZCFUXFFO

    UIFSFQSFTFOUBUJPOTPGVOJGPSNMZSBOEPNMZ TBNQMFEXPSETGSPNEJ⒎FSFOUDPOUFYUT !8 Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl (y))] 𝒪 fl ( ⋅ ) 8IZJTUIFCBTFMJOFJOEVDFE  FH  JG
 ˠ8PSEX`TSFQSFTFOUBUJPOTXFSFQPPSMZDPOUFYUVBMJ[FE  JG
 ˠ8PSEX`TSFQSFTFOUBUJPOTXFSFXFMMDPOUFYUVBMJ[FE
 5IJTJTCFDBVTFSFQSFTFOUBUJPOTPGXJOEJ⒎FSFOUDPOUFYUTXPVMEPOBWFSBHFCFNPSFEJTTJNJMBSUP FBDIPUIFSUIBOUXPSBOEPNMZDIPTFOXPSET   5IFDBTFPG
 ˠ8PSEWFDUPSTBSFQSFUUZVOJGPSNMZEJTUSJCVUFEJO TQBDF *TPUSPQZ PUIFSXJTF 
 UIFDBTFPGJTBOJTPUSPQZ SelfSiml (w) = 0.95 Baseline(fl ) = 0.00 Baseline(fl ) = 0.99 Baseline(fl ) = 0.00 Baseline(fl ) = 0.99
  8. "EKVTUJOHGPS"OJTPUSPQZ w &YBNQMF4FMGTJNJMBSJUZ
 5IJTQBQFSJOEVDF#BTFMJOF w JTUIFTFUPGBMMXPSEPDDVSSFODF
 NBQTBXPSEPDDVSSFODFUPJUTSFQSFTFOUBUJPOJO MBZFS w #BTFMJOFJTUIFBWFSBHFDPTJOFTJNJMBSJUZCFUXFFO

    UIFSFQSFTFOUBUJPOTPGVOJGPSNMZSBOEPNMZ TBNQMFEXPSETGSPNEJ⒎FSFOUDPOUFYUT !9 Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl (y))] 𝒪 fl ( ⋅ ) 8IZJTUIFCBTFMJOFJOEVDFE  FH  JG
 ˠ8PSEX`TSFQSFTFOUBUJPOTXFSFQPPSMZDPOUFYUVBMJ[FE  JG
 ˠ8PSEX`TSFQSFTFOUBUJPOTXFSFXFMMDPOUFYUVBMJ[FE
 5IJTJTCFDBVTFSFQSFTFOUBUJPOTPGXJOEJ⒎FSFOUDPOUFYUTXPVMEPOBWFSBHFCFNPSFEJTTJNJMBSUP FBDIPUIFSUIBOUXPSBOEPNMZDIPTFOXPSET   5IFDBTFPG
 ˠ8PSEWFDUPSTBSFQSFUUZVOJGPSNMZEJTUSJCVUFEJO TQBDF *TPUSPQZ PUIFSXJTF 
 UIFDBTFPGJTBOJTPUSPQZ SelfSiml (w) = 0.95 Baseline(fl ) = 0.00 Baseline(fl ) = 0.99 Baseline(fl ) = 0.00 Baseline(fl ) = 0.99 4FF"SPSBFUBM "SPSBFUBM  .VBOE7JTXBOBUI 
 GPSEFUBJMT
  9. "EKVTUJOHGPS"OJTPUSPQZ w &YBNQMF4FMGTJNJMBSJUZ
 5IJTQBQFSJOEVDF#BTFMJOF w JTUIFTFUPGBMMXPSEPDDVSSFODF
 NBQTBXPSEPDDVSSFODFUPJUTSFQSFTFOUBUJPOJO MBZFS w #BTFMJOFJTUIFBWFSBHFDPTJOFTJNJMBSJUZCFUXFFO

    UIFSFQSFTFOUBUJPOTPGVOJGPSNMZSBOEPNMZ TBNQMFEXPSETGSPNEJ⒎FSFOUDPOUFYUT !10 Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl (y))] 𝒪 fl ( ⋅ ) 5PBEKVTUGPSUIFF⒎FDUPGBOJTPUSPQZ  TFMGTJNJMBSJUZBOEJOUSBTFOUFODFTJNJMBSJUZ
 #BTFMJOFJTUIFBWFSBHFDPTJOFTJNJMBSJUZCFUXFFOUIFSFQSFTFOUBUJPOTPG VOJGPSNMZSBOEPNMZTBNQMFEXPSET  NBYJNVNFYQMBJOBCMFWBSJBODF
 #BTFMJOFJTUIFQSPQPSUJPOPGWBSJBODFJOVOJGPSNMZSBOEPNMZTBNQMFEXPSE SFQSFTFOUBUJPOTUIBUJTFYQMBJOFECZUIFJSpSTUQSJODJQBMDPNQPOFOU  UIFOTVCUSBDUGSPNFBDINFBTVSFJUTSFTQFDUJWF CBTFMJOFUPHFUUIFBOJTPUSPQZBEKVTUFEDPOUFYVBMJUZ NFBTVSF Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl (y))] SelfSim* l (w) = SelfSiml (w) − Baseline (fl)
  10. 'JOEJOHTŠ*TPUSPQZ !12 Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl

    (y))]  $POUFYUVBMJ[FESFQSFTFOUBUJPOTBSFBOJTPUSPQJDJO BMMOPOJOQVUMBZFST  $POUFYUVBMJ[FESFQSFTFOUBUJPOTBSFHFOFSBMMZNPSF BOJTPUSPQJDJOIJHIFSMBZFST 
  11. 'JOEJOHTŠ*TPUSPQZ !13 Baseline (fl) = 𝔼x,y∼U(𝒪) [cos (fl (x), fl

    (y))]  $POUFYUVBMJ[FESFQSFTFOUBUJPOTBSFBOJTPUSPQJDJO BMMOPOJOQVUMBZFST  $POUFYUVBMJ[FESFQSFTFOUBUJPOTBSFHFOFSBMMZNPSF BOJTPUSPQJDJOIJHIFSMBZFST  *TPUSPQZIBTCPUIUIFPSFUJDBMBOEFNQJSJDBM CFOFpUTGPSTUBUJDXPSEFNCFEEJOHT ˠ5IFFYUSFNFEFHSFFPGBOJTPUSPQZTFFOJO DPOUFYUVBMJ[FEXPSESFQSFTFOUBUJPOTJT TVSQSJTJOH
  12. 'JOEJOHT
 Š$POUFYUTQFDJpDJUZ !14 SelfSiml (w) = 1 n2 − n

    ∑ j ∑ k≠j cos (fl (sj , ij), fl (sk , ik))
  13. 'JOEJOHT
 Š$POUFYUTQFDJpDJUZ !15 SelfSiml (w) = 1 n2 − n

    ∑ j ∑ k≠j cos (fl (sj , ij), fl (sk , ik))  $POUFYUVBMJ[FEXPSESFQSFTFOUBUJPOTBSFNPSF DPOUFYUTQFDJpDJOIJHIFSMBZFST  4UPQXPSET FH bUIF` bPG` bUP` IBWFBNPOHUIF NPTUDPOUFYUTQFDJpDSFQSFTFOUBUJPOT -PXFS
  14. 'JOEJOHT
 Š$POUFYUTQFDJpDJUZ !16 SelfSiml (w) = 1 n2 − n

    ∑ j ∑ k≠j cos (fl (sj , ij), fl (sk , ik))  $POUFYUVBMJ[FEXPSESFQSFTFOUBUJPOTBSFNPSF DPOUFYUTQFDJpDJOIJHIFSMBZFST  4UPQXPSET FH bUIF` bPG` bUP` IBWFBNPOHUIF NPTUDPOUFYUTQFDJpDSFQSFTFOUBUJPOT -PXFS "DSPTTBMMMBZFST 
 TUPQXPSETIBWFBNPOHUIFMPXFTUTFMGTJNJMBSJUZPGBMMXPSET
 ˠ5IJTJTSFMBUJWFMZTVSQSJTJOH HJWFOUIBUUIFTFXPSETBSFOPU QPMZTFNPVT 5IJTBOTXFSTPOFPGUIFRVFTUJPOTUIJTQBQFSQPTFE
 &-.P #&35 BOE(15BSFOPUTJNQMZBTTJHOJOHPOFPGB pOJUFOVNCFSPGXPSETFOTFSFQSFTFOUBUJPOTUPFBDIXPSE
  15. 'JOEJOHT
 ŠIPXDPOUFYUTQFDJpDJUZNBOJGFTUT !17 IntraSiml (s) = 1 n ∑ i

    cos ( ⃗ sl , fl (s, i)) where ⃗ sl = 1 n ∑ i fl (s, i) )PXEPFTUIJTJODSFBTFEDPOUFYUTQFDJpDJUZNBOJGFTU JOUIFWFDUPSTQBDF   %PXPSESFQSFTFOUBUJPOTJOUIFTBNFTFOUFODF DPOWFSHFUPBTJOHMFQPJOU   %PUIFZSFNBJOEJTUJODUGSPNPOFBOPUIFSXIJMFTUJMM CFJOHEJTUJODUGSPNUIFJSSFQSFTFOUBUJPOTJOPUIFS DPOUFYUT 
  16. 'JOEJOHT
 ŠIPXDPOUFYUTQFDJpDJUZNBOJGFTUT !18 IntraSiml (s) = 1 n ∑ i

    cos ( ⃗ sl , fl (s, i)) where ⃗ sl = 1 n ∑ i fl (s, i)  &-.P
 8PSETJOUIFTBNFTFOUFODFBSFNPSFTJNJMBSUPPOFBOPUIFSJOVQQFSMBZFST  #&35
 8PSETJOUIFTBNFTFOUFODFBSFNPSFEJTTJNJMBSUPPOFBOPUIFSJOVQQFSMBZFST  (15
 8PSESFQSFTFOUBUJPOTJOUIFTBNFTFOUFODFBSFOPNPSFTJNJMBSUPFBDIPUIFSUIBO SBOEPNMZTBNQMFEXPSET $POUFYUVBMJ[FESFQSFTFOUBUJPOTBSF
 NPSFDPOUFYUTQFDJpDJOVQQFSMBZFST
  17. 'JOEJOHTŠ
 TUBUJD DPOUFYUVBMJ[FE !20 MEVl (w) = σ2 1 ∑

    i σ2 i  0OBWFSBHF MFTTUIBOPGUIFWBSJBODFJOBXPSE`TDPOUFYUVBMJ[FE SFQSFTFOUBUJPOTDBOCFFYQMBJOFECZBTUBUJDFNCFEEJOH  5IFSBX.&7PGNBOZXPSETJTBDUVBMMZCFMPXUIFBOJTPUSPQZCBTFMJOF
 JF BHSFBUFSQSPQPSUJPOPGUIFWBSJBODFBDSPTTBMMXPSETDBOCFFYQMBJOFE CZBTJOHMFWFDUPSUIBODBOUIFWBSJBODFBDSPTTBMMSFQSFTFOUBUJPOTPGB TJOHMFXPSE 
  18. 'JOEJOHTŠ
 TUBUJD DPOUFYUVBMJ[FE !21 MEVl (w) = σ2 1 ∑

    i σ2 i  0OBWFSBHF MFTTUIBOPGUIFWBSJBODFJOBXPSE`TDPOUFYUVBMJ[FE SFQSFTFOUBUJPOTDBOCFFYQMBJOFECZBTUBUJDFNCFEEJOH  5IFSBX.&7PGNBOZXPSETJTBDUVBMMZCFMPXUIFBOJTPUSPQZCBTFMJOF
 JF BHSFBUFSQSPQPSUJPOPGUIFWBSJBODFBDSPTTBMMXPSETDBOCFFYQMBJOFE CZBTJOHMFWFDUPSUIBODBOUIFWBSJBODFBDSPTTBMMSFQSFTFOUBUJPOTPGB TJOHMFXPSE  5IJTTVHHFTUTUIBUDPOUFYUVBMJ[JOHNPEFMTBSF OPUTJNQMZBTTJHOJOHPOFPGBpOJUFOVNCFSPG XPSETFOTFSFQSFTFOUBUJPOTUPFBDIXPSE
 rPUIFSXJTF UIFQSPQPSUJPOPGWBSJBODF FYQMBJOFEXPVMECFNVDIIJHIFS
  19. $PODMVTJPO w 5IJTQBQFSJOWFTUJHBUFEIPXDPOUFYUVBMDPOUFYUVBMJ[FEXPSESFQSFTFOUBUJPOTUSVMZ BSF w 5IFZGPVOEUIBUVQQFSMBZFSTPG&-.P #&35 BOE(15QSPEVDFNPSFDPOUFYU TQFDJpDSFQSFTFOUBUJPOTUIBOMPXFSMBZFST w

    5IJTJODSFBTFEDPOUFYUTQFDJpDJUZJTBMXBZTBDDPNQBOJFECZJODSFBTFE BOJTPUSPQZ w )PXFWFS DPOUFYUTQFDJpDJUZBMTPNBOJGFTUTEJ⒎FSFOUMZBDSPTTUIFUISFFNPEFMT
 FHUIFBOJTPUSPQZBEKVTUFETJNJMBSJUZCFUXFFOXPSETJOUIFTBNFTFOUFODFJT IJHIFTUJO&-.PCVUBMNPTUOPOFYJTUFOUJO(15 w 5IFZGPVOEUIBUTUBUJDXPSEFNCFEEJOHTXPVMECFBQPPSSFQMBDFNFOUGPS DPOUFYUVBMJ[FEPOFT !23
  20. 3FMBUFEXPSLT w "SPSBFUBM "4*.1-&#65506()50#&"5#"4&-*/&'034&/5&/$& &.#&%%*/(4*$-3 w "SPSBFUBM "-BUFOU7BSJBCMF.PEFM"QQSPBDIUP1.*CBTFE8PSE&NCFEEJOHT 5"$- w

    .VBOE7JTXBOBUI "--#655)&5014*.1-&"/%&''&$5*7& 1045130$&44*/('03803%3&13&4&/5"5*0/4*$-3 w 1FUFSTFUBM %FFQ$POUFYUVBMJ[FE8PSE3FQSFTFOUBUJPOT/""$- w %FWMJOFUBM #&351SFUSBJOJOHPG%FFQ#JEJSFDUJPOBM5SBOTGPSNFSTGPS-BOHVBHF 6OEFSTUBOEJOH/""$- w 3BEGPSEFUBM -BOHVBHF.PEFMTBSF6OTVQFSWJTFE.VMUJUBTL-FBSOFST0QFO"* CMPHQPTU !24