Upgrade to Pro — share decks privately, control downloads, hide ads and more …

論文紹介: Exploiting Monolingual Data at Scale for Neural Machine Translation

論文紹介: Exploiting Monolingual Data at Scale for Neural Machine Translation

- TMU Komachi lab
- paper reading (EMNLP2019)
- Exploiting Monolingual Data at Scale for Neural Machine Translation. Wu et al., EMNLP2019
- paper URL: https://www.aclweb.org/anthology/D19-1430.pdf

Satoru Katsumata

December 11, 2023
Tweet

More Decks by Satoru Katsumata

Other Decks in Research

Transcript

  1. "CTUSBDU w )PXUPF⒎FDUJWFMZMFWFSBHFNPOPMJOHVBMEBUBJTBOJNQPSUBOU SFTFBSDIUPQJDGPS/.5
 5IFSFBSFQMFOUZPGXPSLTTUVEZJOHUIJTQSPCMFN w #BDL5SBOTMBUJPO #5 JTUIFPOFPGUIFNPTUDJUFEBQQSPBDI
 MFWFSBHFUIFUBSHFUTJEFNPOPMJOHVBMEBUB

    w 0OUIFPUIFSIBOE UIFJOWFTUJHBUJPOPOTPVSDFTJEFNPOPMJOHVBMEBUB JTWFSZMJNJUFE
 ;IBOHBOE;POH BOE6F⒏OHFUBM VTF
 UIFTPVSDFTJEFEBUBUPNBLFUIFTZOUIFUJDUBSHFUEBUB w 5IJTQBQFSTUVEZIPXUPMFWFSBHFCPUITPVSDFTJEFBOEUBSHFUTJEF NPOPMJOHVBMEBUBUPCPPTUUIFBDDVSBDZPG/.5 !2
  2. *NQSPWJOH/.5CZ .POPMJOHVBM%BUB w 5BSHFUTJEFNPOPMJOHVBMEBUB
 GVTJPO/.5 -.NPEFM
 #5NBLFUIFTZOUIFUJDTPVSDFTJEFEBUBGSPN
 UBSHFUTJEFNPOPMJOHVBMEBUB w 4PVSDFTJEFNPOPMJOHVBMEBUB


    TFMGMFBSOJOHNBLFUIFTZOUIFUJDUBSHFUTJEFEBUB
 GSPNTPVSDFTJEFNPOPMJOHVBMEBUB !4 #5SFRVJSFTUSBJOJOHBOBEEJUJPOBMUBSHFUUPTPVSDF/.5 NPEFMHJWFOUIFCJMJOHVBMEBUBTFU
 5IFUSBOTMBUJPOPVUQVUBOEUIFUBSHFUTJEFNPOPMJOHVBM EBUBUIFOQBJSFEBTTZOUIFUJDQBSBMMFMDPSQVTUPBVHNFOU UIFPSJHJOBMCJMJOHVBMEBUBTFU EJSFDUJPO&OHMJTIˠ(FSNBO TZOUIFUJDPSBOHF (FOVJOFCJUFYU
 QBSBMMFMEBUB  &O 5FYU %F 5FYU 5SBJO
 %F&O #5 .PEFM
 XJUICJUFYU .BLFTZOUIFUJDTSD VTJOH#5.PEFM
 NPOPMJOHVBMEBUB  %F
 5FYU
 NPOP &O 5FYU
 TZOUIFUJD
  3. *NQSPWJOH/.5CZ .POPMJOHVBM%BUB w 5BSHFUTJEFNPOPMJOHVBMEBUB
 GVTJPO/.5 -.NPEFM
 #5NBLFUIFTZOUIFUJDTPVSDFTJEFEBUBGSPN
 UBSHFUTJEFNPOPMJOHVBMEBUB w 4PVSDFTJEFNPOPMJOHVBMEBUB


    TFMGMFBSOJOHNBLFUIFTZOUIFUJDUBSHFUTJEFEBUB
 GSPNTPVSDFTJEFNPOPMJOHVBMEBUB !5 #5SFRVJSFTUSBJOJOHBOBEEJUJPOBMUBSHFUUPTPVSDF/.5 NPEFMHJWFOUIFCJMJOHVBMEBUBTFU
 5IFUSBOTMBUJPOPVUQVUBOEUIFUBSHFUTJEFNPOPMJOHVBM EBUBUIFOQBJSFEBTTZOUIFUJDQBSBMMFMDPSQVTUPBVHNFOU UIFPSJHJOBMCJMJOHVBMEBUBTFU EJSFDUJPO&OHMJTIˠ(FSNBO TZOUIFUJDPSBOHF (FOVJOFCJUFYU
 QBSBMMFMEBUB  &O 5FYU %F 5FYU 5SBJO
 %F&O #5 .PEFM
 XJUICJUFYU .BLFTZOUIFUJDTSD VTJOH#5.PEFM
 NPOPMJOHVBMEBUB  %F
 5FYU
 NPOP &O 5FYU
 TZOUIFUJD 3FMBUFE8PSL
 &EVOPWFUBMpSTUMZQSPWJEFBO FYUFOTJWFBOBMZTJTPGUIFCBDL USBOTMBUJPOBUTDBMF
  4. *NQSPWJOH/.5CZ .POPMJOHVBM%BUB w 5BSHFUTJEFNPOPMJOHVBMEBUB
 GVTJPO/.5 -.NPEFM
 #5NBLFUIFTZOUIFUJDTPVSDFTJEFEBUBGSPN
 UBSHFUTJEFNPOPMJOHVBMEBUB w 4PVSDFTJEFNPOPMJOHVBMEBUB


    TFMGMFBSOJOHNBLFUIFTZOUIFUJDUBSHFUTJEFEBUB
 GSPNTPVSDFTJEFNPOPMJOHVBMEBUB !6 4FMGMFBSOJOHBQQSPBDIHFOFSBUFTUIFTZOUIFUJDEBUBGPS UIFTPVSDFTJEFNPOPMJOHVBMEBUB XIJDIJTBTFNJ TVQFSWJTFENFUIPE EJSFDUJPO&OHMJTIˠ(FSNBO TZOUIFUJDPSBOHF (FOVJOFCJUFYU
 QBSBMMFMEBUB  &O 5FYU %F 5FYU 5SBJO
 &O%F.PEFM
 XJUICJUFYU .BLFTZOUIFUJDUHU VTJOHGPSXBSE.PEFM
 NPOPMJOHVBMEBUB  &O
 5FYU
 NPOP %F 5FYU
 TZOUIFUJD
  5. 5SBJOJOH4USBUFHZ
 /PUBUJPO w 9 :MBOHVBHF w UIFDPMMFDUJPOPGBMMTFOUFODFTGPSFBDIMBOHVBHF w UIFCJMJOHVBMUSBJOJOHQBJST
 w

    UIFDPMMFDUJPOPG NPOPMJOHVBMTFOUFODFT
 w USBOTMBUJPONPEFM !7 X, Y B = {(xi , yi )}N i=1 xi ∈ X, yi ∈ Y xj ∈ X, yj ∈ Y Mx = {xj }Mx j=1 My = {yj }My j=1 f : X ↦ Y
  6. 5SBJOJOH4USBUFHZ -BSHFTDBMFOPJTFEUSBJOJOH w 5IFZBEEOPJTFUPUIFTPVSDFTJEFEBUBPGCPUIBOEGPS USBJOJOHJOTUFBEPGEJSFDUMZVTJOHUIFNUPUSBJONPEFMT
 ˠ#VJMEUXPGPMMPXJOHOPJTFEEBUBTFUT w OPJTFGVODUJPO
 SBOEPNMZSFQMBDFBXPSEUPCFBTQFDJBMUPLFOXJUI QSPCBCJMJUZ


    SBOEPNMZESPQUIFXPSETXJUIQSPCBCJMJUZ
 SBOEPNMZTIV⒐F TXBQ UIFXPSETXJUIDPOTUSBJOUUIBUUIF XPSETXJMMOPUCFTIV⒐FEGVSUIFSUIBOUISFFQPTJUJPOTEJTUBODF w 5IFZUIFOUSBJOBO/.5NPEFMGPS9UP:USBOTMBUJPOPO !9 Bn s = {(σ(x), y)|(x, y) ∈ Bs } Bn t = {(σ(x), y)|(x, y) ∈ Bt } σ B ∪ Bn s ∪ Bn t Bs Bt fn
  7. 5SBJOJOH4USBUFHZ $MFBOEBUBUVOJOH w 5IFZGVSUIFSpOFUVOFUIFOPJTFEUSBJOJOHNPEFMPOUIF DMFBOWFSTJPOPGUIFTZOUIFUJDEBUBXJUIPVUBEEJOHOPJTF NBOVBMMZ w 5SBJOBOPUIFSGPS9UP:USBOTMBUJPOBOEBOPUIFSGPS :UP9USBOTMBUJPO
 ˠVTFUIFNUPCVJMEOFXTZOUIFUJDEBUB

    
 TVCTBNQMFTFOUFODFTUPGPSNBOE w 'JOFUVOFUIFUSBOTMBUJPONPEFMPOUIFOFXEBUB
 JTJOJUJBMJ[FECZ 
 !10 min ∑ (x,y)∈B∪Bs s ∪Bs t − log P(y|x; f ) Bs Bt Bs s Bs t fb gb fn fn f
  8. &YQFSJNFOUT TFUUJOH w UBTL8.5&O%F %F&O %F'S 'S%F w EBUB&O%F %F&O


    QBSBMMFMDPSQVT
 8.5 .TFOUT &VSPQBSMW /FXT$PNNFOUBSZW $PNNPO$SBXM %PDVNFOUTQMJU3BQJEDPSQVT
 8.51$ .TFOUT QSFQSPDFTTFE1SBDSBXM 8.5
 NPOPMJOHVBMDPSQVT
 &O %F/FXT$SBXMˠ.TFOUT
 WBMJEBUJPOEBUB
 OFXTUFTU
 UFTUEBUB
 OFXTUFTU w EBUB%F'S 'S%F
 QBSBMMFMDPSQVT
 8.5.8.51$.
 NPOPMJOHVBMDPSQVT
 /FXT$SBXM.TFOUT
 WBMJEBUJPO UFTU8.5TFUUJOH w ,#1&USBOTGPSNFSCJHCBUDIUPLFOTQFS(161VQEBUFQBSBNFUFSFWFSZNJOJCBUDI !11 'PSOPJTFEUSBJOJOH UIFNPEFMTVTFEGPS USBOTMBUJOHNPOPMJOHVBMEBUBBSFUSBJOFEPO 8.5
 'PSpOFUVOJOH UIFNPEFMTVTFEGPS USBOTMBUJOHNPOPMJOHVBMEBUBBSFUSBJOFEPO 8.51$
  9. $PODMVTJPO w 5IFZFYQMPJUUIFNPOPMJOHVBMEBUBBUTDBMFGPSUIFOFVSBM NBDIJOFUSBOTMBUJPO w 5IFZQSPQPTFBOF⒎FDUJWFUSBJOJOHTUSBUFHZUPCPPTUUIF/.5 QFSGPSNBODFCZMFWFSBHJOHCPUITPVSDFTJEFBOEUBSHFUTJEF NPOPMJOHVBMEBUB w 'PSGVUVSFXPSL

    UIFZXPVMEMJLFUPWFSJGZPVSUSBJOJOHTUSBUFHZ PONPSFMBOHVBHFQBJSTBOEPUIFSTFRVFODFUPTFRVFODF UBTLT w 5IFZBSFJOUFSFTUFEJOTUVEZJOHUIFJSOPJTFEUSBJOJOHXJUIPUIFS EBUBBVHNFOUBUJPOBQQSPBDIFT !18
  10. 3FMBUFE8PSL w &EVOPWFUBM 6OEFSTUBOEJOHCBDLUSBOTMBUJPOBU TDBMF&./-1 w ;IBOHBOE;POH &YQMPJUJOH4PVSDFTJEF.POPMJOHVBM %BUBJO/FVSBM.BDIJOF5SBOTMBUJPO&./-1 w

    4FOOSJDIFUBM *NQSPWJOH/FVSBM.BDIJOF5SBOTMBUJPO .PEFMTXJUI.POPMJOHVBM%BUB"$- w 6F⒏OHFUBM 5SBOTEVDUJWFMFBSOJOHGPSTUBUJTUJDBM NBDIJOFUSBOTMBUJPO"$- !19