$30 off During Our Annual Pro Sale. View Details »

Deep Collaborative Learning by Knowledge Transf...

Deep Collaborative Learning by Knowledge Transfer Graph

In this talk I will present our latest research about knowledge transfer graph for Deep Collaborative Learning (DCL), which is a method that incorporates Knowledge Distillation and Deep Mutual Learning. DCL is represented by a directional graph where each model is represented by a node, and the propagation of knowledge from the source node to the target node is represented by edges. In DCL, a hyperparameter search can be used to search for an optimal knowledge transfer graph.

Hironobu Fujiyoshi

August 12, 2022
Tweet

More Decks by Hironobu Fujiyoshi

Other Decks in Research

Transcript

  1. #JPHSBQIZ%S)JSPOPCV'VKJZPTIJ 2 த෦େֶϩΰ த෦େֶϩΰ  w ɿ1I%GSPN$IVCV6OJWFSTJUZ w ɿ1PTUEPDUPSBMGFMMPX!$BSOFHJF.FMMPO6OJWFSTJUZ r

    7JEFP4VSWFJMMBODFBOE.POJUPSJOH 74". )VNBOPJE7JTJPO w QSFTFOUɿ$IVCV6OJWFSTJUZ w ɿ7JTJUJOHSFTFBSDIFS!$BSOFHJF.FMMPO6OJWFSTJUZ r 1FPQMF*NBHF"OBMZTJT w QSFTFOUɿ1SPGFTTPS $IVCV6OJWFSTJUZ 7JEFP4VSWFJMMBODFBOE.POJUPSJOH<1SPDPG*&&&`> lTUBSzTLFMUPOJ[BUJPO<8"$7`> )VNBOPJEWJTJPOQSPKFDU $.6)POEB
  2. (PJOHEFFQFSUPJNQSPWFQFSGPSNBODF த෦େֶϩΰ த෦େֶϩΰ w %FFQFSOFUXPSLXPOBUMBUFTUDPNQFUJUJPO 2012 SuperVision GoogLeNet Konvolüzasyon Pooling

    Softmax Diğer [Krizhevsky NIPS 2012] [Szegedy arxiv 2014]-22 [Sim "MFY/FU MBZFST *-473$ 2014 GoogLeNet Konvolüzasyon Pooling Softmax Diğer VGG MSRA [Szegedy arxiv 2014]-22 [Simonyan arxiv 2014] -19 [He arxiv 2014] n 2014 GoogLeNet Konvolüzasyon Pooling Softmax Diğer VGG MSRA 35/36 t derin öğ kullanm 20/36 t açık-kay Caffe uygula kullanm 012] [Szegedy arxiv 2014]-22 [Simonyan arxiv 2014] -19 [He arxiv 2014] 7(( MBZFST *-473$ (PPHMF/FU MBZFST *-473$ 14 3FT/FU MBZFST *-473$  ˠNJMMJPO ˠNJMMJPO ˠNJMMJPO
  3. ,OPXMFEHF%JTUJMMBUJPO ,% <()JOUPO > த෦େֶϩΰ த෦େֶϩΰ w ,OPXMFEHFUSBOTGFSGSPNUFBDIFSOFUXPSLUPTUVEFOUOFUXPSL  5FBDIFSQSFUSBJOFEMBSHFOFUXPSL

     4UVEFOUTNBMMOFUXPSL  5FBDIFS /FUXPSL 4UVEFOU /FUXPSL -BSHF  QSFUSBJOFE 4NBMM ,OPXMFEHFUSBOTGFS ,OPXMFEHF %JTUJMMBUJPO ,OPXMFEHF
  4. w 5FBDIFSˠ4UVEFOU  5FBDIFSQSFUSBJOFEMBSHFOFUXPSL  4UVEFOUTNBMMOFUXPSL  5SBJOJOHTUVEFOUOFUXPSLJTUSBJOFECBTFEPOIBSEUBSHFUBOETPGUUBSHFU ,OPXMFEHF%JTUJMMBUJPO ,%

    <()JOUPO > த෦େֶϩΰ த෦େֶϩΰ  %BSL,OPXMFEHF 5FBDIFS 4UVEFOU $SPTT&OUSPQZ $SPTT&OUSPQZ MBCFM QSFUSBJOFE 4PGUUBSHFU )BSEUBSHFU ʢDPSSFDUMBCFMʣ p1 p2 ʢQSPCBCJMJUZEJTUSJCVUJPOʣ #BDLQSPQ
  5. %FFQ.VUVBM-FBSOJOH %.- <:;IBOH > த෦େֶϩΰ த෦େֶϩΰ w 4UVEFOU⁶4UVEFOU  "OFOTFNCMFPGTUVEFOUTMFBSODPMMBCPSBUJWFMZBOEUFBDIFBDIPUIFSUISPVHIPVUUIF

    USBJOJOHQSPDFTT  ,VMMCBDL-FJCMFS ,- %JWFSHFODFJTVTFEUPNBUDIUIFUXPOFUXPSLQSFEJDUJPOT BOE  p1 p2  MBCFM 4UVEFOU 4UVEFOU p2 KL(p1 ||p2 ) KL(p2 ||p1 ) ˠ%.-NBLFTUIFTUVEFOUOFUXPSLCFUUFSUIBOUIFPOFVTFEJO,% )BSE5BSHFU $SPTT&OUSPQZ 4PGU5BSHFU ,-EJWFSHFODF p1
  6. %FFQ.VUVBM-FBSOJOH %.- <:;IBOH > த෦େֶϩΰ த෦େֶϩΰ w 8IZ%.-XPSLTCFUUFS  *OEFQFOEFOU

    Y Zd Y Zd %.- 7*46"-*;*/(5)&-044-"/%4$"1&0'/&63"-/&54</FVS*14> ˠ%.-TFBSDIFTGPSQPJOUTXIFSFUIFMPTTMBOETDBQFJT fl BU
  7. %FFQDPMMBCPSBUJWFMFBSOJOH த෦େֶϩΰ த෦େֶϩΰ w ,OPXMFEHFUSBOTGFSCFUXFFONVMUJQMFOFUXPSLT  ,OPXMFEHF%JTUJMMBUJPO ,% <()JOUPO >

     %FFQ.VUVBM-FBSOJOH %.- <:;IBOH >  5FBDIFS 4UVEFOU ,OPXMFEHF%JTUJMMBUJPO ,% 4UVEFOU 4UVEFOU %FFQ.VUVBM-FBSOJOH %.- p1 p2 p1 p2
  8. 7BSJBUJPOTPG,%BOE%.- த෦େֶϩΰ த෦େֶϩΰ  4UVEFOU 4UVEFOU 4UVEFOU 4UVEFOU 4UVEFOU 4UVEFOU

    4UVEFOU 5FBDIFS 4UVEFOU 5FBDIFS 4UVEFOU 5FBDIFS 5" 4UVEFOU ,OPXMFEHF%JTUJMMBUJPO %FFQ.VUVBM-FBSOJOH ,OPXMFEHF%JTUJMMBUJPO <()JOUPO > #PSO"HBJO <'VSMBOFMMP > 5FBDIFS"TTJTUBOU <.JS[BEFI > %FFQ.VUVBM-FBSOJOH<:;IBOH > -BSHF4NBMM 4BNF 4UFQXJTF -BSHF4NBMM 4BNF PGNPEFM p1 p2 p1 p2 p1 p2 p1 p2 p1 p2 p3 p1 p2 p3
  9. த෦େֶϩΰ த෦େֶϩΰ  ,OPXMFEHF%JTUJMMBUJPO <()JOUPO > #PSO"HBJO <'VSMBOFMMP > 5FBDIFS"TTJTUBOU

    <.JS[BEFI > %FFQ.VUVBM-FBSOJOH<:;IBOH > ,OPXMFEHF%JTUJMMBUJPO %FFQ.VUVBM-FBSOJOH 5IFTFBQQSPBDIFTPGEFFQDPMMBCPSBUJWFMFBSOJOHBSFEFTJHOFECZSFTFBSDIFST IVNBO 7BSJBUJPOTPG,%BOE%.- -BSHF4NBMM 4BNF 4UFQXJTF -BSHF4NBMM 4BNF PGNPEFM
  10. w 4DBMFVQEFFQDPMMBCPSBUJWFMFBSOJOHUPDMBTTSPPNTJ[F  %FFQBOEEJWFSTFDPMMBCPSBUJWFMFBSOJOH w ,OPXMFEHF5SBOTGFS(SBQI<.JOBNJ "$$7>  "OPWFMHSBQISFQSFTFOUBUJPOPGLOPXMFEHFUSBOTGFSGPSEFFQDPMMBCPSBUJWFMFBSOJOH த෦େֶϩΰ

    த෦େֶϩΰ  0VSHPBM ,% %.- ,OPXMFEHFUSBOTGFSHSBQI 𝑚 3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦
  11. w (SBQISFQSFTFOUBUJPO  /PEF%FFQMFBSOJOHNPEFM  &EHFɿ-PTTPGLOPXMFEHFUSBOTGFS (SBQISFQSFTFOUBUJPOPGDPOWFOUJPOBMNFUIPET த෦େֶϩΰ த෦େֶϩΰ 

    ,OPXMFEHF%JTUJMMBUJPO ,% 5FBDIFS 4UVEFOU 4UVEFOU 4UVEFOU %FFQ.VUVBM-FBSOJOH %.- p1 p2 p1 p2 -BSHF 4NBMM m1 m2 POFEJSFDUJPOBMFEHF  -BSHF 4NBMM m1 m2 CJEJSFDUJPOBMFEHFT  (SBQISFQSFTFOUBUJPO
  12. w "VYJMJBSZOPEFTTVQQPSUUIFUSBJOJOHPGUIFUBSHFUOPEF  /PEF%FFQMFBSOJOHNPEFM  &EHFɿ-PTTPGLOPXMFEHFUSBOTGFS ,OPXMFEHFUSBOTGFSHSBQI OPEFT த෦େֶϩΰ த෦େֶϩΰ

     𝑚 3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 DMBTTMBCFM 5BSHFUOPEF 3FT/FU "VYJMJBSZOPEFT 3FT/FU 8JEF3FT/FU %FOTF/FU …
  13. w &BDIFEHFJTEF fi OFECZBEJ ff FSFOUUZQFPGMPTTGVODUJPO  5IJTFOBCMFTVTUPNBLFBEJWFSTFLOPXMFEHFUSBOTGFS த෦େֶϩΰ த෦େֶϩΰ

     𝑚 3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 -PTTGVODUJPO 𝐿 = 𝐻 ( 𝑝 ^ 𝑦 , 𝑝 𝑛 ) 𝐿 = 𝐾 𝐿 ( 𝑝 𝑛 || 𝑝 𝑚 ) 𝐿 = 0 … ,OPXMFEHFUSBOTGFSHSBQI OPEFT
  14. w ,OPXMFEHF%JTUJMMBUJPO ,%   0OFEJSFDUFELOPXMFEHFUSBOTGFS த෦େֶϩΰ த෦େֶϩΰ  ,%JOLOPXMFEHFUSBOTGFSHSBQI

    𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 3 𝑚 1 5BSHFUOPEF 𝐿 3,1
  15. w %FFQ.VUVBM-FBSOJOH %.-   #JEJSFDUFELOPXMFEHFUSBOTGFS 𝐿 ^ 𝑦 ,3

    𝐿 1,3 𝐿 3,2 𝐿 2,3 ^ 𝑦 𝑚 3 𝐿 3,1 த෦େֶϩΰ த෦େֶϩΰ  %.-JOLOPXMFEHFUSBOTGFSHSBQI 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 1,2 𝐿 2,1 ^ 𝑦 ^ 𝑦 𝑚 1 5BSHFUOPEF
  16. w "TBSFTVMUPGTFBSDIJOH UIFCFTUDPNCJOBUJPOPGMPTTGVODUJPOTDBOSFQSFTFOUB OPWFMEFFQDPMMBCPSBUJWFMFBSOJOH த෦େֶϩΰ த෦େֶϩΰ  𝑚 3 𝑚

    1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 ,OPXMFEHFUSBOTGFSHSBQI OPEFT 5BSHFUOPEF
  17. w ,OPXMFEHFUSBOTGFSGSPNOPEF TPVSDF UPOPEF EFTUJOBUJPO -PTTGVODUJPOPGLOPXMFEHFUSBOTGFSHSBQI த෦େֶϩΰ த෦େֶϩΰ  𝑚

    3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 𝑚 3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 5BSHFUOPEF
  18. w ,OPXMFEHFUSBOTGFSGSPNOPEF TPVSDF UPOPEF EFTUJOBUJPO -PTTGVODUJPOPGLOPXMFEHFUSBOTGFSHSBQI த෦େֶϩΰ த෦େֶϩΰ  𝑚

    3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 -PTT GVOD 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 p2 (c|x) p1 (c|x) L2,1 (p2 , p1 ) 'PSXBSE
  19. w ,OPXMFEHFUSBOTGFSGSPNOPEF TPVSDF UPOPEF EFTUJOBUJPO -PTTGVODUJPOPGLOPXMFEHFUSBOTGFSHSBQI த෦େֶϩΰ த෦େֶϩΰ  𝑚

    3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 -PTT GVOD 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) #BDLQSPQ %FUBDI #BDLXBSE
  20. w ,OPXMFEHFUSBOTGFSGSPNOPEF TPVSDF UPOPEF EFTUJOBUJPO த෦େֶϩΰ த෦େֶϩΰ  𝑚 3

    𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 -PTT GVOD 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) Gate KL div 'PSXBSE p2 (c|x) p1 (c|x) -PTTGVODUJPOPGLOPXMFEHFUSBOTGFSHSBQI
  21. w (BUFGVODUJPOTDPOUSPMIPXUIFLOPXMFEHFJTUSBOTGFSSFE (BUFGVODUJPOTGPSMPTTGVODUJPO த෦େֶϩΰ த෦େֶϩΰ  𝑚 3 𝑚 1

    𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) 'PSXBSE p2 (c|x) p1 (c|x) Gate KL div $VUPGG(BUF -JOFBS(BUF 5ISPVHI(BUF $PSSFDU(BUF
  22. w (BUFGVODUJPOTDPOUSPMIPXUIFLOPXMFEHFJTUSBOTGFSSFE 5ISPVHI(BUF த෦େֶϩΰ த෦େֶϩΰ  𝑚 3 𝑚 1

    𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) $VUPGG(BUF -JOFBS(BUF $PSSFDU(BUF 5ISPVHI(BUF 𝐺 ( 𝐷 𝐾 𝐿 ) = 𝐷 𝐾 𝐿 /PDIBOHF  QBTTJOHUISPVHI 'PSXBSE p2 (c|x) p1 (c|x) Gate KL div
  23. w $VUP ff (BUFJTBHBUFUIBUBMXBZTPVUQVUT[FSP $VUP ff (BUF த෦େֶϩΰ த෦େֶϩΰ 

    𝑚 3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) $VUPGG(BUF -JOFBS(BUF $PSSFDU(BUF 5ISPVHI(BUF 0VUQVUT[FSP $VUUJOHUIFFEHF 𝐺 ( 𝐷 𝐾 𝐿 ) = 0 'PSXBSE p2 (c|x) p1 (c|x) Gate KL div
  24. w -JOFBS(BUFJTBHBUFUIBUPVUQVUTHSBEVBMMZPWFSUJNF -JOFBS(BUF த෦େֶϩΰ த෦େֶϩΰ  𝑚 3 𝑚 1

    𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) $VUPGG(BUF -JOFBS(BUF $PSSFDU(BUF 5ISPVHI(BUF Gate KL div 0VUQVUTHSBEVBMMZ PWFSUJNF 𝐺 ( 𝐷 𝐾 𝐿 ) = 𝑡 𝑡 𝑚 𝑎 𝑥 ∙ 𝐷 𝐾 𝐿 'PSXBSE p2 (c|x) p1 (c|x)
  25. w 1BTTFTUISPVHIUIFJOQVU XIFOUIFQSFEJDUJPOPGOPEF TPVSDF JTDPSSFDU $PSSFDUHBUF த෦େֶϩΰ த෦େֶϩΰ  𝑚

    3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) $VUPGG(BUF -JOFBS(BUF $PSSFDU(BUF 5ISPVHI(BUF Gate KL div 0VUQVUTPOMZJGUIF QSFEJDUJPOJTDPSSFDU 𝐺 ( 𝐷 𝐾 𝐿 ) = 𝛿 ^ 𝑦 , 𝑦 𝑚 2 ∙ 𝐷 𝐾 𝐿 'PSXBSE p2 (c|x) p1 (c|x)
  26. w (BUFGVODUJPOTDPOUSPMIPXUIFLOPXMFEHFJTUSBOTGFSSFE (BUFGVODUJPOTGPSMPTTGVODUJPO த෦େֶϩΰ த෦େֶϩΰ  𝑚 3 𝑚 1

    𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝑚 1 𝑚 2 𝐿 2,1 4PVSDF %FTUJOBUJPO 𝑚 2 𝑚 1 L2,1 (p2 , p1 ) 'PSXBSE p2 (c|x) p1 (c|x) Gate KL div $VUPGG(BUF -JOFBS(BUF 5ISPVHI(BUF $PSSFDU(BUF
  27. w "QQMZIZQFSQBSBNFUFSTFBSDIUPPQUJNJ[FLOPXMFEHFUSBOTGFSHSBQI  0QUJNJ[BUJPO"TZODISPOPVT4VDDFTTJWF)BMWJOH"MHPSJUIN "4)"   1BSBNFUFST(BUFGVODUJPOT "VYJMJBSZOPEFT 0QUJNJ[JOHLOPXMFEHFUSBOTGFSHSBQI

    த෦େֶϩΰ த෦େֶϩΰ  𝑚 3 𝑚 1 𝑚 2 𝐿 ^ 𝑦 ,1 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,3 𝐿 1,2 𝐿 1,3 𝐿 2,1 𝐿 3,1 𝐿 3,2 𝐿 2,3 ^ 𝑦 ^ 𝑦 ^ 𝑦 (BUFGVODUJPO • 5ISPVHI(BUF • $VUPGG(BUF • -JOFBS(BUF • $PSSFDU(BUF • 3FT/FU 5BSHFUOPEF • 3FT/FU • 3FT/FU • 8JEF3FT/FU "VYJMJBSZOPEF /PPGDPNCJOBUJPOT  ʢOPEFTʣ
  28. w 5BSHFUOPEF 3FT/FU  w 7BOJMMBNPEFM #FTULOPXMFEHFUSBOTGFSHSBQI OPEFT த෦େֶϩΰ த෦େֶϩΰ

     "VYJMJBSZOPEF QSFUSBJOFE "VYJMJBSZOPEF 5BSHFUOPEF ,OPXMFEHF %JTUJMMBUJPO 5FBDIFSOFUXPSL 'JSTU ,% UIFO %.- ,%
  29. w 5BSHFUOPEF 3FT/FU  w 7BOJMMBNPEFM #FTULOPXMFEHFUSBOTGFSHSBQI OPEFT த෦େֶϩΰ த෦େֶϩΰ

     ɹ%FFQ.VUVBM-FBSOJOHɹ "VYJMJBSZOPEF QSFUSBJOFE "VYJMJBSZOPEF 5BSHFUOPEF 5FBDIFSOFUXPSL 'JSTU ,% UIFO %.- ,%
  30. w %BUBTFUT$*'"3 w /PPGOPEFT w 5BSHFUOPEF3FT/FU &YQFSJNFOUBMSFTVMUT த෦େֶϩΰ த෦େֶϩΰ 

    .FUIPE 5PQDDDVSBDZ<> .PEFMPGBVYJMJBSZOPEF *OEFQFOEFOU  r ,%  3FT/FU QSFUSBJOFE %.-  3FT/FU 3FT/FU 0VST  3FT/FU QSFUSBJOFE 3FT/FU
  31. 7JTVBMJ[BUJPOPGLOPXMFEHFUSBOTGFSHSBQI $*'"3 த෦େֶϩΰ த෦େֶϩΰ  OPEFT  OPEFT  OPEFT

     OPEFT  OPEFT  OPEFT  *OEFQFOEFOU3FT/FU 
  32. w &YUFOEJOHLOPXMFEHFUSBOTGFSHSBQIGPSEFFQFOTFNCMFMFBSOJOH w .BYJNJ[JOHUIFBDDVSBDZPGUIFFOTFNCMFOPEFUPUSBJOCFUUFSFOTFNCMFOFUXPSLT  "UUFOUJPOMPTTBUUFOUJPOMPTTJTBEEFEUPFBDIFEHF  &OTFNCMFOPEFBNFDIBOJTNUPFOTFNCMFUIFPVUQVUPGFBDIOPEF %FFQFOTFNCMFMFBSOJOHVTJOH,5(<0LBNPUP &$$7>

    த෦େֶϩΰ த෦େֶϩΰ  &OTFNCMFOPEF ^ 𝑦 𝐿 ^ 𝑦 , 𝑒 𝑒𝑛𝑠 𝑚 1 𝑚 2 𝑚 3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝐿 ^ 𝑦 ,3 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,1 𝐿 2,1 𝐿 1,2 𝐿 3,1 𝐿 1,3 𝐿 2,3 𝐿 3,2 𝐿 i,j = 𝐾 𝐿 ( 𝒑 i , 𝒑 j) ± 𝐿 𝐴 𝑇 ( 𝑸 i , 𝑸 j ) "UUFOUJPOMPTT
  33. w "UUFOUJPOMPTTGSPNOPEFUPOPEF  1FSGPSNTUPCSJOHPSUPTFQBSBUFUIFBUUFOUJPONBQTCFUXFFOUXPNPEFMT "UUFOUJPOMPTT த෦େֶϩΰ த෦େֶϩΰ  4PVSDF %FTUJOBUJPO

    𝑚 2 𝑚 1 p2 (c|x) p1 (c|x) 'PSXBSE 𝑚 1 𝑚 2 𝑚 3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝐿 ^ 𝑦 ,3 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,1 𝐿 2,1 𝐿 1,2 𝐿 3,1 𝐿 1,3 𝐿 2,3 𝐿 3,2 ,-EJW x x "UUFOUJPO "UUFOUJPO 𝑸 = 𝐶 ∑ 𝑖 =1 𝑨 𝑖 𝑝 ɿGFBUVSFNBQT ɿDIBOOFMT ɿOPSN 𝑨 𝑖 𝐶 𝑝 ,--PTT 𝐿 𝐴𝑇 𝐿 𝐴𝑇 ( 𝑸 2 , 𝑸 1 ) = 1 𝐽 𝐽 ∑ 𝑗 𝑸 𝑗 2 𝑸 𝑗 2 2 − 𝑸 𝑗 1 𝑸 𝑗 1 2 2 "UUFOUJPOMPTT (BUF 𝐿 2,1 = G( 𝐾𝐿 ( 𝒑 2 , 𝒑 1) ± 𝐿 𝐴 𝑇 ( 𝑸 2 , 𝑸 1 )) CSJOHJOH TFQBSBUJOH + −
  34. w 6QEBUFOPEF CZCBDLQSPQBHBUJPOVTJOH,-EJWFSHFODFBOEBUUFOUJPOMPTT 𝑚 1 )PXUPUSBOTGFSLOPXMFEHFGSPNOPEFUPOPEF த෦େֶϩΰ த෦େֶϩΰ  𝑚

    2 𝑚 1 𝑚 1 𝑚 2 𝑚 3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝐿 ^ 𝑦 ,3 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,1 𝐿 1,2 𝐿 3,1 𝐿 1,3 𝐿 2,3 𝐿 3,2 x x 4PVSDF %FTUJOBUJPO 𝐿 2,1 #BDLXBSE 𝐿 2,1 = G( 𝐾𝐿 ( 𝒑 2 , 𝒑 1) ± 𝐿 𝐴 𝑇 ( 𝑸 2 , 𝑸 1 )) ,-EJW 𝐿 𝐴𝑇 (BUF "UUFOUJPO "UUFOUJPO Detach Back-prop CSJOHJOH TFQBSBUJOH + −
  35. w .BYJNJ[JOHUIFBDDVSBDZPGUIFFOTFNCMFOPEFBTBUBSHFUOPEF  &OTFNCMFNFDIBOJTNDBMDVMBUFTUIFBWFSBHFPVUQVUPGBMMOPEFT &OTFNCMFOPEF த෦େֶϩΰ த෦େֶϩΰ  ^ 𝑦

    𝐿 ^ 𝑦 , 𝑒 𝑒𝑛𝑠 𝑚 1 𝑚 2 𝑚 3 ^ 𝑦 ^ 𝑦 ^ 𝑦 𝐿 ^ 𝑦 ,3 𝐿 ^ 𝑦 ,2 𝐿 ^ 𝑦 ,1 𝐿 2,1 𝐿 1,2 𝐿 3,1 𝐿 1,3 𝐿 2,3 𝐿 3,2 𝑚 1 𝑚 2 𝑒𝑛𝑠 &OTFNCMFOPEF 𝑚 3 &OTFNCMFNFDIBOJTN 𝑝 ( 𝑐 𝑥 ) = 𝑝 1 ( 𝑐 𝑥 ) + 𝑝 2 ( 𝑐 𝑥 ) + 𝑝 3 ( 𝑐 | 𝑥 ) 𝑝 1 ( 𝑐 𝑥 ) 𝑝 2 ( 𝑐 𝑥 ) 𝑝 3 ( 𝑐 𝑥 ) 𝑝 ( 𝑐 𝑥 )
  36. #FTUHSBQIGPSFOTFNCMF OPEFT த෦େֶϩΰ த෦େֶϩΰ  ˠ"DRVJSFEJ ff FSFOUBUUFOUJPONBQT EJWFSTJUZTVJUBCMFGPSFOTFNCMFT JOQVU

    #SJOHDMPTFSUP #SJOHDMPTFSFBDIPUIFS #SJOHDMPTFSUP #SJOHDMPTFSUP #SJOHDMPTFSUP 4FQBSBUFGSPN 4FQBSBUFGSPN
  37. ,OPXMFEHFEJTUJMMBUJPOGSPNFOTFNCMFNPEFMT த෦େֶϩΰ த෦େֶϩΰ  w ,%GSPNFOTFNCMFNPEFMBTUFBDIFS  5FBDIFSOFUXPSLFOTFNCMFNPEFMPGPQUJNJ[FEHSBQI 3FO/FUº 

     4UVEFOUOFUXPSL3FT/FU *OQVU 𝒙 4UVEFOU/FUXPSL /FUXPSL 𝑚 1 𝒍 1 ( 𝒙 ) 𝒍 3 ( 𝒙 ) /FUXPSL 𝑚 3 𝒍 𝑒 𝑛 𝑠 ( 𝒙 ) 𝒑 𝑠 ( 𝒙 ) $PSSFDUMBCFM ^ 𝑦 ,OPXMFEHFUSBOTGFS 𝒑 𝑒 𝑛 𝑠 ( 𝒙 ) 4PGUNBY 5FBDIFS/FUXPSL 𝑒 𝑛 𝑠 𝑚 1 𝑚 2 𝑚 3 ,OPXMFEHF5SBOTGFS (SBQI
  38. $PODMVTJPO த෦େֶϩΰ த෦େֶϩΰ  w ,OPXMFEHF5SBOTGFS(SBQI  "OPWFMHSBQISFQSFTFOUBUJPOPGLOPXMFEHFUSBOTGFSGPSEFFQDPMMBCPSBUJWFMFBSOJOH  'PVSHBUFGVODUJPOTDPOUSPMIPXUIFLOPXMFEHFJTUSBOTGFSSFE

     4FBSDIJOHGPSCFTULOPXMFEHFUSBOTGFSHSBQICZVTJOHIZQFSQBSBNFUFSTFBSDI w 8IBUXFGPVOECZ,OPXMFEHF5SBOTGFS(SBQIGPS%$-  'VTJPOPG,%BOE%.-JNQSPWFTQFSGPSNBODF %FFQDPMMBCPSBUJOHMFBSOJOHCBTFEPO,%BOE%.-