Slide 1

Slide 1 text

What does CNN learn ? Tomoki Tanimura B4, Jin Nakazawa Lab, SFC, Keio University

Slide 2

Slide 2 text

3FBE1BQFST 6OEFSTUBOEJOH%FFQ*NBHF3FQSFTFOUBUJPOTCZ*OWFSUJOH5IFN "[email protected]"OESFB@7FEBMEJ *NBHF/FU5SBJOFE$//TBSFCJBTFEUPXBSETUFYUVSFJODSFBTJOHTIBQFCJBTJNQSPWFT BDDVSBDZBOESPCVTUOFTT 3PCFSU@(FJSIPT1BUSJDJB@[email protected]@#FUIHF 'FMJY@"@8JDINBOO8JFMBOE@#SFOEFM "QQSPYJNBUJOH$//TXJUI#BHPGMPDBM'FBUVSFTNPEFMTXPSLTTVSQSJTJOHMZXFMMPO *NBHF/FU 8JFMBOE@#SFOEFM.BUUIJBT@#FUIHF &YQMPSJOHUIF0SJHJOTBOE1SFWBMFODFPG5FYUVSF#JBTJO$POWPMVUJPOBM/FVSBM/FUXPSLT ,BUIFSJOF@-@)FSNBOO4JNPO@,PSOCMJUI

Slide 3

Slide 3 text

"CTUSBDU 8FUIJOLUIBU$//MFBSOUIFHMPCBMTUSVDUVSFPGUIFJNBHF FH(MPCBMTIBQFPGUIFPCKFDU 4USVDUVSFPGUIFFEHF #VU (FJSIPTFUBMSFWFBMTUIBU$//USBJOFEPO*NBHF/FUIBWFUIFUFYUVSFCJBT $MBTTJGZUIFJNBHFCBTFEPOUIFUFYUVSFPGUIFJNBHF 4P UIFQFSGPSNBODFPG#BH/FUXIJDIJTSFTUSJDUFEUPTFFPOMZUIFMPDBM GFBUVSFTJTDPNQFUJUJWFXJUI%FFQFS/FUXPSLMJLF3FT/FU &NQJSJDBMMZDPOpSNUIBU$//MFBSOUIFMPDBMGFBUVSFTMJLFUFYUVSF )FSNBOOFUBMFNQJSJDBMMZTIPXUIBU$//JUTFMGEPFTO`UIBWFUIFJOEVDUJWF CJBTGPSUFYUVSFBOEpOEUIFPSJHJOPGUFYUVSFCJBTJO$// 5IFPCTFSWFEUFYUVSFCJBTJTBUUSJCVUFEUPQSFQSPDFTTFT UBTLT BOEMFBSOJOHSBUFT

Slide 4

Slide 4 text

Preliminaries

Slide 5

Slide 5 text

8IBUJT$// $POWPMVUJPOBM/FVSBM/FUXPSL $// 5IJTJTEFWFMPQFEGPSJNBHFQSPDFTTJOH Let’t try NN ! https://scs.ryerson.ca/~aharley/vis/conv/

Slide 6

Slide 6 text

3FQSFTFOUBUJWF.PEFMTCBTFEPO$// 7(( 3FT/FU /"4/FU &⒏DJFOU/FU "MFY/FU

Slide 7

Slide 7 text

5IFGPSXBSEQBTTPG$// 5ISFFDMBTTDMBTTJpDBUJPOUBTL *OQVU*NBHF < > 5IJTJNBHFJTBNPOLFZ

Slide 8

Slide 8 text

*NBHF/FU$MBTTJpDBUJPO DMBTTDMBTTJpDBUJPO4P5" UPQ

Slide 9

Slide 9 text

What does learn CNN? [Understanding Deep Image Representations by Inverting Them]

Slide 10

Slide 10 text

$//MFBSOQIPUPTMJLFGFBUVSFT *OQVU*NBHFBOEGFBUVSFTFODPEFECZ$// 'FBUVSFTPGFBDIMBZFS %FFQFS NPSFBCTUSBDUPGHFOFSBMDPOUFOUPGUIFDMBTT TIBMMPX EFFQ

Slide 11

Slide 11 text

)PXEPXFDIFDLUIFMFBSOFEGFBUVSFT 0SJHJOBM*NBHF 'FBUVSF0

Slide 12

Slide 12 text

)PXEPXFDIFDLUIFMFBSOFEGFBUVSFT 0SJHJOBM*NBHF /PJTF*NBHF 'FBUVSF/ 'FBUVSF0

Slide 13

Slide 13 text

)PXEPXFDIFDLUIFMFBSOFEGFBUVSFT $BMDVMBUFBOENJOJNJ[FUIFMPTT 0SJHJOBM*NBHF /PJTF*NBHF 'FBUVSF/ 'FBUVSF0

Slide 14

Slide 14 text

)PXEPXFDIFDLUIFMFBSOFEGFBUVSFT $BMDVMBUFBOENJOJNJ[FUIFMPTT 6QEBUFUIFOPJTFJNBHF GPSNBLJOHUIFGFBUVSF/TJNJMBSUPGFBUVSF0 0SJHJOBM*NBHF /PJTF*NBHF 'FBUVSF/ 'FBUVSF0

Slide 15

Slide 15 text

)PXEPXFDIFDLUIFMFBSOFEGFBUVSFT $BMDVMBUFBOENJOJNJ[FUIFMPTT 6QEBUFUIFOPJTFJNBHFJUFSBUJWFMZ 0SJHJOBM*NBHF /PJTF*NBHF 'FBUVSF/ 'FBUVSF0

Slide 16

Slide 16 text

)PXEPXFDIFDLUIFMFBSOFEGFBUVSFT -PTTJTFOPVHITNBMM 6QEBUFEUIFOPJTFJNBHFTPNBOZUJNFT 0SJHJOBM*NBHF *NBHFSFQSFTFOUJOH UIFGFBUVSF/ 'FBUVSF/ 'FBUVSF0

Slide 17

Slide 17 text

'PSNVMBUJPOPG*OWFSUJOH3FQSFTFOUBUJPOT 5IJTJTUIFGPSNVMBUJPOPGl*OWFSUJOH3FQSFTFOUBUJPOTz YJTUIFPSJHJOBMJOQVUJNBHF YTUBSUGSPNOPJTFJNBHF UIFODIBOHFUPUIFJNBHFSFQSFTFOUJOHUIFGFBUVSF CZPQUJNJ[BUJPO JTMPTTGVODUJPOTVDIBT-OPSN JTUIF// JTUIFSFHVMBSJ[BUJPOUFSN l( . , . ) Φ( . ) ℛ( . )

Slide 18

Slide 18 text

%FFQFSGFBUVSFTIBTNPSFJOGPSNBUJPO 5IFSFDPOTUSVDUJPOPGYDFOUFSQBUDIPGFBDIGFBUVSFNBQT %FFQGFBUVSFTIBTSJDIJOGPSNBUJPOBUUIFDFOUFSPGGFBUVSFNBQ

Slide 19

Slide 19 text

"MFY/FUIBTUXPUZQFTPGGFBUVSFTJOJUTFMG "MFY/FUIBTUXPQJQFMJOFJOBOFUXPSL &BDIQJQFMJOFMFBSOEJ⒎FSFOUGFBUVSFTFWFOUIPVHIUIFSFJTOP SFHVMBSJ[BUJPO 0OFGPSMPXGSFRVFODZ FHDPMPS "OPUIFSGPSIJHIPOF FHJOUFOTJUZ 0OFQJQFMJOF "OPUIFSQJQFMJOF

Slide 20

Slide 20 text

4VNNBSZPGUIJTQBQFS $//TMFBSOUIFQIPUPMJLFGFBUVSFT *U`TTJNJMBSUPIVNBOT -FBSOFEGFBUVSFTBSFEJ⒎FSFOUJOMBZFST 4IBMMPXMBZFSMFBSOMPDBMGFBUVSFT FHDPMPS UFYUVSF FUD %FFQ-BZFSMFBSONPSFHMPCBMGFBUVSFT FHHFOFSBMSFQSFTFOUBUJPOPG UIFDMBTTJNBHF

Slide 21

Slide 21 text

Are CNNs like human eye? [ImageNet-Trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness]

Slide 22

Slide 22 text

$//IBTUIFUFYUVSFCJBT *NBHF/FUUSBJOFE$//DMBTTJGZJNBHFTCBTFEPOUIFUFYUVSFSBUIFS UIBOUIFTIBQF

Slide 23

Slide 23 text

$//TBSF/05IVNBOFZF 5IFVTFGVMJOGPSNBUJPOGPS$//JTEJ⒎FSFOUGSPN)VNBO $//MPPLTUIFUFYUVSFSBUIFSUIBOUIFTIBQF )VNBOMPPLTUIFTIBQFTSBUIFSUIBOUIFUFYUVSF

Slide 24

Slide 24 text

4IBQFPS5FYUVSF5FTU 4IBQFBOE5FYUVSFBSFCFMPOHFEUPEJ⒎FSFOUDMBTTFTFBDIPUIFS )VNBOTBOE$//TBTTJHOBOJNBHFUPBTJOHMFDMBTT )VNBOT $//T

Slide 25

Slide 25 text

$BOXFUFBDIBTIBQFUP$// $SFBUFUIF4UZMJ[FE*NBHF/FU 4IBQFJTSFUBJOFE CVU5FYUVSFJTDPSSVQUFE 6TF"EB*/4UZMF5SBOTGFSGPSHFOFSBUJOHUIJTEBUBTFU

Slide 26

Slide 26 text

5IFSFTVMUTUPUSBJO$//PO4UZMJ[FE*NBHF/FU 8FDBOUFBDI$//TBTIBQF )VNBOTIFBWJMZSFMZPOUIFTIBQF 4*/USBJOFE$//TSFMZPOUIFTIBQF
 SBUIFSUIBOUIFUFYUVSF */USBJOFE$//TSFMZPOUIFUFYUVSF

Slide 27

Slide 27 text

4VNNBSZPGUIJTQBQFS )VNBOTMJLFTIBQF CVU$//TMJLFUFYUVSF )VNBOTSFMZPOTIBQFXIFODMBTTJGZJOHJNBHFT $//TSFMZPOUFYUVSF 8FDBOUFBDITIBQFUP$//TVTJOHXFMMEFTJHOFEEBUBTFU 1SPQPTFUIF4UZMJ[FE*NBHF/FU 4IBQFJTPSJHJOBMDMBTT 5FYUVSFJTOPUSFMBUFEUPUIFPSJHJOBMDMBTT DPSSVQUFE

Slide 28

Slide 28 text

CNN : “No! It’s not my fault!” [Exploring the Origins and Prevalence of Texture Bias in Convolutional Neural Networks]

Slide 29

Slide 29 text

5FYUVSFCJBTJTOPUUIFJOEVDUJWFCJBTPG$//T $//TDBOFBTJMZMFBSOUFYUVSF 5IFPSJHJOPGUFYUVSFCJBT %BUBTFUBOE0CKFDUJWFGVODUJPO "SDIJUFDUVSFPGUIFNPEFM %BUB"VHNFOUBUJPO )ZQFSQBSBNFUFST

Slide 30

Slide 30 text

)PXUPpOEUIFPSJHJO 6TFUIFTFEBUBTFU (FJSIPT`TPOFJTNBJOMZVTFEJOUIFFYQFSJNFOU

Slide 31

Slide 31 text

*TJUEJ⒏DVMUGPS$//TUPMFBSOTIBQF $MBTTJpDBUJPO5BTLGPSUIBUUISFFEBUBTFU $//TDBOMFBSOBTIBQFBUMFBTUBTSFBEJMZBTBUFYUVSF

Slide 32

Slide 32 text

5PXIBUFYUFOUBSFTIBQFBOEUFYUVSFSFQSFTFOBUFEJOUIF$// -JOFBS$MBTTTJpDBUJPOGPSTIBQFBOEUFYUVSFPO(FJSIPTEBUBTFU *OQVUGFBUVSFTFODPEFECZ*NBHF/FUUSBJOFE$// *NBHF/FUUSBJOFE$//TIBWFCPUIPGTIBQFBOEUFYUVSFSFQSFTFOUBUJPO

Slide 33

Slide 33 text

%PFTPCKFDUJWFB⒎FDUTIBQFCJBT -JOFBS$MBTTJpDBUJPOGPSTIBQFBOEUFYUVSFPO(FJSIPTEBUBTFU *OQVUGFBUVSFTFODPEFECZ$//TUSBJOFEPOWBSJPVTPCKFDUJWFT 0CKFDUJWFT3PUBUJPO &YBNQMFS #JH#J("/ 3FTVMUT 3PUBUJPONPSFB⒎FDUTIBQFCJBT

Slide 34

Slide 34 text

%PFTBSDIJUFDUVSFB⒎FDUTIBQFCJBT .FBTVSFUIFTIBQFCJBTVTJOH(FJSIPTEBUBTFUGPS*NBHF/FUUSBJOFE $//T 'JOEUIF$PSSFMBUJPOCFUXFFOTIBQFCJBTBOEUIF*NBHF/FUBDDVSBDZ

Slide 35

Slide 35 text

%PFTUIFUSBJOJOHQSPDFTTDPOUSJCVUFUPUFYUVSFCJBT 3BOEPNDSPQQSFQSPDFTTJOHCJBTNPEFMTUPXBSETUFYUVSF :FT0GDPBSTF 8IFOBQQMZJOHUIFDFOUFSDSPQJOUIFUSBJOJOH $//TIBWFNPSFTIBQF CJBTSBUIFSUIBOUFYUVSF

Slide 36

Slide 36 text

)PXBCPVU)ZQFSQBSBNFUFST *OWFTUJHBUFUIFSFMBUJPOUPMFBSOJOHSBUFBOEXFJHIUEFDBZXJUI TIBQFPSUFYUVSFCJBT -BSHFSMS $//TIBTNPSFTIBQFCJBTBOEWJDFWFSTB 8EIBTBMTPTJNJMBSQSPQFSUZBTMS

Slide 37

Slide 37 text

4VNNBSZPGUIJTQBQFS 5FYUVSFCJBTJTOPUUIFJOEVDUJWFCJBTPG$//T 5IFTFGBDUPSTJOqVFODFUFYUVSFCJBTBOE8FDBOBEKVTUUIFN 0CKFDUJWF'VODUJPO "SDIJUFDUVSFPGUIFNPEFM %BUB"VHNFOUBUJPO )ZQFSQBSBNFUFST 5IFSFJTUIFDPOVOESVN 4IBQFCJBTDPSSFMBUFXJUI*NBHF/FUBDDVSBDZ *NBHF/FUTUBOEBSEUSBJOFE$//TIBWFUFYUVSFCJBT

Slide 38

Slide 38 text

As a result, What does CNN learn?

Slide 39

Slide 39 text

*TJUEFQFOETPOUIFUSBJOJOHFOWJSPONFOU 6OEFSTUBOEJOH%FFQ*NBHF3FQSFTFOUBUJPOTCZ*OWFSUJOH5IFN $//TMFBSOQIPUPMJLFGFBUVSFT -FBSOFEGFBUVSFTBSFEJ⒎FSFOUJOMBZFST 4IBMMPX-PDBM %FFQ(MPCBM *NBHF/FUUSBJOFE$//TBSFCJBTFEUPXBSETUFYUVSFJODSFBTJOHTIBQFCJBT JNQSPWFTBDDVSBDZBOESPCVTUOFTT )VNBOTMJLFTIBQF CVU$//TMJLFUFYUVSF &YQMPSJOHUIF0SJHJOTBOE1SFWBMFODFPG5FYUVSF#JBTJO$POWPMVUJPOBM/FVSBM /FUXPSLT 5FYUVSFCJBTJTOPUUIFJOEVDUJWFCJBTPG$//T5SBJOJOHFOWJSPONFOUTB⒎FDUUIJTCJBT #VU UIFSFJTUIFDPOVOESVNBOEPOMZFNQJSJDBMSFTVMUTCBTFEPOUIF(BJSIPTEBUBTFU