Slide 1

Slide 1 text

AI࠷৽࿦จಡΈձ2021೥11݄ ᷂tech vein ழມ ॆԝ

Slide 2

Slide 2 text

ࣗݾ঺հ ழມ ॆԝ (͍ͷ·ͨ ΈͭͻΖ) גࣜձࣾ tech vein / DeepRad גࣜձࣾ ֤୅දऔక໾ ݉ σϕϩούʔ twitter: @ino2222

Slide 3

Slide 3 text

Facebook άϧʔϓͷ঺հ IUUQTXXXGBDFCPPLDPNHSPVQT

Slide 4

Slide 4 text

ΞδΣϯμ Archive Sanity (arxiv-sanity.com) ͔ΒϐοΫΞο ϓͨ͠ɺarxiv.org ͷաڈ1ϲ݄ؒͷ࿦จ঺հɻ ɾҰ൪ؾʹͳͬͨ࿦จͷ঺հ ɾtop recentͷ࿦จτοϓ10 Ϧετ ɾtop hype ͷ࿦จτοϓ10 Ϧετ

Slide 5

Slide 5 text

Archive Sanity? https://www.arxiv-sanity.com/top

Slide 6

Slide 6 text

໨࣍

Slide 7

Slide 7 text

Top10 Recent 1. ResNet strikes back: An improved training procedure in timm 2. Exploring the Limits of Large Scale Pre-training 3. Deep Neural Networks and Tabular Data: A Survey 4. Learning in High Dimension Always Amounts to Extrapolation 5. ADOP: Approximate Differentiable One-Pixel Point Rendering 6. Well-classi fi ed Examples are Underestimated in Classi fi cation with Deep Neural Networks 7. ByteTrack: Multi-Object Tracking by Associating Every Detection Box 8. MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer ← PickUp! 9. Fast Model Editing at Scale 10. Self-supervised Learning is More Robust to Dataset Imbalance

Slide 8

Slide 8 text

Top10 Hype 1. ADOP: Approximate Differentiable One-Pixel Point Rendering 2. Real numbers, data science and chaos: How to fi t any dataset with a single parameter 3. Delphi: Towards Machine Ethics and Norms 4. Multitask Prompted Training Enables Zero-Shot Task Generalization 5. Nonnegative spatial factorization 6. Learning in High Dimension Always Amounts to Extrapolation 7. StyleAlign: Analysis and Applications of Aligned StyleGAN Models 8. Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit 9. ECQx: Explainability-Driven Quantization for Low-Bit and Sparse DNNs 10. Exploring the Limits of Large Scale Pre-training

Slide 9

Slide 9 text

Top10 Recent (྘: CNN, ੺: Transformer) 1. ResNet strikes back: An improved training procedure in timm 2. Exploring the Limits of Large Scale Pre-training 3. Deep Neural Networks and Tabular Data: A Survey 4. Learning in High Dimension Always Amounts to Extrapolation 5. ADOP: Approximate Differentiable One-Pixel Point Rendering 6. Well-classi fi ed Examples are Underestimated in Classi fi cation with Deep Neural Networks 7. ByteTrack: Multi-Object Tracking by Associating Every Detection Box 8. MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer ← PickUp! 9. Fast Model Editing at Scale 10. Self-supervised Learning is More Robust to Dataset Imbalance

Slide 10

Slide 10 text

Top10 Hype (྘: CNN, ੺: Transformer) 2. Real numbers, data science and chaos: How to fi t any dataset with a single parameter 3. Delphi: Towards Machine Ethics and Norms 4. Multitask Prompted Training Enables Zero-Shot Task Generalization 5. Nonnegative spatial factorization 7. StyleAlign: Analysis and Applications of Aligned StyleGAN Models 8. Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit 9. ECQx: Explainability-Driven Quantization for Low-Bit and Sparse DNNs

Slide 11

Slide 11 text

Pickup࿦จ

Slide 12

Slide 12 text

8. MobileViTɿܰྔɾ൚༻ɾϞόΠϧରԠͷϏδϣϯτϥϯεϑΥʔϚʔ (ݪจ: MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer) ܰྔͷ৞ΈࠐΈχϡʔϥϧωοτϫʔΫʢCNNʣ͸ɺϞόΠϧɾϏδϣϯɾλεΫͷσϑΝΫτͱͳ͍ͬͯΔɻͦͷۭ ؒ༠ಋόΠΞεʹΑΓɺҟͳΔࢹ֮λεΫؒͰΑΓগͳ͍ύϥϝʔλͰදݱΛֶश͢Δ͜ͱ͕Ͱ͖Δɻ͔͠͠ɺ͜ΕΒͷ ωοτϫʔΫ͸ۭؒతʹϩʔΧϧͳ΋ͷͰ͋ΔɻάϩʔόϧͳදݱΛֶश͢ΔͨΊʹɺࣗݾ஫ҙϕʔεͷϏδϣϯτϥϯ εϑΥʔϚʔʢViTʣ͕࠾༻͞Ε͍ͯΔɻCNNͱ͸ҟͳΓɺViTs͸ϔϏʔ΢ΣΠτͰ͋ΔɻຊߘͰ͸ɺCNNͱViTͷ௕ॴ Λ૊Έ߹ΘͤͯɺϞόΠϧɾϏδϣϯɾλεΫͷͨΊͷܰྔͰ௿஗ԆͷωοτϫʔΫΛߏங͢Δ͜ͱ͸Մೳ͔ɺͱ͍͏ٙ ໰Λ౤͔͚͍͛ͯΔɻ͜ͷ໨తͷͨΊʹɺϞόΠϧσόΠε༻ͷܰྔͰ൚༻తͳϏδϣϯม׵ثͰ͋ΔMobileViTΛ঺հ ͠·͢ɻMobileViT͸ɺม׵ثΛ༻͍ͨ৘ใͷάϩʔόϧॲཧͷͨΊͷҟͳΔࢹ఺ɺ͢ͳΘͪɺ৞ΈࠐΈͱͯ͠ͷม׵ث Λఏࣔ͠·͢ɻͦͷ݁ՌɺMobileViT͸ɺ͞·͟·ͳλεΫ΍σʔληοτʹ͓͍ͯɺCNN΍ViTϕʔεͷωοτϫʔΫ Λେ෯ʹ্ճΔ͜ͱ͕Θ͔Γ·ͨ͠ɻImageNet-1kσʔληοτͰ͸ɼMobileViT͸໿600ສݸͷύϥϝʔλͰ78.4%ͷ top-1ਫ਼౓Λୡ੒ͨ͠ɽ͜Ε͸ɼಉఔ౓ͷύϥϝʔλ਺ͷMobileNetv3ʢCNNϕʔεʣ͓ΑͼDeITʢViTϕʔεʣΑΓ΋ 3.2%͓Αͼ6.2%ߴ͍ਫ਼౓Ͱ͋ΔɽMS-COCOͷ෺ମݕग़λεΫͰ͸ɺMobileViT͸ಉ਺ͷύϥϝʔλͰMobileNetv3Α Γ΋5.7%ਫ਼౓͕ߴ͍ɻ http://arxiv.org/abs/2110.02178v1 w ໨తɾ੒Ռɿ7J5ΛܰྔԽͨ͠৽Ϟσϧ.PCJMF7J5ͷ։ൃ w ํ๏ɿม׵ثΛ৞ΈࠐΈͱͯ͠࢖͍ɺάϩʔόϧͳදݱΛֶशͨ͠ w ݻ༗໊ɿ.PCJMF7J5 IUUQTHJUIVCDPNBQQMFNMDWOFUT w ஶऀॴଐɿ"QQMF

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

ࢀߟ: MobileNetV2 
 MobileVit͸MobileNetV2 + Transformer IUUQTXXXSFTFBSDIHBUFOFU fi HVSF5IFBSDIJUFDUVSFPGUIF.PCJMF/FUWOFUXPSL@ fi H@

Slide 15

Slide 15 text

CNN+ViTͰશମΛΤϯίʔυ͢Δɻ CNNͰ੺ɾ੨υοτͷपΓΛ৞ΈࠐΜͩ͋ͱɺ TransformerͰ੺υοτ͔Β੨υοτΛݟΔ͜ͱͰɺ݁ՌతʹશମΛݟ͍ͯΔ

Slide 16

Slide 16 text

MobileNetv3, DeITΛ্ճΔੑೳͰܰྔ

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

ϞόΠϧ୺຤Ͱ΋ಈ͔͠΍͍͢ߴਫ਼౓ͳϞσϧ 
 ʢͨͩ͠ɺMobileNetv2ΑΓ͸஗͍ʣ • Mobile Friendly: ܰྔɺ൚༻తɺ(ൺֱత)௿ϨΠςϯγʔͰ࢖͍΍͢ ͦ͏ɻ • ͨͩ͠ϞόΠϧػث(=iPhone12)Ͱ͸MobileNetv2ΑΓਪ࿦͕8ഒ஗ ͔ͬͨɻ 
 →ϞόΠϧGPUʹ͸CUDAΧʔωϧ͕ແ͍ͷͱɺCNN͸৞ΈࠐΈͦ ͏ͱͷҰׅਖ਼نԽ༥߹ͳͲͷ࠷దԽ͕͋Δͷ͕ཧ༝ͱͷ͜ͱ

Slide 19

Slide 19 text

GitHub https://github.com/apple/ml-cvnets

Slide 20

Slide 20 text

Top recent: Best10

Slide 21

Slide 21 text

1. ResNetͷٯऻɻtimmͰͷֶशखॱͷվળ (ݪจ: ResNet strikes back: An improved training procedure in timm) HeΒʹΑͬͯઃܭ͞ΕͨӨڹྗͷ͋ΔResidual Networks͸ɼଟ͘ͷՊֶ࿦จͰۚࣈౝతͳΞʔΩςΫ νϟͱͯ͠औΓ্͛ΒΕ͍ͯ·͢ɽ͜ΕΒͷΞʔΩςΫνϟ͸௨ৗɺݚڀʹ͓͚ΔσϑΥϧτͷΞʔΩ ςΫνϟͱͯ͠ɺ͋Δ͍͸৽͍͠ΞʔΩςΫνϟ͕ఏҊ͞ΕͨࡍͷϕʔεϥΠϯͱͯ͠ػೳ͍ͯ͠· ͢ɻ͔͠͠ɺ2015೥ʹResNetΞʔΩςΫνϟ͕ൃද͞ΕͯҎདྷɺχϡʔϥϧωοτϫʔΫͷτϨʔχ ϯάͷϕετϓϥΫςΟεʹ͍ͭͯେ͖ͳਐల͕͋Γ·ͨ͠ɻ৽ͨͳ࠷దԽˍσʔλΦʔάϝϯςʔ γϣϯʹΑΓɺτϨʔχϯάϨγϐͷ༗ޮੑ͕ߴ·͍ͬͯ·͢ɻຊ࿦จͰ͸ɺ͜ͷΑ͏ͳਐาΛ౷߹͠ ͨखॱͰτϨʔχϯάͨ͠৔߹ͷόχϥResNet-50ͷੑೳΛ࠶ධՁ͠·͢ɻզʑ͸ɺڝ૪ྗͷ͋Δֶश ઃఆͱࣄલʹֶश͞ΕͨϞσϧΛtimmΦʔϓϯιʔεϥΠϒϥϦͰڞ༗͠ɺকདྷͷݚڀͷͨΊͷΑΓྑ ͍ϕʔεϥΠϯͱͯ͠໾ཱͭ͜ͱΛظ଴͍ͯ͠·͢ɻྫ͑͹ɺզʑͷΑΓݫֶ͍͠शઃఆͰ͸ɺόχϥ ͷResNet-50͸ɺ௥Ճσʔλ΍ৠཹͳ͠ͰImageNet-valͷղ૾౓224x224Ͱ80.4%ͷτοϓ1ਫ਼౓Λ ୡ੒͍ͯ͠·͢ɻ·ͨɺҰൠతͳϞσϧʹ͍ͭͯɺզʑͷֶशํ๏ͰಘΒΕͨੑೳΛใࠂ͠·͢ɻ w ໨తɿΞʔΩςΫνϟͷมߋͳ͠ʹɺ3FT/FUͷ࠷ྑͷֶशखॱΛఏڙ͢Δ w ੒Ռɿ1Z5PSDI༻ͷUJNNϥΠϒϥϦͰϞσϧઃఆͱࣄલֶशࡁΈϞσϧΛఏڙ w ํ๏ɿϋΠύʔύϥϝʔλௐ੔ ΫϩεΤϯτϩϐʔଛࣦ͔Βͷ୤٫ w ݻ༗໊ɿͳ͠ w ஶऀॴଐɿ'BDFCPPL"*3FTFBSDI http://arxiv.org/abs/2110.00476v1

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

A1~A3: ֶशίετͷҧ͍

Slide 24

Slide 24 text

2. େن໛ͳࣄલτϨʔχϯάͷݶքΛ୳Δ (ݪจ: Exploring the Limits of Large Scale Pre-training) ۙ೥ͷେن໛ػցֶशͷൃల͸ɺσʔλɺϞσϧαΠζɺֶश࣌ؒΛద੾ʹεέʔϧΞοϓ͢Δ͜ͱͰɺ ࣄલֶशͷվળ͕΄ͱΜͲͷԼྲྀλεΫʹ༗རʹҠߦ͢Δ͜ͱΛ؍࡯͢Δ͜ͱ͕Ͱ͖Δ͜ͱΛࣔࠦͯ͠ ͍ΔɻຊݚڀͰ͸ɺ͜ͷݱ৅Λܥ౷తʹݚڀ͠ɺ্ྲྀͷਫ਼౓Λ্͛ΔͱԼྲྀͷλεΫͷੑೳ͕๞࿨͢Δ ͜ͱΛূ໌͠·ͨ͠ɻ۩ମతʹ͸ɺVision TransformersɺMLP-MixerɺResNetsʹ͍ͭͯɺύϥϝʔλ ਺͕1000ສ͔Β100ԯͷൣғͰ4800ճҎ্ͷ࣮ݧΛߦ͍ɺ࠷େن໛ͷը૾σʔλʢJFTɺ ImageNet21KʣͰֶश͠ɺ20Ҏ্ͷԼྲྀͷը૾ೝࣝλεΫͰධՁͨ͠ɻͦͷ݁Ռɺ๞࿨ݱ৅Λ൓ө ͠ɺ্ྲྀͱԼྲྀͷੑೳͷඇઢܗؔ܎Λଊ͑ͨԼྲྀੑೳͷϞσϧΛఏҊͨ͠ɻ͞Βʹɺ͜ͷΑ͏ͳݱ৅͕ ൃੜ͢Δཧ༝Λ۷ΓԼ͛ͯཧղ͢ΔͨΊʹɺࢲ͕ͨͪ؍࡯ͨ͠๞࿨ݱ৅͸ɺϞσϧͷ૚Λ௨ͯ͠දݱ͕ ਐԽ͢Δํ๏ͱີ઀ʹؔ܎͍ͯ͠Δ͜ͱΛࣔ͠·ͨ͠ɻ·ͨɺ͞Βʹۃ୺ͳྫͱͯ͠ɺΞοϓετϦʔ Ϝͱμ΢ϯετϦʔϜͷύϑΥʔϚϯε͕૬൓͢Δ৔߹Λ঺հ͠·͢ɻͭ·Γɺμ΢ϯετϦʔϜͷੑ ೳΛ޲্ͤ͞ΔͨΊʹ͸ɺΞοϓετϦʔϜͷਫ਼౓Λམͱ͢ඞཁ͕͋Δͱ͍͏͜ͱͰ͢ɻ http://arxiv.org/abs/2110.02095v1 w ໨తɿը૾ೝࣝϞσϧͷGFXTIPUֶशʹ͍ͭͯͷܥ౷తͳݚڀ w ੒ՌɿGFXTIPUֶश༻ͷɺ൚༻తͳࣄલֶशϞσϧͷઃܭํ๏ͷఏࣔ w ํ๏ɿ7J5 .-1.JYFS 3FT/FUʹ͍ͭͯɺ6Q4USFBN%PXO4USFBNֶश࣌ͷ๞࿨ݱ৅ͷ૬ؔΛ 
 λεΫผɾύϥϝʔλผɾϞσϧαΠζผʹௐ΂Δ w ݻ༗໊ɿͳ͠ w ஶऀॴଐɿ(PPHMF3FTFBSDI

Slide 25

Slide 25 text

ࣄલֶश(্ྲྀ)ͷਫ਼౓ vs సҠֶश(Լྲྀ)ͷ 1shot/25shot ਫ਼౓ λεΫʹΑͬͯ͸্ྲྀͷਫ਼౓͕ߴ͍΄ͲԼྲྀͷਫ਼౓͕๞࿨͠΍͍͢(acc:0.2~0.5Ͱ΋ఀ଺) →্ྲྀλεΫͱԼྲྀλεΫͷؔ܎ੑ͕େࣄ

Slide 26

Slide 26 text

ཁ໿ • ൚༻తͳfew shot ֶशλεΫ޲͚ͷࣄલֶशϞσϧΛ࡞Δ ͱ͖ͷઃܭࢦඪɾվળࡦΛఏ͍ࣔͯ͠Δɻ • ࣄલֶशͷਫ਼౓Λߴ͗͘͢͠ͳ͍ɾֶश཰Λམͱ͢ • ୯७ʹεέʔϧ͢Δ͜ͱ͕͢΂ͯΛղܾ͢Δͱ͸ݶΒͳ ͍ • ̍ͭͷԼྲྀλεΫͰಛԽ͢ΔͷͰ͸ͳ͘ɺ෯޿͍Լྲྀͷ λεΫͰύϑΥʔϚϯεΛ޲্ͤ͞ΔΑ͏ͳઃܭ্ͷબ ୒Λ͢΂͖

Slide 27

Slide 27 text

3. σΟʔϓɾχϡʔϥϧɾωοτϫʔΫͱදܗࣜσʔλɻαʔϕΠ (ݪจ: Deep Neural Networks and Tabular Data: A Survey) ෆۉ࣭ͳදܗࣜͷσʔλ͸ɺ࠷΋Ұൠతʹ࢖༻͞Ε͍ͯΔσʔλܗࣜͰ͋Γɺଟ͘ͷॏཁ͔ͭܭࢉෛ ՙͷߴ͍ΞϓϦέʔγϣϯʹ͸͔ܽͤ·ͤΜɻಉछͷσʔληοτͰ͸ɺσΟʔϓχϡʔϥϧωοτ ϫʔΫ͕܁Γฦ͠༏ΕͨੑೳΛ͓ࣔͯ͠ΓɺͦͷͨΊ޿͘࠾༻͞Ε͍ͯ·͢ɻ͔͠͠ɺදܗࣜσʔλ ͷϞσϦϯάʢਪ࿦·ͨ͸ੜ੒ʣ΁ͷద༻͸ɺґવͱͯ͠ඇৗʹࠔ೉Ͱ͢ɻຊݚڀͰ͸ɺදܗࣜσʔ λʹର͢Δ࠷ઌ୺ͷਂ૚ֶशख๏ͷ֓ཁΛઆ໌͢Δɻ·ͣɺͦΕΒΛʮσʔλม׵ʯʮಛघͳΞʔΩ ςΫνϟʯʮਖ਼ଇԽϞσϧʯͷ3ͭͷάϧʔϓʹ෼ྨ͠·͢ɻͦͷޙɺ֤άϧʔϓͷओཁͳΞϓϩʔν ͷแׅతͳ֓ཁΛఏڙ͠·͢ɻλϏϡϥʔσʔλΛੜ੒͢ΔͨΊͷਂ૚ֶशΞϓϩʔνͷٞ࿦͸ɺλ Ϗϡϥʔσʔλ্Ͱਂ૚ϞσϧΛઆ໌͢ΔͨΊͷઓུʹΑͬͯิ׬͞ΕΔɻࢲͨͪͷओͳߩݙ͸ɺ͜ ͷ෼໺ͷओͳݚڀͷྲྀΕͱطଘͷํ๏࿦ΛऔΓ্͛ɺؔ࿈͢Δ՝୊΍ະղܾͷݚڀ՝୊Λ໌Β͔ʹ͢ Δ͜ͱͰ͢ɻզʑͷ஌ΔݶΓͰ͸ɺ͜Ε͸λϏϡϥʔσʔλʹର͢Δਂ૚ֶशͷΞϓϩʔνΛৄࡉʹ ݕ౼ͨ͠ॳΊͯͷ΋ͷͰ͢ɻຊ࿦จ͸ɺදܗࣜσʔλΛ༻͍ͨਂ૚ֶशʹڵຯΛ࣋ͭݚڀऀ΍࣮຿Ո ʹͱͬͯɺوॏͳग़ൃ఺ͱͳΓɺࢦ਑ͱͳΔͰ͠ΐ͏ɻ http://arxiv.org/abs/2110.01889v1 w ໨తɿදܗࣜσʔλʹର͢Δ࠷ઌ୺ͷਂ૚ֶशख๏ͷ֓ཁΛઆ໌͢ΔϨϏϡʔ࿦จ w ੒ՌɿओͳݚڀͷྲྀΕͱطଘͷํ๏࿦ɾϞσϧ౳Λ·ͱΊͯɺະղܾͷݚڀ՝୊Λ໌Β͔ʹͨ͠ w ํ๏ɿදܗࣜͷσʔληοτΛάϧʔϓ෼ྨͯ͠ɺάϧʔϓ͝ͱͷΞϓϩʔνͷ֓ཁΛ·ͱΊͨ w ݻ༗໊ɿͳ͠ w ஶऀॴଐɿςϡʔϏϯήϯେֶ υΠπ 4$)6'")PMEJOH"( υΠπ

Slide 28

Slide 28 text

4. ߴ࣍ݩͰͷֶͼ͸ɺৗʹ֎Ԇతͳ΋ͷͰ͋Δɻ (ݪจ: Learning in High Dimension Always Amounts to Extrapolation) ิؒͱ֎ૠͷ֓೦͸ɼਂ૚ֶश͔Βؔ਺ۙࣅ·Ͱ༷ʑͳ෼໺ͰجຊͱͳΔɽิؒ͸ɼ͋ Δαϯϓϧx͕ɼ༩͑ΒΕͨσʔληοτͷತแͷ಺ଆ·ͨ͸ڥք্ʹ͋Δͱ͖ʹߦΘ ΕΔɽ֎ૠ͸ɼx͕ͦͷತแͷ֎ଆʹ͋Δͱ͖ʹߦΘΕΔɽ1ͭͷجຊతͳʢޡͬͨʣೝ ࣝ͸ɺ࠷ઌ୺ͷΞϧΰϦζϜ͕͏·͘ػೳ͢Δͷ͸ɺֶशσʔλΛਖ਼͘͠ิؒ͢Δೳྗ ͕͋Δ͔Βͩͱ͍͏΋ͷͰ͋Δɻ2ͭ໨ͷޡղ͸ɺิؒ͸λεΫ΍σʔληοτશମͰ ߦΘΕΔͱ͍͏΋ͷͰɺ࣮ࡍɺଟ͘ͷ௚؍΍ཧ࿦͕͜ͷԾఆʹґଘ͍ͯ͠Δɻզʑ͸ܦ ݧతɺཧ࿦తʹ͜ΕΒͷ2ͭͷ఺ʹ൓࿦͠ɺͲΜͳߴ࣍ݩʢ>100ʣͷσʔληοτͰ ΋ɺ΄ͱΜͲ࣮֬ʹิؒ͸ى͜Βͳ͍͜ͱΛ࣮ূͨ͠ɻ͜ΕΒͷ݁Ռ͸ɺҰൠԽੑೳͷ ࢦඪͱͯ͠ͷݱࡏͷิؒ/֎ૠͷఆٛͷଥ౰ੑʹٙ໰Λ౤͔͚͛Δ΋ͷͰ͋Δɻ http://arxiv.org/abs/2110.09485v1 w ໨తɿߴ࣍ݩۭؒ Ҏ্ Ͱ͸σʔληοτͰิ͕ؒى͜Βͳ͍ࣄΛཧ࿦తɾܦݧతʹ࣮ূ͢Δɻ w ੒Ռɿ৽͍͠αϯϓϧʹର͢ΔิؒΛҡ࣋͢ΔͨΊʹ͸ɺσʔληοτͷαΠζ͕σʔλͷ࣍ݩʹର͠ ͯࢦ਺ؔ਺తʹେ͖͘ͳΔ͜ͱΛ࣮ূ͠ɺطଘͷࢦඪΛ൱ఆͨ͠ɻ w ํ๏ɿ࣍ݩ͕૿͑ͨ࣌ʹิؒ͢ΔͨΊͷσʔληοτྔΛཧ࿦ɾ࣮σʔληοτͷ྆໘Ͱܭࢉ w ݻ༗໊ɿ w ஶऀॴଐɿ'BDFCPPL"*3FTFBSDI

Slide 29

Slide 29 text

ࢀߟ: ֎ૠɾ಺ૠ IUUQTBUNBSLJUJUNFEJBDPKQBJUBSUJDMFTOFXTIUNM

Slide 30

Slide 30 text

σʔλͷ࣍ݩ͕૿͑Δఔɺ༻ҙͨ͠σʔλ ηοτͰิؒͰ͖Δׂ߹͕ࢦ਺తʹݮΔ

Slide 31

Slide 31 text

5. ADOP: ۙࣅࠩҟԽ1ϐΫηϧɾϙΠϯτɾϨϯμϦϯά ݪจ: ADOP: Approximate Differentiable One-Pixel Point Rendering) ຊݚڀͰ͸ɺγʔϯͷਫ਼ີԽͱ৽͍͠Ϗϡʔͷ߹੒ͷͨΊͷɺ৽͍͠ϙΠϯτϕʔεͷඍ෼ՄೳͳχϡʔϥϧϨϯμ ϦϯάύΠϓϥΠϯΛ঺հ͠·͢ɻೖྗ͸ɺ఺܈ͷॳظਪఆ஋ͱΧϝϥͷύϥϝʔλͰ͢ɻग़ྗ͸ɺ೚ҙͷΧϝϥ ϙʔζ͔Β߹੒͞Εͨը૾Ͱ͢ɻ఺܈ͷϨϯμϦϯά͸ɺඍ෼ՄೳͳϨϯμϥʔʹΑͬͯɺଟղ૾౓ͷ1ϐΫηϧ఺ ϥελϥΠζΛ༻͍ͯߦΘΕ·͢ɻ཭ࢄతͳϥελϥΠζͷۭؒޯ഑͸ɺΰʔετδΦϝτϦͱ͍͏৽͍֓͠೦ʹ Αͬͯۙࣅ͞Ε·͢ɻϨϯμϦϯάޙɺχϡʔϥϧΠϝʔδϐϥϛου͸σΟʔϓχϡʔϥϧωοτϫʔΫʹ౉͞ ΕɺγΣʔσΟϯάܭࢉͱϗʔϧϑΟϦϯά͕ߦΘΕ·͢ɻͦͯ͠ɺඍ෼Մೳͳ෺ཧϕʔεͷτʔϯϚούʔ͕ɺத ؒग़ྗΛλʔήοτը૾ʹม׵͠·͢ɻύΠϓϥΠϯͷ͢΂ͯͷεςʔδ͕ඍ෼ՄೳͰ͋ΔͨΊɺγʔϯͷ͢΂ͯͷ ύϥϝʔλʢΧϝϥϞσϧɺΧϝϥϙʔζɺϙΠϯτϙδγϣϯɺϙΠϯτΧϥʔɺ؀ڥϚοϓɺϨϯμϦϯάωο τϫʔΫͷॏΈɺϰΟωοτɺΧϝϥԠ౴ؔ਺ɺը૾͝ͱͷ࿐ग़ɺը૾͝ͱͷϗϫΠτόϥϯεʣΛ࠷దԽ͠·͢ɻ ຊγεςϜͰ͸ɺॳظ࠶ߏ੒ֶ͕शதʹվྑ͞ΕΔͨΊɺطଘͷΞϓϩʔνΑΓ΋γϟʔϓͰҰ؏ੑͷ͋Δ৽͍͠ ϏϡʔΛ߹੒Ͱ͖Δ͜ͱΛ͍ࣔͯ͠·͢ɻ·ͨɺ1ϐΫηϧͷϙΠϯτΛޮ཰తʹϥελϥΠζ͢Δ͜ͱͰɺ೚ҙͷ ΧϝϥϞσϧΛ࢖༻͠ɺ100MϙΠϯτҎ্ͷγʔϯΛϦΞϧλΠϜͰදࣔ͢Δ͜ͱ͕Ͱ͖·͢ɻ http://arxiv.org/abs/2110.06635v2 w ໨తɿΧϝϥը૾ɾ఺܈Λೖྗͱͨ͠ඍ෼Մೳͳө૾ϨϯμϦϯάύΠϓϥΠϯͷ঺հ w ੒Ռɿ೚ҙͷඃࣸମͷ -J%"3౳ͷ Χϝϥը૾͔Βө૾ΛϦΞϧλΠϜͰඳըͰ͖ΔϞσϧɾπʔϧΛެ։ w ํ๏ɿ̏࣍ݩҐஔ৘ใ͖ͭͷը૾ɾ఺܈Λ̏࣍ݩతʹॲཧɺิؒɺ&YJG৘ใΛ࢖ͬͯ)%3ʹ࠶ߏ੒ w ݻ༗໊ɿ"%01 IUUQTHJUIVCDPNEBSHMFJO"%01 w ஶऀॴଐɿΤΞϥϯήϯʹχϡϧϯϕϧΫେֶ υΠπ

Slide 32

Slide 32 text

ը૾+̏DҐஔ৘ใ͔Βɺ࿐ޫɾϗϫΠτόϥϯ εௐ੔ػೳ͖ͭͷө૾(novel frames)Λੜ੒͢Δ ೖྗΧϝϥը૾਺఺̏%Ґஔ৘ใ ग़ྗOPWFMGSBNFTө૾

Slide 33

Slide 33 text

GitHubͰσϞಈըɾΞϓϦ͕ެ։͞Ε͍ͯΔ 
 https://github.com/darglein/ADOP

Slide 34

Slide 34 text

1.఺܈(LiDARը૾౳)ͷϚοϐϯάɾϥελϥΠζɾิؒ 2.HDRը૾ϨϯμϦϯά 3.৭ௐิਖ਼ͯࣗ͠વͳ৭߹͍ʹ߹੒(LDRԽ)

Slide 35

Slide 35 text

6. ਂ૚χϡʔϥϧωοτϫʔΫʹΑΔ෼ྨͰ͸ɺΑ͘෼ྨ͞Εͨྫ͕աখධՁ͞ΕΔ (ݪจ: Well-classi fi ed Examples are Underestimated in Classi fi cation with Deep Neural Networks) ैདྷͷਂ૚෼ྨϞσϧͷֶशͰ͸ɺ෼ྨͷѱ͍ྫʹ஫໨͠ɺܾఆڥք͔Β཭Εͨ෼ྨͷྑ͍ྫΛແࢹ͢Δ͜ͱ ͕ৗࣝͰͨ͠ɻྫ͑͹ɺΫϩεΤϯτϩϐʔଛࣦΛ༻ֶ͍ͯश͢Δ৔߹ɺΑΓߴ͍໬౓Λ࣋ͭྫʢ͢ͳΘͪɺ Α͘෼ྨ͞Εͨྫʣ͸ɺόοΫϓϩύήʔγϣϯʹ͓͍ͯখ͞ͳޯ഑ʹد༩͠·͢ɻ͔͠͠ɺ͜ͷҰൠతͳख๏ ͸ɺදݱֶशɺΤωϧΪʔͷ࠷దԽɺϚʔδϯͷ૿ՃΛ๦͛Δ͜ͱΛཧ࿦తʹ͍ࣔͯ͠·͢ɻ͜ͷܽؕΛଧͪফ ͨ͢Ίʹɺզʑ͸ɺ෼ྨͷྑ͍ྫʹՃࢉϘʔφεΛ༩͑ͯɺֶश΁ͷߩݙΛ෮׆ͤ͞Δ͜ͱΛఏҊ͢Δɻ͜ͷ ൓ྫ͸ɺ͜ΕΒ3ͭͷ໰୊Λཧ࿦తʹղܾ͢Δ΋ͷͰ͋Δɻຊ࿦จͰ͸ɺը૾෼ྨɺάϥϑ෼ྨɺػց຋༁ͳͲ ͷଟ༷ͳλεΫʹ͓͍ͯɺཧ࿦తͳ݁ՌΛ௚઀ݕূͨ͠Γɺຊ൓ྫΛ༻͍ͯେ෯ͳੑೳ޲্Λ࣮ݱ͢Δ͜ͱ Ͱɺ͜ͷओுΛ࣮ূతʹࢧ࣋͢Δɻ͞Βʹɺຊ࿦จ͸ɺզʑͷΞΠσΞ͕͜ΕΒ3ͭͷ໰୊ΛղܾͰ͖ΔͨΊɺ ෆۉߧͳ෼ྨɺOODݕग़ɺఢରత߈ܸԼͷΞϓϦέʔγϣϯͳͲͷෳࡶͳγφϦΦʹରԠͰ͖Δ͜ͱΛࣔͯ͠ ͍Δɻίʔυ͸ɺhttps://github.com/lancopku/well-classi fi ed-examples-are-underestimated ɻ http://arxiv.org/abs/2110.06537v2 w ໨తɾ੒ՌɿΫϩεΤϯτϩϐʔϩεΛվྑͨ͠৽͍͠-PTTؔ਺ͷఏҊ w ํ๏ɿόοΫϓϩύήʔγϣϯʹΑΔ$SPTT&OUSPQZ $& ϩεͷ໰୊఺Λ໌Β͔ʹ͢Δ w ݻ༗໊ɿ&ODPVSBHJOH-PTT &- w ஶऀॴଐɿ๺ژେֶ

Slide 36

Slide 36 text

ELϩε=CEϩεʴ௥ՃϘʔφε

Slide 37

Slide 37 text

௥ՃϘʔφεͷՃࢉ஋͸LE=0~1 Ͱௐ੔͢Δ

Slide 38

Slide 38 text

CrossEntropy(CE) vs Encouraging(EL)

Slide 39

Slide 39 text

7. ByteTrack:͢΂ͯͷݕग़ϘοΫεΛؔ࿈෇͚Δ͜ͱʹΑΔෳ਺෺ମͷ௥੻ (ݪจ: ByteTrack: Multi-Object Tracking by Associating Every Detection Box) ϚϧνɾΦϒδΣΫτɾτϥοΩϯάʢMOTʣ͸ɼಈը಺ͷΦϒδΣΫτͷό΢ϯσΟϯάɾϘοΫεͱΞΠσϯςΟ ςΟΛਪఆ͢Δ͜ͱΛ໨తͱ͍ͯ͠·͢ɽଟ͘ͷख๏Ͱ͸ɺᮢ஋ΑΓ΋ߴ͍είΞΛ࣋ͭݕग़ϘοΫεΛؔ࿈෇͚Δ͜ ͱͰΞΠσϯςΟςΟΛಘ͍ͯ·͢ɻ͔͠͠ɺݕग़είΞͷ௿͍ΦϒδΣΫτʢྫ͑͹ɺӅ͞ΕͨΦϒδΣΫτʣ͸୯ ७ʹࣺͯΒΕͯ͠·͏ͨΊɺແࢹͰ͖ͳ͍ਅͷΦϒδΣΫτͷܽམ΍ɺஅยతͳي੻͕ੜͯ͡͠·͏ɻ͜ͷ໰୊Λղܾ ͢ΔͨΊʹɺզʑ͸BYTEͱݺ͹ΕΔγϯϓϧͰޮՌత͔ͭ൚༻తͳؔ࿈෇͚ํ๏Λఏࣔ͢ΔɻBYTEͱ͸ɺߴείΞͷ ݕग़ϘοΫε͚ͩͰͳ͘ɺ͢΂ͯͷݕग़ϘοΫεΛؔ࿈෇͚ͯ௥੻͢Δํ๏Ͱ͋Δɻ௿είΞͷݕग़ϘοΫεʹରͯ͠ ͸ɼτϥοΫϨοτͱͷྨࣅੑΛར༻ͯ͠ਅͷΦϒδΣΫτΛ෮ݩ͠ɼഎܠݕग़ΛϑΟϧλϦϯά͢ΔɽBYTEΛ9ͭͷ ҟͳΔ࠷ઌ୺ͷτϥοΧʔʹద༻ͨ͠ͱ͜ΖɺIDF1είΞΛ1ʙ10ϙΠϯτͷൣғͰҰ؏ͯ͠վળ͢Δ͜ͱ͕Ͱ͖·͠ ͨɻMOTͷ࠷ઌ୺ͷੑೳΛ׆͔ͨ͢ΊʹɺByteTrackͱ໊෇͚ΒΕͨγϯϓϧͰڧྗͳτϥοΧʔΛઃܭ͠·ͨ͠ɻͦ ͷ݁ՌɺV100 GPUΛ༻͍ͨςετηοτʮMOT17ʯʹ͓͍ͯɺMOTA80.3ɺIDF1 77.3ɺHOTA63.1Λୡ੒͠ɺ30 FPSͷಈ࡞଎౓Λ࣮ݱ͠·ͨ͠ɻιʔείʔυɺࣄલֶशࡁΈϞσϧɺσϓϩΠόʔδϣϯɺଞͷτϥοΧʔ΁ͷద༻ʹ ؔ͢ΔνϡʔτϦΞϧ͸ɺhttps://github.com/ifzhang/ByteTrack Ͱެ։͍ͯ͠·͢ɻ http://arxiv.org/abs/2110.06864v2 w ໨తɾ੒ՌɿϚϧνΦϒδΣΫττϥοΩϯάΞϧΰϦζϜ#:5&Ͱ4P5"Λୡ੒ͨ͠ w ํ๏ɿΧϧϚϯϑΟϧλʹΑΔΦϒδΣΫτҐஔਪఆ w ݻ༗໊ɿ#ZUF5SBDL IUUQTHJUIVCDPNJG[IBOH#ZUF5SBDL w ஶऀॴଐɿ՚தՊٕେֶɺ߳ߓେֶɺ#ZUF%BODF 5JL5PLͷձࣾ

Slide 40

Slide 40 text

GitHubͰެ։͞Ε͍ͯΔ https://github.com/ifzhang/ByteTrack

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

a: YOLOX౳ͷطଘख๏ͰObject Detection (t1~t3͸ө૾಺ͷ࿈ଓ͢ΔϑϨʔϜը૾)

Slide 43

Slide 43 text

b: ߴείΞͷശʹඥͮ͘෺ମ(tracklet)ͷ࣍ϑϨʔϜҐஔΛ Kalman FilterͰ༧ଌ(IoUͰείΞԽͯۙ͠ࣅ͢Δ΋ͷΛબͿ)

Slide 44

Slide 44 text

c: ௿είΞͷശ͔Βɺ(b)Ͱݕग़Ͱ͖ͳ ͔ͬͨtrackletΛਪଌͯ͠Ϛονϯά͢Δ

Slide 45

Slide 45 text

ΧϧϚϯϑΟϧλ ϊΠζͷ͋Δෳ਺ͷ৘ใΛ༻͍ͯਅͷঢ়ଶΛਪఆ͢ΔϑΟϧλ (ྫɿϩέοτͷঢ়ଶਪఆ,ࣗಈӡస੍ޚ) #ʁ "ʁ $ʁ

Slide 46

Slide 46 text

MOTA,IDF1,FPSͰSoTAΛୡ੒

Slide 47

Slide 47 text

8. MobileViTɿܰྔɾ൚༻ɾϞόΠϧରԠͷϏδϣϯτϥϯεϑΥʔϚʔ (ݪจ: MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer) Pickup http://arxiv.org/abs/2110.02178v1

Slide 48

Slide 48 text

9. εέʔϧͰͷߴ଎Ϟσϧฤू (ݪจ: Fast Model Editing at Scale) ࣄલʹֶश͞Εͨେن໛ͳϞσϧ͸ɺ༷ʑͳμ΢ϯετϦʔϜͷλεΫͰૉ੖Β͍݁͠ՌΛ࣮ݱ͍ͯ͠·͕͢ɺطଘͷେن໛ ͳϞσϧʹ͸·ͩΤϥʔ͕͋Γɺਖ਼֬ͳ༧ଌͰ͋ͬͯ΋࣌ؒͷܦաͱͱ΋ʹݹ͘ͳͬͯ͠·͏͜ͱ͕͋Γ·͢ɻ͜ͷΑ͏ͳࣦ ഊΛ͢΂ֶͯश࣌ʹݕग़͢Δ͜ͱ͸ෆՄೳͰ͋ΔͨΊɺ͜ͷΑ͏ͳϞσϧͷ։ൃऀͱΤϯυϢʔβͷ྆ํ͕ɺϞσϧΛͦͷ· ·ʹͯ͠ෆਖ਼֬ͳग़ྗΛमਖ਼Ͱ͖ΔΑ͏ʹ͢Δ͜ͱ͕๬·Ε·͢ɻ͔͠͠ɺେن໛ͳχϡʔϥϧωοτϫʔΫֶ͕श͢Δදݱ ͸෼ࢄ͓ͯ͠ΓɺϒϥοΫϘοΫεԽ͍ͯ͠ΔͨΊɺ͜ͷΑ͏ͳର৅Λߜͬͨฤू͸ࠔ೉Ͱ͢ɻ໰୊ͷ͋Δೖྗͱ৽ͨͳر๬ ͷग़ྗ͕1͚ͭͩఏࣔ͞Εͨ৔߹ɺඍௐ੔Ξϓϩʔν͸ΦʔόʔϑΟοτ͢Δ܏޲͕͋Δɻ·ͨɺଞͷฤूΞϧΰϦζϜ͸ɺඇ ৗʹେ͖ͳϞσϧʹద༻͢Δ৔߹ɺܭࢉ͕ෆՄೳͰ͋Δ͔ɺ୯ʹޮՌ͕ͳ͍ɻେن໛ͳϞσϧͷϙετϗοΫฤूΛ༰қʹ͢ ΔͨΊʹɺզʑ͸MEND (Model Editor Networks with Gradient Decomposition)ΛఏҊ͠·͢ɻMEND͸ɺඪ४తͳඍௐ ੔ʹΑͬͯಘΒΕͨޯ഑Λɺޯ഑ͷ௿ϥϯΫ෼ղΛ༻͍ͯม׵͢Δ͜ͱΛֶश͠ɺ͜ͷม׵ͷύϥϝʔλԽΛѻ͍΍ͯ͘͢͠ ͍·͢ɻMEND͸ɺ100ԯҎ্ͷύϥϝʔλΛ࣋ͭϞσϧͰ͋ͬͯ΋ɺ1ͭͷGPU্Ͱ1೔Ҏ಺ʹֶश͢Δ͜ͱ͕Ͱ͖·͢ɻ· ͨɺҰ౓ֶशͨ͠MEND͸ɺࣄલʹֶशͨ͠Ϟσϧʹ৽ͨͳฤूΛՃ͑Δ͜ͱ͕Ͱ͖·͢ɻT5ɺGPTɺBERTɺBARTϞσϧΛ ༻͍࣮ͨݧʹΑΓɺMEND͸ɺ਺ઍສ͔Β100ԯҎ্ͷύϥϝʔλΛ࣋ͭϞσϧʹରͯ͠ޮՌతͳฤूΛߦ͏͜ͱ͕Ͱ͖Δ། ҰͷϞσϧฤूख๏Ͱ͋Δ͜ͱ͕Θ͔Γ·ͨ͠ɻ࣮૷͸ɺhttps://sites.google.com/view/mend-editing ɻ http://arxiv.org/abs/2110.11309v1 w ໨తɿେن໛ͳ5SBOTGPSNFSϞσϧͰͷ௥Ճमਖ਼ͷͨΊͷ࠶ֶश࣌ͷ໰୊ʹରॲ͢Δ w ੒Ռɿߴ଎ʹɺΦʔόʔϑΟοτͳ͘ඍௐ੔Ͱ͖ΔϞσϧ.&/%ͷఏҊ w ํ๏ɿϞσϧฤू໰୊ࣗମΛֶश໰୊ͱͯ͠ѻ͏ w ݻ༗໊ɿ.&/% IUUQTTJUFTHPPHMFDPNWJFXNFOEFEJUJOH w ஶऀॴଐɿελϯϑΥʔυେֶ

Slide 49

Slide 49 text

࣌୅ͷྲྀΕʹΑͬͯ౴͕͑มΘΔ →࠶ֶश͕ඞཁ • ΠΪϦεͷट૬͸୭ʁ 
 ✗ 76୅ट૬(ςϦʔβɾϝΠ) 
 ○ 77୅ट૬(ϘϦεɾδϣϯιϯ) IUUQTTJUFTHPPHMFDPNWJFXNFOEFEJUJOH

Slide 50

Slide 50 text

࣌୅ͷྲྀΕʹΑͬͯ౴͕͑มΘΔ →෦෼తͳ࠶ֶश͕ඞཁ(=Ϟσϧฤू) IUUQTTJUFTHPPHMFDPNWJFXNFOEFEJUJOH

Slide 51

Slide 51 text

࠶ֶशͯ͠΋ɺؔ܎ͳ͍࣭໰ʹӨڹ͠ͳ͍ࣄ͕େࣄ (ྫɿϝογ͕ॴଐ͢ΔεϙʔπνʔϜ͸Ͳ͔͜ʣ

Slide 52

Slide 52 text

MENDͰϞσϧΛֶश͢Δͱɺ෦෼ฤू (࠶ֶश)͕Ͱ͖Δˠաֶश͠ͳ͍ˍߴ଎

Slide 53

Slide 53 text

10. ࣗݾڭࢣ෇ֶ͖श͸σʔληοτͷෆۉߧʹରͯ͠ΑΓؤ݈Ͱ͋Δ (ݪจ: Self-supervised Learning is More Robust to Dataset Imbalance) ࣗݾڭࢣ෇ֶ͖शʢSelf-Supervised Learning: SSLʣ͸ɼϥϕϧͳ͠Ͱֶश͢ΔͨΊɼҰൠతͳࢹ֮දݱΛֶश͢ΔͨΊͷ εέʔϥϒϧͳํ๏Ͱ͋Δɽ͔͠͠ɼେن໛ͳϥϕϧͳ͠σʔληοτͰ͸ɼϥϕϧͷ෼෍͕ϩϯάςʔϧͰ͋Δ͜ͱ͕ଟ ͘ɼSSLͷಈ࡞ʹ͍ͭͯ͸΄ͱΜͲ෼͔͍ͬͯͳ͍ɽຊݚڀͰ͸ɼσʔληοτෆۉߧԼͰͷࣗݾڭࢣ෇ֶ͖शΛܥ౷తʹ ௐࠪ͢Δɽ·ͣɼେن໛ͳ࣮ݧʹΑΓɼط੡ͷڭࢣ෇͖දݱ͸ɼڭࢣ෇͖දݱΑΓ΋Ϋϥεͷෆۉߧʹରͯ͠ΑΓؤ݈Ͱ͋ Δ͜ͱ͕Θ͔ͬͨɽSSLΛ༻͍ͨόϥϯεܕͱΞϯόϥϯεܕͷࣄલֶशͷੑೳࠩ͸ɺαϯϓϧαΠζʹؔΘΒͣɺυϝΠ ϯ಺ɺಛʹυϝΠϯ֎ͷධՁʹ͓͍ͯɺڭࢣ෇ֶ͖शͷੑೳࠩΑΓ΋༗ҙʹখ͘͞ͳ͍ͬͯ·͢ɻୈೋʹɼSSLͷؤ݈ੑΛ ཧղ͢ΔͨΊʹɼSSL͸සग़σʔλ͔ΒΑΓ๛͔ͳಛ௃Λֶश͢Δͱ͍͏ԾઆΛཱͯͨɽͭ·ΓɼكͳΫϥε΍Լྲྀͷλε Ϋͷ෼ྨʹ໾ཱͭɼϥϕϧͱ͸ແؔ܎͕ͩ఻ୡՄೳͳಛ௃Λֶश͢ΔͷͰ͸ͳ͍͔ͱߟ͑ΒΕΔɽରরతʹɼڭࢣ෇ֶ͖ शͰ͸ɼසग़͢Δྫ͔Βϥϕϧͱ͸ແؔ܎ͳಛ௃Λֶश͢Δಈػ͕ͳ͍ɽ͜ͷԾઆΛɺ୯७Խ͞ΕͨઃఆͰͷ൒߹੒࣮ݧ ͱཧ࿦త෼ੳʹΑͬͯݕূ͢Δɻୈࡾʹɺཧ࿦తಎ࡯ʹ৮ൃ͞Εͯɺ࠶ॏΈ෇͚ਖ਼ଇԽٕज़ΛߟҊͨ͠ɻ͜ͷٕज़͸ɺ͍͘ ͔ͭͷධՁج४ʹج͍ͮͯɺΞϯόϥϯεͳσʔληοτʹ͓͚ΔSSLදݱͷ඼࣭ΛҰ؏ͯ͠޲্ͤ͞ɺಉ͡਺ͷྫΛ࣋ͭ όϥϯεͷऔΕͨσʔληοτͱΞϯόϥϯεͳσʔληοτͷؒͷখ͞ͳΪϟοϓΛຒΊΔɻ http://arxiv.org/abs/2110.05025v1 w ໨తɿΫϥεෆۉߧԼͰͷʮࣗݾڭࢣֶ͖ͭश 44- ʯͷදݱ඼࣭Λܥ౷తʹௐࠪ͢Δ w ੒Ռɿ44-͕σʔληοτͷෆۉߧʹରͯ͠ؤڧͳࣄΛԠ༻ͯ͠SX4".ΛߟҊ w ݻ༗໊ɿ3FXFJHIUFE4". SX4". w ஶऀॴଐɿελϯϑΥʔυେֶɺτϤλɾϦαʔνɾΠϯεςΟςϡʔτ

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

Top hype: Best10

Slide 57

Slide 57 text

1. ADOP: Approximate Differentiable One-Pixel Point Rendering (ۙࣅࠩҟԽ1ϐΫη ϧɾϙΠϯτɾϨϯμϦϯά) ॏෳ http://arxiv.org/abs/2110.06635v2

Slide 58

Slide 58 text

2. ࣮਺ɺσʔλαΠΤϯεɺΧΦεɿ͋ΒΏΔσʔληοτΛ୯ҰͷύϥϝʔλͰϑΟοτͤ͞Δํ๏ (ݪจ: Real numbers, data science and chaos: How to fi t any dataset with a single parameter) ࣌ܥྻɺը૾ɺԻ੠ͳͲɺͲͷΑ͏ͳϞμϦςΟͷσʔλͰ͋ͬͯ ΋ɺ୯Ұͷ࣮਺஋ͷύϥϝʔλΛ࣋ͭྑ޷ͳεΧϥʔؔ਺ʢ࿈ଓɺඍ ෼Մೳ...ʣͰۙࣅͰ͖Δ͜ͱΛࣔ͠·͢ɻຊݚڀͰ͸ɺΧΦεཧ࿦ͷ جຊతͳ֓೦ʹج͍ͮͯɺσʔλͷ͢΂ͯͷαϯϓϧʹ೚ҙͷਫ਼౓Ͱ ϑΟοτͤ͞ΔͨΊʹɺ͜ͷύϥϝʔλΛௐ੔͢Δํ๏Λࣔ͢ڭҭత ͳΞϓϩʔνΛ࠾༻͍ͯ͠·͢ɻ޷ح৺Ԣ੝ͳσʔλαΠΤϯςΟε τΛର৅ʹɺػցֶशϞσϧͷදݱྗͱҰൠԽʹؔ͢Δ͜Ε·Ͱͷಉ ༷ͷ؍࡯݁ՌΛൃలͤͨ͞΋ͷͰ͢ɻ http://arxiv.org/abs/1904.12320v1 w ໨తɿ೚ҙͷσʔληοτ9ͷ͢΂ͯͷαϯϓϧ͕ɼ୯७ͳඍ෼ํఔࣜʹΑͬͯ࠶ݱͰ͖Δ͜ͱΛࣔ͢͜ͱ w ੒ՌɿҰͭͷ࣮਺஋ύϥϝʔλ͚ͩͰશͯͷԻ੠ɾࢹ֮σʔλΛੜ੒Ͱ͖Δɺ୯७Ͱඍ෼ՄೳͳఆࣜԽ w ํ๏ɿΧΦεཧ࿦Λݩʹͨ͠Ի੠ɾը૾σʔλͷҰൠԽ w ݻ༗໊ɿ4JOHMF1BSBNFUFS'JU IUUQTHJUIVCDPN3BOMPUTJOHMFQBSBNFUFS fi U w ஶऀॴଐɿ4"1-BCT υΠπͷιϑτ΢ΣΞاۀͷݚڀػؔ

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

3. σϧϑΝΠػցͷྙཧɾنൣΛ໨ࢦͯ͠ (ݪจ: Delphi: Towards Machine Ethics and Norms) ػցʹྙཧతͳߦಈΛڭ͑Δʹ͸ɺԿ͕ඞཁͰ͠ΐ͏͔ʁେ·͔ͳྙཧతϧʔϧ͸ɺʮೊɺࡴ͢ͳ͔Εʯͱ͍͏Α͏ʹ؆୯ʹड़΂Δ͜ ͱ͕Ͱ͖Δ͔΋͠Ε·ͤΜ͕ɺͦͷΑ͏ͳϧʔϧΛݱ࣮ͷঢ়گʹద༻͢Δ͜ͱ͸͸Δ͔ʹෳࡶͰ͢ɻྫ͑͹ɺʮ༑ਓΛॿ͚Δʯ͜ͱ͸ Ұൠతʹྑ͍͜ͱͰ͕͢ɺʮ༑ਓ͕ϑΣΠΫχϡʔεΛྲྀ͢ͷΛॿ͚Δʯ͜ͱ͸ྑ͍͜ͱͰ͸͋Γ·ͤΜɻࢲͨͪ͸ɺػցྙཧ΍نൣ ʹର͢Δ4ͭͷࠜຊతͳ՝୊Λಛఆ͍ͯ͠·͢ɻ(1) ಓಙతڭ܇ͱࣾձతنൣͷཧղɺ(2) ݱ࣮ੈքͷঢ়گΛࢹ֮తʹɺ͋Δ͍͸ࣗવݴޠ ʹΑΔهड़ΛಡΈऔΔೳྗɺ(3) ҟͳΔจ຺ʹ͓͚Δ୅ସతͳߦಈͷ݁ՌΛ༧ଌ͢ΔͨΊͷৗࣝతͳਪ࿦ɺ(4) ࠷΋ॏཁͳͷ͸ɺڝ߹͢ ΔՁ஋ͷ૬ޓ࡞༻ͱɺҟͳΔจ຺ʹ͓͚ΔͦΕΒͷࠜڌΛߟྀͯ͠ྙཧత൑அΛԼ͢ೳྗʢྫɿදݱͷࣗ༝ͷݖརͱϑΣΠΫχϡʔε ͷ֦ࢄ๷ࢭʣɻ ຊߘͰ͸ɺ͜ΕΒͷٙ໰Λਂ૚ֶशͷύϥμΠϜͷதͰղܾ͠Α͏ͱ͢Δ΋ͷͰ͋ΔɻզʑͷϓϩτλΠϓϞσϧͰ͋ ΔDelphi͸ɺݴޠϕʔεͷৗࣝతͳಓಙతਪ࿦ʹڧ͍ظ଴Λد͓ͤͯΓɺਓؒʹΑͬͯݕূ͞Εͨਫ਼౓͸࠷େͰ92.1%Ͱ͋Δɻ͜Ε ͸ɺGPT-3ͷθϩγϣοτੑೳ͕52.3%Ͱ͋Δͷͱ͸ରরతͰ͋Γɺେن໛ͳεέʔϧ͚ͩͰ͸ɺࣄલʹ܇࿅͞ΕͨਆܦݴޠϞσϧʹਓ ؒͷՁ஋Λ෇༩͢Δ͜ͱ͸Ͱ͖ͳ͍͜ͱΛ͍ࣔࠦͯ͠Δɻͦ͜Ͱࢲͨͪ͸ɺػց༻ʹΧελϚΠζ͞ΕͨಓಙͷڭՊॻ ʮCommonsense Norm BankʯΛൃද͠·ͨ͠ɻ͜ͷڭՊॻʹ͸ɺ೔ৗͷ͞·͟·ͳঢ়گʹ͓͚Δਓʑͷྙཧత൑அͷྫ͕170ສ݅ऩ ࿥͞Ε͍ͯ·͢ɻຊݚڀ͸ɺࠓޙͷݚڀͷͨΊͷ৽ͨͳϦιʔεͱϕʔεϥΠϯͱͳΔੑೳʹՃ͑ͯɺਓؒͷීวతͳՁ஋ͱݸਓతͳՁ ஋ͷ۠ผɺҟͳΔಓಙత࿮૊ΈͷϞσϧԽɺػցྙཧ΁ͷઆ໌ՄೳͰҰ؏ੑͷ͋ΔΞϓϩʔνͳͲɺ͍͔ͭ͘ͷॏཁͳະղܾͷݚڀ՝ ୊ʹͭͳ͕Δ৽ͨͳಎ࡯Λఏڙ͍ͯ͠·͢ɻ http://arxiv.org/abs/2110.07574v1 w ໨తɿಓಙతɾྙཧతͳֶशɾਪ࿦ w ੒Ռɿػց༻ʹΧελϚΠζ͞ΕͨಓಙͷڭՊॻʮ$PNNPOTFOTF/PSN#BOLʯΛൃද w ํ๏ɿ w ݻ༗໊ɿϓϩτλΠϓϞσϧ%FMQIJσʔληοτ$PNNPOTFOTF/PSN#BOL w ஶऀॴଐɿϫγϯτϯେֶ ΞϨϯਓ޻஌ೳݚڀॴ

Slide 63

Slide 63 text

Delphi: ྙཧΛ౿·͑ͯɺߦಈͷ ྑ͠ѱ͠Λ൑அ͢ΔϞσϧ

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

σϞαΠτ: https://delphi.allenai.org/

Slide 66

Slide 66 text

4. ϚϧνλεΫϓϩϯϓτʹΑΔτϨʔχϯάͰθϩγϣοτλεΫͷҰൠԽΛ࣮ݱ (ݪจ: Multitask Prompted Training Enables Zero-Shot Task Generalization) ۙ೥ɺେن໛ͳݴޠϞσϧ͕ɺଟ༷ͳλεΫʹ͓͍ͯଥ౰ͳθϩγϣοτ൚ԽΛୡ੒͢Δ͜ͱ͕ࣔ͞Ε͍ͯ Δɻ͜Ε͸ɺݴޠϞσϧͷֶशʹ͓͚Δ҉໧ͷϚϧνλεΫֶशͷ݁ՌͰ͋Δͱ͍͏Ծઆཱ͕ͯΒΕ͍ͯ· ͢ɻͰ͸ɼ໌ࣔతͳϚϧνλεΫֶशʹΑͬͯɼθϩγϣοτͷ൚Խ͕௚઀Ҿ͖ى͜͞ΕΔͷͰ͠ΐ͏͔ʁ͜ ͷٙ໰Λେن໛ʹݕূ͢ΔͨΊʹɺҰൠతͳࣗવݴޠλεΫΛਓ͕ؒಡΊΔϓϩϯϓτܗࣜʹ؆୯ʹϚοϐϯ ά͢ΔγεςϜΛ։ൃ͠·ͨ͠ɻେن໛ͳڭࢣ෇͖σʔληοτΛม׵͠ɺͦΕͧΕͷσʔληοτʹ͸༷ʑ ͳࣗવݴޠΛ༻͍ͨෳ਺ͷϓϩϯϓτ͕༻ҙ͞Ε͍ͯΔɻ͜ΕΒͷϓϩϯϓτσʔληοτ͸ɺࣗવݴޠͰࢦ ఆ͞Εͨશ͘ݟͨ͜ͱͷͳ͍λεΫΛ࣮ߦ͢ΔϞσϧͷೳྗΛϕϯνϚʔΫ͢Δ͜ͱ͕Ͱ͖Δɻࣄલʹֶश͠ ͨΤϯίʔμʔͱσίʔμʔͷϞσϧΛɺ༷ʑͳλεΫΛؚΉϚϧνλεΫࠞ߹෺Ͱඍௐ੔ͨ͠ɻ͜ͷϞσϧ ͸ɺ͍͔ͭ͘ͷඪ४తͳσʔληοτʹ͓͍ͯڧྗͳθϩγϣοτੑೳΛୡ੒͠ɺ͠͹͠͹16ഒͷαΠζͷϞ σϧΑΓ΋༏Ε͍ͯΔɻ͞ΒʹɺBIG-BenchϕϯνϚʔΫͷλεΫͷαϒηοτʹ͓͍ͯ΋ɺ6ഒͷϞσϧΑΓ ΋ߴ͍ੑೳΛൃش͠·ͨ͠ɻ͢΂ͯͷϓϩϯϓτͱֶशࡁΈϞσϧ͸ɼgithub.com/bigscience-workshop/ promptsource/Ͱެ։͞Ε͍ͯ·͢ɽ http://arxiv.org/abs/2110.08207v1 w ໨తɿେن໛ͳݴޠϞσϧ͕θϩγϣοτ൚Խ͕Ͱ͖Δཧ༝ͷݚڀ w ੒ՌɿݴޠϞσϧͷڧྗͳθϩγϣοτҰൠԽೳྗΛ࣮ݱͰ͖ΔπʔϧͱֶशࡁΈϞσϧΛެ։ w ํ๏ɿϚϧνλεΫϓϩϯϓττϨʔχϯά w ݻ༗໊ɿ5ϕʔεͷ5Ϟσϧ 1SPNQU4PVSDF HJUIVCDPNCJHTDJFODFXPSLTIPQ QSPNQUTPVSDF w ஶऀॴଐɿ)VHHJOH'BDF 🤗 ϒϥ΢ϯେֶ΄͔

Slide 67

Slide 67 text

Multitask Prompt Training ଟ༷ͳܗࣜͰ࣭໰ɾճ౴จΛ࡞ֶͬͯश͢Δ͜ ͱͰɺະ஌ͷܗࣜͰͷθϩγϣοτʹڧ͘͢Δ

Slide 68

Slide 68 text

T0: Google ͷ T5Ϟσϧϕʔε IUUQTHJUIVCDPNHPPHMFSFTFBSDIUFYUUPUFYUUSBOTGFSUSBOTGPSNFS

Slide 69

Slide 69 text

GPT3(ύϥϝʔλ਺175B)ΑΓT0(ύϥϝʔλ਺10B)͕ߴ͍ਫ਼ ౓ʹͳͬͨ.

Slide 70

Slide 70 text

PromptSource: ϓϩϯϓτܗࣜσʔληοτ࡞੒πʔϧ https://github.com/bigscience-workshop/promptsource/

Slide 71

Slide 71 text

5. ඇෛͷۭؒతҼ਺෼ղ (ݪจ: Nonnegative spatial factorization) Ψ΢εաఔ͸ɺϊϯύϥϝτϦοΫͳॊೈੑͱෆ࣮֬ੑͷఆྔԽ͕ՄೳͰ͋Δ͜ͱ͔Βɺۭؒσʔλͷ ղੳʹ޿͘༻͍ΒΕ͓ͯΓɺ࠷ۙ։ൃ͞Εͨεέʔϥϒϧͳۙࣅ๏ʹΑΓɺ๲େͳσʔληοτ΁ͷద ༻͕༰қʹͳ͍ͬͯΔɻଟมྔͷ݁Ռʹରͯ͠͸ɺۭؒ૬ؔΛར༻ͨ࣍͠ݩ࡟ݮΛ૊Έ߹ΘͤͨίΞϦ φΠζͷઢܗϞσϧ͕͋Δɻ͔͠͠ɺඇෛϞσϧͱ͸ҟͳΓɺύʔπϕʔεͷදݱΛճ෮͠ͳ͍ͨΊɺ ͦͷ࣮਺જࡏҼࢠͱෛՙྔ͸ղऍ͕೉͍͠ɻຊݚڀͰ͸ɺۭؒΛߟྀͨ֬͠཰త࣍ݩ࡟ݮϞσϧͰ͋Δ ඇෛͷۭؒҼࢠԽʢNSFʣΛఏҊ͢ΔɻNSF͸ɺγϛϡϨʔγϣϯͱߴ࣍ݩۭؒτϥϯεΫϦϓτϛΫ εσʔλΛ༻͍ͯɺMEFISTOͷΑ͏ͳ࣮਺ۭؒҼࢠԽ΍ඇۭؒ࣍ݩ࡟ݮ๏ͱൺֱͨ͠ɻNSF͸ɺҨ఻ࢠ ൃݱͷҰൠԽՄೳͳۭؒύλʔϯΛಛఆ͠·͢ɻ͢΂ͯͷҨ఻ࢠൃݱύλʔϯ͕ۭؒతͰ͋Δͱ͸ݶΒ ͳ͍ͨΊɺۭؒతͳཁૉͱඇۭؒతͳཁૉΛ૊Έ߹ΘͤͨNSFͷϋΠϒϦου֦ுΛఏҊ͠ɺ؍ଌ஋ͱ ಛ௃ͷ྆ํʹ͍ۭͭͯؒతͳॏཁੑΛఆྔԽ͢Δ͜ͱΛՄೳʹ͍ͯ͠·͢ɻNSFͷTensorFlow࣮૷͸ɺ https://github.com/willtownes/nsf-paper ͔ΒೖखՄೳͰ͋Δɻ http://arxiv.org/abs/2110.06122v1 w ໨తɿੜମ૊৫ͷݚڀʹ͓͚ΔۭؒతͳҨ఻ࢠൃݱͷଌఆ w ੒ՌɿҨ఻ࢠൃݱͷҰൠԽՄೳͳۭؒύλʔϯΛಛఆ w ํ๏ɿΨ΢εաఔΛ༻͍ͨσʔλղੳͰͷۭؒΛߟྀͨ֬͠཰త࣍ݩ࡟ݮϞσϧΛߟҊ w ݻ༗໊ɿ/4' /POOFHBUJWF4QBUJBM'BDUPSJ[BUJPO /4') /4')ZCSJE w ஶऀॴଐɿϓϦϯετϯେֶ άϥουετʔϯݚڀॴ αϯϑϥϯγεί

Slide 72

Slide 72 text

Ϛ΢εͷ೴ͷVisiumۭؒతҨ఻ࢠൃݱղੳ

Slide 73

Slide 73 text

6. ߴ࣍ݩͰͷֶͼ͸ɺৗʹ֎Ԇతͳ΋ͷͰ͋Δɻ (ݪจ: Learning in High Dimension Always Amounts to Extrapolation) ॏෳ http://arxiv.org/abs/2110.09485v1

Slide 74

Slide 74 text

7. StyleAlignɿ੔ྻͨ͠StyleGANϞσϧͷ෼ੳͱԠ༻ (ݪจ: StyleAlign: Analysis and Applications of Aligned StyleGAN Models) ຊ࿦จͰ͸ɺΞϥΠϝϯτ͞Εͨੜ੒ϞσϧͷಛੑͱͦͷԠ༻ʹ͍ͭͯৄࡉʹݕ౼ͨ͠ɻ͜͜Ͱ͸ɺ2ͭͷϞσϧ ͕ಉ͡ΞʔΩςΫνϟΛڞ༗͠ɺҰํʢࢠʣ͕ଞํʢ਌ʣ͔ΒผͷυϝΠϯ΁ͷඍௐ੔ΛܦͯಘΒΕͨ৔߹ɺ੔ ྻͨ͠ϞσϧͱݺͿ͜ͱʹ͢Δɻ͢Ͱʹ͍͔ͭ͘ͷ࡞඼Ͱ͸ɺΞϥΠϝϯτ͞ΕͨStyleGANϞσϧͷجຊతͳಛ ੑΛར༻ͯ͠ɺը૾ؒͷ຋༁Λߦ͍ͬͯΔɻ͜͜Ͱ͸ɺStyleGANʹয఺Λ౰ͯͯɺϞσϧͷΞϥΠϯϝϯτΛॳ Ίͯৄࡉʹௐࠪ͢Δɻ·ͣɺ੔ྻͨ͠ϞσϧΛܦݧతʹ෼ੳ͠ɺͦͷੑ࣭ʹؔ͢Δॏཁͳٙ໰ʹର͢Δ౴͑Λఏ ڙ͢ΔɻಛʹɺࢠϞσϧͷજࡏۭؒ͸਌Ϟσϧͷજࡏۭؒͱҙຯతʹ੔߹͓ͯ͠Γɺਓͷإ΍ڭձͳͲͷԕ͍ σʔλྖҬͰ͋ͬͯ΋ɺ৴͡ΒΕͳ͍΄Ͳ๛͔ͳҙຯΛܧঝ͍ͯ͠Δ͜ͱ͕Θ͔Γ·ͨ͠ɻ࣍ʹɺ͜ͷΑ͏ʹ͠ ͯಘΒΕͨཧղΛ΋ͱʹɺ੔ྻͨ͠ϞσϧΛ׆༻ͯ͠͞·͟·ͳ՝୊Λղܾ͠·͢ɻը૾຋༁ʹՃ͑ͯɺ׬શʹ ࣗಈԽ͞ΕͨΫϩευϝΠϯͷը૾ϞʔϑΟϯάΛ࣮ূ͠·ͨ͠ɻ͞Βʹɺ਌ྖҬͰͷ؂ࢹͷΈʹཔΓͳ͕Βɺ ࢠྖҬͰ͸θϩγϣοτͷࢹ֮λεΫΛ࣮ߦͰ͖Δ͜ͱΛࣔ͠·͢ɻ͞Βʹɺ਌ྖҬͷ؂ࢹͷΈʹґଘ͠ͳ͕ ΒɺࢠྖҬͰθϩγϣοτɾϏδϣϯɾλεΫΛ࣮ߦ͢Δ͜ͱ͕Ͱ͖Δ͜ͱΛࣔ͠·ͨ͠ɻ͜ͷΞϓϩʔνʹΑ Γɺ؆୯ͳඍௐ੔ͱ൓సͷΈͰɺ࠷ઌ୺ͷ݁Ռ͕ಘΒΕΔ͜ͱΛఆੑత͓Αͼఆྔతʹࣔ͠·ͨ͠ɻ http://arxiv.org/abs/2110.11323v1 w ໨తɿ4UZMF("/ͷ෼ੳͱԠ༻ w ੒Ռɿ4UZMF("/ͷΫϩευϝΠϯͷసҠֶशͷੑೳΛվળͰ͖ͨ w ํ๏ɿ4UZMF("/Λશ͘ผυϝΠϯʹసҠֶशͨ࣌͠ͷજࡏۭؒΛ෼ੳ w ݻ༗໊ɿ4UZMF"MJHO w ஶऀॴଐɿϔϒϥΠେֶ ςϧΞϏϒେֶ "EPCF3FTFBSDI

Slide 75

Slide 75 text

FFHQσʔληοτͰֶशͨ͠StyleGAN2Λ Mega, Dog ͰϑΝΠϯνϡʔχϯάͨ͠ޙɺॳظ஋Λ෦෼తʹFFHQ ͷॏΈʹϦηοτͨ࣌͠ͷ݁ՌͷมԽ ਌ͷॏΈʹϦηοτͯ͠΋Өڹ͕খ͍͞ ʹ਌ͷॏΈ͔Β͋·ΓมΘ͍ͬͯͳ͍ ਌ͷॏΈʹϦηοτ͢Δͱେ͖͘Өڹ ʹ਌ͷॏΈ͔Βେ͖͘มΘ͍ͬͯΔ ɹ ಛʹਓˠݘͷผυϝΠϯֶश࣌

Slide 76

Slide 76 text

ਓͱڭձͷΑ͏ʹυϝΠϯ͕େ͖͘มΘͬͨͱͯ͠΋ɺ ϚοϐϯάɾΞϑΟϯ͸ྨࣅ͍ͯ͠Δ

Slide 77

Slide 77 text

FFHQ -> Merface or Mega ΁ͷసҠֶश ਌ϞσϧͰͷηϚϯςΟοΫίϯτϩʔϧ͸ࢠϞσϧͰ΋Ҿ͖ܧ͕ΕΔɻ ʢ=જࡏۭؒWͱS͕͋·ΓมԽ͍ͯ͠ͳ͍)

Slide 78

Slide 78 text

StyleAlign=StyleGAN2΍StyleGAN2-ADAΛ ඍௐ੔ͯ͠ϑΝΠϯνϡʔχϯάֶशͨ͠΋ͷ

Slide 79

Slide 79 text

StyleAlign vs Others

Slide 80

Slide 80 text

8. AudacityͷͨΊͷਂ૚ֶशπʔϧɻݚڀऀ͕ΞʔςΟετͷπʔϧΩοτΛ֦ு͢Δͷʹ໾ཱͭ (ݪจ: Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit) ࢲͨͪ͸ɺΦʔϓϯιʔεͷਓؾΦʔσΟΦฤूιϑτAudacity ʹɺ࠷খݶͷ։ൃऀͷ࿑ྗͰχϡʔϥϧωοτϫʔΫΛ౷߹͢Δι ϑτ΢ΣΞϑϨʔϜϫʔΫΛ঺հ͠·͢ɻຊ࿦จͰ͸ɺΤϯυϢʔ βʔͱχϡʔϥϧωοτϫʔΫ։ൃऀͷ྆ํʹ޲͚ͯɺ͍͔ͭ͘ͷ ࢖༻ྫΛ঺հ͠·͢ɻ͜ͷݚڀ͕ɺਂ૚ֶशͷ࣮ફऀͱΤϯυϢʔ βʔͷؒͷ৽͍͠Ϩϕϧͷ૬ޓ࡞༻Λଅਐ͢Δ͜ͱΛظ଴͍ͯ͠· ͢ɻ http://arxiv.org/abs/2110.13323v1 w ໨తɾ੒ՌɿԻ੠ฤूιϑτ"VEBDJUZΛχϡʔϥϧωοτϫʔΫʹରԠ͢Δ w ํ๏ɿΦϯϥΠϯαΠτ)VHHJOH'BDFͷެ։ϞσϧʹରԠ w ݻ༗໊ɿ"VEBDJUZ%JHJUBM"VEJP8PSLTUBUJPO w ஶऀॴଐɿϊʔε΢Σελϯେֶ "VEBDJUZ5FBN Ի੠ฤूιϑτ։ൃνʔϜ

Slide 81

Slide 81 text

Ի੠ฤूιϑτ Audacity Digital Audio WorkstationͰ HuggingFaceʹެ։͞Ε͍ͯΔϞσϧΛϩʔΧϧPCͰ࣮ ߦͯ͠Ի੠Ճ޻Ͱ͖ΔΑ͏ʹͳͬͨ

Slide 82

Slide 82 text

೚ҙͷϞσϧΛ࢖ͬͯɺ Ի੠ΛՃ޻ͨ͠ΓɺϥϕϧͳͲʹม׵Ͱ͖Δ

Slide 83

Slide 83 text

9. ECQx:௿ϏοτͰૄͳDNNͷͨΊͷઆ໌Մೳੑʹجͮ͘ྔࢠԽ (ݪจ: ECQx: Explainability-Driven Quantization for Low-Bit and Sparse DNNs) ༷ʑͳΞϓϦέʔγϣϯʹ͓͚ΔσΟʔϓχϡʔϥϧωοτϫʔΫʢDNNʣͷ໨֮·͍͠੒ޭ͸ɺωοτϫʔΫύϥϝʔλ΍ԋ ࢉྔͷେ෯ͳ૿ՃΛ൐͍ͬͯ·͢ɻ͜ͷΑ͏ͳϝϞϦ΍ܭࢉྔͷ૿Ճ͸ɺϞόΠϧػثͷΑ͏ͳϦιʔεʹ੍໿ͷ͋Δϋʔυ΢Σ ΞϓϥοτϑΥʔϜͰ͸ɺਂ૚ֶशΛߦ͏͜ͱ͕Ͱ͖·ͤΜɻ࠷ۙͰ͸ɺϞσϧͷੑೳΛՄೳͳݶΓҡ࣋ͭͭ͠ɺ͜ΕΒͷΦʔ όʔϔουΛ࡟ݮ͢Δ͜ͱΛ໨తͱͯ͠ɺύϥϝʔλ࡟ݮٕज़ɺύϥϝʔλྔࢠԽɺՄٯѹॖٕज़ͳͲ͕։ൃ͞Ε͍ͯΔɻ ຊষ Ͱ͸ɺDNNͷͨΊͷ৽͍͠ྔࢠԽύϥμΠϜΛ։ൃ͠ɺઆ໌͠·͢ɻຊख๏Ͱ͸ɺઆ໌ՄೳͳAI(XAI)ͷ֓೦ͱ৘ใཧ࿦ͷ֓೦Λ ׆༻͠ɺྔࢠԽΫϥελ΁ͷڑ཭ʹج͍ͮͯॏΈ஋ΛׂΓ౰ͯΔ୅ΘΓʹɺϨΠϠϫΠζؔ࿈ੑ఻೻(LRP)͔ΒಘΒΕΔॏΈͷؔ ࿈ੑͱΫϥελͷ৘ใྔ(Τϯτϩϐʔ࠷దԽ)Λ௥ՃͰߟྀ͠·͢ɻ࠷ऴతͳ໨ඪ͸ɺ࠷΋ؔ࿈ੑͷߴ͍ॏΈΛɺ࠷΋৘ใྔͷଟ ͍ྔࢠԽΫϥελʹอଘ͢Δ͜ͱͰ͢ɻ ࣮ݧͷ݁Ռɺ͜ͷ৽͍͠Τϯτϩϐʔ੍໿෇͖XAIௐ੔ྔࢠԽʢECQxʣ๏͸ɺϞσϧͷ ੑೳΛҡ࣋·ͨ͸޲্ͤ͞ͳ͕Βɺ௒௿ਫ਼౓ʢ2ʙ5ϏοτʣͰಉ࣌ʹεύʔεͳχϡʔϥϧωοτϫʔΫΛੜ੒͢Δ͜ͱ͕෼͔ Γ·ͨ͠ɻ·ͨɼύϥϝʔλͷਫ਼౓͕௿͘ɼθϩཁૉͷ਺͕ଟ͍͜ͱ͔ΒɼϑΝΠϧαΠζͷ఺Ͱ΋ѹॖੑ͕ߴ͘ɼߴਫ਼౓ͷྔ ࢠԽ͞Ε͍ͯͳ͍DNNϞσϧͱൺֱͯ͠ɼ࠷େͰ 103ഒͷѹॖޮՌ͕ಘΒΕ·͢ɽզʑͷΞϓϩʔν͸ɺ༷ʑͳλΠϓͷϞσϧ ͱσʔληοτʢGoogle Speech Commands΍CIFAR-10ͳͲʣͰධՁ͞Εɺաڈͷݚڀͱൺֱ͞Ε·ͨ͠ɻ http://arxiv.org/abs/2109.04236v1 w ໨తɿϞσϧͷੑೳΛҡ࣋ͭͭ͠ྔࢠԽͯ͠αΠζѹॖ͢Δ w ੒Ռɿ7((Ͱςετͯ͠࠷େͰഒͷѹॖޮՌ͕ಘΒΕͨ w ํ๏ɿઆ໌Մೳͳ"* 9"* ͷ֓೦ͱ৘ใཧ࿦ͷ֓೦Λ׆༻ w ݻ༗໊ɿ&$2 &$2Y w ஶऀॴଐɿϑϥ΢ϯϗʔϑΝʔ))*ݚڀॴ υΠπ #*'0-% υΠπ

Slide 84

Slide 84 text

VGG16ͱಠࣗMLPͷྔࢠԽ݁Ռ ECQxͰ4ϏοτྔࢠԽͨ͠VGG16Ϟσϧ͕ɺ-0.1ͷਫ਼౓ྼԽͰ102.59ഒͷѹॖഒ཰Λୡ੒ͨ͠

Slide 85

Slide 85 text

10. େن໛ͳࣄલτϨʔχϯάͷݶքΛ୳Δ (ݪจ: Exploring the Limits of Large Scale Pre-training) ॏෳ http://arxiv.org/abs/2110.02095v1

Slide 86

Slide 86 text

DeepL Translator (deepl.com) https://www.deepl.com/en/translator