Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI最新論文読み会2020年12月

M.Inomata
December 02, 2020

 AI最新論文読み会2020年12月

M.Inomata

December 02, 2020
Tweet

More Decks by M.Inomata

Other Decks in Programming

Transcript

  1. ࣗݾ঺հ ழມ ॆԝ (͍ͷ·ͨ ΈͭͻΖ) גࣜձࣾ tech vein ୅දऔక໾ ݉

    σϕϩούʔ twitter: @ino2222 IUUQTXXXUFDIWFJODPN
  2. Top recent ᶇղઆॻΛ࢖ͬͨࢦಋ (ݪจ: Teaching with Commentaries) σΟʔϓχϡʔϥϧωοτϫʔΫͷޮՌతͳֶश͸ࠔ೉Ͱ͋Γɺ͜ΕΒͷϞσϧΛ࠷దʹֶ श͢Δํ๏ʹ͍ͭͯ͸ଟ͘ͷະղܾͷ໰୊͕࢒͍ͬͯ·͢ɻ࠷ۙ։ൃ͞Εͨχϡʔϥϧωο τϫʔΫͷֶशΛվળ͢ΔͨΊͷख๏͸ɺςΟʔνϯάʢֶश৘ใΛֶशϓϩηεதʹఏڙ

    ͯ͠ԼྲྀͷϞσϧͷੑೳΛ޲্ͤ͞Δ͜ͱʣΛݕ౼͍ͯ͠Δɻຊ࿦จͰ͸ɺςΟʔνϯάͷ ൣғΛ޿͛ΔͨΊͷҰาΛ౿Έग़͢ɻຊ࿦จͰ͸ɺಛఆͷλεΫ΍σʔληοτͰͷֶशʹ ໾ཱͭϝλֶश৘ใͰ͋ΔղઆΛ༻͍ͨॊೈͳςΟʔνϯάϑϨʔϜϫʔΫΛఏҊ͢Δɻຊ ࿦จͰ͸ɺ࠷ۙͷ҉໧ͷࠩҟԽʹؔ͢Δݚڀ੒ՌΛ׆༻ͯ͠ɺޮ཰తͰεέʔϥϒϧͳޯ഑ ϕʔεͷղઆจֶश๏ΛఏҊ͢Δɻݸʑͷ܇࿅ྫʹର͢ΔॏΈͷֶश͔Βɺϥϕϧʹґଘ͠ ͨσʔλ૿ڧϙϦγʔͷύϥϝʔλԽɺݦஶͳը૾ྖҬΛڧௐ͢Δ஫ҙϚεΫͷදݱ·Ͱɺ ༷ʑͳ༻్Λ୳Δɻ͜ΕΒͷઃఆʹ͓͍ͯɺίϝϯλϦʔ͸܇࿅଎౓΍ੑೳΛ޲্ͤ͞ɺ σʔληοτͱ܇࿅ϓϩηεʹؔ͢Δجຊతͳಎ࡯Λఏڙ͢Δ͜ͱ͕Ͱ͖Δ͜ͱΛൃݟ͢ Δɻ http://arxiv.org/abs/2011.03037v1 Google Research / MIT / University of Toronto ˠڭࢣσʔλΛՃ޻ֶͯ͠शΛิॿ͢ΔϞσϧ ղઆϞσϧ  Λ࡞Δ൚༻తͳΞϧΰϦζϜΛߟ࣮͑ͯূͨ͠Α
  3. Top10 Recent 1. An Image is Worth 16x16 Words: Transformers

    for Image Recognition at Scale 2. Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth 3. RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder 4. Intriguing Properties of Contrastive Losses 5. Teaching with Commentaries 6. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges 7. Learning Invariances in Neural Networks 8. Underspecification Presents Challenges for Credibility in Modern Machine Learning 9. Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian 10. Training Generative Adversarial Networks by Solving Ordinary Differential Equations
  4. Top10 Hype 1. Fourier Neural Operator for Parametric Partial Differential

    Equations 2. Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth 3. Viewmaker Networks: Learning Views for Unsupervised Representation Learning 4. Large-scale multilingual audio visual dubbing 5. Text-to-Image Generation Grounded by Fine-Grained User Attention 6. Self Normalizing Flows 7. An Attack on InstaHide: Is Private Learning Possible with Instance Encoding? 8. Hyperparameter Ensembles for Robustness and Uncertainty Quantification 9. The geometry of integration in text classification RNNs 10. Scaling Laws for Autoregressive Generative Modeling
  5. ᶃը૾͸16×16ͷݴ༿ͷՁ஋͕͋Δɻେن໛ը૾ೝࣝͷͨΊͷτϥϯε ϑΥʔϚʔ (ݪจ: An Image is Worth 16x16 Words: Transformers

    for Image Recognition at Scale) τϥϯεϑΥʔϚʔͷΞʔΩςΫνϟ͸ࣗવݴޠॲཧλεΫͷσϑΝΫτ ελϯμʔυͱͳ͍ͬͯ·͕͢ɺίϯϐϡʔλϏδϣϯ΁ͷԠ༻͸·ͩ ݶΒΕ͍ͯ·͢ɻϏδϣϯͰ͸ɺ஫໨͸৞ΈࠐΈωοτϫʔΫͱ૊Έ߹ Θͤͯద༻͞ΕΔ͔ɺ৞ΈࠐΈωοτϫʔΫͷશମతͳߏ଄Λҡ࣋ͨ͠ ··ɺ৞ΈࠐΈωοτϫʔΫͷಛఆͷߏ੒ཁૉΛஔ͖׵͑ΔͨΊʹ࢖༻ ͞ΕΔɻզʑ͸ɺ͜ͷΑ͏ͳCNN΁ͷґଘ͸ඞཁͳ͘ɺը૾ύονͷ γʔέϯεʹ௚઀ద༻͞ΕΔ७ਮͳม׵ث͕ը૾෼ྨλεΫʹ͓͍ͯඇ ৗʹ༏ΕͨੑೳΛൃش͢Δ͜ͱΛࣔ͢ɻେྔͷσʔλͰࣄલʹֶश͠ɺ ෳ਺ͷதن໛·ͨ͸খن໛ͷը૾ೝࣝϕϯνϚʔΫʢImageNetɺCIFAR- 100ɺVTABͳͲʣʹసૹ͢ΔͱɺVision Transformer (ViT)͸࠷ઌ୺ͷ৞ ΈࠐΈωοτϫʔΫͱൺֱͯ͠༏Εͨ݁ՌΛಘΔ͜ͱ͕Ͱ͖ɺֶशʹඞ ཁͳܭࢉࢿݯ͸େ෯ʹগͳ͘ͳΓ·͢ɻ http://arxiv.org/abs/2010.11929v1 ݄ͱ ॏෳ Google Research
  6. ᶄϫΠυωοτϫʔΫͱσΟʔϓωοτϫʔΫ͸ಉ͜͡ͱΛֶͿͷ͔ʁχϡʔϥ ϧωοτϫʔΫͷදݱ͕෯ͱਂ͞ʹΑͬͯͲͷΑ͏ʹมԽ͢Δ͔Λ໌Β͔ʹ͢Δ (ݪจ: Do Wide and Deep Networks Learn the

    Same Things? Uncovering How Neural Network Representations Vary with Width and Depth) σΟʔϓɾχϡʔϥϧɾωοτϫʔΫͷ੒ޭͷ伴ͱͳΔཁҼ͸ɺΞʔΩςΫνϟͷਂ͞ͱ෯ΛมԽͤ͞ ͯੑೳΛ޲্ͤ͞ΔͨΊʹϞσϧΛεέʔϦϯάͰ͖Δ͜ͱͰ͢ɻχϡʔϥϧωοτϫʔΫઃܭͷ͜ͷ ୯७ͳಛੑ͸ɺ༷ʑͳλεΫʹରͯ͠ඇৗʹޮՌతͳΞʔΩςΫνϟΛੜΈग़͖ͯ͠·ͨ͠ɻͦΕʹ΋ ͔͔ΘΒͣɺֶश͞Εͨදݱʹର͢Δਂ͞ͱ෯ͷޮՌʹ͍ͭͯͷཧղ͸ݶΒΕ͍ͯΔɻຊ࿦จͰ͸ɺ͜ ͷجຊతͳ໰୊Λݚڀ͢Δɻ·ͣɺਂ͞ͱ෯ͷมԽ͕ϞσϧͷӅΕදݱʹͲͷΑ͏ͳӨڹΛ༩͑Δ͔Λ ௐ΂Δ͜ͱ͔Β࢝ΊɺΑΓେ͖ͳ༰ྔͷʢ෯͕޿͍·ͨ͸ਂ͍ʣϞσϧͷӅΕදݱʹಛ௃తͳϒϩοΫ ߏ଄Λൃݟ͢Δɻ͜ͷϒϩοΫߏ଄͸ɺϞσϧͷ༰ྔ͕܇࿅ηοτͷαΠζʹରͯ͠େ͖͍৔߹ʹੜ͡ Δ͜ͱΛ࣮ূ͠ɺجૅͱͳΔ૚͕ͦͷදݱͷࢧ഑తͳओ੒෼Λอ࣋͠ɺ఻೻͍ͯ͠Δ͜ͱΛ͍ࣔͯ͠· ͢ɻ͜ͷൃݟ͸ɺҟͳΔϞσϧʹΑֶͬͯश͞ΕΔಛ௃ʹॏཁͳӨڹΛ༩͑Δɻ͢ͳΘͪɺϒϩοΫߏ ଄ͷ֎ଆͷදݱ͸ɺ෯ͱਂ͕͞ҟͳΔΞʔΩςΫνϟؒͰྨࣅ͍ͯ͠Δ͜ͱ͕ଟ͍͕ɺϒϩοΫߏ଄͸ ֤Ϟσϧʹݻ༗ͷ΋ͷͰ͋Δɻզʑ͸ɺҟͳΔϞσϧΞʔΩςΫνϟͷग़ྗ༧ଌΛ෼ੳ͠ɺશମతͳਫ਼ ౓͕ࣅ͍ͯΔ৔߹Ͱ΋ɺ෯ͷ޿͍ϞσϧͱԞߦ͖ͷਂ͍ϞσϧͰ͸ɺΫϥεؒͰಠಛͷΤϥʔύλʔϯ ͱมಈ͕ݟΒΕΔ͜ͱΛൃݟͨ͠ɻ http://arxiv.org/abs/2010.15327v1 Google Research ˠ෯ͱਂ͞ͷҧ͏ϞσϧΛ෼ੳͯ͠ɺͦΕͧΕͷಛੑΛௐ΂ͨɻ
  7. ᶅ RelationNet++: τϥϯεσίʔμʹΑΔ෺ମݕग़ͷͨΊͷࢹ֮తදݱͷڮ౉ ͠ (ݪจ: RelationNet++: Bridging Visual Representations for

    Object Detection via Transformer Decoder) طଘͷ෺ମݕग़ϑϨʔϜϫʔΫ͸ɼ௨ৗɼ෺ମ/෦඼දݱͷ୯ҰϑΥʔϚοτʹج͍ͮͯߏங͞Ε͍ͯΔɽ͢ͳΘͪɼ RetinaNet ΍ Faster R-CNN ͷΞϯΧʔ/ఏҊۣܗϘοΫεɼFCOS ΍ RepPoints ͷத৺఺ɼCornerNet ͷ֯఺ͳͲͰ͋ Δɽ͜ΕΒͷҟͳΔදݱ͸ɼ௨ৗɼΑΓ༏Εͨ෼ྨ΍ΑΓࡉ͔͍ఆҐͳͲɼҟͳΔଆ໘ͰϑϨʔϜϫʔΫͷੑೳΛ޲্͞ ͤΔ͕ɼҟͳΔදݱʹΑΔಛ௃நग़͕ෆۉҰͰ͋ͬͨΓɼඇάϦουಛ௃நग़Ͱ͋ͬͨΓ͢ΔͨΊɼ͜ΕΒͷදݱΛҰͭ ͷϑϨʔϜϫʔΫʹ·ͱΊͯɼͦΕͧΕͷڧΈΛ༗ޮʹ׆༻͢Δ͜ͱ͸Ұൠతʹࠔ೉Ͱ͋ΔɽຊߘͰ͸ɺTransformer ʹ ྨࣅͨ͠஫໨ϕʔεͷσίʔμϞδϡʔϧΛఏࣔ͠ɺΤϯυπʔΤϯυͰɺ୯Ұͷදݱܗࣜʹج͍ͮͯߏங͞Εͨయܕత ͳΦϒδΣΫτݕग़ثʹଞͷදݱΛڮ౉͢͠Δɻଞͷදݱ͸ɺόχϥݕग़ثͷओཁͳ query දݱͷಛ௃ΛڧԽ͢ΔͨΊ ʹɺҰ࿈ͷkeyΠϯελϯεͱͯ͠ػೳ͢ΔɻσίʔμϞδϡʔϧͷޮ཰తͳܭࢉͷͨΊʹɺkey sampling Ξϓϩʔνͱ shared location embedding ΞϓϩʔνΛؚΉ৽͍ٕ͠ज़ΛఏҊ͢ΔɻఏҊϞδϡʔϧ͸ɺʮࢹ֮දݱͷڮ౉͠(BVR)ʯ ͱ໊෇͚ΒΕͨɻ(BVR)Λ࢖༻͍ͯ͠·͢ɻ͜Ε͸ɺΠϯϓϨʔεͰ࣮ߦ͢Δ͜ͱ͕Ͱ͖ɺզʑ͸ɺRetinaNetɺFaster R- CNNɺFCOSɺATSSͳͲͷҰൠతͳ෺ମݕग़ϑϨʔϜϫʔΫ΁ͷଞͷදݱͷϒϦοδϯάʹͦͷ޿ൣͳ༗ޮੑΛ࣮ূ ͠ɺ໿1.5υϧͷAPͷվળ͕ୡ੒͞ΕͨɻಛʹɺڧྗͳόοΫϘʔϯΛ࣋ͭ࠷ઌ୺ͷϑϨʔϜϫʔΫΛ໿ 2.0 APվળ͠ɺ COCO test-dev্Ͱ 52.7 APʹୡͨ͠ɻ݁Ռͱͯ͠ಘΒΕͨωοτϫʔΫ͸ɺRelationNet++ͱ໊෇͚ΒΕ͍ͯ·͢ɻ ίʔυ͸ https://github.com/microsoft/RelationNet2 Ͱެ։͞ΕΔ༧ఆͰ͢ɻ http://arxiv.org/abs/2010.15831v1 Microsoft Research ˠ#73ͱ͍͏σίʔμϞδϡʔϧΛಋೖͯ͠ 3FMBUJPO/FUͳͲͷ%FUFDUJPOϞσϧΛվྑͰ͖ͨ
  8. ᶆରরతଛࣦͷັྗతͳಛੑ (ݪจ: Intriguing Properties of Contrastive Losses) ରরతଛࣦͱͦͷมछ͸ɼ࠷ۙɼ؂ಜͳ͠Ͱࢹ֮දݱΛֶश͢ΔͨΊʹඇৗʹΑ͘࢖ΘΕΔΑ͏ʹͳͬ ͖͍ͯͯΔɽຊݚڀͰ͸ɼ·ͣɼΫϩεΤϯτϩϐʔʹجͮ͘ඪ४తͳରরతଛࣦΛɼ L

    alignment ͷ ந৅ܗΛڞ༗͢ΔଛࣦͷΑΓ޿͍ϑΝϛϦʹҰൠԽ͢Δɽ+ ͜͜ͰɺӅ͞Εͨදݱ͸ɺ(1)͍͔ͭ͘ͷม ׵ɾ֦ுͷԼͰ੔ྻ͞Εɺ(2)ߴ͍Τϯτϩϐʔͷࣄલ෼෍ͱҰக͢ΔΑ͏ʹ঑ྭ͞ΕΔɻզʑ͸ɺҰ ൠԽ͞Εͨଛࣦͷ༷ʑͳΠϯελϯε͕ɺଟ૚ඇઢܗ౤ӨϔουͷଘࡏԼͰಉ༷ʹಈ࡞͢Δ͜ͱΛࣔ ͠ɺඪ४తͳରরతଛࣦͰ޿͘༻͍ΒΕ͍ͯΔԹ౓εέʔϦϯά(τ)͕ɺ2ͭͷଛࣦ߲ؒͷॏΈ෇͚(λ)ʹ ൓ൺྫ͍ͯ͠Δ͜ͱΛࣔ͢ɻͦ͜ͰɺຊݚڀͰ͸ɺʮ৭෼෍ʯͱʮΦϒδΣΫτΫϥεʯͷΑ͏ͳɺ֦ ுϏϡʔͰڞ༗͞ΕΔڝ߹͢Δಛ௃ͷؒͰಛ௃͕཈੍͞ΕΔͱ͍͏ڵຯਂ͍ݱ৅Λݚڀ͍ͯ͠Δɻ໌ࣔ తͰ੍ޚՄೳͳڝ߹ಛ௃Λ࣋ͭσʔληοτΛߏங͠ɺରൺֶशͰ͸ɺֶश͠΍͍͢ڞ༗ಛ௃ͷ਺Ϗο τ͕ɺଞͷڝ߹ಛ௃ͷֶशΛ཈੍͠ɺ͞Βʹ͸׬શʹ๷͙͜ͱ͕Ͱ͖Δ͜ͱΛࣔ͢ɻڵຯਂ͍͜ͱʹɺ ͜ͷಛੑ͸࠶ߏ੒ଛࣦʹجͮࣗ͘ಈΤϯίʔμʔͰ͸ɺ͸Δ͔ʹ༗֐Ͱ͸͋Γ·ͤΜɻطଘͷରরతֶ श๏͸ɺಛఆͷಛ௃ηοτΛଞͷಛ௃ηοτΑΓ΋༗རʹ͢ΔͨΊʹɺσʔλͷ૿ڧʹܾఆతʹґଘ͠ ͍ͯ·͕͢ɺωοτϫʔΫ͕ͦͷ༰ྔ͕ڐ͢ݶΓɺڝ߹͢Δ͢΂ͯͷಛ௃Λֶश͢Δ͜ͱΛ๬Ή͜ͱ΋ Ͱ͖·͢ɻ http://arxiv.org/abs/2011.02803v1 Google Research ˠ$POUSBTUJWF-FBSOJOHͷಛੑݚڀɻڝ߹͢Δ̎ͭͷཁૉΛ ૊Έ߹ΘͤͨσʔληοτͰֶशͯ͠ɺׯবͷ࢓ํΛௐ΂ͨ
  9. ᶇղઆॻΛ࢖ͬͨࢦಋ (ݪจ: Teaching with Commentaries) σΟʔϓχϡʔϥϧωοτϫʔΫͷޮՌతͳֶश͸ࠔ೉Ͱ͋Γɺ͜ΕΒͷϞσϧΛ࠷దʹֶ श͢Δํ๏ʹ͍ͭͯ͸ଟ͘ͷະղܾͷ໰୊͕࢒͍ͬͯ·͢ɻ࠷ۙ։ൃ͞Εͨχϡʔϥϧωο τϫʔΫͷֶशΛվળ͢ΔͨΊͷख๏͸ɺςΟʔνϯάʢֶश৘ใΛֶशϓϩηεதʹఏڙ ͯ͠ԼྲྀͷϞσϧͷੑೳΛ޲্ͤ͞Δ͜ͱʣΛݕ౼͍ͯ͠Δɻຊ࿦จͰ͸ɺςΟʔνϯάͷ ൣғΛ޿͛ΔͨΊͷҰาΛ౿Έग़͢ɻຊ࿦จͰ͸ɺಛఆͷλεΫ΍σʔληοτͰͷֶशʹ

    ໾ཱͭϝλֶश৘ใͰ͋ΔղઆΛ༻͍ͨॊೈͳςΟʔνϯάϑϨʔϜϫʔΫΛఏҊ͢Δɻຊ ࿦จͰ͸ɺ࠷ۙͷ҉໧ͷࠩҟԽʹؔ͢Δݚڀ੒ՌΛ׆༻ͯ͠ɺޮ཰తͰεέʔϥϒϧͳޯ഑ ϕʔεͷղઆจֶश๏ΛఏҊ͢Δɻݸʑͷ܇࿅ྫʹର͢ΔॏΈͷֶश͔Βɺϥϕϧʹґଘ͠ ͨσʔλ૿ڧϙϦγʔͷύϥϝʔλԽɺݦஶͳը૾ྖҬΛڧௐ͢Δ஫ҙϚεΫͷදݱ·Ͱɺ ༷ʑͳ༻్Λ୳Δɻ͜ΕΒͷઃఆʹ͓͍ͯɺίϝϯλϦʔ͸܇࿅଎౓΍ੑೳΛ޲্ͤ͞ɺ σʔληοτͱ܇࿅ϓϩηεʹؔ͢Δجຊతͳಎ࡯Λఏڙ͢Δ͜ͱ͕Ͱ͖Δ͜ͱΛൃݟ͢ Δɻ http://arxiv.org/abs/2011.03037v1 Google Research / MIT / University of Toronto ˠڭࢣσʔλΛՃ޻ֶͯ͠शΛิॿ͢ΔϞσϧ ղઆϞσϧ  Λ࡞Δ൚༻తͳΞϧΰϦζϜΛߟ࣮͑ͯূͨ͠Α ϐοΫΞοϓ࿦จ
  10. ᶈσΟʔϓϥʔχϯάʹ͓͚Δෆ࣮֬ੑఆྔԽͷϨϏϡʔɻٕज़ɺԠ༻ɺ՝୊ (ݪจ: A Review of Uncertainty Quantification in Deep Learning:

    Techniques, Applications and Challenges) ෆ࣮֬ੑఆྔԽʢUQʣ͸ɺ࠷దԽϓϩηεͱҙࢥܾఆϓϩηεͷ྆ํʹ͓͍ͯɺෆ࣮֬ੑΛ௿ ݮ͢Δ্ͰۃΊͯॏཁͳ໾ׂΛՌͨ͠·͢ɻ͜Ε͸ɺՊֶ΍޻ֶͷ෼໺Ͱͷ༷ʑͳ࣮ੈքͰͷΞ ϓϦέʔγϣϯΛղܾ͢ΔͨΊʹద༻͢Δ͜ͱ͕Ͱ͖·͢ɻϕΠζۙࣅ๏ͱΞϯαϯϒϧֶश๏ ͸ɺจݙͷதͰ࠷΋޿͘࢖ΘΕ͍ͯΔUQख๏Ͱ͢ɻ͜Εʹؔ࿈ͯ͠ɺݚڀऀ͸༷ʑͳUQ๏Λఏ Ҋ͠ɺίϯϐϡʔλϏδϣϯʢྫɿࣗಈӡసं΍෺ମݕग़ʣɺը૾ॲཧʢྫɿը૾෮ݩʣɺҩ༻ ը૾ղੳʢྫɿҩ༻ը૾ͷ෼ྨ΍ηάϝϯςʔγϣϯʣɺࣗવݴޠॲཧʢྫɿςΩετ෼ྨɺ ιʔγϟϧϝσΟΞͷςΩετ΍࠶൜ϦεΫείΞϦϯάʣɺόΠΦΠϯϑΥϚςΟΫεͳͲͷ ༷ʑͳΞϓϦέʔγϣϯͰͷੑೳΛݕূ͖ͯͨ͠ɻຊݚڀͰ͸ɺσΟʔϓϥʔχϯάʹ༻͍ΒΕ ΔUQ๏ͷ࠷ۙͷਐาΛϨϏϡʔ͢Δɻ͞ΒʹɺڧԽֶश(RL)ʹ͓͚Δ͜ΕΒͷख๏ͷԠ༻ʹ͍ͭ ͯ΋ௐࠪ͢Δɻ࣍ʹɺUQ๏ͷ͍͔ͭ͘ͷॏཁͳԠ༻ྫΛ֓આ͢Δɻ࠷ޙʹɺUQ๏͕௚໘͍ͯ͠ Δجຊతͳݚڀ՝୊ʹ؆୯ʹϋΠϥΠτΛ౰ͯɺ͜ͷ෼໺ʹ͓͚Δকདྷͷݚڀͷํ޲ੑʹ͍ͭͯ ٞ࿦͢Δɻ http://arxiv.org/abs/2011.06225v3 IEEE ˠ*&&&ʹΑΔσΟʔϓϥʔχϯάશൠͷ62๏ͷแׅతϨϏϡʔ࿦จ
  11. ᶉχϡʔϥϧωοτϫʔΫͷֶशෆมੑ (ݪจ: Learning Invariances in Neural Networks) ຋༁ʹର͢Δෆมੑ͸ɺ৞ΈࠐΈχϡʔϥϧωοτϫʔΫʹڧྗͳҰൠ ԽಛੑΛ༩͍͑ͯ·͢ɻ͔͠͠ɺσʔλதʹͲͷΑ͏ͳෆมੑ͕ଘࡏ͢ Δͷ͔ɺ·ͨɺϞσϧ͕༩͑ΒΕͨରশੑ܈ʹରͯ͠Ͳͷఔ౓ෆมͰ͋

    Δ΂͖ͳͷ͔Λࣄલʹ஌Δ͜ͱ͸Ͱ͖ͳ͍͜ͱ͕ଟ͍ɻզʑ͸ɺෆมੑ ͱෆมੑͷ෼෍ΛύϥϝʔλԽ͠ɺωοτϫʔΫύϥϝʔλͱ֦ுύϥ ϝʔλʹؔͯ͠ಉ࣌ʹֶशଛࣦΛ࠷దԽ͢Δ͜ͱͰɺෆมੑͱෆมੑΛ ʮߟ͑Δʯʮ֮͑Δʯํ๏Λࣔ͢ɻ͜ͷ؆୯ͳํ๏Ͱɺ܇࿅σʔλ͚ͩ Ͱɺը૾෼ྨɺճؼɺηάϝϯςʔγϣϯɺ෼ࢠಛੑ༧ଌͷෆมྔͷਖ਼ ͍͠ηοτͱൣғΛɺେن໛ͳΦʔάϝϯςʔγϣϯͷۭ͔ؒΒճ෮͢ Δ͜ͱ͕Ͱ͖Δɻ http://arxiv.org/abs/2010.11882v1 New York University ˠ"VHVNFOUBUJPOͷൣғΛܾΊΔͨΊͷ ൚༻తͳϑϨʔϜϫʔΫΛ࡞ͬͨ
  12. ᶊݱ୅ͷػցֶशʹ͓͚Δ৴པੑ΁ͷ՝୊Λఏࣔ͢ΔΞϯμʔεϖγϑΟ έʔγϣϯ (ݪจ: Underspecification Presents Challenges for Credibility in Modern

    Machine Learning) MLϞσϧΛ࣮ੈքʹల։͢Δͱɼ͠͹͠͹༧ظͤ͵ѱ͍ڍಈΛࣔ͢͜ͱ͕͋Γ·͢ɽզʑ͸ɺ͜Ε Βͷࣦഊͷओͳཧ༝ͱͯ͠ɺ࢓༷ෆ଍Λಛఆ͍ͯ͠ΔɻMLύΠϓϥΠϯ͸ɼֶशྖҬʹ͓͍ͯಉ౳ ͷڧ͍ϗʔϧυΞ΢τੑೳΛ࣋ͭଟ͘ͷ༧ଌม਺Λฦ͢͜ͱ͕Ͱ͖Δ৔߹ʹɼ ෆಛఆԽ͞Ε͍ͯ ΔɽෆಛఆԽ͸ɼਂ૚ֶशʹجͮ͘MLύΠϓϥΠϯͳͲͰ͸Ұൠతͳ΋ͷͰ͢ɽෆಛఆԽ͞Εͨύ ΠϓϥΠϯʹΑͬͯฦ͞ΕΔ༧ଌث͸ɺ͠͹͠͹܇࿅ྖҬͷੑೳʹج͍ͮͯಉ౳ͷ΋ͷͱͯ͠ѻΘΕ ·͕͢ɺզʑ͸ɺͦͷΑ͏ͳ༧ଌث͕഑උྖҬͰ͸ඇৗʹҟͳΔৼΔ෣͍Λ͢Δ͜ͱΛ͜͜Ͱࣔͯ͠ ͍·͢ɻ͜ͷᐆດ͞͸ɺ࣮ࡍʹ͸ෆ҆ఆੑ΍ϞσϧͷৼΔ෣͍ͷѱ͞ʹͭͳ͕ΔՄೳੑ͕͋Γɺ܇࿅ ྖҬͱల։ྖҬͷؒͷߏ଄తͳϛεϚον͔Βੜ͡Δ໰୊ͱ͸ҟͳΔނোϞʔυͰ͋Δ͜ͱ͕ࢦఠ͞ Ε͍ͯΔɻզʑ͸ɺίϯϐϡʔλϏδϣϯɺҩྍը૾ɺࣗવݴޠॲཧɺిࢠΧϧςʹجͮ͘ྟচϦε Ϋ༧ଌɺϝσΟΧϧήϊϛΫεͳͲͷྫΛ༻͍ͯɺ͜ͷ໰୊͕༷ʑͳ࣮༻తͳMLύΠϓϥΠϯʹݱΕ ͍ͯΔ͜ͱΛࣔͨ͠ɻզʑͷ݁Ռ͸ɺͲͷΑ͏ͳυϝΠϯͰ΋࣮ੈքͰͷల։Λ໨తͱͨ͠ϞσϦϯ άύΠϓϥΠϯʹ͓͍ͯ΋ɺ࢓༷ෆ଍Λ໌ࣔతʹߟྀ͢Δඞཁ͕͋Δ͜ͱΛ͍ࣔͯ͠Δɻ http://arxiv.org/abs/2011.03395v1 Google ˠ.-ϞσϧΛ࣮ੈքʹద༻ͯ͠ࠔΔࣄྫ঺հ ҩྍܥ΍΍ଟΊ ɻ ࣮༻ʹ͸ԿΛֶश͍ͯ͠Δ͔ɺԿֶ͕शͰ͖͍ͯͳ͍͔Λ ཧղֶͯ͠शɾར༻͢Δࣄ͕ͱͯ΋େࣄͱ͍͏࿩ɻ
  13. ᶋϦοδϥΠμʔ: ϔγΞϯͷݻ༗ϕΫτϧʹै͏͜ͱͰଟ༷ͳղΛݟ͚ͭΔ (ݪจ: Ridge Rider: Finding Diverse Solutions by Following

    Eigenvectors of the Hessian) աڈ 10 ೥ؒͰɺ1 ͭͷΞϧΰϦζϜ͕ࢲͨͪͷੜ׆ͷଟ͘ͷ໘Λม͖͑ͯ·ͨ͠ɻଛࣦؔ਺͕ݮগ ͠ଓ͚Δ࣌୅ʹ͋ͬͯɺSGD ͱͦͷ͞·͟·ͳࢠଙ͸ɺػցֶशʹ͓͚Δ࠷దԽπʔϧͱͯ͠ɺ σΟʔϓχϡʔϥϧωοτϫʔΫ (DNN) ͷ੒ޭͷ伴ΛѲΔॏཁͳཁૉͱͳ͍ͬͯ·͢ɻSGD ͸ʢ؇ ͍ԾఆͷԼͰʣہॴ࠷దʹऩଋ͢Δ͜ͱ͕อূ͞Ε͍ͯ·͕͢ɺ ৔߹ʹΑͬͯ͸ɺͲͷہॴ࠷ద͕ݟ ͔͔͕ͭͬͨ໰୊ʹͳΔ͜ͱ΋͋Γɺ͜Ε͸͠͹͠͹จ຺ʹґଘ͠·͢ɻ͜ͷΑ͏ͳྫͱͯ͠ɺػց ֶशͰ͸ɺܗঢ়ରςΫενϟಛ௃͔ΒɺΞϯαϯϒϧ๏΍ θϩγϣοτڠௐ·Ͱɺසൟʹൃੜ͠· ͢ɻ͜ΕΒͷઃఆͰ͸ɺʮඪ४తͳʯଛࣦؔ਺্ͷ SGD ͸ʮ؆୯ͳʯղʹऩଋ͢ΔͨΊɺʮඪ४త ͳʯଛࣦؔ਺্ͷ SGD Ͱ͸ݟ͚ͭΒΕͳ͍ղ͕ଘࡏ͠·͢ɻ͜ͷ࿦จͰ͸ɺผͷΞϓϩʔνΛఏҊ ͠·͢ɻہॴతʹᩦཉͳํ޲ʹରԠ͢Δޯ഑ΛḷΔͷͰ͸ͳ͘ɺϔγΞϯͷݻ༗ϕΫτϧΛḷΓ· ͢ɻඌࠜΛ൓෮తʹḷͬͨΓɺඌࠜͷؒͰ෼ذͨ͠Γ͢Δ͜ͱͰɺଛࣦ໘ΛޮՌతʹԣஅ͠ɺ࣭తʹ ҟͳΔղΛݟ͚ͭΔ͜ͱ͕Ͱ͖·͢ɻզʑ͸ɺϦοδϥΠμʔ(RR)ͱݺ͹ΕΔզʑͷख๏͕ɺ༷ʑͳ ࠔ೉ͳ໰୊ʹରͯ͠༗๬ͳํ޲ੑΛఏڙ͢Δ͜ͱΛɺཧ࿦తʹ΋࣮ݧతʹ΋͍ࣔͯ͠Δɻ http://arxiv.org/abs/2011.06505v1 University of Oxford / Google Research ˠ৽͍͠࠷దԽΞϧΰϦζϜ3JEHF3JEFS 33 Λ࡞ͬͨ
  14. ᶌৗඍ෼ํఔࣜΛղ͘͜ͱʹΑΔੜ੒తఢରωοτϫʔΫͷҭ੒ (ݪจ: Training Generative Adversarial Networks by Solving Ordinary Differential

    Equations) Generative Adversarial Network (GAN) ͷֶशͷෆ҆ఆੑ͸ɺ͠͹͠͹ޯ഑߱ԼʹىҼ͍ͯ͠ Δɻͦͷ݁Ռɺ࠷ۙͷख๏͸཭ࢄతͳߋ৽Λ҆ఆԽͤ͞ΔͨΊʹϞσϧ΍܇࿅खॱΛௐ੔͢Δ͜ ͱΛ໨తͱ͍ͯ͠Δɻ͜Εͱ͸ରরతʹɺզʑ͸GAN܇࿅ʹΑͬͯ༠ൃ͞ΕΔ࿈ଓ࣌ؒμΠφϛ ΫεΛݚڀ͍ͯ͠Δɻཧ࿦ͱ࣮ݧͷ྆ํ͔Βɺ͜ΕΒͷμΠφϛΫε͸࣮ࡍʹ͸ڻ͘΄Ͳ҆ఆ͠ ͍ͯΔ͜ͱ͕ࣔࠦ͞Ε͍ͯ·͢ɻ͜ͷ؍఺͔Βɺզʑ͸ɺGANͷ܇࿅ʹ͓͚Δෆ҆ఆੑ͸ɺ࿈ଓ ࣌ؒμΠφϛΫεΛ཭ࢄԽ͢Δࡍͷੵ෼ޡࠩʹىҼ͢Δͱ͍͏ԾઆΛཱͯͨɻզʑ͸ɺΑ͘஌Β Ε͍ͯΔODEιϧόʔʢRunge-KuttaͳͲʣ͕ɺੵ෼ޡࠩΛ੍ޚ͢Δਖ਼ଇԽثͱ૊Έ߹ΘͤΔ͜ ͱͰɺֶशΛ҆ఆԽͰ͖Δ͜ͱΛ࣮ݧతʹݕূͨ͠ɻզʑͷΞϓϩʔν͸ɺؔ਺ۭؒΛ੍໿͢Δ దԠత࠷దԽ΍҆ఆԽٕज़ʢྫɿεϖΫτϧਖ਼نԽʣΛҰൠతʹ࢖༻͢ΔҎલͷख๏ͱ͸ࠜຊత ʹҟͳΔ΋ͷͰ͋ΔɻCIFAR-10ͱImageNetͰͷධՁͰ͸ɺզʑͷख๏͕͍͔ͭ͘ͷڧྗͳϕʔ εϥΠϯΑΓ΋༏Ε͍ͯΔ͜ͱ͕ࣔ͞Ε͓ͯΓɺͦͷ༗ޮੑ͕ূ໌͞Ε͍ͯ·͢ɻ http://arxiv.org/abs/2010.15040v1 Deepmind ˠ("/ֶशΛৗඍ෼ํఔࣜ 0%& Λղ͘ࣄͱͯ͠ϑϨʔϜԽͨ͠Β ऩଋੑ͕޲্ͯ͠ɺશମతʹੑೳ͕ྑ͘ͳͬͨɻ
  15. ᶃύϥϝτϦοΫภඍ෼ํఔࣜͷͨΊͷϑʔϦΤਆܦԋࢉࢠ (ݪจ: Fourier Neural Operator for Parametric Partial Differential Equations)

    χϡʔϥϧωοτϫʔΫͷݹయతͳ։ൃ͸ɺओʹ༗ݶ࣍ݩϢʔΫϦουۭؒ ؒͷϚοϐϯάͷֶशʹয఺Λ౰͖ͯͯͨɻ࠷ۙͰ͸ɺ͜Ε͸ؔ਺ۭؒؒͷ ϚοϐϯάΛֶश͢ΔχϡʔϥϧԋࢉࢠʹҰൠԽ͞Ε͍ͯΔɻภඍ෼ํఔࣜ ʢPDEʣͷ৔߹ɺχϡʔϥϧԋࢉࢠ͸ɺ೚ҙͷؔ਺ύϥϝτϦοΫґଘੑ͔ Βղ΁ͷࣸ૾Λ௚઀ֶश͢Δɻ͜ͷΑ͏ʹɺχϡʔϥϧԋࢉࢠ͸ɺํఔࣜͷ ҰͭͷΠϯελϯεΛղ͘ݹయతͳख๏ͱ͸ରরతʹɺPDEͷϑΝϛϦʔશ ମΛֶश͢ΔɻຊݚڀͰ͸ɺϑʔϦΤۭؒͰੵ෼ΧʔωϧΛ௚઀ύϥϝʔλ Խ͢Δ͜ͱʹΑΓɺ৽͍͠χϡʔϥϧԋࢉࢠΛఆࣜԽ͠ɺදݱྗ๛͔Ͱޮ཰ తͳΞʔΩςΫνϟΛ࣮ݱ͢ΔɻຊݚڀͰ͸ɺBurgersํఔࣜɺDarcyྲྀɺ Navier-Stokesํఔࣜ(ཚྲྀྖҬΛؚΉ)ͷ࣮ݧΛߦͬͨɻզʑͷϑʔϦΤ χϡʔϥϧԋࢉࢠ͸ɺطଘͷχϡʔϥϧωοτϫʔΫख๏ͱൺֱͯ͠࠷ઌ୺ ͷੑೳΛࣔ͠ɺैདྷͷPDEιϧόʔͱൺֱͯ͠࠷େ3ܻͷߴ଎ԽΛ࣮ݱͨ͠ɻ http://arxiv.org/abs/2010.08895v1 ݄ͱ ॏෳ Caltech / Purdue University
  16. ᶄϫΠυωοτϫʔΫͱσΟʔϓωοτϫʔΫ͸ಉ͜͡ͱΛֶͿͷ͔ʁ χϡʔϥϧωοτϫʔΫͷදݱ͕෯ͱਂ͞ʹΑͬͯͲͷΑ͏ʹมԽ͢Δ͔ Λ໌Β͔ʹ͢Δ (ݪจ: Do Wide and Deep Networks Learn

    the Same Things? σΟʔϓɾχϡʔϥϧɾωοτϫʔΫͷ੒ޭͷ伴ͱͳΔཁҼ͸ɺΞʔΩςΫνϟͷਂ͞ ͱ෯ΛมԽͤͯ͞ੑೳΛ޲্ͤ͞ΔͨΊʹϞσϧΛεέʔϦϯάͰ͖Δ͜ͱͰ͢ɻ χϡʔϥϧωοτϫʔΫઃܭͷ͜ͷ୯७ͳಛੑ͸ɺ༷ʑͳλεΫʹରͯ͠ඇৗʹޮՌత ͳΞʔΩςΫνϟΛੜΈग़͖ͯ͠·ͨ͠ɻͦΕʹ΋͔͔ΘΒͣɺֶश͞Εͨදݱʹର͢ Δਂ͞ͱ෯ͷޮՌʹ͍ͭͯͷཧղ͸ݶΒΕ͍ͯΔɻຊ࿦จͰ͸ɺ͜ͷجຊతͳ໰୊Λݚ ڀ͢Δɻ·ͣɺਂ͞ͱ෯ͷมԽ͕ϞσϧͷӅΕදݱʹͲͷΑ͏ͳӨڹΛ༩͑Δ͔Λௐ΂ Δ͜ͱ͔Β࢝ΊɺΑΓେ͖ͳ༰ྔͷʢ෯͕޿͍·ͨ͸ਂ͍ʣϞσϧͷӅΕදݱʹಛ௃త ͳϒϩοΫߏ଄Λൃݟ͢Δɻ͜ͷϒϩοΫߏ଄͸ɺϞσϧͷ༰ྔ͕܇࿅ηοτͷαΠζ ʹରͯ͠େ͖͍৔߹ʹੜ͡Δ͜ͱΛ࣮ূ͠ɺجૅͱͳΔ૚͕ͦͷදݱͷࢧ഑తͳओ੒෼ Λҡ࣋͠ɺ఻೻͍ͯ͠Δ͜ͱΛ͍ࣔͯ͠·͢ɻ͜ͷൃݟ͸ɺҟͳΔϞσϧʹΑֶͬͯश ͞ΕΔಛ௃ʹॏཁͳӨڹΛ༩͑Δɻ͢ͳΘͪɺϒϩοΫߏ଄ͷ֎ଆͷදݱ͸ɺ෯ͱਂ͞ ͕ҟͳΔΞʔΩςΫνϟؒͰྨࣅ͍ͯ͠Δ͜ͱ͕ଟ͍͕ɺϒϩοΫߏ଄͸֤Ϟσϧʹݻ ༗ͷ΋ͷͰ͋Δɻզʑ͸ɺҟͳΔϞσϧΞʔΩςΫνϟͷग़ྗ༧ଌΛ෼ੳ͠ɺશମతͳ ਫ਼౓͕ࣅ͍ͯΔ৔߹Ͱ΋ɺ෯ͷ޿͍ϞσϧͱԞߦ͖ͷਂ͍ϞσϧͰ͸ɺΫϥεؒͰಠಛ ͷΤϥʔύλʔϯͱมಈ͕ݟΒΕΔ͜ͱΛൃݟͨ͠ɻ http://arxiv.org/abs/2010.15327v1 SFDFOU ͱॏෳ Google Research
  17. ᶅϏϡʔϝʔΧʔωοτϫʔΫڭࢣͳ͠දݱֶशͷͨΊͷϏϡʔͷֶश (ݪจ: Viewmaker Networks: Learning Views for Unsupervised Representation Learning)

    ڭࢣͳ͠දݱֶशͷͨΊͷ࠷ۙͷख๏ͷଟ͘͸ɺҟͳΔʮϏϡʔʯʢೖྗͷม׵͞Εͨόʔ δϣϯʣʹෆมʹͳΔΑ͏ʹϞσϧΛ܇࿅͢Δ͜ͱΛؚΜͰ͍Δɻ͔͠͠ɺ͜ΕΒͷϏϡʔΛ ઃܭ͢ΔͨΊʹ͸ɺ͔ͳΓͷઐ໳஌ࣝͱ࣮ݧ͕ඞཁͰ͋Γɺڭࢣͳ͠දݱֶशͷख๏͕ྖҬ΍ ϞμϦςΟΛ௒͑ͯ޿͘࠾༻͞ΕΔ͜ͱΛ๦͍͛ͯΔɻ͜ͷ໰୊Λղܾ͢ΔͨΊʹɺզʑ͸ ϏϡʔϝʔΧʔωοτϫʔΫΛఏҊ͢Δɻզʑ͸ɺ͜ͷωοτϫʔΫΛΤϯίʔμωοτϫʔ ΫͱڞಉͰ܇࿅͠ɺೖྗʹର͢Δఢରతͳ l p ΏΒ͗Λੜ੒͢Δɻ͜ͷֶशͨ͠ϏϡʔΛCIFAR- 10ʹద༻͢ΔͱɺSimCLRϞσϧͰ࢖༻͞Ε͍ͯΔΑ͘ݚڀ͞Ε͍ͯΔ֦ுͱಉ౳ͷ఻ୡਫ਼౓ ΛಘΔ͜ͱ͕Ͱ͖ΔɻզʑͷϏϡʔ͸ɺԻ੠ʢઈର஋9%૿ʣͱ΢ΣΞϥϒϧηϯαʔʢઈର஋ 17%૿ʣͷྖҬʹ͓͍ͯɺϕʔεϥΠϯͷΦʔάϝϯςʔγϣϯΛେ෯ʹ্ճΓ·ͨ͠ɻ· ͨɺϏϡʔϝʔΧʔͷϏϡʔΛख࡞ۀͰ࡞੒ͨ͠Ϗϡʔͱ૊Έ߹ΘͤΔ͜ͱͰɺҰൠతͳը૾ ͷഁଛʹର͢ΔϩόετੑΛ޲্ͤ͞Δํ๏΋͍ࣔͯ͠·͢ɻզʑͷํ๏͸ɺֶश͞Εͨ Ϗϡʔ͕ڭࢣͳֶ͠शʹඞཁͳઐ໳஌ࣝͱ࿑ྗΛ࡟ݮ͢Δ༗๬ͳํ๏Ͱ͋Δ͜ͱΛ࣮ূ͠ɺͦ ͷར఺ΛΑΓ෯޿͍ྖҬʹ֦େ͢ΔՄೳੑ͕͋Δ͜ͱΛ͍ࣔͯ͠Δɻ http://arxiv.org/abs/2010.07432v1 Stanford University ˠ$POUSBTUJWF-FBSOJOH༻ͷϏϡʔը૾Λઐ໳஌ࣝͳ͠ͰࣗಈͰ࡞ΕΔɺ ʮ7JFXNBLFSϞσϧʯΛ։ൃͨ͠ΒɺԻ੠΍ηϯαʔ෼໺ͰޮՌతͩͬͨɻ
  18. ᶆେن໛ଟݴޠΦʔσΟΦϏδϡΞϧμϏϯά (ݪจ: Large-scale multilingual audio visual dubbing) ͋Δݴޠ͔Βผͷݴޠ΁ಈըΛ຋༁͢Δେن໛ࢹௌ֮຋༁ɾμϏϯάγεςϜʹ͍ͭͯड़΂Δɻ຋༁ݩ ͷݴޠͷԻ੠಺༰ΛςΩετʹసࣸͯ͠຋༁͠ɺݩͷ࿩ऀͷ੠Λ༻͍ͯࣗಈతʹର৅ݴޠͷԻ੠ʹ߹੒ ͢Δɻࢹ֮ίϯςϯπ͸ɺ຋༁͞ΕͨԻ੠ʹ߹Θͤͯ࿩ऀͷ৶ͷಈ͖Λ߹੒͢Δ͜ͱͰ຋༁͞Εɺλʔ

    ήοτݴޠͰͷγʔϜϨεͳࢹௌ֮ମݧΛ࣮ݱ͠·͢ɻԻ੠຋༁αϒγεςϜͱࢹ֮຋༁αϒγεςϜ ʹ͸ɺͦΕͧΕɺରԠ͢ΔυϝΠϯͷ਺ઍ࣌ؒʹٴͿσʔλʹج͍ͮͯ܇࿅͞Εͨେن໛ͳ൚༻߹੒Ϟ σϧؚ͕·Ε͍ͯ·͢ɻ͜ΕΒͷҰൠతͳϞσϧ͸ɺλʔήοτεϐʔΧʔ͔Βͷσʔλͷิॿతͳ ίʔύεΛ࢖༻͢Δ͔ɺ·ͨ͸ඍௐ੔ϓϩηε΁ͷೖྗͱͯ͠຋༁͞ΕΔϏσΦࣗମΛ࢖༻ͯ͠ɺ຋༁ લʹಛఆͷεϐʔΧʔʹඍௐ੔͞Ε·͢ɻ͜ͷϨϙʔτͰ͸ɺγεςϜશମͷΞʔΩςΫνϟͷ֓ཁ ͱɺϏσΦμϏϯάίϯϙʔωϯτͷৄࡉʹ͍ͭͯઆ໌͠·͢ɻγεςϜશମͱͷؔ܎ͰͷΦʔσΟΦ ͱςΩετίϯϙʔωϯτͷ໾ׂ͸֓આ͞Ε͍ͯ·͕͢ɺͦΕΒͷઃܭʹ͍ͭͯ͸ৄࡉʹ͸৮ΕΒΕͯ ͍·ͤΜɻ౰ࣾͷγεςϜΛ࢖༻ͯ͠࡞੒͞Εͨ຋༁͓ΑͼμϏϯά͞ΕͨσϞϏσΦ͸ɺhttps:// www.youtube.com/playlist?list=PLSi232j2ZA6_1Exhof5vndzyfbxAhhEs5 Ͱ͝ཡ͍͚ͨͩ·͢ɻ http://arxiv.org/abs/2011.03530v1 DeepMind / Google ˠԻ੠෇͖ಈըΛݩʹଟݴޠʹࣗಈ຋༁ͨ͠ಈըΛੜ੒Ͱ͖ΔϞσϧΛ࡞ͬͨɻ ࿩͠੠ͷԻ੠Λผͷݴޠʹ຋༁ͭͭ͠ɺ࿩ऀͷޱͷಈ͖΋ࣗવʹ߹੒Ͱ͖ͨɻ
  19. ᶇϢʔβʔͷ͖Ίࡉ͔ͳؾ഑Γʹج͍ͮͨจࣈը૾ੜ੒ (ݪจ: Text-to-Image Generation Grounded by Fine- Grained User Attention)

    Localized Narratives͸ɺը૾ͷৄࡉͳࣗવݴޠهड़ͱϚ΢ετϨʔεͷϖΞΛ࣋ͭ σʔληοτͰ͋ΓɺϑϨʔζͷͨΊͷૄͰৄࡉͳࢹ֮తԼ஍Λఏڙ͢Δɻզʑ͸ɺ ը૾Λੜ੒͢ΔͨΊʹ͜ͷԼ஍Λར༻͢Δஞ࣍ϞσϧͰ͋ΔTReCSΛఏҊ͢Δɻ TReCS͸ɺهड़Λ༻͍ͯηάϝϯςʔγϣϯϚεΫΛऔಘ͠ɺϚ΢ετϨʔεʹԊͬ ͨΦϒδΣΫτϥϕϧΛ༧ଌ͢Δɻ͜ΕΒͷΞϥΠϝϯτ͸ɺ׬શʹΧόʔ͞Εͨη άϝϯςʔγϣϯΩϟϯόεΛੜ੒͢ΔͨΊͷϚεΫͷબ୒ͱ഑ஔʹ࢖༻͞Εɺ࠷ऴ తͳը૾͸͜ͷΩϟϯόεΛ࢖༻ͨ͠ηάϝϯςʔγϣϯը૾ੜ੒ثʹΑͬͯੜ੒͞ ΕΔɻ͜ͷϚϧνεςοϓͷݕࡧϕʔεͷΞϓϩʔν͸ɺࣗಈධՁج४ͱਓؒʹΑΔ ධՁͷ྆ํʹ͓͍ͯɺطଘͷ௚઀ςΩετը૾ੜ੒ϞσϧΑΓ΋༏Ε͍ͯ·͢ɿੜ੒ ͞Εͨը૾͸શମతʹࣸਅͷΑ͏ʹϦΞϧͰɺઆ໌ͱͷϚονϯά͕ྑ͍Ͱ͢ɻ http://arxiv.org/abs/2011.03775v1 Google Research ˠ5FYUUP*NBHFΛ৽͍͠ΞϓϩʔνͰ࣮ݱͨ͠Ϟσϧɻ ࣗવݴޠจষʴը૾ʹՃ͑ͯɺϢʔβͷϚ΢εͷಈ͖ͱ૊Έ߹Θֶͤͯशͯ͠Έͨɻ
  20. ᶈࣗݾਖ਼نԽϑϩʔ (ݪจ: Self Normalizing Flows) ϠίϏΞϯߦྻ߲ࣜͷޮ཰తͳޯ഑ܭࢉ͸ɺਖ਼نԽϑϩʔϑϨʔϜϫʔΫͷத֩తͳ໰୊ Ͱ͋Δɻ͕ͨͬͯ͠ɺఏҊ͞Ε͍ͯΔ΄ͱΜͲͷϑϩʔϞσϧ͸ɺϠίϏΞϯߦྻࣜͷධ Ձ͕༰қͳؔ਺Ϋϥεʹݶఆ͞Ε͍ͯΔ͔ɺ·ͨ͸ͦͷޮ཰తͳਪఆثʹݶఆ͞Ε͍ͯ Δɻ͔͠͠ɺ͜ͷΑ͏ͳ੍໿͸ɺͦͷΑ͏ͳີ౓ϞσϧͷੑೳΛ੍ݶ͠ɺ๬·͍͠ੑೳϨ ϕϧʹ౸ୡ͢ΔͨΊʹ͸ɺଟ͘ͷ৔߹ɺ͔ͳΓͷਂ͞Λඞཁͱ͢ΔɻຊݚڀͰ͸ɺޯ഑ͷ

    ߴՁͳ߲Λ֤૚Ͱֶश͞Εͨۙࣅٯ਺Ͱஔ͖׵͑Δ͜ͱʹΑΓɺਖ਼نԽϑϩʔΛֶश͢Δ ͨΊͷॊೈͳϑϨʔϜϫʔΫͰ͋ΔSelf Normalizing FlowsΛఏҊ͢Δɻ͜ΕʹΑΓɺ֤ ૚ͷݫີߋ৽ͷܭࢉෳࡶ౓͕$\mathcal{O}(D^3)$͔Β$\mathcal{O}(D^2)$ʹݮগ͠ɺޮ ཰తͳαϯϓϦϯάΛఏڙ͠ͳ͕Βɺଞͷํ๏Ͱ͸ܭࢉ্ෆՄೳͰ͋ͬͨϑϩʔΞʔΩς Ϋνϟͷ܇࿅ΛՄೳʹ͢Δɻզʑ͸ɺͦͷΑ͏ͳϞσϧ͕ඇৗʹ҆ఆͰ͋Γɺݫີޯ഑ͷ Χ΢ϯλʔύʔτͱಉ༷ͷσʔλ໬౓஋ʹ࠷దԽ͞ΕΔ͜ͱΛ࣮ݧతʹࣔ͠ɺҰํͰɺؔ ਺తʹ੍໿͞ΕͨΧ΢ϯλʔύʔτͷੑೳΛ্ճΔ͜ͱΛࣔ͢ɻ http://arxiv.org/abs/2011.07248v1 UaV-Bosch Delta Lab / University of Amsterdam ˠಡΜͰ·ͤΜɻޯ഑ܭࢉͷܭࢉޮ཰Λ্͛ΔϑϨʔϜϫʔΫͷఏҊ࿦จ
  21. ᶉ InstaHide΁ͷ߈ܸɻΠϯελϯεɾΤϯίʔσΟϯάͰϓϥΠϕʔτֶश͸ Մೳ͔ʁ (ݪจ: An Attack on InstaHide: Is Private

    Learning Possible with Instance Encoding?) ֶशΞϧΰϦζϜ͸ɺੜ੒͞ΕͨϞσϧ͕ͦͷֶशηοτʹ͍ͭͯʢ͋·Γʹ΋ଟ͘ͷ͜ͱ Λʣ໌Β͔ʹ͠ͳ͍৔߹ʹϓϥΠϕʔτͰ͋Δͱ͞Ε͍ͯ·͢ɻInstaHide [Huang, Song, Li, Arora, ICML'20]͸ɺ௨ৗͷֶशऀ͕ॲཧ͢ΔલʹೖྗΛมߋ͢ΔΤϯίʔσΟϯάػߏʹΑͬ ͯϓϥΠόγʔΛอޢ͢Δ͜ͱΛओு͢Δ࠷ۙͷఏҊͰ͋Δɻ զʑ͸ɺූ߸Խ͞Εͨը૾Λ ར༻ͯ͠ɺݩͷը૾ͷࢹ֮తʹೝࣝՄೳͳόʔδϣϯΛ෮ݩ͢Δ͜ͱ͕Ͱ͖ΔInstaHide্ͷ࠶ ߏ੒߈ܸΛఏࣔ͢Δɻզʑͷ߈ܸ͸ޮՌత͔ͭޮ཰తͰ͋ΓɺCIFAR-10ɺCIFAR-100ɺͦͯ͠ ࠷ۙϦϦʔε͞ΕͨInstaHideνϟϨϯδͰܦݧతʹInstaHideΛഁΔ͜ͱ͕Ͱ͖ͨɻ ͞Βʹɺ ΠϯελϯεΤϯίʔσΟϯάʹΑΔֶशʹؔ͢Δ༷ʑͳϓϥΠόγʔͷ֓೦ΛఆࣜԽ͠ɺ͜ ΕΒͷ֓೦Λ࣮ݱ͢ΔՄೳੑΛௐࠪ͢ΔɻΠϯελϯεූ߸ԽΛ༻ֶ͍ͨशϓϩτίϧΛ༻͍ ͯɺʢ۠ผෆՄೳੑʹجͮ͘ʣϓϥΠόγʔͷ֓೦Λୡ੒͢Δ͜ͱʹର͢ΔোนΛূ໌͢Δɻ http://arxiv.org/abs/2011.05315v1 Google / όʔΫϨʔେֶ΄͔ ˠϓϥΠόγʔอޢ͢ΔͨΊͷ*OTUB)JEF ೥݄ ͕݄ʹ͸ഁΕͯ͠·ͬͨɻ
  22. ᶊϩόετੑͱෆ࣮֬ੑͷఆྔԽͷͨΊͷϋΠύʔύϥϝʔλΞϯαϯϒϧ (ݪจ: Hyperparameter Ensembles for Robustness and Uncertainty Quantification) σΟʔϓΞϯαϯϒϧͱͯ͠஌ΒΕΔɺҟͳΔϥϯμϜͳॳظԽ͔Β܇࿅͞Εͨχϡʔϥϧ

    ωοτϫʔΫͷॏΈΛ௒͑ΔΞϯαϯϒϧ͸ɺ࠷ઌ୺ͷਫ਼౓ͱΩϟϦϒϨʔγϣϯΛ࣮ݱ͠ ·͢ɻ࠷ۙಋೖ͞ΕͨόονΞϯαϯϒϧ͸ɺΑΓύϥϝʔλޮ཰ͷߴ͍υϩοϓΠϯஔ׵ Λఏڙ͢Δɻຊ࿦จͰ͸ɺॏΈ͚ͩͰͳ͘ɺϋΠύʔύϥϝʔλΛ༻͍ͨΞϯαϯϒϧΛઃ ܭ͠ɺ྆ํͷઃఆͰ࠷ઌ୺ͷঢ়ଶΛվળ͢Δɻ༧ࢉʹґଘ͠ͳ͍࠷ߴͷੑೳΛಘΔͨΊʹɺ զʑ͸ϋΠύʔσΟʔϓɾΞϯαϯϒϧΛఏҊ͍ͯ͠Δɻ͜ͷڧྗͳੑೳ͸ɺॏΈͱϋΠύʔ ύϥϝʔλͷଟ༷ੑͷ྆ํΛ࣋ͭϞσϧΛ૊Έ߹ΘͤΔ͜ͱͷར఺Λ໌Β͔ʹͨ͠ɻ͞Β ʹɺզʑ͸ɺόονΞϯαϯϒϧͱࣗݾௐ੔ωοτϫʔΫͷ૚ߏ଄Λϕʔεʹͨ͠ɺύϥ ϝʔλޮ཰ͷߴ͍ϋΠύʔόονΞϯαϯϒϧΛఏҊ͢Δɻຊख๏ͷܭࢉίετͱϝϞϦί ετ͸ɺҰൠతͳΞϯαϯϒϧʹൺ΂ͯஶ͘͠௿͍ɻը૾෼ྨͰ͸ɺMLP, LeNet, ResNet 20, Wide ResNet 28-10ΞʔΩςΫνϟΛ༻͍ͯɺσΟʔϓΞϯαϯϒϧͱόονΞϯαϯϒϧͷ ྆ํΛվળͨ͠ɻ http://arxiv.org/abs/2006.13570v2 Google Research ˠΞϯαϯϒϧͷվળ
  23. ᶋςΩετ෼ྨRNNʹ͓͚Δ౷߹ͷδΦϝτϦ (ݪจ: The geometry of integration in text classification RNNs)

    ϦΧϨϯτɾχϡʔϥϧɾωοτϫʔΫʢRNNʣ͕༷ʑͳλεΫʹ޿͘Ԡ༻͞Ε͍ͯΔʹ΋͔͔ΘΒͣɺ RNN͕ͲͷΑ͏ʹ͜ΕΒͷλεΫΛղܾ͢Δͷ͔ʹ͍ͭͯͷ౷Ұతͳཧղ͸ಘΒΕ͍ͯ·ͤΜɻಛʹɺ܇ ࿅͞ΕͨRNNʹͲͷΑ͏ͳಈతύλʔϯ͕ੜ͡Δͷ͔ɺ·ͨɺͦΕΒͷύλʔϯ͕܇࿅σʔληοτ΍λ εΫʹͲͷΑ͏ʹґଘ͢Δͷ͔͸ෆ໌Ͱ͋ΔɻຊݚڀͰ͸ɺಛఆͷࣗવݴޠॲཧλεΫͰ͋ΔςΩετͷ ෼ྨͱ͍͏จ຺Ͱ͜ΕΒͷ໰୊ʹऔΓ૊ΜͰ͍·͢ɻಈతγεςϜղੳͷπʔϧΛ༻͍ͯɺࣗવݴޠͱ߹ ੒ݴޠͷ྆ํͷςΩετ෼ྨλεΫͰ܇࿅͞ΕͨϦΧϨϯτωοτϫʔΫΛݚڀ͍ͯ͠·͢ɻ͜ΕΒͷ܇ ࿅͞ΕͨRNNͷμΠφϛΫε͸ɺղऍՄೳͰ௿࣍ݩͰ͋Δ͜ͱ͕Θ͔Γ·ͨ͠ɻ۩ମతʹ͸ɺΞʔΩςΫ νϟ΍σʔληοτͷҧ͍ʹؔΘΒͣɺRNN͸ςΩετΛॲཧ͢Δࡍʹ௿࣍ݩͷΞτϥΫλʔଟ༷ମΛج ຊతͳϝΧχζϜͱͯ͠࢖༻ͯ͠ɺ֤ΫϥεͷূڌΛ஝ੵ͠·͢ɻ͞ΒʹɺΞτϥΫλଟ༷ମͷ࣍ݩੑͱ ܗঢ়͸ɺֶशσʔληοτͷߏ଄ʹΑܾͬͯఆ͞ΕΔʀಛʹɺֶशσʔληοτ্Ͱܭࢉ͞Εͨ୯७ͳ୯ ޠ਺౷ܭ͕ɺ͜ΕΒͷಛੑΛ༧ଌ͢ΔͨΊʹͲͷΑ͏ʹ࢖༻Ͱ͖Δ͔ʹ͍ͭͯड़΂Δɻզʑͷ؍ଌ͸ɺෳ ਺ͷΞʔΩςΫνϟͱσʔληοτʹ·͕͓ͨͬͯΓɺRNN͕ςΩετ෼ྨΛ࣮ߦ͢ΔͨΊʹ࠾༻͍ͯ͠ Δڞ௨ͷϝΧχζϜΛ൓ө͍ͯ͠·͢ɻҙࢥܾఆʹ޲͚ͨূڌͷ౷߹͕ڞ௨ͷܭࢉݪཧͰ͋Δఔ౓ʹ͸ɺ ຊݚڀ͸ɺಈతγεςϜٕज़Λ༻͍ͯRNNͷ಺෦ಈ࡞Λݚڀ͢ΔͨΊͷجૅΛங͘΋ͷͰ͋Δɻ http://arxiv.org/abs/2010.15114v1 University of Washington / Google ˠ3//͕Ͳ͏΍ͬͯλεΫΛղܾ͍ͯ͠Δ͔Λ ղੳͨ͠ݚڀ࿦จ
  24. ᶌࣗݾճؼతੜ੒ϞσϦϯάͷͨΊͷεέʔϦϯάଇ (ݪจ: Scaling Laws for Autoregressive Generative Modeling) զʑ͸ɺੜ੒తը૾ϞσϦϯάɺϏσΦϞσϦϯάɺϚϧνϞʔμϧը૾Ϟσϧɺ਺ֶత໰୊ղܾͷ4ͭͷྖҬʹ͓ ͍ͯɺΫϩεΤϯτϩϐʔଛࣦʹର͢ΔܦݧతͳεέʔϦϯάଇΛ໌Β͔ʹͨ͠ɻ͢΂ͯͷ৔߹ʹ͓͍ͯɺࣗݾճ

    ؼܕτϥϯεϑΥʔϚʔ͸ɺϞσϧαΠζͱܭࢉ༧ࢉͷ૿Ճʹ൐ͬͯɺύϫʔͷ๏ଇͱҰఆͷεέʔϦϯά๏ଇʹ ैͬͯɺεϜʔζʹੑೳ͕޲্͠·͢ɻ࠷దͳϞσϧαΠζ΋·ͨɺ͢΂ͯͷσʔλྖҬͰ΄΅ීวతͳࢦ਺Λ࣋ͭ ྗߦଇʹΑΔܭࢉ༧ࢉʹґଘ͠·͢ɻ ΫϩεΤϯτϩϐʔଛࣦ͸ɺ৘ใཧ࿦తʹ͸ɺ$S($True$) + D_{\mathrm{KL}}}($True$||$Model$)$ͱͯ͠ղऍ͞ΕɺܦݧతͳεέʔϦϯάଇ͸ɺਅͷσʔλ෼෍ͷΤϯτϩ ϐʔͱਅͷ෼෍ͱϞσϧ෼෍ͷؒͷKLൃࢄͷ྆ํΛ༧ଌ͢Δ͜ͱΛ͍ࣔࠦͯ͠·͢ɻ͜ͷղऍͰ͸ɺ10ԯύϥϝʔ λͷTransformer͸ɺYFCC100Mͷը૾෼෍Λ$8Times 8$ͷղ૾౓ʹμ΢ϯαϯϓϦϯάͨ͠΄΅׬શͳϞσϧͰ ͋Γɺଞͷղ૾౓ʹ͍ͭͯ͸ɺnats/imageͷ೚ҙͷ༩͑ΒΕͨݮ଎Մೳͳଛࣦ(͢ͳΘͪɺ$D_{mathrm{KL}}}$Λୡ ੒͢ΔͨΊʹඞཁͳϞσϧαΠζΛ༧ଌ͢Δ͜ͱ͕Ͱ͖Δɻ զʑ͸ɺಛఆͷྖҬʹ͓͚Δ͍͔ͭ͘ͷ௥Ճͷεέʔ ϦϯάଇΛൃݟͨ͠ɻ(a) ϚϧνϞʔμϧϞσϧʹ͓͚ΔΩϟϓγϣϯͱը૾ͷؒͷ૬ޓ৘ใͷεέʔϦϯάؔ܎Λ ໌Β͔ʹ͠ɼ"Is a picture worth a thousand words? "ͱ͍͏࣭໰ʹͲͷΑ͏ʹ౴͑Δ͔Λࣔ͢ɽ(b) ਺ֶత໰୊ղܾ ͷ৔߹ɼֶश෼෍Λ௒͑ͯ֎ૠ͢Δͱ͖ͷϞσϧੑೳͷεέʔϦϯάଇΛ໌Β͔ʹ͢Δɽ͜ΕΒͷ݁Ռ͸ɺεέʔ Ϧϯάଇ͕ԼྲྀͷλεΫΛؚΉχϡʔϥϧωοτϫʔΫͷੑೳʹॏཁͳҙຯΛ࣋ͭ͜ͱΛ͍ࣔͯ͠·͢ɻ http://arxiv.org/abs/2010.14701v2 Open AI ˠޮ཰తͳ5SBOTGPSNFSϞσϧαΠζ ύϥϝʔλ਺ɾϨΠϠ਺ɾਂ͞ Λ஌ΔͨΊʹ ৭ʑͳੜ੒՝୊ʹ͍ͭͯɺϞσϧαΠζผͷύϑΥʔϚϯεΛௐ΂ͯ෼ੳͨ͠ɻ