Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
AI最新論文読み会2022年まとめ
ai.labo.ocu
December 07, 2022
Science
0
240
AI最新論文読み会2022年まとめ
AI最新論文読み会2022年まとめ
ai.labo.ocu
December 07, 2022
Tweet
Share
More Decks by ai.labo.ocu
See All by ai.labo.ocu
AI最新論文読み会2022年12月
ailaboocu
0
240
AI最新論文読み会2022年11月
ailaboocu
0
280
AI最新論文読み会2022年8月
ailaboocu
0
390
AI最新論文読み会2022年7月
ailaboocu
0
320
AI最新論文読み会2022年6月
ailaboocu
0
370
AI最新論文読み会2022年5月11日
ailaboocu
0
430
AI最新論文読み会2022年4月
ailaboocu
1
450
AI最新論文読み会2022年3月
ailaboocu
0
390
AI最新論文読み会2022年2月
ailaboocu
1
350
Other Decks in Science
See All in Science
Rで有名絵画を安全に買いたい
saltcooky12
0
110
男子プロテニスのサービス着地点およびランキングポイントに基づく予測得点確率モデルの構築 / Construction of scoring probability model based on service landing location and ranking points in men’s professional tennis matches
konakalab
0
160
Atlas User Interfaces to Accelerate Data Access
ngehlenborg
0
150
Discngine Cloud Infrastructure
valentinbeuchillot
0
550
Tokoy.R #99 パーマーステーションのペンギンたち #1
bob3bob3
1
760
H&M 23th place solution
kuto5046
0
140
ベイズ統計学入門 〜頻度主義からベイズ主義へ〜
ueniki
1
1.2k
SIGNATE ソニーグループ合同データ分析コンペティション(for Recruiting) 3rd Place Solution
yayaya
1
270
SNLP2022:What does the sea say to the shore? A BERT based DST style approach for speaker to dialogue attribution in novels
yukizenimoto
0
180
La Station Biologique de Roscoff et ses coopérations : une approche géographique
marionmai
0
160
MATSUO MAKIKO
genomethica
0
120
【CVPR2022論文紹介】SignGAN
mkkon
0
2.7k
Featured
See All Featured
Done Done
chrislema
178
14k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
239
19k
Building Adaptive Systems
keathley
27
1.3k
Building a Scalable Design System with Sketch
lauravandoore
451
31k
How GitHub (no longer) Works
holman
298
140k
Put a Button on it: Removing Barriers to Going Fast.
kastner
56
2.5k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
109
16k
ParisWeb 2013: Learning to Love: Crash Course in Emotional UX Design
dotmariusz
101
6.1k
Facilitating Awesome Meetings
lara
33
4.6k
GitHub's CSS Performance
jonrohan
1020
430k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
351
21k
Designing Experiences People Love
moore
130
22k
Transcript
େࡕެཱେֶɹ২ాେथ AI࠷৽จಡΈձ 20221·ͱΊ
2022·ͱΊ AI࠷৽จಡΈձ ɾϝΠϯ ConvNeXt (2݄ൃද): ͍͍͢࠷ۙͷߴੑೳϞσϧ GLIDE (1݄ൃද)ɹςΩετtoը૾ੜ Imagic (11݄ൃද)ɹࡉ͔ͳमਖ਼
AudioLM (10݄ൃද): ԻੜϞσϧ Socratic Models (5݄ൃද): ൚༻AI (AGI) ɾͦͷଞ Wav2Vec 2 (7݄ൃද): NeuroAI?Brain-inspired AI (AIͱਓؒͷͷؔΛ୳Δ) Algorithmic Imprint (7݄ൃද): AI࡞ऀͷྙཧ
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt AudioLM Speech
Socratic model Diffusion
2022·Ͱ·ͱΊ Self-Attention 2017 2018 BERT 2020 DETR ViT GPT3 2021
CLIP wav2vec 2 w2v-BERT BigSSL 2019 GPT2 SwinT DDPM ADM
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV AudioLM Speech Socratic
model Diffusion ConvNeXt
ConvNeXt: CNN x SwinTransformer ը૾ྨϞσϧͷstate-of-the-art
ConvNeXt: CNN x SwinTransformer ͷ·ͱΊ: ϕʔεResNet
ConvNeXt ·ͣॳΊʹɻ
ConvNeXt ֤εςʔδͷ܁Γฦ͠ΛSwinTʹ͚ۙͮΔ
ConvNeXt 4×4 non-overlapping convolution ΈࠐΈͷύονԽ
ConvNeXt Depthwise convolutionಋೖޙɺ෯Λ͛Δ
ConvNeXt Inverted bottleneck(Narrow→Wide→Narrow)ߏͷಋೖ TransformerͰ֦େ4ഒΛ༻ɻ※MobileNetͰ֦େ6ഒɻ શମͱͯ͠ͷܭࢉྔݮΔ͕ɺConvͷԋࢉ૿Ճɻ SwinTͰίί
ConvNeXt Depthwise convolutionͷҠಈ ※Depthwise ConvolutionͰେ͖ͳΧʔωϧαΠζ͏ͨΊ Ұ࣌తʹConvͷԋࢉྔݮগͰੑೳѱԽɻ SwinTͰίί MSAϒϩοΫ͕FFNΑΓઌ಄ʹ͋Δ
ConvNeXt SwinTransformerͷΧʔωϧαΠζ(7)ΛਅࣅΔ Depthwise convolutionͷ ΧʔωϧαΠζେ͖͍ͯ͘͘͠ɻ 7Ͱੑೳ͕(SwinTͱಉ͡) ↓
ConvNeXt ࡉ͔ͳSwinT or ViTͷΛಋೖ ReLU→GELU NormalizationݮΒ͢ BN→LN μϯαϯϓϧΛΓ͠
ConvNeXt ݁Ռ
ConvNeXt ResNetΛSwinTransformerԽͯ͠ɺ CNN͚ͩͰState-of-the-artग़ͨΑɻ
None
2022·ͱΊ Text AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt AudioLM Speech Socratic model
Diffusion GLIDE, Imagic
GLIDE Stable Di ff usionͷجૅϞσϧ
Diffusion model ੜϞσϧ
Diffusion modelͷྺ࢙ DDPM ADM GLIDE CLIP ↓ ҆ఆԽɺߴղ૾Խ ݴޠΛѻ͏
Diffusion modelͷྺ࢙ DDPM ADM GLIDE CLIP ↓
DDPM: diffusion modelͷ࢝·Γ DNN Image Noise Image +
Noise ਪͨ͠ Noise ࣌ࠁใ ೋޡࠩ ࠷খԽ
Diffusion modelͷྺ࢙ DDPM ADM GLIDE CLIP ↓
ADM: ϞσϧΛ2ͭʹ͚ͯɺߴղ૾Խʹޭɻ Base Upsampler ྨ ߴղ૾ Classi fi er guidance
(CNN)
GLIDE = CLIP x Diffusion model Di ff usion modelͷྺ࢙
DDPM ADM GLIDE CLIP ↓
CLIP: ը૾ͱςΩετͷڮ͠
CLIP: ը૾ͱςΩετͷڮ͠ ը૾ͱςΩετΛൺֱͰ͖ΔΑ͏ʹಛมͰ͖ΔϞσϧ ViT: Image Transformer: Text ίαΠϯྨࣅ
ADM Base Upsampler ྨ ߴղ૾ Classi fi er guidance (CNN)
GLIDE = ADM-basedʹCNNΛCLIPʹมߋ ADM-basedʹCNNΛCLIPʹมߋ Base Upsampler ྨ ߴղ૾ Classi fi
er guidance (CLIP)
Imagic: Stable DiffusionͷվྑςΫχοΫ Stable Di ff usionͷվྑςΫχοΫ
Imagic Overview
None
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt Speech Socratic
model Diffusion AudioLM
AudioLM ԻͷੜϞσϧ
AudioLM = w2v-BERT x SoundStream Overview ɾจষͱΦʔσΟΦͷؒʹҰରଟͷ͕ؔ͋Δɻ ɾΦʔσΟΦςΩετʹൺͯ͠σʔλྔ͕ଟ͍ɻ
SoundStream ԻΛྔࢠԽ͢Δ
w2v-BERT Contrastive LearningͱMasked Language ModelingͷΈ߹Θͤ
None
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt AudioLM Speech
Diffusion Socratic model
Socratic models طଘֶशࡁΈϞσϧΛΈ߹Θͤͨ(४ʁ)൚ਓೳϞσϧ
Socratic models Overview Language is an intermediate representation
Socratic models Overview طଘͷVLM (Visual Language Model)ɺLMs (Large Language Model)
ɺ ALMs (Audio Language Model)ͷಉ࢜ が ɺߏԽ͞ΕͨରΛߦ͏ɻ ͦͯ͠ɺ ビデ ΦαʔνɺΩϟ プ γϣϯੜɺ ビデ ΦQ&A (ະͷλεΫ)ɺকདྷͷߦಈ༧ଌΛ͜ͷରۭؒͷ৽͍͠ࢀՃऀͱͯ͠ѻ͏ ɻ
Socratic models ྫࣔ̍ɿجຊฤ
Socratic models ྫࣔ̎ɿԠ༻ฤ
Socratic models ιΫϥςεରͱʁ
None
Others: NeuroAIᶃ ͷػೳͱݴޠϞσϧͷରԠΛ୳Δ
Others: NeuroAIᶃ શମ૾: Wav2Vec 2Λֶश͠ɺͦͷ݁Ռ͔ΒfMRIͷBOLDΛ༧ଌ͢ΔWΛ࡞ɾ݁Ռݕূ
Others: NeuroAIᶃ ฏۉԽͨ͠ͷ׆ੑͷදݱɻ
Others: NeuroAIᶃ ϞσϧͷϨΠϠʔͷਂ͞ͱͷ෦ҐʹରԠ͕͋ͬͨɻ
Others: NeuroAIᶄ ͔ΒݴޠΛੜ͢Δ
Others: NeuroAIᶄ ϞσϧͷτϨʔχϯάηογϣϯ 81िؒʹΘͨΓ50ճͷηογϣϯ ݽཱޠλεΫͱจষλεΫ λʔήοτͷ୯ޠจষ͕ը໘্ͷจࣈͱͯ͠ ඃݧऀʹࢹ֮తʹఏࣔ͞Εඃݧऀ ͦͷ୯ޠจষΛੜ͠Α͏ͱͨ͠ɻ ݽཱޠλεΫͰɺ50ݸͷӳ୯ޠηοτ͔Βݸʑͷ୯ޠΛੜɻ จষλεΫͰɺ50୯ޠηοτ͔ΒͳΔӳޠจ͔Β୯ޠྻΛੜɻ
Others: NeuroAIᶄ Ϟσϧͷ݁Ռ จষ75%ͷਫ਼ ୯ޠ93%ͷਫ਼
None
Others: AI Ethics ྙཧ
Algorithmic Imprint Ξϧ ゴ Ϧ ズ ϜʹΑΔ が ൃੜͨ͠߹ͷҰൠత で
߹ཧతͳରࡦͱͯ͠ɺͦͷ༗ͳӨڹ が ͞Βʹൖ͢ΔͷΛ ぐ ͨΊʹ Ξϧ ゴ Ϧ ズ Ϝͷ༻ఀࢭ が Α͘ߦΘΕΔ が ɺఀࢭ͔ͨ͠Βͱݴͬͯެฏੑɺઆ໌ɺಁ໌ੑɺྙཧͷ が ͳ͘ͳΔ Θ͚ で ͳ͍ →͜ͷ༗ͳΞϧ ゴ Ϧ ズ ϜͷӨڹɺΞϧ ゴ Ϧ ズ ϜআҎ߱͘Өڹ͠ଓ͚Δ(Ξϧ ゴ Ϧ ズ Ϝͷࠟ) ྫ: ӳࠃΛڌͱ͢Δߴߍͷଔۀূॻࢼݧ で ͋ΔGCEࢼݧͷΞϧ ゴ Ϧ ズ ϜʹΑΔධՁΛऔΓר͘(2020) ▪ ど ͷΑ͏ͳࢼݧ͔? ɾ 160͔ࠃҎ্ で ࣮ࢪ͞Ε͍ͯΔ(ͦͷଟ͘ӳࠃͷݩ২ຽ)ࠃࡍతʹೝΊΒΕͨࢼݧ ɾ AϨ ベ ϧͷඞવత で ͋ΓɺେֶͷೖֶʹෆՄܽͳׂΛՌͨ͢ ▪ܦҢ ɾCOVID-19ͷେྲྀߦʹΑΓGCEࢼݧΛಜ͢ΔӳࠃʹຊڌΛஔ͘४ػؔ で ͋ΔOfqualର໘ࢼݧΛதࢭͨ͠ ɾࢼݧͷΘΓʹɺֶߍ で ͷੜెͷաڈͷɺڭࢣͷධՁΛ༻ͯ͠Ξϧ ゴ Ϧ ズ Ϝ で Λ࡞ͨ͠ →݁Ռɺੈքతͳ߅ٞߦಈ が ຄൃ͠ɺΞϧ ゴ Ϧ ズ Ϝআ͞Εͨ ɹڭࢣଆ: ͦͦաڈͷੜెͷධՁΛه͍ͯ͠ͳ͍ ɹੜెଆ: ʹରͯ͠ਅʹऔΓΜ で ͍ͳ͔ͬͨ(ࢼݧ が શͯͳͷ で લͷ30~60ʹษڧ͢Δੜె が ଟ͍) ɾΞϧ ゴ Ϧ ズ Ϝআ͞Εͨ が ɺֶੜͷ࠶ධՁߦΘΕͳ͔ͬͨɻ ͢ͳΘͪɺ࠾ํ๏มΘͬͨ が ɺΞϧ ゴ Ϧ ズ ϜͷӨڹΛେ͖͘ड͚͍ͯͨ(Ξϧ ゴ Ϧ ズ Ϝͷࠟ)
Algorithmic Imprint ▪Algorithmic Imprint(Ξϧ ゴ Ϧ ズ Ϝͷࠟ)Λҙࣝͨ͠Ξϧ ゴ Ϧ
ズ Ϝͷઃܭ ʮΞϧ ゴ Ϧ ズ ϜͷࠟʯΛҙࣝͨ͠ઃܭͷߟ͑ํʹΑΓɺΞϧ ゴ Ϧ ズ Ϝ։ൃ プ ϩηεΛΑΓެฏ で ࣾձٕज़తͳ ใʹج づ ͍ͨͷʹ͢Δ͜ͱ がで ͖Δɻ (1)Ξϧ ゴ Ϧ ズ ϜͷӨڹ Ξϧ ゴ Ϧ ズ Ϝআͨ͠ޙʹརؔऀʹӨڹΛٴ ぼ ͢ɻ։ൃऀͱӡӦऀΞϧ ゴ Ϧ ズ ϜΛআ ͢Δ だ ͚ で ͳ͘ɺΞϧ ゴ Ϧ ズ ϜʹΑΔةΛੋਖ਼͠ɺઆ໌ が ࣋ଓͯ͠ཁٻ͞ΕΔɻ (2)Ξϧ ゴ Ϧ ズ Ϝઃܭͷઆ໌ ։ൃऀʮΞϧ ゴ Ϧ ズ ϜͷࠟʯͷӨڹΛड͚ΔਓʹΛΑΓೝࣝ で ͖ΔΑ͏ʹ͢Δ べ ͖ で ͋Δɻ (3)AIྙཧ ガ バ φϯε で ิڧ͢Δ ٕज़తͳհೖ だ ͚ で Λݮ͢Δ͜ͱ で ͖ͳ͍ɻ ʮΞϧ ゴ Ϧ ズ ϜͷࠟʯΛҙࣝͨ͠Ξϧ ゴ Ϧ ズ ϜઃܭΛ దͳAI ྙཧ ガ バ φϯε で ิ͢Δɻ
None
2023ʹ͍ͭͯ ʮզʑͷݚڀࣨʹ͔͠Ͱ͖ͳ͍͜ͱʯΛɻ Ҿ͖ଓ͖ษڧձ։࠵͢Δɻ ҩֶͷൺॏΛॏ͘͢Δɻ ҩྍը૾ݚڀ༻ϞσϧͷνϡʔτϦΞϧɾϋϯζΦϯ