$30 off During Our Annual Pro Sale. View Details »

AI最新論文読み会2022年まとめ

 AI最新論文読み会2022年まとめ

AI最新論文読み会2022年まとめ

More Decks by 医療AI研究所@大阪公立大学

Other Decks in Science

Transcript

  1. େࡕެཱେֶɹ২ాେथ
    AI࠷৽࿦จಡΈձ
    2022೥1೥·ͱΊ

    View Slide

  2. 2022೥·ͱΊ
    AI࠷৽࿦จಡΈձ
    ɾϝΠϯ
    ConvNeXt (2݄ൃද): ࢖͍΍͍͢࠷ۙͷߴੑೳϞσϧ

    GLIDE (1݄ൃද)ɹςΩετtoը૾ੜ੒

    Imagic (11݄ൃද)ɹࡉ΍͔ͳमਖ਼

    AudioLM (10݄ൃද): Ի੠ੜ੒Ϟσϧ

    Socratic Models (5݄ൃද): ൚༻AI (AGI)

    ɾͦͷଞ
    Wav2Vec 2 (7݄ൃද): NeuroAI?Brain-inspired AI (AIͱਓؒͷ೴ͷؔ܎Λ୳Δ෼໺)

    Algorithmic Imprint (7݄ൃද): AI࡞੒ऀͷྙཧ

    View Slide

  3. 2022೥·ͱΊ
    Text
    GLIDE, Imagic
    AGI
    ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓ޻஌ೳ(AGI)΁ͷνϟϨϯδͷ1೥ɻ
    CV
    ConvNeXt
    AudioLM
    Speech
    Socratic model
    Diffusion

    View Slide

  4. 2022೥·Ͱ·ͱΊ
    Self-Attention
    2017೥ 2018೥
    BERT
    2020೥
    DETR
    ViT
    GPT3
    2021೥
    CLIP
    wav2vec 2
    w2v-BERT
    BigSSL
    2019೥
    GPT2
    SwinT
    DDPM ADM

    View Slide

  5. 2022೥·ͱΊ
    Text
    GLIDE, Imagic
    AGI
    ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓ޻஌ೳ(AGI)΁ͷνϟϨϯδͷ1೥ɻ
    CV
    AudioLM
    Speech
    Socratic model
    Diffusion
    ConvNeXt

    View Slide

  6. ConvNeXt: CNN x SwinTransformer
    ը૾෼ྨϞσϧͷstate-of-the-art

    View Slide

  7. ConvNeXt: CNN x SwinTransformer
    ޻෉఺ͷ·ͱΊ: ϕʔε͸ResNet

    View Slide

  8. ConvNeXt
    ·ͣॳΊʹɻ

    View Slide

  9. ConvNeXt
    ֤εςʔδͷ܁Γฦ͠਺ΛSwinTʹ͚ۙͮΔ

    View Slide

  10. ConvNeXt
    4×4 non-overlapping convolution
    ৞ΈࠐΈͷύονԽ

    View Slide

  11. ConvNeXt
    Depthwise convolutionಋೖޙɺ෯Λ޿͛Δ

    View Slide

  12. ConvNeXt
    Inverted bottleneck(Narrow→Wide→Narrow)ߏ଄ͷಋೖ
    TransformerͰ͸֦େ཰4ഒΛ࢖༻ɻ※MobileNetͰ͸֦େ཰͸6ഒɻ
    શମͱͯ͠ͷܭࢉྔ͸ݮΔ͕ɺConvͷԋࢉ͸૿Ճɻ
    SwinTͰ͸ίί

    View Slide

  13. ConvNeXt
    Depthwise convolutionͷҠಈ
    ※Depthwise ConvolutionͰେ͖ͳΧʔωϧαΠζ࢖͏ͨΊ
    Ұ࣌తʹConvͷԋࢉྔݮগͰੑೳѱԽɻ
    SwinTͰ͸ίί
    MSAϒϩοΫ͕FFNΑΓ΋ઌ಄ʹ͋Δ

    View Slide

  14. ConvNeXt
    SwinTransformerͷΧʔωϧαΠζ(7)ΛਅࣅΔ
    Depthwise convolutionͷ

    ΧʔωϧαΠζେ͖͍ͯ͘͘͠ɻ
    7Ͱੑೳ͕๞࿨(SwinTͱಉ͡)

    View Slide

  15. ConvNeXt
    ࡉ͔ͳSwinT or ViTͷ޻෉Λಋೖ
    ReLU→GELU
    NormalizationݮΒ͢
    BN→LN
    μ΢ϯαϯϓϧ૚Λ੾Γ཭͠

    View Slide

  16. ConvNeXt
    ݁Ռ

    View Slide

  17. ConvNeXt
    ResNetΛSwinTransformerԽͯ͠ɺ
    CNN͚ͩͰState-of-the-artग़ͨΑɻ

    View Slide

  18. View Slide

  19. 2022೥·ͱΊ
    Text AGI
    ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓ޻஌ೳ(AGI)΁ͷνϟϨϯδͷ1೥ɻ
    CV
    ConvNeXt
    AudioLM
    Speech
    Socratic model
    Diffusion
    GLIDE, Imagic

    View Slide

  20. GLIDE
    Stable Di
    ff
    usionͷجૅϞσϧ

    View Slide

  21. Diffusion model
    ੜ੒Ϟσϧ

    View Slide

  22. Diffusion modelͷྺ࢙
    DDPM ADM GLIDE
    CLIP

    ҆ఆԽɺߴղ૾౓Խ ݴޠΛѻ͏

    View Slide

  23. Diffusion modelͷྺ࢙
    DDPM ADM GLIDE
    CLIP

    View Slide

  24. DDPM: diffusion modelͷ࢝·Γ
    DNN
    Image
    Noise
    Image

    +

    Noise
    ਪ࿦ͨ͠


    Noise
    ࣌ࠁ৘ใ
    ೋ৐ޡࠩ


    ࠷খԽ

    View Slide

  25. Diffusion modelͷྺ࢙
    DDPM ADM GLIDE
    CLIP

    View Slide

  26. ADM: ϞσϧΛ2ͭʹ෼͚ͯɺߴղ૾౓Խʹ੒ޭɻ
    Base Upsampler
    ෼ྨ ߴղ૾
    Classi
    fi
    er guidance


    (CNN)

    View Slide

  27. GLIDE = CLIP x Diffusion model
    Di
    ff
    usion modelͷྺ࢙
    DDPM ADM GLIDE
    CLIP

    View Slide

  28. CLIP: ը૾ͱςΩετͷڮ౉͠

    View Slide

  29. CLIP: ը૾ͱςΩετͷڮ౉͠
    ը૾ͱςΩετΛ௚઀ൺֱͰ͖ΔΑ͏ʹಛ௃ม׵Ͱ͖ΔϞσϧ
    ViT: Image
    Transformer: Text
    ίαΠϯྨࣅ౓

    View Slide

  30. ADM
    Base Upsampler
    ෼ྨ ߴղ૾
    Classi
    fi
    er guidance


    (CNN)

    View Slide

  31. GLIDE = ADM-basedʹCNNΛCLIPʹมߋ
    ADM-basedʹCNNΛCLIPʹมߋ
    Base Upsampler
    ෼ྨ ߴղ૾
    Classi
    fi
    er guidance


    (CLIP)

    View Slide

  32. Imagic: Stable DiffusionͷվྑςΫχοΫ
    Stable Di
    ff
    usionͷվྑςΫχοΫ

    View Slide

  33. Imagic
    Overview

    View Slide

  34. View Slide

  35. 2022೥·ͱΊ
    Text
    GLIDE, Imagic
    AGI
    ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓ޻஌ೳ(AGI)΁ͷνϟϨϯδͷ1೥ɻ
    CV
    ConvNeXt
    Speech
    Socratic model
    Diffusion
    AudioLM

    View Slide

  36. AudioLM
    Իͷੜ੒Ϟσϧ

    View Slide

  37. AudioLM = w2v-BERT x SoundStream
    Overview
    ɾจষͱΦʔσΟΦͷؒʹ͸Ұରଟͷؔ܎͕͋Δɻ

    ɾΦʔσΟΦ͸ςΩετʹൺͯ͠σʔλྔ͕ଟ͍ɻ

    View Slide

  38. SoundStream
    ԻΛྔࢠԽ͢Δ

    View Slide

  39. w2v-BERT
    Contrastive LearningͱMasked Language Modelingͷ૊Έ߹Θͤ

    View Slide

  40. View Slide

  41. 2022೥·ͱΊ
    Text
    GLIDE, Imagic
    AGI
    ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓ޻஌ೳ(AGI)΁ͷνϟϨϯδͷ1೥ɻ
    CV
    ConvNeXt
    AudioLM
    Speech
    Diffusion
    Socratic model

    View Slide

  42. Socratic models
    طଘֶशࡁΈϞσϧΛ૊Έ߹Θͤͨ(४ʁ)൚ਓ޻஌ೳϞσϧ

    View Slide

  43. Socratic models
    Overview
    Language is an intermediate representation

    View Slide

  44. Socratic models
    Overview
    طଘͷVLM (Visual Language Model)ɺLMs (Large Language Model) ɺ ALMs (Audio Language Model)ͷಉ࢜
    が
    ɺߏ଄Խ͞Εͨର࿩Λߦ͏ɻ

    ͦͯ͠ɺ
    ビデ
    ΦαʔνɺΩϟ
    プ
    γϣϯੜ੒ɺ
    ビデ
    ΦQ&A (ະ஌ͷλεΫ)ɺকདྷͷߦಈ༧ଌΛ͜ͷର࿩ۭؒ΁ͷ৽͍͠ࢀՃऀͱͯ͠ѻ͏ ɻ

    View Slide

  45. Socratic models
    ྫࣔ̍ɿجຊฤ

    View Slide

  46. Socratic models
    ྫࣔ̎ɿԠ༻ฤ

    View Slide

  47. Socratic models
    ιΫϥςεର࿩ͱ͸ʁ

    View Slide

  48. View Slide

  49. Others: NeuroAIᶃ
    ೴ͷػೳͱݴޠϞσϧͷରԠΛ୳Δ

    View Slide

  50. Others: NeuroAIᶃ
    શମ૾: Wav2Vec 2Λֶश͠ɺͦͷ݁Ռ͔ΒfMRIͷBOLDΛ༧ଌ͢ΔWΛ࡞੒ɾ݁Ռݕূ

    View Slide

  51. Others: NeuroAIᶃ
    ฏۉԽͨ͠೴ͷ׆ੑͷදݱɻ

    View Slide

  52. Others: NeuroAIᶃ
    ϞσϧͷϨΠϠʔͷਂ͞ͱ೴ͷ෦ҐʹରԠ͕͋ͬͨɻ

    View Slide

  53. Others: NeuroAIᶄ
    ೴೾͔ΒݴޠΛੜ੒͢Δ

    View Slide

  54. Others: NeuroAIᶄ
    ϞσϧͷτϨʔχϯάηογϣϯ
    81िؒʹΘͨΓ50ճͷηογϣϯ

    ݽཱޠλεΫͱจষλεΫ

    λʔήοτͷ୯ޠ΍จষ͕ը໘্ͷจࣈͱͯ͠

    ඃݧऀʹࢹ֮తʹఏࣔ͞Εඃݧऀ͸

    ͦͷ୯ޠ΍จষΛੜ੒͠Α͏ͱͨ͠ɻ

    ݽཱޠλεΫͰ͸ɺ50ݸͷӳ୯ޠηοτ͔Βݸʑͷ୯ޠΛੜ੒ɻ

    จষλεΫͰ͸ɺ50୯ޠηοτ͔ΒͳΔӳޠจ͔Β୯ޠྻΛੜ੒ɻ

    View Slide

  55. Others: NeuroAIᶄ
    Ϟσϧͷ݁Ռ
    จষ͸75%ͷਫ਼౓

    ୯ޠ͸93%ͷਫ਼౓

    View Slide

  56. View Slide

  57. Others: AI Ethics
    ྙཧ

    View Slide

  58. Algorithmic Imprint
    Ξϧ
    ゴ
    Ϧ
    ズ
    ϜʹΑΔ֐
    が
    ൃੜͨ͠৔߹ͷҰൠత
    で
    ߹ཧతͳରࡦͱͯ͠ɺͦͷ༗֐ͳӨڹ
    が
    ͞Βʹ఻ൖ͢ΔͷΛ๷
    ぐ
    ͨΊʹ
    Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝͷ࢖༻ఀࢭ
    が
    Α͘ߦΘΕΔ
    が
    ɺఀࢭ͔ͨ͠Βͱݴͬͯެฏੑɺઆ໌੹೚ɺಁ໌ੑɺྙཧͷ໰୊
    が
    ͳ͘ͳΔ
    Θ͚
    で
    ͸ͳ͍

    →͜ͷ༗֐ͳΞϧ
    ゴ
    Ϧ
    ズ
    ϜͷӨڹ͸ɺΞϧ
    ゴ
    Ϧ
    ズ
    Ϝ࡟আҎ߱΋௕͘Өڹ͠ଓ͚Δ(Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝͷࠟ੻)

    ྫ: ӳࠃΛڌ఺ͱ͢Δߴߍͷଔۀূॻࢼݧ
    で
    ͋ΔGCEࢼݧͷΞϧ
    ゴ
    Ϧ
    ズ
    ϜʹΑΔධՁΛऔΓר͘໰୊(2020)


    ど
    ͷΑ͏ͳࢼݧ͔?

    ɾ 160͔ࠃҎ্
    で
    ࣮ࢪ͞Ε͍ͯΔ(ͦͷଟ͘͸ӳࠃͷݩ২ຽ஍)ࠃࡍతʹೝΊΒΕͨࢼݧ

    ɾ AϨ
    ベ
    ϧͷ੒੷͸ඞવత
    で
    ͋Γɺେֶ΁ͷೖֶʹෆՄܽͳ໾ׂΛՌͨ͢

    ■ܦҢ

    ɾCOVID-19ͷେྲྀߦʹΑΓGCEࢼݧΛ؂ಜ͢ΔӳࠃʹຊڌΛஔ͘४੓෎ػؔ
    で
    ͋ΔOfqual͸ର໘ࢼݧΛதࢭͨ͠

    ɾࢼݧͷ୅ΘΓʹɺֶߍ
    で
    ͷੜెͷաڈͷ੒੷ɺڭࢣͷධՁΛ࢖༻ͯ͠Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝ
    で
    ੒੷Λ࡞੒ͨ͠

    →݁Ռɺੈքతͳ߅ٞߦಈ
    が
    ຄൃ͠ɺΞϧ
    ゴ
    Ϧ
    ズ
    Ϝ͸࡟আ͞Εͨ

    ɹڭࢣଆ: ͦ΋ͦ΋աڈͷੜెͷධՁΛه࿥͍ͯ͠ͳ͍

    ɹੜెଆ: ੒੷ʹରͯ͠ਅ݋ʹऔΓ૊Μ
    で
    ͍ͳ͔ͬͨ(ࢼݧ
    が
    શͯͳͷ
    で
    ௚લͷ30~60೔ʹ໠ษڧ͢Δੜె
    が
    ଟ͍)

    ɾΞϧ
    ゴ
    Ϧ
    ズ
    Ϝ͸࡟আ͞Εͨ
    が
    ɺֶੜͷ࠶ධՁ͸ߦΘΕͳ͔ͬͨɻ

    ͢ͳΘͪɺ࠾఺ํ๏͸มΘͬͨ
    が
    ɺΞϧ
    ゴ
    Ϧ
    ズ
    ϜͷӨڹΛେ͖͘ड͚͍ͯͨ(Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝͷࠟ੻)

    View Slide

  59. Algorithmic Imprint
    ■Algorithmic Imprint(Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝͷࠟ੻)Λҙࣝͨ͠Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝͷઃܭ

    ʮΞϧ
    ゴ
    Ϧ
    ズ
    Ϝͷࠟ੻ʯΛҙࣝͨ͠ઃܭͷߟ͑ํʹΑΓɺΞϧ
    ゴ
    Ϧ
    ズ
    Ϝ։ൃ
    プ
    ϩηεΛΑΓެฏ
    で
    ࣾձٕज़తͳ

    ৘ใʹج
    づ
    ͍ͨ΋ͷʹ͢Δ͜ͱ
    がで
    ͖Δɻ

    (1)Ξϧ
    ゴ
    Ϧ
    ズ
    ϜͷӨڹ Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝ͸࡟আͨ͠ޙʹ΋ར֐ؔ܎ऀʹӨڹΛٴ
    ぼ
    ͢ɻ։ൃऀͱӡӦऀ͸Ξϧ
    ゴ
    Ϧ
    ズ
    ϜΛ࡟আ
    ͢Δ
    だ
    ͚
    で
    ͳ͘ɺΞϧ
    ゴ
    Ϧ
    ズ
    ϜʹΑΔة֐Λੋਖ਼͠ɺઆ໌੹೚
    が
    ࣋ଓͯ͠ཁٻ͞ΕΔɻ

    (2)Ξϧ
    ゴ
    Ϧ
    ズ
    Ϝઃܭͷઆ໌੹೚ ։ൃऀ͸ʮΞϧ
    ゴ
    Ϧ
    ズ
    Ϝͷࠟ੻ʯͷӨڹΛड͚Δਓʹ֐ΛΑΓೝࣝ
    で
    ͖ΔΑ͏ʹ͢Δ
    べ
    ͖
    で
    ͋Δɻ

    (3)AIྙཧ
    ガ
    バ
    φϯε
    で
    ิڧ͢Δ

    ٕज़తͳհೖ
    だ
    ͚
    で
    ͸֐Λ࡟ݮ͢Δ͜ͱ͸
    で
    ͖ͳ͍ɻ ʮΞϧ
    ゴ
    Ϧ
    ズ
    Ϝͷࠟ੻ʯΛҙࣝͨ͠Ξϧ
    ゴ
    Ϧ
    ズ
    ϜઃܭΛ ద੾ͳAI
    ྙཧ
    ガ
    バ
    φϯε
    で
    ิ׬͢Δɻ

    View Slide

  60. View Slide

  61. 2023೥ʹ͍ͭͯ
    ʮզʑͷݚڀࣨʹ͔͠Ͱ͖ͳ͍͜ͱʯΛɻ
    Ҿ͖ଓ͖ษڧձ։࠵͢Δɻ
    ҩֶ΁ͷൺॏΛॏ͘͢Δɻ
    ҩྍը૾ݚڀ༻ϞσϧͷνϡʔτϦΞϧɾϋϯζΦϯ

    View Slide