Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[Journal club] Scalable Diffusion Models with T...
Search
Semantic Machine Intelligence Lab., Keio Univ.
PRO
July 22, 2024
Technology
0
140
[Journal club] Scalable Diffusion Models with Transformers
Semantic Machine Intelligence Lab., Keio Univ.
PRO
July 22, 2024
Tweet
Share
More Decks by Semantic Machine Intelligence Lab., Keio Univ.
See All by Semantic Machine Intelligence Lab., Keio Univ.
Mobi-𝜋: Mobilizing Your Robot Learning Policy
keio_smilab
PRO
0
100
A Gentle Introduction to Transformers
keio_smilab
PRO
5
2.4k
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
keio_smilab
PRO
0
39
[Journal club] VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
keio_smilab
PRO
0
100
[Journal club] Improved Mean Flows: On the Challenges of Fastforward Generative Models
keio_smilab
PRO
0
160
[Journal club] MemER: Scaling Up Memory for Robot Control via Experience Retrieval
keio_smilab
PRO
0
120
[Journal club] Flow Matching for Generative Modeling
keio_smilab
PRO
1
400
Multimodal AI Driving Solutions to Societal Challenges
keio_smilab
PRO
2
250
[Journal club] Re-thinking Temporal Search for Long-Form Video Understanding
keio_smilab
PRO
0
65
Other Decks in Technology
See All in Technology
AIにより大幅に強化された AWS Transform Customを触ってみる
0air
0
180
ハーネスエンジニアリング×AI適応開発
aictokamiya
1
730
AgentCoreとLINEを使った飲食店おすすめアプリを作ってみた
yakumo
2
260
【AWS】CloudTrail LakeとCloudWatch Logs Insightsの使い分け方針
tsurunosd
0
120
SaaSの操作主体は人間からAIへ - 経理AIエージェントが目指す深い自動化
nishihira
0
120
非同期・イベント駆動処理の分散トレーシングの繋げ方
ichikawaken
1
210
SaaSに宿る21g
kanyamaguc
2
180
開発チームとQAエンジニアの新しい協業モデル -年末調整開発チームで実践する【QAリード施策】-
qa
0
410
ThetaOS - A Mythical Machine comes Alive
aslander
0
220
Embeddings : Symfony AI en pratique
lyrixx
0
420
開発チームとQAエンジニアの新しい協業モデル -年末調整開発チームで実践する【QAリード施策】-
kaomi_wombat
0
260
The essence of decision-making lies in primary data
kaminashi
0
180
Featured
See All Featured
So, you think you're a good person
axbom
PRO
2
2k
Balancing Empowerment & Direction
lara
5
1k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
400
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
160
The browser strikes back
jonoalderson
0
850
Tell your own story through comics
letsgokoyo
1
870
Leo the Paperboy
mayatellez
5
1.6k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.5k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.1k
What does AI have to do with Human Rights?
axbom
PRO
1
2.1k
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
140
Building the Perfect Custom Keyboard
takai
2
720
Transcript
4DBMBCMF%JGGVTJPO.PEFMTXJUI 5SBOTGPSNFST ܚጯٛक़େֶ ਿӜ໌ݚڀࣨ#ീౡେ 8JMMJBN1FFCMFT 4BJOJOH9JF 6$#FSLFMFZ /FX:PSL6OJWFSTJUZ *$$7 8JMMJBN1FFCMFT
4BJOJOH9JFl4DBMBCMF%JGGVTJPO.PEFMTXJUI5SBOTGPSNFSTzJO*$$7 QQ
എܠɿ֦ࢄϞσϧʹΑΔಈը૾ੜ FH 4PSB ͷൃల IUUQTXXXZPVUVCFDPNXBUDI W),Z%"1/@
ؔ࿈ݚڀɿ֦ࢄϞσϧͷόοΫϘʔϯͱͯ͠6/FU͕ଟ༻ • 6/FUͷ.VMUJTDBMFTLJQDPOOFDUJPOTˠ ෆཁͳܭࢉࢿݯͷ༻ 手法 概要 DALL-E 2 [Ramesh+,
22] CLIPを用いてテキストと画像のAlignmentを行う Stable Diffusion [Rombach+, CVPR22] 潜在拡散モデル 6/FU<3POOFCFSHFS .*$$"*> 4UBCMF%JGGVTJPO<3PNCBDI $713>
ఏҊख๏ɿ%JGGVTJPO5SBOTGPSNFS %J5 • જࡏ֦ࢄϞσϧ -%. <3PNCBDI $713>Λϕʔεʹߏங • 7JTJPO5SBOTGPSNFS 7J5
<%PTPWJUTLJZ *$-3>ػߏΛಋೖ • $POEJUJPOJOHʹΑΔ݅ใͷೖྗ
ఏҊख๏ ɿજࡏ֦ࢄϞσϧͱֶͯ͠शͤ͞Δ • ߴ࣍ݩͷըૉۭؒͰ֦ࢄϞσϧΛֶश ͤ͞Δ͜ͱܭࢉྔతʹࠔ • -%.ͱֶͯ͠शͤ͞Δ͜ͱͰ ܭࢉྔΛݮ
• ըૉۭؒͷ֦ࢄϞσϧͰ͋Δ "%.<%IBSJXBM /FVS*14> ͷͷͷ(GMPQTͰֶशՄೳ
ఏҊख๏ ɿೖྗϊΠζΛQBUDIʹղ • "VUPFODPEFS͔ΒಘΒΕͨ /PJTFE-BUFOU YY Λ 7J5ͱಉ༷ʹE࣍ݩͷτʔΫϯ5ʹม •
1BUDIαΠζQΛʹ͢Δͱ5ഒ ʹͳΓUSBOTGPSNFSͷ (GMPQTগͳ͘ͱഒҎ্
ఏҊख๏ ɿ͖݅ೖྗ $POEJUJPOJOH ͷॲཧ • ͖֦݅ࢄϞσϧͰϊΠζΛؚΉը૾ͱͱʹՃใ͕Ճ͑ΒΕΔ FH UJNFTUFQɼΫϥεϥϕϧɼࣗવݴޠ FUD
• ຊݚڀͰ͜ΕΒͷ͖݅ೖྗΛॲཧ͢ΔͨΊʹҎԼͷͭͷҟͳΔઃܭΛఏҊ • *ODPOUFYUDPOEJUJPOJOH • $SPTT"UUFOUJPOCMPDL • "EBQUJWFMBZFSOPSN BEB-/ CMPDL • BEB-/;FSPCMPDL
ఏҊख๏ ɿBEB-/;FSPCMPDL • 7J5ͷTFMGBUUFOUJPOCMPDLʹରͯ͠"EB-/ػߏΛಋೖ • "EB-/ͷεέʔϧ ͓Αͼ ࠩଓͷલͷεέʔϧ
Λύϥϝʔλͱͯ͠Ճ ˠ݅ใΛը૾ʹΑΓڧ͘ө • "EB-/;FSPCMPDLͰͦΕΒΛθϩʹॳظԽ ˠֶशͷॳظஈ֊߃ؔʹ͍ۙಇ͖ ˠ ֶशͷ҆ఆԽ
࣮ݧઃఆ • σʔληοτ • $MBTT$POEJUJPOBM*NBHF/FUY Y<%FOH $713> • ΞʔΩςΫνϟ
• 7J5ͱಉ༷ʹͭͷϞσϧͷେ͖͞ 4 # - 9- Λ༻ҙ • QBUDITJ[FQ • %%1.TBNQMJOHTUFQT • ධՁई • '*% T'*% *4 1SFDJTJPO 3FDBMM • (GMPQT • ֶश • 516WQPE #BUDITJ[F
ఆྔత݁Ռɿ6/FUϕʔεͷख๏Λ্ճͬͨ
ఆੑత݁Ռ • 1BUDITJ[FΛখ͘͞ɼϞσϧΛେ͖͘͢ΔͱΑΓࣗવͳը૾͕ग़ྗ͞ΕΔ ˠ%J5Ͱ(GMPQT͕େ͖͍΄Ͳग़ྗը૾ͷ্࣭͕͕Δ
ࢼ͓ΑͼΤϥʔੳ ఆੑత݁Ռ ɿࣦഊྫ • ಛఆͷMBCFMʹରͯ͠ෆࣗવͳը૾͕ੜ͞ΕΔ • ྫɿJOQVUMBCFM UPZQPPEMF %%1.TBNQMJOHTUFQ
• ϥϕϧʹΑͬͯTUFQͰੜը૾͕ෆ҆ఆˠ ਪ࣌ͷTUFQΛಈతʹมߋ
ॴײ • 4USFOHUI • ֦ࢄϞσϧʹUSBOTGPSNFSΛಋೖ • ܭࢉࢿݯͱग़ྗը૾ͷ࣭ʹ͍ͭͯͷߟ • 8FBLOFTT
• ͕ࣜগͳ͔ͬͨ • Τϥʔੳ͕ͳ͍
·ͱΊ • എܠ • ֦ࢄϞσϧʹΑΔಈը૾ੜ FH 4PSB ͷൃల • ֦ࢄϞσϧʹ͓͚ΔUSBOTGPSNFSͷར༻͕গͳ͍
• ఏҊख๏ • USBOTGPSNFSϕʔεͷ֦ࢄϞσϧͰ͋Δ%JGGVTJPO5SBOTGPSNFS %J5 ΛఏҊ • ݁Ռ • %J5εέʔϥϏϦςΟ͕ߴ͘ɼ(GMPQT͕େ͖͍΄Ͳ'*%͕Լ ˠ ܭࢉࢿݯͱग़ྗը૾ͷ࣭ʹڧ͍૬ؔؔ • %J59-Ϟσϧ͕ɼ$MBTT$POEJUJPOBM*NBHF/FUʹ͓͍ͯ ैདྷͷ6/FUϕʔεͷ֦ࢄϞσϧΛ্ճͬͨ
"QQFOEJYɿ%FOPJTJOH%JGGVTJPO1SPCBCJMJTUJD.PEFM %%1. ֶश
"QQFOEJYɿ$MBTTJGJFSGSFFHVJEBODF • ͖֦݅ࢄϞσϧͰΫϥεϥϕϧΛϥϯμϜʹυϩοϓ ˠ αϯϓϦϯάͷਫ਼Λ্ • #BZFTͷఆཧΑΓ • ֦ࢄϞσϧͷग़ྗΛείΞͱͯ͠ղऍ͢Δͱਪఆ͢ΔϊΠζҎԼͷΑ͏ʹͳΔ
TɿΨΠμϯεεέʔϧ
"QQFOEJYɿ*ODPOUFYUDPOEJUJPOJOH • $POEJUJPOJOHͰ݅ͱͯ͠ೖྗ͞ΕͨτʔΫϯΛ ը૾τʔΫϯͷઌ಄ʹՃ • ͜ΕΒͷτʔΫϯը૾τʔΫϯͱಉ༷ʹѻΘΕɺ 7J5ʣʹ͓͚ΔDMTτʔΫϯͱࣅׂͨΛ࣋ͭ
"QQFOEJYɿ$SPTT"UUFOUJPOCMPDL • 4FMG"UUFOUJPOϒϩοΫͷޙʹ$SPTT"UUFOUJPOΛ Ճͨ͠ઃܭ • <7BTXBOJ /*14>-%.ͱྨࣅͨ͠ΞʔΩςΫνϟ
"QQFOEJYɿ%J5CMPDLEFTJHO • %J59-Ϟσϧʹ͓͍ͯBEB-/;FSPΛ༻͍ͨ ߹͕࠷গͳ͍ܭࢉࢿݯͰ࠷ྑ͍ '*%,είΞΛୡ
"QQFOEJYɿ7JTJPO5SBOTGPSNFS <%PTPWJUTLJZ *$-3>
"QQFOEJYɿ*ODFQUJPO4DPSF *4 • *NBHF/FUͰࣄલֶशࡁΈͷ*ODFQUJPOOFUXPSLΛ༻͍ͨධՁࢦඪ • *ODFQUJPOOFUXPSL͕ࣝผ͘͢͠ɼࣝผ͞ΕΔϥϕϧͷଟ༷ੑ͕͋Δ΄Ͳ େ͖͘ͳΔࢦඪ <4[FHFEZ $713>
"QQFOEJYɿ'SFDIFU*ODFQUJPO%JTUBODF '*% • *NBHF/FUͰࣄલֶशࡁΈͷ*ODFQUJPOOFUXPSLΛ༻͍ͨධՁࢦඪ • ੜ͞Εͨը૾ͷಛ͕(5ը૾ͷಛͱͲͷఔ ࣅ͍ͯΔ͔ΛධՁ͢Δࢦඪ • '*%͕খ͍͞΄Ͳੜ͞Εͨը૾ͷ࣭͕(5ը૾ʹ͍ۙͱߟ͑ΒΕΔ
"QQFOEJYɿ1SFDJTJPO3FDBMM • *NBHF/FUͰࣄલֶशࡁΈͷ7((<4JNPOZBO *$-3>Λ༻͍ͯ ಛϕΫτϧू߹ΛಘΔ
"QQFOEJYɿ(GMPQT • 'MPQTɿුಈখԋࢉͷճ • (GMPQT 'MPQT • ը૾ੜλεΫͰΞʔΩςΫνϟͷෳࡶ͞ΛධՁ͢ΔࡍύϥϝʔλΛ༻͍Δͷ ͕Ұൠత
• ੑೳʹେ͖͘Өڹ͢Δը૾ղ૾ΛҰߟྀ͍ͯ͠ͳ͍ • Ϟσϧͷෳࡶ͞Λද͢ࢦඪͱͯ͠ෆेͳ߹͕͋Δ
"QQFOEJYɿఆྔత݁Ռ
"QQFOEJY(GMPQTͱ'*%ͷ૬ؔ • ΑΓଟ͘ͷ(GMPQTΛͭϞσϧ'*%͕͘ͳΔ
"QQFOEJYɿϞσϧαΠζͱύοναΠζͷݕ౼