Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[Journal club] Scalable Diffusion Models with T...
Search
Semantic Machine Intelligence Lab., Keio Univ.
PRO
July 22, 2024
Technology
0
120
[Journal club] Scalable Diffusion Models with Transformers
Semantic Machine Intelligence Lab., Keio Univ.
PRO
July 22, 2024
Tweet
Share
More Decks by Semantic Machine Intelligence Lab., Keio Univ.
See All by Semantic Machine Intelligence Lab., Keio Univ.
[Journal club] FreeTimeGS: Free Gaussian Primitives at Anytime and Anywhere for Dynamic Scene Reconstruction
keio_smilab
PRO
0
34
[Journal club] Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
keio_smilab
PRO
0
76
[Journal club] GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering
keio_smilab
PRO
0
67
[RSJ25] Feasible RAG: Hierarchical Multimodal Retrieval with Feasibility-Aware Embodied Memory for Mobile Manipulation
keio_smilab
PRO
0
160
[RSJ25] LILAC: Language‑Conditioned Object‑Centric Optical Flow for Open‑Loop Trajectory Generation
keio_smilab
PRO
0
110
[RSJ25] Multilingual Scene Text-Aware Multimodal Retrieval for Everyday Objects Based on Deep State Space Models
keio_smilab
PRO
0
99
[RSJ25] Everyday Object Manipulation Based on Scene Text-Aware Multimodal Retrieval
keio_smilab
PRO
1
82
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
150
[Journal club] Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking
keio_smilab
PRO
0
73
Other Decks in Technology
See All in Technology
組織全員で向き合うAI Readyなデータ利活用
gappy50
5
1.8k
어떤 개발자가 되고 싶은가?
arawn
0
170
JSConf JPのwebsiteをGatsbyからNext.jsに移行した話 - Next.jsの多言語静的サイトと課題
leko
2
200
serverless team topology
_kensh
3
250
文字列操作の達人になる ~ Kotlinの文字列の便利な世界 ~ - Kotlin fest 2025
tomorrowkey
1
170
データとAIで明らかになる、私たちの課題 ~Snowflake MCP,Salesforce MCPに触れて~ / Data and AI Insights
kaonavi
0
170
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
14
82k
AWS re:Invent 2025事前勉強会資料 / AWS re:Invent 2025 pre study meetup
kinunori
0
840
GCASアップデート(202508-202510)
techniczna
0
100
【SORACOM UG Explorer 2025】さらなる10年へ ~ SORACOM MVC 発表
soracom
PRO
0
180
オブザーバビリティと育てた ID管理・認証認可基盤の歩み / The Journey of an ID Management, Authentication, and Authorization Platform Nurtured with Observability
kaminashi
2
1.3k
Observability — Extending Into Incident Response
nari_ex
1
590
Featured
See All Featured
Being A Developer After 40
akosma
91
590k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
The Cost Of JavaScript in 2023
addyosmani
55
9.1k
Writing Fast Ruby
sferik
630
62k
Bash Introduction
62gerente
615
210k
Into the Great Unknown - MozCon
thekraken
40
2.1k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
4 Signs Your Business is Dying
shpigford
186
22k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.1k
Visualization
eitanlees
150
16k
It's Worth the Effort
3n
187
28k
Transcript
4DBMBCMF%JGGVTJPO.PEFMTXJUI 5SBOTGPSNFST ܚጯٛक़େֶ ਿӜ໌ݚڀࣨ#ീౡେ 8JMMJBN1FFCMFT 4BJOJOH9JF 6$#FSLFMFZ /FX:PSL6OJWFSTJUZ *$$7 8JMMJBN1FFCMFT
4BJOJOH9JFl4DBMBCMF%JGGVTJPO.PEFMTXJUI5SBOTGPSNFSTzJO*$$7 QQ
എܠɿ֦ࢄϞσϧʹΑΔಈը૾ੜ FH 4PSB ͷൃల IUUQTXXXZPVUVCFDPNXBUDI W),Z%"1/@
ؔ࿈ݚڀɿ֦ࢄϞσϧͷόοΫϘʔϯͱͯ͠6/FU͕ଟ༻ • 6/FUͷ.VMUJTDBMFTLJQDPOOFDUJPOTˠ ෆཁͳܭࢉࢿݯͷ༻ 手法 概要 DALL-E 2 [Ramesh+,
22] CLIPを用いてテキストと画像のAlignmentを行う Stable Diffusion [Rombach+, CVPR22] 潜在拡散モデル 6/FU<3POOFCFSHFS .*$$"*> 4UBCMF%JGGVTJPO<3PNCBDI $713>
ఏҊख๏ɿ%JGGVTJPO5SBOTGPSNFS %J5 • જࡏ֦ࢄϞσϧ -%. <3PNCBDI $713>Λϕʔεʹߏங • 7JTJPO5SBOTGPSNFS 7J5
<%PTPWJUTLJZ *$-3>ػߏΛಋೖ • $POEJUJPOJOHʹΑΔ݅ใͷೖྗ
ఏҊख๏ ɿજࡏ֦ࢄϞσϧͱֶͯ͠शͤ͞Δ • ߴ࣍ݩͷըૉۭؒͰ֦ࢄϞσϧΛֶश ͤ͞Δ͜ͱܭࢉྔతʹࠔ • -%.ͱֶͯ͠शͤ͞Δ͜ͱͰ ܭࢉྔΛݮ
• ըૉۭؒͷ֦ࢄϞσϧͰ͋Δ "%.<%IBSJXBM /FVS*14> ͷͷͷ(GMPQTͰֶशՄೳ
ఏҊख๏ ɿೖྗϊΠζΛQBUDIʹղ • "VUPFODPEFS͔ΒಘΒΕͨ /PJTFE-BUFOU YY Λ 7J5ͱಉ༷ʹE࣍ݩͷτʔΫϯ5ʹม •
1BUDIαΠζQΛʹ͢Δͱ5ഒ ʹͳΓUSBOTGPSNFSͷ (GMPQTগͳ͘ͱഒҎ্
ఏҊख๏ ɿ͖݅ೖྗ $POEJUJPOJOH ͷॲཧ • ͖֦݅ࢄϞσϧͰϊΠζΛؚΉը૾ͱͱʹՃใ͕Ճ͑ΒΕΔ FH UJNFTUFQɼΫϥεϥϕϧɼࣗવݴޠ FUD
• ຊݚڀͰ͜ΕΒͷ͖݅ೖྗΛॲཧ͢ΔͨΊʹҎԼͷͭͷҟͳΔઃܭΛఏҊ • *ODPOUFYUDPOEJUJPOJOH • $SPTT"UUFOUJPOCMPDL • "EBQUJWFMBZFSOPSN BEB-/ CMPDL • BEB-/;FSPCMPDL
ఏҊख๏ ɿBEB-/;FSPCMPDL • 7J5ͷTFMGBUUFOUJPOCMPDLʹରͯ͠"EB-/ػߏΛಋೖ • "EB-/ͷεέʔϧ ͓Αͼ ࠩଓͷલͷεέʔϧ
Λύϥϝʔλͱͯ͠Ճ ˠ݅ใΛը૾ʹΑΓڧ͘ө • "EB-/;FSPCMPDLͰͦΕΒΛθϩʹॳظԽ ˠֶशͷॳظஈ֊߃ؔʹ͍ۙಇ͖ ˠ ֶशͷ҆ఆԽ
࣮ݧઃఆ • σʔληοτ • $MBTT$POEJUJPOBM*NBHF/FUY Y<%FOH $713> • ΞʔΩςΫνϟ
• 7J5ͱಉ༷ʹͭͷϞσϧͷେ͖͞ 4 # - 9- Λ༻ҙ • QBUDITJ[FQ • %%1.TBNQMJOHTUFQT • ධՁई • '*% T'*% *4 1SFDJTJPO 3FDBMM • (GMPQT • ֶश • 516WQPE #BUDITJ[F
ఆྔత݁Ռɿ6/FUϕʔεͷख๏Λ্ճͬͨ
ఆੑత݁Ռ • 1BUDITJ[FΛখ͘͞ɼϞσϧΛେ͖͘͢ΔͱΑΓࣗવͳը૾͕ग़ྗ͞ΕΔ ˠ%J5Ͱ(GMPQT͕େ͖͍΄Ͳग़ྗը૾ͷ্࣭͕͕Δ
ࢼ͓ΑͼΤϥʔੳ ఆੑత݁Ռ ɿࣦഊྫ • ಛఆͷMBCFMʹରͯ͠ෆࣗવͳը૾͕ੜ͞ΕΔ • ྫɿJOQVUMBCFM UPZQPPEMF %%1.TBNQMJOHTUFQ
• ϥϕϧʹΑͬͯTUFQͰੜը૾͕ෆ҆ఆˠ ਪ࣌ͷTUFQΛಈతʹมߋ
ॴײ • 4USFOHUI • ֦ࢄϞσϧʹUSBOTGPSNFSΛಋೖ • ܭࢉࢿݯͱग़ྗը૾ͷ࣭ʹ͍ͭͯͷߟ • 8FBLOFTT
• ͕ࣜগͳ͔ͬͨ • Τϥʔੳ͕ͳ͍
·ͱΊ • എܠ • ֦ࢄϞσϧʹΑΔಈը૾ੜ FH 4PSB ͷൃల • ֦ࢄϞσϧʹ͓͚ΔUSBOTGPSNFSͷར༻͕গͳ͍
• ఏҊख๏ • USBOTGPSNFSϕʔεͷ֦ࢄϞσϧͰ͋Δ%JGGVTJPO5SBOTGPSNFS %J5 ΛఏҊ • ݁Ռ • %J5εέʔϥϏϦςΟ͕ߴ͘ɼ(GMPQT͕େ͖͍΄Ͳ'*%͕Լ ˠ ܭࢉࢿݯͱग़ྗը૾ͷ࣭ʹڧ͍૬ؔؔ • %J59-Ϟσϧ͕ɼ$MBTT$POEJUJPOBM*NBHF/FUʹ͓͍ͯ ैདྷͷ6/FUϕʔεͷ֦ࢄϞσϧΛ্ճͬͨ
"QQFOEJYɿ%FOPJTJOH%JGGVTJPO1SPCBCJMJTUJD.PEFM %%1. ֶश
"QQFOEJYɿ$MBTTJGJFSGSFFHVJEBODF • ͖֦݅ࢄϞσϧͰΫϥεϥϕϧΛϥϯμϜʹυϩοϓ ˠ αϯϓϦϯάͷਫ਼Λ্ • #BZFTͷఆཧΑΓ • ֦ࢄϞσϧͷग़ྗΛείΞͱͯ͠ղऍ͢Δͱਪఆ͢ΔϊΠζҎԼͷΑ͏ʹͳΔ
TɿΨΠμϯεεέʔϧ
"QQFOEJYɿ*ODPOUFYUDPOEJUJPOJOH • $POEJUJPOJOHͰ݅ͱͯ͠ೖྗ͞ΕͨτʔΫϯΛ ը૾τʔΫϯͷઌ಄ʹՃ • ͜ΕΒͷτʔΫϯը૾τʔΫϯͱಉ༷ʹѻΘΕɺ 7J5ʣʹ͓͚ΔDMTτʔΫϯͱࣅׂͨΛ࣋ͭ
"QQFOEJYɿ$SPTT"UUFOUJPOCMPDL • 4FMG"UUFOUJPOϒϩοΫͷޙʹ$SPTT"UUFOUJPOΛ Ճͨ͠ઃܭ • <7BTXBOJ /*14>-%.ͱྨࣅͨ͠ΞʔΩςΫνϟ
"QQFOEJYɿ%J5CMPDLEFTJHO • %J59-Ϟσϧʹ͓͍ͯBEB-/;FSPΛ༻͍ͨ ߹͕࠷গͳ͍ܭࢉࢿݯͰ࠷ྑ͍ '*%,είΞΛୡ
"QQFOEJYɿ7JTJPO5SBOTGPSNFS <%PTPWJUTLJZ *$-3>
"QQFOEJYɿ*ODFQUJPO4DPSF *4 • *NBHF/FUͰࣄલֶशࡁΈͷ*ODFQUJPOOFUXPSLΛ༻͍ͨධՁࢦඪ • *ODFQUJPOOFUXPSL͕ࣝผ͘͢͠ɼࣝผ͞ΕΔϥϕϧͷଟ༷ੑ͕͋Δ΄Ͳ େ͖͘ͳΔࢦඪ <4[FHFEZ $713>
"QQFOEJYɿ'SFDIFU*ODFQUJPO%JTUBODF '*% • *NBHF/FUͰࣄલֶशࡁΈͷ*ODFQUJPOOFUXPSLΛ༻͍ͨධՁࢦඪ • ੜ͞Εͨը૾ͷಛ͕(5ը૾ͷಛͱͲͷఔ ࣅ͍ͯΔ͔ΛධՁ͢Δࢦඪ • '*%͕খ͍͞΄Ͳੜ͞Εͨը૾ͷ࣭͕(5ը૾ʹ͍ۙͱߟ͑ΒΕΔ
"QQFOEJYɿ1SFDJTJPO3FDBMM • *NBHF/FUͰࣄલֶशࡁΈͷ7((<4JNPOZBO *$-3>Λ༻͍ͯ ಛϕΫτϧू߹ΛಘΔ
"QQFOEJYɿ(GMPQT • 'MPQTɿුಈখԋࢉͷճ • (GMPQT 'MPQT • ը૾ੜλεΫͰΞʔΩςΫνϟͷෳࡶ͞ΛධՁ͢ΔࡍύϥϝʔλΛ༻͍Δͷ ͕Ұൠత
• ੑೳʹେ͖͘Өڹ͢Δը૾ղ૾ΛҰߟྀ͍ͯ͠ͳ͍ • Ϟσϧͷෳࡶ͞Λද͢ࢦඪͱͯ͠ෆेͳ߹͕͋Δ
"QQFOEJYɿఆྔత݁Ռ
"QQFOEJY(GMPQTͱ'*%ͷ૬ؔ • ΑΓଟ͘ͷ(GMPQTΛͭϞσϧ'*%͕͘ͳΔ
"QQFOEJYɿϞσϧαΠζͱύοναΠζͷݕ౼