Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[Journal club] Scalable Diffusion Models with T...
Search
Semantic Machine Intelligence Lab., Keio Univ.
PRO
July 22, 2024
Technology
0
77
[Journal club] Scalable Diffusion Models with Transformers
Semantic Machine Intelligence Lab., Keio Univ.
PRO
July 22, 2024
Tweet
Share
More Decks by Semantic Machine Intelligence Lab., Keio Univ.
See All by Semantic Machine Intelligence Lab., Keio Univ.
[Journal club] MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting
keio_smilab
PRO
0
27
[Journal club] Seeing the Unseen: Visual Common Sense for Semantic Placement
keio_smilab
PRO
0
25
[Journal club] Language-Embedded Gaussian Splats (LEGS): Incrementally Building Room-Scale Representations with a Mobile Robot
keio_smilab
PRO
0
6
[Journal club] RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
keio_smilab
PRO
1
11
[Journal club] Simplified State Space Layers for Sequence Modeling
keio_smilab
PRO
0
26
[Journal club] Detecting and Preventing Hallucinations in Large Vision Language Models
keio_smilab
PRO
1
72
[IROS24] Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models
keio_smilab
PRO
0
46
[IROS24] Learning-To-Rank Approach for Identifying Everyday Objects Using a Physical-World Search Engine
keio_smilab
PRO
0
77
[RSJ24] オフライン軌道生成による軌道に基づくOpen-Vocabulary物体操作タスクにおける将来成否予測
keio_smilab
PRO
1
120
Other Decks in Technology
See All in Technology
Oracle Cloud Infrastructureデータベース・クラウド:各バージョンのサポート期間
oracle4engineer
PRO
29
13k
The Role of Developer Relations in AI Product Success.
giftojabu1
1
150
マルチモーダル / AI Agent / LLMOps 3つの技術トレンドで理解するLLMの今後の展望
hirosatogamo
37
13k
RubyのWebアプリケーションを50倍速くする方法 / How to Make a Ruby Web Application 50 Times Faster
hogelog
3
950
Taming you application's environments
salaboy
0
200
100 名超が参加した日経グループ横断の競技型 AWS 学習イベント「Nikkei Group AWS GameDay」の紹介/mediajaws202411
nikkei_engineer_recruiting
1
170
DynamoDB でスロットリングが発生したとき/when_throttling_occurs_in_dynamodb_short
emiki
0
270
開発生産性を上げながらビジネスも30倍成長させてきたチームの姿
kamina_zzz
2
1.7k
Introduction to Works of ML Engineer in LY Corporation
lycorp_recruit_jp
0
140
New Relicを活用したSREの最初のステップ / NRUG OKINAWA VOL.3
isaoshimizu
3
640
VideoMamba: State Space Model for Efficient Video Understanding
chou500
0
190
"とにかくやってみる"で始めるAWS Security Hub
maimyyym
2
100
Featured
See All Featured
The Cult of Friendly URLs
andyhume
78
6k
A designer walks into a library…
pauljervisheath
204
24k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
Happy Clients
brianwarren
98
6.7k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
16
2.1k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.4k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
Designing for humans not robots
tammielis
250
25k
5 minutes of I Can Smell Your CMS
philhawksworth
202
19k
Bash Introduction
62gerente
608
210k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
126
18k
Speed Design
sergeychernyshev
25
620
Transcript
4DBMBCMF%JGGVTJPO.PEFMTXJUI 5SBOTGPSNFST ܚጯٛक़େֶ ਿӜ໌ݚڀࣨ#ീౡେ 8JMMJBN1FFCMFT 4BJOJOH9JF 6$#FSLFMFZ /FX:PSL6OJWFSTJUZ *$$7 8JMMJBN1FFCMFT
4BJOJOH9JFl4DBMBCMF%JGGVTJPO.PEFMTXJUI5SBOTGPSNFSTzJO*$$7 QQ
എܠɿ֦ࢄϞσϧʹΑΔಈը૾ੜ FH 4PSB ͷൃల IUUQTXXXZPVUVCFDPNXBUDI W),Z%"1/@
ؔ࿈ݚڀɿ֦ࢄϞσϧͷόοΫϘʔϯͱͯ͠6/FU͕ଟ༻ • 6/FUͷ.VMUJTDBMFTLJQDPOOFDUJPOTˠ ෆཁͳܭࢉࢿݯͷ༻ 手法 概要 DALL-E 2 [Ramesh+,
22] CLIPを用いてテキストと画像のAlignmentを行う Stable Diffusion [Rombach+, CVPR22] 潜在拡散モデル 6/FU<3POOFCFSHFS .*$$"*> 4UBCMF%JGGVTJPO<3PNCBDI $713>
ఏҊख๏ɿ%JGGVTJPO5SBOTGPSNFS %J5 • જࡏ֦ࢄϞσϧ -%. <3PNCBDI $713>Λϕʔεʹߏங • 7JTJPO5SBOTGPSNFS 7J5
<%PTPWJUTLJZ *$-3>ػߏΛಋೖ • $POEJUJPOJOHʹΑΔ݅ใͷೖྗ
ఏҊख๏ ɿજࡏ֦ࢄϞσϧͱֶͯ͠शͤ͞Δ • ߴ࣍ݩͷըૉۭؒͰ֦ࢄϞσϧΛֶश ͤ͞Δ͜ͱܭࢉྔతʹࠔ • -%.ͱֶͯ͠शͤ͞Δ͜ͱͰ ܭࢉྔΛݮ
• ըૉۭؒͷ֦ࢄϞσϧͰ͋Δ "%.<%IBSJXBM /FVS*14> ͷͷͷ(GMPQTͰֶशՄೳ
ఏҊख๏ ɿೖྗϊΠζΛQBUDIʹղ • "VUPFODPEFS͔ΒಘΒΕͨ /PJTFE-BUFOU YY Λ 7J5ͱಉ༷ʹE࣍ݩͷτʔΫϯ5ʹม •
1BUDIαΠζQΛʹ͢Δͱ5ഒ ʹͳΓUSBOTGPSNFSͷ (GMPQTগͳ͘ͱഒҎ্
ఏҊख๏ ɿ͖݅ೖྗ $POEJUJPOJOH ͷॲཧ • ͖֦݅ࢄϞσϧͰϊΠζΛؚΉը૾ͱͱʹՃใ͕Ճ͑ΒΕΔ FH UJNFTUFQɼΫϥεϥϕϧɼࣗવݴޠ FUD
• ຊݚڀͰ͜ΕΒͷ͖݅ೖྗΛॲཧ͢ΔͨΊʹҎԼͷͭͷҟͳΔઃܭΛఏҊ • *ODPOUFYUDPOEJUJPOJOH • $SPTT"UUFOUJPOCMPDL • "EBQUJWFMBZFSOPSN BEB-/ CMPDL • BEB-/;FSPCMPDL
ఏҊख๏ ɿBEB-/;FSPCMPDL • 7J5ͷTFMGBUUFOUJPOCMPDLʹରͯ͠"EB-/ػߏΛಋೖ • "EB-/ͷεέʔϧ ͓Αͼ ࠩଓͷલͷεέʔϧ
Λύϥϝʔλͱͯ͠Ճ ˠ݅ใΛը૾ʹΑΓڧ͘ө • "EB-/;FSPCMPDLͰͦΕΒΛθϩʹॳظԽ ˠֶशͷॳظஈ֊߃ؔʹ͍ۙಇ͖ ˠ ֶशͷ҆ఆԽ
࣮ݧઃఆ • σʔληοτ • $MBTT$POEJUJPOBM*NBHF/FUY Y<%FOH $713> • ΞʔΩςΫνϟ
• 7J5ͱಉ༷ʹͭͷϞσϧͷେ͖͞ 4 # - 9- Λ༻ҙ • QBUDITJ[FQ • %%1.TBNQMJOHTUFQT • ධՁई • '*% T'*% *4 1SFDJTJPO 3FDBMM • (GMPQT • ֶश • 516WQPE #BUDITJ[F
ఆྔత݁Ռɿ6/FUϕʔεͷख๏Λ্ճͬͨ
ఆੑత݁Ռ • 1BUDITJ[FΛখ͘͞ɼϞσϧΛେ͖͘͢ΔͱΑΓࣗવͳը૾͕ग़ྗ͞ΕΔ ˠ%J5Ͱ(GMPQT͕େ͖͍΄Ͳग़ྗը૾ͷ্࣭͕͕Δ
ࢼ͓ΑͼΤϥʔੳ ఆੑత݁Ռ ɿࣦഊྫ • ಛఆͷMBCFMʹରͯ͠ෆࣗવͳը૾͕ੜ͞ΕΔ • ྫɿJOQVUMBCFM UPZQPPEMF %%1.TBNQMJOHTUFQ
• ϥϕϧʹΑͬͯTUFQͰੜը૾͕ෆ҆ఆˠ ਪ࣌ͷTUFQΛಈతʹมߋ
ॴײ • 4USFOHUI • ֦ࢄϞσϧʹUSBOTGPSNFSΛಋೖ • ܭࢉࢿݯͱग़ྗը૾ͷ࣭ʹ͍ͭͯͷߟ • 8FBLOFTT
• ͕ࣜগͳ͔ͬͨ • Τϥʔੳ͕ͳ͍
·ͱΊ • എܠ • ֦ࢄϞσϧʹΑΔಈը૾ੜ FH 4PSB ͷൃల • ֦ࢄϞσϧʹ͓͚ΔUSBOTGPSNFSͷར༻͕গͳ͍
• ఏҊख๏ • USBOTGPSNFSϕʔεͷ֦ࢄϞσϧͰ͋Δ%JGGVTJPO5SBOTGPSNFS %J5 ΛఏҊ • ݁Ռ • %J5εέʔϥϏϦςΟ͕ߴ͘ɼ(GMPQT͕େ͖͍΄Ͳ'*%͕Լ ˠ ܭࢉࢿݯͱग़ྗը૾ͷ࣭ʹڧ͍૬ؔؔ • %J59-Ϟσϧ͕ɼ$MBTT$POEJUJPOBM*NBHF/FUʹ͓͍ͯ ैདྷͷ6/FUϕʔεͷ֦ࢄϞσϧΛ্ճͬͨ
"QQFOEJYɿ%FOPJTJOH%JGGVTJPO1SPCBCJMJTUJD.PEFM %%1. ֶश
"QQFOEJYɿ$MBTTJGJFSGSFFHVJEBODF • ͖֦݅ࢄϞσϧͰΫϥεϥϕϧΛϥϯμϜʹυϩοϓ ˠ αϯϓϦϯάͷਫ਼Λ্ • #BZFTͷఆཧΑΓ • ֦ࢄϞσϧͷग़ྗΛείΞͱͯ͠ղऍ͢Δͱਪఆ͢ΔϊΠζҎԼͷΑ͏ʹͳΔ
TɿΨΠμϯεεέʔϧ
"QQFOEJYɿ*ODPOUFYUDPOEJUJPOJOH • $POEJUJPOJOHͰ݅ͱͯ͠ೖྗ͞ΕͨτʔΫϯΛ ը૾τʔΫϯͷઌ಄ʹՃ • ͜ΕΒͷτʔΫϯը૾τʔΫϯͱಉ༷ʹѻΘΕɺ 7J5ʣʹ͓͚ΔDMTτʔΫϯͱࣅׂͨΛ࣋ͭ
"QQFOEJYɿ$SPTT"UUFOUJPOCMPDL • 4FMG"UUFOUJPOϒϩοΫͷޙʹ$SPTT"UUFOUJPOΛ Ճͨ͠ઃܭ • <7BTXBOJ /*14>-%.ͱྨࣅͨ͠ΞʔΩςΫνϟ
"QQFOEJYɿ%J5CMPDLEFTJHO • %J59-Ϟσϧʹ͓͍ͯBEB-/;FSPΛ༻͍ͨ ߹͕࠷গͳ͍ܭࢉࢿݯͰ࠷ྑ͍ '*%,είΞΛୡ
"QQFOEJYɿ7JTJPO5SBOTGPSNFS <%PTPWJUTLJZ *$-3>
"QQFOEJYɿ*ODFQUJPO4DPSF *4 • *NBHF/FUͰࣄલֶशࡁΈͷ*ODFQUJPOOFUXPSLΛ༻͍ͨධՁࢦඪ • *ODFQUJPOOFUXPSL͕ࣝผ͘͢͠ɼࣝผ͞ΕΔϥϕϧͷଟ༷ੑ͕͋Δ΄Ͳ େ͖͘ͳΔࢦඪ <4[FHFEZ $713>
"QQFOEJYɿ'SFDIFU*ODFQUJPO%JTUBODF '*% • *NBHF/FUͰࣄલֶशࡁΈͷ*ODFQUJPOOFUXPSLΛ༻͍ͨධՁࢦඪ • ੜ͞Εͨը૾ͷಛ͕(5ը૾ͷಛͱͲͷఔ ࣅ͍ͯΔ͔ΛධՁ͢Δࢦඪ • '*%͕খ͍͞΄Ͳੜ͞Εͨը૾ͷ࣭͕(5ը૾ʹ͍ۙͱߟ͑ΒΕΔ
"QQFOEJYɿ1SFDJTJPO3FDBMM • *NBHF/FUͰࣄલֶशࡁΈͷ7((<4JNPOZBO *$-3>Λ༻͍ͯ ಛϕΫτϧू߹ΛಘΔ
"QQFOEJYɿ(GMPQT • 'MPQTɿුಈখԋࢉͷճ • (GMPQT 'MPQT • ը૾ੜλεΫͰΞʔΩςΫνϟͷෳࡶ͞ΛධՁ͢ΔࡍύϥϝʔλΛ༻͍Δͷ ͕Ұൠత
• ੑೳʹେ͖͘Өڹ͢Δը૾ղ૾ΛҰߟྀ͍ͯ͠ͳ͍ • Ϟσϧͷෳࡶ͞Λද͢ࢦඪͱͯ͠ෆेͳ߹͕͋Δ
"QQFOEJYɿఆྔత݁Ռ
"QQFOEJY(GMPQTͱ'*%ͷ૬ؔ • ΑΓଟ͘ͷ(GMPQTΛͭϞσϧ'*%͕͘ͳΔ
"QQFOEJYɿϞσϧαΠζͱύοναΠζͷݕ౼