Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ICLR2017読み会@DeNA/iclr2017atDeNA_VLAE
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Masaki Kozuki
June 17, 2017
Research
29k
2
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
ICLR2017読み会@DeNA/iclr2017atDeNA_VLAE
Masaki Kozuki
June 17, 2017
More Decks by Masaki Kozuki
See All by Masaki Kozuki
Sanity Checks for Saliency Maps explained in Japanese language
crcrpar
0
2.6k
Deep Learning for clothes and changing pose
crcrpar
0
930
夏のトップカンファレンス論文読み会 / InnovationMeetup20170918csn_cvpr2k17
crcrpar
3
1.5k
iclr読み会 / iclrjp2017vlae
crcrpar
3
1.1k
Other Decks in Research
See All in Research
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
shunk031
4
1k
正規分布と最適化について
koide3
1
260
2026 東京科学大 情報通信系 研究室紹介 (大岡山)
icttitech
0
3.8k
はじまりの クエスチョンブック —余暇と豊かさにあふれた社会とは?
culturaltransition
PRO
0
520
LLM の Attention 機構まとめ — 数式・計算量・メモリ
puwaer
8
2.2k
セマンティック通信勉強会 6Gに向けたデバイス間効率的な通信の技術紹介・課題・今後展望
satai
3
170
NII S. Koyama's Lab Research Overview AY2026
skoyamalab
0
320
Fukui Shibiten 39 - AI Art
butchi
0
130
コーディングエージェントとABNを再考
hf149
2
720
The mathematics of transformers
gpeyre
0
330
typst の使い方:言語学を研究する学生のために
gitomochang
0
460
進学校の生徒にはア行の苗字が多いのか
ozekinote
0
450
Featured
See All Featured
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.8k
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
71
40k
Principles of Awesome APIs and How to Build Them.
keavy
128
18k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
170
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.2k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.3k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
2.1k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
230
Joys of Absence: A Defence of Solitary Play
codingconduct
1
400
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
580
GraphQLとの向き合い方2022年版
quramy
50
15k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
1
250
Transcript
Variational Lossy Autoencoder ICLR 2017 ಡΈձ @ DeNA @crcrpar 2017/6/17
1 / 25
จ • Variational Lossy Autoencoder • Xi Chen (UC Berkeley,
OpenAI), Diederik P. Kingma (OpenAI), Tim Salimans (OpenAI), et al. • දݱֶशͰજࡏมΛ׆༻͢Δ • Bits Back Coding Ͱ VAE ͷજࡏมʹ͍ͭͯͷߟ • જࡏมΛ lossy ʹ͢Δ • જࡏม z ͷ p(z), q(z|x) Λॊೈʹ • decoder ʹ PixelCNN 2 / 25
දهʹ͍ͭͯ • x ∈ Rd: σʔλ. x = ( x0
. . . xd )⊤ • x<i : x ͷ index ͕ i ະຬͷશཁૉ ( x0 . . . xi−1 )⊤ • z: જࡏม • pdata (x): σʔλΛੜ͢Δਅͷ • DKL (p∥q): p ͷ q ʹର͢Δ Kullback Leibler Divergence • θ: ϞσϧʢNNʣͷύϥϝʔλ • AR: PixelCNN ͳͲͷࣗݾճؼܕ NN • H, H: Τϯτϩϐʔ 3 / 25
VAE తؔ log p(X) = ∑ N i=1 log p(x(i))
࣮ࡍͷతؔ L(x; θ) = Eq(z|x) [log p(x|z) − DKL (q(z|x)∥p(z))] - ਖ਼نԽͨ͠ autoencoder ͱΈΕΔɻ VAE ͷ՝ɾऑ • දݱྗ͕ߴ͗͢Δ decoder જࡏมΛແࢹ • જࡏม͕ͭใΛཧͰ͖ͳ͍ 4 / 25
1 ͳͥʁ ײతʹ ཧʢBits Back Codingʣ 2 VLAE ֓ཁ Autoregressive
Flow decoder: PixelCNN 3 ࣮ݧɾ݁Ռ Lossy Comprssion Density Estimation 5 / 25
1 ͳͥʁ 2 VLAE 3 ࣮ݧɾ݁Ռ 6 / 25
ײతʹ... ͦͦɺRNN / AR ҙͷΛۙࣅͰ͖Δ 1 જࡏมʹใ͕΄ͱΜͲؚ·Εͳ͍ʢֶशॳظʣ 2 decoder σʔλΛ࠶ߏ͠Α͏ͱ͢Δ:
p(x|z) → pdecoder (x) 3 ࣄޙɾۙࣅࣄޙͱʹࣄલʹͳΔ p(z|x), q(z|x) → p(z) 7 / 25
গ͠ཧతʹ... VAE ≈ ූ߸Խ 1 σʔλͷຊ࣭ z Λූ߸Խ: p(z) 2
z ͷζϨΛූ߸Խ: p(x|z) ූ߸ͷ͞ʁ naive ʹ Cnaive (x) = Ex∼data,z∼q(z|x) [− log p(z) − log p(x|z)] Bits Back Coding ޮͷͨΊʹ encoder ͷ q(z|x) Λ༻͍Δ 8 / 25
Bits Back Coding q(z|x) ߴʑ H(q(z|x)) ϏοτͰใΛ͑ΒΕΔ ʢʣ ɿreceiver
q(z|x) ΛΈΕΔ߹ͷΈ Bits Back Coding ͷූ߸ Cnaive q(z|x) ͚ͩແବͰ L(x) = Eq(z|x) [log p(x|z) − log q(z|x)] ͳͷͰ CBitsBack (x) = Ex∼data [−L(x)] ≥ H(data) + Ex∼data [DKL (q(z|x)∥p(z|x))] 9 / 25
Bits Back Coding • ූ߸ͷ࠷খԽ = มԼքͷ࠷େԽ → z ͕ΘΕΔͷූ߸Խ͕ޮՌతͳ࣌
• ΑΓਖ਼֬ͳࣄޙʹΑΓมਪߴਫ਼ʹͳ Δ͕ɺݱ࣌Ͱଘࡏ͠ͳ͍ → DKL (≥ 0) ແࢹͰ͖ͳ͍ 10 / 25
Information Preference z ͕ແࢹ͞ΕΔͷ... p(x|z) ͕ pdata (x) Λz ͷใͳ͠ʹϞσϧԽͰ͖Δ߹
1 ࣄޙ pz|x) ͕ p(z) ʹͳΓɺ 2 ۙࣅࣄޙ q(z|x) p(z) ʹͳΔ ∵ KL ߲Λখ͘͢͞ΔͨΊ Information Preference • z ͳ͠ͰہॴతʹϞσϧԽͰ͖Δใہॴతʹ ූ߸Խ • ͦΕҎ֎ͷใ z Λͬͯ෮߸Խ જࡏมΛ hack ͢Δํ๏ɿ free bits, annealing the relative weight of DKL 11 / 25
1 ͳͥʁ 2 VLAE 3 ࣮ݧɾ݁Ռ 12 / 25
Ϟσϧͷ֓ཁ 1 ॊೈͳࣄલ 2 දݱྗͷ͋Δ decoder 13 / 25
ࣄલͷվળ • ٿ໘ΨεɾҰ༷͕ద͔ٙ • જࡏมͷ׆༻ʹෆՄܽ • → autoregressive flow 14
/ 25
Autoregressive Flow normalizing flows ʹ͍ͭͯ • ୯७ͳ͔ΒॊೈͳͷՄٯͳม • general normalizing
flow • volume preserving flow • Jacobian ͷѻ͍ʹҧ͍ AF ͷಛ IAF ͱಉ͡ܭࢉྔ͕ͩϞσϧ͕ΑΓਂ͍ 15 / 25
Inverse Autoregressive Flow zt = µt + σt ⊙ zt−1
log q(zT |x) = − D ∑ i=1 1 2 ϵ2 i + 1 2 log(2π) + T ∑ t=0 log σt,i ਤ 1: IAF ͷ֓ཁ 16 / 25
IAF posterior ॊೈͳࣄޙΛ֫ಘ͍ͯ͠Δʂ ਤ 2: IAF ͷࣄޙ 17 / 25
AF prior ≡ IAF posterior L(x; θ) = Ez∼q(z|x) [log
p(x|z) + log p(z) − log q(z|x)] = Ez∼q(z|x),ϵ=f−1(z) [ log p(x|f(ϵ)) + log u(ϵ) + log det dϵ dz − log q(z|x) ] = Ez∼q(z|x),ϵ=f−1(z) log p(x|f(ϵ)) + log u(ϵ) − ( log q(z|x) − log det dϵ dz ) IAF posterior 18 / 25
1 ͳͥʁ 2 VLAE 3 ࣮ݧɾ݁Ռ 19 / 25
࣮ݧ֓ཁ • త • જࡏม͕େҬతͳใΛ֫ಘ͍ͯ͠Δ͔ • AF prior ͕ IAF
posterior ΑΓ༏Ε͍ͯΔ͔ • AR decoder ʹΑΓີਪఆͷਫ਼্͕͕Δ͔ • ݕূϞσϧ: AF prior & PixelCNN decoder • σʔληοτ: 2 ͷ 28×28 ը૾ • MNIST, OMNIGLOT, Caltech - 101 Silhouettes • ΞʔΩςΫνϟɾજࡏมͷ࣍ݩ౷Ұ 20 / 25
Lossy Compression - MNIST ࠨɿೖྗɺӈɿग़ྗ • Ͳͷࣈ͔Θ͔Δ • ͨͩͷ࠶ߏͰͳ͍ ਤ
3: original & decompressed MNIST 21 / 25
Lossy Compression - OMNIGLOT ࠨɿೖྗɺӈɿग़ྗ • semantics ͕อଘ͞Ε ͍ͯͳ͍ •
λεΫɾσʔληοτ ͝ͱʹใΛಛఆ͢Δ ඞཁ ਤ 4: original & decompressed OMNIGLOT 22 / 25
જࡏม͔ΒͷαϯϓϦϯά • Սۭͷࣈ • େҬతͳಛ ਤ 5: VLAE ͔Βͷαϯϓϧ 23
/ 25
Density Estimation Unconditional Decoder γϯϓϧͳ PixelCNN 24 / 25
AF priorͷޮՌ • ີਪఆ͕վળ • AR ʹΑͬͯજࡏม ͷ࣋ͭใ͕૿Ճ ਤ 6:
AF prior ͷޮՌ 25 / 25