Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ICLR2017読み会@DeNA/iclr2017atDeNA_VLAE

 ICLR2017読み会@DeNA/iclr2017atDeNA_VLAE

38fbb481037166a3e7facba1020b7a7c?s=128

Masaki Kozuki

June 17, 2017
Tweet

More Decks by Masaki Kozuki

Other Decks in Research

Transcript

  1. Variational Lossy Autoencoder ICLR 2017 ಡΈձ @ DeNA @crcrpar 2017/6/17

    1 / 25
  2. ࿦จ • Variational Lossy Autoencoder • Xi Chen (UC Berkeley,

    OpenAI), Diederik P. Kingma (OpenAI), Tim Salimans (OpenAI), et al. • දݱֶशͰજࡏม਺Λ׆༻͢Δ • Bits Back Coding Ͱ VAE ͷજࡏม਺ʹ͍ͭͯͷߟ࡯ • જࡏม਺Λ lossy ʹ͢Δ • જࡏม਺ z ͷ෼෍ p(z), q(z|x) Λॊೈʹ • decoder ʹ PixelCNN 2 / 25
  3. දهʹ͍ͭͯ • x ∈ Rd: σʔλ. x = ( x0

    . . . xd )⊤ • x<i : x ͷ index ͕ i ະຬͷશཁૉ ( x0 . . . xi−1 )⊤ • z: જࡏม਺ • pdata (x): σʔλΛੜ੒͢Δਅͷ෼෍ • DKL (p∥q): p ͷ q ʹର͢Δ Kullback Leibler Divergence • θ: ϞσϧʢNNʣͷύϥϝʔλ • AR: PixelCNN ͳͲͷࣗݾճؼܕ NN • H, H: Τϯτϩϐʔ 3 / 25
  4. VAE ໨తؔ਺ log p(X) = ∑ N i=1 log p(x(i))

    ࣮ࡍͷ໨తؔ਺ L(x; θ) = Eq(z|x) [log p(x|z) − DKL (q(z|x)∥p(z))] - ਖ਼نԽͨ͠ autoencoder ͱΈΕΔɻ VAE ͷ՝୊ɾऑ఺ • දݱྗ͕ߴ͗͢Δ decoder ͸જࡏม਺Λແࢹ • જࡏม਺͕΋ͭ৘ใΛ؅ཧͰ͖ͳ͍ 4 / 25
  5. 1 ͳͥʁ ௚ײతʹ ཧ࿦ʢBits Back Codingʣ 2 VLAE ֓ཁ Autoregressive

    Flow decoder: PixelCNN 3 ࣮ݧɾ݁Ռ Lossy Comprssion Density Estimation 5 / 25
  6. 1 ͳͥʁ 2 VLAE 3 ࣮ݧɾ݁Ռ 6 / 25

  7. ௚ײతʹ... ͦ΋ͦ΋ɺRNN / AR ͸೚ҙͷ෼෍ΛۙࣅͰ͖Δ 1 જࡏม਺ʹ৘ใ͕΄ͱΜͲؚ·Εͳ͍ʢֶशॳظʣ 2 decoder ͸௚઀σʔλΛ࠶ߏ੒͠Α͏ͱ͢Δ:

    p(x|z) → pdecoder (x) 3 ࣄޙ෼෍ɾۙࣅࣄޙ෼෍ͱ΋ʹࣄલ෼෍ʹͳΔ p(z|x), q(z|x) → p(z) 7 / 25
  8. গ͠ཧ࿦తʹ... VAE ≈ ූ߸Խ 1 σʔλͷຊ࣭ z Λූ߸Խ: p(z) 2

    z ͷζϨΛූ߸Խ: p(x|z) ූ߸ͷ௕͞͸ʁ naive ʹ Cnaive (x) = Ex∼data,z∼q(z|x) [− log p(z) − log p(x|z)] Bits Back Coding ޮ཰ͷͨΊʹ encoder ͷ෼෍ q(z|x) Λ༻͍Δ 8 / 25
  9. Bits Back Coding q(z|x) ߴʑ H(q(z|x)) ϏοτͰ৘ใΛ఻͑ΒΕΔ ʢ஫ʣ ɿreceiver ΋

    q(z|x) ΛΈΕΔ৔߹ͷΈ Bits Back Coding ͷූ߸௕ Cnaive ͸ q(z|x) ͚ͩແବͰ L(x) = Eq(z|x) [log p(x|z) − log q(z|x)] ͳͷͰ CBitsBack (x) = Ex∼data [−L(x)] ≥ H(data) + Ex∼data [DKL (q(z|x)∥p(z|x))] 9 / 25
  10. Bits Back Coding • ූ߸௕ͷ࠷খԽ = ม෼Լքͷ࠷େԽ → z ͕࢖ΘΕΔͷ͸ූ߸Խ͕ޮՌతͳ࣌

    • ΑΓਖ਼֬ͳࣄޙ෼෍ʹΑΓม෼ਪ࿦͸ߴਫ਼౓ʹͳ Δ͕ɺݱ࣌఺Ͱ͸ଘࡏ͠ͳ͍ → DKL (≥ 0) ͸ແࢹͰ͖ͳ͍ 10 / 25
  11. Information Preference z ͕ແࢹ͞ΕΔͷ͸... p(x|z) ͕ pdata (x) Λz ͷ৘ใͳ͠ʹϞσϧԽͰ͖Δ৔߹

    1 ࣄޙ෼෍ pz|x) ͕ p(z) ʹͳΓɺ 2 ۙࣅࣄޙ෼෍ q(z|x) ΋ p(z) ʹͳΔ ∵ KL ߲Λখ͘͢͞ΔͨΊ Information Preference • z ͳ͠ͰہॴతʹϞσϧԽͰ͖Δ৘ใ͸ہॴతʹ ූ߸Խ • ͦΕҎ֎ͷ৘ใ͸ z Λ࢖ͬͯ෮߸Խ જࡏม਺Λ hack ͢Δํ๏ɿ free bits, annealing the relative weight of DKL 11 / 25
  12. 1 ͳͥʁ 2 VLAE 3 ࣮ݧɾ݁Ռ 12 / 25

  13. Ϟσϧͷ֓ཁ 1 ॊೈͳࣄલ෼෍ 2 දݱྗͷ͋Δ decoder 13 / 25

  14. ࣄલ෼෍ͷվળ • ٿ໘Ψ΢ε෼෍ɾҰ༷෼෍͕ద੾͔ٙ໰ • જࡏม਺ͷ׆༻ʹ͸ෆՄܽ • → autoregressive flow 14

    / 25
  15. Autoregressive Flow normalizing flows ʹ͍ͭͯ • ୯७ͳ෼෍͔Βॊೈͳ෼෍΁ͷՄٯͳม׵ • general normalizing

    flow • volume preserving flow • Jacobian ͷѻ͍ʹҧ͍ AF ͷಛ௃ IAF ͱಉ͡ܭࢉྔ͕ͩϞσϧ͕ΑΓਂ͍ 15 / 25
  16. Inverse Autoregressive Flow zt = µt + σt ⊙ zt−1

    log q(zT |x) = − D ∑ i=1         1 2 ϵ2 i + 1 2 log(2π) + T ∑ t=0 log σt,i         ਤ 1: IAF ͷ֓ཁ 16 / 25
  17. IAF posterior ॊೈͳࣄޙ෼෍Λ֫ಘ͍ͯ͠Δʂ ਤ 2: IAF ͷࣄޙ෼෍ 17 / 25

  18. AF prior ≡ IAF posterior L(x; θ) = Ez∼q(z|x) [log

    p(x|z) + log p(z) − log q(z|x)] = Ez∼q(z|x),ϵ=f−1(z) [ log p(x|f(ϵ)) + log u(ϵ) + log det dϵ dz − log q(z|x) ] = Ez∼q(z|x),ϵ=f−1(z)                   log p(x|f(ϵ)) + log u(ϵ) − ( log q(z|x) − log det dϵ dz ) IAF posterior                   18 / 25
  19. 1 ͳͥʁ 2 VLAE 3 ࣮ݧɾ݁Ռ 19 / 25

  20. ࣮ݧ֓ཁ • ໨త • જࡏม਺͕େҬతͳ৘ใΛ֫ಘ͍ͯ͠Δ͔ • AF prior ͕ IAF

    posterior ΑΓ༏Ε͍ͯΔ͔ • AR decoder ʹΑΓີ౓ਪఆͷਫ਼౓্͕͕Δ͔ • ݕূϞσϧ: AF prior & PixelCNN decoder • σʔληοτ: 2 ஋ͷ 28×28 ը૾ • MNIST, OMNIGLOT, Caltech - 101 Silhouettes • ΞʔΩςΫνϟɾજࡏม਺ͷ࣍ݩ਺͸౷Ұ 20 / 25
  21. Lossy Compression - MNIST ࠨɿೖྗɺӈɿग़ྗ • Ͳͷ਺ࣈ͔͸Θ͔Δ • ͨͩͷ࠶ߏ੒Ͱ͸ͳ͍ ਤ

    3: original & decompressed MNIST 21 / 25
  22. Lossy Compression - OMNIGLOT ࠨɿೖྗɺӈɿग़ྗ • semantics ͕อଘ͞Ε ͍ͯͳ͍ •

    λεΫɾσʔληοτ ͝ͱʹ৘ใΛಛఆ͢Δ ඞཁ ਤ 4: original & decompressed OMNIGLOT 22 / 25
  23. જࡏม਺͔ΒͷαϯϓϦϯά • Սۭͷ਺ࣈ • େҬతͳಛ௃ ਤ 5: VLAE ͔Βͷαϯϓϧ 23

    / 25
  24. Density Estimation Unconditional Decoder ͸γϯϓϧͳ PixelCNN 24 / 25

  25. AF priorͷޮՌ • ີ౓ਪఆ͕վળ • AR ʹΑͬͯજࡏม਺ ͷ࣋ͭ৘ใ͕૿Ճ ਤ 6:

    AF prior ͷޮՌ 25 / 25