Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Convolute all the things

David Nicholson
October 17, 2016
49

Convolute all the things

Class presentation for Neural Computation Fall 2016

David Nicholson

October 17, 2016
Tweet

Transcript

  1. LONG ET AL., INTRO • Q: How do you make

    a network fully convolutional? • A: by making it fully convolutional
  2. LONG ET AL., INTRO • Okay, but how do we

    obtain “dense” predictions, i.e., predictions for every pixel in the output? 1. Shift and stitch, or equivalently ‘a trous’ / dilated convolution 2. Upsampling, AKA backwards convolution or deconvolution
  3. LONG ET AL., RESULTS • Adding the “deep jet” with

    skip layers improved the segmentation detail
  4. VAN DEN OORD ET AL.: INTRO • A generative model

    for raw audio – “What if we used PixelCNN on audio data?”
  5. VAN DEN OORD ET AL.: INTRO • Even more secret

    ingredient: dilated causal convolution
  6. VAN DEN OORD ET AL.: INTRO • Even more secret

    ingredient: dilated causal convolution
  7. VAN DEN OORD ET AL.: INTRO • Yet more secret

    ingredients: – Output is a softmax layer trained on transformed data • non-linear transformation that can be mapped back to full range of 16-bit audio output – Gated activation units – Residual and skip connections
  8. VAN DEN OORD ET AL.: INTRO • Your model needs

    a conditioner – global conditioning – local conditioning
  9. VAN DEN OORD ET AL.: RESULTS • 3.1: We got

    it to make up speech. • 3.2: It did better than other models on text to speech (TTS) – Other models are concatenative (LSTM-RNN) and parameterized (HMM)
  10. VAN DEN OORD ET AL.: RESULTS • 3.2: It did

    better than other models on text to speech (TTS) (cont.) • 3.3: We got it to make music.