Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Understanding Neural Network Architectures with Attention and Diffusion

Understanding Neural Network Architectures with Attention and Diffusion

Neural networks have revolutionized AI, enabling machines to learn from data and make intelligent decisions. In this talk, we'll explore two popular architectures: Attention models and Diffusion models.

First up, we'll discuss Attention models and how they've contributed to the success of large language models like ChatGPT. We'll explore how the Attention mechanism helps GPT focus on specific parts of a text sequence and how this mechanism has been applied to different tasks in natural language processing.

Next, we'll dive into Diffusion models, a class of generative models that have shown remarkable performance in image synthesis. We'll explain how they work and their potential applications in the creative industry.

By the end of the talk, you'll have a better understanding of these cutting-edge neural network architectures.

Michał Karzyński

July 20, 2023
Tweet

More Decks by Michał Karzyński

Other Decks in Technology

Transcript

  1. THE TALK: MODELS Transformers Diffusion Natural Language Images GPT BERT

    T5 Stable Diffusion Midjourney DALL-E Attention Convolution & Attention
  2. THE TALK: OPERATIONS Linear a.k.a.: Dense, Fully-connected 
 Convolution Filter

    scan to produce feature maps Attention Key-value store lookup
  3. convolution convolution max pool max pool linear linear ResNet-18 ResNet-18

    average pool average pool add add + + + + + + + + + + + + + + + + + +
  4. convolution convolution max pool max pool deconvolution deconvolution Convolutional U-Net

    Convolutional U-Net unpooling unpooling concatenate concatenate c c c c c c c c c c
  5. OPERATION: ATTENTION store = { 'key1': 'value1', 'key2': 'value2', 'key3':

    'value3', } query = 'key1' value = store[query]
  6. linear linear inputs inputs Nx Nx append word to output

    append word to output K K V V V V Q Q multi-head attention multi-head attention Transformer Transformer positional encoding positional encoding embedding embedding add add + + + + + + + + + + + + + + + + + + + + + + + + Nx Nx
  7. FORWARD BACKWARD + + + + + + + +

    add genrated noise add genrated noise substract estimated noise substract estimated noise - - - -
  8. Latent Space Latent Space + + + + + +

    + + "Logo for "Logo for EuroPython EuroPython in Prague" in Prague" + + add add + + + + + + + + K K V V V V Q Q + + + + K K V V V V Q Q + + + + BERT Encoder BERT Encoder ResBlock ResBlock ResBlock ResBlock Spatial Transformer Spatial Transformer Spatial Transformer Spatial Transformer linear linear multi-head attention multi-head attention Latent Diffusion Latent Diffusion timestep / positional encoding timestep / positional encoding embedding embedding deconvolution deconvolution K K V V V V Q Q + + + + convolution convolution up/down sample up/down sample c c c c c c c c concatenate concatenate c c
  9. Social media was the fi rst contact between A.I. and

    humanity, and humanity lost. YUVAL HARARI