Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OpenTalks.AI - Сергей Бартунов, Distributed, compressive and implicit memory models​

Ad8ae7af280edaecb09bd73a551b5e5f?s=47 OpenTalks.AI
February 21, 2020

OpenTalks.AI - Сергей Бартунов, Distributed, compressive and implicit memory models​



February 21, 2020


  1. Compressive, distributed and implicit neural memory Sergey Bartunov

  2. Private & Confidential Why memory? Partial observability Time sequentiality Cognitive

  3. Private & Confidential What kind of memory we need ▪

    Long episodes and generalization require compression ▪ Distributed representations for efficient memory access and robustness ▪ Implicit operation is often implied by the above
  4. Private & Confidential Recurrent Neural Network Pros ▪ General ▪

    Distributed ▪ Compressive ▪ Explicit h x y Cons ▪ Training issues ▪ Persistence ▪ Scalability
  5. Private & Confidential Slot-based memory m 1 m 2 ...

    m K Memory is a set of slot vectors: Reading from a query: Weston et al, 2014
  6. Private & Confidential Applications: reasoning

  7. Private & Confidential Applications: RL with episodic memory Pritzel et

    al, 2017
  8. Private & Confidential Slot-based memory h x m 1 m

    2 m K q w 1 w 2 w K ... y Pros ▪ Scalable* ▪ Explicit Cons ▪ Linear memory cost ▪ Lack of generalization
  9. Private & Confidential Writing strategies Add new slots Write to

    existing slots Graves et al, 2016
  10. Private & Confidential Writing via posterior inference (Kanerva Machine) Wu

    et al, 2018 ... Memory is a distribution Bayesian posterior update Likelihood model
  11. Private & Confidential Benefits of compression Rae et al, 2019

    Wu et al, 2018
  12. Private & Confidential Compressive transformer Rae et al, 2020

  13. Private & Confidential x 2 Implicit memory: Hopfield network x

    1 x 3
  14. Private & Confidential Energy-based models Du & Mordatch, 2019

  15. Private & Confidential RNN with Fast Weights h x y

    M Repeat S times Ba et al, 2016; Miconi et al, 2018 Soft nearest neighbour search:
  16. Private & Confidential Meta-Learning Deep Energy-Based Memory Models Sergey Bartunov,

    Jack Rae, Simon Osindero, Timothy Lillicrap
  17. Private & Confidential Implicit memory and attractor models Update dynamics:

    Memories are attractors:
  18. Private & Confidential Network weights as distributed storage Fully-connected layers

    Convolutional layers Static weights Memory weights Memory filters Static filters
  19. Private & Confidential Meta-learning the writing rule 3. Meta-learn the

    initialization and gradient descent schedule 2. Write into the memory by minimizing the writing loss (repeat K times) 1. For a known family define a writing loss Loss
  20. Private & Confidential Associative retrieval

  21. Private & Confidential Distortion-rate analysis

  22. Private & Confidential Compressing correlated data

  23. Private & Confidential Private & Confidential Thank you! Questions?

  24. Private & Confidential References • Weston, J., Chopra, S., &

    Bordes, A. (2014). Memory networks. arXiv preprint arXiv:1410.3916. • Pritzel, Alexander, et al. "Neural episodic control." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017. • Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471-476. • Wu, Yan, et al. "Learning attractor dynamics for generative memory." Advances in Neural Information Processing Systems. 2018. • Rae, Jack W., Sergey Bartunov, and Timothy P. Lillicrap. "Meta-learning neural bloom filters." arXiv preprint arXiv:1906.04304 (2019). • Du, Yilun, and Igor Mordatch. "Implicit generation and generalization in energy-based models." arXiv preprint arXiv:1903.08689 (2019). • Munkhdalai, Tsendsuren, et al. "Metalearned neural memory." Advances in Neural Information Processing Systems. 2019. • Miconi, Thomas, Jeff Clune, and Kenneth O. Stanley. "Differentiable plasticity: training plastic neural networks with backpropagation." arXiv preprint arXiv:1804.02464 (2018). • Rae, J. W., Potapenko, A., Jayakumar, S. M., & Lillicrap, T. P. (2019). Compressive Transformers for Long-Range Sequence Modelling. arXiv preprint arXiv:1911.05507.
  25. Private & Confidential Picture credits • http://adelsepehr.profcms.um.ac.ir/index.php?mclang=fa-IR • http://www.9minecraft.net/first-person-model-mod/ •

    https://www.planetminecraft.com/project/just-a-sneak-peak/ • http://www.bristol.ac.uk/synaptic/pathways/ • http://ce.sharif.edu/courses/92-93/1/ce957-1/resources/root/Lectures/Lecture23.pdf