OpenTalks.AI

Compressive, distributed and implicit neural memory Sergey Bartunov

Private & Conﬁdential Why memory? Partial observability Time sequentiality Cognitive
importance

Private & Conﬁdential What kind of memory we need ▪
Long episodes and generalization require compression ▪ Distributed representations for efﬁcient memory access and robustness ▪ Implicit operation is often implied by the above

Private & Conﬁdential Recurrent Neural Network Pros ▪ General ▪
Distributed ▪ Compressive ▪ Explicit h x y Cons ▪ Training issues ▪ Persistence ▪ Scalability

Private & Conﬁdential Slot-based memory m 1 m 2 ...
m K Memory is a set of slot vectors: Reading from a query: Weston et al, 2014

Private & Conﬁdential Applications: reasoning

Private & Conﬁdential Applications: RL with episodic memory Pritzel et
al, 2017

Private & Conﬁdential Slot-based memory h x m 1 m
2 m K q w 1 w 2 w K ... y Pros ▪ Scalable* ▪ Explicit Cons ▪ Linear memory cost ▪ Lack of generalization

Private & Conﬁdential Writing strategies Add new slots Write to
existing slots Graves et al, 2016

Private & Conﬁdential Writing via posterior inference (Kanerva Machine) Wu
et al, 2018 ... Memory is a distribution Bayesian posterior update Likelihood model

Private & Conﬁdential Benefits of compression Rae et al, 2019
Wu et al, 2018

Private & Conﬁdential Compressive transformer Rae et al, 2020

Private & Conﬁdential x 2 Implicit memory: Hopfield network x
1 x 3

Private & Conﬁdential Energy-based models Du & Mordatch, 2019

Private & Conﬁdential RNN with Fast Weights h x y
M Repeat S times Ba et al, 2016; Miconi et al, 2018 Soft nearest neighbour search:

Private & Conﬁdential Meta-Learning Deep Energy-Based Memory Models Sergey Bartunov,
Jack Rae, Simon Osindero, Timothy Lillicrap

Private & Conﬁdential Implicit memory and attractor models Update dynamics:
Memories are attractors:

Private & Confidential Network weights as distributed storage Fully-connected layers
Convolutional layers Static weights Memory weights Memory filters Static filters

Private & Conﬁdential Meta-learning the writing rule 3. Meta-learn the
initialization and gradient descent schedule 2. Write into the memory by minimizing the writing loss (repeat K times) 1. For a known family deﬁne a writing loss Loss

Private & Conﬁdential Associative retrieval

Private & Conﬁdential Distortion-rate analysis

Private & Conﬁdential Compressing correlated data

Private & Conﬁdential Private & Conﬁdential Thank you! Questions?

Private & Conﬁdential References • Weston, J., Chopra, S., &
Bordes, A. (2014). Memory networks. arXiv preprint arXiv:1410.3916. • Pritzel, Alexander, et al. "Neural episodic control." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017. • Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471-476. • Wu, Yan, et al. "Learning attractor dynamics for generative memory." Advances in Neural Information Processing Systems. 2018. • Rae, Jack W., Sergey Bartunov, and Timothy P. Lillicrap. "Meta-learning neural bloom ﬁlters." arXiv preprint arXiv:1906.04304 (2019). • Du, Yilun, and Igor Mordatch. "Implicit generation and generalization in energy-based models." arXiv preprint arXiv:1903.08689 (2019). • Munkhdalai, Tsendsuren, et al. "Metalearned neural memory." Advances in Neural Information Processing Systems. 2019. • Miconi, Thomas, Jeff Clune, and Kenneth O. Stanley. "Differentiable plasticity: training plastic neural networks with backpropagation." arXiv preprint arXiv:1804.02464 (2018). • Rae, J. W., Potapenko, A., Jayakumar, S. M., & Lillicrap, T. P. (2019). Compressive Transformers for Long-Range Sequence Modelling. arXiv preprint arXiv:1911.05507.

Private & Conﬁdential Picture credits • http://adelsepehr.profcms.um.ac.ir/index.php?mclang=fa-IR • http://www.9minecraft.net/first-person-model-mod/ •
https://www.planetminecraft.com/project/just-a-sneak-peak/ • http://www.bristol.ac.uk/synaptic/pathways/ • http://ce.sharif.edu/courses/92-93/1/ce957-1/resources/root/Lectures/Lecture23.pdf

OpenTalks.AI - Сергей Бартунов, Distributed, co...

OpenTalks.AI - Сергей Бартунов, Distributed, compressive and implicit memory models

More Decks by OpenTalks.AI

Other Decks in Science

Featured

Transcript

Compressive, distributed and implicit neural memory Sergey Bartunov

Private & Conﬁdential Why memory? Partial observability Time sequentiality Cognitive

Private & Conﬁdential What kind of memory we need ▪

Private & Conﬁdential Recurrent Neural Network Pros ▪ General ▪

Private & Conﬁdential Slot-based memory m 1 m 2 ...

Private & Conﬁdential Applications: reasoning

Private & Conﬁdential Applications: RL with episodic memory Pritzel et

Private & Conﬁdential Slot-based memory h x m 1 m

Private & Conﬁdential Writing strategies Add new slots Write to

Private & Conﬁdential Writing via posterior inference (Kanerva Machine) Wu

Private & Conﬁdential Benefits of compression Rae et al, 2019

Private & Conﬁdential Compressive transformer Rae et al, 2020

Private & Conﬁdential x 2 Implicit memory: Hopfield network x

Private & Conﬁdential Energy-based models Du & Mordatch, 2019

Private & Conﬁdential RNN with Fast Weights h x y

Private & Conﬁdential Meta-Learning Deep Energy-Based Memory Models Sergey Bartunov,

Private & Conﬁdential Implicit memory and attractor models Update dynamics:

Private & Conﬁdential Network weights as distributed storage Fully-connected layers

Private & Conﬁdential Meta-learning the writing rule 3. Meta-learn the