OpenTalks.AI - Сергей Бартунов, Distributed, compressive and implicit memory models​

Ad8ae7af280edaecb09bd73a551b5e5f?s=47 OpenTalks.AI
February 21, 2020

OpenTalks.AI - Сергей Бартунов, Distributed, compressive and implicit memory models​

Ad8ae7af280edaecb09bd73a551b5e5f?s=128

OpenTalks.AI

February 21, 2020
Tweet

Transcript

  1. 3.

    Private & Confidential What kind of memory we need ▪

    Long episodes and generalization require compression ▪ Distributed representations for efficient memory access and robustness ▪ Implicit operation is often implied by the above
  2. 4.

    Private & Confidential Recurrent Neural Network Pros ▪ General ▪

    Distributed ▪ Compressive ▪ Explicit h x y Cons ▪ Training issues ▪ Persistence ▪ Scalability
  3. 5.

    Private & Confidential Slot-based memory m 1 m 2 ...

    m K Memory is a set of slot vectors: Reading from a query: Weston et al, 2014
  4. 8.

    Private & Confidential Slot-based memory h x m 1 m

    2 m K q w 1 w 2 w K ... y Pros ▪ Scalable* ▪ Explicit Cons ▪ Linear memory cost ▪ Lack of generalization
  5. 10.

    Private & Confidential Writing via posterior inference (Kanerva Machine) Wu

    et al, 2018 ... Memory is a distribution Bayesian posterior update Likelihood model
  6. 15.

    Private & Confidential RNN with Fast Weights h x y

    M Repeat S times Ba et al, 2016; Miconi et al, 2018 Soft nearest neighbour search:
  7. 18.

    Private & Confidential Network weights as distributed storage Fully-connected layers

    Convolutional layers Static weights Memory weights Memory filters Static filters
  8. 19.

    Private & Confidential Meta-learning the writing rule 3. Meta-learn the

    initialization and gradient descent schedule 2. Write into the memory by minimizing the writing loss (repeat K times) 1. For a known family define a writing loss Loss
  9. 24.

    Private & Confidential References • Weston, J., Chopra, S., &

    Bordes, A. (2014). Memory networks. arXiv preprint arXiv:1410.3916. • Pritzel, Alexander, et al. "Neural episodic control." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017. • Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471-476. • Wu, Yan, et al. "Learning attractor dynamics for generative memory." Advances in Neural Information Processing Systems. 2018. • Rae, Jack W., Sergey Bartunov, and Timothy P. Lillicrap. "Meta-learning neural bloom filters." arXiv preprint arXiv:1906.04304 (2019). • Du, Yilun, and Igor Mordatch. "Implicit generation and generalization in energy-based models." arXiv preprint arXiv:1903.08689 (2019). • Munkhdalai, Tsendsuren, et al. "Metalearned neural memory." Advances in Neural Information Processing Systems. 2019. • Miconi, Thomas, Jeff Clune, and Kenneth O. Stanley. "Differentiable plasticity: training plastic neural networks with backpropagation." arXiv preprint arXiv:1804.02464 (2018). • Rae, J. W., Potapenko, A., Jayakumar, S. M., & Lillicrap, T. P. (2019). Compressive Transformers for Long-Range Sequence Modelling. arXiv preprint arXiv:1911.05507.
  10. 25.

    Private & Confidential Picture credits • http://adelsepehr.profcms.um.ac.ir/index.php?mclang=fa-IR • http://www.9minecraft.net/first-person-model-mod/ •

    https://www.planetminecraft.com/project/just-a-sneak-peak/ • http://www.bristol.ac.uk/synaptic/pathways/ • http://ce.sharif.edu/courses/92-93/1/ce957-1/resources/root/Lectures/Lecture23.pdf