Tensor Networks and their applications in Physics and Machine Learning

Tensor Networks and their applications in Physics and Machine Learning
Sivaramakrishnan Swaminathan Vicarious AI http://sivark.me 12 December 2019 Indian Institute of Technology Bombay

Before we begin. . . Please feel free to interrupt
and ask questions! Comments are my own, and do not represent Vicarious AI

Tensor networks and quantum states

History of tensor networks Graphical notation (Roger Penrose in the
1970s) Representing and formally manipulating computations Index gymnastics on multilinear operators i.e. “tensors” Einstein summation convention

Primer on tensors

Quantum states ∼ probability distributions States are vectors in a
Hilbert space, with ψ|ψ = 1 Alternately, density matrices ρ ≡ |ψ ψ| with Tr ρ = 1 Compute expectation values of operators O ≡ Tr [ρO] = ψ|O|ψ Entanglement ∼ Mutual Information Bell’s inequality

Many-body states are high-dimensional Joint distribution on N random variables
=⇒ dim. ∼ exp (N) Would be nice to handle inﬁnite systems This is why QM is hard, even though it’s linear

Aside: Curse or blessing of dimensionality? Exponentially many dimensions Every
direction corresponds to a (soft) partitioning! (high “shattering” capacity)

Statistical physics

Basic problem setup Local DOFs on a lattice (eg: Ising
model) Hamiltonian describing how the states are coupled Implicitly deﬁnes a distribution, and a “ground state” Compute observables to explain behavior correlation functions ~ statistical moments Why bother? Condensed matter goodies!

Strategy Seek representation amenable to Eﬃcient storage Eﬃcient computations Lossy
representations that allow controlled approximations Start with the simplest cases (most symmetry) and slowly generalize

Exploit physical principles?! Model typical states eg: Lowest energy state;
use “power method” Most states in Hilbert space are crazy unphysical! Typical states form a vanishing fraction of Hilbert space Locality =⇒ “area scaling” of entanglement Additional symmetries (translation, scale invariance)

Tensor networks1 Approximate joint distributions (states) by some variational ansatz;
allows eﬃcient representation and computation Can condition/marginalize over variables eﬃciently. (Number of variational parameters scale favorably) Often massively over-parametrized 1See Orus 2019 for a recent review

Matrix Product States2 (markov models, tensor train, etc.) Modern perspective
on DMRG Exponentially decaying correlations Could passably fake power-laws through interesting dynamics, or suitable sum of exponentials! (rich statistics literature) x−r = 1 Γ(r) ∞ 0 tr−1e−xtdt 2See Schollwock 2011 for a review

Multiscale Entanglement Renormalization Ansatz3 Modelling scale-invariant (critical) systems 3Vidal 2008,
Evenbly+Vidal 2009, Pﬁefer+Evenbly+Vidal 2009

MERA: Constraints = u† u i j k l i
j k l = w† w i j i j

MERA: Efficient computations Causal structure of influence simplifies computations

Computing the variational parameters in MERA Non-trivial optimization problem, given
unitarity constraints For each layer Reduce problem to optimizing Tr t A t† B Approximate by optimizing Tr [t C] =⇒ use SVD! Alternating minimization to optimize tensors (t ∈ {u, v}) (More recent developments demonstrate better learning techniques)

Aside: MERA and Wavelets Multi-resolution “shape” of MERA reminiscent of
wavelets Connection established4 more rigorously Used to design new wavelets from quantum circuits! 4Evenbly+White 2016, 2018

Quantum gravity

A hard problem Holy grail of fundamental physics for the
past half-century If we naively combine gravity and quantum mechanics “Infinities” from marginalizing over infinitely many DOFs (dependence on prior; loss of predictivity) Physicists care about answers being finite and unique, so they may be compared with experiment.

Holographic quantum gravity Quantum Mechanics on "boundary" = Quantum Gravity
in "bulk" (justiﬁcations from string theory) {Figure from https://commons.wikimedia.org/wiki/File:AdS3_(new).png}

MERA ??? ←→ Holography How does “space” emerge from correlated
DOFs? (deep question in AI/cognition) MERA models entanglement structure in quantum states Holographic spacetime maps entanglement structure Bulk geodesic length ∼ = boundary entanglement (Ryu-Takayanagi formula) Emergent direction encodes scale-dependence of entanglement (renormalization group ﬂow)

Searching for a more direct relationship Lots of discussion over
the last several years. . . I’ll summarize recent understanding, without detailing justiﬁcations MERA discretizes the integral transform of bulk geometry5 5Czech+Lamprou+McCandlish+Sully 2015, 2016

Simplest example: hyperbolic space H2 Full conformal symmetry Start with
H2 and obtain dS1+1 MERA discretizes dS1+1 Causal structure and scaling of correlations (I’m happy to sketch the calculation if desired)

Minimal Updates Proposal (MUP) Modeling scale invariant systems with a
local defect Originally6 motivated by computational convenience 6Evenbly+Vidal 2015

Our generalization: defect geometries Reduced symmetry: more nuanced duality; harder
computations Proposed7 a novel generalization of the MUP: Rayed MERA principled justiﬁcation based on symmetry arguments (Boundary OPE) 7Czech+Nguyen+Swaminathan 2017

Summary: Quantum mechanics ↔ Spacetime geometry TNs organize many-body systems
by structure of correlations Sparsity in entanglement ↔ spatial structure

Machine learning8 8Hopelessly incomplete selection of things to touch on

TNs for discriminative models (Reminiscent of quantum circuit interpretation of
tensor network) Linear classiﬁer on a suitable encoding of the input y = W · Φ(x) Represent classiﬁer (W) by a tensor network Tensor bond dimensions regularize model capacity; can be chosen adaptively

MPS for MNIST9 Generalize one-hot encoding at each pixel; tensor
product over locations Reshape image to 1d (ugh!), and represent linear classiﬁer functional as MPS Regularization from approximation L2 cost function; network structure gives eﬃcient gradients Choose internal bond dimension adaptively while optimizing (SVD step) 9Stoudenmire+Schwab 2016

(Figures from Stoudenmire+Schwab 2016)

TNs for generative models11 (Reminiscent of wavefunction interpretation of tensor
network) Eﬃcient contraction schemes provide inference supporting variety of “queries” a la graphical models. Direct sampling schemes10 MCMC 10Ferris+Vidal 2012 11Han+Wang+Fan+Wang+Zhang 2018

TN ↔ more familiar ML models MPS and RBMs Tree
tensor networks and Conv. Arithmetic Circuits Coarse graining structure of language models etc, etc, etc. This slide is just meant to be indicative. See Orus 2019 for a more comprehensive listing and references

TensorNetwork12 API on top of TensorFlow (2019) Previously had to
write eﬃcient bespoke code Recently released by Google X, one of the highlights at NeurIPS 2019 Convenient Python interface GPU backend =⇒ massive speedup! 12Roberts et. al. 2019

Themes to explore Engineering Develop better ansatzes (esp. for higher
dimensional space) Make sense of these classes Better techniques (diﬀerential programming13) Exploit them for ML! ML on quantum computers!? Physics Quantum many-body systems (condensed matter physics) Why do these variational models work so well!? MERA and renormalization group ﬂow Quantum gravity (holography) 13Liao+Liu+Wang+Xiang 2019

Thank you!

Tensor Networks and their applications in Physi...

Tensor Networks and their applications in Physics and Machine Learning

More Decks by Siva Swaminathan

Featured

Transcript