Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Conservation Laws for Gradient Flows

Gabriel Peyré
September 29, 2023

Conservation Laws for Gradient Flows

Talk associated to the paper: https://arxiv.org/abs/2307.00144

Gabriel Peyré

September 29, 2023
Tweet

More Decks by Gabriel Peyré

Other Decks in Research

Transcript

  1. Conservation Laws for Gradient Flows Gabriel Peyré É C O

    L E N O R M A L E S U P É R I E U R E Sibylle Marcotte Remi Gribonval
  2. g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk

    ⟩) Conservation laws ℰY X (θ) := 1 N N ∑ i=1 ℓ(g(θ, xi ), yi ) Neural network (2 layers): θ = (U, V) σ U V⊤ x g(θ, x) Empirical risk minimization: 3
  3. θ(t) −∇ℰ Y X argmin(ℰ Y X ) g(θ, x)

    := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Conservation laws ℰY X (θ) := 1 N N ∑ i=1 ℓ(g(θ, xi ), yi ) Neural network (2 layers): θ = (U, V) Conservation law : h(θ) ∀X, Y, t, θ(0), h(θ(t)) = h(θ(0)) σ U V⊤ x g(θ, x) Empirical risk minimization: · θ(t) = − ∇ℰY X (θ(t)) Gradient flow: Understanding implicit bias of gradient descent. Applications: Helping to prove convergence. 3
  4. Independent conservation laws hk,k′  (U, V) = ⟨uk ,

    uk′  ⟩ − ⟨vk , vk′  ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 4
  5. h−1(−1) h−1(0) h−1(1) u v1 v2 u v1 v2 g(θ,

    x) = uv1 x1 + uv2 x2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → Independent conservation laws hk,k′  (U, V) = ⟨uk , uk′  ⟩ − ⟨vk , vk′  ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 4
  6. h−1(−1) h−1(0) h−1(1) u v1 v2 u v1 v2 g(θ,

    x) = uv1 x1 + uv2 x2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → Independent conservation laws hk,k′  (U, V) = ⟨uk , uk′  ⟩ − ⟨vk , vk′  ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 How many? Determine them? (h1 , …, hK ) conserved ⟹ Φ(h1 , …, hK ) conserved Independence: ∀θ, (∇h1 (θ), …, ∇hK (θ)) are independent g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 4
  7. ∇h(θ) W(θ) Structure of the Flow Fields · θ(t) =

    w(θ(t)) where w(θ) ∈ W(θ) W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Tangent space: 6 {θ : h(θ) = h(θ(0))}
  8. ∇h(θ) W(θ) Structure of the Flow Fields · θ(t) =

    w(θ(t)) where w(θ) ∈ W(θ) W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Proposition: h conserved ⇔ ∀θ, ∇h(θ) ⊥ W(θ) Tangent space: 6 {θ : h(θ) = h(θ(0))}
  9. ∇h(θ) W(θ) Structure of the Flow Fields · θ(t) =

    w(θ(t)) where w(θ) ∈ W(θ) W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Proposition: h conserved ⇔ ∀θ, ∇h(θ) ⊥ W(θ) Tangent space: Span y ∇ℓ(z, y) = whole space. Hypothesis: ℓ(z, y) = ∥z − y∥2 Example: Proposition: W(θ) = Span⋃ x Im[∂θ g(θ, x)⊤] Question: determining W(θ) ∇ℰY X (θ) = 1 N N ∑ i=1 ∂θ g(θ, xi )⊤αi where αi = ∇ℓ(g(θ, xi ), yi ) Chain rule: 6 {θ : h(θ) = h(θ(0))}
  10. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) should "factor"

    the invariances. Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 7 φ(θ) = (φ1 (θ), φ2 (θ), …)⊤
  11. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) should "factor"

    the invariances. Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 7 φ(θ) = (φ1 (θ), φ2 (θ), …)⊤ W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
  12. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) should "factor"

    the invariances. Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 7 φ(θ) = (φ1 (θ), φ2 (θ), …)⊤ ⟺ Finite dimensional set of vector fields → Definition: is minimal if is the whole space φ Wf (θ) W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
  13. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) should "factor"

    the invariances. Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i Theorem: σ = Id, φ(U, V) = UV⊤ For σ = ReLu, φ(U, V) = (ui v⊤ i )i are minimal (outside a set of 0 measure for ReLu) (valid only locally) 7 φ(θ) = (φ1 (θ), φ2 (θ), …)⊤ ⟺ Finite dimensional set of vector fields → Definition: is minimal if is the whole space φ Wf (θ) W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
  14. Example: 1 hidden neuron φ1 (u, v1 , v2 )

    = uv1 φ2 (u, v1 , v2 ) = uv2 ∇φ1 (u, v1 , v2 ) = (v1 , u,0) ∇φ2 (u, v1 , v2 ) = (v2 ,0,u) u v1 v2 θ(t) −∇ℰ Y X argmin(ℰ Y X ) g(θ, x) = uv1 x1 + uv2 x2 u v1 v2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → ℰY X (θ) = (uv1 x1 + uv2 x2 − y)2
  15. Example: 1 hidden neuron φ1 (u, v1 , v2 )

    = uv1 φ2 (u, v1 , v2 ) = uv2 ∇φ1 (u, v1 , v2 ) = (v1 , u,0) ∇φ2 (u, v1 , v2 ) = (v2 ,0,u) u v1 v2 θ(t) −∇ℰ Y X argmin(ℰ Y X ) g(θ, x) = uv1 x1 + uv2 x2 u v1 v2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → ℰY X (θ) = (uv1 x1 + uv2 x2 − y)2
  16. ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

    Laws Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ 9 W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …)
  17. ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

    Laws Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ φ(u, v) = uv⊤ W(u, v) = Span M {(Mv, M⊤u)} ∂φ(u, v)⊤ : M ↦ (Mv, M⊤u) h conserved ⇔ ∀M, ⟨∇u h(u, v), Mv⟩ + ⟨∇v h(u, v), M⊤v⟩ = 0 ⇔ ∇u h(u, v)v⊤ + u∇v h(u, v)⊤ = 0 Example: single neuron Only solutions: h(u, v) = Φ(∥u∥2 − ∥v∥2) 9 W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …)
  18. ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

    Laws Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ φ(u, v) = uv⊤ W(u, v) = Span M {(Mv, M⊤u)} ∂φ(u, v)⊤ : M ↦ (Mv, M⊤u) h conserved ⇔ ∀M, ⟨∇u h(u, v), Mv⟩ + ⟨∇v h(u, v), M⊤v⟩ = 0 ⇔ ∇u h(u, v)v⊤ + u∇v h(u, v)⊤ = 0 Example: single neuron Only solutions: h(u, v) = Φ(∥u∥2 − ∥v∥2) 9 For a polynomial , restricting the search to fixed degree polynomials : φ h finite dimensional linear kernel. → W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …)
  19. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Σ W(θ) θ(t) 11
  20. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) 11
  21. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) Theorem: and ∀(i, j), [wi , wj ](θ) ∈ W(θ) then there exists with . Σ dim(Σ) = dim(W(θ)) If W(θ) = Span(wi (θ))i Ferdinand Georg Frobenius 11
  22. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) [wi , wj ](θ) ∉ W(θ) Linear networks ReLu networks [wi , wj ](θ) ∈ W(θ) Theorem: and ∀(i, j), [wi , wj ](θ) ∈ W(θ) then there exists with . Σ dim(Σ) = dim(W(θ)) If W(θ) = Span(wi (θ))i Ferdinand Georg Frobenius 11
  23. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Generated Lie algebra : W∞ W0 (θ) = W(θ) Wk+1 = Span([W0 , Wk ] + Wk ) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) [wi , wj ](θ) ∉ W(θ) Linear networks ReLu networks [wi , wj ](θ) ∈ W(θ) Theorem: and ∀(i, j), [wi , wj ](θ) ∈ W(θ) then there exists with . Σ dim(Σ) = dim(W(θ)) If W(θ) = Span(wi (θ))i Ferdinand Georg Frobenius Sophus Lie 11
  24. Number of Conservation Laws Theorem: if is locally constant, dim(W∞

    (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ separability 12 φ : (U, V) ↦ UV⊤ r := rank(U; V)
  25. Number of Conservation Laws Theorem: if is locally constant, dim(W∞

    (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ separability 12 φ : (U, V) ↦ UV⊤ r := rank(U; V) Proposition: given , one has W0 (θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W0 ⊊ W1 = Span([W0 , W0 ] + W0 ) W1 = W2 = Span([W0 , W1 ] + W1 ) = W3 = … = W∞ Explicit formula, dim(W∞ ) = (n + m)r − r(r + 1)/2
  26. Number of Conservation Laws Theorem: if is locally constant, dim(W∞

    (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ separability Proposition: hk,k′  (U, V) = ⟨uk , uk′  ⟩ − ⟨vk , vk′  ⟩ define independent conservations laws. r(r + 1)/2 Corollary: for ReLu and linear networks, no other conservation laws. 12 φ : (U, V) ↦ UV⊤ r := rank(U; V) Proposition: given , one has W0 (θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W0 ⊊ W1 = Span([W0 , W0 ] + W0 ) W1 = W2 = Span([W0 , W1 ] + W1 ) = W3 = … = W∞ Explicit formula, dim(W∞ ) = (n + m)r − r(r + 1)/2
  27. Conclusion Deeper networks: no minimal parameterization valid for almost all

    . θ For some , there exists new conservation laws. → θ(0) https://github.com/sibyllema/Conservation_laws is infinite dimensional, SageMath code to compute → W∞ W∞ (θ) <latexit sha1_base64="UAUdhj6OM+ENH5fDEWE+yCEmOkk=">AABE5nictVzbchu5EYU3t41z8yaPeZmN1ilvyuvIjnOp2krV2qIsa621ZZOSvWvaLg45omiPODSHpC9c/UIqL6lU8pQ/yXfkA1KVPOUX0hdggCEx0xjF8ZQkDAanu9EDNLobGMeTdJTPNjf/ce69b3zzW9/+zvvfPf+97//ghz+68MGPD/NsPu0nB/0szaaP4l6epKNxcjAbzdLk0WSa9E7iNHkYv9jC5w8XyTQfZePO7M0keXLSG45HR6N+bwZVh910kM3yZxc2Nq9s0r9ovXBVFzaU/refffDhP1VXDVSm+mquTlSixmoG5VT1VA7XY3VVbaoJ1D1RS6ibQmlEzxN1qs4Ddg6tEmjRg9oX8HsId4917RjukWZO6D5wSeFnCshIXQRMBu2mUEZuET2fE2WsraK9JJoo2xv4G2taJ1A7U8dQK+FMy1Ac9mWmjtTvqA8j6NOEarB3fU1lTlpBySOnVzOgMIE6LA/g+RTKfUIaPUeEyanvqNsePf8XtcRavO/rtnP1b5LyIlyRauveZwWFnloQ/Yje5hyesTwpcB4ChUT3EUuvSNcn1PsxtF9C/V24TqlkdBLDtaTa01rkFlw+5JaI3IHLh9wRkXtw+ZB7InIfLh9yXyMROyWd+/FtuHz4tsj5Plw+5H0R+QAuH/KBiDyEy4c8FJFfweVDfiUib8HlQ94SkXfg8iHviMgOXD5kR0QewOVDHojIbbh8yG2NrJ6pU7gyojMSZuUNKJd5oKVIoeaGKN9Nso4+7M2AOd2vwMqzugV//dhWgE6TCux2wLg7qsDKI28HbKQfK9ui27Sa+LC3RewujAA/dlfEfq6eV2A/D5hpLyqw8lzbg3Z+rGx9v4A7P/YLEXsXSn6svEbdgxo/9l7AijGpwO6L2PvqZQU2xOpPK7Cy3W+DXfFj5XWqA+392BBrOq/Ayvb0EDwYP1ZerR5CrR/7UMQ+Uq8rsI9E7Jdg3f3YLwNW2LcVWLPGnqcVZEj+SAIzto5ar5iVWJoAtZ7APy3WlpR84xjqJcywwAwJcyIidgrETiBir0DsBcuVF3Y0J39X5tIuEO1ARFysTViaie0HRXsspQGIVoForSDqPFJ816YvC/IuTI2EnBUrF5ZC+pQV9htLiR4P9ZbXIO6VEDy2j2nkX6ZoCSMo1FQdteNijWdkRPd1iFcUvZleGh4yblZYBRf1WkTFHlQsot54UG9E1NyDmouohQe1EFF25ru4bsAIsPrHd7GkOx4B7CNXXxF4BTdg1bkNczSC8bMPXuADqrkHf9sUe0tXnWQYzeM6iVmOJyVLPIXSUm1AvY0KWxRfpzTDEpCMW97TMT7eYW5jqeccW+HTYiWPioxJOJ0RyTMs6KC3GNF8akbnDtWcknfHpWb428W8N6Vm+G3S+Cl58Vxqhp9p6WdnkL2jsZ0zYNswmyZa+7bclAbnX5iGKZ+nVRctLr7VEz1mkN7rhvR39ZvZPcN72aIS68eWm9HInf7lpf41oWH1nDt6bkYFvSf2ek0patyTsY57bbmpDBmtomMth71r+mawzUC/GVNuRmMfPK4tirmXTrnp6J0UvbHlZjQOFec9T8mTN+VmNIZ0z/qw5WY0MNvS03G+LTe17KgBjp1tualVH1MWGHNAPOa5xnpFU/KT5praiPyD+myN6/Ovr2OYs3laxAj1lKxvW00nLtayeomMv5CAVZs1lAP9i7njg5VpLNU1Mb5iGWal9X2djl3jUfN7oMUIZj/vAUg58xQkNDkJtN4pULwqRl3lnhncNRGHo+RoBdXVtTPRW7R8OWtUrntGtVJcZntr9dgle53T2JuQT7hHmpX0sFf5hqsoShraK2lIptdEd2/1fC1rf1PETVYQk2Kk9WlHiHfS6uNUn9bbjo4v6l2eGVy852PHL2abj7S1wZgnI1uEstTxdNuZPJJbh+vqZWVz3PwsojeK9mpBVmNEO1K5GIWabDF740u6t7QPaE8OeTCNPrzHSFOZKN41wyw65tMjsqiuvZV4o75Mho7LOVldY4/r0UMHPfSgm8c4W7Bi3IVSB2KGA7jrBEQ55wtdZaTxqfqk2B3N6A3WR/RpyUIaGmxvkpKFrIuyj0tUXgEaRwNH6eE0VukYfHeNkhz1++SxsWvZ8l+knVuzv92jMV49mqszMQPieo24RjRreFeX71Y5sARL75Nr5L/W9xL5NeGINlTi+tThzHoZ045/QhHshDzjlGabNDvKrd381OoTw2lfmb1z3M3OyEJGZP8iWJ8yGpMR/bhnB8wOOluElGxkiN0ZFd6Nz9cZiWPM+nEjxaca7HhLyJbNib+h686unMYiRwy8DpyujG2jkz3yBRPiOtXW3c7t+tUHkfachDtKmKIdK5eI/8f02/yYcbKxNiJQw/gGcm3rfO8jo5gFddSjVb7eBpm2rpQfFTI81VLb9c/K9FFJshZFXCgPrtYD4Nyne+aFo2RKcudrbXgdrcvmIuXJih6xt0cUxbPdH+oVGOW+TKvkBs25Lo2SIYyCWRFFmLZSFnmVbz2vMvUw2vn/hbrVdVlrSDFSNoPLGpLy+wlFa66UKYxqHr8vaDb5tT5daVXPZ0xj8cSZy19D7Yfw28ht7sPoxCWrcJPGAFOwd1YjXBOttQjjdbPEy4xMQ8veW352TJpWbs1Z4mu2bjbGXjSmsk+j5rXOWpjyWWg8d2g8D9Rhh/YarRZNvbFEz8TYoqN3K0P5NeHWaUB5LlKWPTKDGgVI6cZSYVQHIlU5xjeotyKtTZFWD2aruxvgzvkQpH+ur87ur4vVPVK3yLfpkwfG8cuAZumIfC5TWx+pMQXkfF3bV3f2d6kGucdkQZEyn+PEGcO7Tn26TgtJf65XtozsvLUI5tzSK93G2NgulX+1hjyhOZHTvDSI69Qi0fK7ckQrFumK43NElPnvkU/Ffkd9zOy2tu8kKvkTNt7kWWV5caQwJv1Lmbfdteh114lfI4oJ59q7joFW8zeMFBhjMgl+zzKnN4SrHO8ksEcbk/1ct1O8izd2JLpCUi/V7wNsDEe9dqy7Y8v02PTtF9AStW7fuq+FzC8N5ijxO8uOXo9WtRPtoy5X7s9Gq6dXufJ9nR7mK3ytPubUxo0sbJRXxnTVp8FcWKJmXBgTwqVZL5rI30zyJjLz7lQoZdPaUC5nGtjGHFO8JJ0DRYTPu7vk9eY+FvoRr9GLCetS4xqJEmbjMp0fcC0tZqWilQjJrZfWpNRZj6rWC8vDXTWsHWdLmZAVTJWUu+HWbh+6pWhFzsYwhb7ik71VcaJL81O48HekfFGi4RiSQ2yDn3tDbantd3Aq4qUuc2Yzohq0CYOVGLyn+1luUa+jlw51l34Ih3AeI9C1JP2IVtSmsjNlWXKXejj9V2QNpioRpbctm/fB5SL3ZJ1Tk/6MyMLJvRkp801O074YDiE9KXMJ58P7G1IvjpT5tqlZHwx1uQdlDk14mPMMYe/ctm7Oy+VUr691LqE8eB0wOy8GhzuA1TGLbRdioabOG3n3HNA6HNVQN6vF/9oPw8dyas4rlFtO35w9D3jr3C7RmVn0i5vPGcstZDRXcwznmRW9s16Tnx/7f1GjN5U5vXn39NEvtWPA8FoqzofK0jHeHUVW3lAquD/gkyFT/1F/Pyd/lfCyoFElRxNKZr+impppIVMzX176emeehchk6VTJVKZm44k2nYzdUrvqFvxsFR5g01Oi/E0l/0Ws/zvaAdQekfUw2XTOIHSpLqEsiN1NG9C9PUdbJTGe6eUzvh2owT3xParF8753qT2e+e2U+lb9JQnP9S9UpgalyGR1l8/Oqxh6UN6B41yQ+d43ojP1nM3iE2gnAXuMfI6KIyXz9fOSEAOKC1clXRLCjJY6yrGXckxnkpIK2nGpb30a4RO904/7Dng+v1dklyL1S6rr6dUBV2pJqn2PVI8pMxCT/jchQvu1ugx/L+uyX9L9NUlzegdliV47z+pPgp16x4X9mvEi5cFMpm6h22UU1dvdw/pMbKuSC594r8cPa/BDR8o2va0XFHdPVX3ucF5Dc65lcvdzx8rkPVkPGM32ivFRHz8vangtAvp/pxJ9x5F0B2SJKdse0X7elOilWjfbJD2fq6zP296ukdZ8tck07clKOw7MGcn6PYFUj7vq2c/nIKVcTVJBx53rfCJTOi0y8lKS5+ck4DREL6C3cl9DeipRmYuSzAO+RF4EyLIIoHMkSHMkUhiKkmj78OzCxtXV/+tjvXB47crV31y5fv/6xmc39f8D8r76qfqZugRr32/VZzD+99UBcHqu/qj+ov7aOm79ofWn1p+56XvnNOYnqvSv9bf/AjRlQhk=</latexit> . . .
  28. Conclusion Deeper networks: no minimal parameterization valid for almost all

    . θ For some , there exists new conservation laws. → θ(0) https://github.com/sibyllema/Conservation_laws is infinite dimensional, SageMath code to compute → W∞ W∞ (θ) <latexit sha1_base64="UAUdhj6OM+ENH5fDEWE+yCEmOkk=">AABE5nictVzbchu5EYU3t41z8yaPeZmN1ilvyuvIjnOp2krV2qIsa621ZZOSvWvaLg45omiPODSHpC9c/UIqL6lU8pQ/yXfkA1KVPOUX0hdggCEx0xjF8ZQkDAanu9EDNLobGMeTdJTPNjf/ce69b3zzW9/+zvvfPf+97//ghz+68MGPD/NsPu0nB/0szaaP4l6epKNxcjAbzdLk0WSa9E7iNHkYv9jC5w8XyTQfZePO7M0keXLSG45HR6N+bwZVh910kM3yZxc2Nq9s0r9ovXBVFzaU/refffDhP1VXDVSm+mquTlSixmoG5VT1VA7XY3VVbaoJ1D1RS6ibQmlEzxN1qs4Ddg6tEmjRg9oX8HsId4917RjukWZO6D5wSeFnCshIXQRMBu2mUEZuET2fE2WsraK9JJoo2xv4G2taJ1A7U8dQK+FMy1Ac9mWmjtTvqA8j6NOEarB3fU1lTlpBySOnVzOgMIE6LA/g+RTKfUIaPUeEyanvqNsePf8XtcRavO/rtnP1b5LyIlyRauveZwWFnloQ/Yje5hyesTwpcB4ChUT3EUuvSNcn1PsxtF9C/V24TqlkdBLDtaTa01rkFlw+5JaI3IHLh9wRkXtw+ZB7InIfLh9yXyMROyWd+/FtuHz4tsj5Plw+5H0R+QAuH/KBiDyEy4c8FJFfweVDfiUib8HlQ94SkXfg8iHviMgOXD5kR0QewOVDHojIbbh8yG2NrJ6pU7gyojMSZuUNKJd5oKVIoeaGKN9Nso4+7M2AOd2vwMqzugV//dhWgE6TCux2wLg7qsDKI28HbKQfK9ui27Sa+LC3RewujAA/dlfEfq6eV2A/D5hpLyqw8lzbg3Z+rGx9v4A7P/YLEXsXSn6svEbdgxo/9l7AijGpwO6L2PvqZQU2xOpPK7Cy3W+DXfFj5XWqA+392BBrOq/Ayvb0EDwYP1ZerR5CrR/7UMQ+Uq8rsI9E7Jdg3f3YLwNW2LcVWLPGnqcVZEj+SAIzto5ar5iVWJoAtZ7APy3WlpR84xjqJcywwAwJcyIidgrETiBir0DsBcuVF3Y0J39X5tIuEO1ARFysTViaie0HRXsspQGIVoForSDqPFJ816YvC/IuTI2EnBUrF5ZC+pQV9htLiR4P9ZbXIO6VEDy2j2nkX6ZoCSMo1FQdteNijWdkRPd1iFcUvZleGh4yblZYBRf1WkTFHlQsot54UG9E1NyDmouohQe1EFF25ru4bsAIsPrHd7GkOx4B7CNXXxF4BTdg1bkNczSC8bMPXuADqrkHf9sUe0tXnWQYzeM6iVmOJyVLPIXSUm1AvY0KWxRfpzTDEpCMW97TMT7eYW5jqeccW+HTYiWPioxJOJ0RyTMs6KC3GNF8akbnDtWcknfHpWb428W8N6Vm+G3S+Cl58Vxqhp9p6WdnkL2jsZ0zYNswmyZa+7bclAbnX5iGKZ+nVRctLr7VEz1mkN7rhvR39ZvZPcN72aIS68eWm9HInf7lpf41oWH1nDt6bkYFvSf2ek0patyTsY57bbmpDBmtomMth71r+mawzUC/GVNuRmMfPK4tirmXTrnp6J0UvbHlZjQOFec9T8mTN+VmNIZ0z/qw5WY0MNvS03G+LTe17KgBjp1tualVH1MWGHNAPOa5xnpFU/KT5praiPyD+myN6/Ovr2OYs3laxAj1lKxvW00nLtayeomMv5CAVZs1lAP9i7njg5VpLNU1Mb5iGWal9X2djl3jUfN7oMUIZj/vAUg58xQkNDkJtN4pULwqRl3lnhncNRGHo+RoBdXVtTPRW7R8OWtUrntGtVJcZntr9dgle53T2JuQT7hHmpX0sFf5hqsoShraK2lIptdEd2/1fC1rf1PETVYQk2Kk9WlHiHfS6uNUn9bbjo4v6l2eGVy852PHL2abj7S1wZgnI1uEstTxdNuZPJJbh+vqZWVz3PwsojeK9mpBVmNEO1K5GIWabDF740u6t7QPaE8OeTCNPrzHSFOZKN41wyw65tMjsqiuvZV4o75Mho7LOVldY4/r0UMHPfSgm8c4W7Bi3IVSB2KGA7jrBEQ55wtdZaTxqfqk2B3N6A3WR/RpyUIaGmxvkpKFrIuyj0tUXgEaRwNH6eE0VukYfHeNkhz1++SxsWvZ8l+knVuzv92jMV49mqszMQPieo24RjRreFeX71Y5sARL75Nr5L/W9xL5NeGINlTi+tThzHoZ045/QhHshDzjlGabNDvKrd381OoTw2lfmb1z3M3OyEJGZP8iWJ8yGpMR/bhnB8wOOluElGxkiN0ZFd6Nz9cZiWPM+nEjxaca7HhLyJbNib+h686unMYiRwy8DpyujG2jkz3yBRPiOtXW3c7t+tUHkfachDtKmKIdK5eI/8f02/yYcbKxNiJQw/gGcm3rfO8jo5gFddSjVb7eBpm2rpQfFTI81VLb9c/K9FFJshZFXCgPrtYD4Nyne+aFo2RKcudrbXgdrcvmIuXJih6xt0cUxbPdH+oVGOW+TKvkBs25Lo2SIYyCWRFFmLZSFnmVbz2vMvUw2vn/hbrVdVlrSDFSNoPLGpLy+wlFa66UKYxqHr8vaDb5tT5daVXPZ0xj8cSZy19D7Yfw28ht7sPoxCWrcJPGAFOwd1YjXBOttQjjdbPEy4xMQ8veW352TJpWbs1Z4mu2bjbGXjSmsk+j5rXOWpjyWWg8d2g8D9Rhh/YarRZNvbFEz8TYoqN3K0P5NeHWaUB5LlKWPTKDGgVI6cZSYVQHIlU5xjeotyKtTZFWD2aruxvgzvkQpH+ur87ur4vVPVK3yLfpkwfG8cuAZumIfC5TWx+pMQXkfF3bV3f2d6kGucdkQZEyn+PEGcO7Tn26TgtJf65XtozsvLUI5tzSK93G2NgulX+1hjyhOZHTvDSI69Qi0fK7ckQrFumK43NElPnvkU/Ffkd9zOy2tu8kKvkTNt7kWWV5caQwJv1Lmbfdteh114lfI4oJ59q7joFW8zeMFBhjMgl+zzKnN4SrHO8ksEcbk/1ct1O8izd2JLpCUi/V7wNsDEe9dqy7Y8v02PTtF9AStW7fuq+FzC8N5ijxO8uOXo9WtRPtoy5X7s9Gq6dXufJ9nR7mK3ytPubUxo0sbJRXxnTVp8FcWKJmXBgTwqVZL5rI30zyJjLz7lQoZdPaUC5nGtjGHFO8JJ0DRYTPu7vk9eY+FvoRr9GLCetS4xqJEmbjMp0fcC0tZqWilQjJrZfWpNRZj6rWC8vDXTWsHWdLmZAVTJWUu+HWbh+6pWhFzsYwhb7ik71VcaJL81O48HekfFGi4RiSQ2yDn3tDbantd3Aq4qUuc2Yzohq0CYOVGLyn+1luUa+jlw51l34Ih3AeI9C1JP2IVtSmsjNlWXKXejj9V2QNpioRpbctm/fB5SL3ZJ1Tk/6MyMLJvRkp801O074YDiE9KXMJ58P7G1IvjpT5tqlZHwx1uQdlDk14mPMMYe/ctm7Oy+VUr691LqE8eB0wOy8GhzuA1TGLbRdioabOG3n3HNA6HNVQN6vF/9oPw8dyas4rlFtO35w9D3jr3C7RmVn0i5vPGcstZDRXcwznmRW9s16Tnx/7f1GjN5U5vXn39NEvtWPA8FoqzofK0jHeHUVW3lAquD/gkyFT/1F/Pyd/lfCyoFElRxNKZr+impppIVMzX176emeehchk6VTJVKZm44k2nYzdUrvqFvxsFR5g01Oi/E0l/0Ws/zvaAdQekfUw2XTOIHSpLqEsiN1NG9C9PUdbJTGe6eUzvh2owT3xParF8753qT2e+e2U+lb9JQnP9S9UpgalyGR1l8/Oqxh6UN6B41yQ+d43ojP1nM3iE2gnAXuMfI6KIyXz9fOSEAOKC1clXRLCjJY6yrGXckxnkpIK2nGpb30a4RO904/7Dng+v1dklyL1S6rr6dUBV2pJqn2PVI8pMxCT/jchQvu1ugx/L+uyX9L9NUlzegdliV47z+pPgp16x4X9mvEi5cFMpm6h22UU1dvdw/pMbKuSC594r8cPa/BDR8o2va0XFHdPVX3ucF5Dc65lcvdzx8rkPVkPGM32ivFRHz8vangtAvp/pxJ9x5F0B2SJKdse0X7elOilWjfbJD2fq6zP296ukdZ8tck07clKOw7MGcn6PYFUj7vq2c/nIKVcTVJBx53rfCJTOi0y8lKS5+ck4DREL6C3cl9DeipRmYuSzAO+RF4EyLIIoHMkSHMkUhiKkmj78OzCxtXV/+tjvXB47crV31y5fv/6xmc39f8D8r76qfqZugRr32/VZzD+99UBcHqu/qj+ov7aOm79ofWn1p+56XvnNOYnqvSv9bf/AjRlQhk=</latexit> . . . Extensions: Max-pooling Convolution Skip connexions Optimization with momentum Discrete gradient descent …