Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Conservation Laws for Gradient Flows

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Gabriel Peyré Gabriel Peyré
September 29, 2023

Conservation Laws for Gradient Flows

Talk associated to the paper: https://arxiv.org/abs/2307.00144

Avatar for Gabriel Peyré

Gabriel Peyré

September 29, 2023
Tweet

More Decks by Gabriel Peyré

Other Decks in Research

Transcript

  1. Gabriel Peyré É C O L E N O R

    M A L E S U P É R I E U R E Sibylle Marcotte Remi Gribonval Conservation Laws for Gradient Flows
  2. g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk

    ⟩) Conservation laws ℰY X (θ) := 1 N N ∑ i=1 ℓ(g(θ, xi ), yi ) Neural network (2 layers): θ = (U, V) σ U V⊤ x g(θ, x) Empirical risk minimization: 3 /15
  3. θ(t) −∇ℰ Y X argmin(ℰ Y X ) g(θ, x)

    := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Conservation laws ℰY X (θ) := 1 N N ∑ i=1 ℓ(g(θ, xi ), yi ) Neural network (2 layers): θ = (U, V) σ U V⊤ x g(θ, x) Empirical risk minimization: Conservation law : h(θ) ∀X, Y, t, θ(0), h(θ(t)) = h(θ(0)) · θ(t) = − ∇ℰY X (θ(t)) Gradient flow: Understanding implicit bias of gradient descent. Applications: Helping to prove convergence. 3 /15
  4. Homogeneity matters 4 /15 g(θ, x) := uσ(vx) 1-D with

    1 neuron: g(θ, ⋅ ) : ℝ → ℝ Proposition: σ(s) = { a|s|k if s > 0 b|s|k if s < 0 there exists conservation laws if and only if is positively -homogeneous σ k Conservation laws are h(u, v) = Φ( u2 2 − v2 k ) s σ(s) k = 1 s σ(s) k = 2
  5. Independent conservation laws hk,k′  (U, V) = ⟨uk ,

    uk′  ⟩ − ⟨vk , vk′  ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 5 /15
  6. h−1(−1) h−1(0) h−1(1) u v1 v2 u v1 v2 g(θ,

    x) = uv1 x1 + uv2 x2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → Independent conservation laws hk,k′  (U, V) = ⟨uk , uk′  ⟩ − ⟨vk , vk′  ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 5 /15
  7. h−1(−1) h−1(0) h−1(1) u v1 v2 u v1 v2 g(θ,

    x) = uv1 x1 + uv2 x2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → Independent conservation laws hk,k′  (U, V) = ⟨uk , uk′  ⟩ − ⟨vk , vk′  ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 How many? Determine them? (h1 , …, hK ) conserved ⟹ Φ(h1 , …, hK ) conserved Independence: ∀θ, (∇h1 (θ), …, ∇hK (θ)) are independent g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 5 /15
  8. ∇h(θ) W(θ) {θ : h(θ) = h(θ(0))} Structure of the

    Flow Fields W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Proposition: h conserved ⇔ ∀θ, ∇h(θ) ⊥ W(θ) Tangent space: 7 /15 · θ(t) = w(θ(t)) where w(θ) ∈ W(θ)
  9. ∇h(θ) W(θ) {θ : h(θ) = h(θ(0))} Structure of the

    Flow Fields W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Proposition: h conserved ⇔ ∀θ, ∇h(θ) ⊥ W(θ) Tangent space: Proposition: W(θ) = Span⋃ x Im[∂θ g(θ, x)⊤] Question: determining W(θ) 7 /15 · θ(t) = w(θ(t)) where w(θ) ∈ W(θ) Hypothesis: Span y ∇ℓ(z, y) = whole space. Example: ℓ(z, y) = ∥z − y∥2 ∇ℰY X (θ) = 1 N N ∑ i=1 ∂θ g(θ, xi )⊤αi Chain rule: where αi = ∇ℓ(g(θ, xi ), yi )
  10. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

    Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …)
  11. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

    Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
  12. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

    Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …) ⟺ Finite dimensional set of vector fields → Definition: is minimal if is the whole space φ Wf (θ) W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
  13. Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

    Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i Theorem: σ = Id, φ(U, V) = UV⊤ For σ = ReLu, φ(U, V) = (ui v⊤ i )i are minimal Outside a set of 0 measure for ReLu. (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …) ⟺ Finite dimensional set of vector fields → Definition: is minimal if is the whole space φ Wf (θ) W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
  14. u v1 v2 θ(t) −∇ℰ Y X argmin(ℰ Y X

    ) Example: 1 hidden neuron 9 /15 φ1 (u, v1 , v2 ) = uv1 ∇φ1 (u, v1 , v2 ) = (v1 , u,0) g(θ, x) = uv1 x1 + uv2 x2 u v1 v2 x1 x2 h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D: → ℰY X (θ) = (uv1 x1 + uv2 x2 − y)2 φ2 (u, v1 , v2 ) = uv2 ∇φ2 (u, v1 , v2 ) = (v2 ,0,u)
  15. ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

    Laws 10/15 Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …)
  16. ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

    Laws 10/15 Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) φ(u, v) = uv⊤ W(u, v) = Span M {(Mv, M⊤u)} ∂φ(u, v)⊤ : M ↦ (Mv, M⊤u) ⇔ ∇u h(u, v)v⊤ + u∇v h(u, v)⊤ = 0 Example: single neuron Only solutions: h(u, v) = Φ(∥u∥2 − ∥v∥2) conserved h
  17. ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

    Laws 10/15 Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ For a polynomial , restricting the search to fixed degree polynomials : φ h finite dimensional linear kernel. → W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) φ(u, v) = uv⊤ W(u, v) = Span M {(Mv, M⊤u)} ∂φ(u, v)⊤ : M ↦ (Mv, M⊤u) ⇔ ∇u h(u, v)v⊤ + u∇v h(u, v)⊤ = 0 Example: single neuron Only solutions: h(u, v) = Φ(∥u∥2 − ∥v∥2) conserved h
  18. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Σ W(θ) θ(t) 12/15
  19. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) Sophus Lie 12/15
  20. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) Ferdinand Georg Frobenius Sophus Lie 12/15 Theorem: there exists with . Σ dim(Σ) = dim(W(θ)) W(θ) = Span(wi (θ))i If , then ∀(i, j), [wi , wj ](θ) ∈ W(θ)
  21. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) [wi , wj ](θ) ∉ W(θ) Linear networks ReLu networks [wi , wj ](θ) ∈ W(θ) Ferdinand Georg Frobenius Sophus Lie 12/15 Theorem: there exists with . Σ dim(Σ) = dim(W(θ)) W(θ) = Span(wi (θ))i If , then ∀(i, j), [wi , wj ](θ) ∈ W(θ)
  22. Did we found all the conservation laws? Question: find a

    "minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Generated Lie algebra : W∞ W0 (θ) = W(θ) Wk+1 = Span([W0 , Wk ] + Wk ) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) [wi , wj ](θ) ∉ W(θ) Linear networks ReLu networks [wi , wj ](θ) ∈ W(θ) Ferdinand Georg Frobenius Sophus Lie 12/15 Theorem: there exists with . Σ dim(Σ) = dim(W(θ)) W(θ) = Span(wi (θ))i If , then ∀(i, j), [wi , wj ](θ) ∈ W(θ)
  23. separability Number of Conservation Laws 13/15 Theorem: if is locally

    constant, dim(W∞ (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ sub-case r = 1
  24. separability Number of Conservation Laws 13/15 Theorem: if is locally

    constant, dim(W∞ (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ φ : (U, V) ↦ UV⊤ r := rank(U; V) Proposition: given , one has W0 (θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W0 ⊊ W1 = Span([W0 , W0 ] + W0 ) W1 = W2 = Span([W0 , W1 ] + W1 ) = W3 = … = W∞ Explicit formula dim(W∞ ) = (n + m)r − r(r + 1)/2 (except for ) r = 1 sub-case r = 1
  25. separability Number of Conservation Laws 13/15 Theorem: if is locally

    constant, dim(W∞ (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ Corollary: for ReLu and linear networks, no other conservation laws. Proposition: hk,k′  (U, V) = ⟨uk , uk′  ⟩ − ⟨vk , vk′  ⟩ define independent conservations laws. r(r + 1)/2 φ : (U, V) ↦ UV⊤ r := rank(U; V) Proposition: given , one has W0 (θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W0 ⊊ W1 = Span([W0 , W0 ] + W0 ) W1 = W2 = Span([W0 , W1 ] + W1 ) = W3 = … = W∞ Explicit formula dim(W∞ ) = (n + m)r − r(r + 1)/2 (except for ) r = 1 sub-case r = 1
  26. Momentum Flows 14/15 ·· θ(t) + τ(t) · θ(t) =

    − ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t)
  27. Momentum Flows 14/15 ·· θ(t) + τ(t) · θ(t) =

    − ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t)
  28. Momentum Flows 14/15 Emmy Noether ·· θ(t) + τ(t) ·

    θ(t) = − ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t) Lagrangian: ∫ L(θ(t), · θ(t), t)dt L(θ, v, t) := eτt(∥v∥2 2 − ℰ(θ)) Finding conservation laws: apply Noether theorem can only leverage orthogonal invariance! → anti-symmetric, ∀A h(θ, · θ, t) := e∫t 0 τ(⟨ · U, VA⟩ + ⟨ · V, UA⟩) Theorem: for , the conservation laws are φ(θ) = UV⊤
  29. Momentum Flows 14/15 Emmy Noether ·· θ(t) + τ(t) ·

    θ(t) = − ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t) Lagrangian: ∫ L(θ(t), · θ(t), t)dt L(θ, v, t) := eτt(∥v∥2 2 − ℰ(θ)) Finding conservation laws: apply Noether theorem can only leverage orthogonal invariance! → anti-symmetric, ∀A h(θ, · θ, t) := e∫t 0 τ(⟨ · U, VA⟩ + ⟨ · V, UA⟩) Theorem: for , the conservation laws are φ(θ) = UV⊤ gradient flows r(r + 1) 2 Independent laws: momentum flows r(r − 1) 2 ReLu networks: no conservation! ω := (t, θ(t), · θ(t)), · ω ∈ Span(( 1 · θ −τ(t) · θ ), ( 0 0 ∇φ1 (θ) ), ( 0 0 ∇φ2 (θ) ), …) Lie algebra analysis: phase-space lifting
  30. Conclusion 15/15 Deeper networks: no minimal parameterization valid for almost

    all . θ For some , there exists new conservation laws. → θ(0) https://github.com/sibyllema/Conservation_laws is infinite dimensional, SageMath code to compute → W∞ W∞ (θ) <latexit sha1_base64="UAUdhj6OM+ENH5fDEWE+yCEmOkk=">AABE5nictVzbchu5EYU3t41z8yaPeZmN1ilvyuvIjnOp2krV2qIsa621ZZOSvWvaLg45omiPODSHpC9c/UIqL6lU8pQ/yXfkA1KVPOUX0hdggCEx0xjF8ZQkDAanu9EDNLobGMeTdJTPNjf/ce69b3zzW9/+zvvfPf+97//ghz+68MGPD/NsPu0nB/0szaaP4l6epKNxcjAbzdLk0WSa9E7iNHkYv9jC5w8XyTQfZePO7M0keXLSG45HR6N+bwZVh910kM3yZxc2Nq9s0r9ovXBVFzaU/refffDhP1VXDVSm+mquTlSixmoG5VT1VA7XY3VVbaoJ1D1RS6ibQmlEzxN1qs4Ddg6tEmjRg9oX8HsId4917RjukWZO6D5wSeFnCshIXQRMBu2mUEZuET2fE2WsraK9JJoo2xv4G2taJ1A7U8dQK+FMy1Ac9mWmjtTvqA8j6NOEarB3fU1lTlpBySOnVzOgMIE6LA/g+RTKfUIaPUeEyanvqNsePf8XtcRavO/rtnP1b5LyIlyRauveZwWFnloQ/Yje5hyesTwpcB4ChUT3EUuvSNcn1PsxtF9C/V24TqlkdBLDtaTa01rkFlw+5JaI3IHLh9wRkXtw+ZB7InIfLh9yXyMROyWd+/FtuHz4tsj5Plw+5H0R+QAuH/KBiDyEy4c8FJFfweVDfiUib8HlQ94SkXfg8iHviMgOXD5kR0QewOVDHojIbbh8yG2NrJ6pU7gyojMSZuUNKJd5oKVIoeaGKN9Nso4+7M2AOd2vwMqzugV//dhWgE6TCux2wLg7qsDKI28HbKQfK9ui27Sa+LC3RewujAA/dlfEfq6eV2A/D5hpLyqw8lzbg3Z+rGx9v4A7P/YLEXsXSn6svEbdgxo/9l7AijGpwO6L2PvqZQU2xOpPK7Cy3W+DXfFj5XWqA+392BBrOq/Ayvb0EDwYP1ZerR5CrR/7UMQ+Uq8rsI9E7Jdg3f3YLwNW2LcVWLPGnqcVZEj+SAIzto5ar5iVWJoAtZ7APy3WlpR84xjqJcywwAwJcyIidgrETiBir0DsBcuVF3Y0J39X5tIuEO1ARFysTViaie0HRXsspQGIVoForSDqPFJ816YvC/IuTI2EnBUrF5ZC+pQV9htLiR4P9ZbXIO6VEDy2j2nkX6ZoCSMo1FQdteNijWdkRPd1iFcUvZleGh4yblZYBRf1WkTFHlQsot54UG9E1NyDmouohQe1EFF25ru4bsAIsPrHd7GkOx4B7CNXXxF4BTdg1bkNczSC8bMPXuADqrkHf9sUe0tXnWQYzeM6iVmOJyVLPIXSUm1AvY0KWxRfpzTDEpCMW97TMT7eYW5jqeccW+HTYiWPioxJOJ0RyTMs6KC3GNF8akbnDtWcknfHpWb428W8N6Vm+G3S+Cl58Vxqhp9p6WdnkL2jsZ0zYNswmyZa+7bclAbnX5iGKZ+nVRctLr7VEz1mkN7rhvR39ZvZPcN72aIS68eWm9HInf7lpf41oWH1nDt6bkYFvSf2ek0patyTsY57bbmpDBmtomMth71r+mawzUC/GVNuRmMfPK4tirmXTrnp6J0UvbHlZjQOFec9T8mTN+VmNIZ0z/qw5WY0MNvS03G+LTe17KgBjp1tualVH1MWGHNAPOa5xnpFU/KT5praiPyD+myN6/Ovr2OYs3laxAj1lKxvW00nLtayeomMv5CAVZs1lAP9i7njg5VpLNU1Mb5iGWal9X2djl3jUfN7oMUIZj/vAUg58xQkNDkJtN4pULwqRl3lnhncNRGHo+RoBdXVtTPRW7R8OWtUrntGtVJcZntr9dgle53T2JuQT7hHmpX0sFf5hqsoShraK2lIptdEd2/1fC1rf1PETVYQk2Kk9WlHiHfS6uNUn9bbjo4v6l2eGVy852PHL2abj7S1wZgnI1uEstTxdNuZPJJbh+vqZWVz3PwsojeK9mpBVmNEO1K5GIWabDF740u6t7QPaE8OeTCNPrzHSFOZKN41wyw65tMjsqiuvZV4o75Mho7LOVldY4/r0UMHPfSgm8c4W7Bi3IVSB2KGA7jrBEQ55wtdZaTxqfqk2B3N6A3WR/RpyUIaGmxvkpKFrIuyj0tUXgEaRwNH6eE0VukYfHeNkhz1++SxsWvZ8l+knVuzv92jMV49mqszMQPieo24RjRreFeX71Y5sARL75Nr5L/W9xL5NeGINlTi+tThzHoZ045/QhHshDzjlGabNDvKrd381OoTw2lfmb1z3M3OyEJGZP8iWJ8yGpMR/bhnB8wOOluElGxkiN0ZFd6Nz9cZiWPM+nEjxaca7HhLyJbNib+h686unMYiRwy8DpyujG2jkz3yBRPiOtXW3c7t+tUHkfachDtKmKIdK5eI/8f02/yYcbKxNiJQw/gGcm3rfO8jo5gFddSjVb7eBpm2rpQfFTI81VLb9c/K9FFJshZFXCgPrtYD4Nyne+aFo2RKcudrbXgdrcvmIuXJih6xt0cUxbPdH+oVGOW+TKvkBs25Lo2SIYyCWRFFmLZSFnmVbz2vMvUw2vn/hbrVdVlrSDFSNoPLGpLy+wlFa66UKYxqHr8vaDb5tT5daVXPZ0xj8cSZy19D7Yfw28ht7sPoxCWrcJPGAFOwd1YjXBOttQjjdbPEy4xMQ8veW352TJpWbs1Z4mu2bjbGXjSmsk+j5rXOWpjyWWg8d2g8D9Rhh/YarRZNvbFEz8TYoqN3K0P5NeHWaUB5LlKWPTKDGgVI6cZSYVQHIlU5xjeotyKtTZFWD2aruxvgzvkQpH+ur87ur4vVPVK3yLfpkwfG8cuAZumIfC5TWx+pMQXkfF3bV3f2d6kGucdkQZEyn+PEGcO7Tn26TgtJf65XtozsvLUI5tzSK93G2NgulX+1hjyhOZHTvDSI69Qi0fK7ckQrFumK43NElPnvkU/Ffkd9zOy2tu8kKvkTNt7kWWV5caQwJv1Lmbfdteh114lfI4oJ59q7joFW8zeMFBhjMgl+zzKnN4SrHO8ksEcbk/1ct1O8izd2JLpCUi/V7wNsDEe9dqy7Y8v02PTtF9AStW7fuq+FzC8N5ijxO8uOXo9WtRPtoy5X7s9Gq6dXufJ9nR7mK3ytPubUxo0sbJRXxnTVp8FcWKJmXBgTwqVZL5rI30zyJjLz7lQoZdPaUC5nGtjGHFO8JJ0DRYTPu7vk9eY+FvoRr9GLCetS4xqJEmbjMp0fcC0tZqWilQjJrZfWpNRZj6rWC8vDXTWsHWdLmZAVTJWUu+HWbh+6pWhFzsYwhb7ik71VcaJL81O48HekfFGi4RiSQ2yDn3tDbantd3Aq4qUuc2Yzohq0CYOVGLyn+1luUa+jlw51l34Ih3AeI9C1JP2IVtSmsjNlWXKXejj9V2QNpioRpbctm/fB5SL3ZJ1Tk/6MyMLJvRkp801O074YDiE9KXMJ58P7G1IvjpT5tqlZHwx1uQdlDk14mPMMYe/ctm7Oy+VUr691LqE8eB0wOy8GhzuA1TGLbRdioabOG3n3HNA6HNVQN6vF/9oPw8dyas4rlFtO35w9D3jr3C7RmVn0i5vPGcstZDRXcwznmRW9s16Tnx/7f1GjN5U5vXn39NEvtWPA8FoqzofK0jHeHUVW3lAquD/gkyFT/1F/Pyd/lfCyoFElRxNKZr+impppIVMzX176emeehchk6VTJVKZm44k2nYzdUrvqFvxsFR5g01Oi/E0l/0Ws/zvaAdQekfUw2XTOIHSpLqEsiN1NG9C9PUdbJTGe6eUzvh2owT3xParF8753qT2e+e2U+lb9JQnP9S9UpgalyGR1l8/Oqxh6UN6B41yQ+d43ojP1nM3iE2gnAXuMfI6KIyXz9fOSEAOKC1clXRLCjJY6yrGXckxnkpIK2nGpb30a4RO904/7Dng+v1dklyL1S6rr6dUBV2pJqn2PVI8pMxCT/jchQvu1ugx/L+uyX9L9NUlzegdliV47z+pPgp16x4X9mvEi5cFMpm6h22UU1dvdw/pMbKuSC594r8cPa/BDR8o2va0XFHdPVX3ucF5Dc65lcvdzx8rkPVkPGM32ivFRHz8vangtAvp/pxJ9x5F0B2SJKdse0X7elOilWjfbJD2fq6zP296ukdZ8tck07clKOw7MGcn6PYFUj7vq2c/nIKVcTVJBx53rfCJTOi0y8lKS5+ck4DREL6C3cl9DeipRmYuSzAO+RF4EyLIIoHMkSHMkUhiKkmj78OzCxtXV/+tjvXB47crV31y5fv/6xmc39f8D8r76qfqZugRr32/VZzD+99UBcHqu/qj+ov7aOm79ofWn1p+56XvnNOYnqvSv9bf/AjRlQhk=</latexit> . . .
  31. Conclusion 15/15 Deeper networks: no minimal parameterization valid for almost

    all . θ For some , there exists new conservation laws. → θ(0) https://github.com/sibyllema/Conservation_laws is infinite dimensional, SageMath code to compute → W∞ W∞ (θ) <latexit sha1_base64="UAUdhj6OM+ENH5fDEWE+yCEmOkk=">AABE5nictVzbchu5EYU3t41z8yaPeZmN1ilvyuvIjnOp2krV2qIsa621ZZOSvWvaLg45omiPODSHpC9c/UIqL6lU8pQ/yXfkA1KVPOUX0hdggCEx0xjF8ZQkDAanu9EDNLobGMeTdJTPNjf/ce69b3zzW9/+zvvfPf+97//ghz+68MGPD/NsPu0nB/0szaaP4l6epKNxcjAbzdLk0WSa9E7iNHkYv9jC5w8XyTQfZePO7M0keXLSG45HR6N+bwZVh910kM3yZxc2Nq9s0r9ovXBVFzaU/refffDhP1VXDVSm+mquTlSixmoG5VT1VA7XY3VVbaoJ1D1RS6ibQmlEzxN1qs4Ddg6tEmjRg9oX8HsId4917RjukWZO6D5wSeFnCshIXQRMBu2mUEZuET2fE2WsraK9JJoo2xv4G2taJ1A7U8dQK+FMy1Ac9mWmjtTvqA8j6NOEarB3fU1lTlpBySOnVzOgMIE6LA/g+RTKfUIaPUeEyanvqNsePf8XtcRavO/rtnP1b5LyIlyRauveZwWFnloQ/Yje5hyesTwpcB4ChUT3EUuvSNcn1PsxtF9C/V24TqlkdBLDtaTa01rkFlw+5JaI3IHLh9wRkXtw+ZB7InIfLh9yXyMROyWd+/FtuHz4tsj5Plw+5H0R+QAuH/KBiDyEy4c8FJFfweVDfiUib8HlQ94SkXfg8iHviMgOXD5kR0QewOVDHojIbbh8yG2NrJ6pU7gyojMSZuUNKJd5oKVIoeaGKN9Nso4+7M2AOd2vwMqzugV//dhWgE6TCux2wLg7qsDKI28HbKQfK9ui27Sa+LC3RewujAA/dlfEfq6eV2A/D5hpLyqw8lzbg3Z+rGx9v4A7P/YLEXsXSn6svEbdgxo/9l7AijGpwO6L2PvqZQU2xOpPK7Cy3W+DXfFj5XWqA+392BBrOq/Ayvb0EDwYP1ZerR5CrR/7UMQ+Uq8rsI9E7Jdg3f3YLwNW2LcVWLPGnqcVZEj+SAIzto5ar5iVWJoAtZ7APy3WlpR84xjqJcywwAwJcyIidgrETiBir0DsBcuVF3Y0J39X5tIuEO1ARFysTViaie0HRXsspQGIVoForSDqPFJ816YvC/IuTI2EnBUrF5ZC+pQV9htLiR4P9ZbXIO6VEDy2j2nkX6ZoCSMo1FQdteNijWdkRPd1iFcUvZleGh4yblZYBRf1WkTFHlQsot54UG9E1NyDmouohQe1EFF25ru4bsAIsPrHd7GkOx4B7CNXXxF4BTdg1bkNczSC8bMPXuADqrkHf9sUe0tXnWQYzeM6iVmOJyVLPIXSUm1AvY0KWxRfpzTDEpCMW97TMT7eYW5jqeccW+HTYiWPioxJOJ0RyTMs6KC3GNF8akbnDtWcknfHpWb428W8N6Vm+G3S+Cl58Vxqhp9p6WdnkL2jsZ0zYNswmyZa+7bclAbnX5iGKZ+nVRctLr7VEz1mkN7rhvR39ZvZPcN72aIS68eWm9HInf7lpf41oWH1nDt6bkYFvSf2ek0patyTsY57bbmpDBmtomMth71r+mawzUC/GVNuRmMfPK4tirmXTrnp6J0UvbHlZjQOFec9T8mTN+VmNIZ0z/qw5WY0MNvS03G+LTe17KgBjp1tualVH1MWGHNAPOa5xnpFU/KT5praiPyD+myN6/Ovr2OYs3laxAj1lKxvW00nLtayeomMv5CAVZs1lAP9i7njg5VpLNU1Mb5iGWal9X2djl3jUfN7oMUIZj/vAUg58xQkNDkJtN4pULwqRl3lnhncNRGHo+RoBdXVtTPRW7R8OWtUrntGtVJcZntr9dgle53T2JuQT7hHmpX0sFf5hqsoShraK2lIptdEd2/1fC1rf1PETVYQk2Kk9WlHiHfS6uNUn9bbjo4v6l2eGVy852PHL2abj7S1wZgnI1uEstTxdNuZPJJbh+vqZWVz3PwsojeK9mpBVmNEO1K5GIWabDF740u6t7QPaE8OeTCNPrzHSFOZKN41wyw65tMjsqiuvZV4o75Mho7LOVldY4/r0UMHPfSgm8c4W7Bi3IVSB2KGA7jrBEQ55wtdZaTxqfqk2B3N6A3WR/RpyUIaGmxvkpKFrIuyj0tUXgEaRwNH6eE0VukYfHeNkhz1++SxsWvZ8l+knVuzv92jMV49mqszMQPieo24RjRreFeX71Y5sARL75Nr5L/W9xL5NeGINlTi+tThzHoZ045/QhHshDzjlGabNDvKrd381OoTw2lfmb1z3M3OyEJGZP8iWJ8yGpMR/bhnB8wOOluElGxkiN0ZFd6Nz9cZiWPM+nEjxaca7HhLyJbNib+h686unMYiRwy8DpyujG2jkz3yBRPiOtXW3c7t+tUHkfachDtKmKIdK5eI/8f02/yYcbKxNiJQw/gGcm3rfO8jo5gFddSjVb7eBpm2rpQfFTI81VLb9c/K9FFJshZFXCgPrtYD4Nyne+aFo2RKcudrbXgdrcvmIuXJih6xt0cUxbPdH+oVGOW+TKvkBs25Lo2SIYyCWRFFmLZSFnmVbz2vMvUw2vn/hbrVdVlrSDFSNoPLGpLy+wlFa66UKYxqHr8vaDb5tT5daVXPZ0xj8cSZy19D7Yfw28ht7sPoxCWrcJPGAFOwd1YjXBOttQjjdbPEy4xMQ8veW352TJpWbs1Z4mu2bjbGXjSmsk+j5rXOWpjyWWg8d2g8D9Rhh/YarRZNvbFEz8TYoqN3K0P5NeHWaUB5LlKWPTKDGgVI6cZSYVQHIlU5xjeotyKtTZFWD2aruxvgzvkQpH+ur87ur4vVPVK3yLfpkwfG8cuAZumIfC5TWx+pMQXkfF3bV3f2d6kGucdkQZEyn+PEGcO7Tn26TgtJf65XtozsvLUI5tzSK93G2NgulX+1hjyhOZHTvDSI69Qi0fK7ckQrFumK43NElPnvkU/Ffkd9zOy2tu8kKvkTNt7kWWV5caQwJv1Lmbfdteh114lfI4oJ59q7joFW8zeMFBhjMgl+zzKnN4SrHO8ksEcbk/1ct1O8izd2JLpCUi/V7wNsDEe9dqy7Y8v02PTtF9AStW7fuq+FzC8N5ijxO8uOXo9WtRPtoy5X7s9Gq6dXufJ9nR7mK3ytPubUxo0sbJRXxnTVp8FcWKJmXBgTwqVZL5rI30zyJjLz7lQoZdPaUC5nGtjGHFO8JJ0DRYTPu7vk9eY+FvoRr9GLCetS4xqJEmbjMp0fcC0tZqWilQjJrZfWpNRZj6rWC8vDXTWsHWdLmZAVTJWUu+HWbh+6pWhFzsYwhb7ik71VcaJL81O48HekfFGi4RiSQ2yDn3tDbantd3Aq4qUuc2Yzohq0CYOVGLyn+1luUa+jlw51l34Ih3AeI9C1JP2IVtSmsjNlWXKXejj9V2QNpioRpbctm/fB5SL3ZJ1Tk/6MyMLJvRkp801O074YDiE9KXMJ58P7G1IvjpT5tqlZHwx1uQdlDk14mPMMYe/ctm7Oy+VUr691LqE8eB0wOy8GhzuA1TGLbRdioabOG3n3HNA6HNVQN6vF/9oPw8dyas4rlFtO35w9D3jr3C7RmVn0i5vPGcstZDRXcwznmRW9s16Tnx/7f1GjN5U5vXn39NEvtWPA8FoqzofK0jHeHUVW3lAquD/gkyFT/1F/Pyd/lfCyoFElRxNKZr+impppIVMzX176emeehchk6VTJVKZm44k2nYzdUrvqFvxsFR5g01Oi/E0l/0Ws/zvaAdQekfUw2XTOIHSpLqEsiN1NG9C9PUdbJTGe6eUzvh2owT3xParF8753qT2e+e2U+lb9JQnP9S9UpgalyGR1l8/Oqxh6UN6B41yQ+d43ojP1nM3iE2gnAXuMfI6KIyXz9fOSEAOKC1clXRLCjJY6yrGXckxnkpIK2nGpb30a4RO904/7Dng+v1dklyL1S6rr6dUBV2pJqn2PVI8pMxCT/jchQvu1ugx/L+uyX9L9NUlzegdliV47z+pPgp16x4X9mvEi5cFMpm6h22UU1dvdw/pMbKuSC594r8cPa/BDR8o2va0XFHdPVX3ucF5Dc65lcvdzx8rkPVkPGM32ivFRHz8vangtAvp/pxJ9x5F0B2SJKdse0X7elOilWjfbJD2fq6zP296ukdZ8tck07clKOw7MGcn6PYFUj7vq2c/nIKVcTVJBx53rfCJTOi0y8lKS5+ck4DREL6C3cl9DeipRmYuSzAO+RF4EyLIIoHMkSHMkUhiKkmj78OzCxtXV/+tjvXB47crV31y5fv/6xmc39f8D8r76qfqZugRr32/VZzD+99UBcHqu/qj+ov7aOm79ofWn1p+56XvnNOYnqvSv9bf/AjRlQhk=</latexit> . . . Extensions: Max-pooling Convolution Skip connexions Variable metric, mirror descent Discrete gradient descent, SGD …