Gabriel Peyré
September 29, 2023
1.1k

# Conservation Laws for Gradient Flows

Talk associated to the paper: https://arxiv.org/abs/2307.00144

## Gabriel Peyré

September 29, 2023

## Transcript

1. ### Gabriel Peyré É C O L E N O R

M A L E S U P É R I E U R E Sibylle Marcotte Remi Gribonval Conservation Laws for Gradient Flows
2. ### Conservation laws Finding conservation laws Have we found them all?

Ferdinand Georg Frobenius Sophus Lie
3. ### g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk

⟩) Conservation laws ℰY X (θ) := 1 N N ∑ i=1 ℓ(g(θ, xi ), yi ) Neural network (2 layers): θ = (U, V) σ U V⊤ x g(θ, x) Empirical risk minimization: 3 /15
4. ### θ(t) −∇ℰ Y X argmin(ℰ Y X ) g(θ, x)

:= Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Conservation laws ℰY X (θ) := 1 N N ∑ i=1 ℓ(g(θ, xi ), yi ) Neural network (2 layers): θ = (U, V) σ U V⊤ x g(θ, x) Empirical risk minimization: Conservation law : h(θ) ∀X, Y, t, θ(0), h(θ(t)) = h(θ(0)) · θ(t) = − ∇ℰY X (θ(t)) Gradient flow: Understanding implicit bias of gradient descent. Applications: Helping to prove convergence. 3 /15
5. ### Homogeneity matters 4 /15 g(θ, x) := uσ(vx) 1-D with

1 neuron: g(θ, ⋅ ) : ℝ → ℝ Proposition: σ(s) = { a|s|k if s > 0 b|s|k if s < 0 there exists conservation laws if and only if is positively -homogeneous σ k Conservation laws are h(u, v) = Φ( u2 2 − v2 k ) s σ(s) k = 1 s σ(s) k = 2
6. ### Independent conservation laws hk,k′ ￼ (U, V) = ⟨uk ,

uk′ ￼ ⟩ − ⟨vk , vk′ ￼ ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 5 /15
7. ### h−1(−1) h−1(0) h−1(1) u v1 v2 u v1 v2 g(θ,

x) = uv1 x1 + uv2 x2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → Independent conservation laws hk,k′ ￼ (U, V) = ⟨uk , uk′ ￼ ⟩ − ⟨vk , vk′ ￼ ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 5 /15
8. ### h−1(−1) h−1(0) h−1(1) u v1 v2 u v1 v2 g(θ,

x) = uv1 x1 + uv2 x2 x1 x2 θ = (u, v1 , v2 ) h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D, 1 hidden neuron: → Independent conservation laws hk,k′ ￼ (U, V) = ⟨uk , uk′ ￼ ⟩ − ⟨vk , vk′ ￼ ⟩ Linear networks ReLu networks σ(s) = max(s,0) σ(s) = s hk (U, V) = ∥uk ∥2 − ∥vk ∥2 How many? Determine them? (h1 , …, hK ) conserved ⟹ Φ(h1 , …, hK ) conserved Independence: ∀θ, (∇h1 (θ), …, ∇hK (θ)) are independent g(θ, x) := Uσ(V⊤x) = ∑ k uk σ(⟨x, vk ⟩) Example: θ = (U, V) σ σ 5 /15
9. ### Conservation laws Finding conservation laws Have we found them all?

Ferdinand Georg Frobenius Sophus Lie
10. ### ∇h(θ) W(θ) {θ : h(θ) = h(θ(0))} Structure of the

Flow Fields W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Proposition: h conserved ⇔ ∀θ, ∇h(θ) ⊥ W(θ) Tangent space: 7 /15 · θ(t) = w(θ(t)) where w(θ) ∈ W(θ)
11. ### ∇h(θ) W(θ) {θ : h(θ) = h(θ(0))} Structure of the

Flow Fields W(θ) := Span {∇ℰY X (θ) : ∀X, Y} Proposition: h conserved ⇔ ∀θ, ∇h(θ) ⊥ W(θ) Tangent space: Proposition: W(θ) = Span⋃ x Im[∂θ g(θ, x)⊤] Question: determining W(θ) 7 /15 · θ(t) = w(θ(t)) where w(θ) ∈ W(θ) Hypothesis: Span y ∇ℓ(z, y) = whole space. Example: ℓ(z, y) = ∥z − y∥2 ∇ℰY X (θ) = 1 N N ∑ i=1 ∂θ g(θ, xi )⊤αi Chain rule: where αi = ∇ℓ(g(θ, xi ), yi )
12. ### Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …)
13. ### Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
14. ### Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …) ⟺ Finite dimensional set of vector fields → Definition: is minimal if is the whole space φ Wf (θ) W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
15. ### Minimal Parameterizations Re-parameterization: g(θ, x) = f(φ(θ), x) removes invariances.

Linear networks: g(θ, x) = UV⊤x φ(U, V) = UV⊤ ReLu networks: g(θ, x) = ∑ i ui ReLu(⟨vi , x⟩) = ∑ i 1⟨vi ,x⟩≥0 (ui v⊤ i )x φ(U, V) = (ui v⊤ i )i Theorem: σ = Id, φ(U, V) = UV⊤ For σ = ReLu, φ(U, V) = (ui v⊤ i )i are minimal Outside a set of 0 measure for ReLu. (valid only locally) 8 /15 φ(θ) = (φ1 (θ), φ2 (θ), …) ⟺ Finite dimensional set of vector fields → Definition: is minimal if is the whole space φ Wf (θ) W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W(θ) = Wg (θ) := Span⋃x Im[∂θ g(θ, x)⊤] = ∂φ(θ)⊤ Span⋃x Im[∂θ f(θ, x)⊤] := Wf (θ) chain rule
16. ### u v1 v2 θ(t) −∇ℰ Y X argmin(ℰ Y X

) Example: 1 hidden neuron 9 /15 φ1 (u, v1 , v2 ) = uv1 ∇φ1 (u, v1 , v2 ) = (v1 , u,0) g(θ, x) = uv1 x1 + uv2 x2 u v1 v2 x1 x2 h(θ) = v2 1 + v2 2 − u2 Conserved function: Neural network 2D 1D: → ℰY X (θ) = (uv1 x1 + uv2 x2 − y)2 φ2 (u, v1 , v2 ) = uv2 ∇φ2 (u, v1 , v2 ) = (v2 ,0,u)
17. ### ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

Laws 10/15 Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …)
18. ### ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

Laws 10/15 Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) φ(u, v) = uv⊤ W(u, v) = Span M {(Mv, M⊤u)} ∂φ(u, v)⊤ : M ↦ (Mv, M⊤u) ⇔ ∇u h(u, v)v⊤ + u∇v h(u, v)⊤ = 0 Example: single neuron Only solutions: h(u, v) = Φ(∥u∥2 − ∥v∥2) conserved h
19. ### ∇h(θ) W (θ) {θ : h(θ) = h(θ(0))} Constructing Conservation

Laws 10/15 Consequence: h conserved ⇔ ∀i, ⟨∇φi (θ), ∇h(θ)⟩ = 0 Minimal parameterization : φ For a polynomial , restricting the search to fixed degree polynomials : φ h finite dimensional linear kernel. → W(θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) φ(u, v) = uv⊤ W(u, v) = Span M {(Mv, M⊤u)} ∂φ(u, v)⊤ : M ↦ (Mv, M⊤u) ⇔ ∇u h(u, v)v⊤ + u∇v h(u, v)⊤ = 0 Example: single neuron Only solutions: h(u, v) = Φ(∥u∥2 − ∥v∥2) conserved h
20. ### Conservation laws Finding conservation laws Have we found them all?

Ferdinand Georg Frobenius Sophus Lie
21. ### Did we found all the conservation laws? Question: find a

"minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Σ W(θ) θ(t) 12/15
22. ### Did we found all the conservation laws? Question: find a

"minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) Sophus Lie 12/15
23. ### Did we found all the conservation laws? Question: find a

"minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) Ferdinand Georg Frobenius Sophus Lie 12/15 Theorem: there exists with . Σ dim(Σ) = dim(W(θ)) W(θ) = Span(wi (θ))i If , then ∀(i, j), [wi , wj ](θ) ∈ W(θ)
24. ### Did we found all the conservation laws? Question: find a

"minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) [wi , wj ](θ) ∉ W(θ) Linear networks ReLu networks [wi , wj ](θ) ∈ W(θ) Ferdinand Georg Frobenius Sophus Lie 12/15 Theorem: there exists with . Σ dim(Σ) = dim(W(θ)) W(θ) = Span(wi (θ))i If , then ∀(i, j), [wi , wj ](θ) ∈ W(θ)
25. ### Did we found all the conservation laws? Question: find a

"minimal" surface tangent to all . Σ W(θ) Issue: in general, impossible! dim(Σ) = dim(W(θ)) Definition: Generated Lie algebra : W∞ W0 (θ) = W(θ) Wk+1 = Span([W0 , Wk ] + Wk ) Definition: Lie brackets [w1 , w2 ](θ) := ∂w1 (θ)w2 (θ) − ∂w2 (θ)w1 (θ) · θ = w2 (θ) · θ = w 1 (θ) if = [w1 , w2 ] = 0 Σ W(θ) θ(t) [wi , wj ](θ) ∉ W(θ) Linear networks ReLu networks [wi , wj ](θ) ∈ W(θ) Ferdinand Georg Frobenius Sophus Lie 12/15 Theorem: there exists with . Σ dim(Σ) = dim(W(θ)) W(θ) = Span(wi (θ))i If , then ∀(i, j), [wi , wj ](θ) ∈ W(θ)
26. ### separability Number of Conservation Laws 13/15 Theorem: if is locally

constant, dim(W∞ (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ sub-case r = 1
27. ### separability Number of Conservation Laws 13/15 Theorem: if is locally

constant, dim(W∞ (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ φ : (U, V) ↦ UV⊤ r := rank(U; V) Proposition: given , one has W0 (θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W0 ⊊ W1 = Span([W0 , W0 ] + W0 ) W1 = W2 = Span([W0 , W1 ] + W1 ) = W3 = … = W∞ Explicit formula dim(W∞ ) = (n + m)r − r(r + 1)/2 (except for ) r = 1 sub-case r = 1
28. ### separability Number of Conservation Laws 13/15 Theorem: if is locally

constant, dim(W∞ (θ)) = K there are exactly independent conservation laws. d − K ReLu networks Linear networks φ(U, V) = UV⊤ φ(U, V) = (ui v⊤ i )i φ(u, v) = uv⊤ Corollary: for ReLu and linear networks, no other conservation laws. Proposition: hk,k′ ￼ (U, V) = ⟨uk , uk′ ￼ ⟩ − ⟨vk , vk′ ￼ ⟩ define independent conservations laws. r(r + 1)/2 φ : (U, V) ↦ UV⊤ r := rank(U; V) Proposition: given , one has W0 (θ) = Span(∇φ1 (θ), ∇φ2 (θ), …) W0 ⊊ W1 = Span([W0 , W0 ] + W0 ) W1 = W2 = Span([W0 , W1 ] + W1 ) = W3 = … = W∞ Explicit formula dim(W∞ ) = (n + m)r − r(r + 1)/2 (except for ) r = 1 sub-case r = 1
29. ### Momentum Flows 14/15 ·· θ(t) + τ(t) · θ(t) =

− ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t)
30. ### Momentum Flows 14/15 ·· θ(t) + τ(t) · θ(t) =

− ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t)
31. ### Momentum Flows 14/15 Emmy Noether ·· θ(t) + τ(t) ·

θ(t) = − ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t) Lagrangian: ∫ L(θ(t), · θ(t), t)dt L(θ, v, t) := eτt(∥v∥2 2 − ℰ(θ)) Finding conservation laws: apply Noether theorem can only leverage orthogonal invariance! → anti-symmetric, ∀A h(θ, · θ, t) := e∫t 0 τ(⟨ · U, VA⟩ + ⟨ · V, UA⟩) Theorem: for , the conservation laws are φ(θ) = UV⊤
32. ### Momentum Flows 14/15 Emmy Noether ·· θ(t) + τ(t) ·

θ(t) = − ∇ℰ(θ(t)) Heavy ball τ(t) = τ Nesterov τ(t) = 3/t3 Conservation laws depends on speed & time! h(θ, · θ, t) Lagrangian: ∫ L(θ(t), · θ(t), t)dt L(θ, v, t) := eτt(∥v∥2 2 − ℰ(θ)) Finding conservation laws: apply Noether theorem can only leverage orthogonal invariance! → anti-symmetric, ∀A h(θ, · θ, t) := e∫t 0 τ(⟨ · U, VA⟩ + ⟨ · V, UA⟩) Theorem: for , the conservation laws are φ(θ) = UV⊤ gradient flows r(r + 1) 2 Independent laws: momentum flows r(r − 1) 2 ReLu networks: no conservation! ω := (t, θ(t), · θ(t)), · ω ∈ Span(( 1 · θ −τ(t) · θ ), ( 0 0 ∇φ1 (θ) ), ( 0 0 ∇φ2 (θ) ), …) Lie algebra analysis: phase-space lifting
33. ### Conclusion 15/15 Deeper networks: no minimal parameterization valid for almost

all . θ For some , there exists new conservation laws. → θ(0) https://github.com/sibyllema/Conservation_laws is infinite dimensional, SageMath code to compute → W∞ W∞ (θ) <latexit sha1_base64="UAUdhj6OM+ENH5fDEWE+yCEmOkk=">AABE5nictVzbchu5EYU3t41z8yaPeZmN1ilvyuvIjnOp2krV2qIsa621ZZOSvWvaLg45omiPODSHpC9c/UIqL6lU8pQ/yXfkA1KVPOUX0hdggCEx0xjF8ZQkDAanu9EDNLobGMeTdJTPNjf/ce69b3zzW9/+zvvfPf+97//ghz+68MGPD/NsPu0nB/0szaaP4l6epKNxcjAbzdLk0WSa9E7iNHkYv9jC5w8XyTQfZePO7M0keXLSG45HR6N+bwZVh910kM3yZxc2Nq9s0r9ovXBVFzaU/refffDhP1VXDVSm+mquTlSixmoG5VT1VA7XY3VVbaoJ1D1RS6ibQmlEzxN1qs4Ddg6tEmjRg9oX8HsId4917RjukWZO6D5wSeFnCshIXQRMBu2mUEZuET2fE2WsraK9JJoo2xv4G2taJ1A7U8dQK+FMy1Ac9mWmjtTvqA8j6NOEarB3fU1lTlpBySOnVzOgMIE6LA/g+RTKfUIaPUeEyanvqNsePf8XtcRavO/rtnP1b5LyIlyRauveZwWFnloQ/Yje5hyesTwpcB4ChUT3EUuvSNcn1PsxtF9C/V24TqlkdBLDtaTa01rkFlw+5JaI3IHLh9wRkXtw+ZB7InIfLh9yXyMROyWd+/FtuHz4tsj5Plw+5H0R+QAuH/KBiDyEy4c8FJFfweVDfiUib8HlQ94SkXfg8iHviMgOXD5kR0QewOVDHojIbbh8yG2NrJ6pU7gyojMSZuUNKJd5oKVIoeaGKN9Nso4+7M2AOd2vwMqzugV//dhWgE6TCux2wLg7qsDKI28HbKQfK9ui27Sa+LC3RewujAA/dlfEfq6eV2A/D5hpLyqw8lzbg3Z+rGx9v4A7P/YLEXsXSn6svEbdgxo/9l7AijGpwO6L2PvqZQU2xOpPK7Cy3W+DXfFj5XWqA+392BBrOq/Ayvb0EDwYP1ZerR5CrR/7UMQ+Uq8rsI9E7Jdg3f3YLwNW2LcVWLPGnqcVZEj+SAIzto5ar5iVWJoAtZ7APy3WlpR84xjqJcywwAwJcyIidgrETiBir0DsBcuVF3Y0J39X5tIuEO1ARFysTViaie0HRXsspQGIVoForSDqPFJ816YvC/IuTI2EnBUrF5ZC+pQV9htLiR4P9ZbXIO6VEDy2j2nkX6ZoCSMo1FQdteNijWdkRPd1iFcUvZleGh4yblZYBRf1WkTFHlQsot54UG9E1NyDmouohQe1EFF25ru4bsAIsPrHd7GkOx4B7CNXXxF4BTdg1bkNczSC8bMPXuADqrkHf9sUe0tXnWQYzeM6iVmOJyVLPIXSUm1AvY0KWxRfpzTDEpCMW97TMT7eYW5jqeccW+HTYiWPioxJOJ0RyTMs6KC3GNF8akbnDtWcknfHpWb428W8N6Vm+G3S+Cl58Vxqhp9p6WdnkL2jsZ0zYNswmyZa+7bclAbnX5iGKZ+nVRctLr7VEz1mkN7rhvR39ZvZPcN72aIS68eWm9HInf7lpf41oWH1nDt6bkYFvSf2ek0patyTsY57bbmpDBmtomMth71r+mawzUC/GVNuRmMfPK4tirmXTrnp6J0UvbHlZjQOFec9T8mTN+VmNIZ0z/qw5WY0MNvS03G+LTe17KgBjp1tualVH1MWGHNAPOa5xnpFU/KT5praiPyD+myN6/Ovr2OYs3laxAj1lKxvW00nLtayeomMv5CAVZs1lAP9i7njg5VpLNU1Mb5iGWal9X2djl3jUfN7oMUIZj/vAUg58xQkNDkJtN4pULwqRl3lnhncNRGHo+RoBdXVtTPRW7R8OWtUrntGtVJcZntr9dgle53T2JuQT7hHmpX0sFf5hqsoShraK2lIptdEd2/1fC1rf1PETVYQk2Kk9WlHiHfS6uNUn9bbjo4v6l2eGVy852PHL2abj7S1wZgnI1uEstTxdNuZPJJbh+vqZWVz3PwsojeK9mpBVmNEO1K5GIWabDF740u6t7QPaE8OeTCNPrzHSFOZKN41wyw65tMjsqiuvZV4o75Mho7LOVldY4/r0UMHPfSgm8c4W7Bi3IVSB2KGA7jrBEQ55wtdZaTxqfqk2B3N6A3WR/RpyUIaGmxvkpKFrIuyj0tUXgEaRwNH6eE0VukYfHeNkhz1++SxsWvZ8l+knVuzv92jMV49mqszMQPieo24RjRreFeX71Y5sARL75Nr5L/W9xL5NeGINlTi+tThzHoZ045/QhHshDzjlGabNDvKrd381OoTw2lfmb1z3M3OyEJGZP8iWJ8yGpMR/bhnB8wOOluElGxkiN0ZFd6Nz9cZiWPM+nEjxaca7HhLyJbNib+h686unMYiRwy8DpyujG2jkz3yBRPiOtXW3c7t+tUHkfachDtKmKIdK5eI/8f02/yYcbKxNiJQw/gGcm3rfO8jo5gFddSjVb7eBpm2rpQfFTI81VLb9c/K9FFJshZFXCgPrtYD4Nyne+aFo2RKcudrbXgdrcvmIuXJih6xt0cUxbPdH+oVGOW+TKvkBs25Lo2SIYyCWRFFmLZSFnmVbz2vMvUw2vn/hbrVdVlrSDFSNoPLGpLy+wlFa66UKYxqHr8vaDb5tT5daVXPZ0xj8cSZy19D7Yfw28ht7sPoxCWrcJPGAFOwd1YjXBOttQjjdbPEy4xMQ8veW352TJpWbs1Z4mu2bjbGXjSmsk+j5rXOWpjyWWg8d2g8D9Rhh/YarRZNvbFEz8TYoqN3K0P5NeHWaUB5LlKWPTKDGgVI6cZSYVQHIlU5xjeotyKtTZFWD2aruxvgzvkQpH+ur87ur4vVPVK3yLfpkwfG8cuAZumIfC5TWx+pMQXkfF3bV3f2d6kGucdkQZEyn+PEGcO7Tn26TgtJf65XtozsvLUI5tzSK93G2NgulX+1hjyhOZHTvDSI69Qi0fK7ckQrFumK43NElPnvkU/Ffkd9zOy2tu8kKvkTNt7kWWV5caQwJv1Lmbfdteh114lfI4oJ59q7joFW8zeMFBhjMgl+zzKnN4SrHO8ksEcbk/1ct1O8izd2JLpCUi/V7wNsDEe9dqy7Y8v02PTtF9AStW7fuq+FzC8N5ijxO8uOXo9WtRPtoy5X7s9Gq6dXufJ9nR7mK3ytPubUxo0sbJRXxnTVp8FcWKJmXBgTwqVZL5rI30zyJjLz7lQoZdPaUC5nGtjGHFO8JJ0DRYTPu7vk9eY+FvoRr9GLCetS4xqJEmbjMp0fcC0tZqWilQjJrZfWpNRZj6rWC8vDXTWsHWdLmZAVTJWUu+HWbh+6pWhFzsYwhb7ik71VcaJL81O48HekfFGi4RiSQ2yDn3tDbantd3Aq4qUuc2Yzohq0CYOVGLyn+1luUa+jlw51l34Ih3AeI9C1JP2IVtSmsjNlWXKXejj9V2QNpioRpbctm/fB5SL3ZJ1Tk/6MyMLJvRkp801O074YDiE9KXMJ58P7G1IvjpT5tqlZHwx1uQdlDk14mPMMYe/ctm7Oy+VUr691LqE8eB0wOy8GhzuA1TGLbRdioabOG3n3HNA6HNVQN6vF/9oPw8dyas4rlFtO35w9D3jr3C7RmVn0i5vPGcstZDRXcwznmRW9s16Tnx/7f1GjN5U5vXn39NEvtWPA8FoqzofK0jHeHUVW3lAquD/gkyFT/1F/Pyd/lfCyoFElRxNKZr+impppIVMzX176emeehchk6VTJVKZm44k2nYzdUrvqFvxsFR5g01Oi/E0l/0Ws/zvaAdQekfUw2XTOIHSpLqEsiN1NG9C9PUdbJTGe6eUzvh2owT3xParF8753qT2e+e2U+lb9JQnP9S9UpgalyGR1l8/Oqxh6UN6B41yQ+d43ojP1nM3iE2gnAXuMfI6KIyXz9fOSEAOKC1clXRLCjJY6yrGXckxnkpIK2nGpb30a4RO904/7Dng+v1dklyL1S6rr6dUBV2pJqn2PVI8pMxCT/jchQvu1ugx/L+uyX9L9NUlzegdliV47z+pPgp16x4X9mvEi5cFMpm6h22UU1dvdw/pMbKuSC594r8cPa/BDR8o2va0XFHdPVX3ucF5Dc65lcvdzx8rkPVkPGM32ivFRHz8vangtAvp/pxJ9x5F0B2SJKdse0X7elOilWjfbJD2fq6zP296ukdZ8tck07clKOw7MGcn6PYFUj7vq2c/nIKVcTVJBx53rfCJTOi0y8lKS5+ck4DREL6C3cl9DeipRmYuSzAO+RF4EyLIIoHMkSHMkUhiKkmj78OzCxtXV/+tjvXB47crV31y5fv/6xmc39f8D8r76qfqZugRr32/VZzD+99UBcHqu/qj+ov7aOm79ofWn1p+56XvnNOYnqvSv9bf/AjRlQhk=</latexit> . . .
34. ### Conclusion 15/15 Deeper networks: no minimal parameterization valid for almost

all . θ For some , there exists new conservation laws. → θ(0) https://github.com/sibyllema/Conservation_laws is infinite dimensional, SageMath code to compute → W∞ W∞ (θ) <latexit sha1_base64="UAUdhj6OM+ENH5fDEWE+yCEmOkk=">AABE5nictVzbchu5EYU3t41z8yaPeZmN1ilvyuvIjnOp2krV2qIsa621ZZOSvWvaLg45omiPODSHpC9c/UIqL6lU8pQ/yXfkA1KVPOUX0hdggCEx0xjF8ZQkDAanu9EDNLobGMeTdJTPNjf/ce69b3zzW9/+zvvfPf+97//ghz+68MGPD/NsPu0nB/0szaaP4l6epKNxcjAbzdLk0WSa9E7iNHkYv9jC5w8XyTQfZePO7M0keXLSG45HR6N+bwZVh910kM3yZxc2Nq9s0r9ovXBVFzaU/refffDhP1VXDVSm+mquTlSixmoG5VT1VA7XY3VVbaoJ1D1RS6ibQmlEzxN1qs4Ddg6tEmjRg9oX8HsId4917RjukWZO6D5wSeFnCshIXQRMBu2mUEZuET2fE2WsraK9JJoo2xv4G2taJ1A7U8dQK+FMy1Ac9mWmjtTvqA8j6NOEarB3fU1lTlpBySOnVzOgMIE6LA/g+RTKfUIaPUeEyanvqNsePf8XtcRavO/rtnP1b5LyIlyRauveZwWFnloQ/Yje5hyesTwpcB4ChUT3EUuvSNcn1PsxtF9C/V24TqlkdBLDtaTa01rkFlw+5JaI3IHLh9wRkXtw+ZB7InIfLh9yXyMROyWd+/FtuHz4tsj5Plw+5H0R+QAuH/KBiDyEy4c8FJFfweVDfiUib8HlQ94SkXfg8iHviMgOXD5kR0QewOVDHojIbbh8yG2NrJ6pU7gyojMSZuUNKJd5oKVIoeaGKN9Nso4+7M2AOd2vwMqzugV//dhWgE6TCux2wLg7qsDKI28HbKQfK9ui27Sa+LC3RewujAA/dlfEfq6eV2A/D5hpLyqw8lzbg3Z+rGx9v4A7P/YLEXsXSn6svEbdgxo/9l7AijGpwO6L2PvqZQU2xOpPK7Cy3W+DXfFj5XWqA+392BBrOq/Ayvb0EDwYP1ZerR5CrR/7UMQ+Uq8rsI9E7Jdg3f3YLwNW2LcVWLPGnqcVZEj+SAIzto5ar5iVWJoAtZ7APy3WlpR84xjqJcywwAwJcyIidgrETiBir0DsBcuVF3Y0J39X5tIuEO1ARFysTViaie0HRXsspQGIVoForSDqPFJ816YvC/IuTI2EnBUrF5ZC+pQV9htLiR4P9ZbXIO6VEDy2j2nkX6ZoCSMo1FQdteNijWdkRPd1iFcUvZleGh4yblZYBRf1WkTFHlQsot54UG9E1NyDmouohQe1EFF25ru4bsAIsPrHd7GkOx4B7CNXXxF4BTdg1bkNczSC8bMPXuADqrkHf9sUe0tXnWQYzeM6iVmOJyVLPIXSUm1AvY0KWxRfpzTDEpCMW97TMT7eYW5jqeccW+HTYiWPioxJOJ0RyTMs6KC3GNF8akbnDtWcknfHpWb428W8N6Vm+G3S+Cl58Vxqhp9p6WdnkL2jsZ0zYNswmyZa+7bclAbnX5iGKZ+nVRctLr7VEz1mkN7rhvR39ZvZPcN72aIS68eWm9HInf7lpf41oWH1nDt6bkYFvSf2ek0patyTsY57bbmpDBmtomMth71r+mawzUC/GVNuRmMfPK4tirmXTrnp6J0UvbHlZjQOFec9T8mTN+VmNIZ0z/qw5WY0MNvS03G+LTe17KgBjp1tualVH1MWGHNAPOa5xnpFU/KT5praiPyD+myN6/Ovr2OYs3laxAj1lKxvW00nLtayeomMv5CAVZs1lAP9i7njg5VpLNU1Mb5iGWal9X2djl3jUfN7oMUIZj/vAUg58xQkNDkJtN4pULwqRl3lnhncNRGHo+RoBdXVtTPRW7R8OWtUrntGtVJcZntr9dgle53T2JuQT7hHmpX0sFf5hqsoShraK2lIptdEd2/1fC1rf1PETVYQk2Kk9WlHiHfS6uNUn9bbjo4v6l2eGVy852PHL2abj7S1wZgnI1uEstTxdNuZPJJbh+vqZWVz3PwsojeK9mpBVmNEO1K5GIWabDF740u6t7QPaE8OeTCNPrzHSFOZKN41wyw65tMjsqiuvZV4o75Mho7LOVldY4/r0UMHPfSgm8c4W7Bi3IVSB2KGA7jrBEQ55wtdZaTxqfqk2B3N6A3WR/RpyUIaGmxvkpKFrIuyj0tUXgEaRwNH6eE0VukYfHeNkhz1++SxsWvZ8l+knVuzv92jMV49mqszMQPieo24RjRreFeX71Y5sARL75Nr5L/W9xL5NeGINlTi+tThzHoZ045/QhHshDzjlGabNDvKrd381OoTw2lfmb1z3M3OyEJGZP8iWJ8yGpMR/bhnB8wOOluElGxkiN0ZFd6Nz9cZiWPM+nEjxaca7HhLyJbNib+h686unMYiRwy8DpyujG2jkz3yBRPiOtXW3c7t+tUHkfachDtKmKIdK5eI/8f02/yYcbKxNiJQw/gGcm3rfO8jo5gFddSjVb7eBpm2rpQfFTI81VLb9c/K9FFJshZFXCgPrtYD4Nyne+aFo2RKcudrbXgdrcvmIuXJih6xt0cUxbPdH+oVGOW+TKvkBs25Lo2SIYyCWRFFmLZSFnmVbz2vMvUw2vn/hbrVdVlrSDFSNoPLGpLy+wlFa66UKYxqHr8vaDb5tT5daVXPZ0xj8cSZy19D7Yfw28ht7sPoxCWrcJPGAFOwd1YjXBOttQjjdbPEy4xMQ8veW352TJpWbs1Z4mu2bjbGXjSmsk+j5rXOWpjyWWg8d2g8D9Rhh/YarRZNvbFEz8TYoqN3K0P5NeHWaUB5LlKWPTKDGgVI6cZSYVQHIlU5xjeotyKtTZFWD2aruxvgzvkQpH+ur87ur4vVPVK3yLfpkwfG8cuAZumIfC5TWx+pMQXkfF3bV3f2d6kGucdkQZEyn+PEGcO7Tn26TgtJf65XtozsvLUI5tzSK93G2NgulX+1hjyhOZHTvDSI69Qi0fK7ckQrFumK43NElPnvkU/Ffkd9zOy2tu8kKvkTNt7kWWV5caQwJv1Lmbfdteh114lfI4oJ59q7joFW8zeMFBhjMgl+zzKnN4SrHO8ksEcbk/1ct1O8izd2JLpCUi/V7wNsDEe9dqy7Y8v02PTtF9AStW7fuq+FzC8N5ijxO8uOXo9WtRPtoy5X7s9Gq6dXufJ9nR7mK3ytPubUxo0sbJRXxnTVp8FcWKJmXBgTwqVZL5rI30zyJjLz7lQoZdPaUC5nGtjGHFO8JJ0DRYTPu7vk9eY+FvoRr9GLCetS4xqJEmbjMp0fcC0tZqWilQjJrZfWpNRZj6rWC8vDXTWsHWdLmZAVTJWUu+HWbh+6pWhFzsYwhb7ik71VcaJL81O48HekfFGi4RiSQ2yDn3tDbantd3Aq4qUuc2Yzohq0CYOVGLyn+1luUa+jlw51l34Ih3AeI9C1JP2IVtSmsjNlWXKXejj9V2QNpioRpbctm/fB5SL3ZJ1Tk/6MyMLJvRkp801O074YDiE9KXMJ58P7G1IvjpT5tqlZHwx1uQdlDk14mPMMYe/ctm7Oy+VUr691LqE8eB0wOy8GhzuA1TGLbRdioabOG3n3HNA6HNVQN6vF/9oPw8dyas4rlFtO35w9D3jr3C7RmVn0i5vPGcstZDRXcwznmRW9s16Tnx/7f1GjN5U5vXn39NEvtWPA8FoqzofK0jHeHUVW3lAquD/gkyFT/1F/Pyd/lfCyoFElRxNKZr+impppIVMzX176emeehchk6VTJVKZm44k2nYzdUrvqFvxsFR5g01Oi/E0l/0Ws/zvaAdQekfUw2XTOIHSpLqEsiN1NG9C9PUdbJTGe6eUzvh2owT3xParF8753qT2e+e2U+lb9JQnP9S9UpgalyGR1l8/Oqxh6UN6B41yQ+d43ojP1nM3iE2gnAXuMfI6KIyXz9fOSEAOKC1clXRLCjJY6yrGXckxnkpIK2nGpb30a4RO904/7Dng+v1dklyL1S6rr6dUBV2pJqn2PVI8pMxCT/jchQvu1ugx/L+uyX9L9NUlzegdliV47z+pPgp16x4X9mvEi5cFMpm6h22UU1dvdw/pMbKuSC594r8cPa/BDR8o2va0XFHdPVX3ucF5Dc65lcvdzx8rkPVkPGM32ivFRHz8vangtAvp/pxJ9x5F0B2SJKdse0X7elOilWjfbJD2fq6zP296ukdZ8tck07clKOw7MGcn6PYFUj7vq2c/nIKVcTVJBx53rfCJTOi0y8lKS5+ck4DREL6C3cl9DeipRmYuSzAO+RF4EyLIIoHMkSHMkUhiKkmj78OzCxtXV/+tjvXB47crV31y5fv/6xmc39f8D8r76qfqZugRr32/VZzD+99UBcHqu/qj+ov7aOm79ofWn1p+56XvnNOYnqvSv9bf/AjRlQhk=</latexit> . . . Extensions: Max-pooling Convolution Skip connexions Variable metric, mirror descent Discrete gradient descent, SGD …