Hamilton Jacobi Bellman approach for state-constrained control problem with maximum cost

Hamilton Jacobi Bellman approach for some applied control problems Mohamed
assellaou Supervisor: Hasnaa Zidani Co-supervisor: Olivier Bokanowski December 7, 2015 - ENSTA ParisTech

Hamilton Jacobi Bellman approach for some applied control problems Introduction
Indirect method : Calculus of variation, Pontryagin Maximum Principle (Pontragin 1961), Shooting method. . Based on necessary conditions that are not necessary suﬃcient conditions for optimality. . Accurate method when it works however local method. Direct method: Discretize and Optimize. . General approach. . Optimization (IPOPT, WORHP, ..). . Convergence result requires restrictive framework. . Accurate method when it works however local method. 2 / 68

Hamilton Jacobi Bellman approach for some applied control problems Introduction
Dynamic Programming Principle (Bellman 1957) and the Hamilton Jacobi Bellman approach: . Quite general approach. . Extensive literature for viscosity solutions of HJB equations. . Lack of studies dedicated to the reconstruction of trajectories, HJB equations or value functions coming from applications. 3 / 68

Hamilton Jacobi Bellman approach for some applied control problems Motivations
Real applications include physical state and control constraints; . Without any controllability property. . Time-dependent constraints. . Constraints of diﬀerent types: target constraints, probabilistic constraints. Reconstruction of optimal trajectories and optimal feedback control: . Analysis of the optimal trajectories using the value function. . Numerical Algorithms for reconstruction of optimal trajectories. 4 / 68

Hamilton Jacobi Bellman approach for some applied control problems Main
contributions of the Phd thesis 1 Analysis of state-contrained deterministic OCPs with unusual costs: State constrained OCPs with maximum cost . Characterization of the value functions. . Characterization of the epigraph of the value function by a Lipschitz value function of an auxiliary control problem. . Link the exit time. . Reconstruction of optimal trajectories. State-constrained OCPs with Bolza costs . Locally Lipschitz continuous cost with polynomial growth. . Convergence result of optimal trajectories for the case of state constraints. Abort landing problem in presence of windshear. 2 Stochastic control problem: Probabilistic constraints ; Probabilistic Backward reachable sets: . New Level set approach, . Approximated sets.. New error estimates for a class of stochastic control problem: . Discontinous cost; . Unbounded coeﬃcients and cost; 5 / 68

Hamilton Jacobi Bellman approach for some applied control problems Publications
M. Assellaou, O. Bokanowski and H. Zidani, Probabilistic safety reachability analysis, ICCOPT, Lisboa, July 2013. M. Assellaou, O. Bokanowski and H. Zidani, Error Estimates for Second Order Hamilton-Jacobi-Bellman Equations. Approximation of Probabilistic Reachable Sets, DCDS- Serie A, vol. 35(9), pp. 3933 - 3964, 2015. M. Assellaou, O. Bokanowski, A. Desilles and H. Zidani, Feedback control analysis for state constrained control problem with maximum cost, in preparation, 2015. M. Assellaou, O. Bokanowski, A. Desilles and H. Zidani, Optimal feedback control for the abort landing problem in presence of windshear , in preparation, 2015. 6 / 68

Hamilton Jacobi Bellman approach for some applied control problems Outline
1 Abort landing problem in presence of windhear. Physical model. Setting of the problem. Auxiliary control problem. Reconstruction of optimal trajectories. Numerical simulations. 2 Backward reachable sets under probability of success. Setting of the problem. Basic assumptions. Stochastic OCP with unbounded and discontinuous cost. Numerical simulations. 7 / 68

Hamilton Jacobi Bellman approach for some applied control problems Abort
landing problem in presence of windhear Physical model Abort landing problem under windshear x h FL FD FP V xb δ FT α γ Figure : Forces acting on the aircraft in ﬂight in a moving atmosphere. 8 / 68

landing problem in presence of windhear Physical model Controlled Diﬀerential equation The equations of motion in a vertical plane over a ﬂat plane are given by the following system:          ˙ x = V cos γ + wx ˙ h = V sin γ + wh ˙ V = β FT m cos(α + δ) − FD m − g sin γ − ( ˙ wx cos γ + ˙ wh sin γ) ˙ γ = 1 V (β FT m sin(α + δ) + FL m − g cos γ + ( ˙ wx sin γ − ˙ wh cos γ)) where ˙ wx = ∂wx ∂x (V cos γ + wx ) + ∂wx ∂h (V sin γ + wh) ˙ wh = ∂wh ∂x (V cos γ + wx ) + ∂wh ∂h (V sin γ + wh) and FT := FT (V ), thrust force, FD := FD (V , α), FL := FL(V , α), drag and lift forces, wx := wx (x), wh := wh(x, h), the wind components, m, g, and δ are constants. 9 / 68

landing problem in presence of windhear Physical model Diﬀerents models 4D- model: The state variable is y(.) = (x(.), h(.), V (.), γ(.)). The vector control here is u(.) = (α(.), β(.)). Bulirsch et al -model: 5D model, y(.) = (x(.), h(.), V (.), γ(.), α(.)) is the state variable. The control u in the angular speed of α. The controlled dynamics is:              ˙ x = V cos γ + wx , ˙ h = V sin γ + wh, ˙ V = FT m cos(α + δ) − FD m − g sin γ − ( ˙ wx cos γ + ˙ wh sin γ), ˙ γ = 1 V ( FT m sin(α + δ) + FL m − g cos γ + ( ˙ wx sin γ − ˙ wh cos γ)), ˙ α = u. 10 / 68

landing problem in presence of windhear Physical model State and control constraints State constraints: . Constructor constraints : the velocity V ∈ [Vmin, Vmax ] , the angle of attack α ∈ [αmin, αmax ]. . Type of physical problem: the altitude h ∈ [hmin, hmax ], the horizontal variable x ∈ [xmin, xmax ], the inclination angle γ ∈ [γmin, γmax ]. Control constraints: the angular speed of the angle of attack belongs to u ∈ U = [umin, umax ]. ⇒ The state variable has value in a closed set K and the control variable in a compact set U. Deﬁne the set of controls by U := u : (0, T) → Rk , measurable, u(t) ∈ U a.e . 11 / 68

landing problem in presence of windhear Physical model Numerical data Boeing B 727 The forces acting on the center of gravity of the plane take the following forms: FT := FT (V ) = A0 + A1 V + A2 V 2 FL := FL (V ) = 1 2 ρV 2Scl , FD := FD (V ) = 1 2 ρV 2Scd , where: cd := B0 + B1α + B2α2 cl (α) := C0 + C1α, α ≤ α∗ , C0 + C1α + C2 (α − α∗ )2 α∗ ≤ α ≤ αmax, The control belongs to the following interval u(.) ∈ − 3.0 deg, 3.0 deg ⇒ Locally Lipschitz continuous functions with polynomial growth. 12 / 68

landing problem in presence of windhear Physical model Numerical data Boeing B 727 The wind velocity components: wx (x) = kA(x), wh(x, h) = k h h∗ B(x), (1) where A(x) and B(x) are functions depending only on the variable x. Figure : Horizontal and vertical wind components. 13 / 68

landing problem in presence of windhear Physical model Optimal control problem Aim: Let y be a given starting point. Maximize the minimal altitude over a time interval [0, T], i.e, sup min θ∈[0,T] h(θ) u ∈ U, and yu(s) ∈ K, ∀s ∈ [0, T] Instead of considering the lower altitude as a cost function, one can optimize the peak value over a time interval of the diﬀerence between the reference altitude Hr and the instantaneous altitude. Deﬁne the function Φ by: Φ(y) = Hr − h, Then, the following control problem can be considered: inf max θ∈[0,T] Φ(yu(θ)), u ∈ U, and yu(s) ∈ K, ∀s ∈ [0, T] 14 / 68

landing problem in presence of windhear Physical model Optimal control problem In the case when Hr − h(t) ≥ 0, ∀t ∈ [0, T], we have, lim k→∞ T 0 (Hr − h(t))k 1 k = sup θ∈[0,T]≤T (Hr − h(θ)). Then, one can also study the following control problem with Bolza cost, inf T 0 rΦ(yu(s))qds, u ∈ U, and yu(s) ∈ K, ∀s ∈ [0, T] where r and q are positive constants. 15 / 68

landing problem in presence of windhear Physical model Abort landing problem in presence of windshear: References 1 A. Miele, T. Wang et W. Melvin : Quasi-steady ﬂight to quasi-steady ﬂight transition for abort landing in a windshear : Trajectory optimization and guidance. 58(2):165–207, 1988. 2 A. Miele, T. Wang, C. Y. Tzeng et W. Melvin : Optimal abort landing trajectories in the presence of windshear. 55(2):165–202, 1987. 3 R. Bulirsch, F. Montrone et H. J. Pesch : Abort landing in the presence of windshear as a minimax optimal control problem. I. Necessary conditions. J. Optim. Theory Appl., 70(1):1–23, 1991. 4 R. Bulirsch, F. Montrone et H. J. Pesch : Abort landing in the presence of windshear as a minimax optimal control problem. II. Multiple shooting and homotopy. J. Optim. Theory Appl., 70(2):223–254, 1991. 16 / 68

landing problem in presence of windhear Setting of the problem Setting of the problem For a given non-empty compact subset U of Rk and a ﬁnite time T > 0, deﬁne the set of admissible control to be, U := u : (0, T) → Rk , measurable, u(t) ∈ U a.e . Consider the following control system: ˙ y(s) := f (y(s), u(s)), a.e s ∈ [0, T], y(0) := y, (2) where u ∈ U and the function f satifsies the following: (H1): Continuous on K × U and that it is Locally Lipshitz continuous in the variable y. (H2) : For all y ∈ K, f (y, U) = {f (y, u), u ∈ U} is a convex set. 17 / 68

landing problem in presence of windhear Setting of the problem Setting of the problem Embedded in a wide class of state constrained control problem with maximum cost: ϑ(t, y) := inf max θ∈[0,t] Φ(yu y (θ)) u ∈ U, yu y (s) ∈ K, s ∈ [0, t] , where the cost function Φ(y) is assumed to be Lipschitz continuous. 18 / 68

landing problem in presence of windhear Setting of the problem Some references OCPs with maximum cost: . Barron and Ishii: appoximation with a sequence of optimal control problem with Lp - cost. . Quincampoix and Serea: A viability approach for optimal control with inﬁmum cost. State-constrained OCPs: . “Controllability conditions” on the coeﬃcients of the ODE and the set of constraints: Soner (’86); Frankowska et Vinter (’00), Hermozilla et Zidani (15’). . Viability theory: Aubin (’11), Cardalliaguet el al (’97), (’00); . Characterization without any controllability assumption, Altarovici et al (’13).. 19 / 68

landing problem in presence of windhear Auxiliary control problem Auxiliary control problem Introduce the following augmented dynamics f for u ∈ U and y := (y, z) ∈ Rd × R: f ((y, z), u) = f (y, u) 0 . Let y(s) := y{y,z} (s) := (yu y (s), zu y,z (s)) (where zu y,z (.) := z) be the associated augmented solution of: ˙ y(s) = f (y(s), u(s)), s ∈ (0, T), (3a) y(0) = (y, z)T . (3b) Deﬁne the corresponding set of feasible trajectories, S[0,T] (y) := {y = (yu y , zu y,z ), y satisﬁes (3) for some u ∈ U}, for y = (y, z) ∈ Rd × R. 20 / 68

landing problem in presence of windhear Auxiliary control problem Auxiliary control problem Let g be a Lipschitz continuous function such that: ∀y ∈ K, g(y) ≤ 0 ⇔ y ∈ K. Consider the following auxiliary control problem (AOCP) deﬁned below by its value function: w(t, y, z) := inf y:=(y,z)∈S[0,t](y) max θ∈[0,t] Ψ(y(θ), z), where Ψ(y, z) := (Φ(y) − z) ∨ g(y), with a ∨ b = max(a, b). 21 / 68

landing problem in presence of windhear Auxiliary control problem Auxiliary control problem Proposition Let (t, y, z) ∈ [0, T] × K × R. The value function w is related to ϑ by the following relations: (i) ϑ(t, y) − z ≤ 0 ⇔ w(t, y, z) ≤ 0, (ii) ϑ(t, y) = inf z ∈ R , w(t, y, z) ≤ 0 . ⇒ The epigraph of ϑ at time t is described by the function w Epi(ϑ(t, .)) = (y, z) ∈ K × R w(t, y, z) ≤ 0 . 22 / 68

landing problem in presence of windhear Auxiliary control problem Auxiliary control problem Let us deﬁne the following Hamiltonian, for all y, p ∈ Rd : H(y, p) := sup u∈U − f (y, u) · p . Proposition The function w is the unique locally Lipschitz continuous viscosity solution of the following HJ equation: min ∂t w(t, y, z) + H(y, ∇y w), w(t, y, z) − Ψ(y, z) = 0, in ]0, T] × Rd × R, w(0, y, z) = Ψ(y, z), in Rd × R. ⇒ This HJB equation is deﬁned in all domain. ⇒ Outside the set K, the function f is not known. ⇒ Problem for numerical convenience. 23 / 68

landing problem in presence of windhear Auxiliary control problem Particular choice of the obstacle function g Let c > 0 and define the following extended set: K :≡ K + ˜ cB1. Let g(y) := (dK(y) ∧ c), ∀y ∈ Rd where dK the signed distance to K. Define Ψ(y, z) = (Φ(y) − z) ∨ g(y) ∧ c, Let w be defined by: w(t, y, z) := inf y=(y,z)∈S[0,T](y) max θ∈[0,t] Ψ(y(θ), z). 24 / 68

landing problem in presence of windhear Auxiliary control problem Particular choice of the obstacle function g Theorem The function w is the unique locally Lipschitz continuous viscosity solution of the following Hamilton Jacobi equation: min ∂t w(t, y, z) + H(y, ∇y w), w(t, y, z) − Ψ(y, z) = 0, in [0, T]× ◦ K ×Rd , w(0, y, z) = Ψ(y, z), in K × Rd . w(t, y, z) = c, in [0, T] × K × Rd 25 / 68

landing problem in presence of windhear Auxiliary control problem Particular choice of the obstacle function g For the case K is compact or Φ is bounded, we have Φ(y) ∈ [m, M], ∀y ∈ K. Theorem The value function w is the unique locally Lipschitz continuous viscosity solution of the following HJ equation, min ∂t w(t, y, z) + H(y, ∇y w), w(t, y, z) − Ψ(y, z) = 0, in [0, T]× ◦ K ×[m, M], w(0, y, z) = Ψ(y, z), in K × [m, M]. w(t, y, z) = c, in [0, T] × K × [m, M] In addition, the value function w describes the epigraph of ϑ. ϑ(t, y) = inf z ∈ [m, M], w(t, y, z) ≤ 0 . 26 / 68

landing problem in presence of windhear Auxiliary control problem Link with viability problem Problem of state constraints: Introduction of the auxiliary control problem free of constraints with an additionnal variable z. The value function w describes the epigraph of ϑ. Is there a link between the optimal trajectories corresponding to each control problem? Let (y, z) be a given initial point and change the time t. 27 / 68

landing problem in presence of windhear Auxiliary control problem Link with a viability problem First, let us deﬁne the following set: D := y = (y, z) ∈ Rd+1 y ∈ K and y ∈ Epi(Φ) . Let us deﬁne also the exit time function T : Rd+1 → [0, T], i.e, T (y, z) := sup t ∈ [0, T] ∃u ∈ U, s.t yu y,z (θ) ∈ D, ∀θ ∈ [0, t] . Proposition (i) T (y, z) = sup t ∈ [0, T] w(t, y, z) ≤ 0 , (ii) T (y, z) = t ⇒ w(t, y, z) = 0, (iii) ϑ(t, y) = inf z T (y, z) ≥ t . 28 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Link between optimal trajectories Let y ∈ K such that ϑ(T, y) < ∞. Deﬁne z := ϑ(T, y). y∗ = (y∗, z∗) is optimal for the (AOCP) ⇒ y∗ is optimal for the (OCP). y∗ = (y∗, z∗) is optimal for the exit time problem ⇒ y∗ is optimal for the (AOCP). ⇒ Optimal trajectories corresponding to the OCP can obtained by: . The value function ϑ. . The auxiliary value function w. . The exit time function T . 29 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Recall of DPP for w and T For any t ∈ [0, T], h ≥ 0, such that t + h ≤ T, w(t + h, y, z) = inf α∈U w(t, yα(h), z) max θ∈[0,h] Ψ(yα(θ), z) . For all (y, z) ∈ Rd+1, T (y, z) = sup α∈U T (yα(t), z) + t T, ∀t ∈ 0, T (y, z) . 30 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Reconstruction of optimal trajectories Algorithm A. Set zn(·) := z and yn(t0) = y (where ϑ(tn, y) = z). [Step 1] Knowing the state yn(tk ), choose the optimal control at time tk such that: un k ∈ arg min u∈U w(tn−k−1, yn(tk ) + hf yn(tk ), u , z) max θ∈[0,h] Ψ(yn(tk ) + θf yn(tk ), u , z) . [Step 2] Deﬁne un(t) := un k , ∀t ∈ (tk , tk+1] and yn(t) on (tk , tk+1] as the solution of ˙ y(t) := f (y(t), un(t)) a.e t ∈ (tk , tk+1], with initial condition yn(tk ) at tk and zn(·) := z. 31 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Reconstruction of optimal trajectories Theorem Let {yn(·), zn(·), un(·)} be a sequence generated by algorithm A for n ≥ 1. Then, the sequence of trajectories {yn(·), zn(·)}n has cluster points with respect to the uniform convergence topology. For any cluster point (¯ y(·), ¯ z(·)) there exists a control law ¯ u(·) such that (¯ y(·), ¯ z(·), ¯ u(·)) is optimal for the auxiliary control problem. ⇒ Is there a convergence result for the algorithm A if an approximated solution W is considered instead of w? 32 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Reconstruction of trajectories Let W be a numerical approximation of w such that, |W (t, y, z) − w(t, y, z)| ≤ E(∆t, ∆y, ∆z), (7) where E(∆t, ∆y, ∆z) → w(., .) as ∆t, ∆y, ∆z → 0 Theorem Let {yn(.), zn(.), un 1 (.)} and {Yn(.), Zn(.), un 2 (.)} be the sequences generated by the algorithm A for w and W respectively. Then, for all > 0 we have the following estimates: max θ∈[0,T] Ψ(Yn(θ), z) − max θ∈[0,T] Ψ(yn(θ), z) ≤ 3T + (2n + 3)E(∆t, ∆y, ∆z). Examples: For the Semi Lagrangian or the Finite diﬀerences method, it suﬃces to take: (∆t 1 2 + ∆y + ∆z) = o( 1 n ), and n → ∞ 33 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Reconstruction by the exit time function T Algorithm B. Set zn(·) := z and yn(t0) = y (where ϑ(tn, y) = z). [Step 1] Knowing the state yn(tk ) choose the optimal control at tk : un k ∈ arg max u∈U T yn(tk ) + hf yn(tk ), u , z + h T , where A denotes the linear interpolation of A. [Step 2] The next point of the state corresponding to the minimizing value u(tk ) = un k is: yn(tk+1) := yn(tk ) + hf yn(tk ), u(tk ) . and zn(·) := z. 34 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories Numerical schemes Finite Diﬀerence scheme    W n+1 I,j = max W n I,j + ∆tH(yI , D+W n(yI , zj ), D−W n(yI , zj )), ϕI,j W N I,j = ϕI,j , where the discrete space gradient of the function W n at the point (yI , zj ), D±W n(yI , zj ) = (D± y1 W n(yI , zj ), .., D± yd W n(yI , zj )), and H is a numerical Hamiltonian that can approximated by ENO method. Semi Lagrangian scheme W n+1 I,j = mina∈U W n yI + f (yI , a)∆t, zj ϕI,j W N I,j = ϕI,j Same error estimates for both schemes. 35 / 68

landing problem in presence of windhear Reconstruction of optimal trajectories 36 / 68

landing problem in presence of windhear Numerical simulations Comparison with the Bulirsch et al- paper. Figure : History of the states for the control problems with Bolza cost and maximum cost. 37 / 68

landing problem in presence of windhear Numerical simulations Reconstruction using the value function Figure : History of the state components for the maximum-cost problem using the reconstruction by exit time function and the value function. 38 / 68

landing problem in presence of windhear Numerical simulations Reconstruction using the value function The control is unstable. Figure : History of the control for the maximum-cost problem associated using the reconstruction by the exit time function and the value function. 39 / 68

landing problem in presence of windhear Numerical simulations Reconstruction using the value function and the penalization of the control variation Algorithm. C [Step 1]: Let λ be a positive constant. Knowing the state yn(tk ), choose the optimal control at tk : un k = arg min u∈U w(tn−k−1, yn(tk ) + hf yn(tk ), u , z) max θ∈[0,h] Ψ(yn(tk ) + θf yn(tk ), u , z) + λ|u − un k−1 | . [Step 2] Deﬁne u(tk ) = un k . Then, the next point is: yn(tk+1) := yn(tk ) + hf yn(tk ), u(tk ). 40 / 68

landing problem in presence of windhear Numerical simulations Reconstruction using the value function and the penalization of the control variation Figure : Reconstruction of the optimal feedback control (speed of the angle of attack) for the control problem with maximum cost with three values of the penalization parameter λ = 0.0, 1.0 and 2.0. 41 / 68

landing problem in presence of windhear Numerical simulations Reconstruction using the Hamiltonian Algorithm. D [Step 1] For 1 ≤ k ≤ n, let yk := yn(tk ). Calculate the space gradient D±W (tk , yk , z) of the function W at the point (tk , yk , z), Compute the optimal control at tk : ak = arg min u H(u, yk , D+W (tk , yk , z), D−W (tk , yk , z)) (8) where Hnum is the numerical Hamiltonian. [Step 2] Deﬁne u(tk ) = ak . Then, the next point is: yn(tk+1) := yn(tk ) + hf yn(tk ), u(tk ) 42 / 68

landing problem in presence of windhear Numerical simulations Reconstruction with diﬀerents methods Table : The optimality criterion at time t = T obtained with several algorithms. Algorithm Minimal altitude Algorithm A 5.178 e+2 Algorithm B 5.063 e+2 Algorithm C 5.171 e+2 Algorithm D 5.134 e+2 Figure : History of the trajectory in the plan oxh for the control problem with maximum cost using several methods of reconstruction. 43 / 68

landing problem in presence of windhear Numerical simulations Feedback control analysis Figure : History of the control for the control problem with maximum-cost using diﬀerent methods of reconstruction. 44 / 68

landing problem in presence of windhear Numerical simulations Thank you 45 / 68

Hamilton Jacobi Bellman approach for some applied control problems Backward
reachable set under probability of success Outline 1 Abort landing problem in presence of windhear. Setting of physical model. State-constrained OCP with maximum cost. Reconstruction of optimal trajectories. 2 Backward reachable sets under probability of success. Setting of the problem. Basic assumptions. Stochastic OCP with unbounded and discontinuous cost. Numerical simulations. 46 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Setting of the problem Let (Ω, F, {Ft }t≥0 , P) be a ﬁltered probability space. Let T > 0 be a time horizon and consider the following SDE: dX(s) = b(s, X(s), u(s))ds + σ(s, X(s), u(s))dW (s), s ∈ (t, T] X(t) = x, where u ∈ U := {Progr. meas. processes valued in U} and - W (·): p-dimensional Brownian motion; - U ⊂ Rm: set of control values, compact set; - b and σ: Lipschitz continuous. Xu t,x (·): unique strong solution associated with the control u. 47 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Setting of the problem Aim: Let C be a non-empty subset of Rd . Let ρ ∈ [0, 1[ and t ≤ T. Consider the backward reachable set under probability of success ρ: Ωρ t = x ∈ Rd ∃u ∈ U, P[Xu t,x (T) ∈ C] > ρ . (9) Indeed, it is straithforward to see that Ωρ t is equivalent to: Ωρ t = x ∈ Rd ∃u ∈ U, E[1C (Xu t,x (T))] > ρ . Characterization by the Level-set approach: sup E[1C (Xu t,x (T))] u ∈ U 48 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Some references Backward reachable sets under probability of success: . Discrete time problems: Abat et al (’07) (’08), . Continuous time problems: F¨ ollmer and Leukert(’99), Bouchard et al (’08) (ﬁnance setting). Error estimates for second order HJB equation with bounded and Lipschitz cost: . Barles and Jakobsen (’02), (’05) and (’07), . Debrabant and Jakobsen (’13). 49 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Setting of the problem Introduce the following optimal control problem: ϑ(t, x) := sup u∈U E 1C (Xu t,x (T)) ≡ sup u∈U P Xu t,x (T) ∈ C (10) Therefore, it is straightforward to show the following: Proposition Let ϑ deﬁned in (10). Then, ∀t ∈ [0, T]: Ωρ t = {x ∈ Rd , ϑ(t, x) > ρ}. ϑ is upper semi continuous. 50 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Stochastic OCP with unbounded and discontinuous cost Embedded in the following class of stochastic control problem (SOCP): sup E Φ(Xu t,x (T) u ∈ U Φ discontinuous and unbounded function; Error estimates for numerical approximation of the associated value function. 51 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Setting of the problem. Basic assumptions Let corresponding value function be deﬁned by: ϑ(t, x) := sup E Φ(Xu t,x (T) u ∈ U . where Φ be measurable with linear growth. Deﬁne the following regularized value function: ϑ (t, x) := sup E Φ (Xu t,x (T) u ∈ U . It is known that ϑ → ϑ pointwisely, The error estimates |ϑ (t, x) − ϑ(t, x)| ≤?? is not classical. 52 / 68

reachable set under probability of success Setting of the problem. Basic assumptions Setting of the problem. Basic assumptions Assumptions: The function Φ is L - Lipschitz continuous. Let D := {x ∈ Rd | Φ (x) = Φ(x)}. There exists M1 > 0, such that for any A > 0, µ(D ∩ BA ) ≤ M1 A σ depends only on (t, x) and there exists a real number Λ ≥ 1, such that: ∀(t, x) ∈ (0, T) × Rd , ΛId ≥ σ(t, x)σ(t, x)T ≥ Λ−1Id , where Id is the identity matrix. 53 / 68

reachable set under probability of success Error estimate for the regularized procedure Error estimate for the regularized procedure Theorem There exist a constant C0 > 0 and 0 ∈]0, 1], such that for every 0 < < 0 the following estimate holds: |ϑ(t, x) − ϑ (t, x)| ≤ C0 1+|x|2+| log | (T−t)d/2 , for every 0 ≤ t < T and x ∈ Rd . Elements of the proof: Aronson type estimates: There exist c1, c2, c3 > 0 such that for any (t, s, x, y) ∈ [0, T) × Rd × Rd such that t < s, and for any admissible control u ∈ U, the following estimate holds: |pu(t, x; s, y)| ≤ c1 (s − t) d 2 e−c2 |x−y|2 2(s−t) ec3|x|2 . Some classical arguments ... 54 / 68

reachable set under probability of success Error estimate for the regularized procedure Some properties of ϑ Lemma There exists a constant C > 0 such that for every > 0, the value function ϑ satisﬁes: |ϑ (t, x) − ϑ (t, y)| ≤ CL |x − y|, for all x, y ∈ Rd , t ∈ [0, T]. Moreover, |ϑ (t, x) − ϑ (s, x)| ≤ CL (1 + |x|) |t − s|1 2 , for all x ∈ Rd , t, s ∈ [0, T]. 55 / 68

reachable set under probability of success Error estimate for the regularized procedure Some properties of ϑ The function ϑ is the unique continuous viscosity solution, with linear growth, of the following HJB equation: − ∂tϑ + H(t, x, Dϑ , D2ϑ ) = 0, in (0, T) × Rd , (11a) ϑ (T, x) = Φ (x) in Rd , (11b) where H denotes the Hamiltonian function deﬁned by: H(t, x, p, Q) := inf u∈U − 1 2 Tr(σ(t, x, u)σT (t, x, u) Q) − b(t, x, u) · p , for every t ∈ [0, T], x ∈ Rd , p ∈ Rd and for every symmetric d × d-matrix Q. Error estimates for the numerical approximations of (11)? 56 / 68

reachable set under probability of success Error estimates for numerical appoximations by a Semi- Lagrangian scheme Error estimates for numerical appoximations by a Semi- Lagrangian scheme Semi discrete scheme: Let h = dt > 0 denote a given time step, and consider a semi-discrete scheme deﬁned as (for x ∈ Rd ): V N (x) = φ(x), (12a) V n−1(x) = Sh(tn, x, V n), for every n = N, . . . , 1, (12b) with, for any t ∈ [0, T], x ∈ Rd , and any function w : Rd → R, Sh(t, x, w) := 1 2m max a∈U 2m k=1 w(x + hb(t, x, a) + √ h¯ σk (t, x, a)) . where ¯ σk (t, x, a) := (−1)k √ m σ k−1 2 (t, x, a), ( p denotes the integer part of p ∈ R and σk are the column vectors of the matrix σ). 57 / 68

reachable set under probability of success Error estimates for numerical appoximations by a Semi- Lagrangian scheme Error estimates for numerical appoximations by a Semi- Lagrangian scheme For any regular function ϕ ∈ C2,4([0, T] × Rd ), denoting ϕn(x) = ϕ(tn, x) and En ϕ (x) as En ϕ (x) := −∂tϕ(tn, x) + H(tn, x, Dϕ, D2ϕ) − ϕn−1(x) − Sh(tn, x, ϕn) h , Then, there exists a constant C ≥ 0 such that: § ¦ ¤ ¥ |En ϕ (x)| ≤ C ( ϕtt 0 + k=2,3,4 Dk ϕ 0 ) (1 + |x|)4 h. 58 / 68

reachable set under probability of success Error estimates for numerical appoximations by a Semi- Lagrangian scheme Error estimates for numerical appoximations by a Semi- Lagrangian scheme Theorem Let φ is Lipschitz continuous function with Lipschitz constant Lφ. There exists C ≥ 0, ∀n ∈ [0, . . . , N], |V n(x) − v(tn, x)| ≤ CLφ (1 + |x|)7/4 h1/4. Elements of the proof: Let (Qn, Qn+1, . . . , Qk , . . . ) be a sequence of i.i.d. random variables such that P[Qi = k] = 1 2m for all i ≥ n and k ≥ n. For a given x ∈ Rd , a given k ≥ n ≥ 0, a sequence of controls a = (an, . . . , ak , . . . ) with ai ∈ U, - If k = n, Zn,a n,x := x. - If k ≥ n, Zk+1,a n,x := Zk,a n,x + hb(tk , Zk,a n,x , ak ) + √ h¯ σQk (tk , Zk,a n,x , ak ). 59 / 68

reachable set under probability of success Error estimates for numerical appoximations by a Semi- Lagrangian scheme Error estimates for numerical appoximations by a Semi- Lagrangian scheme The scheme can be written equivalently in the form V n−1(x) = max a∈U E V n(Zn+1,a0 n,x ) . (13) Upper bound: Shaking coeﬃcient techniques combined with standard molliﬁcation arguments. Lemma For every η > 0 there exists a C∞ function vη such that vη is a classical super-solution to (11). Moreover, there exists C > 0 such that for every η > 0 the following estimates hold: |v(t, x) − vη(t, x)| ≤ CLφ(1 + |x|)η, (14a) | ∂k vη dtk (t, x)| ≤ CLφ η2k−1 (1 + |x|) and ∂k vη dxk 0 ≤ CLφ ηk−1 , (14b) for any k ≥ 1, and for every (t, x) ∈ [0, T] × Rd . Lower bound: Reversing the role of the equation and the scheme. The key point is that the solution V of the semi-discrete scheme is also H¨ older continuous. 60 / 68

reachable set under probability of success Error estimates for numerical appoximations by a Semi- Lagrangian scheme Error estimates for numerical appoximations by a Semi- Lagrangian scheme Fully discrete scheme: for n = N, . . . , 1, for all xi ∈ G: V n−1 i = V n−1(xi ) = 1 2m max a∈U 2m k=1 [V n](xi + h b(tn, xi , a) + √ h ¯ σk (tn, xi , a)) (15) where [V n] denotes the bilinear interpolation of (V n i ) on (xi ), and with V N i = V N (xi ) = φ(xi ), ∀xi ∈ G. (16) Theorem Let v be the continuous solution, and let V ∆ the numerical solution satisfying the fully discrete scheme (15), with ∆ = (h, ∆x) the time and space steps. There exists C > 0 depending only on T, L0 such that for every R > 0, we have: |v − V ∆ L∞(BR ) ≤ CLφ R7/4h1/4 + |∆x| h . 61 / 68

reachable set under probability of success Backward reachable sets under probability of success. Probabilistic reachability problem Consider the following ”regularized” control problem: ϑ (t, x) := sup u∈U E[φ (Xu t,x (T))], x 1 C ε ε 1C φ Figure : Regularization φ of the indicator function 1C for a given set C. 62 / 68

reachable set under probability of success Backward reachable sets under probability of success. Probabilistic reachability problem Theorem Let ϑ ,∆ be numerical approximation of ϑ based on the SL scheme. If there exists A > 0 such that C \C ∈ BA. Then, ∃C > 0, ∀ δ ∈ (0, T], ∀ t ∈ [0, T), x ∈ BR with R > 1, the following holds: |ϑ ,∆(t, x) − ϑ(t, x)| ≤ C R 7 4 δ d 4 ∆x1/10. Therefore, we obtain the following approximation of Ωρ t , for 0 ≤ t ≤ T − δ: x, ϑ ,∆(x, t) > ρ + C R 7 4 δ d 4 ∆x 1 10 ⊂ Ωρ t ∩ BR ⊂ x, ϑ ,∆(x, t) > ρ − C R 7 4 δ d 4 ∆x 1 10 63 / 68

reachable set under probability of success Numerical simulations Numerical simulations Example 1: Consider the following SDE: dX1 dX2 = σ c 1 −1 0 dW 1 t dW 2 t where c = 0.2 and σ = 0.2. The time horizon is T = 1.0. t=1 t=0 ; ρ=0.05 Figure : Example 1 with σ = 0.2: Target (up-left); backward reachable set Ωρ 0 for ρ = 0.05 (up-right); backward reachable sets Ωρ 0 for diﬀerent values of ρ (down). 64 / 68

reachable set under probability of success Numerical simulations Numerical simulations Example 2: In this example, we consider a controlled SDE with a drift: dx(t) = −1 −4 4 −1 x(t)dt + u(t)dt + 0.7 0 0 0.7 dW 1 t dW 2 t where u(t) = u1(t) u2(t) and ui ∈ [−0.1, 0.1], for i = 1, 2. t=0.75 ; ρ=0.4 t=0.25 ; ρ=0.4 t=0 ; ρ=0.4 Figure : Reachable sets at diﬀerent times t ∈ {0.75, 0.25, 0} for a time horizon T = 1.75. The target set is represented by the green square. 65 / 68

reachable set under probability of success Numerical simulations Numerical simulations Figure : Behaviour of controlled processes starting from the backward reachable sets at times t ∈ {0.75, 0.25, 0} for a ﬁnal time horizon T = 1.75. 66 / 68

reachable set under probability of success Numerical simulations Conclusion 67 / 68

reachable set under probability of success Numerical simulations Thank you 68 / 68

Hamilton Jacobi Bellman approach for state-cons...

Hamilton Jacobi Bellman approach for state-constrained control problem with maximum cost

More Decks by GdR MOA 2015

Other Decks in Science

Featured

Transcript