= N−1 j=i l(xj , uj ) + lf (xN ) i ステップ時点の価値関数 Vi を次のように定義する. Vi (x) = min Ui Ji (x, Ui ) 導かれるベルマン方程式: Vi (xi ) = min ui [l(xi , ui ) + Vi+1 (f(x, u))] 以降では,インデックス i を省略し,Vi+1 = V と書く. V (x) = min u [l(x, u) + V (f(x, u))] (2) yuki @ Tokyo Tech SSR A Survey of Constrained Differential Dynamics Programming
を考える. Q(δx, δu) = l(x + δx, u + δu) + V (f(x + δx, u + δu) (3) Q は疑似ハミルトニアンと呼ばれる. 二次までの偏微分を計算する. Qx = lx + fT x Vx (4a) Qu = lu + fT u Vx (4b) Qxx = lxx + fT x Vxx fx + Vx · fxx (4c) Qux = lux + fT u Vxx fx + Vx · fux (4d) Quu = luu + fT u Vxx fu + Vx · fuu (4e) (4c)〜(4e) の 2 項目はテンソル積を表す. yuki @ Tokyo Tech SSR A Survey of Constrained Differential Dynamics Programming
ui =ui + αki + Ki (ˆ xi − xi ) (8b) ˆ xi+1 =f(ˆ xi , ˆ ui ) (8c) ここで α は更新パラメータ. yuki @ Tokyo Tech SSR A Survey of Constrained Differential Dynamics Programming
et al.: Control-Limited Differential Dynamic Programming, ICRA 2014. 一般不等式制約:g(x, u) ≤ 0 Z. Xie, et al.: Differential Dynamic Programming with Nonlinear Constraints, ICRA 2017. ホロノミック拘束:φ(x) = 0 C. Mastalli, et al.: Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control, ICRA 2020. yuki @ Tokyo Tech SSR A Survey of Constrained Differential Dynamics Programming
※ 期待するだけで何も保証されないことに注意 順方向パスでも部分問題 (QP) を解き直す 制約を満たしつつ,最適な入力 u を順方向パスで再探索 計算量は増えるが,少なくとも制約は満たされる 現状の最も有望な方法だと思われる yuki @ Tokyo Tech SSR A Survey of Constrained Differential Dynamics Programming
and Versatile Framework for Multi-Contact Optimal Control https://www.youtube.com/watch?v=wHy8YAHwj-M OSS: iLQG/DDP trajectory optimization (MATLAB) https://jp.mathworks.com/matlabcentral/ fileexchange/ 52069-ilqg-ddp-trajectory-optimization Crocoddyl (C++) https://github.com/loco-3d/crocoddyl yuki @ Tokyo Tech SSR A Survey of Constrained Differential Dynamics Programming