Optimal control (and learning) vs. reactive control: Foresight Optimal control • (Discrete) construction of “value function” • Given • Infinite horizon cost • find instantaneous “good” direction • HJB equation Reactive control • Find a potential/energy-like function 𝑉(𝑥) • “Myopic” energy-like • “All-knowing” NF • … • ~simple to construct (which lends to easy analysis) • No built-in mechanism to account for the “path cost” [cf. VV/MIT project] [Zhong, …, Todorov (2013)]
Optimal control in practice (SOA) • Give up hope on globally estimating VF—intractable • Local estimates around “frequented” regions of state space • “Trajectory optimization” -> local estimate around a specific state/input trajectory • Analogies for RL • Bad if perturbed to a different region of state space • Online “re-estimation” of finite-horizon value function -> MPC • Ensures the trajectory is “near” the current state
Analytical controller vs. Optimization: Implementation Optimization • Write as min 𝑢 … 𝑠. 𝑡. 𝑢 ∈ 𝑈 • Global solution if feasible space is convex • In practice: • QP of problem size 100—1000 < 1ms on desktop processor • Problem size 10—100 < 1ms on microcontroller Analytical • Inverse dynamics • Use properties of mechanical system -> natural control • Challenges: • Underactuation • Constraints: 𝑢 ∈ 𝑈? Friction? Recall (assuming known VF) • Optimal control wants • Reactive control wants 𝜕𝑉 𝜕𝑥 𝑓 𝑥, 𝑢 < 0
Design process => design science • Power consumption • Gear ratio selection • Impact mitigation • Reflected inertia • Bus voltage utilization • Current @ motor controller • … Power consumption (W) Swing Stance Gait kinematics “Simple models” Past data Gait dynamics Forward dynamics Robot kinematic params Robot dynamic params Motor params
Details of these reflexes make a huge difference in practice: outdoors “details” • Slope estimation • Slip detection and handling • Stubbing detection and handling • Early/late contact handling • “Re-swing” reflexes • …
Can go further: templates are inevitable • Subject to anchoring posture control, • With sufficient actuated DOFs, • (degrades gracefully with fewer) • Reduced dynamics at least contain IP. [De, Topping, Kod (in prep)]
Embedding pitch-steady target dynamics on floating torso models • “Zero manifold” = pitch-steady locomotion (e.g. walking) • Render ZM attracting and invariant Limitations: • Restriction dynamics are affected by anchoring force • Form of restriction dynamics depends on virtual constraint choice Valid zero dynamics [De, Topping, Kod (in prep)] [Westervelt et. al (2007)]
Input-decoupled anchoring • Can find reduced coordinates 𝑟 𝑞 s.t. 𝑢𝑎𝑛𝑐ℎ does not appear in ሷ 𝑟 • 𝑟 𝑞 is ~ virtual leg pos • SLIP dynamics are exactly embedded! A new kind of anchoring Input-decoupled anchoring with actuated IP template behavior Floating torso model 𝑥 ∈ 𝑆𝐸(2) Conventional anchoring Input- decoupled anchoring Invariant+attracting pitch-stable manifold (conventional anchoring/ZD) [Full & Kod (1999)] [Westervelt et al (2007)] [De, Topping, Kod (in prep)]
Application to open-loop control of leaping [De, Topping, Kod (in prep)] • Use IP template behavior to design open- loop leaping controllers • Together with provably correct anchoring • Application to leaping for mobile manipulation (preliminary)
Combining reflexes with anticipatory planning (WIP) • Receding horizon planning • With template model • Hierarchical dimension reduction enables real- time solutions, robustness to model uncertainty
Conclusion • Axes of confusion • Modularity is a time-investment that saves time, computational effort, improves robustness • Ghost is hiring -> [email protected] What do the animals do?