Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learning Robotic Contact Juggling (IROS'21)

OMRON SINIC X
October 21, 2021

Learning Robotic Contact Juggling (IROS'21)

Learning Robotic Contact Juggling (IROS'21)

Kazutoshi Tanaka, Masashi Hamaya, Devwrat Joshi, Felix von Drigalski, Ryo Yonetani, Takamitsu Matsubara, and Yoshihisa Ijiri

Presented at International Conference on Intelligent Robots and Systems (IROS 2021)
September 27, 2021

OMRON SINIC X

October 21, 2021
Tweet

More Decks by OMRON SINIC X

Other Decks in Research

Transcript

  1. © 2021 OMRON SINIC X Corporation. All Rights Reserved. Learning

    Robotic Contact Juggling (IROS’21) Kazutoshi Tanaka1, Masashi Hamaya1, Devwrat Joshi1, Felix von Drigalski1, Ryo Yonetani1, Takamitsu Matsubara2, and Yoshihisa Ijiri1 (1OMRON SINIC X Corporation, 2Nara Institute of Science and Technology) International Conference on Intelligent Robots and Systems (IROS 2021) September 27, 2021
  2. Learning Robotic Contact Juggling Kazutoshi Tanaka1, Masashi Hamaya1, Devwrat Joshi1,

    Felix von Drigalski1, Ryo Yonetani1, Takamitsu Matsubara2, and Yoshihisa Ijiri1 1OMRON SINIC X Corporation, 2Nara Institute of Science and Technology (c) 2021 OMRON SINIC X 2
  3. Robotic juggling (c) 2021 OMRON SINIC X 5 Devil sticking

    Diabolo Paddle juggling [Schaal & Atkeson 1994] Toss juggling [Ploeger+ 2020] [Kober+ 2010] [Drigalski+ 2021] Agile manipulation benchmark
  4. Robotic contact juggling (c) 2021 OMRON SINIC X 7 1

    joint [Lynch+ 1998] Modeling 3 joints [Woodruff & Lynch 2021] Butterfly Ball Hand Manual modeling and building a controller
  5. Model-based reinforcement learning • Sample-efficient • Complex dynamics • Behavior

    primitives (ours) Learning robotic contact juggling Contact juggling • Many interactions • High acceleration • High risk of breakdown (c) 2021 OMRON SINIC X 8
  6. Concept: behavior primitives 1. Simple behavior: e.g., free flying 2.

    Simple model: fast learning (c) 2021 OMRON SINIC X 9 Primitive A: contact Primitive B: no contact ... Contact juggling
  7. Model based RL (MBRL) • 𝑓 ~ 𝑇, state-transition dynamics

    𝑠𝑡+1 = 𝑇(𝑠𝑡 , 𝑎𝑡 ) • A model 𝑓(𝑠, 𝑎; 𝜃) (θ: parameter) • Learned using collected samples • Model predictive control 11 (C) 2021 OMRON SINIC X Corporation
  8. Concept: behavior primitives 1. Simple behavior: e.g., free flying 2.

    Simple model: fast learning (c) 2021 OMRON SINIC X 12 Primitive A: contact Primitive B: no contact ... Contact juggling
  9. Switched multiple model-based reinforcement learning (c) 2021 OMRON SINIC X

    13 Primitive A: contact Primitive B: no contact ...
  10. One goal of our method (c) 2021 OMRON SINIC X

    15 Primitive A: contact Primitive B: no contact
  11. Environment of learning robotic contact juggling (c) 2021 OMRON SINIC

    X 31 Ball: radius 0.05 m, weight 0.45 kg Robot: UR5e + flat hand Control: 25 Hz Gazebo: timestep 0.001 s
  12. Definition of learning robotic contact juggling (c) 2021 OMRON SINIC

    X 32 State: Reward: reference trajectory Action: joint target velocity Hand pose Ball position
  13. Curriculum learning (c) 2021 OMRON SINIC X 33 • Episode

    11-20 • 1.0 s • Throw & Catch • Episode 1-10 • 0.4 s • Throw • Trains models continuously • Same reward functions
  14. Baseline method (c) 2021 OMRON SINIC X 34 Baseline1 Ours

    One complex dynamics models Multiple simple models Baseline2 Ours Linear combination Selecting a model Baseline3 Ours No random extraction With random extraction *Simple/complex: low/high representation performance of models [Chua+ 2018] [Doya+ 2002]
  15. Baseline method (c) 2021 OMRON SINIC X 36 Baseline1 Ours

    One complex dynamics models Multiple simple models Baseline2 Ours Linear combination Selecting a model Baseline3 Ours No random extraction With random extraction *Simple/complex: low/high representation performance of models [Chua+ 2018] [Doya+ 2002]
  16. Returns: Our method learns contact juggling faster (c) 2021 OMRON

    SINIC X 40 Baseline1 Ours One complex dynamics models Multiple simple models
  17. Baseline 2 and baseline3 do not learn the juggling (c)

    2021 OMRON SINIC X 42 Baseline2 Ours Linear combination Selecting a model Baseline3 Ours No random extraction With random extraction Fundamental • Selection • Random sample extraction
  18. Conclusion • S-MMRL: a novel MBRL method to learn robotic

    contact juggling. • S-MMRL learns the juggling faster than baselines in simulations. • Future work: apply S-MMRL to • Real robots • Other robotic contact juggling • Practical tasks e.g., peg-in-hole (c) 2021 OMRON SINIC X 43