Learning Robotic Contact Juggling (IROS'21)

© 2021 OMRON SINIC X Corporation. All Rights Reserved. Learning
Robotic Contact Juggling (IROS’21) Kazutoshi Tanaka1, Masashi Hamaya1, Devwrat Joshi1, Felix von Drigalski1, Ryo Yonetani1, Takamitsu Matsubara2, and Yoshihisa Ijiri1 (1OMRON SINIC X Corporation, 2Nara Institute of Science and Technology) International Conference on Intelligent Robots and Systems (IROS 2021) September 27, 2021

Learning Robotic Contact Juggling Kazutoshi Tanaka1, Masashi Hamaya1, Devwrat Joshi1,
Felix von Drigalski1, Ryo Yonetani1, Takamitsu Matsubara2, and Yoshihisa Ijiri1 1OMRON SINIC X Corporation, 2Nara Institute of Science and Technology (c) 2021 OMRON SINIC X 2

Outline • Motivation • Method • Experiment (c) 2021 OMRON
SINIC X 3

SINIC X 4

Robotic juggling (c) 2021 OMRON SINIC X 5 Devil sticking
Diabolo Paddle juggling [Schaal & Atkeson 1994] Toss juggling [Ploeger+ 2020] [Kober+ 2010] [Drigalski+ 2021] Agile manipulation benchmark

Contact juggling • Move an object with contact • Nonprehensile
manipulation (c) 2021 OMRON SINIC X 6

Robotic contact juggling (c) 2021 OMRON SINIC X 7 1
joint [Lynch+ 1998] Modeling 3 joints [Woodruff & Lynch 2021] Butterfly Ball Hand Manual modeling and building a controller

Model-based reinforcement learning • Sample-efficient • Complex dynamics • Behavior
primitives (ours) Learning robotic contact juggling Contact juggling • Many interactions • High acceleration • High risk of breakdown (c) 2021 OMRON SINIC X 8

Concept: behavior primitives 1. Simple behavior: e.g., free flying 2.
Simple model: fast learning (c) 2021 OMRON SINIC X 9 Primitive A: contact Primitive B: no contact ... Contact juggling

SINIC X 10

Model based RL (MBRL) • 𝑓 ~ 𝑇, state-transition dynamics
𝑠𝑡+1 = 𝑇(𝑠𝑡 , 𝑎𝑡 ) • A model 𝑓(𝑠, 𝑎; 𝜃) (θ: parameter) • Learned using collected samples • Model predictive control 11 (C) 2021 OMRON SINIC X Corporation

Concept: behavior primitives 1. Simple behavior: e.g., free flying 2.
Simple model: fast learning (c) 2021 OMRON SINIC X 12 Primitive A: contact Primitive B: no contact ... Contact juggling

Switched multiple model-based reinforcement learning (c) 2021 OMRON SINIC X
13 Primitive A: contact Primitive B: no contact ...

One goal of the model construction (c) 2021 OMRON SINIC
X 14

One goal of our method (c) 2021 OMRON SINIC X
15 Primitive A: contact Primitive B: no contact

One goal of our method (c) 2021 OMRON SINIC X
16

Dividing database to sub-databases (c) 2021 OMRON SINIC X 17
Based on features of samples

Switching model using sub-databases (c) 2021 OMRON SINIC X 18
Based on features of samples

(3) Add a new model (c) 2021 OMRON SINIC X
23 Maximum error

(4) Remove maximum error sample (c) 2021 OMRON SINIC X
24

End: Random sample (c) 2021 OMRON SINIC X 26 Fit
sample number

(6) Learn a switching model (c) 2021 OMRON SINIC X
27

SINIC X 30

Environment of learning robotic contact juggling (c) 2021 OMRON SINIC
X 31 Ball: radius 0.05 m, weight 0.45 kg Robot: UR5e + flat hand Control: 25 Hz Gazebo: timestep 0.001 s

Definition of learning robotic contact juggling (c) 2021 OMRON SINIC
X 32 State: Reward: reference trajectory Action: joint target velocity Hand pose Ball position

Curriculum learning (c) 2021 OMRON SINIC X 33 • Episode
11-20 • 1.0 s • Throw & Catch • Episode 1-10 • 0.4 s • Throw • Trains models continuously • Same reward functions

Baseline method (c) 2021 OMRON SINIC X 34 Baseline1 Ours
One complex dynamics models Multiple simple models Baseline2 Ours Linear combination Selecting a model Baseline3 Ours No random extraction With random extraction *Simple/complex: low/high representation performance of models [Chua+ 2018] [Doya+ 2002]

Baseline method (c) 2021 OMRON SINIC X 36 Baseline1 Ours
One complex dynamics models Multiple simple models Baseline2 Ours Linear combination Selecting a model Baseline3 Ours No random extraction With random extraction *Simple/complex: low/high representation performance of models [Chua+ 2018] [Doya+ 2002]

Learned robotic contact juggling (Realtime) (c) 2021 OMRON SINIC X
38

Returns: Our method learns contact juggling faster (c) 2021 OMRON
SINIC X 40 Baseline1 Ours One complex dynamics models Multiple simple models

Class: Behavior primitives were divided automatically (c) 2021 OMRON SINIC
X 41 Free flying Contact

Baseline 2 and baseline3 do not learn the juggling (c)
2021 OMRON SINIC X 42 Baseline2 Ours Linear combination Selecting a model Baseline3 Ours No random extraction With random extraction Fundamental • Selection • Random sample extraction

Conclusion • S-MMRL: a novel MBRL method to learn robotic
contact juggling. • S-MMRL learns the juggling faster than baselines in simulations. • Future work: apply S-MMRL to • Real robots • Other robotic contact juggling • Practical tasks e.g., peg-in-hole (c) 2021 OMRON SINIC X 43

Learning Robotic Contact Juggling (IROS'21)

Learning Robotic Contact Juggling (IROS'21)

More Decks by OMRON SINIC X

Other Decks in Research

Featured

Transcript