Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learning Robotic Contact Juggling (IROS'21)

OMRON SINIC X
October 21, 2021

Learning Robotic Contact Juggling (IROS'21)

Learning Robotic Contact Juggling (IROS'21)

Kazutoshi Tanaka, Masashi Hamaya, Devwrat Joshi, Felix von Drigalski, Ryo Yonetani, Takamitsu Matsubara, and Yoshihisa Ijiri

Presented at International Conference on Intelligent Robots and Systems (IROS 2021)
September 27, 2021

OMRON SINIC X

October 21, 2021
Tweet

More Decks by OMRON SINIC X

Other Decks in Research

Transcript

  1. © 2021 OMRON SINIC X Corporation. All Rights Reserved.
    Learning Robotic Contact Juggling
    (IROS’21)
    Kazutoshi Tanaka1, Masashi Hamaya1, Devwrat Joshi1, Felix von Drigalski1, Ryo Yonetani1,
    Takamitsu Matsubara2, and Yoshihisa Ijiri1 (1OMRON SINIC X Corporation, 2Nara Institute of
    Science and Technology)
    International Conference on Intelligent Robots and Systems (IROS 2021)
    September 27, 2021

    View Slide

  2. Learning Robotic Contact Juggling
    Kazutoshi Tanaka1, Masashi Hamaya1, Devwrat Joshi1, Felix von Drigalski1, Ryo Yonetani1,
    Takamitsu Matsubara2, and Yoshihisa Ijiri1
    1OMRON SINIC X Corporation, 2Nara Institute of Science and Technology
    (c) 2021 OMRON SINIC X 2

    View Slide

  3. Outline
    • Motivation
    • Method
    • Experiment
    (c) 2021 OMRON SINIC X 3

    View Slide

  4. Outline
    • Motivation
    • Method
    • Experiment
    (c) 2021 OMRON SINIC X 4

    View Slide

  5. Robotic juggling
    (c) 2021 OMRON SINIC X 5
    Devil sticking Diabolo
    Paddle juggling
    [Schaal & Atkeson 1994]
    Toss juggling
    [Ploeger+ 2020] [Kober+ 2010] [Drigalski+ 2021]
    Agile manipulation benchmark

    View Slide

  6. Contact juggling
    • Move an object with contact
    • Nonprehensile manipulation
    (c) 2021 OMRON SINIC X 6

    View Slide

  7. Robotic contact juggling
    (c) 2021 OMRON SINIC X 7
    1 joint [Lynch+ 1998]
    Modeling
    3 joints [Woodruff & Lynch 2021]
    Butterfly
    Ball
    Hand
    Manual modeling and building a controller

    View Slide

  8. Model-based reinforcement learning
    • Sample-efficient
    • Complex dynamics
    • Behavior primitives (ours)
    Learning robotic contact juggling
    Contact juggling
    • Many interactions
    • High acceleration
    • High risk of breakdown
    (c) 2021 OMRON SINIC X 8

    View Slide

  9. Concept: behavior primitives
    1. Simple behavior: e.g., free flying
    2. Simple model: fast learning
    (c) 2021 OMRON SINIC X 9
    Primitive A: contact
    Primitive B: no contact
    ...
    Contact juggling

    View Slide

  10. Outline
    • Motivation
    • Method
    • Experiment
    (c) 2021 OMRON SINIC X 10

    View Slide

  11. Model based RL (MBRL)
    • 𝑓 ~ 𝑇, state-transition dynamics 𝑠𝑡+1
    = 𝑇(𝑠𝑡
    , 𝑎𝑡
    )
    • A model 𝑓(𝑠, 𝑎; 𝜃) (θ: parameter)
    • Learned using collected samples
    • Model predictive control
    11
    (C) 2021 OMRON SINIC X Corporation

    View Slide

  12. Concept: behavior primitives
    1. Simple behavior: e.g., free flying
    2. Simple model: fast learning
    (c) 2021 OMRON SINIC X 12
    Primitive A: contact
    Primitive B: no contact
    ...
    Contact juggling

    View Slide

  13. Switched multiple model-based reinforcement learning
    (c) 2021 OMRON SINIC X 13
    Primitive A: contact
    Primitive B: no contact
    ...

    View Slide

  14. One goal of the model construction
    (c) 2021 OMRON SINIC X 14

    View Slide

  15. One goal of our method
    (c) 2021 OMRON SINIC X 15
    Primitive A: contact
    Primitive B: no contact

    View Slide

  16. One goal of our method
    (c) 2021 OMRON SINIC X 16

    View Slide

  17. Dividing database to sub-databases
    (c) 2021 OMRON SINIC X 17
    Based on features of samples

    View Slide

  18. Switching model using sub-databases
    (c) 2021 OMRON SINIC X 18
    Based on features of samples

    View Slide

  19. (1) Extract all samples
    (c) 2021 OMRON SINIC X 21

    View Slide

  20. (2) Fit
    (c) 2021 OMRON SINIC X 22

    View Slide

  21. (3) Add a new model
    (c) 2021 OMRON SINIC X 23
    Maximum error

    View Slide

  22. (4) Remove maximum error sample
    (c) 2021 OMRON SINIC X 24

    View Slide

  23. (5) Random sample
    (c) 2021 OMRON SINIC X 25

    View Slide

  24. End: Random sample
    (c) 2021 OMRON SINIC X 26
    Fit sample number

    View Slide

  25. (6) Learn a switching model
    (c) 2021 OMRON SINIC X 27

    View Slide

  26. Outline
    • Motivation
    • Method
    • Experiment
    (c) 2021 OMRON SINIC X 30

    View Slide

  27. Environment of learning robotic contact juggling
    (c) 2021 OMRON SINIC X 31
    Ball: radius 0.05 m, weight 0.45 kg
    Robot: UR5e + flat hand
    Control: 25 Hz
    Gazebo: timestep 0.001 s

    View Slide

  28. Definition of learning robotic contact juggling
    (c) 2021 OMRON SINIC X 32
    State: Reward: reference trajectory
    Action: joint target velocity
    Hand pose Ball position

    View Slide

  29. Curriculum learning
    (c) 2021 OMRON SINIC X 33
    • Episode 11-20
    • 1.0 s
    • Throw & Catch
    • Episode 1-10
    • 0.4 s
    • Throw
    • Trains models continuously
    • Same reward functions

    View Slide

  30. Baseline method
    (c) 2021 OMRON SINIC X 34
    Baseline1 Ours
    One complex dynamics models Multiple simple models
    Baseline2 Ours
    Linear combination Selecting a model
    Baseline3 Ours
    No random extraction With random extraction
    *Simple/complex: low/high representation performance of models
    [Chua+ 2018]
    [Doya+ 2002]

    View Slide

  31. (5) Random sample
    (c) 2021 OMRON SINIC X 35

    View Slide

  32. Baseline method
    (c) 2021 OMRON SINIC X 36
    Baseline1 Ours
    One complex dynamics models Multiple simple models
    Baseline2 Ours
    Linear combination Selecting a model
    Baseline3 Ours
    No random extraction With random extraction
    *Simple/complex: low/high representation performance of models
    [Chua+ 2018]
    [Doya+ 2002]

    View Slide

  33. Learned robotic contact juggling
    (c) 2021 OMRON SINIC X 37

    View Slide

  34. Learned robotic contact juggling (Realtime)
    (c) 2021 OMRON SINIC X 38

    View Slide

  35. Returns: Our method learns contact juggling faster
    (c) 2021 OMRON SINIC X 40
    Baseline1 Ours
    One complex dynamics models Multiple simple models

    View Slide

  36. Class: Behavior primitives were divided automatically
    (c) 2021 OMRON SINIC X 41
    Free flying
    Contact

    View Slide

  37. Baseline 2 and baseline3 do not learn the juggling
    (c) 2021 OMRON SINIC X 42
    Baseline2 Ours
    Linear combination Selecting a model
    Baseline3 Ours
    No random extraction With random extraction
    Fundamental
    • Selection
    • Random sample extraction

    View Slide

  38. Conclusion
    • S-MMRL: a novel MBRL method to learn robotic contact juggling.
    • S-MMRL learns the juggling faster than baselines in simulations.
    • Future work: apply S-MMRL to
    • Real robots
    • Other robotic contact juggling
    • Practical tasks e.g., peg-in-hole
    (c) 2021 OMRON SINIC X 43

    View Slide