ricsue-deepracer-ws-istanbul.pdf

Slide 1

Slide 1 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Putting the “machine” in Machine Learning Ricardo Sueiras | Principal Evangelist, Amazon Web Services I s t a n b u l L o f t

Slide 2

Slide 2 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T AWS DeepRacer car specifications Car: 1/18-scale 4WD with monster truck chassis CPU: Intel Atom processor Memory: 4 GB RAM Storage: 32 GB (expandable) Wi-Fi: 802.11ac Camera: 4 MP camera with MJPEG Drive battery: 1000 mAh lithium polymer Compute battery: 13600 mAh USB-C Sensors: Integrated accelerometer and gyroscope Ports: 4x USB-A, 1x USB-C, 1x Micro-USB, 1x HDMI Software: Ubuntu OS 16.04.3 LTS, Intel OpenVINO toolkit, ROS Kinetic

Slide 3

Slide 3 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T AWS DeepRacer League: Race for prizes and glory The world’s first global, autonomous racing league www.deepracerleague.com

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Method: Supervised learning How it works: Expert driver controls a real- world car that has a camera. Save the images from the camera as inputs and corresponding driving actions (speed and steering angle) as outputs. Train a model. Result: Provide state (image) into model and receive driving action. RL vs. other approaches for robotic racing Method: Reinforcement learning How it works: Virtual agent repeatedly interacts with a simulated environment and logs experience (image, action, new state, reward). Experience is used to train a model, and new model is used to get more experience. Result: Provide state (image) into model and receive driving action.

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T The reward function The reward function incentivizes particular behaviors and is at the core of reinforcement learning

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T RL algorithms: Vanilla policy gradient * Image source: Landscape image is CC0 1.0 public domain Data is only used once • High variance of rewards • Magnitude of update could be too large J(q) New weights New weights 0.4 ± 0.3 ±

Slide 22

Slide 22 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T EPISODE STEP {State, Action, Reward, New State} Complete Track or Crash – sequence of STEPS or EXPERIENCE EXPERIENCE BUFFER Sequence of STEPS over fixed number of EPISODES Episode x Episode y BATCH Ordered list of experiences TRAINING Random selection of BATCHES ITERATION POLICY NETWORK Episode 1 Episode 2

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T AWS Cloud AWS DeepRacer NAT gateway VPC AWS DeepRacer Models Simulation video Metrics AWS DeepRacer simulator architecture

Slide 26

Slide 26 text

Slide 27

Slide 27 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Lab 0 – AWS DeepRacer service resource creation Objective: Set up your account resources to get you to the races! https://tinyurl.com/y59s4r4c

Slide 28

Slide 28 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Programming your own reward function Code editor: Python 3 syntax Three example reward functions Code validation via AWS Lambda

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Coordinate system and track waypoints Outer boundary waypoints Track center waypoints Inner boundary waypoints X Y Track width Car direction

Slide 32

Slide 32 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Reward function parameters { "all_wheels_on_track": Boolean, # flag to indicate if the vehicle is on the track "x": float, # vehicle's x-coordinate in meters "y": float, # vehicle's y-coordinate in meters "distance_from_center": float, # distance in meters from the track center "is_left_of_center": Boolean, # Flag to indicate if the vehicle is on the left side to the track center "heading": float, # vehicle's yaw in degrees "progress": float, # percentage of track completed "steps": int, # number steps completed "speed": float, # vehicle's speed in meters per second (m/s) "steering_angle": float, # vehicle's steering angle in degrees "track_width": float, # width of the track "waypoints": [[float, float], … ], # list of [x,y] as milestones along the track center "closest_waypoints": [int, int] # indices of the two nearest waypoints. }

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text

Slide 41

Slide 41 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Simulation-to-real domain transfer SIM-to-real challenge Train the model using simulated images, but train the race car using the images that the car experiences in the real world Strategies Environment control Domain randomization Modularity and abstraction

Slide 42

Slide 42 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T ROS msg node Stored file ROS nodes Web server publisher Model optimizer Video M-JPEG Web server video Inference results Autonomous drive Control node Optimized model Media engine Camera Model Inference engine Manual drive Navigation node Servo and motor AWS DeepRacer software architecture