CaRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

CᴀRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots
Md Abir Hossen Bradley Schmerl Javier Cámara Jason M. O’Kane Ellen C. Czaplinski Pooyan Jamshidi David Garlan Katherine A. Dzurilla Sonam Kharade

Motivation Causal Inference CᴀRE Results 2

Motivation Causal Inference CᴀRE Results Motivation 2

Clear costmap Rotate recovery 6 1 Behavior Configuration Space in
Robotic Systems Recovery Behaviors Global Planner Global Costmap Local Planner Local Costmap 14 # Conﬁguration Options ROS Navigation Stack BaseLocalPlanner DWA planner Eband planner TEB planner MPC planner 33 30 32 80 77 Algorithms in local planner 14 Only one can be selected at a time Only one can be selected at a time BaseGlobalPlanner Navfn Carrot planner Algorithms in global planner 33 30 32 Both must be selected Complex interactions between options (intra or inter components) give rise to a combinatorially large conﬁguration space. 3

Clear costmap Rotate recovery 6 1 Behavior Configuration Space in
Robotic Systems Recovery Behaviors Global Planner Global Costmap Local Planner Local Costmap 14 # Configuration Options ROS Navigation Stack BaseLocalPlanner DWA planner Eband planner TEB planner MPC planner 33 30 32 80 77 Algorithms in local planner 14 Only one can be selected at a time Only one can be selected at a time BaseGlobalPlanner Navfn Carrot planner Algorithms in global planner 33 30 32 Both must be selected Configurations Possible 2382 Complex interactions between options (intra or inter components) give rise to a combinatorially large configuration space. 3

Motivating Scenarios Turtlebot 3 Transform tolerance Inflation radius Husky UGV
Cost scaling factor Husky UGV Inflation radius Ocean World Lander Autonomy Testbed Force threshold Arm joint angles 4

Challenges • Trial-and-error requires non-trivial human efforts due to the
large configuration space. • Even after finding the optimal fix, the new fix is not guaranteed to function in different environments. • Performance influence models suffer from several shortcomings: • Non-transferable across different robotic platforms and environments • Incorrect explanations • Data collection is expensive from physical hardware 5

Incorrect Reasoning About The Robot’s Behavior Increasing Planner failed increases
Mission success P( ) 6

Mission success P( ) More Planner failed should reduce Mission success not increase it P( ) This is counter-intuitive 6

Mission success P( ) More Planner failed should reduce Mission success not increase it P( ) This is counter-intuitive Purely statistical models built on this data will be unreliable. 6

Performance Influence Models might be Unreliable Segregating data on Obstacle
Cost indicates that within each group increase of Planer Failed result in a decrease in Mission success . P( ) 7

Why Causal Model? To build reliable models that produce correct
explanations Obstacle cost Planner failed Mission success Obstacle cost aﬀects Mission success via Planner failed. Causal Models recover the correct interactions. 9

CaRE: End-to-end Pipeline 11 Stages 1. Learn causal model from
observational data 2. Identify the causal paths 3. Average causal eﬀect estimation Conﬁg. that has Causal Model Observational Data C1 E2 P1 C4 E1 P2 C3 E2 P1 Path's Rank Root causes Find highest perf-affecting config. options Average causal effect of each option Debugging Edge orientation rules Constraints 1 2 3 4 5 Learn causal model C1 C2 C3 C4 C5 E1 E2 E3 P1 P2 Examples: C1: goal_distance_bias E1: position_accuracy P1 : energy_consumption

Research Questions 12 • Research Question 1 (Accuracy). To what
extent are the root causes determined by CaRE the true root causes of the observed functional faults? • Research Question 2 (Transferability). To what extent can CaRE accurately detect misconﬁgurations when deployed in a diﬀerent platform?

Results 14 Husky Targets Real Environment Targets Husky Gazebo Env.
Monitor Husky /move_base subscribed topics set_param_ values send_goal Battery percentage RNS Traveled distance - conservative_reset - rotate_recovery - aggressive_reset Recovery tracker Mission time Recovery_behaviors API MoveBaseActionFeedback /gazebo_ros_battery /targets_reached rosbag API Total time to reach goal Observational data Record rosbag Evaluate rosbag Observational Data Collection Experimental Setup Energy Occdist scale, xy goal tolerance Occdist scale, Goal distance bias Transform tolerance, Combination method Goal distance bias, Transform tolerance Combination method, yaw goal tolerance Update frequency, Cost scaling factor Mission Success 1 3 4 1 3 4 Energy Mission Success Option Rank Path dist bias Occdist scale Recovery executed RNS Mission success Traveled distance Energy Tranform tolerance Publish frequency Planner Costmap 2d Navigation Stack Sub-systems Performance Metrics Performance Objectives Causal Interaction Goal dist bias A partial causal model discovered in our experiments using Husky in simulation Applying CaRE Conﬁguration options which rank higher have the strongest inﬂuence on the performance objectives. Takeaway Energy Energy Energy Mission success Mission success Mission success Energy Mission success Evaluation

Results 15
Evaluation C1 C2 C3 C4 C5 E1 E2 E3 P1 P2 Targets Husky Gazebo Environment Real Environment Applying CaRE Learning the Causal Model from Source Platform and Environment Target Platform and Environment Learned Causal Model Husky Turtlebot 3 Reusing the learned causal model from source Experimental Setup CaRE transfer reasonably well when reusing the causal model learned from source platform and environment and achieves higher accuracy than the baseline. Takeaway

ROS Navigation Stack Clear costmap Rotate recovery 6 1 Behavior
Recovery Behaviors Global Planner Global Costmap Local Planner Local Costmap 14 # Configuration Options BaseLocalPlanner DWA planner Eband planner TEB planner MPC planner 33 30 32 80 77 Algorithms in local planner 14 Only one can be selected at a time Only one can be selected at a time BaseGlobalPlanner Navfn Carrot planner Algorithms in global planner 33 30 32 Both must be selected Configurations Possible 2382 Complex interactions between options (intra or inter components) give rise to a combinatorially large configuration space. X Configuration Space in Robotic Systems Increasing Planner failed increases Mission success P( ) More Planner failed should reduce Mission success not increase it P( ) This is counter-intuitive Purely statistical models built on this data will be unreliable. X Incorrect Reasoning About The Robot’s Behavior Config. that has Causal Model Observational Data C1 E2 P1 C4 E1 P2 C3 E2 P1 Path's Rank Root causes Find highest perf-affecting config. options Average causal effect of each option Debugging Edge orientation rules Constraints 1 2 3 4 5 Learn causal model C1 C2 C3 C4 C5 E1 E2 E3 P1 P2 Examples: C1: goal_distance_bias E1: position_accuracy P1 : energy_consumption CaRE Husky Targets Real Environment Targets Husky Gazebo Env. Monitor Husky /move_base subscribed topics set_param_ values send_goal Battery percentage RNS Traveled distance - conservative_reset - rotate_recovery - aggressive_reset Recovery tracker Mission time Recovery_behaviors API MoveBaseActionFeedback /gazebo_ros_battery /targets_reached rosbag API Total time to reach goal Observational data Record rosbag Evaluate rosbag Observational Data Collection Experimental Setup Energy Occdist scale, xy goal tolerance Occdist scale, Goal distance bias Transform tolerance, Combination method Goal distance bias, Transform tolerance Combination method, yaw goal tolerance Update frequency, Cost scaling factor Mission Success 1 3 4 1 3 4 Energy Mission Success Option Rank Path dist bias Occdist scale Recovery executed RNS Mission success Traveled distance Energy Tranform tolerance Publish frequency Planner Costmap 2d Navigation Stack Sub-systems Performance Metrics Performance Objectives Causal Interaction Goal dist bias A partial causal model discovered in our experiments using Husky in simulation Applying CaRE Configuration options which rank higher have the strongest influence on the performance objectives. Takeaway Energy Energy Energy Mission success Mission success Mission success Energy Mission success Evaluation Results • Robotic systems are highly configurable, hundreds or even thousands of possible software and hardware configuration options interacting non-trivially. • Incorrect configuration (misconfiguration) can cause buggy behavior resulting in both functional and non-functional faults. • Performance influence models, such as regression models suffer from several shortcomings including, • Producing incorrect explanations • Non-transferable • Training data collection is expensive from physical hardware Challenges The Team • A novel framework for finding root causes of the configuration bugs in robotic systems. • We evaluated CaRE conducting a comprehensive empirical study in a controlled environment across multiple robotic platforms, including Husky and Turtlebot 3 both in simulation and physical robots. • We demonstrated the transferability of the causal models by learning the causal model in the Husky simulator, and reusing it in the Turtlebot 3 physical platform Key Conributions https://github.com/ softsys4ai/care

CaRE: Finding Root Causes of Configuration Issu...

CaRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

Md Abir hossen

More Decks by Md Abir hossen

Featured

Transcript

CᴀRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

Motivation Causal Inference CᴀRE Results 2

Motivation Causal Inference CᴀRE Results Motivation 2

Clear costmap Rotate recovery 6 1 Behavior Configuration Space in

Clear costmap Rotate recovery 6 1 Behavior Configuration Space in

Motivating Scenarios Turtlebot 3 Transform tolerance Inflation radius Husky UGV

Challenges • Trial-and-error requires non-trivial human eﬀorts due to the

Incorrect Reasoning About The Robot’s Behavior Increasing Planner failed increases

Incorrect Reasoning About The Robot’s Behavior Increasing Planner failed increases

Incorrect Reasoning About The Robot’s Behavior Increasing Planner failed increases

Performance Influence Models might be Unreliable Segregating data on Obstacle

Performance Influence Models might be Unreliable Segregating data on Obstacle

Motivation Causal Inference CᴀRE Results 8

Why Causal Model? To build reliable models that produce correct

Motivation Causal Inference CᴀRE Results 10

CaRE: End-to-end Pipeline 11 Stages 1. Learn causal model from

CaRE: End-to-end Pipeline 11 Stages 1. Learn causal model from

CaRE: End-to-end Pipeline 11 Stages 1. Learn causal model from

CaRE: End-to-end Pipeline 11 Stages 1. Learn causal model from

Research Questions 12 • Research Question 1 (Accuracy). To what

Motivation Causal Inference CᴀRE Results 13

Results 14 Husky Targets Real Environment Targets Husky Gazebo Env.

Results 15

ROS Navigation Stack Clear costmap Rotate recovery 6 1 Behavior