Building the Computing System for Autonomous Micromobility Vehicles: Design Constraints and Architectural Optimizations Official Website: https://perceptin.io Bo Yu1, Wei Hu1, Leimeng Xu1, Jie Tang2, Shaoshan Liu1, Yuhao Zhu3 1. PerceptIn Inc 2. South China University of Technology 3. University of Rochester
Introduction • First thorough study of a commercial computing system for micromobility 1. complement previous academic research work in this area 2. summarize our R&D efforts in the past three years in building a suitable computing system for autonomous vehicles • Objectives: 1. highlight design constraints unique to autonomous vehicles 2. identify new architecture and systems problems for autonomous machines • Contributions: 1. introduce real autonomous vehicle workloads, and unobscured, unnormalized data from our deployed vehicles that future research can build on 2. present a detailed performance characterization of our vehicles 3. highlight that the computing system shouldn’t be optimized alone
Business Story Behind this Paper Business Story Behind: https://ieeexplore.ieee.org/document/9195123/ • Option 1: Optimization of commercial mobile SoC • Option 2: Procurement of specialized autonomous driving computing systems • Option 3: Development of proprietary autonomous driving computing systems
Design Constraints 1.Latency and Throughput 1.computing and physical control latencies 2.throughput 10 Hz 2.Energy and Thermal 1.computing and sensing drop operation time from 10 to 7.7 hrs 2.conventional cooling for thermal 3.Cost and Safety 1.low cost to sustain $1 per trip cost 2.proactive and reactive paths
Design Constraints: LiDAR vs Camera • Latency 1. LiDAR-based localization algorithm takes 100 ms to 1 s 2. our vision based localization algorithm finishes in about 25 ms on an embedded FPGA • Power 1. LiDAR is one order of magnitude more power hungry than cameras • Cost 1. LiDAR is one order of magnitude more expensive than cameras • Depth Quality 1. LiDARs directly provide depth information at the precision of 2 cm
Algorithm-Hardware Mapping • Sensing 1. Zynq FPGA to perform sensor synch and feature extraction • Planning 1. executing on CPU with great CAN interface support • Perception 1. scene understanding 2. localization • Partial Reconfiguration 1. time shares part of the FPGA resources at runtime 2. swap time < 3 ms
Performance Characterizations • Avg. latency and variation • mean is 164 ms • 99th percent is over 300 ms • Variation mostly caused by scene complexity • Latency distribution • sensing contributes significantly to the latency • Perception is the biggest contributor to latency
Sensor Synchronization • Sensing has been overlooked • most research work focus on perception and planning in robotics • Sensing crucial to perception • e.g. in sensor fusion • Sensor synch • a challenging problem in real world! • trigger at the same time • processing latency variation
Sensing-Computing Co-Design • Localization 1. Sync sensor samples from both the camera and the IMU 2. processing pipeline introduces variable latency • Camera-IMU sync 1. IMU with short processing time 2. Camera with long processing time
Sensor Synchronization Architecture • Software-Hardware Sync Design Principles 1. Single time source trigger 2. Obtain timestamp close to sensor • Our Design 1. GPS provides satellite atomic time as the universal time source 2. Camera 30 FPS IMU 240 FPS 3. Camera trigger signal is down-sampled 8 times from the IMU signal
Concluding Remarks • Holistic SoV Optimizations 1. move beyond optimizing only one part of the computing platform 2. understand the constraints and tradeoffs from an SoV perspective • Horizontal Cross-Accelerator Optimization 1. most previous studies focus on one accelerator 2. exploit interactions across accelerators 3. present the dataflow across different on-vehicle algorithms and their inherent task-level parallelisms • Architecture for Autonomous Machines 1. static dataflow pattern 2. each stage imposes additional constraints • “TCO” Model for Autonomous Machines 1. need a comprehensive cost model for autonomous machines 2. balance between cloud processing and on-vehicle processing