and Architectural Optimizations Official Website: https://perceptin.io Bo Yu1, Wei Hu1, Leimeng Xu1, Jie Tang2, Shaoshan Liu1, Yuhao Zhu3 1. PerceptIn Inc 2. South China University of Technology 3. University of Rochester
for micromobility 1. complement previous academic research work in this area 2. summarize our R&D efforts in the past three years in building a suitable computing system for autonomous vehicles • Objectives: 1. highlight design constraints unique to autonomous vehicles 2. identify new architecture and systems problems for autonomous machines • Contributions: 1. introduce real autonomous vehicle workloads, and unobscured, unnormalized data from our deployed vehicles that future research can build on 2. present a detailed performance characterization of our vehicles 3. highlight that the computing system shouldn’t be optimized alone
Option 1: Optimization of commercial mobile SoC • Option 2: Procurement of specialized autonomous driving computing systems • Option 3: Development of proprietary autonomous driving computing systems
2.throughput 10 Hz 2.Energy and Thermal 1.computing and sensing drop operation time from 10 to 7.7 hrs 2.conventional cooling for thermal 3.Cost and Safety 1.low cost to sustain $1 per trip cost 2.proactive and reactive paths
algorithm takes 100 ms to 1 s 2. our vision based localization algorithm finishes in about 25 ms on an embedded FPGA • Power 1. LiDAR is one order of magnitude more power hungry than cameras • Cost 1. LiDAR is one order of magnitude more expensive than cameras • Depth Quality 1. LiDARs directly provide depth information at the precision of 2 cm
synch and feature extraction • Planning 1. executing on CPU with great CAN interface support • Perception 1. scene understanding 2. localization • Partial Reconfiguration 1. time shares part of the FPGA resources at runtime 2. swap time < 3 ms
164 ms • 99th percent is over 300 ms • Variation mostly caused by scene complexity • Latency distribution • sensing contributes significantly to the latency • Perception is the biggest contributor to latency
work focus on perception and planning in robotics • Sensing crucial to perception • e.g. in sensor fusion • Sensor synch • a challenging problem in real world! • trigger at the same time • processing latency variation
the camera and the IMU 2. processing pipeline introduces variable latency • Camera-IMU sync 1. IMU with short processing time 2. Camera with long processing time
time source trigger 2. Obtain timestamp close to sensor • Our Design 1. GPS provides satellite atomic time as the universal time source 2. Camera 30 FPS IMU 240 FPS 3. Camera trigger signal is down-sampled 8 times from the IMU signal
only one part of the computing platform 2. understand the constraints and tradeoffs from an SoV perspective • Horizontal Cross-Accelerator Optimization 1. most previous studies focus on one accelerator 2. exploit interactions across accelerators 3. present the dataflow across different on-vehicle algorithms and their inherent task-level parallelisms • Architecture for Autonomous Machines 1. static dataflow pattern 2. each stage imposes additional constraints • “TCO” Model for Autonomous Machines 1. need a comprehensive cost model for autonomous machines 2. balance between cloud processing and on-vehicle processing