Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building the Computing System for Autonomous Mi...

HorizonLab
October 24, 2020

Building the Computing System for Autonomous Micromobility Vehicles: Design Constraints and Architectural Optimizations

MICRO 2020 talk slides by Shaoshan Liu.

HorizonLab

October 24, 2020
Tweet

More Decks by HorizonLab

Other Decks in Education

Transcript

  1. Building the Computing System for Autonomous Micromobility Vehicles: Design Constraints

    and Architectural Optimizations Official Website: https://perceptin.io Bo Yu1, Wei Hu1, Leimeng Xu1, Jie Tang2, Shaoshan Liu1, Yuhao Zhu3 1. PerceptIn Inc 2. South China University of Technology 3. University of Rochester
  2. Introduction • First thorough study of a commercial computing system

    for micromobility 1. complement previous academic research work in this area 2. summarize our R&D efforts in the past three years in building a suitable computing system for autonomous vehicles • Objectives: 1. highlight design constraints unique to autonomous vehicles 2. identify new architecture and systems problems for autonomous machines • Contributions: 1. introduce real autonomous vehicle workloads, and unobscured, unnormalized data from our deployed vehicles that future research can build on 2. present a detailed performance characterization of our vehicles 3. highlight that the computing system shouldn’t be optimized alone
  3. Business Story Behind this Paper Business Story Behind: https://ieeexplore.ieee.org/document/9195123/ •

    Option 1: Optimization of commercial mobile SoC • Option 2: Procurement of specialized autonomous driving computing systems • Option 3: Development of proprietary autonomous driving computing systems
  4. Autonomous Driving Infrastructure • On-Vehicle Processing • Sensing • Perception

    • Planning • Cloud Services • Map generation • Simulation • ML model training
  5. Design Constraints 1.Latency and Throughput 1.computing and physical control latencies

    2.throughput 10 Hz 2.Energy and Thermal 1.computing and sensing drop operation time from 10 to 7.7 hrs 2.conventional cooling for thermal 3.Cost and Safety 1.low cost to sustain $1 per trip cost 2.proactive and reactive paths
  6. Design Constraints: LiDAR vs Camera • Latency 1. LiDAR-based localization

    algorithm takes 100 ms to 1 s 2. our vision based localization algorithm finishes in about 25 ms on an embedded FPGA • Power 1. LiDAR is one order of magnitude more power hungry than cameras • Cost 1. LiDAR is one order of magnitude more expensive than cameras • Depth Quality 1. LiDARs directly provide depth information at the precision of 2 cm
  7. Software Pipeline • Task-Level Parallelism • sensing, perception, and planning

    are serialized • different sensor processing are independent
  8. Algorithm-Hardware Mapping • Sensing 1. Zynq FPGA to perform sensor

    synch and feature extraction • Planning 1. executing on CPU with great CAN interface support • Perception 1. scene understanding 2. localization • Partial Reconfiguration 1. time shares part of the FPGA resources at runtime 2. swap time < 3 ms
  9. Performance Characterizations • Avg. latency and variation • mean is

    164 ms • 99th percent is over 300 ms • Variation mostly caused by scene complexity • Latency distribution • sensing contributes significantly to the latency • Perception is the biggest contributor to latency
  10. Sensor Synchronization • Sensing has been overlooked • most research

    work focus on perception and planning in robotics • Sensing crucial to perception • e.g. in sensor fusion • Sensor synch • a challenging problem in real world! • trigger at the same time • processing latency variation
  11. Sensing-Computing Co-Design • Localization 1. Sync sensor samples from both

    the camera and the IMU 2. processing pipeline introduces variable latency • Camera-IMU sync 1. IMU with short processing time 2. Camera with long processing time
  12. Sensor Synchronization Architecture • Software-Hardware Sync Design Principles 1. Single

    time source trigger 2. Obtain timestamp close to sensor • Our Design 1. GPS provides satellite atomic time as the universal time source 2. Camera 30 FPS IMU 240 FPS 3. Camera trigger signal is down-sampled 8 times from the IMU signal
  13. Concluding Remarks • Holistic SoV Optimizations 1. move beyond optimizing

    only one part of the computing platform 2. understand the constraints and tradeoffs from an SoV perspective • Horizontal Cross-Accelerator Optimization 1. most previous studies focus on one accelerator 2. exploit interactions across accelerators 3. present the dataflow across different on-vehicle algorithms and their inherent task-level parallelisms • Architecture for Autonomous Machines 1. static dataflow pattern 2. each stage imposes additional constraints • “TCO” Model for Autonomous Machines 1. need a comprehensive cost model for autonomous machines 2. balance between cloud processing and on-vehicle processing