Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Low-Latency Proactive Continuous Vision

HorizonLab
October 24, 2020

Low-Latency Proactive Continuous Vision

PACT 2020 talk slides by Yiming Gan

HorizonLab

October 24, 2020
Tweet

More Decks by HorizonLab

Other Decks in Education

Transcript

  1. Low-Latency Proactive Continuous Vision Yiming Gan Department of Computer Science,

    University of Rochester with Yuxian Qiu, Shanghai Jiao Tong University Lele Chen, University of Rochester Jingwen Leng, Shanghai Jiao Tong University Yuhao Zhu University of Rochester
  2. Traditional Pipeline Sensing Frame 1 Imaging Vision Frame 2 Sensing

    Imaging Vision Frame 3 Sensing Imaging Vision Latency Latency Latency
  3. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging
  4. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek
  5. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check
  6. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency
  7. Gains Sensing Frame 1 Imaging Vision Latency Pred Sensing Frame

    2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency
  8. Challenges Sensing Frame 1 Imaging Vision Latency Pred Sensing Frame

    2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency Resource Contention
  9. Challenges Sensing Frame 1 Imaging Vision Latency Pred Sensing Frame

    2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency Energy Wasting
  10. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K “ ” Accuracy Target Similarity Metric etc.
  11. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor
  12. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor Runtime
  13. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor Runtime Checking
  14. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor Runtime Checking Scheduler
  15. PVF Framework Static Dynamic Vision Apps Similarity T Degree K

    SoC Sensor BUS Memory CPU NPU DSP GPU ISP Predictor Scheduler Control Checking Runtime
  16. Experimental Setup I. In house simulator modeling state-of-the art SoCs

    • Real measurement of latency and energy on different IPs.
  17. Experimental Setup I. In house simulator modeling state-of-the art SoCs

    • Real measurement of latency and energy on different IPs. II. RTL Implementations for NPU and Predictor • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor
  18. Experimental Setup III. Evaluate on Object Detection and Tracking •

    KITTI dataset for object detection, VOT-challange for tracking. I. In house simulator modeling state-of-the art SoCs • Real measurement of latency and energy on different IPs. II. RTL Implementations for NPU and Predictor • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor
  19. Experimental Setup III. Evaluate on Object Detection and Tracking •

    KITTI dataset for object detection, VOT-challange for tracking. I. In house simulator modeling state-of-the art SoCs • Real measurement of latency and energy on different IPs. II. RTL Implementations for NPU and Predictor • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor IV. Different Input Resolutions
  20. Baselines I. Base • Baseline with traditional execution pipeline II.

    BO • Baseline with optimized back-end III. FCFS • Traditional pipeline with multiple hardware IPs
  21. Results 0 25 50 75 100 0 12.5 25 37.5

    50 Energy Budget (mJ) Latency Reduction (%) Better
  22. Results 0 25 50 75 100 0 12.5 25 37.5

    50 Energy Budget (mJ) Latency Reduction (%) PVF Better
  23. Results 0 25 50 75 100 0 12.5 25 37.5

    50 Energy Budget (mJ) Latency Reduction (%) Base BO FCFS Better PVF
  24. Conclusion I. Long Latency Bottleneck Continuous Vision II. Proactive Execution

    Pipeline 1) Leveraging Heterogeneities in Mobile SoCs 2) Relaxed Checking III. Non-mission-critical System