Low-Latency Proactive Continuous Vision

Low-Latency Proactive Continuous Vision

PACT 2020 talk slides by Yiming Gan

F0c4b39a71fc7c752d4e6c451f6f678b?s=128

HorizonLab

October 24, 2020
Tweet

Transcript

  1. Low-Latency Proactive Continuous Vision Yiming Gan Department of Computer Science,

    University of Rochester with Yuxian Qiu, Shanghai Jiao Tong University Lele Chen, University of Rochester Jingwen Leng, Shanghai Jiao Tong University Yuhao Zhu University of Rochester
  2. Continuous Vision: Long Frame Latency

  3. Bottleneck: Serialization Light Raw Pixels Sensor

  4. Bottleneck: Serialization Light Raw Pixels Sensor RGB Image Signal Processor

  5. Light Raw Pixels Sensor RGB Bottleneck: Serialization Results DNN Accelerator

    Image Signal Processor
  6. Traditional Pipeline Sensing Frame 1 Imaging Vision Frame 2 Sensing

    Imaging Vision Frame 3 Sensing Imaging Vision Latency Latency Latency
  7. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency

  8. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred

  9. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging
  10. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek
  11. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check
  12. Proactive Pipeline Sensing Frame 1 Imaging Vision Latency Pred Sensing

    Frame 2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency
  13. Gains Sensing Frame 1 Imaging Vision Latency Pred Sensing Frame

    2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency
  14. Challenges Sensing Frame 1 Imaging Vision Latency Pred Sensing Frame

    2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency Resource Contention
  15. Solutions

  16. Solutions

  17. Challenges Sensing Frame 1 Imaging Vision Latency Pred Sensing Frame

    2 Imaging Vision Vision Sensing Frame 3 Imaging Chek Chek Latency Vision Fail Check Pass Check Latency Energy Wasting
  18. Solutions • Relaxing Checking Criterion (Threshold T)

  19. Solutions • Relaxing Checking Criterion (Threshold T) • Relaxing Checking

    Frequency (Degree K)
  20. Frames Sequence Precise Frames Unchecked- Predicted Frames Checked- Predicted Frames

    Time Predicted Sequence
  21. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP
  22. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K “ ” Accuracy Target Similarity Metric etc.
  23. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor
  24. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor Runtime
  25. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor Runtime Checking
  26. PVF Framework Static Dynamic Vision Apps SoC Sensor BUS Memory

    CPU NPU DSP GPU ISP Similarity T Degree K Predictor Runtime Checking Scheduler
  27. PVF Framework Static Dynamic Vision Apps Similarity T Degree K

    SoC Sensor BUS Memory CPU NPU DSP GPU ISP Predictor Scheduler Control Checking Runtime
  28. Experimental Setup I. In house simulator modeling state-of-the art SoCs

    • Real measurement of latency and energy on different IPs.
  29. Experimental Setup I. In house simulator modeling state-of-the art SoCs

    • Real measurement of latency and energy on different IPs. II. RTL Implementations for NPU and Predictor • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor
  30. Experimental Setup III. Evaluate on Object Detection and Tracking •

    KITTI dataset for object detection, VOT-challange for tracking. I. In house simulator modeling state-of-the art SoCs • Real measurement of latency and energy on different IPs. II. RTL Implementations for NPU and Predictor • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor
  31. Experimental Setup III. Evaluate on Object Detection and Tracking •

    KITTI dataset for object detection, VOT-challange for tracking. I. In house simulator modeling state-of-the art SoCs • Real measurement of latency and energy on different IPs. II. RTL Implementations for NPU and Predictor • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor IV. Different Input Resolutions
  32. Baselines I. Base • Baseline with traditional execution pipeline II.

    BO • Baseline with optimized back-end III. FCFS • Traditional pipeline with multiple hardware IPs
  33. Results 0 25 50 75 100 0 12.5 25 37.5

    50 Energy Budget (mJ) Latency Reduction (%) Better
  34. Results 0 25 50 75 100 0 12.5 25 37.5

    50 Energy Budget (mJ) Latency Reduction (%) PVF Better
  35. Results 0 25 50 75 100 0 12.5 25 37.5

    50 Energy Budget (mJ) Latency Reduction (%) Base BO FCFS Better PVF
  36. Conclusion I. Long Latency Bottleneck Continuous Vision II. Proactive Execution

    Pipeline 1) Leveraging Heterogeneities in Mobile SoCs 2) Relaxed Checking III. Non-mission-critical System
  37. Collaborators Yuxian Qiu Jingwen Leng Lele Chen Yuhao Zhu

  38. Questions