Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Low-Latency Proactive Continuous Vision

HorizonLab
October 24, 2020

Low-Latency Proactive Continuous Vision

PACT 2020 talk slides by Yiming Gan

HorizonLab

October 24, 2020
Tweet

More Decks by HorizonLab

Other Decks in Education

Transcript

  1. Low-Latency Proactive
    Continuous Vision
    Yiming Gan
    Department of Computer Science,
    University of Rochester
    with
    Yuxian Qiu, Shanghai Jiao Tong University
    Lele Chen, University of Rochester
    Jingwen Leng, Shanghai Jiao Tong University
    Yuhao Zhu University of Rochester

    View Slide

  2. Continuous Vision: Long Frame
    Latency

    View Slide

  3. Bottleneck: Serialization
    Light Raw Pixels
    Sensor

    View Slide

  4. Bottleneck: Serialization
    Light Raw Pixels
    Sensor
    RGB
    Image Signal Processor

    View Slide

  5. Light Raw Pixels
    Sensor
    RGB
    Bottleneck: Serialization
    Results
    DNN Accelerator
    Image Signal Processor

    View Slide

  6. Traditional Pipeline
    Sensing
    Frame 1 Imaging Vision
    Frame 2 Sensing Imaging Vision
    Frame 3 Sensing Imaging Vision
    Latency
    Latency
    Latency

    View Slide

  7. Proactive Pipeline
    Sensing
    Frame 1 Imaging Vision
    Latency

    View Slide

  8. Proactive Pipeline
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred

    View Slide

  9. Proactive Pipeline
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging

    View Slide

  10. Proactive Pipeline
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging
    Chek
    Chek

    View Slide

  11. Proactive Pipeline
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging
    Chek
    Chek
    Latency
    Vision
    Fail Check

    View Slide

  12. Proactive Pipeline
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging
    Chek
    Chek
    Latency
    Vision
    Fail Check
    Pass Check
    Latency

    View Slide

  13. Gains
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging
    Chek
    Chek
    Latency
    Vision
    Fail Check
    Pass Check
    Latency

    View Slide

  14. Challenges
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging
    Chek
    Chek
    Latency
    Vision
    Fail Check
    Pass Check
    Latency
    Resource Contention

    View Slide

  15. Solutions

    View Slide

  16. Solutions

    View Slide

  17. Challenges
    Sensing
    Frame 1 Imaging Vision
    Latency
    Pred
    Sensing
    Frame 2 Imaging
    Vision
    Vision
    Sensing
    Frame 3 Imaging
    Chek
    Chek
    Latency
    Vision
    Fail Check
    Pass Check
    Latency
    Energy Wasting

    View Slide

  18. Solutions
    • Relaxing Checking Criterion (Threshold T)

    View Slide

  19. Solutions
    • Relaxing Checking Criterion (Threshold T)
    • Relaxing Checking Frequency (Degree K)

    View Slide

  20. Frames Sequence
    Precise Frames
    Unchecked-
    Predicted Frames
    Checked-
    Predicted Frames
    Time
    Predicted Sequence

    View Slide

  21. PVF Framework
    Static Dynamic
    Vision
    Apps
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP

    View Slide

  22. PVF Framework
    Static Dynamic
    Vision
    Apps
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP
    Similarity
    T
    Degree K
    “ ”
    Accuracy Target
    Similarity Metric
    etc.

    View Slide

  23. PVF Framework
    Static Dynamic
    Vision
    Apps
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP
    Similarity
    T
    Degree K
    Predictor

    View Slide

  24. PVF Framework
    Static Dynamic
    Vision
    Apps
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP
    Similarity
    T
    Degree K
    Predictor
    Runtime

    View Slide

  25. PVF Framework
    Static Dynamic
    Vision
    Apps
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP
    Similarity
    T
    Degree K
    Predictor
    Runtime
    Checking

    View Slide

  26. PVF Framework
    Static Dynamic
    Vision
    Apps
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP
    Similarity
    T
    Degree K
    Predictor
    Runtime
    Checking
    Scheduler

    View Slide

  27. PVF Framework
    Static Dynamic
    Vision
    Apps
    Similarity
    T
    Degree K
    SoC
    Sensor
    BUS
    Memory
    CPU NPU
    DSP GPU
    ISP
    Predictor
    Scheduler
    Control
    Checking
    Runtime

    View Slide

  28. Experimental Setup
    I. In house simulator modeling state-of-the art SoCs
    • Real measurement of latency and energy on different IPs.

    View Slide

  29. Experimental Setup
    I. In house simulator modeling state-of-the art SoCs
    • Real measurement of latency and energy on different IPs.
    II. RTL Implementations for NPU and Predictor
    • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor

    View Slide

  30. Experimental Setup
    III. Evaluate on Object Detection and Tracking
    • KITTI dataset for object detection, VOT-challange for tracking.
    I. In house simulator modeling state-of-the art SoCs
    • Real measurement of latency and energy on different IPs.
    II. RTL Implementations for NPU and Predictor
    • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor

    View Slide

  31. Experimental Setup
    III. Evaluate on Object Detection and Tracking
    • KITTI dataset for object detection, VOT-challange for tracking.
    I. In house simulator modeling state-of-the art SoCs
    • Real measurement of latency and energy on different IPs.
    II. RTL Implementations for NPU and Predictor
    • 20x20 Systolic Array for NPU, 10x10 Systolic Array for Predictor
    IV. Different Input Resolutions

    View Slide

  32. Baselines
    I. Base
    • Baseline with traditional execution pipeline
    II. BO
    • Baseline with optimized back-end
    III. FCFS
    • Traditional pipeline with multiple hardware IPs

    View Slide

  33. Results
    0
    25
    50
    75
    100
    0 12.5 25 37.5 50
    Energy Budget (mJ)
    Latency Reduction (%)
    Better

    View Slide

  34. Results
    0
    25
    50
    75
    100
    0 12.5 25 37.5 50
    Energy Budget (mJ)
    Latency Reduction (%)
    PVF
    Better

    View Slide

  35. Results
    0
    25
    50
    75
    100
    0 12.5 25 37.5 50
    Energy Budget (mJ)
    Latency Reduction (%)
    Base
    BO
    FCFS
    Better
    PVF

    View Slide

  36. Conclusion
    I. Long Latency Bottleneck Continuous Vision
    II. Proactive Execution Pipeline
    1) Leveraging Heterogeneities in Mobile SoCs
    2) Relaxed Checking
    III. Non-mission-critical System

    View Slide

  37. Collaborators
    Yuxian Qiu Jingwen Leng Lele Chen Yuhao Zhu

    View Slide

  38. Questions

    View Slide