Challenges: Memory Accesses ▸ Irregular Access Pattern ▹Accesses are not sequential ▹Severely hurts the efficiency of hardware acceleration ▸ Large Footprint
Challenges: Memory Accesses ▸ Irregular Access Pattern ▹Accesses are not sequential ▹Severely hurts the efficiency of hardware acceleration ▸ Large Footprint ▹1080P is ~5.9 MB and 4K is ~23.7 MB
Challenges: Memory Accesses ▸ Irregular Access Pattern ▹Accesses are not sequential ▹Severely hurts the efficiency of hardware acceleration ▸ Large Footprint ▹1080P is ~5.9 MB and 4K is ~23.7 MB ▹Cannot be fully captured by a typical on-chip memory
Setup and Evaluation 8 ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA
Setup and Evaluation 8 Energy Savings(%) 0 20 40 60 RC Elephant NYC Rhino Paris Venice Saving over FPGA Savings over GPU ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA
Setup and Evaluation 8 Energy Savings(%) 0 20 40 60 RC Elephant NYC Rhino Paris Venice Saving over FPGA Savings over GPU ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA
Setup and Evaluation 8 Energy Savings(%) 0 20 40 60 RC Elephant NYC Rhino Paris Venice Saving over FPGA Savings over GPU ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA
Conclusion 9 ▸ 360-degree video rendering consumes excessive power ▸ Our co-design achieves on average 26.4% and 40.0% energy savings over baselines ▸ Virtual reality popularity is growing rapidly