sequential ▹Severely hurts the efficiency of hardware acceleration ▸ Large Footprint ▹1080P is ~5.9 MB and 4K is ~23.7 MB ▹Cannot be fully captured by a typical on-chip memory
RC Elephant NYC Rhino Paris Venice Saving over FPGA Savings over GPU ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA
RC Elephant NYC Rhino Paris Venice Saving over FPGA Savings over GPU ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA
RC Elephant NYC Rhino Paris Venice Saving over FPGA Savings over GPU ▸ Xilinx Zynq UltraScale+ MPSoC ZCU104 ▸ Pascal GPU on the Nvidia Jetson TX2 ▸ Real User Trace Evaluation ▸ Baseline: Original algorithm implemented on GPU and FPGA