Slide 26
Slide 26 text
10/13/2018 PyConZA_OpenCL_Talk slides
file:///home/neil/Trees/ctpug/Pycon_organisers/PyConZA_2018_Slides/Reveal.js/PyConZA_Inggs_Slides/PyConZA_OpenCL_Talk.slides.html?print-pdf#/ 26/74
flags=pyopencl.mem_flags.READ_ONLY,
size=b.nbytes)
c_nvidia_buffer = pyopencl.Buffer(nvidia_context,
flags=pyopencl.mem_flags.WRITE_ONLY,
size=c.nbytes)
In [9]: nvidia_queue = pyopencl.CommandQueue(nvidia_context)
input_tuples = ((a, a_nvidia_buffer), (b, b_nvidia_buffer), )
output_tuples = ((c, c_nvidia_buffer),)
run_ocl_kernel(nvidia_queue, nvidia_program.sum, (N,), input_tuples, output_tupl
es)
In [10]: check_sum_results(a, b, c)
In [11]: %timeit run_ocl_kernel(nvidia_queue, nvidia_program.sum, (N,), input_tuples, out
put_tuples)
result matches!
2.65 ms ± 507 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
How to Manipulate Memory
How to Manipulate Memory