Java so go through each pixel one at a time. • Even with threading in Java it still takes seconds which makes for poor UX. • Start thinking about NDK/JNI/C to get performance boost • Discover RenderScript
at the native level that you write in C (C99 standard). RenderScript gives your apps the ability to run operations with automatic parallelization across all available processor cores. It also supports different types of processors such as the CPU, GPU or DSP.
Android team for Live Wallpaper • 3.x – RenderScript Compute and RenderScript Graphics officially available • 4.0 – RenderScript available in Emulator • 4.1 – RenderScript Graphics depreciated – Floating Point pragmas, – __(attribute)__((kernel)) custom root func
- image processing functions (tuned for device) • ScriptGroups – chain related RenderScript scripts into one call • FilterScript – even more restricted API for more platforms (available in ADT 21.0.x)
does cost 352k to 665k per architecture – total base cost of around 1.6 Mb • Gather read / Scatter write support – rsSetElementAt_<type> • Bounds checking flags • YUV Allocations
for the 50MB apk limit) • Easy parallelism from Java, binding done via reflection • Performance for certain class of problems (Image processing, codecs, audio, physics/modeling) • Optimizations occur on device
level – Compatibility Library reduces impact • Not using existing GPU compute standards (OpenCL and CUDA) • Limited/Any? Third party libraries • Debugging limited to print log (rsDebug) & Not always followed [Samsung Note - ICS] • Documentation limited, but getting better [use the Source Luke! Or StackOverflow]
there. Focused on Image Processing (Computational Photography) • Honeycomb through Jelly Bean now up to 88%! • Compatibility Library covers the rest • Intel on tablets: Acer, Asus, Toshiba, etc. • Imagination Technologies (PowerVR) owns MIPS (see random Chinese MIPS tablet)
Desktop • Desktop has GPU with 10-15x FLOPs of single CPU and 4-6x bandwidth. • Mobile has GPU with 3-5x FLOPs of CPU and no bandwidth advantage. • Desktop has fewer system architectures • Mobile has many (n CPU / n GPU) with high variability
equal to the number of the CPUs available • java.util.concurrent.Executors • java.util.concurrent.CompletionService • Use cs.take().get() to wait for all threads to completed
optional • Qualcomm has their own llvm for NDK • Intel has their own support tools (was Beacon Mountain) • NDK provides good baseline performance if written well, but options exist when needed (NEON/SSE)
- default is double, use float (3.0f vs 3.0 [double]) • #pragma rs_fp_relaxed – relaxed fp rounding and allows for NEON • Use built in functions where possible • Use convert_* to pack/unpack pixel when range rescale not needed
when script is loaded • .destroy() on RenderScript object to release native memory early and reduce memory pressure • Use invoke_* for single thread code
Language – Requires OpenGL ES 2.0 – Code is a text file that is compiled by your Graphics Driver • WebView/Javascript – Currently Single Threaded – Chrome/Blink team want multithread – Firefox nightly has asm.js which turns C code into Javascript, Chrome team has expressed interest
do the processing • Render to either a GLSurfaceView (API 3) or TextureView (API 14) • Shaders are universal cross platform • Shaders are text files which are compiled by your Graphics Driver • Project: android-gpuimage
(everything is a pixel) • Always uses GPU regardless of CPU/GPU architecture. Perhaps better done on octo- core CPU or speciality DSP for camera functions.
Debugging capabilities • More documentation from Google • 3rd Party Libraries (may depend on API level) • Focused on image processing – looking for other use cases
(video/slides available) http://llvm.org/devmtg/2011-11/ • Jeff Sharkey's talk at Google IO 2012 https://developers.google.com/events/io/se ssions/gooio2012/103/
on iOS-GPUImage) https://github.com/CyberAgent/android-gpuimage • Papers on RenderScript Flock/Boids in Renderscript http://sbgames.org/sbgames2012/proceedings/papers/computacao/comp-full_11.pdf Performance comparison http://www.cs.vu.nl/~rkemp/papers/kemp-me2013.pdf
feedback. The Android robot is reproduced or modified from work created and shared by Google and used according to terms described in the Creative Commons 3.0 Attribution License.