Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Droidcon NYC 2014: Image Filtering Options on A...

Droidcon NYC 2014: Image Filtering Options on Android

Droidcon NYC 2014 talk on Image Filtering Options on Android

Morrison Chang

September 20, 2014
Tweet

More Decks by Morrison Chang

Other Decks in Programming

Transcript

  1. Android Image Filtering Options Droidcon NYC 2014 Sept 20, 2014

    Morrison Chang Twitter @codeledger Github: https://github.com/Codeledger
  2. The Highlights • What is Image Filtering • Basics •

    Performance • RenderScript • ScriptIntrinsics • OpenGL ES based solutions • References
  3. What do I mean by Image Filtering • Transform an

    existing image in its alpha and color channels. • Could use Porter-Duff if mapping image is of same size
  4. Make me an Instagram Filter • Use your favorite search

    engine • Realize that there is a android.media.effect package available since API 14 (Ice Cream Sandwich)
  5. android.media.effect • Lots of pre-builts in EffectsFactory • Uses OpenGL

    2.0 / GPU under the hood • Camera and Video ready • Extendable? Perhaps with GL shader language knowledge
  6. Make me an Instagram Filter • Continue searching with your

    favorite search engine • Find recipes in Java http://www.jhlabs.com/ip/blurring.html
  7. Problems with Performance • Implemented as a single thread in

    Java so go through each pixel one at a time. • Even with threading in Java it still takes seconds which makes for poor UX. • Start thinking about NDK/JNI/C to get performance boost • Discover RenderScript
  8. What is RenderScript RenderScript is a high performance computation API

    at the native level that you write in C (C99 standard). RenderScript gives your apps the ability to run operations with automatic parallelization across all available processor cores. It also supports different types of processors such as the CPU, GPU or DSP.
  9. Goals of RenderScript • Portability – Can run on ARM,

    x86, MIPS, GPU (Nexus 10) • Performance – Easy Parallelization across cores • Usability – No handcrafted JNI required, auto-generated Java binding code • Hide the complexity of hardware architectures
  10. RenderScript Components • Build time Compiler (llvm-rs-cc) – converts script

    code to portable bitcode and Java bindings • Runtime JIT Compiler (libbcc) – translates bitcode into machine code (ARM/x86/DSP) • Runtime library support (librs) – manage script from Dalvik, support libraries – code tuned for (ARM/x86/DSP)
  11. Quick History of RenderScript • 2.x – Internal Only for

    Android team for Live Wallpaper • 3.x – RenderScript Compute and RenderScript Graphics officially available • 4.0 – RenderScript available in Emulator • 4.1 – RenderScript Graphics depreciated – Floating Point pragmas, – __(attribute)__((kernel)) custom root func
  12. Quick History of RenderScript 4.2+ (API 17+) • Script Intrinsics

    - image processing functions (tuned for device) • ScriptGroups – chain related RenderScript scripts into one call • FilterScript – even more restricted API for more platforms (available in ADT 21.0.x)
  13. Quick History of RenderScript • Compatibility Library for Gingerbread –

    does cost 352k to 665k per architecture – total base cost of around 1.6 Mb • Gather read / Scatter write support – rsSetElementAt_<type> • Bounds checking flags • YUV Allocations
  14. Advantages • No need to recompile for various platforms (good

    for the 50MB apk limit) • Easy parallelism from Java, binding done via reflection • Performance for certain class of problems (Image processing, codecs, audio, physics/modeling) • Optimizations occur on device
  15. Disadvantages • Android only API • Variations based on API

    level – Compatibility Library reduces impact • Not using existing GPU compute standards (OpenCL and CUDA) • Limited/Any? Third party libraries • Debugging limited to print log (rsDebug) & Not always followed [Samsung Note - ICS] • Documentation limited, but getting better [use the Source Luke! Or StackOverflow]
  16. Rational - Why Bother? • If you need it, its

    there. Focused on Image Processing (Computational Photography) • Honeycomb through Jelly Bean now up to 88%! • Compatibility Library covers the rest • Intel on tablets: Acer, Asus, Toshiba, etc. • Imagination Technologies (PowerVR) owns MIPS (see random Chinese MIPS tablet)
  17. Why not OpenCL or CUDA? • Mobile is different than

    Desktop • Desktop has GPU with 10-15x FLOPs of single CPU and 4-6x bandwidth. • Mobile has GPU with 3-5x FLOPs of CPU and no bandwidth advantage. • Desktop has fewer system architectures • Mobile has many (n CPU / n GPU) with high variability
  18. What tools are available on Android? • Java / Threads

    • Native / JNI / C/C++ / pthreads • Renderscript • OpenGL ES Shader Language • Webview / Javascript
  19. ScriptInstrinsics • Prebuilt filters for image processing • New ones

    added in API 18,19, and 20 • Base set available in RenderScript Compatibility Library
  20. RS Compare app • Image processing example • Makes use

    of gather reads so fits into SIMD profile. • Code is on github at: https://github.com/codeledger/RSSampleCompare
  21. Try Concurrent Threads • Split up the task into parts,

    equal to the number of the CPUs available • java.util.concurrent.Executors • java.util.concurrent.CompletionService • Use cs.take().get() to wait for all threads to completed
  22. Tuning JNI/C/C++ • NDK has gcc compiler standard and llvm

    optional • Qualcomm has their own llvm for NDK • Intel has their own support tools (was Beacon Mountain) • NDK provides good baseline performance if written well, but options exist when needed (NEON/SSE)
  23. Script • .rs (a C99 file), may have .rsh (header

    file) • Has root method • ScriptC_[name of rs file] in Java
  24. Memory • Shared between Dalvik and RenderScript • android.renderscript.Allocation •

    Declared in Java code • Different Types available – Script – IO on SurfaceTexture (API 17) – Graphics
  25. API • Memory allocation mSourceAllocation = Allocation.createFromBitmap(mRS, mBitmapIn, Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);

    • Root function void root(const uchar4 *v_in, uchar4 *v_out, const void *usrData, uint32_t x, uint32_t y) {..} • ForEach rsForEach(gScript, gIn, gOut,0,0); • Memory Access uchar4 * rgb_ref = (uchar4 *) rsGetElementAt(gIn,ix,iy);
  26. Be aware of • Only use the precision you need

    - default is double, use float (3.0f vs 3.0 [double]) • #pragma rs_fp_relaxed – relaxed fp rounding and allows for NEON • Use built in functions where possible • Use convert_* to pack/unpack pixel when range rescale not needed
  27. Tips • rsDebug() – write to logcat • Debugging properties

    set via adb – debug.rs.max-threads • Force CPU worker pool size – debug.rs.default-driver-cpu • Use reference implementation
  28. Tips • Use void init() for one time initialization, called

    when script is loaded • .destroy() on RenderScript object to release native memory early and reduce memory pressure • Use invoke_* for single thread code
  29. Other ways to do Image Processing • OpenGL ES Shader

    Language – Requires OpenGL ES 2.0 – Code is a text file that is compiled by your Graphics Driver • WebView/Javascript – Currently Single Threaded – Chrome/Blink team want multithread – Firefox nightly has asm.js which turns C code into Javascript, Chrome team has expressed interest
  30. OpenGL ES 2.0 • Use OpenGL ES Shading Language to

    do the processing • Render to either a GLSurfaceView (API 3) or TextureView (API 14) • Shaders are universal cross platform • Shaders are text files which are compiled by your Graphics Driver • Project: android-gpuimage
  31. OpenGL Shader tradeoffs • Language syntax • Shader pipeline model

    (everything is a pixel) • Always uses GPU regardless of CPU/GPU architecture. Perhaps better done on octo- core CPU or speciality DSP for camera functions.
  32. Future • More devices with Compute on GPU • More

    Debugging capabilities • More documentation from Google • 3rd Party Libraries (may depend on API level) • Focused on image processing – looking for other use cases
  33. Resources • Book: Pro Android Apps Performance Optimization (Professional Apress)

    by Hervé Guihot – has a chapter • Android Developer Blog posts: http://android- developers.blogspot.com/2012/01/levels-in- renderscript.html
  34. Resources • Android Renderscript talk at LLVM 2011-11 Developer Meeting

    (video/slides available) http://llvm.org/devmtg/2011-11/ • Jeff Sharkey's talk at Google IO 2012 https://developers.google.com/events/io/se ssions/gooio2012/103/
  35. Resources • Google IO 2013 – High Performance Applications in

    RenderScript http://www.youtube.com/watch?v=uzBw6AWCBpU • GDC 2013 Renderscript presentation http://www.youtube.com/watch?v=gu1jwNuMv1A
  36. Resources • Implementation of various image filters in RenderScript by

    Cesar Aguilar https://github.com/caguilar187/RSImage • AOSP Tests https://android.googlesource.com/platform/frameworks/base/ +/master/tests/RenderScriptTests • Older samples http://code.google.com/p/renderscript-examples/
  37. Resources • Android Library Project using OpenGL ES 2.0 (based

    on iOS-GPUImage) https://github.com/CyberAgent/android-gpuimage • Papers on RenderScript Flock/Boids in Renderscript http://sbgames.org/sbgames2012/proceedings/papers/computacao/comp-full_11.pdf Performance comparison http://www.cs.vu.nl/~rkemp/papers/kemp-me2013.pdf
  38. Q&A

  39. Thanks For Attending Please vote on the session and provide

    feedback. The Android robot is reproduced or modified from work created and shared by Google and used according to terms described in the Creative Commons 3.0 Attribution License.