Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Datascience from the browser with WebAssembly (and other web standards)

Magellium
December 11, 2017

Datascience from the browser with WebAssembly (and other web standards)

A strong trend in data science is the shift of calculations to users (especially for cost and confidentiality constraints). Thanks to the new standards, the web platform is an important vector of this decentralization. This is the case of WebAssembly which is a kind of assembler executable in the browser.

Of course, the data scientist and its important computing needs can benefit from this technology.

We present experiments carried out by Magellium during a study conducted for the CNES. In particular, there are aircraft detection by machine inference learning and cloud detection by deep learning inference.

These slides were presented for TDS #25 https://www.meetup.com/fr-FR/Tlse-Data-Science/events/245536866/

Vidéos:
- Demo of a WebAssembly tool for remote sensing images viewing and processing : https://youtu.be/s5hpaZIEHV8
- Demo of JPEG 2000 in browser viewing using WebAssembly : https://youtu.be/5U0kMGI5X00

Blog post: http://deeplearning.magellium.fr/conference/2017/12/11/Datascience_from_the_browser_with_WebAssembly.html

Magellium

December 11, 2017
Tweet

More Decks by Magellium

Other Decks in Science

Transcript

  1. PUTTING KNOWLEDGE ON THE MAP Datascience from the browser Nicolas

    Decoster Sébastien Bosch with WebAssembly (and other web standards) @nnodot @seeb0h
  2. Who are we? Sébastien Bosch - @seeb0h engineer geomatics/maps, spatial

    and aerial imagery, robotics, ML, coding, R&D, mgmt...
  3. Powers of 10 0.1 second : Reacting instantly 1.0 seconds

    : User’s flow of thought 10 seconds : Keeping the user’s attention @anirudhkou [Miller 1968; Card et al. 1991; Jakob Nielsen 1993] Confidentiality Latency WHY
  4. Confidentiality Latency Cost Voice search 2 trillion Google searches/years (2e12)

    [searchengineland.com] 20-50% searches without a screen by 2020 [Mediapos] [comscore] Let’s say  1s CPU time/search  $0.033 / h [n1-standard-1 Google Cloud] $11 000 000 / year WHY
  5. Mobile WHY WHO HOW Main Others BNNS (Basic Neural Network

    Subroutines) MPSCNN (Metal Performance Shaders CNN)
  6. Mobile WHY WHO HOW Main Others BNNS (Basic Neural Network

    Subroutines) MPSCNN (Metal Performance Shaders CNN) Inference
  7. Mobile WHY WHO HOW Main Others BNNS (Basic Neural Network

    Subroutines) CPU (no transfer to GPU mem. May be suitable)
  8. Mobile WHY WHO HOW Main Qualcomm Snapdragon Neural Processing Engine

    (NPE) Others CNNDroid DeepLearningKit NPE : Caffe/TF/ONNX on 50% device. GPU/CPU/DSP optimizations
  9. Browser WHY WHO HOW Run CNNs WebDNN (WebGL/WA/WebGPU) Keras-JS (CPU/WebGL

    ) Tensorfire (WebGL), MXNetJS, CaffeJS Run/Train CNNs DeepLearn.js (WebGL), ConvNetJS Run/Train LSTMs Brain.js, Synaptic.js Train and Run NNs Mind.js, DN2A dmlc tree-lite.io
  10. PUTTING KNOWLEDGE ON THE MAP WebAssembly for EO Data Valorization

    Nicolas Decoster, Julien Gaucher (Magellium) – Julien Nosavan (CNES) running (legacy) time-consuming processings in the browser
  11. PUTTING KNOWLEDGE ON THE MAP WebAssembly for EO Data Valorization

    Nicolas Decoster, Julien Gaucher (Magellium) – Julien Nosavan (CNES) running (legacy) time-consuming processings in the browser Conférence Big Data from Space 2017 Toulouse
  12. WebAssembly? WebAssembly or wasm is a kind of bytcode for

    the browser fast and secure a compilation target for (legacy) code
  13. for earth and space data web platforms WebAssembly is a

    new technology that opens new doors for online services that manage data
  14. for earth and space data web platforms Earth and space

    observation have data with great variety of natures, usages and processings it can greatly benefit from WebAssembly
  15. origins Then starts the battle for the fast, full featured

    language for the web Flash, Java, ActiveX, NaCL, SilverLight, Dart... all failed: security problems low adoption not standard vendor lockin etc.
  16. origins 2012 toy project by Alon Zakai a tool to

    compile C code into JavaScript
  17. origins asm.js is the technology everybody was waiting for Mozilla

    asks other actors to start working together on this basis
  18. origins WebAssembly was born asm.js is the technology everybody was

    waiting for Mozilla asks other actors to start working together on this basis
  19. origins WebAssembly was born 2015 asm.js is the technology everybody

    was waiting for Mozilla asks other actors to start working together on this basis
  20. WebAssembly is a new Web standard useable now backed by

    Microsoft, Google, Mozilla, Apple!
  21. WebAssembly is a new Web standard useable now backed by

    Microsoft, Google, Mozilla, Apple! credits: @linclark code-cartoons.com
  22. manages lots of things for you compilation from C/C++ JavaScript

    glue code standard library (pseudo file system, stdio…) canvas and WebGL bindings WebAssembly generation
  23. manages lots of things for you compilation from C/C++ JavaScript

    glue code standard library (pseudo file system, stdio…) canvas and WebGL bindings WebAssembly generation etc.
  24. our experiments an image viewer a proof of concept image

    viewer with (legacy) processings 100% in the browser
  25. OpenLayers based viewer simple cloud detection in wasm legacy plane

    detection with ICF algorithm from CCV library in wasm 100% in the browser our experiments an image viewer
  26. ICF algorithm speed is OK for its use in the

    browser a pure JavaScript version of ICF would have been very slow native binary of ICF is x2 faster our experiments an image viewer
  27. in the previous example, we displayed a “small” image (20

    Mb) in a format known by the browser (PNG) our experiments an image viewer
  28. in the previous example, we displayed a “small” image (20

    Mb) in a format known by the browser (PNG) what if we want to manipulate a very big image (say, 3 Gb) in an exotic format (say, JPEG 2000) ? our experiments an image viewer
  29. JPEG 2000 experiments compile OpenJPEG in wasm create OpenLayers layer

    that calls it it works on small JP2 files it is slow but might be usable on some computers but might be faster with OpenJPEG new version
  30. JPEG 2000 experiments compile Kakadu in wasm create OpenLayers layer

    that calls it it works on small JP2 files it is fast!
  31. JPEG 2000 experiments compile Kakadu in wasm create OpenLayers layer

    that calls it it works on very big JP2 files (3 Gb) it is relatively slow…
  32. JPEG 2000 experiments compile Kakadu in wasm create OpenLayers layer

    that calls it it works on very big JP2 files (3 Gb) it is relatively slow… But, with some tricks at the emscripten level, and with help from some other web standards it becomes useable
  33. JPEG 2000 experiments compile Kakadu in wasm create OpenLayers layer

    that calls it executed in a web worker can’t store all the content in memory: use the File API of the browser with allows chunk access from web workers
  34. at Magellium we conduct some deep learning activities for some

    EO image use cases boats, buildings, cars, clouds, land cover, planes…
  35. at Magellium we conduct some deep learning activities for some

    EO image use cases boats, buildings, cars, clouds, land cover, planes… why not trying to execute the inference part of one of our existing neural network in the browser with WebAssembly?
  36. here are our experiments for in-browser deep learning inference using

    WebDNN library with the WebAssembly backend
  37. in-browser deep learning inference use an existing Unet learned with

    Keras use all 13 bands of Sentinel 2 Sentinel 2 - tile 223 x 223 thick cloud mask
  38. in-browser deep learning inference 17 seconds for one tile on

    an old laptop Sentinel 2 - tile 223 x 223 thick cloud mask
  39. in-browser deep learning inference our experiments are slow, but… can

    be faster with: tile parallelization wit web workers a neural network with a smaller size a fast modern computer a WebGPU backend for WebDNN (significantly faster)
  40. in-browser deep learning inference for our experiments we had to

    modify our Unet network WebDNN misses some kinds of layers
  41. in-browser deep learning inference for our experiments we had to

    modify our Unet network WebDNN misses some kinds of layers yet
  42. current WebGPU is “WebMetal” (i.e. Apple only) actual WebGPU is

    a (near?) future web standard but current WebGPU is not WebGPU…
  43. current WebGPU is “WebMetal” (i.e. Apple only) actual WebGPU is

    a (near?) future web standard but one can bet the performances will be the same but current WebGPU is not WebGPU…
  44. Browser WHY WHO HOW Run CNNs WebDNN (WebGL/WA/WebGPU) Keras-JS (CPU/WebGL

    ) Tensorfire (WebGL), MXNetJS, CaffeJS Run/Train CNNs DeepLearn.js (WebGL), ConvNetJS Run/Train LSTMs Brain.js, Synaptic.js Train and Run NNs Mind.js, DN2A dmlc tree-lite.io
  45. our experiments WebAssembly is web technology the browser is the

    main target but it is not tied to the browser “it is also desirable for it to be able to execute well in other environments, […] on servers in datacenters, […].” (webassembly.org)
  46. our experiments wasm on the server imagine one is testing

    some algorithm in the browser on some data extract changing some parameters or adding some preprocessings and once everything is fine, asking for exactly the same processings on full data on the server
  47. our experiments wasm on the server we have experiment this

    with our plane detection algorithm to ease development we choose a serverless architecture
  48. our experiments wasm on the server it is a work

    in progress so far: the wasm plane detection is deployed as a Google Cloud Function a HTTP trigger and a PubSub trigger results on a PubSub batch of lots of requests is working
  49. our experiments wasm on the server it is a work

    in progress to do: integrate requests and results into our browser viewer do the tiling of the image and send the requests display the results as they come
  50. data loading - fast data parsing - fast rendering -

    fast layout - slow (circle packing) HeapViz
  51. WebAssembly is a standard, useable now, with strong commitment from

    major web actors our experiments show that it is possible to successfully bring relative complex legacy tools in the browser
  52. WebAssembly is a standard, useable now, with strong commitment from

    major web actors our experiments show that it is possible to successfully bring relative complex legacy tools in the browser our experiments show that this can be reasonably fast
  53. WebAssembly is a standard, useable now, with strong commitment from

    major web actors (wasm engines are very young, performances will improve) our experiments show that it is possible to successfully bring relative complex legacy tools in the browser our experiments show that this can be reasonably fast
  54. CNES will experiment WebAssembly with SPOT WORLD HERITAGE project to

    bring some new features to users like visualization of full resolution images
  55. Values are speed improvment compared to clang (lower is better)

    Compiled with emscripten: wasm, asmjs, emjs (i.e. compiled with O1, which produces standard JavaScript ) pojs (plain old JavaScript): JavaScript implementation of the algorithms wasm, asmjs, gcc et clang: compiled with O3 asmjs clang emjs gcc pojs wasm algo 2.83 1.00 1.31 0.49 1.66 1.13 bounding_box_10000 1.06 1.00 0.84 1.30 1.99 1.31 loop_sum_math_no_array_100 1.10 1.00 0.85 1.31 2.04 1.33 loop_sum_math_no_array_200 1.05 1.00 0.81 1.28 1.92 1.30 loop_sum_math_no_array_600 0.63 1.00 0.65 1.06 0.98 0.78 loop_sum_math_array_10000 0.63 1.00 0.63 1.06 0.94 0.78 loop_sum_math_array_20000 0.62 1.00 0.63 1.04 0.91 0.76 loop_sum_math_array_40000 1.65 1.00 1.71 0.98 7.86 1.01 image_convolution_const_128 1.74 1.00 1.72 0.97 7.75 0.98 image_convolution_128 1.39 1.00 1.41 1.21 27.36 0.74 image_convolution_faster_128 0.89 1.00 0.89 0.99 21.70 0.55 image_convolution_faster_double_128 1.37 1.00 1.38 1.24 28.06 0.76 image_convolution_faster_256 1.36 1.00 1.38 1.24 26.90 0.75 image_convolution_faster_512 1.35 1.00 1.36 1.24 26.52 0.75 image_convolution_faster_1024