Slide 1

Slide 1 text

Creating a serverless Python environment for scientific computing with WebAssembly, for data scientists and Python lovers Jeongkyu Shin Lablup Inc. / Google Developers Expert

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

[1] https://openwho.org/channels/covid-19-national-languages [2] https://www.youtube.com/watch?v=PjhoPEUcrmI?t=33

Slide 5

Slide 5 text

What I talk today is… • Python • Scientific computation / environments • Calculation resources • Web and Mozilla • Iodide Project • App shells • Demo • What to do (with you)

Slide 6

Slide 6 text

Python • Not only the greatest language for beginners, but the greatest scientific langauge • Julia: Hello? • Slow but Fast • GIL but scalable • Community-driven • Combine with many tools / frameworks / libraries • Can bind anything written with keyboard [1] https://xkcd.com/353/

Slide 7

Slide 7 text

Scientific Computation • How humankind consumes electricity: • Officeworks? • Game? • Adult video? • No! Humanbeings use energy for scientific computation! • From FORTRAN to Python • IMSL, BLAS, LAPACK, OpenMP • cuBLAS, cuLAPACK, cuSOLVER • GSL, ROOTS, Numerical Recipes • And numpy / scipy

Slide 8

Slide 8 text

Scientific environments • Libraries • NumPy, SciPy, Pandas, Matplotlib, SciKit-Learn… • Platforms • Anaconda, Canopy, ActivePython, PyIMSL, Python(x,y) • Container Images • MLWorkspace, Backend.AI Scientific Kernels

Slide 9

Slide 9 text

Burning fire: • Complex computation resources • CPU, GPU, ASICs • Drivers, Libraries • Ultra-scale computation resources • GPU Cloud • Distributed Clusters [1] https://cloud.google.com/blog/products/ai-machine-learning/cloud-tpu-pods-break-ai-training-records

Slide 10

Slide 10 text

Cut the chicken with a sledge knife • We do not need the nuke • e.g. Machine Learning study • Fraction of GPU is enough (2GB) • In fact, no need to use GPU for studying • Wag the dog: Scientific training workshop • Preparation: 2hr. • Training: 4hr. [1] http://www.inven.co.kr/board/webzine/2097/1177426 (Before modification) [2] https://www.yna.co.kr/view/GYH20090602001500044 (Note: now typo modified)

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Back to the future: battlefield web • JAVA / MS JAVA • Active X • Shockwave Adobe Flash • NaCl / PNaCl • For Chrome / ChromeOS extension • .EXE everywhere • Palmface [1] https://en.wikipedia.org/wiki/Facepalm#/media/File:Paris_Tuileries_Garden_Facepalm_statue.jpg

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Mozilla • Firefox • Rust • WebAssembly (WASM) by W3C • Can be a compliation target for low-level languages • Emscripten • LLVM-based Toolchain for compiling to asm.js / WASM • So, what can we do with this?

Slide 15

Slide 15 text

[1] https://www.pcgamesn.com/quake-live-ditching-web-browsers-standalone-client

Slide 16

Slide 16 text

Iodide Project • An experimental tool for scientific communication and exploration on the web • Scientific computing • Data science • Why no web tech. for scientific computing? • JavaScript in early 21st century • Are you serious? • Now: everyone uses Jupyter as UI for Python stack https://github.com/iodide-project

Slide 17

Slide 17 text

Iodide to Pyodide • Web-based scientific environment • Complete notebook • Visualization • Everybody loves Python • Why don’t we compile scientific python stack with WASM? • And run in the JavaScript VM on browser? • ?! • Does it work? Does it? • It works! https://github.com/iodide-project/pyodide

Slide 18

Slide 18 text

Pyodide: The Python science stack in the browser • Created by Michael Droettboom • Python runtime + scientific packages compiled with WASM • NumPy, SciPy, Pandas, Matplotlib… • Pros • Instant, easy, • Can combine magical ideas • Cons • Big & slow

Slide 19

Slide 19 text

Pyodide: Packages 2020. Aug. Common packages are listed only.

Slide 20

Slide 20 text

Pyodide: Bootstrapping window.languagePluginUrl = ‘pyodide/'; Runtime test:
let resultPane = document.querySelector('#result-pane’); languagePluginLoader.then(()=>{ resultPane.innerHTML = ’’; let result = pyodide.runPython (` import sys sys.version `); resultPane.innerHTML = result; });

Slide 21

Slide 21 text

Pyodide: Bootstrapping let resultPane = document.querySelector('#result-pane’); languagePluginLoader.then(()=>{ resultPane.innerHTML = ’’; let result = pyodide.runPython (` import sys sys.version `); resultPane.innerHTML = result; }); • Note • Sometimes result will return ‘undefined’ • Timing issue. (will cover later)

Slide 22

Slide 22 text

Practical problems • Everyday use • File system to store codes / results • Python module loading: numpy, scipy, matplotlib • Web UI to work / study • Stand-alone / Portable • Iodide: Written in Django • Need installation on server / web connections • Stand-alone / Portable solution • To make usage scenarios simple

Slide 23

Slide 23 text

Solutions for Practical problems • Fire system for runtime • Use BrowserFS (https://github.com/jvilk/BrowserFS) BrowserFS.install(window); BrowserFS.configure({ fs: "LocalStorage" }, function(e) { let fs = BrowserFS.BFSRequire('fs’); fs.writeFileSync('/test.txt', 'Python+WebAssembly is Awesome!’); languagePluginLoader.then(async ()=>{ let FS = pyodide._module.FS; let PATH = pyodide._module.PATH; // Create an Emscripten interface for the BrowserFS let BFS = new BrowserFS.EmscriptenFS(FS, PATH); // Create mount point in Emscripten FS FS.createFolder(FS.root, 'data', true, true); // Mount BrowserFS into Emscripten FS FS.mount(BFS, {root: '/'}, '/data’); // Open file in BrowserFS from python and show contents let result = await pyodide.runPythonAsync(` import numpy as np import sys import glob import js print(sys.version) print(np.__version__) f = open('/data/test.txt') print(f.readline()) `); }); }); • Workflow • LocalStorage in this example • Use Emscripten for WASM backend

Slide 24

Slide 24 text

Solutions for Practical problems • Module loading: Use Promise-ready running API • runPythonAsync automatically detects packages and dynamically imports. let resultPane = document.querySelector('#result-pane’); languagePluginLoader.then(async ()=>{ resultPane.innerHTML = ’’; let result = await pyodide.runPythonAsync(` import numpy as np import sys print(sys.version) np.__version__ `) .then(result=>{ if (typeof result !== "undefined") { resultPane.innerHTML = result; } }); }); Console Log

Slide 25

Slide 25 text

Making / Testing Simple Python REPL IDE • Problem 1: cannot get stdout • Solution: manual stdout() reading pipeline languagePluginLoader.then( ()=>{ pyodide.runPython(` import sys import io sys.stdout = io.StringIO() `); ... }); let stdout = pyodide.runPython("sys.stdout.getvalue()") let stdout_console = document.createElement('div'); stdout_console.innerText = stdout; resultPane.appendChild(stdout_console); After executing each block:

Slide 26

Slide 26 text

Making / Testing Simple Python REPL IDE • Problem 2: Iodide dependency • Solution: Make an Iodide mock-up on code • matplotlib (and other plotting libraries) is monkeypatched to create and use canvas to iodide.output • So let’s provide mockup object like this: globalThis.iodide = { output:{ element: (tagName) => { let outputPane = document.createElement(tagName); document.querySelector("#result-pane").appendChild(outputPane); return outputPane; } } };

Slide 27

Slide 27 text

’More’ Practical problems • Data science • Size, access, speed, easeness • Runtime size • Cannot deliver through internet connection • 150~450MB for fullstack • depends on your compiled libraries

Slide 28

Slide 28 text

Problem Solving with application • Electron app. with Chromium • Stand-alone browser environment with Node.js • App / Web dual mode • App mode • Complete scientific stack with local Pyodide packages • Web mode • Selectable library loading with ESNext dynamic import • pyodide.runPythonAsync will do the job

Slide 29

Slide 29 text

Architecture (overview) Limitation: WebWorkers cannot modify DOM due to its jail nature

Slide 30

Slide 30 text

Architecture (implementation) Implements with WebComponnents Custom pyodide.module for ES module

Slide 31

Slide 31 text

Application Building & Distributing • Automatic build script • Dockerized build environment for WASM / Pyodide • rollup.js for Electron app • Electron packager with build script • Distribution • GitHub with source code and runtime • (Plan) Linux/Windows/Mac AppStore for easier distribution on Windows

Slide 32

Slide 32 text

’More’ Troubles • Problem 3: String buffer • Should clean string buffers! • Solution: run stdout cleanup code after each execution • Problem 4: WebWorker limitation • WebWorker runtime does not provide main thread DOM access • Solution: make data pipeline routine with message system • Limitation: multimedia outputs • They are generated as a part of WorkerGlobalScope • Still no fine way to solve the problem: Any ideas? pyodide.runPython(`sys.stdout.truncate(0);sys.stdout.seek(0)`);

Slide 33

Slide 33 text

Now the app is ready to use: Let’s test the stand-alone IDE app! With those works + other hidden (& tedious) stuffs,

Slide 34

Slide 34 text

Demo: Simple data science • Data → Analyze → Visualize

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

Go further • Connect with Web • pyodide.pyimport: Access a Python object from JavaScript • Internal module : js • Provides direct access from Python to container DOM • Connect with local filestorage • Convert BrowserFS to node.js FileSystem API • Enable nodeIntegration=True on Electron newWindow option from js import document, window

Slide 37

Slide 37 text

Building Pyodide • To enable FileSystem access with NODEFS • Recent versions of Emscripten removes node.js FileSytem support from default FS support • https://emscripten.org/docs/api_reference /Filesystem-API.html • How • Add option to ./Makefile OPTFLAGS=-O3 -lnodefs.js -lworkerfs.js -s NODERAWFS=1 … LDFLAGS=-s NODERAWFS=1

Slide 38

Slide 38 text

Building Pyodide • To add / update libraries • If you have your own scientific libraries, compile it • Building Pyodide package • Use ‘mkpkg’ in Pyodide source code • Generates meta.yaml • bin/pyodide mkpkg [PACKAGE_NAME] package: name: numpy version: 1.15.4 source: url: https://files.pythonhosted.org/packages/... sha256: 3d734559db35aa3697dadcea492a423118c5c... patches: - patches/add-emscripten-cpu.patch - patches/disable-maybe-uninitialized.patch - patches/dont-include-execinfo.patch - patches/fix-longdouble.patch - patches/fix-static-init-of-nditer-pywrap.patch - patches/force_malloc.patch - patches/init-alloc-cache.patch - patches/use-local-blas-lapack.patch - patches/fix-install-with-skip-build.patch build: skip_host: False cflags: -include math.h -I../../config test: imports: - numpy

Slide 39

Slide 39 text

Demo: Web+Pyodide+Web+Fun

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

Performance 0.44s 11.95s 12.85s Matrix Dot (4096x4096) 0.04ms 1.30ms 0.82ms Vector Dot (524288) 28.00s 0.37s 12.85s SVD (2048x1024) 0.06s 3.96s 1.99s Cholesky decomp. (2048x2048) 3.91s 149.73s 67.16s Eigendecomposition (2048x2048) Tested on iMac Intel i9 9900k (8 Core / 5GHz) • Near native • Basically, it is single-threaded

Slide 42

Slide 42 text

Limitations • Good enough, not for the production • With Firefox, user may change config: dom.max_script_run_time • Single-threaded • Slow when performing matrix calculations • Bad for heavy workload, good enough for studying

Slide 43

Slide 43 text

Ideas • WASM+Micropython • “Usable” python runtime on browsers • Python-based SPA solution • Full Web-Python ecosphere with WASM • Micropip: (experimental) supports pure Python package installation • PyPi for WASM-Python • Dynamically loadable Python packages on the web • JupyterLab integration • Use local Pyodide runtime as IPython kernel • Some projects (e.g. jyve) but with security holes • And…

Slide 44

Slide 44 text

Thank you for listening J [email protected] inureyes inureyes jeongkyu.shin End ! Source codes: https://github.com/inureyes/pyodide-console