Creating a serverless Python environment
for scientific computing
with WebAssembly,
for data scientists and Python lovers
Jeongkyu Shin
Lablup Inc. / Google Developers Expert
What I talk today is…
• Python
• Scientific computation / environments
• Calculation resources
• Web and Mozilla
• Iodide Project
• App shells
• Demo
• What to do (with you)
Slide 6
Slide 6 text
Python
• Not only the greatest
language for beginners, but
the greatest scientific
langauge
• Julia: Hello?
• Slow but Fast
• GIL but scalable
• Community-driven
• Combine with many tools /
frameworks / libraries
• Can bind anything written with
keyboard
[1] https://xkcd.com/353/
Slide 7
Slide 7 text
Scientific Computation
• How humankind consumes electricity:
• Officeworks?
• Game?
• Adult video?
• No! Humanbeings use energy for scientific computation!
• From FORTRAN to Python
• IMSL, BLAS, LAPACK, OpenMP
• cuBLAS, cuLAPACK, cuSOLVER
• GSL, ROOTS, Numerical Recipes
• And numpy / scipy
Cut the chicken with a sledge knife
• We do not need the nuke
• e.g. Machine Learning study
• Fraction of GPU is enough (2GB)
• In fact, no need to use GPU for studying
• Wag the dog: Scientific training
workshop
• Preparation: 2hr.
• Training: 4hr.
[1] http://www.inven.co.kr/board/webzine/2097/1177426 (Before modification)
[2] https://www.yna.co.kr/view/GYH20090602001500044 (Note: now typo modified)
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
Back to the future: battlefield web
• JAVA / MS JAVA
• Active X
• Shockwave Adobe Flash
• NaCl / PNaCl
• For Chrome / ChromeOS extension
• .EXE everywhere
• Palmface
[1] https://en.wikipedia.org/wiki/Facepalm#/media/File:Paris_Tuileries_Garden_Facepalm_statue.jpg
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
Mozilla
• Firefox
• Rust
• WebAssembly (WASM) by W3C
• Can be a compliation target for low-level languages
• Emscripten
• LLVM-based Toolchain for compiling to asm.js / WASM
• So, what can we do with this?
Iodide Project
• An experimental tool for scientific communication and
exploration on the web
• Scientific computing
• Data science
• Why no web tech. for scientific computing?
• JavaScript in early 21st century
• Are you serious?
• Now: everyone uses Jupyter as UI for Python stack
https://github.com/iodide-project
Slide 17
Slide 17 text
Iodide to Pyodide
• Web-based scientific environment
• Complete notebook
• Visualization
• Everybody loves Python
• Why don’t we compile scientific python stack with WASM?
• And run in the JavaScript VM on browser?
• ?!
• Does it work? Does it?
• It works!
https://github.com/iodide-project/pyodide
Slide 18
Slide 18 text
Pyodide: The Python science stack in the browser
• Created by Michael Droettboom
• Python runtime + scientific packages compiled with WASM
• NumPy, SciPy, Pandas, Matplotlib…
• Pros
• Instant, easy,
• Can combine magical ideas
• Cons
• Big & slow
Slide 19
Slide 19 text
Pyodide: Packages
2020. Aug. Common packages are listed only.
Slide 20
Slide 20 text
Pyodide: Bootstrapping
window.languagePluginUrl = ‘pyodide/';
Runtime test:
let resultPane = document.querySelector('#result-pane’);
languagePluginLoader.then(()=>{
resultPane.innerHTML = ’’;
let result = pyodide.runPython (`
import sys
sys.version
`);
resultPane.innerHTML = result;
});
Slide 21
Slide 21 text
Pyodide: Bootstrapping
let resultPane =
document.querySelector('#result-pane’);
languagePluginLoader.then(()=>{
resultPane.innerHTML = ’’;
let result = pyodide.runPython (`
import sys
sys.version
`);
resultPane.innerHTML = result;
});
• Note
• Sometimes result will
return ‘undefined’
• Timing issue. (will
cover later)
Slide 22
Slide 22 text
Practical problems
• Everyday use
• File system to store codes / results
• Python module loading: numpy, scipy, matplotlib
• Web UI to work / study
• Stand-alone / Portable
• Iodide: Written in Django
• Need installation on server / web connections
• Stand-alone / Portable solution
• To make usage scenarios simple
Slide 23
Slide 23 text
Solutions for Practical problems
• Fire system for runtime
• Use BrowserFS (https://github.com/jvilk/BrowserFS)
BrowserFS.install(window);
BrowserFS.configure({
fs: "LocalStorage"
}, function(e) {
let fs = BrowserFS.BFSRequire('fs’);
fs.writeFileSync('/test.txt', 'Python+WebAssembly is Awesome!’);
languagePluginLoader.then(async ()=>{
let FS = pyodide._module.FS;
let PATH = pyodide._module.PATH;
// Create an Emscripten interface for the BrowserFS
let BFS = new BrowserFS.EmscriptenFS(FS, PATH);
// Create mount point in Emscripten FS
FS.createFolder(FS.root, 'data', true, true);
// Mount BrowserFS into Emscripten FS
FS.mount(BFS, {root: '/'}, '/data’);
// Open file in BrowserFS from python and show contents
let result = await pyodide.runPythonAsync(`
import numpy as np
import sys
import glob
import js
print(sys.version)
print(np.__version__)
f = open('/data/test.txt')
print(f.readline())
`);
});
});
• Workflow
• LocalStorage in this example
• Use Emscripten for WASM
backend
Slide 24
Slide 24 text
Solutions for Practical problems
• Module loading: Use Promise-ready running API
• runPythonAsync automatically detects packages and dynamically
imports.
let resultPane =
document.querySelector('#result-pane’);
languagePluginLoader.then(async ()=>{
resultPane.innerHTML = ’’;
let result = await pyodide.runPythonAsync(`
import numpy as np
import sys
print(sys.version)
np.__version__
`) .then(result=>{
if (typeof result !== "undefined") {
resultPane.innerHTML = result;
}
});
});
Console Log
Slide 25
Slide 25 text
Making / Testing Simple Python REPL IDE
• Problem 1: cannot get stdout
• Solution: manual stdout() reading pipeline
languagePluginLoader.then( ()=>{
pyodide.runPython(`
import sys
import io
sys.stdout = io.StringIO()
`);
...
});
let stdout = pyodide.runPython("sys.stdout.getvalue()")
let stdout_console = document.createElement('div');
stdout_console.innerText = stdout;
resultPane.appendChild(stdout_console);
After executing each block:
Slide 26
Slide 26 text
Making / Testing Simple Python REPL IDE
• Problem 2: Iodide dependency
• Solution: Make an Iodide mock-up on code
• matplotlib (and other plotting libraries) is
monkeypatched to create and use canvas to
iodide.output
• So let’s provide mockup object like this:
globalThis.iodide = {
output:{
element: (tagName) => {
let outputPane = document.createElement(tagName);
document.querySelector("#result-pane").appendChild(outputPane);
return outputPane;
}
}
};
Slide 27
Slide 27 text
’More’ Practical problems
• Data science
• Size, access, speed, easeness
• Runtime size
• Cannot deliver through internet
connection
• 150~450MB for fullstack
• depends on your compiled libraries
Slide 28
Slide 28 text
Problem Solving with application
• Electron app. with Chromium
• Stand-alone browser environment with Node.js
• App / Web dual mode
• App mode
• Complete scientific stack with local Pyodide packages
• Web mode
• Selectable library loading with ESNext dynamic import
• pyodide.runPythonAsync will do the job
Slide 29
Slide 29 text
Architecture (overview)
Limitation:
WebWorkers cannot modify DOM due to its jail nature
Slide 30
Slide 30 text
Architecture (implementation)
Implements with
WebComponnents
Custom pyodide.module
for ES module
Slide 31
Slide 31 text
Application Building & Distributing
• Automatic build script
• Dockerized build environment for WASM / Pyodide
• rollup.js for Electron app
• Electron packager with build script
• Distribution
• GitHub with source code and runtime
• (Plan) Linux/Windows/Mac AppStore for easier distribution on
Windows
Slide 32
Slide 32 text
’More’ Troubles
• Problem 3: String buffer
• Should clean string buffers!
• Solution: run stdout cleanup code after each execution
• Problem 4: WebWorker limitation
• WebWorker runtime does not provide main thread DOM access
• Solution: make data pipeline routine with message system
• Limitation: multimedia outputs
• They are generated as a part of WorkerGlobalScope
• Still no fine way to solve the problem: Any ideas?
pyodide.runPython(`sys.stdout.truncate(0);sys.stdout.seek(0)`);
Slide 33
Slide 33 text
Now the app is ready to use:
Let’s test the stand-alone IDE app!
With those works + other hidden (& tedious) stuffs,
Slide 34
Slide 34 text
Demo: Simple data science
• Data → Analyze → Visualize
Slide 35
Slide 35 text
No content
Slide 36
Slide 36 text
Go further
• Connect with Web
• pyodide.pyimport: Access a Python object from JavaScript
• Internal module : js
• Provides direct access from Python to container DOM
• Connect with local filestorage
• Convert BrowserFS to node.js FileSystem API
• Enable nodeIntegration=True on Electron newWindow option
from js import document, window
Slide 37
Slide 37 text
Building Pyodide
• To enable FileSystem access with
NODEFS
• Recent versions of Emscripten removes
node.js FileSytem support from default FS
support
• https://emscripten.org/docs/api_reference
/Filesystem-API.html
• How
• Add option to ./Makefile
OPTFLAGS=-O3 -lnodefs.js -lworkerfs.js -s NODERAWFS=1
…
LDFLAGS=-s NODERAWFS=1
Slide 38
Slide 38 text
Building Pyodide
• To add / update libraries
• If you have your own scientific
libraries, compile it
• Building Pyodide package
• Use ‘mkpkg’ in Pyodide source
code
• Generates meta.yaml
• bin/pyodide mkpkg
[PACKAGE_NAME]
package:
name: numpy
version: 1.15.4
source:
url: https://files.pythonhosted.org/packages/...
sha256: 3d734559db35aa3697dadcea492a423118c5c...
patches:
- patches/add-emscripten-cpu.patch
- patches/disable-maybe-uninitialized.patch
- patches/dont-include-execinfo.patch
- patches/fix-longdouble.patch
- patches/fix-static-init-of-nditer-pywrap.patch
- patches/force_malloc.patch
- patches/init-alloc-cache.patch
- patches/use-local-blas-lapack.patch
- patches/fix-install-with-skip-build.patch
build:
skip_host: False
cflags: -include math.h -I../../config
test:
imports:
- numpy
Limitations
• Good enough, not for the production
• With Firefox, user may change config: dom.max_script_run_time
• Single-threaded
• Slow when performing matrix calculations
• Bad for heavy workload, good enough for studying
Slide 43
Slide 43 text
Ideas
• WASM+Micropython
• “Usable” python runtime on browsers
• Python-based SPA solution
• Full Web-Python ecosphere with WASM
• Micropip: (experimental) supports pure Python package installation
• PyPi for WASM-Python
• Dynamically loadable Python packages on the web
• JupyterLab integration
• Use local Pyodide runtime as IPython kernel
• Some projects (e.g. jyve) but with security holes
• And…
Slide 44
Slide 44 text
Thank you for listening J
[email protected]
inureyes inureyes
jeongkyu.shin End
!
Source codes:
https://github.com/inureyes/pyodide-console