types var 3 string int32 4 float64 * * * * var * { x : int32, y : string, z : float64 } datashape tabular datashape record ordered struct dtype { x : int32, y : string, z : float64 } collection of types keyed by labels Data Shape
"var * {name: string, amount: float64}" irismongo: source: mongodb://localhost/mydb::iris server.yaml YAML 39 Builds off of Blaze uniform interface to host data remotely through a JSON web API. $ blaze-server server.yaml -e localhost:6363/compute.json Blaze Server — Lights up your Dark Data
you can layer expressions over any data. • Write once, deploy anywhere. • Practically, expressions will work better on specific data structures, formats, and engines. • You will need to copy from one format and/or engine to another
library for turning things into other things • Factored out from the blaze project • Handles a huge variety of conversions • odo is cp with types, for data 44
blaze. >>> x = da.from_array(...) # Make a dask array >>> from blaze import Data, log, compute >>> d = Data(x) # Wrap with Blaze >>> y = log(d + 1)[:5].sum(axis=1) # Do work as usual >>> result = compute(y) # Fall back to dask dask can be a backend/engine for blaze
Linux 32 and 64-bit x86 CPUs and NVIDIA GPUs Python 2 and 3 NumPy versions 1.6 through 1.9 • Does not require a C/C++ compiler on the user’s system. • < 70 MB to install. • Does not replace the standard Python interpreter (all of your existing Python libraries are still available)
Python objects. Only significant performance improvement is compilation of loops that can be compiled in nopython mode (see below). • nopython mode: Compiled code operates on “machine native” data. Usually within 25% of the performance of equivalent C or FORTRAN.
test case. (Do not use your unit tests as a benchmark!) 2. Run a profiler on your benchmark. (cProfile is a good choice) 3. Identify hotspots that could potentially be compiled by Numba with a little refactoring. (see rest of this talk and online documentation) 4. Apply @numba.jit and @numba.vectorize as needed to critical functions. (Small rewrites may be needed to work around Numba limitations.) 5. Re-run benchmark to check if there was a performance improvement.
can’t create a simple or efficient array expression or ufunc. Use Numba to work with array elements directly. • Example: Suppose you have a boolean grid and you want to find the maximum number neighbors a cell has in the grid:
of the first libraries I wrote (in 1999) • extended “umath” module by adding new “universal functions” to compute many scientific functions by wrapping C and Fortran libs. • Bessel functions are solutions to a differential equation: x 2 d 2 y dx 2 + x dy dx + ( x 2 ↵ 2) y = 0 y = J↵ ( x ) Jn (x) = 1 ⇡ Z ⇡ 0 cos (n⌧ x sin (⌧)) d⌧
vj0(x) 10000 loops, best of 3: 75 us per loop In [7]: from scipy.special import j0 In [8]: %timeit j0(x) 10000 loops, best of 3: 75.3 us per loop But! Now code is in Python and can be experimented with more easily (and moved to the GPU / accelerator more easily)!
reports experiments of a SciPy author who got 2x speed-‐up by removing their Cython type annotations and surrounding function with numba.jit (with a few minor changes needed to the code). As soon as Numba’s ahead-‐of-‐time compilation moves beyond experimental stage one can legitimately use Numba to create a library that you ship to others (who then don’t need to have Numba installed — or just need a Numba run-‐time installed). SciPy (and NumPy) would look very different in Numba had existed 16 years ago when SciPy was getting started…. — and you would all be happier.
Python With PyData Stack you often have multi-‐threaded In PyData Stack we quite often release GIL NumPy does it SciPy does it (quite often) Scikit-‐learn (now) does it Pandas (now) does it when possible Cython makes it easy Numba makes it easy
code in Python interpreter • Generalized ufuncs (@guvectorize) • Call ctypes and cffi functions directly and pass them as arguments • Preliminary support for types that understand the buffer protocol • Pickle Numba functions to run on remote execution engines • “numba annotate” to dump HTML annotated version of compiled code • See: http://numba.pydata.org/numba-doc/0.20.0/
dictionaries, user defined classes (tuples do work!) • List, set and dictionary comprehensions • Recursion • Exceptions with non-constant parameters • Most string operations (buffer support is very preliminary!) • yield from • closures inside a JIT function (compiling JIT functions inside a closure works…) • Modifying globals • Passing an axis argument to numpy array reduction functions • Easy debugging (you have to debug in Python mode).
Classes” • Better support for strings/bytes, buffers, and parsing use- cases • More coverage of the Numpy API (advanced indexing, etc) • Documented extension API for adding your own types, low level function implementations, and targets. • Better debug workflows
• A new GPU target: the Heterogenous System Architecture, supported by AMD APUs • Support for named tuples in nopython mode • Limited support for lists in nopython mode • On-disk caching of compiled functions (opt-in) • A simulator for debugging GPU functions with the Python debugger on the CPU • Can choose to release the GIL in nopython functions • Many speed improvements
Support for ARMv7 (Raspbery Pi 2) • Python 3.5 support • NumPy 1.10 support • Faster loading of pre-compiled functions from the disk cache • ufunc compilation for multithreaded CPU and GPU targets (features only in NumbaPro previously). 91
• Try out Numba on your numerical and Numpy-related projects: conda install numba • Your feedback helps us make Numba better! Tell us what you would like to see: https://github.com/numba/numba • Stay tuned for more exciting stuff this year…
2015 •DARPA XDATA program (Chris White and Wade Shen) which helped fund Numba, Blaze, Dask and Odo. •Investors of Continuum. •Clients and Customers of Continuum who help support these projects. •Numfocus volunteers •PyData volunteers