(one datatype) • Contiguous (fixed length strides) • Increasingly releasing the GIL (becoming multi-threaded) What about non-contiguous blocks of memory (or disk)? What about arrays that can never be realised to memory? What about distributing the computation across a cluster or HPC?
• Provides N-dimensional virtual arrays of arbitrary size. • Defines a simple adapter interface for data source containment. • Concatenate or tile virtual arrays to increase their extent. • Stack virtual arrays to increase their dimensionality. • Performs lazy operations, ◦ indexing and slicing to subset a virtual array. ◦ element-wise arithmetic operators. ◦ statistical aggregation operators. • Requires an explicit request to realize a concrete result. • Out-of-core streaming to a data target.
tool. Implements a generalised N-dimensional gridded data model known as a Cube. Deferred loading, virtual array and lazy evaluation capability provided by Biggus. scitools.org.uk github.com/SciTools Iris
• borrows from the NumPy API. • provides an Array abstract base class. • requires, • dtype property • shape property • __getitem__ magic method • is data source agnostic.
an expression evaluation graph to achieve out-of-core processing. a b add a+b Producer Node Producer Node Consumer Node In-memory Result Node chunk a chunk b chunk (a+b) Q Q Q
operations. • Lazy rolling window capability. • Generate dask expression graph. • Roll in to dask.array. • Requires: dask.array auto-chunking & masked arrays TODO