$30 off During Our Annual Pro Sale. View Details »

Standardizing on a single N-dimensional array API for Python

Standardizing on a single N-dimensional array API for Python

Numerical computing and deep learning libraries for Python all offer array (or tensor) data structures and associated compute functionality with similar APIs. There are many subtle differences however, making it hard for users to migrate from one library to another, or for library authors to write code that supports multiple array libraries. The Consortium for Python Data API Standards (https://data-apis.org/) recently released a first version of its array API standard - which aims to address these issues - for community review.

In this talk, we will start with an overview of array API standard goals, benefits and API surface, and then focus on some of the key technical issues, such as reconciling in-place operations with immutable/mutable array data models, dtype casting rules, and zero-copy exchange protocols. Finally we will look at initial implementations in NumPy and PyTorch, and plans for use in downstream libraries like SciPy and scikit-learn.

Toulouse Data Science

June 17, 2021
Tweet

More Decks by Toulouse Data Science

Other Decks in Programming

Transcript

  1. Standardizing on a single
    N-dimensional array API for Python
    Coordination:
    Sponsors:
    Ralf Gommers
    15 June 2021

    View Slide

  2. How often do you write code for
    novel array/tensor APIs?
    vs.
    Rewriting for another library or for
    higher performance?

    View Slide

  3. Array-based computing in Python

    View Slide

  4. Today’s Python data ecosystem
    Can we make it easy to build on top of multiple array data structures?

    View Slide

  5. Example: scikit-image, CuPy & Dask

    View Slide

  6. Example: einops package
    Einops is a popular package for
    array manipulation (reshaping,
    concatenating, stacking, etc.)
    Supports all major array/tensor
    libraries.
    It has:
    ● 700 LoC for public APIs
    ● 550 LoC for backends ⇒
    transpose() is still relatively well-behaved, it gets worse for other functions

    View Slide

  7. State of compatibility today
    All libraries have common concepts and functionality.
    But, there are many small (and some large) incompatibilities. It’s very painful to
    translate code from one array library to another.
    Let’s look at some examples!

    View Slide

  8. So compatibility is poor?
    Fix it: create a standard!

    View Slide

  9. View Slide

  10. Consortium for Python Data API Standards
    A new organization, with participation from maintainers of many array (a.k.a.
    tensor) and dataframe libraries.
    Concrete goals for first year:
    1. Define a standardization methodology and necessary tooling for it
    2. Publish an RFC for an array API standard
    3. Publish an RFC for a dataframe API standard
    4. Finalize 2021.0x API standards after community review
    See data-apis.org and github.com/data-apis for more on the Consortium
    expected within a month

    View Slide

  11. Goals for and scope of the array API
    Syntax and semantics of functions
    and objects in the API
    Casting rules, broadcasting, indexing,
    Python operator support
    Data interchange & device support
    Execution semantics (e.g. task
    scheduling, parallelism, lazy eval)
    Non-standard dtypes, masked arrays,
    I/O, subclassing array object, C API
    Error handling & behaviour for invalid
    inputs to functions and methods
    Goal 1: enable writing code & packages that support multiple array libraries
    Goal 2: make it easy for end users to switch between array libraries
    In Scope Out of Scope

    View Slide

  12. Array- and array-consuming libraries
    Using DLPack, will work for any two
    libraries if they support device the
    data resides on
    x = xp.from_dlpack(x_other)
    Data interchange between array libs
    Portable code in array-consuming libs
    def softmax(x):
    # grab standard namespace from
    # the passed-in array
    xp = get_array_api(x)
    x_exp = xp.exp(x)
    partition = xp.sum(x_exp, axis=1,
    keepdims=True)
    return x_exp / partition

    View Slide

  13. What does the full API surface look like?
    ● 1 array object with
    ○ 6 attributes: ndim, shape, size, dtype, device, T
    ○ dunder methods to support all Python operators
    ○ __array_api_version__, __array_namespace__, __dlpack__
    ● 11 dtype literals: bool, (u)int8/16/32/64, float32/64
    ● 1 device object
    ● 4 constants: inf, nan, pi, e
    ● ~125 functions:
    ○ Array creation & manipulation (20)
    ○ Element-wise math & logic (6)
    ○ Statistics (7)
    ○ Linear algebra (22)
    ○ Search, sort & set (7)
    ○ Utilities, dtypes, broadcasting (8)

    View Slide

  14. Latest: github.com/data-apis/array-api/

    View Slide

  15. Mutability & copies/views
    x = ones(4)
    # y may be a view on data of x
    y = x[:2]
    # modifies x if y is a view
    y += 1
    Mutable operations and the concept of views are
    important for strided in-memory array implementations
    (NumPy, CuPy, PyTorch, MXNet)
    They are problematic for libraries based on immutable data
    structures or delayed evaluation (TensorFlow, JAX, Dask)
    Decisions in API standard:
    1. Support inplace operators
    2. Support item and slice assignment
    3. Do not support out= keyword
    4. Warn users that mixing mutating operations and views
    may result in implementation-specific behavior

    View Slide

  16. Dtype casting rules
    x = xp.arange(5) # will be integer
    y = xp.ones(5, dtype=xp.float32)
    # This may give float32, float64, or raise
    dtype = (x * y).dtype
    Casting rules are straightforward to align between
    libraries when the dtypes are of the same kind
    Mixed integer and floating-point casting is very
    inconsistent between libraries, and hard to change:
    Hence this will remain unspecified.

    View Slide

  17. Data-dependent output shape/dtype
    # Boolean indexing, and even slicing
    # in some cases, results in shapes
    # that depend on values in `x`
    x2 = x[:, x > 3]
    val = somefunc(x)
    x3 = x[:val]
    # Functions for which output shape
    # depends on value
    unique(x)
    nonzero(x)
    # NumPy does value-based casting
    x = np.ones(3, dtype=np.float32)
    x + 1 # float32 output
    x + 100000 # float64 output
    Data-dependent output shapes or dtypes are
    problematic, because of:
    ● static memory allocation (TensorFlow, JAX)
    ● graph-based scheduling (Dask)
    ● JIT compilation (Numba, PyTorch, JAX,
    Gluon)
    Value-based dtype results can be avoided.
    Value-based shapes can be important - the API
    standard will include but clearly mark such
    functionality.

    View Slide

  18. DLPack - device-aware zero copy protocol
    Improved Python API:
    x_mylib = from_dlpack(x_otherlib)
    Getting stream handling right was hard:
    def from_dlpack(x):
    device = x.__dlpack_device__()
    consumer_stream = _find_exchange_stream(device)
    dlpack_caps = x.__dlpack__(consumer_stream)
    return _convert_to_consumer_array(dlpack_caps)
    def __dlpack__(self, /, *, stream=None):
    # stream: optional pointer to a stream, as a Python integer,
    # provided by the consumer that the producer will use to make
    # the array safe to operate on (e.g., via cudaStreamWaitForEvent)
    return dlpack_capsule

    View Slide

  19. Where are we today? (1/2)
    The array API standard is >95% complete and published for community review. A
    mechanism for future extensions is also defined. Open discussion points include:
    ● unique is the only polymorphic function (output type depends on
    keywords) - should it be changed?
    ● Type promotion for reductions, and one-off promotion corner caser
    ● Resolving issues that come up during implementation in libraries
    The NumPy Enhancement Proposal (NEP 47) for adoption is also merged (Draft),
    and reference implementation progressing nicely - will be merged with
    experimental status in the next few weeks:
    https://github.com/numpy/numpy/pull/18585

    View Slide

  20. Where are we today? (2/2)
    ● PyTorch has decided that the array API standard will be adopted:
    ● JAX and CuPy will wait till NumPy has an implementation, and then add
    compatibility in the same way. Dask hasn’t confirmed yet, but in general
    aims for a NumPy-compatible API too
    ● MXNet and ONNX have stated they will implement the standard
    ● TensorFlow will likely add support in tf.experimental (not confirmed yet)

    View Slide

  21. What is next? — array API standard
    1. Complete the library-independent test suite
    2. Merge reference implementation in NumPy
    3. Prototype implementations in other array libraries & use downstream
    (SciPy, scikit-learn, scikit-image, domain-specific libraries)
    4. Get sign-off from maintainers of each array library ⇒ array API v2021 final

    View Slide

  22. What is next? — Data APIs roadmap

    View Slide

  23. How can you help?
    Give feedback! Is your use case covered? See a small gap in functionality?
    Contribute! Portable test & benchmarking suites, remaining design issues
    Implement! The standard is complete enough to adopt today (draft mode)
    Spread awareness! Blog, reference in your talk, ...
    Support! Funding or engineering time -- lots more to do, also for dataframes

    View Slide

  24. Consortium:
    ● Website & introductory blog posts: data-apis.org
    ● Array API main repo: github.com/data-apis/array-api
    ● Latest version of the standard: data-apis.github.io/array-api/latest
    ● Members: github.com/data-apis/governance
    Find me at: [email protected], rgommers, ralfgommers
    Try this at home - installing the latest version of all seven array libraries in one
    env to experiment:
    conda create -n many-libs python=3.7
    conda activate many-libs
    conda install cudatoolkit=10.2
    pip install numpy torch jax jaxlib tensorflow mxnet cupy-cuda102 dask toolz sparse
    To learn more

    View Slide