$30 off During Our Annual Pro Sale. View Details »

HPy: a future-proof way of extending Python?

HPy: a future-proof way of extending Python?

HPy (https://github.com/pyhandle/hpy) is a joint project which is being developed by PyPy, CPython and Cython developers. It aims to design a better C API for writing Python extensions which is more friendly to alternative implementations and which would allow CPython itself more freedom to experiment (e.g. by using a real GC instead of refcounting).

Antonio Cuni

April 15, 2020
Tweet

More Decks by Antonio Cuni

Other Decks in Programming

Transcript

  1. HPy: a future-proof way of extending
    Python?
    Antonio Cuni
    Python Language Summit 2020
    antocuni (Python Language Summit 2020) HPy 1 / 16

    View Slide

  2. What is HPy?
    An attempt to "solve" some of the problems with the
    C-API
    Idea born at EP2019, discussion between:
    PyPy devs
    CPython devs
    Cython dev(s)
    https://github.com/pyhandle/hpy
    https://hpy.readthedocs.io/en/latest/
    https://mail.python.org/archives/list/[email protected]/
    #hpy on freenode
    antocuni (Python Language Summit 2020) HPy 1 / 16

    View Slide

  3. What is the problem?
    The C-API is too tied to CPython internals
    Many implementation details are exposed by/backed
    in the API
    CPython can’t evolve / change its details
    Alternative implementations have to "emulate"
    CPython
    Related work:
    Stable ABI, PEP 384
    https://pythoncapi.readthedocs.io/
    antocuni (Python Language Summit 2020) HPy 2 / 16

    View Slide

  4. Exposed details
    Reference counting
    Objects as C pointers (PyObject *)
    Implicit assumption that Python-level is is the same as
    C-level ==
    Structs not fully opaque
    ob_refcnt, ob_type, PyTypeObject, ...
    PEP 384 goes in the right direction
    Borrowed references
    PyList_GetItem ==> borrowed reference
    In PyPy, [1, 2, 3, 4] is represented as a C long[]
    No references to borrow!
    antocuni (Python Language Summit 2020) HPy 3 / 16

    View Slide

  5. Refcounting vs GC
    Refcounting prevents using a "real" GC
    The GC is not (only) about collecting garbage!
    It should be called Memory Manager
    State-of-the-art GCs:
    super-fast allocation, fast deallocation
    no/minimal pauses
    multi-threading (not the PyPy GC)
    ...
    E.g., on gcbench.py, PyPy is ~25x faster than
    CPython!
    antocuni (Python Language Summit 2020) HPy 4 / 16

    View Slide

  6. CPython
    Can’t evolve the VM
    Can’t experiment with many ideas
    Refcounting is a big problem
    Refcounting is the major blocker / problem for the Gilec-
    tomy. The next step with the Gilectomy is to switch
    CPython to tracing garbage collection, which is much more
    amenable to running across multiple threads.
    Larry Hastings
    antocuni (Python Language Summit 2020) HPy 5 / 16

    View Slide

  7. Alternative implementations
    Standard solution: compatibility layer to emulate
    CPython
    PyPy: cpyext
    IronPython: IronClad
    Jython: Jython Native Interface
    ...
    Massive amout of precious developer hours wasted
    Poor results
    https://morepypy.blogspot.com/2018/09/
    inside-cpyext-why-emulating-cpython-c.html
    antocuni (Python Language Summit 2020) HPy 6 / 16

    View Slide

  8. HPy solution
    Fully opaque data structures by default
    GC-friendly: handles
    Like file descriptors of Windows’s HANDLE
    HPy_Dup ==> Py_INCREF
    HPy_Close() ==> Py_DECREF
    Each handle must be closed individually
    PyObject *a = PyLong_FromLong(42);
    PyObject *b = a;
    Py_INCREF(b);
    Py_DECREF(a);
    Py_DECREF(a); // Ok
    HPy a = HPyLong_FromLong(ctx, 42);
    HPy b = HPy_Dup(ctx, a);
    HPy_Close(a);
    HPy_Close(a); // WRONG!
    HPyContext passed everywhere (useful for
    subinterpreters, etc.)
    antocuni (Python Language Summit 2020) HPy 7 / 16

    View Slide

  9. HPy strategy to conquer the world
    Zero overhead on CPython
    Using macros and static inline to map HPy to
    C-API
    Incremental adoption
    Port existing extensions one function at a time
    Faster on alternative implementations
    3x faster than cpyext on PyPy
    2x faster on GraalPython (could be optimized further)
    Better debugging experience
    "The handle created at foo.c:543 was never closed"
    (Optional) Universal ABI: one binary for multiple
    versions/implementations
    Cython backend
    antocuni (Python Language Summit 2020) HPy 8 / 16

    View Slide

  10. HPy targets
    foo.c
    #include
    CPython ABI
    hpy/cpython.h
    gcc
    foo.cpython-37m.so
    CPython
    antocuni (Python Language Summit 2020) HPy 9 / 16

    View Slide

  11. HPy targets
    foo.c
    #include
    CPython ABI
    hpy/cpython.h
    HPy Universal ABI
    hpy/universal.h
    gcc
    foo.cpython-37m.so foo.hpy-1.so
    CPython PyPy
    hpy.universal
    hpy/universal.cpython-
    37m.so
    GraalPython
    antocuni (Python Language Summit 2020) HPy 10 / 16

    View Slide

  12. HPy targets
    foo.c
    #include
    CPython ABI
    hpy/cpython.h
    HPy Universal ABI
    hpy/universal.h
    BarPython ABI
    bar-python/hpy.h
    gcc
    foo.cpython-37m.so foo.hpy-1.so
    CPython PyPy
    hpy.universal
    hpy/universal.cpython-
    37m.so
    GraalPython
    foo.bar-python.so
    BarPython
    antocuni (Python Language Summit 2020) HPy 11 / 16

    View Slide

  13. CPython ABI
    // hpy/cpython.h
    typedef struct { PyObject *_o; } HPy;
    static inline HPy HPy_Dup(HPyContext ctx, HPy handle) {
    Py_XINCREF(handle._o);
    return handle;
    }
    static inline HPy HPyLong_FromLong(HPyContext ctx, long v)
    {
    return (HPy){PyLong_FromLong(v)};
    }
    antocuni (Python Language Summit 2020) HPy 12 / 16

    View Slide

  14. Universal ABI
    // hpy/universal.h
    /* a word-sized opaque field: can be an index, a pointer, whatever */
    typedef struct { HPy_ssize_t _i; } HPy;
    struct _HPyContext_s {
    int ctx_version;
    ...
    HPy (*ctx_Dup)(HPyContext ctx, HPy h);
    HPy (*ctx_Long_FromLong)(HPyContext ctx, long value);
    ...
    };
    typedef struct _HPyContext_s *HPyContext;
    static inline HPy HPy_Dup(HPyContext ctx, HPy h) {
    return ctx->ctx_Dup ( ctx, h );
    }
    static inline HPy HPyLong_FromLong(HPyContext ctx, long value) {
    return ctx->ctx_Long_FromLong ( ctx, value );
    }
    antocuni (Python Language Summit 2020) HPy 13 / 16

    View Slide

  15. Implementation on PyPy
    # pseudocode
    class HandleManager:
    def __init__(self):
    # GC-managed! The items inside the list might move in memory
    self.handles_w = []
    def new(self, w_obj):
    i = self._find_empty_index()
    self.handles_w[i] = w_obj
    return i
    ...
    def ctx_Long_FromLong(space, value):
    w_obj = space.newint(value)
    return handle_manager.new(w_obj)
    def make_context():
    ctx = lltype.malloc(HPyContext)
    ctx.ctx_Long_FromLong = ctx_Long_FromLong
    ...
    return ctx
    antocuni (Python Language Summit 2020) HPy 14 / 16

    View Slide

  16. Current status
    No type objects yet (WIP)
    Produce native CPython and HPy Universal
    extensions
    ultrajson-hpy
    CPython ABI: as fast as ultrajson
    Universal ABI on CPython: 10% slower
    Universal ABI on PyPy: 3x faster
    https://github.com/pyhandle/ultrajson-hpy
    The current approach works on the technical level
    The biggest challenge will be adoption
    Speed on PyPy might be the most important driving force
    in the short term
    antocuni (Python Language Summit 2020) HPy 15 / 16

    View Slide

  17. Next steps
    Short term
    Custom types in C
    Validate the approach by porting PicoNumpy
    https://github.com/paugier/piconumpy
    Debug mode
    Medium term
    Cython backend
    Experiment with the real numpy
    Long term
    PEP
    Official PSF/CPython endorsement?
    Hypothetical sci-fi future :)
    CPython switches to HPy internally
    antocuni (Python Language Summit 2020) HPy 16 / 16

    View Slide