Extending Pandas using Apache Arrow and Numba

Extending Pandas using Apache Arrow and Numba

With the latest release of Pandas the ability to extend it with custom dtypes was introduced. Using Apache Arrow as the in-memory storage and Numba for fast, vectorized computations on these memory regions, it is possible to extend Pandas in pure Python while achieving the same performance of the built-in types. In the talk we implement a native string type as an example.


Uwe L. Korn

July 08, 2018