Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making Pandas Fly (PyDataAmsterdam 2020)

June 18, 2020

Making Pandas Fly (PyDataAmsterdam 2020)

Another variant of the recent talks, this one focuses on making Pandas faster by digging into NumPy, using my `dtype_diet` memory-saving tool and understanding what's going on with some of Pandas' low level functions. See https://ianozsvald.com/ for more.


June 18, 2020

More Decks by ianozsvald

Other Decks in Science


  1.  Interim Chief Data Scientist  19+ years experience 

    Team coaching & public courses – Higher Performance! Introductions By [ian]@ianozsvald[.com] Ian Ozsvald 2nd Edition!
  2.  All volunteers – go say thank you in #lobby

     NumFOCUS benefits us all Thank the organisers! By [ian]@ianozsvald[.com] Ian Ozsvald
  3.  Pandas – Saving RAM – Calculating faster by dropping

    to Numpy  Advice for “being highly performant” Today’s goal By [ian]@ianozsvald[.com] Ian Ozsvald
  4. NumPy vs Pandas overhead (ser.sum()) By [ian]@ianozsvald[.com] Ian Ozsvald 25

    files, 83 functions Very few NumPy calls! Thanks!
  5. Overhead with ser.values.sum() By [ian]@ianozsvald[.com] Ian Ozsvald 18 files, 51

    functions Many fewer Pandas calls (but still a lot!)
  6. Is Pandas unnecessarily slow – NO! By [ian]@ianozsvald[.com] Ian Ozsvald

    https://github.com/pandas-dev/pandas/issues/34773 - the truth is a bit complicated!
  7.  Install optional (but great!) Pandas dependencies – bottleneck –

    numexpr  Investigate https://github.com/ianozsvald/dtype_diet  Investigate my ipython_memory_usage (PyPI/Conda) Being highly performant By [ian]@ianozsvald[.com] Ian Ozsvald https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html
  8.  Mistakes slow us down (PAY ATTENTION!) – Try nullable

    Int64 & boolean, forthcoming Float64 – Write tests (unit & end-to-end) – Codify your assumptions – bulwark library – https://github.com/ianozsvald/notes_to_self Being highly performant By [ian]@ianozsvald[.com] Ian Ozsvald
  9.  Make it right then make it fast  Think

    about being performant  See blog for my classes  I’d love a postcard if you learned something new! Summary By [ian]@ianozsvald[.com] Ian Ozsvald