Slide 1

Slide 1 text

Making Pandas Fly (live from London) @IanOzsvald – ianozsvald.com Ian Ozsvald PyDataAmsterdam 2020

Slide 2

Slide 2 text

 Interim Chief Data Scientist  19+ years experience  Team coaching & public courses – Higher Performance! Introductions By [ian]@ianozsvald[.com] Ian Ozsvald 2nd Edition!

Slide 3

Slide 3 text

 All volunteers – go say thank you in #lobby  NumFOCUS benefits us all Thank the organisers! By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 4

Slide 4 text

 Pandas – Saving RAM – Calculating faster by dropping to Numpy  Advice for “being highly performant” Today’s goal By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 5

Slide 5 text

 Go to Notebook for demo Demo By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 6

Slide 6 text

NumPy vs Pandas overhead (ser.sum()) By [ian]@ianozsvald[.com] Ian Ozsvald 25 files, 83 functions Very few NumPy calls! Thanks!

Slide 7

Slide 7 text

Overhead... By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 8

Slide 8 text

Overhead with ser.values.sum() By [ian]@ianozsvald[.com] Ian Ozsvald 18 files, 51 functions Many fewer Pandas calls (but still a lot!)

Slide 9

Slide 9 text

Is Pandas unnecessarily slow? By [ian]@ianozsvald[.com] Ian Ozsvald Missing? The bottleneck library! This certainly helps

Slide 10

Slide 10 text

Is Pandas unnecessarily slow – NO! By [ian]@ianozsvald[.com] Ian Ozsvald https://github.com/pandas-dev/pandas/issues/34773 - the truth is a bit complicated!

Slide 11

Slide 11 text

 Install optional (but great!) Pandas dependencies – bottleneck – numexpr  Investigate https://github.com/ianozsvald/dtype_diet  Investigate my ipython_memory_usage (PyPI/Conda) Being highly performant By [ian]@ianozsvald[.com] Ian Ozsvald https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html

Slide 12

Slide 12 text

 Mistakes slow us down (PAY ATTENTION!) – Try nullable Int64 & boolean, forthcoming Float64 – Write tests (unit & end-to-end) – Codify your assumptions – bulwark library – https://github.com/ianozsvald/notes_to_self Being highly performant By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 13

Slide 13 text

 Make it right then make it fast  Think about being performant  See blog for my classes  I’d love a postcard if you learned something new! Summary By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 14

Slide 14 text

Covid 19 UK economic impact? By [ian]@ianozsvald[.com] Ian Ozsvald