Slide 1

Slide 1 text

www.morconsulting.c High Performance Python – find your bottleneck and go much faster London Python Usergroup Oct 2013 Ian Ozsvald @IanOzsvald MorConsulting.com

Slide 2

Slide 2 text

[email protected] @IanOzsvald London Python Usergroup October 2013 “High Performance Python” • Publishing mid 2014 • Please join the mailing list via IanOzsvald.com

Slide 3

Slide 3 text

[email protected] @IanOzsvald London Python Usergroup October 2013 About Ian Ozsvald • “Exploiter of Data” in MorConsulting.com • Teach: PyCon, EuroSciPy, EuroPython • Various ML/Parallel/Data projects • ShowMeDo.com • IanOzsvald.com

Slide 4

Slide 4 text

[email protected] @IanOzsvald London Python Usergroup October 2013 I teach High Performance... • because it is embarrassingly easy to collect lots of data • turning data->actionable information is hard

Slide 5

Slide 5 text

[email protected] @IanOzsvald London Python Usergroup October 2013 The Julia Set Fractal

Slide 6

Slide 6 text

[email protected] @IanOzsvald London Python Usergroup October 2013 The code (1/2)

Slide 7

Slide 7 text

[email protected] @IanOzsvald London Python Usergroup October 2013 The code (2/2)

Slide 8

Slide 8 text

[email protected] @IanOzsvald London Python Usergroup October 2013 cProfile & runsnakerun

Slide 9

Slide 9 text

[email protected] @IanOzsvald London Python Usergroup October 2013 kernprof.py -v -l julia.py

Slide 10

Slide 10 text

[email protected] @IanOzsvald London Python Usergroup October 2013 memory_profiler.py -v -l julia.py

Slide 11

Slide 11 text

[email protected] @IanOzsvald London Python Usergroup October 2013 mprof (memory_profiler)

Slide 12

Slide 12 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Cython (annotated->gcc)

Slide 13

Slide 13 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Cython (annotated->gcc)

Slide 14

Slide 14 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Cython (annotated->gcc)

Slide 15

Slide 15 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Cython results Straight CPython (in VM): 8.9s Cython1: 5.5s (uses Python lists) Cython3: 3s (uses numpy arrays) Cython4: 0.15s (expanded math) 66* speed­up

Slide 16

Slide 16 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Numba (continuum.io) • Just in Time Compiler (like PyPy/JVM) • Numpy-compatible (unlike PyPy) • LLVM toolchain • Multiple targets (e.g. GPU) • Installation is a touch tricky...

Slide 17

Slide 17 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Numba (continuum.io)

Slide 18

Slide 18 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Numba results Straight CPython (in VM): 8.9s Numba: 1.4s (using numpy arrays – faster than Cython3 at 3s) Numba2: 0.45s (expanded math) 22* speed­up (vs 66* for Cython)

Slide 19

Slide 19 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Missed out... • Ignored PyPy, ShedSkin, Parakeet... • Didn't look at parallelisation • Didn't look at other memory profilers • This will go into the book and maybe training in London

Slide 20

Slide 20 text

[email protected] @IanOzsvald London Python Usergroup October 2013 Thank you! • Training interest? • Want to interview more CTOs • Join book mailing list (on ianozsvald.com) • [email protected] • @IanOzsvald • MorConsulting.com • GitHub/IanOzsvald