Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making Computations Execute Very Quickly

Making Computations Execute Very Quickly

Presentation at PyData London 2015, 2015-06-20. Putting forward the proposal not to use Cython, NumPy and Numba to speed up Python computations, but to use C++, or better D or Chapel in a polyglot programming approach.

Code used as examples comes from https://github.com/russel/Pi_Quadrature

Ca04455902d1b207348b6d406432718f?s=128

Russel Winder

June 20, 2015
Tweet

More Decks by Russel Winder

Other Decks in Technology

Transcript

  1. Copyright © 2015 Russel Winder 1 Making Computations Execute Very

    Quickly Dr Russel Winder email: russel@winder.org.uk twitter: @russel_winder Web: http://www.russel.org.uk
  2. Copyright © 2015 Russel Winder 2 Python is slow…

  3. Copyright © 2015 Russel Winder 3 …at computation.

  4. Copyright © 2015 Russel Winder 4 CPU Bound vs I/O

    Bound 1/2 — • Python is entirely fine for essentially I/O bound activity: • Managing user interfaces via native code widgets (Qt, GTK, Wx, ) … • Managing networking activity. Common theme here, the use of an event loop.
  5. Copyright © 2015 Russel Winder 5 CPU Bound vs I/O

    Bound 2/2 — • Python uses hardware floating point, but via the Python heap. • Python uses hardware integers for small integer values, but via the Python heap. Result: non-trivial numerical activity is slow.
  6. Copyright © 2015 Russel Winder 6

  7. Copyright © 2015 Russel Winder 7 

  8. Copyright © 2015 Russel Winder 8

  9. Copyright © 2015 Russel Winder 9 What is the value

    of ?
  10. Copyright © 2015 Russel Winder 10 Well that's easy, it's…

  11. Copyright © 2015 Russel Winder 11 

  12. Copyright © 2015 Russel Winder 12 Exactly.

  13. Copyright © 2015 Russel Winder 13 It's simples. Александр Орлов

    2009
  14. Copyright © 2015 Russel Winder 14 Albeit irrational.

  15. Copyright © 2015 Russel Winder 15 Approximating  • What

    is it's value represented as a floating point number? • We can only obtain an approximation. • A plethora of possible algorithms to choose from, a popular one is to employ the following integral equation. π 4 =∫ 0 1 1 1+x2 dx
  16. Copyright © 2015 Russel Winder 16 One possible algorithm •

    Use quadrature to estimate the value of the integral which is the area under the curve. – π= 4 n ∑ i=1 n 1 1+( i−0.5 n ) 2 With n = 3 not much to do, but potentially lots of error. Use n = 107 or n = 109? Embarrassingly parallel.
  17. Copyright © 2015 Russel Winder 17 Code!

  18. Copyright © 2015 Russel Winder 18 C++ D Chapel

  19. Copyright © 2015 Russel Winder 19 Because addition is commutative

    and associative, expression can be decomposed into sums of partial sums.
  20. Copyright © 2015 Russel Winder 20 a + b +

    c + d + e + f = ( a + b ) + ( c + d ) + ( e + f )
  21. Copyright © 2015 Russel Winder 21 Scatter Gather — map

    reduce data parallel
  22. Copyright © 2015 Russel Winder 22 Code!

  23. Copyright © 2015 Russel Winder 23 C++ D Chapel

  24. Copyright © 2015 Russel Winder 24 The Python data model

    and its GIL make Python unsuitable for parallel computation.
  25. Copyright © 2015 Russel Winder 25 PyPy and NumPy do

    not help, nor does Cython, Numba, etc., as much as they perhaps should.
  26. Copyright © 2015 Russel Winder 26 Native code, e.g. C++,

    D, Chapel, are the way forward for CPU-bound components of a Python-based system.
  27. Copyright © 2015 Russel Winder 27 And then there is

    OpenCL and OpenGL, soon to be replaced by Vulkan.
  28. Copyright © 2015 Russel Winder 28 Making Computations Execute Very

    Quickly Dr Russel Winder email: russel@winder.org.uk twitter: @russel_winder Web: http://www.russel.org.uk