Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

How PyPy can help High Performance Computing

How PyPy can help High Performance Computing

Antonio Cuni

August 23, 2018
Tweet

More Decks by Antonio Cuni

Other Decks in Programming

Transcript

  1. Short bio Short bio PyPy core dev since 2006 pdb++,

    CFFI, vmprof, capnpy, ... @antocuni https://github.com/antocuni (https://github.com/antocuni) https://bitbucket.org/antocuni (https://bitbucket.org/antocuni)
  2. Python Python REAL strong points strong points Expressive & simple

    APIs Uniform typesystem (everything is an object) Powerful abstractions
  3. JSONObject jsonObj = new JSONObject(jsonString); JSONArray jArray = jsonObj.getJSONArray("data"); int

    length = jArray.length(); for(int i=0; i<length; i++) { JSONObject jObj = jArray.getJSONObject(i); String id = jObj.optString("id"); String name=jObj.optString("name"); JSONArray ingredientArray = jObj.getJSONArray("Ingredients"); int size = ingredientArray.length(); ArrayList<String> Ingredients = new ArrayList<>(); for(int j=0; j<size; j++) { JSONObject json = ja.getJSONObject(j); Ingredients.add(json.optString("name")); } } // googled for "getJSONArray example", found this: // https://stackoverflow.com/questions/32624166/how-to-get-json-array-within-j son-object
  4. obj = json.loads(string) for item in obj['data']: id = item['id']

    name = item['name'] ingredients = [] for ingr in item["ingredients"]: ingredients.append(ingr['name'])
  5. So far so good, BUT So far so good, BUT

    abstraction iterators abstraction temp objects abstraction classes/methods/functions core of computation
  6. Example of temporary objects Example of temporary objects Bound methods

    Bound methods In [ ]: class A(object): def foo(self): return 42 a = A() bound_foo = a.foo %timeit a.foo() %timeit bound_foo()
  7. Ideally Ideally Think of concepts, not implementation details Think of

    concepts, not implementation details Real world Real world Details leak to the user Details leak to the user
  8. 1. Work around in the user code 1. Work around

    in the user code e.g. create bound methods beforehand e.g. create bound methods beforehand
  9. 2. Work around in the language specs 2. Work around

    in the language specs range vs xrange dict.keys vs .iterkeys int vs long array.array vs list Easier to implement Harder to use Clutter the language unnecessarily More complex to understand Not really Pythonic
  10. 3. Stay in C as much as possible 3. Stay

    in C as much as possible In [29]: In [31]: numbers = range(1000) % timeit [x*2 for x in numbers] import numpy as np numbers = np.arange(1000) % timeit numbers*2 10000 loops, best of 3: 47.1 µs per loop The slowest run took 17.59 times longer than the fastest. This could mean that an intermediate result is being cached. 1000000 loops, best of 3: 1.48 µs per loop
  11. Python in the HPC world Python in the HPC world

    Python as a glue-only language Python as a glue-only language Tradeo between speed and code quality Tradeo between speed and code quality
  12. PyPy PyPy Alternative Python implementation Ideally: no visible difference to

    the user JIT compiler http://pypy.org (http://pypy.org)
  13. How fast is PyPy? How fast is PyPy? Wrong question

    Wrong question Up to 80x faster in extreme cases 10x faster in good cases 2x faster on "random" code sometime it's just slower
  14. PyPy aws PyPy aws Far from being perfect it leaks

    other implementation details than CPython e.g. JIT warmup, GC pecularities
  15. Python as a rst class language Python as a rst

    class language No longer "just glue" No longer "just glue"
  16. Example: Sobel lter Example: Sobel lter Extendend version "The Joy

    of PyPy: Abstractions for Free", EP 2017 https://speakerdeck.com/antocuni/the-joy-of-pypy-jit-abstractions-for-free (https://speakerdeck.com/antocuni/the-joy-of-pypy-jit-abstractions-for-free) https://www.youtube.com/watch?v=NQfpHQII2cU (https://www.youtube.com/watch?v=NQfpHQII2cU)
  17. cpyext cpyext PyPy version of Python.h Compatibility layer Most C

    extensions just work: numpy, scipy, pandas, etc. Slow :( Use CFFI whenever it's possible
  18. We are working on it We are working on it

    Future status (hopefully) Future status (hopefully) All C extensions will just work C code as fast as today, Python code super-fast The best of both worlds PyPy as the default choice for HPC My personal estimate: 6 months of work and we have a fast cpyext (let's talk about money :))