Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Joy of PyPy JIT: Abstractions for Free

The Joy of PyPy JIT: Abstractions for Free

EuroPython 2017

Antonio Cuni

July 13, 2017
Tweet

More Decks by Antonio Cuni

Other Decks in Programming

Transcript

  1. PyPy: Abstractions for free Antonio Cuni EuroPython 2017 July 12

    2017 antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 1 / 34
  2. About me PyPy core dev pdb++, cffi, vmprof, capnpy, ...

    @antocuni http://antocuni.eu Source code of this demo: https://bitbucket.org/pypy/extradoc/ src/extradoc/talk/ep2017/ the-joy-of-pypy-jit/ antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 2 / 34
  3. General question Q: "How fast is PyPy?" A: "It depends"

    antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 3 / 34
  4. General question Q: "How fast is PyPy?" A: "It depends"

    antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 3 / 34
  5. The joy of PyPy No single "speedup" factor The better

    the code, the greater the speedup antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 5 / 34
  6. Good code Correct Readable Easy to maintain Nice APIs Fast

    antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 6 / 34
  7. Abstractions functions classes inheritance etc. PRO: readability CON: cost of

    abstraction? antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 7 / 34
  8. Abstractions functions classes inheritance etc. PRO: readability CON: cost of

    abstraction? antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 7 / 34
  9. Image greyscale w, h, data data = array.array(’B’) of w

    * h bytes pixel (x, y) at index x + w*y antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 9 / 34
  10. Version 0 def sobel(img): w, h, data = img data_out

    = array.array(’B’, [0]) * (w*h) out = w, h, data_out for y in xrange(1, h-1): for x in xrange(1, w-1): dx = (-1.0 * data[(x-1) + w*(y-1)] + 1.0 * data[(x+1) + w*(y-1)] + -2.0 * data[(x-1) + w*y ] + 2.0 * data[(x+1) + w*y ] + -1.0 * data[(x-1) + w*(y+1)] + 1.0 * data[(x+1) + w*(y+1)]) dy = (-1.0 * data[(x-1) + w*(y-1)] + -2.0 * data[x + w*(y-1)] + -1.0 * data[(x+1) + w*(y-1)] + 1.0 * data[(x-1) + w*(y+1)] + 2.0 * data[x + w*(y+1)] + 1.0 * data[(x+1) + w*(y+1)]) value = min(int(sqrt(dx*dx + dy*dy) / 2.0), 255) data_out[x + w*y] = value return out antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 10 / 34
  11. Version 0 PyPy is ~59x faster antocuni (EuroPython 2017) PyPy:

    abstractions for free July 12 2017 12 / 34
  12. Version 1 def get(img, x, y): w, h, data =

    img i = x + y*w return data[i] def set(img, x, y, value): w, h, data = img i = x + y*w data[i] = value def sobel(img): w, h, data = img out = w, h, array.array(’B’, [0]) * (w*h) for y in xrange(1, h-1): for x in xrange(1, w-1): dx = (-1.0 * get(img, x-1, y-1) + 1.0 * get(img, x+1, y-1) + -2.0 * get(img, x-1, y) + 2.0 * get(img, x+1, y) + -1.0 * get(img, x-1, y+1) + 1.0 * get(img, x+1, y+1)) dy = ... ... antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 13 / 34
  13. Version 1 PyPy is ~97x faster antocuni (EuroPython 2017) PyPy:

    abstractions for free July 12 2017 14 / 34
  14. Version 2 class Image(object): def __init__(self, width, height, data=None): self.width

    = width self.height = height if data is None: self.data = array.array(’B’, [0]) * (width*height) else: self.data = data def __getitem__(self, idx): x, y = idx return self.data[x + y*self.width] def __setitem__(self, idx, value): x, y = idx self.data[x + y*self.width] = value antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 15 / 34
  15. Version 2 PyPy is ~170x faster antocuni (EuroPython 2017) PyPy:

    abstractions for free July 12 2017 16 / 34
  16. Version 3 _Point = namedtuple(’_Point’, [’x’, ’y’]) class Point(_Point): def

    __add__(self, other): ox, oy = other x = self.x + ox y = self.y + oy return self.__class__(x, y) class ImageIter(object): def __init__(self, x0, x1, y0, y1): self.it = itertools.product(xrange(x0, x1), xrange(y0, y1)) def __iter__(self): return self def next(self): x, y = next(self.it) return Point(x, y) class Image(v2.Image): def noborder(self): return ImageIter(1, self.width-1, 1, self.height-1) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 17 / 34
  17. Version 3 def sobel(img): img = Image(*img) out = Image(img.width,

    img.height) for p in img.noborder(): dx = (-1.0 * img[p + (-1,-1)] + 1.0 * img[p + ( 1,-1)] + -2.0 * img[p + (-1, 0)] + 2.0 * img[p + ( 1, 0)] + -1.0 * img[p + (-1, 1)] + 1.0 * img[p + ( 1, 1)]) dy = (-1.0 * img[p + (-1,-1)] + -2.0 * img[p + ( 0,-1)] + -1.0 * img[p + ( 1,-1)] + 1.0 * img[p + (-1, 1)] + 2.0 * img[p + ( 0, 1)] + 1.0 * img[p + ( 1, 1)]) value = min(int(sqrt(dx*dx + dy*dy) / 2.0), 255) out[p] = value antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 18 / 34
  18. Version 3 PyPy is ~435x faster antocuni (EuroPython 2017) PyPy:

    abstractions for free July 12 2017 19 / 34
  19. Version 4 class Kernel(object): def __init__(self, matrix): self.height = len(matrix)

    self.width = len(matrix[0]) self.matrix = matrix def __call__(self, img, p): value = 0.0 for j, row in enumerate(self.matrix, -(self.height/2)): for i, k in enumerate(row, -(self.width/2)): value += img[p + (i, j)] * k return value Gx = Kernel([[-1.0, 0.0, +1.0], [-2.0, 0.0, +2.0], [-1.0, 0.0, +1.0]]) Gy = Kernel([[-1.0, -2.0, -1.0], [0.0, 0.0, 0.0], [+1.0, +2.0, +1.0]]) def sobel(img): ... dx = Gx(img, p) dy = Gy(img, p) ... antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 20 / 34
  20. Version 4 PyPy massively slower :( (still 76x faster than

    CPython) I’m a liar PyPy sucks antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 21 / 34
  21. Version 5 from pypytools.codegen import Code def Kernel(matrix): height =

    len(matrix) width = len(matrix[0]) code = Code() with code.block(’def apply(img, p):’): code.w(’value = 0.0’) for j, row in enumerate(matrix, -(height/2)): for i, k in enumerate(row, -(width/2)): if k == 0: continue code.w(’value += img[p+{delta}] * {k}’, delta=(i, j), k=k) code.w(’return value’) code.compile() return code[’apply’] antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 23 / 34
  22. Version 5 # GENERATED CODE def apply(img, p): value =

    0.0 value += img[p+(-1, -1)] * -1.0 value += img[p+(1, -1)] * 1.0 value += img[p+(-1, 0)] * -2.0 value += img[p+(1, 0)] * 2.0 value += img[p+(-1, 1)] * -1.0 value += img[p+(1, 1)] * 1.0 return value antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 24 / 34
  23. Version 5 PyPy ~428x faster again antocuni (EuroPython 2017) PyPy:

    abstractions for free July 12 2017 25 / 34
  24. The cost of abstraction CPython each version ~2-3x slower than

    the previous one v3 is ~8.5x slower than v0 PyPy abstractions (almost) for free v5 is ~18% slower than v0, v1, v2 antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 26 / 34
  25. PyPy JIT 101 What is the JIT doing? Which code

    is optimized away? Very rough explanation For a deeper view: http://speakerdeck.com/u/antocuni/p/ pypy-jit-under-the-hood http: //www.youtube.com/watch?v=cMtBUvORCfU antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 27 / 34
  26. Loops and guards def compute(n): total = 0 i =

    0 while i < n: total += i i += 1 return total cdef loop0(i, n, total): assert isinstance(n, int) while True: assert i < n total = int_add_ovf(total, i) assert not_overflow(total) i = int_add_ovf(i, 1) assert not_overflow(i) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 28 / 34
  27. Loops and guards def compute(n): total = 0 i =

    0 while i < n: total += i i += 1 return total cdef loop0(i, n, total): assert isinstance(n, int) while True: assert i < n total = int_add_ovf(total, i) assert not_overflow(total) i = int_add_ovf(i, 1) assert not_overflow(i) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 28 / 34
  28. Bridges (1) def compute(n): total = 0 i = 0

    while i < n: if i % 2: total += i else: total += (i-5) i += 1 return total cdef loop0(i, n, total): assert isinstance(n, int) while True: assert i < n assert i % 2 != 0 total = int_add_ovf(total, i) assert not_overflow(total) i = int_add_ovf(i, 1) assert not_overflow(i) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 29 / 34
  29. Bridges (1) def compute(n): total = 0 i = 0

    while i < n: if i % 2: total += i else: total += (i-5) i += 1 return total cdef loop0(i, n, total): assert isinstance(n, int) while True: assert i < n assert i % 2 != 0 total = int_add_ovf(total, i) assert not_overflow(total) i = int_add_ovf(i, 1) assert not_overflow(i) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 29 / 34
  30. Bridges (2) def compute(n): total = 0 i = 0

    while i < n: if i % 2: total += i else: total += (i-5) i += 1 return total cdef loop0(i, n, total): assert isinstance(n, int) while True: assert i < n if i % 2 != 0: total = int_add_ovf(total, i) assert not_overflow(total) i = int_add_ovf(i, 1) assert not_overflow(i) else: tmp = int_sub_ovf(i, 5) assert not_overflow(tmp) total = int_add_ovf(total, tmp) i = int_add_ovf(i, 1) assert not_overflow(i) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 30 / 34
  31. Inlining def fn(a, b): return a + b def compute(n):

    total = 0 i = 0 while i < n: total = fn(total, i) i += 1 return total assert version(globals()) == 42 assert id(fn.__code__) == 0x1234 # assert isinstance(n, int) while True: assert i < n total = int_add_ovf(total, i) # inlined! assert not_overflow(total) i = int_add_ovf(i, 1) assert not_overflow(i) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 31 / 34
  32. Classes import math class Point(object): def __init__(self, x, y): self.x

    = x self.y = y def distance(self): return math.hypot(self.x, self.y) def compute(points): total = 0 for p in points: total += p.distance() return total cdef loop0(total, list_iter): assert version(globals()) == 42 assert version(math.__dict__) == 23 assert version(Point.__dict__) == 56 assert id(Point.distance.__globals__) == 0x1 assert version(Point.distance.__globals__) = assert id(Point.distance.__code__) == 0x5678 while True: p = next(list_iter) assert isinstance(p, Point) # <inlined Point.distance> assert isinstance(p.x, float) assert isinstance(p.y, float) p_x = p.x p_y = p.y tmp = c_call(math.hypot, p_x, p_y) # </inlined> total = float_add(total, tmp) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 32 / 34
  33. Virtuals def compute(n): total = 0.0 i = 0.0 while

    i < n: p = Point(i, i+1) total += p.distance() i += 1 return total assert ... assert isinstance(n, int) assert isinstance(i, float) while True: assert i < n # Point() is "virtualized" into p_x and p_y p_x = i p_y = float_add(i, 1.0) # # inlined call to Point.hypot tmp = c_call(math.hypot, p_x, p_y) total = float_add(total, tmp) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 33 / 34
  34. Virtuals def compute(n): total = 0.0 i = 0.0 while

    i < n: p = Point(i, i+1) total += p.distance() i += 1 return total assert ... assert isinstance(n, int) assert isinstance(i, float) while True: assert i < n # Point() is "virtualized" into p_x and p_y p_x = i p_y = float_add(i, 1.0) # # inlined call to Point.hypot tmp = c_call(math.hypot, p_x, p_y) total = float_add(total, tmp) antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 33 / 34
  35. More PyPy at EuroPython PyPy Help Desk Tomorrow, 10:30-12:00 and

    14:00-15:30 Come and ask us questions! "PyPy meets Python 3 and numpy" Armin Rigo Friday, 14:00 Or, just talk to us :) @pypyproject, @antocuni antocuni (EuroPython 2017) PyPy: abstractions for free July 12 2017 34 / 34