Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyPy Use Case and Experiment

PyPy Use Case and Experiment

This is my slide at PyCon TW 2017, topic is "PyPy's approach to construct domain-specific language runtime", and this slide contains some experiments and benchmark related to PyPy, a JIT-enabled version of Python

Tsundere Chen

June 11, 2017
Tweet

More Decks by Tsundere Chen

Other Decks in Programming

Transcript

  1. Test Environment • Vagrant, Ubuntu/16.04 • The benchmark result on

    Host OS and Guest OS is really close, so I use VM to get result
 (BTW, it's really easy to get your VM dirty
  2. Test Version • CPython2 2.7.12 • CPython3 3.5.2 • PyPy2

    5.1.2 (Installed from apt-get) • PyPy2 5.6.0 (Compiled from source) • PyPy3 5.7.1 (Compiled from source)
  3. Why only CPython & PyPy • Cython • You'll need

    to learn Cython's syntax, it's mixing C and Python. • Jython • The latest version of Jython 2.7.0 is released in May 2015, so it's outdated
  4. Some notice • PyPy3 is still in beta, so if

    it's slower than CPython 3, no surprise • And not every module can run faster in PyPy than CPython, there will be samples later
  5. Why PyPy3 is beta ? • The way CPython develop

    and the way PyPy develop is different • CPython • Focus on Python3, only maintain Python2 when security issue pops up • PyPy • Focus on PyPy2, also updating PyPy3, but it's not their main development
  6. PyPy Installation • If you want to compile PyPy from

    scratch • First, install dependencies • http://doc.pypy.org/en/latest/build.html • Then, cd to pypy/goal
  7. PyPy Installation • No-JIT • <Python/PyPy> ../../rpython/bin/rpython -- opt=2 •

    JIT-Enabled • <Python/PyPy> ../../rpython/bin/rpython -- opt=jit
  8. PyPy Installation • Notice • Compile PyPy takes lots of

    time, and compile it with JIT-Enabled takes even more. • Usually takes 30min up • And you need at least 4G RAM to compile it on 64-Bit Machine, make sure you have enough RAM for this, or it may be killed by system
  9. Not every case should use PyPy • For example, when

    it comes to the code below, CPython is faster than PyPy
 myStr = “”
 for x in xrange(1, 10**6):
 myStr += str(myStr[x])
  10. Example • Language: Brainf*ck • 8 commands • + mem[ptr]

    += 1 - mem[ptr] -= 1
 < ptr -= 1 > ptr += 1
 , input() . print()
 [ while(mem[ptr]){ ] }
  11. Our Goal • Build a Brainf*ck Interpreter • Build a

    Brainf*ck to Python translator, and compile it with PyPy
  12. Interpreter • Just read in the file, and execute the

    command • But, we can add JIT here
  13. What to do to add JIT • We need to

    find "Reds" and "Greens" • Greens -> Define instructions • Reds -> What's being manipulated
  14. What to do to add JIT • from rpython.rlib.jit import

    JitDriver • jitdriver = JitDriver(greens=[], reds=[]) • and add jit_merge_point to your main loop
  15. Optimize • Speed up loop • Because every loop needs

    to look up address in dictionary, but the dictionary is static, so we can use @elidable decorator and add a function to speed up
  16. Basic Knowledge • It reads in Brainf*ck file, then turn

    into IR • Then you can choose to do Optimize in IR • Finally, turn your IR into Python Code, and compile it with PyPy to generate a binary file Brainf*ck Code IR Python Code Binary File
  17. Architecture • ir.py -> For Brainf*ck to IR and IR

    to Python • trans.py -> Main program • python trans.py <input> <output> <optmode> • optmode 1 to open optimization, 0 to not to • opt.py -> Optimize tricks
  18. Optimizations • opt_contract ( Contract) • Operation like " +++++

    ", means that we have to do "mem[p] += 1" five times • But because we have IR, so we can change the instruction to "mem[p] += 5" • When it comes to “+ - > <“, this trick can apply
  19. Optimizations • opt_clearloop (Clear Loop) • Command like [-], it

    means when(mem[p]), do mem[p] -= 1 • We know what the result is, so we can set mem[p] to zero directly
 mem[p] = 0
  20. Optimizations • opt_multiloop & opt_copyloop (Multiplication and Copy) • Command

    like [->+>+<<] is copy mem[p]'s value to mem[p+1] and mem[p+2], and set mem[p] to zero • If we know what this is doing, we can make it short
  21. Optimizations • opt_multiloop & opt_copyloop (Multiplication and Copy) • Same

    trick can apply to [->++<], make
 mem[p+1] = 2 * mem[p] and set mem[p] = 0 • Which is multiplication
  22. Optimizations • opt_offsetops (Operation Offsets) • In Brainf*ck, we know

    that we have a pointer indicating where we are now, and pointer usually move a lot • What if we can calculate offset for Instructions directly, so we don't need to move the pointer around
  23. Optimizations • opt_cancel (Cancel Instructions) • ++++-->>+-<<< do the same

    thing as ++< • Then, why waste all the time on these Instructons ?
  24. Wait a sec... • Not every case can use JIT

    • Because JIT needs to warm-up and Analysis
 Maybe warm-up can take more time than your code actually run • And it's import to avoid to record the warm-up time when you want to do some benchmarking
  25. Wait a sec... • And do you really need JIT

    ? • It may cost a lot for one to import JIT to a project • Sometimes, maybe buy more server is a better choice than import JIT into your project
  26. Wait a sec... • But if you analyzed your project,

    know how difficult it is for you to import PyPy and JIT into your project, then you're good to go! • BTW, file size of executable with JIT Enabled is bigger than the one with No-JIT
  27. References • Tutorial: Writing an Interpreter with PyPy, part 1

    • https://morepypy.blogspot.tw/2011/04/tutorial-writing-interpreter-with- pypy.html • PyPy - Tutorial for Brainf*ck Interpreter • http://wdv4758h.github.io/posts/2015/01/pypy-tutorial-for-brainfuck- interpreter/ • matslina/bfoptimization • https://github.com/matslina/bfoptimization/ • Virtual Machine Constructions for Dummies • https://www.slideshare.net/jserv/vm-construct