Slide 1

Slide 1 text

Gccpy – Python Front-end to GCC Philip Herron - @redzor [email protected] - http://redbrain.co.uk

Slide 2

Slide 2 text

Introduction ● Fully ahead of time compiler for Pure Python code (no Rpython) ● Currently targeting Python 2.4 syntax (simple exception handling) ● Implemented on-top of GCC for a feature rich Middle and backend. ● Not using LLVM as it wasn't as mature back in 2009 when I was developing the idea ● I didn't like LLVM's code base at the time.

Slide 3

Slide 3 text

What is a Front End?

Slide 4

Slide 4 text

Why develop this ● GOAL: Is Jit compilation better than Ahead of time with current technologies and languages? – No one has really tested this fully but ends up in flame wars. ● GCCPY vs PyPy Basically. ● To learn and gain a deep understanding in Compilers and code- generation - Python argued to be slow ● Inspired by PHC a php compiler Paul Biggar's PHD optimizing dynamic languages ● Surely making python code native is faster right...? – This isn't necessarily true

Slide 5

Slide 5 text

History ● Always wanted a deep full project to get into and got accepted to Gsoc 2010 ● PHC - http://www.phpcompiler.org/ Php compiler: Paul Biggar – PHD thesis: http://paulbiggar.com/research/#phd-dissertation ● Wanted to learn how compilers and languages really worked ● Python is awesome, Paul also wanted to do this exact project from his google talk on his phd its on youtube search his name. ● In the end PHC Failed for many reasons I believe mostly because it was a home-brew compiler! – If you are to reuse GCC with all its lovely middle/back end optimizers we could have something special – Also why I can't stand compiler's which output C because its LAZY

Slide 6

Slide 6 text

Hello World $ gccpy -g -O2 -fpy-gen-main helloworld.py -o helloworld $ ./helloworld

Slide 7

Slide 7 text

Compiling Python – DOT IL ● Developed DOT_IL (Python with braces ;) ) – It is a dynamic typed language with some optimizations – constant folding at a highlevel – Sits on-top of GCC GENERIC which is very C like no control structure almost 3-address-code – Lowered to Gimple and into GCC middle-end optimizers and finally thr backend with target specific optimizations.

Slide 8

Slide 8 text

The whole Pipeline together!

Slide 9

Slide 9 text

Dynamic Typing ● What something points to! ● Basically looks similar to Cpython gpy_object_t, PYObject are very similar structures in principle. ● The runtime is more optimized to Ahead of time languages more runtime behaviour is compiled in rather than dynamically figured out at runtime. ● Objects are allocated at runtime into gpy_object_t

Slide 10

Slide 10 text

Memory Addressing

Slide 11

Slide 11 text

Issues ● Empty class declarations such as: class myClass: pass ● Won't be supported by gccpy

Slide 12

Slide 12 text

Latency and Speed ● Programmers have forgot how powerful C is! ● Programmers have forgot how powerful old hardware is. ● Add more resources argument is broken.

Slide 13

Slide 13 text

Demo 1 Hello World!..........

Slide 14

Slide 14 text

Your gonna make this face next

Slide 15

Slide 15 text

Structure of the Gccpy Front-End

Slide 16

Slide 16 text

Gccpy VS other's ● Nuitka is pretty much a clone of ShedSkin.... – They generate optimized, embedded python/C++ depending on libpython.so. ● Cython is awesome for mixing pure python + C – Dialect of python supports C types – Can be used to 'compile' python but its not really ● PyPy Jit'd python boasting lots of runtime optimization. ● Numba and parakeet extremely DSL specific

Slide 17

Slide 17 text

Compile Time vs Runtime ● EVERYTHING is already Jit'ed before running your program – Reduces the size of the software stack no intermediate Virtual Machine. ● Runtime optimizations on the language very hard to do. – Jit optimization mostly revolves around target cpu funky'ness ● GCC is already aware of this -mtune – Languages and Architectures don't lend themselves to current state of dynamic languages. –

Slide 18

Slide 18 text

Good and the Bad ● Developing this is a huge task! – Takes long time everything from scratch ● Generating sensible code – Jit's in principle are better idea for the future of dynamic languages but I don't agree with this in our current landscape – PyPy has a budget! – Ahead of time binaries are always going to be around – Embedded systems will always benefit from this type of project. – Native applications without need to change language

Slide 19

Slide 19 text

Gccpy Optimizations ● Global Dictionary Lookup offset's all calculated at compile time – No runtime lookups! ● Jump tables calculated at compile time with labels and goto's in the compiled code! – No runtime jump tables ● No middle-ish crap to manage like parsing ● No need to pseudo compile .pyc or manage the jit.

Slide 20

Slide 20 text

Project Status ● First milestone (lucy) accomplished – All basic core principles in place – List/dict/integers/string all types need their api filled out more community has been helping with that – Stdlib needs done (huge task) going to implement enough to bootstrap the rest of Cpython stdlib. ● __builtin__ sys etc...

Slide 21

Slide 21 text

Major Projects ● Module compilation with __init__.py (should modules be .so or .a or both?) ● Threading and multi-core no GIL (only should affect the gvs (gccpy global virtual stack)) ● Need a garbage Collector – bohem-gc used in gcj (gnu-java and objc/objc++) is meant to be slow but will do in the mean time I think)

Slide 22

Slide 22 text

Thanks! Twitter @redzor [email protected] http://redbrain.co.uk https://github.com/redbrain/gccpy http://gcc.gnu.org/wiki/PythonFrontEnd Questions?!