Slide 1

Slide 1 text

Code Generation in Python Dismantling Jinja a

Slide 2

Slide 2 text

bit.ly/codegeneration Discuss this presentation, give feedback

Slide 3

Slide 3 text

Code Generation?

Slide 4

Slide 4 text

eval is evil Or is it?

Slide 5

Slide 5 text

Why is eval evil? Security Performance &

Slide 6

Slide 6 text

Security Code Injection Namespace pollution

Slide 7

Slide 7 text

Performance No bytecode Code makes code that code runs

Slide 8

Slide 8 text

So: Why? No suitable alternatives

Slide 9

Slide 9 text

use responsibly because of this:

Slide 10

Slide 10 text

101 EVAl

Slide 11

Slide 11 text

>>> code = compile('a = 1 + 2', '', 'exec') >>> code at 0x1004d5120, file "", line 1> Compile

Slide 12

Slide 12 text

>>> ns = {} >>> exec code in ns >>> ns['a'] 3 Eval

Slide 13

Slide 13 text

>>> import ast >>> ast.parse('a = 1 + 2') <_ast.Module object at 0x1004fd250> >>> code = compile(_, '', 'exec') AST #1

Slide 14

Slide 14 text

AST #2 >>> n = ast.Module([ ... ast.Assign([ast.Name('a', ast.Store())], ... ast.BinOp(ast.Num(1), ast.Add(), ... ast.Num(2)))])) >>> ast.fix_missing_locations(n) >>> code = compile(n, '', 'exec')

Slide 15

Slide 15 text

No strings passed to eval()/exec Explicit compilation to bytecode Execution in explicit namespace Recap

Slide 16

Slide 16 text

ARCHITECTURE TeMpLAtE eNgInE

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

2nd Iteration Generates Python Code Python Semantics Different Scoping Overview

Slide 19

Slide 19 text

Lexer Pipeline Parser Identifier Analyzer Code Generator Python Source Bytecode Runtime

Slide 20

Slide 20 text

Different Scoping WSGI & Generating Debug-ability Restricted Environments Complexities

Slide 21

Slide 21 text

    {% for item in seq %}
  • {{ item }}
  • {% endfor %}
Input

Slide 22

Slide 22 text

print "
    " for each item in the variable seq push the scope print "
  • " print the value of item and escape it as necessary print "
  • " pop the scope print "
" Behavior

Slide 23

Slide 23 text

Naive: write(u'
    ') for _tmp in context['seq']: context.push({'item': _tmp}) write(u'
  • ') write(autoescape(context['item'])) write(u'
  • ') context.pop() write(u'
')

Slide 24

Slide 24 text

Actual: l_seq = context.resolve('seq') write(u'
    ') for l_item in l_seq: write(u'
  • ') write(autoescape(l_item)) write(u'
  • ') write(u'
')

Slide 25

Slide 25 text

?

Slide 26

Slide 26 text

COMPILATION INtRoDUCTIoN tO

Slide 27

Slide 27 text

Low Level High Level e Art of Code Generation versus

Slide 28

Slide 28 text

Low Level Code Generation 2 0 LOAD_CONST 1 (1) 3 LOAD_CONST 2 (2) 6 BINARY_ADD 7 STORE_FAST 0 (a) a = 1 + 2

Slide 29

Slide 29 text

Assign(targets=[Name(id='a', ctx=Store())], value=BinOp(left=Num(n=1), op=Add(), right=Num(n=2)))] a = 1 + 2 High Level Code Generation

Slide 30

Slide 30 text

Bytecode Abstract Syntax Trees Sourcecode Building Blocks

Slide 31

Slide 31 text

Bytecode Undocumented Does not work on GAE Implementation Specific

Slide 32

Slide 32 text

AST More Limited Easier to Debug Does not segfault the Interpreter

Slide 33

Slide 33 text

Source Works always Very Limited Hard to Debug without Hacks

Slide 34

Slide 34 text

e Tale of Two Pieces of Code (very similar pieces of code)

Slide 35

Slide 35 text

def foo(): a = 0 for x in xrange(100): a += x print a foo() Fast

Slide 36

Slide 36 text

a = 0 for x in xrange(100): a += x print a Slower

Slide 37

Slide 37 text

?

Slide 38

Slide 38 text

2 0 LOAD_CONST 0 (0) 3 STORE_NAME 0 (a) 3 6 SETUP_LOOP 30 (to 39) 9 LOAD_NAME 1 (xrange) 12 LOAD_CONST 1 (100) 15 CALL_FUNCTION 1 18 GET_ITER >> 19 FOR_ITER 16 (to 38) 22 STORE_NAME 2 (x) 4 25 LOAD_NAME 0 (a) 28 LOAD_NAME 2 (x) 31 INPLACE_ADD 32 STORE_NAME 0 (a) 35 JUMP_ABSOLUTE 19 >> 38 POP_BLOCK 5 >> 39 LOAD_NAME 0 (a) 42 PRINT_ITEM 43 PRINT_NEWLINE Slower

Slide 39

Slide 39 text

2 0 LOAD_CONST 1 (0) 3 STORE_FAST 0 (a) 3 6 SETUP_LOOP 30 (to 39) 9 LOAD_GLOBAL 0 (xrange) 12 LOAD_CONST 2 (100) 15 CALL_FUNCTION 1 18 GET_ITER >> 19 FOR_ITER 16 (to 38) 22 STORE_FAST 1 (x) 4 25 LOAD_FAST 0 (a) 28 LOAD_FAST 1 (x) 31 INPLACE_ADD 32 STORE_FAST 0 (a) 35 JUMP_ABSOLUTE 19 >> 38 POP_BLOCK 5 >> 39 LOAD_FAST 0 (a) 42 PRINT_ITEM 43 PRINT_NEWLINE Fast

Slide 40

Slide 40 text

2 0 LOAD_CONST 1 (0) 3 STORE_FAST 0 (a) 3 6 SETUP_LOOP 30 (to 39) 9 LOAD_GLOBAL 0 (xrange) 12 LOAD_CONST 2 (100) 15 CALL_FUNCTION 1 18 GET_ITER >> 19 FOR_ITER 16 (to 38) 22 STORE_FAST 1 (x) 4 25 LOAD_FAST 0 (a) 28 LOAD_FAST 1 (x) 31 INPLACE_ADD 32 STORE_FAST 0 (a) 35 JUMP_ABSOLUTE 19 >> 38 POP_BLOCK 5 >> 39 LOAD_FAST 0 (a) 42 PRINT_ITEM 43 PRINT_NEWLINE Fast

Slide 41

Slide 41 text

>>> def foo(): ... a = 42 ... locals()['a'] = 23 ... return a ... >>> foo() 42 Example

Slide 42

Slide 42 text

SEMANTICS A StOrY ABoUT

Slide 43

Slide 43 text

print "
    " for each item in the variable seq push the scope print "
  • " print the value of item and escape it as necessary print "
  • " pop the scope print "
" Remember

Slide 44

Slide 44 text

at's not how Python works … so how do you generate code for it?

Slide 45

Slide 45 text

Keep tracks of identifiers emulate desired semantics Tracking

Slide 46

Slide 46 text

Context in Jinja2 is a Data Source Context in Django is a Data Store Scopes

Slide 47

Slide 47 text

    {% for item in seq %} {% include "item.html" %} {% endfor %}
Source

Slide 48

Slide 48 text

Code l_seq = context.resolve('seq') write(u'
    ') for l_item in l_seq: t1 = env.get_template('other.html') for event in yield_from(t1, context, {'item': l_item}) yield event write(u'
')

Slide 49

Slide 49 text

What happens in the include … … stays in the include

Slide 50

Slide 50 text

Impossible @contextfunction def get_users_and_store(context, var='users'): context[var] = get_all_users() return u''

Slide 51

Slide 51 text

EXAMPLES PrACTICAl

Slide 52

Slide 52 text

Source

Slide 53

Slide 53 text

Generated def root(context): l_sequence = context.resolve('sequence') yield u'\n'

Slide 54

Slide 54 text

Source

Slide 55

Slide 55 text

Generated def root(context): l_sequence = context.resolve('sequence') yield u'\n'

Slide 56

Slide 56 text

Source

Item: {{ item }}

Slide 57

Slide 57 text

Generated def root(context): l_item = context.resolve('item') l_sequence = context.resolve('sequence') yield u'\n\n

Item: ' yield escape(l_item)

Slide 58

Slide 58 text

Source {% extends "layout.html" %} {% block body %}

Hello World!

{% endblock %}

Slide 59

Slide 59 text

Generated def root(context): parent_template = environment.get_template('layout.html', None) for name, parent_block in parent_template.blocks.iteritems(): context.blocks.setdefault(name, []).append(parent_block) for event in parent_template.root_render_func(context): yield event def block_body(context): if 0: yield None yield u'\n

Hello World!

\n' blocks = {'body': block_body}

Slide 60

Slide 60 text

Source {% block body %}{% endblock %}

Slide 61

Slide 61 text

Generated def root(context): yield u'\n' for event in context.blocks['body'][0](context): yield event def block_body(context): if 0: yield None blocks = {'body': block_body}

Slide 62

Slide 62 text

Source {% extends "layout.html" %} {% block title %}Hello | {{ super() }}{% endblock %}

Slide 63

Slide 63 text

Generated def root(context): parent_template = environment.get_template('layout.html', None) for name, parent_block in parent_template.blocks.iteritems(): context.blocks.setdefault(name, []).append(parent_block) for event in parent_template.root_render_func(context): yield event def block_title(context): l_super = context.super('title', block_title) yield u'Hello | ' yield escape(context.call(l_super)) blocks = {'title': block_title}

Slide 64

Slide 64 text

Jinja Do WHY DOeS

Slide 65

Slide 65 text

… manual code generation? why Originally the only option AST compilation was new in 2.6 GAE traditionally did not allow it

Slide 66

Slide 66 text

… generators instead of buffer.append() why Required for WSGI streaming unless greenlets are in use Downside: StopIteration :-(

Slide 67

Slide 67 text

… map "var_x" to "l_var_x" why Reversible to debugging purposes Does not clash with internals see templatetk for better approach

Slide 68

Slide 68 text

Jinja Do HOW DoEs

Slide 69

Slide 69 text

… does automatic escaping work how Markup object Operator overloading Compile-time and Runtime

Slide 70

Slide 70 text

Const

{{ "Hello World!" }}

def root(context): yield u'

<strong>Hello World!</strong>

'

Slide 71

Slide 71 text

Runtime

{{ variable }}

def root(context): l_variable = context.resolve('variable') yield u'

%s

' % ( escape(l_variable), )

Slide 72

Slide 72 text

Control #1 {% autoescape false %}

{{ variable }}

{% endautoescape %} def root(context): l_variable = context.resolve('variable') t_1 = context.eval_ctx.save() context.eval_ctx.autoescape = False yield u'

%s

' % ( l_variable, ) context.eval_ctx.revert(t_1)

Slide 73

Slide 73 text

Control #2 {% autoescape flag %}

{{ variable }}

{% endautoescape %} def root(context): l_variable = context.resolve('variable') l_flag = context.resolve('flag') t_1 = context.eval_ctx.save() context.eval_ctx.autoescape = l_flag yield u'%s%s%s' % ( (context.eval_ctx.autoescape and escape or to_string)((context.eval_ctx.autoescape and Markup or identity)(u'

')), (context.eval_ctx.autoescape and escape or to_string)(l_variable), (context.eval_ctx.autoescape and escape or to_string)((context.eval_ctx.autoescape and Markup or identity)(u'

')), ) context.eval_ctx.revert(t_1)

Slide 74

Slide 74 text

… far does the Markup object go? how All operators are overloaded All string operations are safe necessary due to operator support

Slide 75

Slide 75 text

Example >>> from markupsafe import Markup >>> Markup('%s') % '' Markup(u'<insecure>') >>> Markup('') + '' + Markup('') Markup(u'<insecure>') >>> Markup('Complex value').striptags() u'Complex\xa0value'

Slide 76

Slide 76 text

… do unde ned values work how Configurable Replaced by special object By default one level of silence

Slide 77

Slide 77 text

Example >>> from jinja2 import Undefined >>> unicode(Undefined(name='missing_var')) u'' >>> unicode(Undefined(name='missing_var').attribute) Traceback (most recent call last): File "", line 1, in UndefinedError: 'missing_var' is undefined

Slide 78

Slide 78 text

Q&A

Slide 79

Slide 79 text

@mitsuhiko http://lucumr.pocoo.org/ [email protected]

Slide 80

Slide 80 text

Oh hai. We're hiring http://fireteam.net/careers