Performant Django, Djangocon 2014 Edition

Performant Django Ara Anjargolian Co-Founder & CTO, YCharts

17 PERFORMANCE SECRETS DJANGONAUTS WON'T TELL YOU

Performance work can be divided into two distinct areas Front-end
and back-end Handling them effectively requires very different approaches.

But ﬁrst, a quick note about frontend performance 80-90% of
the end-user response time is spent on the frontend. Start there. -Steve Souders

Front-End Performance Work •  Can be universally applied •  Requires
systems/tooling changes •  Often has clear, system-independent best practices

Best Practice: Cache static assets forever (as long as they
don’t change) Why: Download assets as infrequently as possible Solution: Already done! (As long as you use CachedStaticFilesStorage or CachedFilesMixin with your own storage)

Best practice: Bundle/minify/ compress static assets Why: Reduce # of
requests, download time Solution: Use a static-asset-manager. 2 good ones: django-pipeline, webassets. Bonus points: Lower number of requests by using data URIs for images (which pipeline supports)

Best Practice: Serve static ﬁles via a CDN. Why: Less
latency Solution: One good way: Use django-storages + STATICFILES_STORAGE storage setting to store in cloud ﬁle storage (i.e. S3) and point CDN to it.

Best Practice: Serve more stuff as static assets. Why: Static
assets can be served faster, more efﬁciently than dynamic assets. Solution: Front-end templates, static-y data structures that can be served as JSON. All that’s required are some custom management commands.

Back-End Performance Work •  Can really only be done on
a case by case basis. •  Often only requires code changes. •  Is very site and situation speciﬁc.

OK, I lied, there are some global backend performance
to-dos. •  Use cached sessions (contrib.sessions.backends.cache or contrib.sessions.backends.cached_db) •  Use cached template loader •  If you’re starting a new project, or do a ton heavy weight templates, consider using jinja2 as your template engine. But on to the real stuff!

OK, I lied, ﬁrst a disclaimer DO NOT try to
“optimize” every view. •  This is an utter waste of time, as there will be diminishing returns. •  Optimizing on the backend often means adding complexity. And in a multi- programmer environment, complexity is expensive!

Backend performance work starts with a proﬁle of the “problem”
view Use a proﬁler middleware! (A good one: https://gist.github.com/Miserlou/3649773)

What does a proﬁle look like?

Understanding a proﬁle Things to look for: •  Tons of
time spent in SQL? •  Functions being called way too many times •  Functions taking longer than you would expect •  Some combination of the two

What if the problem is SQL? First use django-debug-toolbar, or,
django- devserver to identify the problem queries. Is the issue one slow query? Too many queries?

SQL Tricks, Part 1 •  select_related(): Helps avoid extra queries
to grab objects referenced by foreign keys/one to one relationships •  values/values_list(): Avoid Python object creation overhead when dicts/lists are good enough •  db_index=True: if you are referencing objects by ﬁeld that’s not it’s primary/ foreign key and does not have a uniqueness constraint on it, you might need this

SQL Tricks, Part 2 •  prefetch_related(): Like select related except
the “join” is done in Python and thus works for M2M •  only(): Only grab ﬁelds in the model you need (USE WITH CAUTION!) •  defer(): Get all ﬁelds except those stated in defer() •  bulk_create(): When writing lots of rows to same table

What if the problem is SQL and none of the
above helps? •  raw(): -Roll your own SQL that can perhaps use stuff speciﬁc to the DB, or fancier queries. •  Denormalization: Less joins, precomputed data •  No SQL: Maybe the data you are storing in a relational database doesn’t map well to a relational database.

What if the problem is in the Python? Common issues:
•  Algorithmic issues like n^2 paths that don’t need to be n^2 •  Doing extra work like constantly re-evaluating a loop invariant inside a loop •  Marginally slow functions, that didn’t seem like a problem until they were called 10k times in a request. Basically: People doing bad stuff inside loops.

Warning: sometimes you have to get weird

Example 1 Decimal(0.1) used in a function that’s called a
lot. >>> import timeit! >>> timeit.timeit("Decimal(‘0.1’)", "from decimal import Decimal", number=50000)! 0.33817005157470703! ! Solution: DECIMAL_POINT_ONE = Decimal(0.1)!

Example 2 reverse() is slow. When it’s called a lot
it becomes a problem. Solution: base_url = reverse('company’, args=[’SYM'])! …! item['url'] = base_url.replace('SYM’, symbol)! !

What if you optimized your Python/ SQL and you’re still
slow? Cache. Then cache some more.

•  View cache •  Template fragment cache •  Function level
cache (via package like django-cache-utils, django-cache-helper) •  Query cache (django-cache-machine, django-cacheops) Many types of caching

The End Questions? @ara818 [email protected] http://github.com/ara818 Like solving complex performance
problems? YCharts is hiring!

Performant Django, Djangocon 2014 Edition

Performant Django, Djangocon 2014 Edition

Ara Anjargolian

More Decks by Ara Anjargolian

Other Decks in Programming

Featured

Transcript

Performant Django Ara Anjargolian Co-Founder & CTO, YCharts

17 PERFORMANCE SECRETS DJANGONAUTS WON'T TELL YOU

Performance work can be divided into two distinct areas Front-end

But ﬁrst, a quick note about frontend performance 80-90% of

Front-End Performance Work •  Can be universally applied •  Requires

Best Practice: Cache static assets forever (as long as they

Best practice: Bundle/minify/ compress static assets Why: Reduce # of

Best Practice: Serve static ﬁles via a CDN. Why: Less

Best Practice: Serve more stuff as static assets. Why: Static

Back-End Performance Work •  Can really only be done on

OK, I lied, there are some global back- end performance

OK, I lied, ﬁrst a disclaimer DO NOT try to

Backend performance work starts with a proﬁle of the “problem”

What does a proﬁle look like?

Understanding a proﬁle Things to look for: •  Tons of

What if the problem is SQL? First use django-debug-toolbar, or,

SQL Tricks, Part 1 •  select_related(): Helps avoid extra queries

SQL Tricks, Part 2 •  prefetch_related(): Like select related except

What if the problem is SQL and none of the

What if the problem is in the Python? Common issues:

Warning: sometimes you have to get weird

Example 1 Decimal(0.1) used in a function that’s called a

Example 2 reverse() is slow. When it’s called a lot

What if you optimized your Python/ SQL and you’re still

•  View cache •  Template fragment cache •  Function level

The End Questions? @ara818 [email protected] http://github.com/ara818 Like solving complex performance