Upgrade to Pro — share decks privately, control downloads, hide ads and more …

django in the real world (PyConPK)

django in the real world (PyConPK)

Israel Fermín Montilla

December 16, 2017
Tweet

More Decks by Israel Fermín Montilla

Other Decks in Programming

Transcript

  1. django in the real world yes! it scales!... YAY! Israel

    Fermin Montilla Software Engineer @ dubizzle December 16, 2017
  2. from iferminm import more data Software Engineer @ dubizzle Venezuelan

    living in Dubai, UAE T: @iferminm blog: http://iffm.me
  3. What will we see in this talk? Pareto Principle The

    simple django project Measuring Common bottlenecks
  4. Basic concepts: Pareto principle The Pareto principle states that, for

    many events, roughly 80% of the effects come from 20% of the causes –Wikipedia
  5. Basic concepts: Pareto principle The Pareto principle states that, for

    many events, roughly 80% of the effects come from 20% of the causes –Wikipedia For example: 20% of the code produces 80% of the bugs.
  6. Reduce query counts 1 subs = S u b s

    c r i p t i o n . o b j e c t s . f i l t e r ( 2 u s e r i d=user . pk 3 ) 4 f o r s in subs : 5 packages . append ( s . package . name)
  7. Reduce query counts 1 subs = S u b s

    c r i p t i o n . o b j e c t s . f i l t e r ( 2 u s e r i d=user . pk 3 ) 4 f o r s in subs : 5 packages . append ( s . package . name) N hits to the database
  8. Reduce query counts 1 subs = S u b s

    c r i p t i o n . o b j e c t s . f i l t e r ( 2 u s e r i d=user . pk 3 ) 4 f o r s in subs : 5 packages . append ( s . package . name) N hits to the database 1 subs = S u b s c r i p t i o n . o b j e c t s . f i l t e r ( 2 u s e r i d=user . pk 3 ) . s e l e c t r e l a t e d ( ’ package ’ )
  9. Reduce query counts 1 subs = S u b s

    c r i p t i o n . o b j e c t s . f i l t e r ( 2 u s e r i d=user . pk 3 ) 4 f o r s in subs : 5 packages . append ( s . package . name) N hits to the database 1 subs = S u b s c r i p t i o n . o b j e c t s . f i l t e r ( 2 u s e r i d=user . pk 3 ) . s e l e c t r e l a t e d ( ’ package ’ ) Will join the table and return it in one hit
  10. Reduce query counts Use it wisely and measure 1 user

    = User . o b j e c t s . s e l e c t r e l a t e d ( 2 ’ sodas ’ 3 ) . get ( pk=request . data [ ’ u s e r i d ’ ] ) 4 5 # No a d d i t i o n a l query 6 user . sodas . a l l ()
  11. Reduce query counts Use it wisely and measure 1 user

    = User . o b j e c t s . s e l e c t r e l a t e d ( 2 ’ sodas ’ 3 ) . get ( pk=request . data [ ’ u s e r i d ’ ] ) 4 5 # No a d d i t i o n a l query 6 user . sodas . a l l () 1 # T r i g g e r s an a d d i t i o n a l query 2 user . sodas . f i l t e r (name=’ p e p s i ’ ) 3 4 # Sometimes i t ’ s b e t t e r to use the cached r e s u l t 5 # and f i l t e r in memory 6 [ s f o r s in user . sodas . a l l () i f s . name == ’ pe psi ’ ]
  12. Reduce query counts Use the Prefetch object! 1 # A

    product has many s u b s c r i p t i o n s and 2 # a s u b s c r i p t i o n can have many products 3 4 queryset = S u b s c r i p t i o n . o b j e c t s . f i l t e r ( 5 s t a t u s=e x p i r e d 6 ) . s e l e c t r e l a t e d ( ’ c r e d i t s ’ ) 7 8 p r e f e t c h = Prefetch ( ’ s u b s c r i p t i o n s ’ , 9 queryset=queryset ) 10 products = Product . o b j e c t s . p r e f e t c h r e l a t e d ( 11 p r e f e t c h 12 ) . f i l t e r ( s e c t i o n=’ jobs ’ )
  13. Reduce query time Indexing 1 c l a s s

    U s e r P r o f i l e ( models . Model ) : 2 user = models . ForeignKey ( ’ auth user ’ ) 3 dob = models . DateField ( db index=True ) 4 e x t e r n a l i d = models . I n t e g e r F i e l d ( 5 db index=True 6 )
  14. Reduce query time Indexing 1 c l a s s

    U s e r P r o f i l e ( models . Model ) : 2 user = models . ForeignKey ( ’ auth user ’ ) 3 dob = models . DateField ( db index=True ) 4 e x t e r n a l i d = models . I n t e g e r F i e l d ( 5 db index=True 6 ) Note: Your DBMS updates your indices in write time (INSERT and UPDATE)
  15. Some notes on indexing You need to measure before you

    do it. Run EXPLAIN on the query (Seq scan) Index by workload If you filter on multiple columns use index together Meta option Check if the index is used before you push it. Run EXPLAIN again
  16. Expensive JOINs Sometimes you might want to separate them into

    two different queries. 1 # You may want to see the c r e d i t spending behavior of your u s e r s 2 C r e d i t . o b j e c t s . f i l t e r ( 3 s u b s c r i p t i o n p k g t y p e=’ motors ’ 4 ) . s e l e c t r e l a t e d ( ’ r e s o u r c e ’ ) 5 6 # Sometimes two q u e r i e s might perform b e t t e r 7 s u b s i d s = S u b s c r i p t i o n . o b j e c t s . f i l t e r ( 8 pkg type=’ motors ’ 9 ) . v a l u e s l i s t ( ’ id ’ , f l a t=True ) 10 11 C r e d i t . o b j e c t s . f i l t e r ( 12 s u b s c r i p t i o n i d i n=s u b s i d s 13 ) . s e l e c t r e l a t e d ( ’ r e s o u r c e ’ )
  17. Avoid whole table COUNT() queries After some point, having exact

    numbers is not important 1 PropertyForRent . o b j e c t s . count ()
  18. Avoid whole table COUNT() queries After some point, having exact

    numbers is not important 1 PropertyForRent . o b j e c t s . count () You can instead do a raw SQL query 1 # Postgres 2 SELECT r e l t u p l e s FROM p g c l a s s 3 WHERE relname = ’ p r o p e r t y f o r r e n t ’ 4 5 # MySQL 6 SELECT t a b l e r o w s FROM information schema . t a b l e s 7 WHERE table schema = DATABASE() 8 AND table name = ’ p r o p e r t y f o r r e n t ’
  19. Avoid whole table COUNT() queries After some point, having exact

    numbers is not important 1 PropertyForRent . o b j e c t s . count () You can instead do a raw SQL query 1 # Postgres 2 SELECT r e l t u p l e s FROM p g c l a s s 3 WHERE relname = ’ p r o p e r t y f o r r e n t ’ 4 5 # MySQL 6 SELECT t a b l e r o w s FROM information schema . t a b l e s 7 WHERE table schema = DATABASE() 8 AND table name = ’ p r o p e r t y f o r r e n t ’ This could reduce up to 90% response time
  20. Use persistent connections 1 DATABASES = { 2 ’ d

    e f a u l t ’ : { 3 # The usual . . . 4 ’CONN MAX AGE ’ : None , 5 } 6 }
  21. Know your ORM Read the full ORM docs at least

    once Use F expressions to reference values within the queryset Use Q expressions for advanced filters Explore the aggregation framework Use values(), values list(), only() and defer() when the results are too big
  22. Denormalize Evaluate huge joins Don’t use Generic Relations Figure: Response

    time reduction after denormalizing a Generic Relation
  23. Russian Doll Caching 1 {% cache MIDDLE TTL ” ads

    ” request .GET. page %} 2 {% i n c l u d e ” s e c t i o n s / property / postheader . html” %} 3 <div c l a s s=”ads−l i s t ”> 4 {% f o r ad in ads %} 5 {% cache LONG TTL ” a d d e s c r i p t i o n ” ad id ad . l a s t u p d a t e d %} 6 {% i n c l u d e ” s e c t i o n s / property / a d t e a s e r . html” %} 7 {% endcache %} 8 {% endfor %} 9 {% endcache %}
  24. Further optimization Minimize your CSS and JS (django-compressor, webassets or

    django-pipeline) Optimize your static images Optimize user uploaded images Serve your media and static content from a CDN Do slow work later... (celery or python-rq) Use slave replicas for read operations (and database routers)