Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2017 - Alex Becker - Dangers of Django

PyBay
August 13, 2017

2017 - Alex Becker - Dangers of Django

While convenient, Django's ORM has many pitfalls for the unwary, which can lead to data integrity bugs:
- autocommit by default
- get_or_create/update_or_create not always atomic
- invalid values are silently coerced to None
- validation is enforced inconsistently

This talk shows brief examples of bugs due to each.

PyBay

August 13, 2017
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. Autocommit by default Queries are committed as soon as they

    are executed. Suppose we implement a toy service for tracking API limits: d e f i n c r e m e n t _ c a l l s ( r e q u e s t ) : k e y = r e q u e s t . G E T [ ' a p i _ k e y ' ] c o u n t e r = C a l l C o u n t e r . o b j e c t s . g e t ( k e y = k e y ) c o u n t e r . c a l l s + = 1 c o u n t e r . s a v e ( ) r e t u r n H t t p R e s p o n s e ( s t a t u s = 2 0 0 ) What happens if another view modifies the same c o u n t e r between lines 3 and 5?
  2. Not only will any changes they make to c o

    u n t e r . c a l l s get overridden, every other field on the instance will get reset. Solution: set A T O M I C _ R E Q U E S T S = T r u e .
  3. g e t _ o r _ c r e

    a t e is not atomic Suppose we create C a l l C o u n t e r s on demand: d e f i n c r e m e n t _ c a l l s ( r e q u e s t ) : k e y = r e q u e s t . G E T [ ' a p i _ k e y ' ] c o u n t e r = C a l l C o u n t e r . o b j e c t s . g e t _ o r _ c r e a t e ( k e y = k e y ) c o u n t e r . c a l l s + = 1 c o u n t e r . s a v e ( ) r e t u r n H t t p R e s p o n s e ( s t a t u s = 2 0 0 ) What happens if two requests are made to i n c r e m e n t _ c a l l s at the same time?
  4. Let's consult the g e t _ o r _

    c r e a t e docs: This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database. Translation: g e t _ o r _ c r e a t e relies entirely on the database to prevent duplicates. Solution: only use g e t _ o r _ c r e a t e when uniqueness is enforced via database constraints.
  5. Validation is inconsistently enforced Django offers lots of useful model

    validation: c l a s s U s e r ( r e q u e s t ) : u s e r n a m e = C h a r F i e l d ( u n i q u e = T r u e ) s t a t e = C h a r F i e l d ( c h o i c e s = u s _ s t a t e s ) c r e a t e d _ a t = D a t e F i e l d ( )
  6. But this still works: u s e r = U

    s e r ( u s e r n a m e = ' a l i c e ' , s t a t e = ' C a n a d a ' , c r e a t e d _ a t = ' S t a r d a t e 7 1 3 0 . 4 ' , ) u s e r . s a v e ( ) Django expects model instances to be updated through forms, even though most projects do not do this. Solution: always use forms or call M o d e l . f u l l _ c l e a n ( ) . Or add SQL C H E C K constraints manually.
  7. Bonus: invalid values coerced to N o n e The

    value in the database is ' S t a r d a t e 7 1 3 0 . 4 ' , but Django will coerce it to N o n e : > U s e r . o b j e c t s . f i l t e r ( c r e a t e d _ a t _ _ i s n u l l = T r u e ) [ ] > U s e r . o b j e c t s . g e t ( u s e r n a m e = ' a l i c e ' ) . c r e a t e d _ a t i s N o n e T r u e