Markus H
October 18, 2014
7.3k

# You Should(n't) Normalize Your Database

October 18, 2014

## Transcript

2. ### ABOUT ME Markus Holtermann M.Sc. student of Computer Science @

TU Berlin EuroPython 2014 local organizer Part of the server team ubuntuusers.de
3. ### HOW DO WE STORE OUR DATA? Files Relational Database Document

Store / NoSQL
4. ### FIRST APPROACH Name Home planet Gender Padmé Naboo Female Luke

Tatooine Male Leia Alderaan, Naboo Female Problem Multiple values in a single cell

6. ### REQUIREMENTS (1NF) Only atomic values Each row is unique i.e.

it has a primary key
7. ### EXAMPLE (1NF) PersonID Name Home planet Gender 1 Padmé Naboo

Female 2 Luke Tatooine Male 3 Leia Alderaan Female 3 Leia Naboo Female
8. ### PROBLEM (1NF) Table suffers from update anomalies PersonID Name Home

planet Gender 3 Leia Alderaan Female 3 Leia Naboo Male

10. ### REQUIREMENTS (2NF) 1NF All non-key attributes are fully functional dependent

on the primary key
11. ### EXAMPLE (2NF) PersonID Name Gender 1 Padmé Female 2 Luke

Male 3 Leia Female PersonID Name 1 Naboo 2 Tatooine 3 Alderaan 3 Naboo
12. ### PROBLEMS (2NF) Table suffers from insert anomalies PersonID Name 1

Naboo 2 Tatooine 3 Alderaan 3 Naboo ___ Dagobah
13. ### PROBLEMS (2NF) Table suffers from deletion anomalies PersonID Name Gender

1 Padmé Female 2 Luke Male 3 Leia Female PersonID Name 1 Naboo 2 Tatooine 3 Alderaan 3 Naboo

15. ### REQUIREMENTS (3NF) 2NF All attributes are dependent on the primary

key (and nothing else)
16. ### EXAMPLE (3NF) PersonID Name Gender 1 Padmé Female 2 Luke

Male 3 Leia Female PersonID PlanetID 1 10 2 11 3 10 3 12 PlanetID Name Water 10 Naboo 85% 11 Tatooine 1% 12 Alderaan 78% 13 Dagobah 88%
17. ### PROBLEMS (3NF) Normally no anomalies Solvable by Boyce–Codd Normal Form

(BCNF) If BCNF doesn't solve it: 4NF or 5NF

20. ### SPECIFICATION A Page with a name and unique slug and

text Unlimited number of revisions of the page text Latest revision of a page is determined by creation date
21. ### DATABASE SCHEMA (1) Page + PageID Name Slug Revision +

RevisionID PageID Text Version Date
22. ### DJANGO MODEL f r o m d j a n

g o . d b i m p o r t m o d e l s f r o m d j a n g o . u t i l s . f u n c t i o n a l i m p o r t c a c h e d _ p r o p e r t y f r o m d j a n g o . u t i l s . t i m e z o n e i m p o r t n o w c l a s s P a g e ( m o d e l s . M o d e l ) : n a m e = m o d e l s . C h a r F i e l d ( ' N a m e ' , m a x _ l e n g t h = 2 5 5 ) s l u g = m o d e l s . S l u g F i e l d ( ' S l u g ' , m a x _ l e n g t h = 2 5 5 , u n i q u e = T r u e ) @ c a c h e d _ p r o p e r t y d e f l a s t _ r e v ( s e l f ) : r e t u r n s e l f . r e v i s i o n _ s e t . f i r s t ( ) c l a s s R e v i s i o n ( m o d e l s . M o d e l ) : p a g e = m o d e l s . F o r e i g n K e y ( P a g e , o n _ d e l e t e = m o d e l s . P R O T E C T ) t e x t = m o d e l s . T e x t F i e l d ( ' T e x t ' ) v e r s i o n _ d a t e = m o d e l s . D a t e T i m e F i e l d ( ' D a t e ' , d e f a u l t = n o w ) c l a s s M e t a : o r d e r i n g = ( ' - v e r s i o n _ d a t e ' , )
23. ### DATABASE SCHEMA (2) Page + PageID Name Slug Last Revision

Revision + RevisionID PageID Text Version Date
24. ### DJANGO MODEL f r o m d j a n

g o . d b i m p o r t m o d e l s , t r a n s a c t i o n f r o m d j a n g o . u t i l s . t i m e z o n e i m p o r t n o w c l a s s P a g e ( m o d e l s . M o d e l ) : n a m e = m o d e l s . C h a r F i e l d ( ' N a m e ' , m a x _ l e n g t h = 2 5 5 ) s l u g = m o d e l s . S l u g F i e l d ( ' S l u g ' , m a x _ l e n g t h = 2 5 5 , u n i q u e = T r u e ) l a s t _ r e v = m o d e l s . F o r e i g n K e y ( ' R e v i s i o n ' , o n _ d e l e t e = m o d e l s . P R O T E C T , n u l l = T r u e , u n i q u e = T r u e , r e l a t e d _ n a m e = ' + ' ) c l a s s R e v i s i o n ( m o d e l s . M o d e l ) : p a g e = m o d e l s . F o r e i g n K e y ( P a g e , o n _ d e l e t e = m o d e l s . P R O T E C T ) t e x t = m o d e l s . T e x t F i e l d ( ' T e x t ' ) v e r s i o n _ d a t e = m o d e l s . D a t e T i m e F i e l d ( ' D a t e ' , d e f a u l t = n o w ) c l a s s M e t a : o r d e r i n g = ( ' - v e r s i o n _ d a t e ' , )

26. ### BENCHMARK DATA PostgreSQL 9.3.5 6,300 pages approx. 6 million revisions

in total 1 to 2,000 revisions per page

28. ### NORMALIZED SCHEMA f r o m d j a n

g o . s h o r t c u t s i m p o r t g e t _ o b j e c t _ o r _ 4 0 4 , r e n d e r f r o m n o r m a l i z e d . m o d e l s i m p o r t P a g e d e f s h o w ( r e q u e s t , s l u g ) : p a g e = g e t _ o b j e c t _ o r _ 4 0 4 ( P a g e , s l u g = s l u g ) r e v i s i o n = p a g e . l a s t _ r e v c o n t e x t = { ' p a g e ' : p a g e , ' r e v i s i o n ' : r e v i s i o n , } r e t u r n r e n d e r ( r e q u e s t , ' s h o w . h t m l ' , c o n t e x t )
29. ### DENORMALIZED SCHEMA f r o m d j a n

g o . s h o r t c u t s i m p o r t g e t _ o b j e c t _ o r _ 4 0 4 , r e n d e r f r o m d e n o r m a l i z e d . m o d e l s i m p o r t P a g e d e f s h o w ( r e q u e s t , s l u g ) : p a g e = g e t _ o b j e c t _ o r _ 4 0 4 ( P a g e , s l u g = s l u g ) r e v i s i o n = p a g e . l a s t _ r e v c o n t e x t = { ' p a g e ' : p a g e , ' r e v i s i o n ' : r e v i s i o n , } r e t u r n r e n d e r ( r e q u e s t , ' s h o w . h t m l ' , c o n t e x t )
30. ### RESULTS normalized denormalized Requests / minute 13,000 15,000 Concurrent connections

24 24 | 32 Quotient 541.67 625 | 468.75

EVERY PAGE
32. ### NORMALIZED SCHEMA f r o m d j a n

g o . s h o r t c u t s i m p o r t g e t _ o b j e c t _ o r _ 4 0 4 , r e n d e r f r o m d j a n g o . d b i m p o r t m o d e l s f r o m n o r m a l i z e d . m o d e l s i m p o r t P a g e d e f l i s t _ v i e w ( r e q u e s t ) : p a g e s = P a g e . o b j e c t s . a l l ( ) r e t u r n r e n d e r ( r e q u e s t , ' l i s t . h t m l ' , { ' p a g e s ' : p a g e s } ) 6,301 QUERIES! ONE PER PAGE!
33. ### NORMALIZED SCHEMA (SLIGHTLY OPTIMIZED) f r o m d j

a n g o . s h o r t c u t s i m p o r t g e t _ o b j e c t _ o r _ 4 0 4 , r e n d e r f r o m d j a n g o . d b i m p o r t m o d e l s f r o m n o r m a l i z e d . m o d e l s i m p o r t P a g e d e f l i s t _ v i e w ( r e q u e s t ) : p a g e s = P a g e . o b j e c t s . a l l ( ) . a n n o t a t e ( l a s t _ r e v _ v e r s i o n _ d a t e = m o d e l s . M a x ( ' r e v i s i o n _ _ v e r s i o n _ d a t e ' ) ) r e t u r n r e n d e r ( r e q u e s t , ' l i s t . h t m l ' , { ' p a g e s ' : p a g e s } )
34. ### DENORMALIZED SCHEMA f r o m d j a n

g o . s h o r t c u t s i m p o r t g e t _ o b j e c t _ o r _ 4 0 4 , r e n d e r f r o m d e n o r m a l i z e d . m o d e l s i m p o r t P a g e d e f l i s t _ v i e w ( r e q u e s t ) : p a g e s = P a g e . o b j e c t s . s e l e c t _ r e l a t e d ( ' l a s t _ r e v ' ) . a l l ( ) r e t u r n r e n d e r ( r e q u e s t , ' l i s t . h t m l ' , { ' p a g e s ' : p a g e s } )
35. ### RESULTS normalized denormalized Requests / minute 12 | 24 88

Concurrent connections 4 4 Quotient 3 | 6 22