Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Good Schema Design and Why It Matters

Good Schema Design and Why It Matters

A talk I gave at DjangoCon Europe 2014.

Andrew Godwin

May 15, 2014
Tweet

More Decks by Andrew Godwin

Other Decks in Programming

Transcript

  1. Explicit Schema ID int Name text Weight uint 1 2

    3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }
  2. Explicit Schema Normalised or semi normalised structure JOINs to retrieve

    related data Implicit Schema Embedded structure Related data retrieved naturally with object
  3. Silent Failure { "id": 342, "name": "David", "weight": 74, }

    { "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }
  4. You can't put your database in a VCS You can

    put your schema in a VCS But your data won't always survive.
  5. The Nesting Problem { "id": 123, "name": "Andrew", "friends": [

    {"id": 456, "name": "David"}, {"id": 789, "name": "Mazz"}, ], "likes": [ {"id": 22, "liker": {"id": 789, "name", "Mazz"}}, ], }
  6. Schemaless Columns ID int Name text Weight uint Data json

    1 Alice 76 { "nickname": "Al", "bgcolor": "#ff0033" }
  7. Comparison (never trust benchmarks) Loading 1.2 million records PostgreSQL MongoDB

    76 sec 8 min Sequential scan PostgreSQL MongoDB 980 ms 980 ms Index scan (Postgres GINhash) PostgreSQL MongoDB 0.7 ms 1 ms
  8. General Advice Write heavy → Fewer indexes Read heavy →

    Denormalise Keep large data away from read/write heavy data Blob stores/filesystems are DBs too