Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dubious Database Design

Andrew Godwin
September 07, 2015

Dubious Database Design

My talk from DjangoCon US 2015.

Andrew Godwin

September 07, 2015
Tweet

More Decks by Andrew Godwin

Other Decks in Programming

Transcript

  1. DUBIOUS
    Database
    DESIGN

    View full-size slide

  2. Andrew Godwin
    Hi, I'm
    Author of 1.7 Django & South migrations
    Senior Software Engineer at
    Only hates MySQL a little

    View full-size slide

  3. “Do this. Don't ask why.”

    View full-size slide

  4. Learning from failure.

    View full-size slide

  5. “Redis is fast!”

    View full-size slide

  6. Read-only, forever.

    View full-size slide

  7. “Redis is fast!”

    View full-size slide

  8. GET chapter-1
    GET chapter-2

    View full-size slide

  9. GET entry-123
    GET entry-124
    GET entry-125

    View full-size slide

  10. ZRANGEBYSCORE ....
    GET entry-123
    GET entry-124
    GET entry-125

    View full-size slide

  11. Request page
    Look up key range
    Multi-get key range
    Get speaker details

    View full-size slide

  12. SELECT ... JOIN ... WHERE ...

    View full-size slide

  13. Ignoring JOIN
    2

    View full-size slide

  14. “Joins are slow!”

    View full-size slide

  15. {"id": 11, "post": "abc", "author": 1}
    {"id": 12, "post": "def", "author": 2}
    {"id": 13, "post", "ghi", "author": 3}
    {"id": 1, "name": "Andrew"}
    {"id": 2, "name": "Brenda"}
    {"id": 3, "name": "Carol"}

    View full-size slide

  16. n
    number of authors
    m
    ×
    number of posts

    View full-size slide

  17. scan all posts
    build dict of author -> posts
    scan all authors and emit with posts

    View full-size slide

  18. {
    "id": 11,
    "post":"abc",
    "author": {"name": "Andrew"}
    }

    View full-size slide

  19. {
    "id": 11,
    "post":"abc",
    "author": {
    "name": "Andrew",
    "last_seen": 120993013,
    }
    }

    View full-size slide

  20. The server's running, it's fine!
    3

    View full-size slide

  21. Write new save file
    Write new save file
    Delete old save file

    View full-size slide

  22. Write new save file
    Write new save file
    ?

    View full-size slide

  23. Tell payment processor to send
    Mark as processing
    Find unpaid clients
    Mark as paid

    View full-size slide

  24. Tell payment processor to send
    Mark as processing
    Find unpaid clients
    Mark as paid

    View full-size slide

  25. The Fastidious Modeller
    4

    View full-size slide

  26. TwitterUser
    FacebookUser
    LinkedInUser
    EmailUser

    View full-size slide

  27. SELECT ... FROM TwitterUser
    SELECT ... FROM EmailUser
    SELECT ... FROM LinkedInUser

    View full-size slide

  28. The database isn't magic.

    View full-size slide

  29. The Table Lover
    5

    View full-size slide

  30. “How do I make tables at runtime?”

    View full-size slide

  31. Tables/columns per language
    Tables/columns per customer
    Configurable CMS columns

    View full-size slide

  32. Columns per language
    300 - 400 language variants
    x
    A couple of translated cols per table
    x

    View full-size slide

  33. DDL is very expensive.

    View full-size slide

  34. Use JSON, hstore, or EAV-style table!

    View full-size slide

  35. 6
    The Cold Boot

    View full-size slide

  36. Decent cache hit rate
    Application servers mostly utilised

    View full-size slide

  37. Great engineering!

    View full-size slide

  38. What would happen if I deleted
    the entire cache?

    View full-size slide

  39. The Optimist
    7

    View full-size slide

  40. Sharded PostgreSQL
    ElasticSearch
    Riak
    Redis
    Flat files
    +
    +
    +
    +

    View full-size slide

  41. Don't forget redundancy.
    And backups.

    View full-size slide

  42. What happens if just one dies?

    View full-size slide

  43. n services = n points of failure

    View full-size slide

  44. The Primary Optimist
    8

    View full-size slide

  45. “The highest value PK
    is the most recent”

    View full-size slide

  46. “Autoincrement will work
    and scale forever”

    View full-size slide

  47. “IDs are numbers we
    can do maths on”

    View full-size slide

  48. The Function Lover
    9

    View full-size slide

  49. "Why waste time fetching columns
    and rendering them separately?"

    View full-size slide

  50. CREATE FUNCTION

    View full-size slide

  51. CREATE FUNCTION ... import jinja2

    View full-size slide

  52. bit.ly/whynotpg

    View full-size slide

  53. SELECT render(template, id)
    FROM pages
    WHERE %s ~ url;

    View full-size slide

  54. There's a reason behind
    every rule.

    View full-size slide

  55. Ask why, or try yourself.
    Don't write it off without context.

    View full-size slide

  56. Thanks.
    Andrew Godwin
    @andrewgodwin

    View full-size slide