Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dubious Database Design

Andrew Godwin
September 07, 2015

Dubious Database Design

My talk from DjangoCon US 2015.

Andrew Godwin

September 07, 2015
Tweet

More Decks by Andrew Godwin

Other Decks in Programming

Transcript

  1. DUBIOUS
    Database
    DESIGN

    View Slide

  2. Andrew Godwin
    Hi, I'm
    Author of 1.7 Django & South migrations
    Senior Software Engineer at
    Only hates MySQL a little

    View Slide

  3. “Do this. Don't ask why.”

    View Slide

  4. Learning from failure.

    View Slide

  5. Spacelog
    1

    View Slide

  6. Spacelog
    1

    View Slide

  7. “Redis is fast!”

    View Slide

  8. View Slide

  9. Spacelog
    1

    View Slide

  10. Read-only, forever.

    View Slide

  11. “Redis is fast!”

    View Slide

  12. GET chapter-1
    GET chapter-2

    View Slide

  13. GET entry-123
    GET entry-124
    GET entry-125

    View Slide

  14. ZRANGEBYSCORE ....
    GET entry-123
    GET entry-124
    GET entry-125

    View Slide

  15. Request page
    Look up key range
    Multi-get key range
    Get speaker details

    View Slide

  16. SELECT ... JOIN ... WHERE ...

    View Slide

  17. Ignoring JOIN
    2

    View Slide

  18. “Joins are slow!”

    View Slide

  19. {"id": 11, "post": "abc", "author": 1}
    {"id": 12, "post": "def", "author": 2}
    {"id": 13, "post", "ghi", "author": 3}
    {"id": 1, "name": "Andrew"}
    {"id": 2, "name": "Brenda"}
    {"id": 3, "name": "Carol"}

    View Slide

  20. n
    number of authors
    m
    ×
    number of posts

    View Slide

  21. scan all posts
    build dict of author -> posts
    scan all authors and emit with posts

    View Slide

  22. HASH JOIN

    View Slide

  23. {
    "id": 11,
    "post":"abc",
    "author": {"name": "Andrew"}
    }

    View Slide

  24. {
    "id": 11,
    "post":"abc",
    "author": {
    "name": "Andrew",
    "last_seen": 120993013,
    }
    }

    View Slide

  25. The server's running, it's fine!
    3

    View Slide

  26. Write new save file
    Write new save file
    Delete old save file

    View Slide

  27. Write new save file
    Write new save file
    ?

    View Slide

  28. Tell payment processor to send
    Mark as processing
    Find unpaid clients
    Mark as paid

    View Slide

  29. Tell payment processor to send
    Mark as processing
    Find unpaid clients
    Mark as paid

    View Slide

  30. The Fastidious Modeller
    4

    View Slide

  31. TwitterUser
    FacebookUser
    LinkedInUser
    EmailUser

    View Slide

  32. SELECT ... FROM TwitterUser
    SELECT ... FROM EmailUser
    SELECT ... FROM LinkedInUser

    View Slide

  33. View Slide

  34. View Slide

  35. The database isn't magic.

    View Slide

  36. The Table Lover
    5

    View Slide

  37. “How do I make tables at runtime?”

    View Slide

  38. View Slide

  39. Tables/columns per language
    Tables/columns per customer
    Configurable CMS columns

    View Slide

  40. Columns per language
    300 - 400 language variants
    x
    A couple of translated cols per table
    x

    View Slide

  41. DDL is very expensive.

    View Slide

  42. Use JSON, hstore, or EAV-style table!

    View Slide

  43. 6
    The Cold Boot

    View Slide

  44. Decent cache hit rate
    Application servers mostly utilised

    View Slide

  45. Great engineering!

    View Slide

  46. What would happen if I deleted
    the entire cache?

    View Slide

  47. View Slide

  48. The Optimist
    7

    View Slide

  49. Sharded PostgreSQL
    ElasticSearch
    Riak
    Redis
    Flat files
    +
    +
    +
    +

    View Slide

  50. Don't forget redundancy.
    And backups.

    View Slide

  51. What happens if just one dies?

    View Slide

  52. n services = n points of failure

    View Slide

  53. The Primary Optimist
    8

    View Slide

  54. “The highest value PK
    is the most recent”

    View Slide

  55. “Autoincrement will work
    and scale forever”

    View Slide

  56. “IDs are numbers we
    can do maths on”

    View Slide

  57. The Function Lover
    9

    View Slide

  58. "Why waste time fetching columns
    and rendering them separately?"

    View Slide

  59. CREATE FUNCTION

    View Slide

  60. CREATE FUNCTION ... import jinja2

    View Slide

  61. bit.ly/whynotpg

    View Slide

  62. View Slide

  63. SELECT render(template, id)
    FROM pages
    WHERE %s ~ url;

    View Slide

  64. View Slide

  65. There's a reason behind
    every rule.

    View Slide

  66. Ask why, or try yourself.
    Don't write it off without context.

    View Slide

  67. Thanks.
    Andrew Godwin
    @andrewgodwin

    View Slide