Dubious Database Design

077e9a0cb34fa3eba2699240c9509717?s=47 Andrew Godwin
September 07, 2015

Dubious Database Design

My talk from DjangoCon US 2015.

077e9a0cb34fa3eba2699240c9509717?s=128

Andrew Godwin

September 07, 2015
Tweet

Transcript

  1. DUBIOUS Database DESIGN

  2. Andrew Godwin Hi, I'm Author of 1.7 Django & South

    migrations Senior Software Engineer at Only hates MySQL a little
  3. “Do this. Don't ask why.”

  4. Learning from failure.

  5. Spacelog 1

  6. Spacelog 1

  7. “Redis is fast!”

  8. None
  9. Spacelog 1

  10. Read-only, forever.

  11. “Redis is fast!”

  12. GET chapter-1 GET chapter-2

  13. GET entry-123 GET entry-124 GET entry-125

  14. ZRANGEBYSCORE .... GET entry-123 GET entry-124 GET entry-125

  15. Request page Look up key range Multi-get key range Get

    speaker details
  16. SELECT ... JOIN ... WHERE ...

  17. Ignoring JOIN 2

  18. “Joins are slow!”

  19. {"id": 11, "post": "abc", "author": 1} {"id": 12, "post": "def",

    "author": 2} {"id": 13, "post", "ghi", "author": 3} {"id": 1, "name": "Andrew"} {"id": 2, "name": "Brenda"} {"id": 3, "name": "Carol"}
  20. n number of authors m × number of posts

  21. scan all posts build dict of author -> posts scan

    all authors and emit with posts
  22. HASH JOIN

  23. { "id": 11, "post":"abc", "author": {"name": "Andrew"} }

  24. { "id": 11, "post":"abc", "author": { "name": "Andrew", "last_seen": 120993013,

    } }
  25. The server's running, it's fine! 3

  26. Write new save file Write new save file Delete old

    save file
  27. Write new save file Write new save file ?

  28. Tell payment processor to send Mark as processing Find unpaid

    clients Mark as paid
  29. Tell payment processor to send Mark as processing Find unpaid

    clients Mark as paid
  30. The Fastidious Modeller 4

  31. TwitterUser FacebookUser LinkedInUser EmailUser

  32. SELECT ... FROM TwitterUser SELECT ... FROM EmailUser SELECT ...

    FROM LinkedInUser
  33. None
  34. None
  35. The database isn't magic.

  36. The Table Lover 5

  37. “How do I make tables at runtime?”

  38. None
  39. Tables/columns per language Tables/columns per customer Configurable CMS columns

  40. Columns per language 300 - 400 language variants x A

    couple of translated cols per table x
  41. DDL is very expensive.

  42. Use JSON, hstore, or EAV-style table!

  43. 6 The Cold Boot

  44. Decent cache hit rate Application servers mostly utilised

  45. Great engineering!

  46. What would happen if I deleted the entire cache?

  47. None
  48. The Optimist 7

  49. Sharded PostgreSQL ElasticSearch Riak Redis Flat files + + +

    +
  50. Don't forget redundancy. And backups.

  51. What happens if just one dies?

  52. n services = n points of failure

  53. The Primary Optimist 8

  54. “The highest value PK is the most recent”

  55. “Autoincrement will work and scale forever”

  56. “IDs are numbers we can do maths on”

  57. The Function Lover 9

  58. "Why waste time fetching columns and rendering them separately?"

  59. CREATE FUNCTION

  60. CREATE FUNCTION ... import jinja2

  61. bit.ly/whynotpg

  62. None
  63. SELECT render(template, id) FROM pages WHERE %s ~ url;

  64. None
  65. There's a reason behind every rule.

  66. Ask why, or try yourself. Don't write it off without

    context.
  67. Thanks. Andrew Godwin @andrewgodwin