Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Dubious Database Design
Search
Andrew Godwin
September 07, 2015
Programming
0
250
Dubious Database Design
My talk from DjangoCon US 2015.
Andrew Godwin
September 07, 2015
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
200
Django Through The Years
andrewgodwin
0
86
Writing Maintainable Software At Scale
andrewgodwin
0
330
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
240
Async, Python, and the Future
andrewgodwin
2
540
How To Break Django: With Async
andrewgodwin
1
570
Taking Django's ORM Async
andrewgodwin
0
580
The Long Road To Asynchrony
andrewgodwin
0
510
The Scientist & The Engineer
andrewgodwin
1
570
Other Decks in Programming
See All in Programming
Prepare for Jakarta EE 11 - Performance and Developer Productivity
ivargrimstad
0
800
DMMプラットフォームがTiDB Cloudを採用した背景
pospome
8
4.1k
SIMD Parallel Programming with the Vector API
josepaumard
0
180
[技育CAMPアカデミア]アイディアを形に!【超入門】スマホアプリ開発〜リリースまでの流れをご紹介
teamlab
PRO
0
380
Elm 0.19.0 Changes
bkuhlmann
0
490
冗長なエラーログを削減し、スタックトレースを手に入れる / Reducing Verbose Error Logs and Obtaining Stack Traces
upamune
0
760
デフォルトにして至高、RubyMineの大好きな所
ruzia
0
390
見た目から始める生産性向上
ikumatadokoro
7
850
Tailwind CSSを本気でカスタマイズする方法
fsubal
13
5.3k
ゆるい個人開発のススメ
kuroppe1819
10
990
Git Lint
bkuhlmann
4
750
MicrosoftのPlatform Engineeringガイドを読んで実際になにかやってみた
ymd65536
1
340
Featured
See All Featured
How STYLIGHT went responsive
nonsquared
92
4.8k
YesSQL, Process and Tooling at Scale
rocio
164
13k
Fireside Chat
paigeccino
21
2.6k
Designing the Hi-DPI Web
ddemaree
276
33k
5 minutes of I Can Smell Your CMS
philhawksworth
199
19k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
60
14k
Happy Clients
brianwarren
92
6.4k
Principles of Awesome APIs and How to Build Them.
keavy
121
16k
Intergalactic Javascript Robots from Outer Space
tanoku
266
26k
[RailsConf 2023] Rails as a piece of cake
palkan
23
4k
From Idea to $5000 a Month in 5 Months
shpigford
377
45k
Thoughts on Productivity
jonyablonski
58
3.8k
Transcript
DUBIOUS Database DESIGN
Andrew Godwin Hi, I'm Author of 1.7 Django & South
migrations Senior Software Engineer at Only hates MySQL a little
“Do this. Don't ask why.”
Learning from failure.
Spacelog 1
Spacelog 1
“Redis is fast!”
None
Spacelog 1
Read-only, forever.
“Redis is fast!”
GET chapter-1 GET chapter-2
GET entry-123 GET entry-124 GET entry-125
ZRANGEBYSCORE .... GET entry-123 GET entry-124 GET entry-125
Request page Look up key range Multi-get key range Get
speaker details
SELECT ... JOIN ... WHERE ...
Ignoring JOIN 2
“Joins are slow!”
{"id": 11, "post": "abc", "author": 1} {"id": 12, "post": "def",
"author": 2} {"id": 13, "post", "ghi", "author": 3} {"id": 1, "name": "Andrew"} {"id": 2, "name": "Brenda"} {"id": 3, "name": "Carol"}
n number of authors m × number of posts
scan all posts build dict of author -> posts scan
all authors and emit with posts
HASH JOIN
{ "id": 11, "post":"abc", "author": {"name": "Andrew"} }
{ "id": 11, "post":"abc", "author": { "name": "Andrew", "last_seen": 120993013,
} }
The server's running, it's fine! 3
Write new save file Write new save file Delete old
save file
Write new save file Write new save file ?
Tell payment processor to send Mark as processing Find unpaid
clients Mark as paid
Tell payment processor to send Mark as processing Find unpaid
clients Mark as paid
The Fastidious Modeller 4
TwitterUser FacebookUser LinkedInUser EmailUser
SELECT ... FROM TwitterUser SELECT ... FROM EmailUser SELECT ...
FROM LinkedInUser
None
None
The database isn't magic.
The Table Lover 5
“How do I make tables at runtime?”
None
Tables/columns per language Tables/columns per customer Configurable CMS columns
Columns per language 300 - 400 language variants x A
couple of translated cols per table x
DDL is very expensive.
Use JSON, hstore, or EAV-style table!
6 The Cold Boot
Decent cache hit rate Application servers mostly utilised
Great engineering!
What would happen if I deleted the entire cache?
None
The Optimist 7
Sharded PostgreSQL ElasticSearch Riak Redis Flat files + + +
+
Don't forget redundancy. And backups.
What happens if just one dies?
n services = n points of failure
The Primary Optimist 8
“The highest value PK is the most recent”
“Autoincrement will work and scale forever”
“IDs are numbers we can do maths on”
The Function Lover 9
"Why waste time fetching columns and rendering them separately?"
CREATE FUNCTION
CREATE FUNCTION ... import jinja2
bit.ly/whynotpg
None
SELECT render(template, id) FROM pages WHERE %s ~ url;
None
There's a reason behind every rule.
Ask why, or try yourself. Don't write it off without
context.
Thanks. Andrew Godwin @andrewgodwin