Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Dubious Database Design
Search
Andrew Godwin
September 07, 2015
Programming
0
260
Dubious Database Design
My talk from DjangoCon US 2015.
Andrew Godwin
September 07, 2015
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
240
Django Through The Years
andrewgodwin
0
140
Writing Maintainable Software At Scale
andrewgodwin
0
380
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
290
Async, Python, and the Future
andrewgodwin
2
580
How To Break Django: With Async
andrewgodwin
1
630
Taking Django's ORM Async
andrewgodwin
0
650
The Long Road To Asynchrony
andrewgodwin
0
570
The Scientist & The Engineer
andrewgodwin
1
660
Other Decks in Programming
See All in Programming
Progressive Web Apps for Rails developers
siaw23
2
550
実践サーバーレスパフォーマンスチューニング ~その実力に迫る~ / Practical Serverless Performance Tuning ~A Close Look at its Power~
seike460
PRO
2
180
モジュラモノリス、その前に / Modular monolith, before that
euglena1215
8
720
sqlcを利用してsqlに型付けを
kamiyam
0
240
学生の時に開催したPerl入学式をきっかけにエンジニアが組織に馴染むために勉強会を主催や仲間と参加して職能間の境界を越えていく
ohmori_yusuke
2
140
ML-прайсинг_на_Lamoda__вошли_и_вышли__приключение_на_20_минут__Слава_Цыганков.pdf
lamodatech
0
220
Integrating AI in Your Enterprise Java Applications
ivargrimstad
0
330
tsconfig.jsonの最近の新機能 ファイルパス編
uhyo
7
1.7k
Memory API: Patterns, Use Cases, and Performance
josepaumard
1
180
pytest プラグインを開発して DRY に自動テストを書こう
inuatsu
2
260
Compose Multiplatform과 Ktor로 플랫폼의 경계를 넘어보자
kwakeuijin
0
280
Unlocking Python's Core Magic
leew
0
130
Featured
See All Featured
Put a Button on it: Removing Barriers to Going Fast.
kastner
58
3.5k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
25
660
Adopting Sorbet at Scale
ufuk
73
9k
Scaling GitHub
holman
458
140k
Unsuck your backbone
ammeep
668
57k
What's new in Ruby 2.0
geeforr
341
31k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
504
140k
Designing with Data
zakiwarfel
98
5.1k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
92
16k
Designing for humans not robots
tammielis
249
25k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
225
22k
Typedesign – Prime Four
hannesfritz
39
2.3k
Transcript
DUBIOUS Database DESIGN
Andrew Godwin Hi, I'm Author of 1.7 Django & South
migrations Senior Software Engineer at Only hates MySQL a little
“Do this. Don't ask why.”
Learning from failure.
Spacelog 1
Spacelog 1
“Redis is fast!”
None
Spacelog 1
Read-only, forever.
“Redis is fast!”
GET chapter-1 GET chapter-2
GET entry-123 GET entry-124 GET entry-125
ZRANGEBYSCORE .... GET entry-123 GET entry-124 GET entry-125
Request page Look up key range Multi-get key range Get
speaker details
SELECT ... JOIN ... WHERE ...
Ignoring JOIN 2
“Joins are slow!”
{"id": 11, "post": "abc", "author": 1} {"id": 12, "post": "def",
"author": 2} {"id": 13, "post", "ghi", "author": 3} {"id": 1, "name": "Andrew"} {"id": 2, "name": "Brenda"} {"id": 3, "name": "Carol"}
n number of authors m × number of posts
scan all posts build dict of author -> posts scan
all authors and emit with posts
HASH JOIN
{ "id": 11, "post":"abc", "author": {"name": "Andrew"} }
{ "id": 11, "post":"abc", "author": { "name": "Andrew", "last_seen": 120993013,
} }
The server's running, it's fine! 3
Write new save file Write new save file Delete old
save file
Write new save file Write new save file ?
Tell payment processor to send Mark as processing Find unpaid
clients Mark as paid
Tell payment processor to send Mark as processing Find unpaid
clients Mark as paid
The Fastidious Modeller 4
TwitterUser FacebookUser LinkedInUser EmailUser
SELECT ... FROM TwitterUser SELECT ... FROM EmailUser SELECT ...
FROM LinkedInUser
None
None
The database isn't magic.
The Table Lover 5
“How do I make tables at runtime?”
None
Tables/columns per language Tables/columns per customer Configurable CMS columns
Columns per language 300 - 400 language variants x A
couple of translated cols per table x
DDL is very expensive.
Use JSON, hstore, or EAV-style table!
6 The Cold Boot
Decent cache hit rate Application servers mostly utilised
Great engineering!
What would happen if I deleted the entire cache?
None
The Optimist 7
Sharded PostgreSQL ElasticSearch Riak Redis Flat files + + +
+
Don't forget redundancy. And backups.
What happens if just one dies?
n services = n points of failure
The Primary Optimist 8
“The highest value PK is the most recent”
“Autoincrement will work and scale forever”
“IDs are numbers we can do maths on”
The Function Lover 9
"Why waste time fetching columns and rendering them separately?"
CREATE FUNCTION
CREATE FUNCTION ... import jinja2
bit.ly/whynotpg
None
SELECT render(template, id) FROM pages WHERE %s ~ url;
None
There's a reason behind every rule.
Ask why, or try yourself. Don't write it off without
context.
Thanks. Andrew Godwin @andrewgodwin