Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Small Data: Databases in the Real World
Andrew Godwin
August 04, 2014
Programming
2
370
Small Data: Databases in the Real World
A talk I gave at PyCon AU 2014.
Andrew Godwin
August 04, 2014
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
87
Async, Python, and the Future
andrewgodwin
1
360
How To Break Django: With Async
andrewgodwin
1
280
Taking Django's ORM Async
andrewgodwin
0
330
The Long Road To Asynchrony
andrewgodwin
0
380
The Scientist & The Engineer
andrewgodwin
1
370
Pioneering Real-Time
andrewgodwin
0
150
Just Add Await: Retrofitting Async Into Django
andrewgodwin
2
1.1k
Terrain, Art, Python and LiDAR
andrewgodwin
1
230
Other Decks in Programming
See All in Programming
CLIツールにSwift Concurrencyを適用させようとしている話
417_72ki
3
130
Yumemi.apk #6 ~ゆめみのAndroidエンジニア 日頃の成果大発表会!~ Session 2
blendthink
1
200
microCMS × imgixを活用して品質とレスポンスを両立したポートフォリオサイトを作成した話
takehitogoto
0
380
Kueue入門/Kueue Introduction
bells17
0
500
あなたの会社の古いシステム、なんとかしませんか?~システム刷新から考えるDX化への道筋とバリエーション~/webinar20220420-systems
grapecity_dev
0
120
職場にPythonistaを増やす方法
soogie
0
170
UI State Modeling 어떤게 좋을까?
laco2951
0
170
A technique to implement DSL in Ruby
okuramasafumi
0
450
Angular's new Standalone Components: How Will They Affect My Architecture? @iJS London 2022
manfredsteyer
PRO
0
380
Enterprise Angular: Frontend Moduliths with Nx and Standalone Components @jax2022
manfredsteyer
PRO
0
270
You CANt teach an old dog new tricks
michaelbukachi
0
110
あなたの会社の古いシステム、なんとかしませんか?~システム刷新から考えるDX化への道筋とバリエーション~/webinar20220420-grapecity
grapecity_dev
0
120
Featured
See All Featured
Building an army of robots
kneath
299
40k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
268
11k
Art Directing for the Web. Five minutes with CSS Template Areas
malarkey
196
9.4k
Designing for humans not robots
tammielis
241
23k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
29
4.3k
Robots, Beer and Maslow
schacon
152
7.1k
Fireside Chat
paigeccino
11
1.2k
Debugging Ruby Performance
tmm1
65
10k
Side Projects
sachag
449
37k
Six Lessons from altMBA
skipperchong
14
1.3k
Clear Off the Table
cherdarchuk
79
280k
The Illustrated Children's Guide to Kubernetes
chrisshort
14
34k
Transcript
Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE
Andrew Godwin Core Developer Senior Engineer
BIG DATA What does it mean? What is 'big'?
1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?
Scalable designs are a tradeoff: NOW LATER vs
Small company? Agency? Focus on ease of change, not scalability
You don't need to scale from day one But always
leave yourself scaling points
Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'
Rapid development It's all about schema change overhead
Explicit Schema ID int Name text Weight uint 1 2
3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }
Silent Failure { "id": 342, "name": "David", "weight": 74, }
{ "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }
Continuous deployment It's 11pm. Do you know where your locks
are?
Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type
changes
Hardware choice ZOMG RUN IT ON THE CLOUD
VMs are TERRIBLE at IO Up to 10x slowdown, even
with VT-d.
Memory is king Your database loves it. Don't let other
apps steal it.
Adding more power goes far Especially with PostgreSQL or read-only
replicas
None
Sharding point Vertical split Consistency leeway
Sharding point Datasets paritioned by primary key
Migration plan Implement consistent hashing on primary key Make large
number of logical shards (2048?) Map logical shards to single physical shard Migrate shards using replication
Vertical split Entirely unrelated tables
Migration plan Replicate database to new server Route split tables
there, disable replication - or - Slowly backfill new datastore with fallback lookup
Denormalisation It's not free!
Migration plan Add NULL fields to dependent tables App code
to fetch and fill if not present Possibly prefill on save of new items
Consistency leeway Can you take inconsistent views?
Migration plan Change your site! Talk to your designers! Deliberately
introduce inconsistency!
Big Data isn't one thing It depends on type, size,
complexity, throughput, latency...
Focus on the current problems Future problems don't matter if
you never get there
Efficiency and iterating fast matters The smaller you are, the
more time is worth
Good architecture affects product You're not writing a system in
a vacuum
Thanks! Andrew Godwin @andrewgodwin andrewgodwin@eventbrite.com are hiring!