Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Small Data: Databases in the Real World
Search
Andrew Godwin
August 04, 2014
Programming
650
2
Share
Small Data: Databases in the Real World
A talk I gave at PyCon AU 2014.
Andrew Godwin
August 04, 2014
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
380
Django Through The Years
andrewgodwin
0
300
Writing Maintainable Software At Scale
andrewgodwin
0
510
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
410
Async, Python, and the Future
andrewgodwin
2
730
How To Break Django: With Async
andrewgodwin
1
800
Taking Django's ORM Async
andrewgodwin
0
800
The Long Road To Asynchrony
andrewgodwin
0
750
The Scientist & The Engineer
andrewgodwin
1
830
Other Decks in Programming
See All in Programming
いつか誰かが、と思っていた フロントエンド刷新5年間の実践知
kiichisugihara
1
230
アクセシビリティ試験の"その後"を仕組み化する
yuuumiravy
1
180
PHPer、Cloudflare に引っ越す
suguruooki
1
120
2026-04-15 Spring IO - I Can See Clearly Now
jonatan_ivanov
1
150
「Linuxサーバー構築標準教科書」を読んでみた #ツナギメオフライン.7
akase244
0
1.4k
How We Benchmarked Quarkus: Patterns and anti-patterns
hollycummins
1
170
クラウドネイティブなエンジニアに向ける Raycastの魅力と実際の活用事例
nealle
2
230
2026年のソフトウェア開発を考える(2026/05版) / Software Engineering Scrum Fest Niigata 2026 Edition
twada
PRO
18
8.2k
属人化しないコード品質の作り方_2026.04.07.pdf
muraaano
0
280
10 Tips of AWS ~Gen AI on AWS~
licux
5
510
Spec-Driven Development with AI Agents (Workshop, May 2026)
antonarhipov
2
210
【26新卒研修】OpenAPI/Swagger REST API研修
dip_tech
PRO
0
110
Featured
See All Featured
Skip the Path - Find Your Career Trail
mkilby
1
110
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
160
Max Prin - Stacking Signals: How International SEO Comes Together (And Falls Apart)
techseoconnect
PRO
0
150
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
140
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.2k
The SEO Collaboration Effect
kristinabergwall1
1
440
Making the Leap to Tech Lead
cromwellryan
135
9.8k
How GitHub (no longer) Works
holman
316
150k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
780
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
270
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Transcript
Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE
Andrew Godwin Core Developer Senior Engineer
BIG DATA What does it mean? What is 'big'?
1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?
Scalable designs are a tradeoff: NOW LATER vs
Small company? Agency? Focus on ease of change, not scalability
You don't need to scale from day one But always
leave yourself scaling points
Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'
Rapid development It's all about schema change overhead
Explicit Schema ID int Name text Weight uint 1 2
3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }
Silent Failure { "id": 342, "name": "David", "weight": 74, }
{ "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }
Continuous deployment It's 11pm. Do you know where your locks
are?
Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type
changes
Hardware choice ZOMG RUN IT ON THE CLOUD
VMs are TERRIBLE at IO Up to 10x slowdown, even
with VT-d.
Memory is king Your database loves it. Don't let other
apps steal it.
Adding more power goes far Especially with PostgreSQL or read-only
replicas
None
Sharding point Vertical split Consistency leeway
Sharding point Datasets paritioned by primary key
Migration plan Implement consistent hashing on primary key Make large
number of logical shards (2048?) Map logical shards to single physical shard Migrate shards using replication
Vertical split Entirely unrelated tables
Migration plan Replicate database to new server Route split tables
there, disable replication - or - Slowly backfill new datastore with fallback lookup
Denormalisation It's not free!
Migration plan Add NULL fields to dependent tables App code
to fetch and fill if not present Possibly prefill on save of new items
Consistency leeway Can you take inconsistent views?
Migration plan Change your site! Talk to your designers! Deliberately
introduce inconsistency!
Big Data isn't one thing It depends on type, size,
complexity, throughput, latency...
Focus on the current problems Future problems don't matter if
you never get there
Efficiency and iterating fast matters The smaller you are, the
more time is worth
Good architecture affects product You're not writing a system in
a vacuum
Thanks! Andrew Godwin @andrewgodwin
[email protected]
are hiring!