Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Small Data: Databases in the Real World
Search
Andrew Godwin
August 04, 2014
Programming
2
600
Small Data: Databases in the Real World
A talk I gave at PyCon AU 2014.
Andrew Godwin
August 04, 2014
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
330
Django Through The Years
andrewgodwin
0
220
Writing Maintainable Software At Scale
andrewgodwin
0
450
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
360
Async, Python, and the Future
andrewgodwin
2
680
How To Break Django: With Async
andrewgodwin
1
730
Taking Django's ORM Async
andrewgodwin
0
730
The Long Road To Asynchrony
andrewgodwin
0
660
The Scientist & The Engineer
andrewgodwin
1
780
Other Decks in Programming
See All in Programming
ぬるぬる動かせ! Riveでアニメーション実装🐾
kno3a87
1
230
Tool Catalog Agent for Bedrock AgentCore Gateway
licux
7
2.5k
デザイナーが Androidエンジニアに 挑戦してみた
874wokiite
0
550
The Past, Present, and Future of Enterprise Java with ASF in the Middle
ivargrimstad
0
170
Testing Trophyは叫ばない
toms74209200
0
890
@Environment(\.keyPath)那么好我不允许你们不知道! / atEnvironment keyPath is so good and you should know it!
lovee
0
130
Compose Multiplatform × AI で作る、次世代アプリ開発支援ツールの設計と実装
thagikura
0
170
Updates on MLS on Ruby (and maybe more)
sylph01
1
180
How Android Uses Data Structures Behind The Scenes
l2hyunwoo
0
480
Zendeskのチケットを Amazon Bedrockで 解析した
ryokosuge
3
320
Amazon RDS 向けに提供されている MCP Server と仕組みを調べてみた/jawsug-okayama-2025-aurora-mcp
takahashiikki
1
120
Oracle Database Technology Night 92 Database Connection control FAN-AC
oracle4engineer
PRO
1
470
Featured
See All Featured
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Fireside Chat
paigeccino
39
3.6k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.6k
How to Ace a Technical Interview
jacobian
279
23k
Automating Front-end Workflow
addyosmani
1370
200k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
61k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
34
6k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.2k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
44
2.5k
Six Lessons from altMBA
skipperchong
28
4k
Transcript
Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE
Andrew Godwin Core Developer Senior Engineer
BIG DATA What does it mean? What is 'big'?
1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?
Scalable designs are a tradeoff: NOW LATER vs
Small company? Agency? Focus on ease of change, not scalability
You don't need to scale from day one But always
leave yourself scaling points
Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'
Rapid development It's all about schema change overhead
Explicit Schema ID int Name text Weight uint 1 2
3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }
Silent Failure { "id": 342, "name": "David", "weight": 74, }
{ "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }
Continuous deployment It's 11pm. Do you know where your locks
are?
Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type
changes
Hardware choice ZOMG RUN IT ON THE CLOUD
VMs are TERRIBLE at IO Up to 10x slowdown, even
with VT-d.
Memory is king Your database loves it. Don't let other
apps steal it.
Adding more power goes far Especially with PostgreSQL or read-only
replicas
None
Sharding point Vertical split Consistency leeway
Sharding point Datasets paritioned by primary key
Migration plan Implement consistent hashing on primary key Make large
number of logical shards (2048?) Map logical shards to single physical shard Migrate shards using replication
Vertical split Entirely unrelated tables
Migration plan Replicate database to new server Route split tables
there, disable replication - or - Slowly backfill new datastore with fallback lookup
Denormalisation It's not free!
Migration plan Add NULL fields to dependent tables App code
to fetch and fill if not present Possibly prefill on save of new items
Consistency leeway Can you take inconsistent views?
Migration plan Change your site! Talk to your designers! Deliberately
introduce inconsistency!
Big Data isn't one thing It depends on type, size,
complexity, throughput, latency...
Focus on the current problems Future problems don't matter if
you never get there
Efficiency and iterating fast matters The smaller you are, the
more time is worth
Good architecture affects product You're not writing a system in
a vacuum
Thanks! Andrew Godwin @andrewgodwin
[email protected]
are hiring!