$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Small Data: Databases in the Real World
Search
Andrew Godwin
August 04, 2014
Programming
2
620
Small Data: Databases in the Real World
A talk I gave at PyCon AU 2014.
Andrew Godwin
August 04, 2014
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
350
Django Through The Years
andrewgodwin
0
260
Writing Maintainable Software At Scale
andrewgodwin
0
470
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
380
Async, Python, and the Future
andrewgodwin
2
700
How To Break Django: With Async
andrewgodwin
1
760
Taking Django's ORM Async
andrewgodwin
0
750
The Long Road To Asynchrony
andrewgodwin
0
710
The Scientist & The Engineer
andrewgodwin
1
800
Other Decks in Programming
See All in Programming
ローターアクトEクラブ アメリカンナイト:川端 柚菜 氏(Japan O.K. ローターアクトEクラブ 会長):2720 Japan O.K. ロータリーEクラブ2025年12月1日卓話
2720japanoke
0
730
sbt 2
xuwei_k
0
300
AIコーディングエージェント(Manus)
kondai24
0
190
20 years of Symfony, what's next?
fabpot
2
360
認証・認可の基本を学ぼう後編
kouyuume
0
240
堅牢なフロントエンドテスト基盤を構築するために行った取り組み
shogo4131
8
2.4k
なあ兄弟、 余白の意味を考えてから UI実装してくれ!
ktcryomm
11
11k
LLM Çağında Backend Olmak: 10 Milyon Prompt'u Milisaniyede Sorgulamak
selcukusta
0
120
まだ間に合う!Claude Code元年をふりかえる
nogu66
5
840
AIコーディングエージェント(skywork)
kondai24
0
180
AI時代を生き抜く 新卒エンジニアの生きる道
coconala_engineer
1
250
DevFest Android in Korea 2025 - 개발자 커뮤니티를 통해 얻는 가치
wisemuji
0
150
Featured
See All Featured
Java REST API Framework Comparison - PWX 2021
mraible
34
9k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
1k
Facilitating Awesome Meetings
lara
57
6.7k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
Site-Speed That Sticks
csswizardry
13
1k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
7.9k
Transcript
Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE
Andrew Godwin Core Developer Senior Engineer
BIG DATA What does it mean? What is 'big'?
1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?
Scalable designs are a tradeoff: NOW LATER vs
Small company? Agency? Focus on ease of change, not scalability
You don't need to scale from day one But always
leave yourself scaling points
Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'
Rapid development It's all about schema change overhead
Explicit Schema ID int Name text Weight uint 1 2
3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }
Silent Failure { "id": 342, "name": "David", "weight": 74, }
{ "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }
Continuous deployment It's 11pm. Do you know where your locks
are?
Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type
changes
Hardware choice ZOMG RUN IT ON THE CLOUD
VMs are TERRIBLE at IO Up to 10x slowdown, even
with VT-d.
Memory is king Your database loves it. Don't let other
apps steal it.
Adding more power goes far Especially with PostgreSQL or read-only
replicas
None
Sharding point Vertical split Consistency leeway
Sharding point Datasets paritioned by primary key
Migration plan Implement consistent hashing on primary key Make large
number of logical shards (2048?) Map logical shards to single physical shard Migrate shards using replication
Vertical split Entirely unrelated tables
Migration plan Replicate database to new server Route split tables
there, disable replication - or - Slowly backfill new datastore with fallback lookup
Denormalisation It's not free!
Migration plan Add NULL fields to dependent tables App code
to fetch and fill if not present Possibly prefill on save of new items
Consistency leeway Can you take inconsistent views?
Migration plan Change your site! Talk to your designers! Deliberately
introduce inconsistency!
Big Data isn't one thing It depends on type, size,
complexity, throughput, latency...
Focus on the current problems Future problems don't matter if
you never get there
Efficiency and iterating fast matters The smaller you are, the
more time is worth
Good architecture affects product You're not writing a system in
a vacuum
Thanks! Andrew Godwin @andrewgodwin
[email protected]
are hiring!