Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Small Data: Databases in the Real World
Search
Andrew Godwin
August 04, 2014
Programming
2
600
Small Data: Databases in the Real World
A talk I gave at PyCon AU 2014.
Andrew Godwin
August 04, 2014
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
320
Django Through The Years
andrewgodwin
0
210
Writing Maintainable Software At Scale
andrewgodwin
0
450
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
360
Async, Python, and the Future
andrewgodwin
2
670
How To Break Django: With Async
andrewgodwin
1
730
Taking Django's ORM Async
andrewgodwin
0
730
The Long Road To Asynchrony
andrewgodwin
0
660
The Scientist & The Engineer
andrewgodwin
1
770
Other Decks in Programming
See All in Programming
STUNMESH-go: Wireguard NAT穿隧工具的源起與介紹
tjjh89017
0
370
技術的負債で信頼性が限界だったWordPress運用をShifterで完全復活させた話
rvirus0817
1
1.7k
プロダクトという一杯を作る - プロダクトチームが味の責任を持つまでの煮込み奮闘記
hiliteeternal
0
460
Android 15以上でPDFのテキスト検索を爆速開発!
tonionagauzzi
0
200
Understanding Ruby Grammar Through Conflicts
yui_knk
1
110
『リコリス・リコイル』に学ぶ!! 〜キャリア戦略における計画的偶発性理論と変わる勇気の重要性〜
wanko_it
1
530
Terraform やるなら公式スタイルガイドを読もう 〜重要項目 10選〜
hiyanger
13
3.1k
あまり知られていない MCP 仕様たち / MCP specifications that aren’t widely known
ktr_0731
0
270
React 使いじゃなくても知っておきたい教養としての React
oukayuka
18
5.7k
書き捨てではなく継続開発可能なコードをAIコーディングエージェントで書くために意識していること
shuyakinjo
1
270
Google I/O recap web編 大分Web祭り2025
kponda
0
2.9k
[DevinMeetupTokyo2025] コード書かせないDevinの使い方
takumiyoshikawa
2
280
Featured
See All Featured
Building Flexible Design Systems
yeseniaperezcruz
328
39k
Designing for Performance
lara
610
69k
For a Future-Friendly Web
brad_frost
179
9.9k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
770
jQuery: Nuts, Bolts and Bling
dougneiner
64
7.8k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.3k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
50
5.5k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
Designing for humans not robots
tammielis
253
25k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Designing Experiences People Love
moore
142
24k
Transcript
Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE
Andrew Godwin Core Developer Senior Engineer
BIG DATA What does it mean? What is 'big'?
1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?
Scalable designs are a tradeoff: NOW LATER vs
Small company? Agency? Focus on ease of change, not scalability
You don't need to scale from day one But always
leave yourself scaling points
Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'
Rapid development It's all about schema change overhead
Explicit Schema ID int Name text Weight uint 1 2
3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }
Silent Failure { "id": 342, "name": "David", "weight": 74, }
{ "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }
Continuous deployment It's 11pm. Do you know where your locks
are?
Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type
changes
Hardware choice ZOMG RUN IT ON THE CLOUD
VMs are TERRIBLE at IO Up to 10x slowdown, even
with VT-d.
Memory is king Your database loves it. Don't let other
apps steal it.
Adding more power goes far Especially with PostgreSQL or read-only
replicas
None
Sharding point Vertical split Consistency leeway
Sharding point Datasets paritioned by primary key
Migration plan Implement consistent hashing on primary key Make large
number of logical shards (2048?) Map logical shards to single physical shard Migrate shards using replication
Vertical split Entirely unrelated tables
Migration plan Replicate database to new server Route split tables
there, disable replication - or - Slowly backfill new datastore with fallback lookup
Denormalisation It's not free!
Migration plan Add NULL fields to dependent tables App code
to fetch and fill if not present Possibly prefill on save of new items
Consistency leeway Can you take inconsistent views?
Migration plan Change your site! Talk to your designers! Deliberately
introduce inconsistency!
Big Data isn't one thing It depends on type, size,
complexity, throughput, latency...
Focus on the current problems Future problems don't matter if
you never get there
Efficiency and iterating fast matters The smaller you are, the
more time is worth
Good architecture affects product You're not writing a system in
a vacuum
Thanks! Andrew Godwin @andrewgodwin andrewgodwin@eventbrite.com are hiring!