Small Data: Databases in the Real World

Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE

Andrew Godwin Core Developer Senior Engineer

BIG DATA What does it mean? What is 'big'?

1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?

Scalable designs are a tradeoff: NOW LATER vs

Small company? Agency? Focus on ease of change, not scalability

You don't need to scale from day one But always
leave yourself scaling points

Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'

Rapid development It's all about schema change overhead

Explicit Schema ID int Name text Weight uint 1 2
3 Alice Bob Charles 76 84 65 Implicit Schema { "id": 342, "name": "David", "weight": 44, }

Silent Failure { "id": 342, "name": "David", "weight": 74, }
{ "id": 342, "name": "Ellie", "weight": "85kg", } { "id": 342, "nom": "Frankie", "weight": 77, } { "id": 342, "name": "Frankie", "weight": -67, }

Continuous deployment It's 11pm. Do you know where your locks
are?

Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type
changes

Hardware choice ZOMG RUN IT ON THE CLOUD

VMs are TERRIBLE at IO Up to 10x slowdown, even
with VT-d.

Memory is king Your database loves it. Don't let other
apps steal it.

Adding more power goes far Especially with PostgreSQL or read-only
replicas

Sharding point Vertical split Consistency leeway

Sharding point Datasets paritioned by primary key

Migration plan Implement consistent hashing on primary key Make large
number of logical shards (2048?) Map logical shards to single physical shard Migrate shards using replication

Vertical split Entirely unrelated tables

Migration plan Replicate database to new server Route split tables
there, disable replication - or - Slowly backfill new datastore with fallback lookup

Denormalisation It's not free!

Migration plan Add NULL fields to dependent tables App code
to fetch and fill if not present Possibly prefill on save of new items

Consistency leeway Can you take inconsistent views?

Migration plan Change your site! Talk to your designers! Deliberately
introduce inconsistency!

Big Data isn't one thing It depends on type, size,
complexity, throughput, latency...

Focus on the current problems Future problems don't matter if
you never get there

Efficiency and iterating fast matters The smaller you are, the
more time is worth

Good architecture affects product You're not writing a system in
a vacuum

Thanks! Andrew Godwin @andrewgodwin andrewgodwin@eventbrite.com are hiring!

Small Data: Databases in the Real World

Small Data: Databases in the Real World

Andrew Godwin

More Decks by Andrew Godwin

Other Decks in Programming

Featured

Transcript

Andrew Godwin @andrewgodwin SMALL DATA REAL WORLD DATABASES IN THE

Andrew Godwin Core Developer Senior Engineer

BIG DATA What does it mean? What is 'big'?

1,000 rows? 1,000,000 rows? 1,000,000,000 rows? 1,000,000,000,000 rows?

Scalable designs are a tradeoff: NOW LATER vs

Small company? Agency? Focus on ease of change, not scalability

You don't need to scale from day one But always

Rapid development Continuous deployment Hardware choice Scaling 'breakpoints'

Rapid development It's all about schema change overhead

Explicit Schema ID int Name text Weight uint 1 2

Silent Failure { "id": 342, "name": "David", "weight": 74, }

Continuous deployment It's 11pm. Do you know where your locks

Add NULL and backfill 1-to-1 relation and backfill DBMS-supported type

Hardware choice ZOMG RUN IT ON THE CLOUD

VMs are TERRIBLE at IO Up to 10x slowdown, even

Memory is king Your database loves it. Don't let other

Adding more power goes far Especially with PostgreSQL or read-only

Sharding point Vertical split Consistency leeway

Sharding point Datasets paritioned by primary key

Migration plan Implement consistent hashing on primary key Make large

Vertical split Entirely unrelated tables

Migration plan Replicate database to new server Route split tables

Denormalisation It's not free!

Migration plan Add NULL fields to dependent tables App code

Consistency leeway Can you take inconsistent views?

Migration plan Change your site! Talk to your designers! Deliberately

Big Data isn't one thing It depends on type, size,

Focus on the current problems Future problems don't matter if

Efficiency and iterating fast matters The smaller you are, the

Good architecture affects product You're not writing a system in

Thanks! Andrew Godwin @andrewgodwin andrewgodwin@eventbrite.com are hiring!