Slide 1

Slide 1 text

I Hate Your Database I Hate Your Database Andrew Godwin Andrew Godwin @andrewgodwin @andrewgodwin flickr.com/96dpi flickr.com/96dpi

Slide 2

Slide 2 text

Hate? Databases? Hate? Databases? flickr.com/parkerblohm flickr.com/parkerblohm

Slide 3

Slide 3 text

· Misuse · Misuse · Ignorance · Ignorance · Lies · Lies

Slide 4

Slide 4 text

Different databases, Different databases, different occasions different occasions flickr.com/maistora flickr.com/maistora

Slide 5

Slide 5 text

Relational Relational Document Document Key-Value Key-Value Graph Graph Object / Hierarchial Object / Hierarchial Spatial Spatial Time-series / RRD Time-series / RRD Search Search

Slide 6

Slide 6 text

Relational Relational PostgreSQL PostgreSQL MySQL MySQL SQLite SQLite PostgreSQL PostgreSQL MySQL MySQL SQLite SQLite MongoDB MongoDB CouchDB CouchDB Document Document Redis Redis Cassandra Cassandra Riak Riak Key-value Key-value

Slide 7

Slide 7 text

Some Quick Theory Some Quick Theory flickr.com/iamdabe flickr.com/iamdabe

Slide 8

Slide 8 text

A Atomicity tomicity C Consistency onsistency I Isolation solation D Durability urability

Slide 9

Slide 9 text

C Consistency onsistency A Availability vailability P Partition Tolerance artition Tolerance

Slide 10

Slide 10 text

Relational Databases Relational Databases flickr.com/maistora flickr.com/maistora

Slide 11

Slide 11 text

Common Pitfalls Common Pitfalls SELECT item1, item2, item3 FROM basket; SELECT item1, item2, item3 FROM basket; INSERT INTO groups (name, people) INSERT INTO groups (name, people) VALUES ('friends', 'aaron,betty,chris,damien') VALUES ('friends', 'aaron,betty,chris,damien') SELECT file_bytes FROM pictures WHERE filename = SELECT file_bytes FROM pictures WHERE filename = 'foo.jpg' 'foo.jpg'

Slide 12

Slide 12 text

names = set() names = set() for book in Book.objects.filter(year=2012): for book in Book.objects.filter(year=2012): names.add(book.author.name) names.add(book.author.name)

Slide 13

Slide 13 text

names = Author.objects.filter( names = Author.objects.filter( books__year=2012 books__year=2012 ).values_list('name', flat=True).distinct() ).values_list('name', flat=True).distinct()

Slide 14

Slide 14 text

No transactional DDL No transactional DDL Poor query optimiser Poor query optimiser MyISAM: Full-table locking, no transactions MyISAM: Full-table locking, no transactions Oracle Oracle Very fast for some operations Very fast for some operations MySQL MySQL

Slide 15

Slide 15 text

Little integrity checking (slowly being fixed) Little integrity checking (slowly being fixed) Impossible to do some table alterations Impossible to do some table alterations No concurrent access No concurrent access Very low overhead Very low overhead Very portable Very portable SQLite SQLite

Slide 16

Slide 16 text

Slow default configuration Slow default configuration Can be a little harder to learn / less familiar Can be a little harder to learn / less familiar Almost too many features Almost too many features Incredibly reliable Incredibly reliable PostgreSQL PostgreSQL

Slide 17

Slide 17 text

Document Databases Document Databases flickr.com/babyowls flickr.com/babyowls

Slide 18

Slide 18 text

db.insert({ db.insert({ 'name': 'Sally', 'name': 'Sally', 'tags': ['django', 'python', 'search'], 'tags': ['django', 'python', 'search'], 'addresses': [ 'addresses': [ {'type': 'jabber', 'jid': '[email protected]'}, {'type': 'jabber', 'jid': '[email protected]'}, {'type': 'phone', 'number': '011899981199'}, {'type': 'phone', 'number': '011899981199'}, ], ], }) }) db.find({ db.find({ 'tags': 'python', 'tags': 'python', 'addresses.type': 'jabber', 'addresses.type': 'jabber', }) })

Slide 19

Slide 19 text

No fixed schema No fixed schema Low barrier to entry Low barrier to entry Closer to Python datatypes Closer to Python datatypes Advantages Advantages

Slide 20

Slide 20 text

Immature (but improving) Immature (but improving) No transactions No transactions No integrity checking No integrity checking Problems Problems

Slide 21

Slide 21 text

Key-value stores Key-value stores flickr.com/zebble flickr.com/zebble

Slide 22

Slide 22 text

db.set('foo', 'bar) db.set('foo', 'bar) x = db.get('foo') x = db.get('foo') db.sadd('names', 'andrew') db.sadd('names', 'andrew') db.sadd('names', 'brian') db.sadd('names', 'brian') y = db.scard('names') y = db.scard('names')

Slide 23

Slide 23 text

Horizontal scaling (but with drawbacks) Horizontal scaling (but with drawbacks) Extremely fast Extremely fast Can only fetch objects by key Can only fetch objects by key Batch/map-reduce queries Batch/map-reduce queries Transactions not possible Transactions not possible Traits Traits

Slide 24

Slide 24 text

Other database types Other database types flickr.com/tusnelda flickr.com/tusnelda

Slide 25

Slide 25 text

Knowledge of projections useful Knowledge of projections useful Spatial indexes really speed up some problems Spatial indexes really speed up some problems Generally add-on to existing DB Generally add-on to existing DB Spatial Spatial

Slide 26

Slide 26 text

Hierarchial key-value store Hierarchial key-value store Allows multiple writers for appends Allows multiple writers for appends Supports very large files Supports very large files Filesystems Filesystems

Slide 27

Slide 27 text

Allow efficient neighbour queries Allow efficient neighbour queries Generally not much use for anything else Generally not much use for anything else Graph Databases Graph Databases

Slide 28

Slide 28 text

Deliberately loses old data Deliberately loses old data Useful for logging or statistics Useful for logging or statistics Round-Robin Database Round-Robin Database

Slide 29

Slide 29 text

Final Thoughts Final Thoughts flickr.com/pagedooley flickr.com/pagedooley

Slide 30

Slide 30 text

It's unlikely your data all It's unlikely your data all fits in one paradigm. fits in one paradigm.

Slide 31

Slide 31 text

Just buying bigger servers Just buying bigger servers goes a long way goes a long way

Slide 32

Slide 32 text

If it sounds too good to be true, If it sounds too good to be true, it probably is. it probably is.

Slide 33

Slide 33 text

Fin. Fin. Andrew Godwin / @andrewgodwin Andrew Godwin / @andrewgodwin flickr.com/oimax flickr.com/oimax