Upgrade to Pro — share decks privately, control downloads, hide ads and more …

I Hate Your Database

I Hate Your Database

A talk I gave at Djangocon Europe 2012

Andrew Godwin

June 05, 2012
Tweet

More Decks by Andrew Godwin

Other Decks in Technology

Transcript

  1. I Hate Your Database
    I Hate Your Database
    Andrew Godwin
    Andrew Godwin
    @andrewgodwin
    @andrewgodwin
    flickr.com/96dpi
    flickr.com/96dpi

    View Slide

  2. Hate? Databases?
    Hate? Databases?
    flickr.com/parkerblohm
    flickr.com/parkerblohm

    View Slide

  3. · Misuse
    · Misuse
    · Ignorance
    · Ignorance
    · Lies
    · Lies

    View Slide

  4. Different databases,
    Different databases,
    different occasions
    different occasions
    flickr.com/maistora
    flickr.com/maistora

    View Slide

  5. Relational
    Relational
    Document
    Document
    Key-Value
    Key-Value
    Graph
    Graph
    Object / Hierarchial
    Object / Hierarchial
    Spatial
    Spatial
    Time-series / RRD
    Time-series / RRD
    Search
    Search

    View Slide

  6. Relational
    Relational PostgreSQL
    PostgreSQL
    MySQL
    MySQL
    SQLite
    SQLite
    PostgreSQL
    PostgreSQL
    MySQL
    MySQL
    SQLite
    SQLite
    MongoDB
    MongoDB
    CouchDB
    CouchDB
    Document
    Document
    Redis
    Redis
    Cassandra
    Cassandra
    Riak
    Riak
    Key-value
    Key-value

    View Slide

  7. Some Quick Theory
    Some Quick Theory
    flickr.com/iamdabe
    flickr.com/iamdabe

    View Slide

  8. A
    Atomicity
    tomicity
    C
    Consistency
    onsistency
    I
    Isolation
    solation
    D
    Durability
    urability

    View Slide

  9. C
    Consistency
    onsistency
    A
    Availability
    vailability
    P
    Partition Tolerance
    artition Tolerance

    View Slide

  10. Relational Databases
    Relational Databases
    flickr.com/maistora
    flickr.com/maistora

    View Slide

  11. Common Pitfalls
    Common Pitfalls
    SELECT item1, item2, item3 FROM basket;
    SELECT item1, item2, item3 FROM basket;
    INSERT INTO groups (name, people)
    INSERT INTO groups (name, people)
    VALUES ('friends', 'aaron,betty,chris,damien')
    VALUES ('friends', 'aaron,betty,chris,damien')
    SELECT file_bytes FROM pictures WHERE filename =
    SELECT file_bytes FROM pictures WHERE filename =
    'foo.jpg'
    'foo.jpg'

    View Slide

  12. names = set()
    names = set()
    for book in Book.objects.filter(year=2012):
    for book in Book.objects.filter(year=2012):
    names.add(book.author.name)
    names.add(book.author.name)

    View Slide

  13. names = Author.objects.filter(
    names = Author.objects.filter(
    books__year=2012
    books__year=2012
    ).values_list('name', flat=True).distinct()
    ).values_list('name', flat=True).distinct()

    View Slide

  14. No transactional DDL
    No transactional DDL
    Poor query optimiser
    Poor query optimiser
    MyISAM: Full-table locking, no transactions
    MyISAM: Full-table locking, no transactions
    Oracle
    Oracle
    Very fast for some operations
    Very fast for some operations
    MySQL
    MySQL

    View Slide

  15. Little integrity checking (slowly being fixed)
    Little integrity checking (slowly being fixed)
    Impossible to do some table alterations
    Impossible to do some table alterations
    No concurrent access
    No concurrent access
    Very low overhead
    Very low overhead
    Very portable
    Very portable
    SQLite
    SQLite

    View Slide

  16. Slow default configuration
    Slow default configuration
    Can be a little harder to learn / less familiar
    Can be a little harder to learn / less familiar
    Almost too many features
    Almost too many features
    Incredibly reliable
    Incredibly reliable
    PostgreSQL
    PostgreSQL

    View Slide

  17. Document Databases
    Document Databases flickr.com/babyowls
    flickr.com/babyowls

    View Slide

  18. db.insert({
    db.insert({
    'name': 'Sally',
    'name': 'Sally',
    'tags': ['django', 'python', 'search'],
    'tags': ['django', 'python', 'search'],
    'addresses': [
    'addresses': [
    {'type': 'jabber', 'jid': '[email protected]'},
    {'type': 'jabber', 'jid': '[email protected]'},
    {'type': 'phone', 'number': '011899981199'},
    {'type': 'phone', 'number': '011899981199'},
    ],
    ],
    })
    })
    db.find({
    db.find({
    'tags': 'python',
    'tags': 'python',
    'addresses.type': 'jabber',
    'addresses.type': 'jabber',
    })
    })

    View Slide

  19. No fixed schema
    No fixed schema
    Low barrier to entry
    Low barrier to entry
    Closer to Python datatypes
    Closer to Python datatypes
    Advantages
    Advantages

    View Slide

  20. Immature (but improving)
    Immature (but improving)
    No transactions
    No transactions
    No integrity checking
    No integrity checking
    Problems
    Problems

    View Slide

  21. Key-value stores
    Key-value stores
    flickr.com/zebble
    flickr.com/zebble

    View Slide

  22. db.set('foo', 'bar)
    db.set('foo', 'bar)
    x = db.get('foo')
    x = db.get('foo')
    db.sadd('names', 'andrew')
    db.sadd('names', 'andrew')
    db.sadd('names', 'brian')
    db.sadd('names', 'brian')
    y = db.scard('names')
    y = db.scard('names')

    View Slide

  23. Horizontal scaling (but with drawbacks)
    Horizontal scaling (but with drawbacks)
    Extremely fast
    Extremely fast
    Can only fetch objects by key
    Can only fetch objects by key
    Batch/map-reduce queries
    Batch/map-reduce queries
    Transactions not possible
    Transactions not possible
    Traits
    Traits

    View Slide

  24. Other database types
    Other database types flickr.com/tusnelda
    flickr.com/tusnelda

    View Slide

  25. Knowledge of projections useful
    Knowledge of projections useful
    Spatial indexes really speed up some problems
    Spatial indexes really speed up some problems
    Generally add-on to existing DB
    Generally add-on to existing DB
    Spatial
    Spatial

    View Slide

  26. Hierarchial key-value store
    Hierarchial key-value store
    Allows multiple writers for appends
    Allows multiple writers for appends
    Supports very large files
    Supports very large files
    Filesystems
    Filesystems

    View Slide

  27. Allow efficient neighbour queries
    Allow efficient neighbour queries
    Generally not much use for anything else
    Generally not much use for anything else
    Graph Databases
    Graph Databases

    View Slide

  28. Deliberately loses old data
    Deliberately loses old data
    Useful for logging or statistics
    Useful for logging or statistics
    Round-Robin Database
    Round-Robin Database

    View Slide

  29. Final Thoughts
    Final Thoughts
    flickr.com/pagedooley
    flickr.com/pagedooley

    View Slide

  30. It's unlikely your data all
    It's unlikely your data all
    fits in one paradigm.
    fits in one paradigm.

    View Slide

  31. Just buying bigger servers
    Just buying bigger servers
    goes a long way
    goes a long way

    View Slide

  32. If it sounds too good to be true,
    If it sounds too good to be true,
    it probably is.
    it probably is.

    View Slide

  33. Fin.
    Fin.
    Andrew Godwin / @andrewgodwin
    Andrew Godwin / @andrewgodwin
    flickr.com/oimax
    flickr.com/oimax

    View Slide