Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Guide to the Post Relational Revolution

A Guide to the Post Relational Revolution

Presentation held at Scandinavian Developer Conference, April 2012

Theo Hultberg

April 17, 2012
Tweet

More Decks by Theo Hultberg

Other Decks in Programming

Transcript

  1. Chief Architect at Co-organizer of the local Ruby, Scala and

    JavaScript user groups More rep on StackOverflow than both Jeff & Joel
  2. JOINS ARE A CRUTCH why split up your data, if

    all you’re going to do is assemble it over and over again?
  3. THE RELATIONAL MODEL ISN’T A GOLDEN HAMMER the existence of

    object relational mappers should be proof enough
  4. the Bigtable model, “column oriented”, “sparse tables” found in Cassandra

    and HBase COLUMN KEY ROW KEY VALUE COLUMN KEY VALUE + TIMEST AMP SORTED
  5. “datastructure server”, e.g. Redis KEY VALUE VALUE VALUE LIST OR

    SET KEY VALUE VALUE VALUE SORTED SET OR HASH KEY KEY KEY KEY VALUE INCREMENT , APPEND, SLICE, CAS
  6. complex objects with lists, numbers, strings secondary indexes* and partial

    updates, MongoDB, CouchDB, RavenDB, Lotus Notes * subject to availability { "firstName": "John", "lastName": "Smith", "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": "10021" }, "phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "cell", "number": "646 555-4567" } ] }
  7. DIVERSITY I haven’t even mentioned search & indexing systems like

    Solr and Elastic Search, or distributed filesystems
  8. SOMETIMES TABLES ARE GREAT, TOO but mostly when you rely

    heavily on GROUP BY, SUM, AVG, etc. and can’t precompute
  9. CAP

  10. OK?

  11. divide the keyspace into shards, or regions (and store each

    one redundantly) SHARD SHARD SHARD KEYSPACE REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA DIVIDED BY DA T A SIZE Z A
  12. split a shard when it grows too big, move one

    of the new shards onto a new node SHARD SHARD SHARD KEYSPACE REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA SPLIT SHARD REPLICA REPLICA REPLICA Z A
  13. in reality there’s chunks, tablets or “virtual shards” that are

    distributed over physical shards SHARD SHARD SHARD KEYSPACE REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA REPLICA SHARD REPLICA REPLICA REPLICA Z A
  14. HBASE, MONGODB sharding is easy in theory, hard in practice,

    lots data needs to be moved when adding nodes
  15. each node is responsible for a range of the keyspace,

    keys are hashed and mapped to the first following node, (optionally) replicated to subsequent nodes KEYSPACE NODE NODE NODE NODE hash(key) replication 0 2n
  16. KEYSPACE NODE NODE NODE NODE NODE NEW NODE 0 2n

    when a new node is added, only part of the keyspace needs to be moved
  17. KEYSPACE NODE NODE NODE NODE NODE 0 2n in practice,

    “virtual nodes” are evenly distributed over the keyspace, and then mapped onto physical nodes
  18. GOSSIP , HINTED HANDOFF , LOG STRUCTURED STORAGE, COMPACTION, VECTOR

    CLOCKS, READ REPAIR, JOURNALING, QUORUMS, EVENTUAL CONSISTENCY, DYNAMO, MAP/REDUCE, 2PC a few of the things I haven’t mentioned, look them up
  19. GIVE A LOT OF THOUGHT TO YOUR PRIMARY KEYS range

    queries over cleverly designed primary keys can be very powerful, good keys required for efficient sharding
  20. M04L7NOC5NQS M04L7O05MIU2 M04NX42YFUCR M04NYR7VWKJC M04NZA8MJOOA M04NZB88CT14 M04NZPOCE8DM M04NZQ9G2T0S M04NZQE7E5VX M04NZSK4V3JN

    M04NZTRG661R M04NZTSUITJ7 M04NZUAILUS5 M04NZUG4DTXN M04NZWB9VV0C M04NZWW52T8N M04NZX2JEVO9 M04NZX7WD77W M04NZXGOLDEX M04NZXKNQWB3 M04NZXLGJ3M6 M04NZY7GO39G M04NZZ2SQF1I M04O013HN9L9 M04O014DASE6 M04O02PE8AD3 M04O02PGJBR1 M04O03UPTRWG M04O04833ZTL M04O04GH21JF M04O04JQ8B57 M04O04UHK3U4 M04O056QBNBH M04O05E8XO8N M04O069O8CDK M04O06MG47WK M04O07BHELVD M04O07F30WYX M04O0B39DGEA
  21. DELETING DATA IS NOT TRIVIAL sometimes delete operations can be

    more costly than inserts, design your cleaning process early
  22. MONGODB we’re currently pushing around 5K updates/s over three replica

    sets, each update incrementing up to 20 numbers
  23. CASSANDRA low level building blocks, no single point of failure,

    great horizontal scalability, TTL on values
  24. CASSANDRA we use it to store data about website visits,

    indexing it to support complex queries