What? “NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia Thursday, April 15, 2010
What? “NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia Not a single technique Not a single type of data Not a single type of use case Thursday, April 15, 2010
What’s out there? Storage type License Implemented in Amazon Dynamo Key/Value n/a ? Cassandra Columnfamily ASL 2.0 Java CouchDB Document ASL 2.0 Erlang Dynomite Key/Value BSD/MIT-style Erlang HBase Columnfamily ASL 2.0 Java MongoDB Document AGPL v3.0 C++ Neo4J Graph AGPL v3.0 / Comm Java Riak Key/Value ASL 2.0 Erlang Redis Key/Value BSD/MIT-style C Scalaris Key/Value ASL 2.0 Erlang Tokyo Cabinet Key/Value LGPL C Voldemort Key/Value ASL 2.0 Java Thursday, April 15, 2010
Distribution Masterless Master/Slave Hot standby Amazon Dynamo X Cassandra X CouchDB X Dynomite X HBase ? MongoDB X X Neo4J* Riak X Redis X Scalaris X Tokyo Cabinet Voldemort X * Neo4J HA coming “soon” Thursday, April 15, 2010
Of the web “...Django may be built for the Web, but CouchDB is built of the Web. I’ve never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated” - http://jacobian.org/writing/of-the-web/ Thursday, April 15, 2010
Of the web “...CouchDB may succeeded, and it may fail; who knows. I’m sure of one thing, though — this is what the software of the future looks like” - http://jacobian.org/writing/of-the-web/ Thursday, April 15, 2010
The Ring One Ring size to rule them all, One Ring size to find them, One Ring size to bring them all and in the cluster bind them... Thursday, April 15, 2010
Cluster - Read (GET) Instance A Instance B Instance C Okidoki, now where’s he...a yeah in my fourth slice I can haz ? Hey C! I need Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Shares state, bucket and ring knowledge in the cluster Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Shares state, bucket and ring knowledge in the cluster Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects Data structure for efficient summary about keys. Gossiped. Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects Data structure for efficient summary about keys. Gossiped. One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Partition Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects One slice (part) of the ring. Data structure for efficient summary about keys. Gossiped. One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Partition Read Repair Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects One slice (part) of the ring. Data structure for efficient summary about keys. Gossiped. One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the Auto correction of out-of-date objects Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Partition Read Repair Replica Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects One slice (part) of the ring. Data structure for efficient summary about keys. Gossiped. Number of copies of the same object in the cluster One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the Auto correction of out-of-date objects Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Partition Read Repair Replica Ring Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects One slice (part) of the ring. Data structure for efficient summary about keys. Gossiped. Number of copies of the same object in the cluster One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the The complete “space”, divided into partitions which are claimed by vnodes Auto correction of out-of-date objects Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Partition Read Repair Replica Ring Vector Clock Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects One slice (part) of the ring. Data structure for efficient summary about keys. Gossiped. Number of copies of the same object in the cluster One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the The complete “space”, divided into partitions which are claimed by vnodes Auto correction of out-of-date objects Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Version control technique used for objects. Thursday, April 15, 2010
Riak “stuff” Bucket Consistent Hashing Gossiping Hinted Handoff Links Merkle Tree Node Partition Read Repair Replica Ring Vector Clock Vnode Shares state, bucket and ring knowledge in the cluster Allows retrieval of “weakly” linked objects Runs in a node and claims one partition in the ring One slice (part) of the ring. Data structure for efficient summary about keys. Gossiped. Number of copies of the same object in the cluster One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the The complete “space”, divided into partitions which are claimed by vnodes Auto correction of out-of-date objects Container/keyspace. Determines number of replicas for its contents Covering for a failed “neighbor” node while gone Version control technique used for objects. Thursday, April 15, 2010
A document { "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337 } Key, either you choose it or CouchDB does it for you Revision number Thursday, April 15, 2010
Views Views are stored as an accessible web resource on disk and incrementally updated as well as replicated with the database Thursday, April 15, 2010
CouchDB “stuff” MVCC Append only Multi version concurrency control. Writers do not block readers. Readers do not block Hence, won’t corrupt its data files Thursday, April 15, 2010
CouchDB “stuff” MVCC Append only BDCRR Multi version concurrency control. Writers do not block readers. Readers do not block Bi-directional, conflict resolving, replication Hence, won’t corrupt its data files Thursday, April 15, 2010
CouchDB “stuff” MVCC Append only Compaction BDCRR Multi version concurrency control. Writers do not block readers. Readers do not block Bi-directional, conflict resolving, replication Hence, won’t corrupt its data files Append only will cause data files to grow. Compaction to the rescue, in the background - for your pleasure. Thursday, April 15, 2010
CouchDB “stuff” MVCC Append only Compaction ACID BDCRR Multi version concurrency control. Writers do not block readers. Readers do not block Bi-directional, conflict resolving, replication Hence, won’t corrupt its data files Awesome, Cool, Impressive, Dope Append only will cause data files to grow. Compaction to the rescue, in the background - for your pleasure. Thursday, April 15, 2010