NoSQL Overview

NoSQL Mårten Gustafson Progressive.Net @ Valtech Stockholm 2011-05-12 Thursday, May
12, 2011

Not Only SQL Thursday, May 12, 2011

“NoSQL is a movement promoting a loosely deﬁned class of
non-relational data stores that break with a long history of relational databases” - Wikipedia Thursday, May 12, 2011

non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique Thursday, May 12, 2011

non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data Thursday, May 12, 2011

non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Thursday, May 12, 2011

The way I see it Thursday, May 12, 2011

Families • Graph • Key/Value • Document Thursday, May 12,
2011

Flavors • Stand-alone • Distributed • Embedded Thursday, May 12,
2011

Flavors • Stand-alone • Distributed • Embedded •Isolated instances are
common •Might have slave replication Thursday, May 12, 2011

Flavors • Stand-alone • Distributed • Embedded •“Cluster” is default
mode of operation •No master node •Multiple nodes may fail without interrupting service (implies storage distribution) •Isolated instances are common •Might have slave replication Thursday, May 12, 2011

Disclaimer I don’t know any and all products by heart
I’m trying to illustrate my broad reasoning Thursday, May 12, 2011 NoSQL tends crammed with religious zealots

Example of my reasoning Flavor / Family Graph Key/Value Document
Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011

priorities & trade-offs Thursday, May 12, 2011 (No)SQL for me
is very much about trade offs

Does it ﬁt with current data structures? Thursday, May 12,
2011 Don’t underestimate the exercise of making your data “ﬁt” a certain nosql product

Ease of adoption? Thursday, May 12, 2011 Client libraries? Does
it require driver libraries?

Indices or some sort of search? Thursday, May 12, 2011
What access patterns do you have today? Tomorrow? What kind of reports will customers or management require?

Does it speak HTTP? Thursday, May 12, 2011 For us
at Hitta.se this is important since almost everything we do is HTTP based

Availability and redundancy? Thursday, May 12, 2011 What kinds of
availability? How does it handle node failures? Network partitions?

Can you monitor it? Thursday, May 12, 2011 How and
with what?

Performance? Thursday, May 12, 2011 Does performance scale with additional
nodes?

Ease of scaling in and out? Thursday, May 12, 2011
What’s required to add additional nodes? How do you remove a node temporarily or permanently?

Commercial support available? Thursday, May 12, 2011

Does it run on your preferred OS? Thursday, May 12,
2011

Is it properly packaged? Thursday, May 12, 2011 Proper install
packages? Sane defaults in terms of service accounts and privileges?

Do you understand it? Thursday, May 12, 2011 Don’t underestimate
this

Can you kill it without loosing data? Thursday, May 12,
2011 Is your data really durable on disk -- assuming that’s what you need

For example... • I work at Hitta.se • We love
availability • We like “easy” scalability Thursday, May 12, 2011

availability + scalability Thursday, May 12, 2011

availability + scalability = multi-master Thursday, May 12, 2011

availability + scalability = storage distribution Thursday, May 12, 2011

availability + scalability = replication Thursday, May 12, 2011

availability + scalability = add & remove nodes Thursday, May
12, 2011

availability + scalability = tune behavior per use case Thursday,
May 12, 2011

availability + scalability = ? Thursday, May 12, 2011

availability + scalability = Riak & CouchDB Thursday, May 12,
2011 For us, so far, the answer has been Riak & CouchDB

Riak CouchDB Thursday, May 12, 2011

Riak CouchDB Dynamo inspired key / value store Thursday, May
12, 2011

Riak CouchDB Dynamo inspired key / value store Document database
Thursday, May 12, 2011

Riak CouchDB Data that must be available as soon as
possible on all nodes Dynamo inspired key / value store Document database Thursday, May 12, 2011

possible on all nodes Data that changes less frequently and is ok to replicate “manually” Dynamo inspired key / value store Document database Thursday, May 12, 2011

possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Dynamo inspired key / value store Document database Thursday, May 12, 2011

possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Data that might be local to a single node Dynamo inspired key / value store Document database Thursday, May 12, 2011

Common factors Riak & CouchDB Good packaging Thursday, May 12,
2011

Common factors Riak & CouchDB Monitorable (lots of stats) Thursday,
May 12, 2011

Common factors Riak & CouchDB Easy conﬁguration Thursday, May 12,
2011

Common factors Riak & CouchDB Reliable Thursday, May 12, 2011
Append only disk structures

Common factors Riak & CouchDB HTTP API Thursday, May 12,
2011

Common factors Riak & CouchDB They embrace HTTP Thursday, May
12, 2011

All your webish skillz and tools apply... Thursday, May 12,
2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*

All your webish skillz and tools apply... proxies load balancers
caches HTTP client libs (etag, if-modiﬁed-since, etc) language-, platform- and OS-neutral MIME / Content-Type Thursday, May 12, 2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*

Common factors Riak & CouchDB Able to store and serve
complete web apps Thursday, May 12, 2011

Go do! • Test one or more NoSQL thingys •
Get familiar with Brewers CAP theorem • Get familiar with the Dynamo paper Thursday, May 12, 2011

Thx. Mårten Gustafson @martengustafson http://marten.gustafson.pp.se/ [email protected] Thursday, May 12, 2011

NoSQL Overview

NoSQL Overview

More Decks by Mårten Gustafson

Other Decks in Technology

Featured

Transcript