Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL Overview

NoSQL Overview

A high level overview on my thoughts on approaching the NoSQL space. Given at the Progressive.NET meetup hosted by Valtech Stockholm.

Approx 60 minutes.

Mårten Gustafson

May 12, 2011
Tweet

More Decks by Mårten Gustafson

Other Decks in Technology

Transcript

  1. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia Thursday, May 12, 2011
  2. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique Thursday, May 12, 2011
  3. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data Thursday, May 12, 2011
  4. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Thursday, May 12, 2011
  5. Flavors • Stand-alone • Distributed • Embedded •Isolated instances are

    common •Might have slave replication Thursday, May 12, 2011
  6. Flavors • Stand-alone • Distributed • Embedded •“Cluster” is default

    mode of operation •No master node •Multiple nodes may fail without interrupting service (implies storage distribution) •Isolated instances are common •Might have slave replication Thursday, May 12, 2011
  7. Disclaimer I don’t know any and all products by heart

    I’m trying to illustrate my broad reasoning Thursday, May 12, 2011 NoSQL tends crammed with religious zealots
  8. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  9. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  10. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  11. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  12. Does it fit with current data structures? Thursday, May 12,

    2011 Don’t underestimate the exercise of making your data “fit” a certain nosql product
  13. Indices or some sort of search? Thursday, May 12, 2011

    What access patterns do you have today? Tomorrow? What kind of reports will customers or management require?
  14. Does it speak HTTP? Thursday, May 12, 2011 For us

    at Hitta.se this is important since almost everything we do is HTTP based
  15. Availability and redundancy? Thursday, May 12, 2011 What kinds of

    availability? How does it handle node failures? Network partitions?
  16. Ease of scaling in and out? Thursday, May 12, 2011

    What’s required to add additional nodes? How do you remove a node temporarily or permanently?
  17. Is it properly packaged? Thursday, May 12, 2011 Proper install

    packages? Sane defaults in terms of service accounts and privileges?
  18. Can you kill it without loosing data? Thursday, May 12,

    2011 Is your data really durable on disk -- assuming that’s what you need
  19. For example... • I work at Hitta.se • We love

    availability • We like “easy” scalability Thursday, May 12, 2011
  20. For example... • I work at Hitta.se • We love

    availability • We like “easy” scalability Thursday, May 12, 2011
  21. availability + scalability = Riak & CouchDB Thursday, May 12,

    2011 For us, so far, the answer has been Riak & CouchDB
  22. Riak CouchDB Data that must be available as soon as

    possible on all nodes Dynamo inspired key / value store Document database Thursday, May 12, 2011
  23. Riak CouchDB Data that must be available as soon as

    possible on all nodes Data that changes less frequently and is ok to replicate “manually” Dynamo inspired key / value store Document database Thursday, May 12, 2011
  24. Riak CouchDB Data that must be available as soon as

    possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Dynamo inspired key / value store Document database Thursday, May 12, 2011
  25. Riak CouchDB Data that must be available as soon as

    possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Data that might be local to a single node Dynamo inspired key / value store Document database Thursday, May 12, 2011
  26. All your webish skillz and tools apply... Thursday, May 12,

    2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*
  27. All your webish skillz and tools apply... proxies load balancers

    caches HTTP client libs (etag, if-modified-since, etc) language-, platform- and OS-neutral MIME / Content-Type Thursday, May 12, 2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*
  28. Common factors Riak & CouchDB Able to store and serve

    complete web apps Thursday, May 12, 2011
  29. Go do! • Test one or more NoSQL thingys •

    Get familiar with Brewers CAP theorem • Get familiar with the Dynamo paper Thursday, May 12, 2011