NoSQL Overview

NoSQL Overview

A high level overview on my thoughts on approaching the NoSQL space. Given at the Progressive.NET meetup hosted by Valtech Stockholm.

Approx 60 minutes.

A204e1fe2002bc6d087391759c3dfab0?s=128

Mårten Gustafson

May 12, 2011
Tweet

Transcript

  1. NoSQL Mårten Gustafson Progressive.Net @ Valtech Stockholm 2011-05-12 Thursday, May

    12, 2011
  2. Not Only SQL Thursday, May 12, 2011

  3. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia Thursday, May 12, 2011
  4. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique Thursday, May 12, 2011
  5. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data Thursday, May 12, 2011
  6. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Thursday, May 12, 2011
  7. The way I see it Thursday, May 12, 2011

  8. Families • Graph • Key/Value • Document Thursday, May 12,

    2011
  9. Flavors • Stand-alone • Distributed • Embedded Thursday, May 12,

    2011
  10. Flavors • Stand-alone • Distributed • Embedded •Isolated instances are

    common •Might have slave replication Thursday, May 12, 2011
  11. Flavors • Stand-alone • Distributed • Embedded •“Cluster” is default

    mode of operation •No master node •Multiple nodes may fail without interrupting service (implies storage distribution) •Isolated instances are common •Might have slave replication Thursday, May 12, 2011
  12. Disclaimer I don’t know any and all products by heart

    I’m trying to illustrate my broad reasoning Thursday, May 12, 2011 NoSQL tends crammed with religious zealots
  13. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  14. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  15. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  16. Example of my reasoning Flavor / Family Graph Key/Value Document

    Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011
  17. priorities & trade-offs Thursday, May 12, 2011 (No)SQL for me

    is very much about trade offs
  18. Does it fit with current data structures? Thursday, May 12,

    2011 Don’t underestimate the exercise of making your data “fit” a certain nosql product
  19. Ease of adoption? Thursday, May 12, 2011 Client libraries? Does

    it require driver libraries?
  20. Indices or some sort of search? Thursday, May 12, 2011

    What access patterns do you have today? Tomorrow? What kind of reports will customers or management require?
  21. Does it speak HTTP? Thursday, May 12, 2011 For us

    at Hitta.se this is important since almost everything we do is HTTP based
  22. Availability and redundancy? Thursday, May 12, 2011 What kinds of

    availability? How does it handle node failures? Network partitions?
  23. Can you monitor it? Thursday, May 12, 2011 How and

    with what?
  24. Performance? Thursday, May 12, 2011 Does performance scale with additional

    nodes?
  25. Ease of scaling in and out? Thursday, May 12, 2011

    What’s required to add additional nodes? How do you remove a node temporarily or permanently?
  26. Commercial support available? Thursday, May 12, 2011

  27. Does it run on your preferred OS? Thursday, May 12,

    2011
  28. Is it properly packaged? Thursday, May 12, 2011 Proper install

    packages? Sane defaults in terms of service accounts and privileges?
  29. Do you understand it? Thursday, May 12, 2011 Don’t underestimate

    this
  30. Can you kill it without loosing data? Thursday, May 12,

    2011 Is your data really durable on disk -- assuming that’s what you need
  31. For example... • I work at Hitta.se • We love

    availability • We like “easy” scalability Thursday, May 12, 2011
  32. For example... • I work at Hitta.se • We love

    availability • We like “easy” scalability Thursday, May 12, 2011
  33. availability + scalability Thursday, May 12, 2011

  34. availability + scalability = multi-master Thursday, May 12, 2011

  35. availability + scalability = storage distribution Thursday, May 12, 2011

  36. availability + scalability = replication Thursday, May 12, 2011

  37. availability + scalability = add & remove nodes Thursday, May

    12, 2011
  38. availability + scalability = tune behavior per use case Thursday,

    May 12, 2011
  39. availability + scalability = ? Thursday, May 12, 2011

  40. availability + scalability = Riak & CouchDB Thursday, May 12,

    2011 For us, so far, the answer has been Riak & CouchDB
  41. Riak CouchDB Thursday, May 12, 2011

  42. Riak CouchDB Dynamo inspired key / value store Thursday, May

    12, 2011
  43. Riak CouchDB Dynamo inspired key / value store Document database

    Thursday, May 12, 2011
  44. Riak CouchDB Data that must be available as soon as

    possible on all nodes Dynamo inspired key / value store Document database Thursday, May 12, 2011
  45. Riak CouchDB Data that must be available as soon as

    possible on all nodes Data that changes less frequently and is ok to replicate “manually” Dynamo inspired key / value store Document database Thursday, May 12, 2011
  46. Riak CouchDB Data that must be available as soon as

    possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Dynamo inspired key / value store Document database Thursday, May 12, 2011
  47. Riak CouchDB Data that must be available as soon as

    possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Data that might be local to a single node Dynamo inspired key / value store Document database Thursday, May 12, 2011
  48. Common factors Riak & CouchDB Good packaging Thursday, May 12,

    2011
  49. Common factors Riak & CouchDB Monitorable (lots of stats) Thursday,

    May 12, 2011
  50. Common factors Riak & CouchDB Easy configuration Thursday, May 12,

    2011
  51. Common factors Riak & CouchDB Reliable Thursday, May 12, 2011

    Append only disk structures
  52. Common factors Riak & CouchDB HTTP API Thursday, May 12,

    2011
  53. Common factors Riak & CouchDB They embrace HTTP Thursday, May

    12, 2011
  54. All your webish skillz and tools apply... Thursday, May 12,

    2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*
  55. All your webish skillz and tools apply... proxies load balancers

    caches HTTP client libs (etag, if-modified-since, etc) language-, platform- and OS-neutral MIME / Content-Type Thursday, May 12, 2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*
  56. Common factors Riak & CouchDB Able to store and serve

    complete web apps Thursday, May 12, 2011
  57. Go do! • Test one or more NoSQL thingys •

    Get familiar with Brewers CAP theorem • Get familiar with the Dynamo paper Thursday, May 12, 2011
  58. Thx. Mårten Gustafson @martengustafson http://marten.gustafson.pp.se/ marten.gustafson@gmail.com Thursday, May 12, 2011