Approaching and evaluating NoSQL

Approaching and evaluating NoSQL

Brown bag lunch presentation at TUI / Fritidsresor about approaching and evaluating the NoSQL area.

Approx 60 minutes.

A204e1fe2002bc6d087391759c3dfab0?s=128

Mårten Gustafson

August 31, 2011
Tweet

Transcript

  1. NoSQL Mårten Gustafson Brown bag lunch @ TUI / Fritidsresor

    Stockholm 2011-08-31 Wednesday, August 31, 2011
  2. Not Only SQL Wednesday, August 31, 2011

  3. Disclaimer I don’t know any and all products by heart

    I’m trying to illustrate my broad reasoning Wednesday, August 31, 2011 NoSQL tends crammed with religious zealots
  4. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia Wednesday, August 31, 2011
  5. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique Wednesday, August 31, 2011
  6. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data Wednesday, August 31, 2011
  7. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Wednesday, August 31, 2011
  8. “NoSQL is a movement promoting a loosely defined class of

    non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Wednesday, August 31, 2011
  9. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Wednesday, August 31, 2011
  10. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Multi-master move to? Wednesday, August 31, 2011
  11. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Atomicity Multi-master sacrifices? move to? Wednesday, August 31, 2011
  12. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Distribution / replication Atomicity Multi-master sacrifices? gives? move to? Wednesday, August 31, 2011
  13. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Distribution / replication De-normalization Atomicity Multi-master sacrifices? gives? move to? requires? Wednesday, August 31, 2011
  14. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Distribution / replication De-normalization Atomicity Multi-master sacrifices? gives? move to? requires? Integrity dissolves? Wednesday, August 31, 2011
  15. Example reasoning “...so our RDBMS is a SPoF that won’t

    scale” Distribution / replication De-normalization Querying Atomicity Multi-master sacrifices? gives? move to? toughens? requires? Integrity dissolves? Wednesday, August 31, 2011
  16. There is no free lunch. Wednesday, August 31, 2011

  17. Key aspects Wednesday, August 31, 2011

  18. Key aspects • Type of data Wednesday, August 31, 2011

  19. Key aspects • Type of data • Graph Wednesday, August

    31, 2011
  20. Key aspects • Type of data • Graph • Documents

    & key/value-pairs Wednesday, August 31, 2011
  21. Key aspects • Type of data • Graph • Documents

    & key/value-pairs • Operational aspect Wednesday, August 31, 2011
  22. Key aspects • Type of data • Graph • Documents

    & key/value-pairs • Operational aspect • Resilient Wednesday, August 31, 2011
  23. Key aspects • Type of data • Graph • Documents

    & key/value-pairs • Operational aspect • Resilient • Atomic Wednesday, August 31, 2011
  24. Example Atomic Resilient Graph Neo4J InfiniteGraph ? Document & key/value-paris

    CouchDB RavenDB Redis Riak Voldemort Cassandra Wednesday, August 31, 2011
  25. Sample cases Wednesday, August 31, 2011

  26. Case A: Data from user • Data set... • ...mix

    of binary & JSON (access by known key) • ...replicated to all nodes on best effort • ...allows read & write on any node • individual nodes may be unavailable Wednesday, August 31, 2011
  27. Case A: Data from user Atomic Resilient Graph Neo4J InfiniteGraph

    ? Document & key/value-pairs CouchDB RavenDB Redis Riak Voldemort Cassandra Wednesday, August 31, 2011
  28. Case A: Data from user • Riak due to... •

    outstanding packaging & community • simple on both server and workstations • serious and dedicated engineering team • Lots of statistics • HTTP API • Everything’s tunable (much per request) Wednesday, August 31, 2011
  29. Case B: configuration • Data set... • ...is JSON (access

    by alternating keys) • ...on all nodes • ...seldom updated (read only) • ...consistent and replicated on demand Wednesday, August 31, 2011
  30. Case B: configuration Atomic Resilient Graph Neo4J InfiniteGraph ? Document

    & key/value-pairs CouchDB RavenDB Redis Riak Voldemort Cassandra Wednesday, August 31, 2011
  31. Case B: configuration • CouchDB due to... • simple on

    both server and workstations • HTTP API • Views for alternating keys • Lots of statistics • Dead simple “one-click” push replication Wednesday, August 31, 2011
  32. priorities & trade-offs Wednesday, August 31, 2011 (No)SQL for me

    is very much about trade offs
  33. does it fit with current data structures? Wednesday, August 31,

    2011 Don’t underestimate the exercise of making your data “fit” a certain nosql product
  34. ease of adoption? Wednesday, August 31, 2011 Client libraries? Does

    it require driver libraries?
  35. search and reporting? Wednesday, August 31, 2011 What access patterns

    do you have today? Tomorrow? What kind of reports will customers or management require?
  36. does it speak HTTP? Wednesday, August 31, 2011 For us

    at Hitta.se this is important since almost everything we do is HTTP based
  37. availability and redundancy? Wednesday, August 31, 2011 What kinds of

    availability? How does it handle node failures? Network partitions?
  38. can you monitor it? Wednesday, August 31, 2011 How and

    with what?
  39. performance? Wednesday, August 31, 2011 Out-of-the-box vs. vertical vs. horizontal?

  40. ease of scaling in and out? Wednesday, August 31, 2011

    What’s required to add additional nodes? How do you remove a node temporarily or permanently?
  41. commercial support available? Wednesday, August 31, 2011 And how is

    the community?
  42. does it run on your preferred OS? Wednesday, August 31,

    2011
  43. is it properly packaged? Wednesday, August 31, 2011 Proper install

    packages? Sane defaults in terms of service accounts and privileges?
  44. do you understand it? Wednesday, August 31, 2011 Don’t underestimate

    this
  45. is your data safe? Wednesday, August 31, 2011 Is your

    data really durable on disk -- assuming that’s what you need
  46. there is no universal solution Wednesday, August 31, 2011

  47. Thx. Mårten Gustafson @martengustafson http://marten.gustafson.pp.se/ marten.gustafson@gmail.com Wednesday, August 31, 2011