Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Approaching and evaluating NoSQL

Approaching and evaluating NoSQL

Brown bag lunch presentation at TUI / Fritidsresor about approaching and evaluating the NoSQL area.

Approx 60 minutes.

Mårten Gustafson

August 31, 2011
Tweet

More Decks by Mårten Gustafson

Other Decks in Technology

Transcript

  1. NoSQL
    Mårten Gustafson
    Brown bag lunch @ TUI / Fritidsresor Stockholm
    2011-08-31
    Wednesday, August 31, 2011

    View Slide

  2. Not Only SQL
    Wednesday, August 31, 2011

    View Slide

  3. Disclaimer
    I don’t know any and all
    products by heart
    I’m trying to illustrate my
    broad reasoning
    Wednesday, August 31, 2011
    NoSQL tends crammed with religious zealots

    View Slide

  4. “NoSQL is a movement promoting a loosely
    defined class of non-relational data stores
    that break with a long history of relational
    databases” - Wikipedia
    Wednesday, August 31, 2011

    View Slide

  5. “NoSQL is a movement promoting a loosely
    defined class of non-relational data stores
    that break with a long history of relational
    databases” - Wikipedia
    • Not one single technique
    Wednesday, August 31, 2011

    View Slide

  6. “NoSQL is a movement promoting a loosely
    defined class of non-relational data stores
    that break with a long history of relational
    databases” - Wikipedia
    • Not one single technique
    • Not one type of data
    Wednesday, August 31, 2011

    View Slide

  7. “NoSQL is a movement promoting a loosely
    defined class of non-relational data stores
    that break with a long history of relational
    databases” - Wikipedia
    • Not one single technique
    • Not one type of data
    • Not one type of use case
    Wednesday, August 31, 2011

    View Slide

  8. “NoSQL is a movement promoting a loosely
    defined class of non-relational data stores
    that break with a long history of relational
    databases” - Wikipedia
    • Not one single technique
    • Not one type of data
    • Not one type of use case
    Wednesday, August 31, 2011

    View Slide

  9. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Wednesday, August 31, 2011

    View Slide

  10. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Multi-master
    move to?
    Wednesday, August 31, 2011

    View Slide

  11. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Atomicity
    Multi-master
    sacrifices?
    move to?
    Wednesday, August 31, 2011

    View Slide

  12. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Distribution / replication
    Atomicity
    Multi-master
    sacrifices?
    gives?
    move to?
    Wednesday, August 31, 2011

    View Slide

  13. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Distribution / replication
    De-normalization
    Atomicity
    Multi-master
    sacrifices?
    gives?
    move to?
    requires?
    Wednesday, August 31, 2011

    View Slide

  14. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Distribution / replication
    De-normalization
    Atomicity
    Multi-master
    sacrifices?
    gives?
    move to?
    requires?
    Integrity
    dissolves?
    Wednesday, August 31, 2011

    View Slide

  15. Example reasoning
    “...so our RDBMS is a SPoF that won’t scale”
    Distribution / replication
    De-normalization
    Querying
    Atomicity
    Multi-master
    sacrifices?
    gives?
    move to?
    toughens?
    requires?
    Integrity
    dissolves?
    Wednesday, August 31, 2011

    View Slide

  16. There is no free lunch.
    Wednesday, August 31, 2011

    View Slide

  17. Key aspects
    Wednesday, August 31, 2011

    View Slide

  18. Key aspects
    • Type of data
    Wednesday, August 31, 2011

    View Slide

  19. Key aspects
    • Type of data
    • Graph
    Wednesday, August 31, 2011

    View Slide

  20. Key aspects
    • Type of data
    • Graph
    • Documents & key/value-pairs
    Wednesday, August 31, 2011

    View Slide

  21. Key aspects
    • Type of data
    • Graph
    • Documents & key/value-pairs
    • Operational aspect
    Wednesday, August 31, 2011

    View Slide

  22. Key aspects
    • Type of data
    • Graph
    • Documents & key/value-pairs
    • Operational aspect
    • Resilient
    Wednesday, August 31, 2011

    View Slide

  23. Key aspects
    • Type of data
    • Graph
    • Documents & key/value-pairs
    • Operational aspect
    • Resilient
    • Atomic
    Wednesday, August 31, 2011

    View Slide

  24. Example
    Atomic Resilient
    Graph Neo4J
    InfiniteGraph
    ?
    Document &
    key/value-paris
    CouchDB
    RavenDB
    Redis
    Riak
    Voldemort
    Cassandra
    Wednesday, August 31, 2011

    View Slide

  25. Sample cases
    Wednesday, August 31, 2011

    View Slide

  26. Case A: Data from user
    • Data set...
    • ...mix of binary & JSON (access by known key)
    • ...replicated to all nodes on best effort
    • ...allows read & write on any node
    • individual nodes may be unavailable
    Wednesday, August 31, 2011

    View Slide

  27. Case A: Data from user
    Atomic Resilient
    Graph Neo4J
    InfiniteGraph
    ?
    Document &
    key/value-pairs
    CouchDB
    RavenDB
    Redis
    Riak
    Voldemort
    Cassandra
    Wednesday, August 31, 2011

    View Slide

  28. Case A: Data from user
    • Riak due to...
    • outstanding packaging & community
    • simple on both server and workstations
    • serious and dedicated engineering team
    • Lots of statistics
    • HTTP API
    • Everything’s tunable (much per request)
    Wednesday, August 31, 2011

    View Slide

  29. Case B: configuration
    • Data set...
    • ...is JSON (access by alternating keys)
    • ...on all nodes
    • ...seldom updated (read only)
    • ...consistent and replicated on demand
    Wednesday, August 31, 2011

    View Slide

  30. Case B: configuration
    Atomic Resilient
    Graph Neo4J
    InfiniteGraph
    ?
    Document &
    key/value-pairs
    CouchDB
    RavenDB
    Redis
    Riak
    Voldemort
    Cassandra
    Wednesday, August 31, 2011

    View Slide

  31. Case B: configuration
    • CouchDB due to...
    • simple on both server and workstations
    • HTTP API
    • Views for alternating keys
    • Lots of statistics
    • Dead simple “one-click” push replication
    Wednesday, August 31, 2011

    View Slide

  32. priorities & trade-offs
    Wednesday, August 31, 2011
    (No)SQL for me is very much about trade offs

    View Slide

  33. does it fit with current data structures?
    Wednesday, August 31, 2011
    Don’t underestimate the exercise of making your data “fit” a certain nosql product

    View Slide

  34. ease of adoption?
    Wednesday, August 31, 2011
    Client libraries?
    Does it require driver libraries?

    View Slide

  35. search and reporting?
    Wednesday, August 31, 2011
    What access patterns do you have today? Tomorrow?
    What kind of reports will customers or management require?

    View Slide

  36. does it speak HTTP?
    Wednesday, August 31, 2011
    For us at Hitta.se this is important since almost everything we do is HTTP based

    View Slide

  37. availability and redundancy?
    Wednesday, August 31, 2011
    What kinds of availability?
    How does it handle node failures? Network partitions?

    View Slide

  38. can you monitor it?
    Wednesday, August 31, 2011
    How and with what?

    View Slide

  39. performance?
    Wednesday, August 31, 2011
    Out-of-the-box vs. vertical vs. horizontal?

    View Slide

  40. ease of scaling in and out?
    Wednesday, August 31, 2011
    What’s required to add additional nodes?
    How do you remove a node temporarily or permanently?

    View Slide

  41. commercial support available?
    Wednesday, August 31, 2011
    And how is the community?

    View Slide

  42. does it run on your preferred OS?
    Wednesday, August 31, 2011

    View Slide

  43. is it properly packaged?
    Wednesday, August 31, 2011
    Proper install packages?
    Sane defaults in terms of service accounts and privileges?

    View Slide

  44. do you understand it?
    Wednesday, August 31, 2011
    Don’t underestimate this

    View Slide

  45. is your data safe?
    Wednesday, August 31, 2011
    Is your data really durable on disk -- assuming that’s what you need

    View Slide

  46. there is no universal solution
    Wednesday, August 31, 2011

    View Slide

  47. Thx.
    Mårten Gustafson
    @martengustafson
    http://marten.gustafson.pp.se/
    [email protected]
    Wednesday, August 31, 2011

    View Slide