Slide 1

Slide 1 text

NoSQL Mårten Gustafson Brown bag lunch @ TUI / Fritidsresor Stockholm 2011-08-31 Wednesday, August 31, 2011

Slide 2

Slide 2 text

Not Only SQL Wednesday, August 31, 2011

Slide 3

Slide 3 text

Disclaimer I don’t know any and all products by heart I’m trying to illustrate my broad reasoning Wednesday, August 31, 2011 NoSQL tends crammed with religious zealots

Slide 4

Slide 4 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia Wednesday, August 31, 2011

Slide 5

Slide 5 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique Wednesday, August 31, 2011

Slide 6

Slide 6 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data Wednesday, August 31, 2011

Slide 7

Slide 7 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Wednesday, August 31, 2011

Slide 8

Slide 8 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Wednesday, August 31, 2011

Slide 9

Slide 9 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Wednesday, August 31, 2011

Slide 10

Slide 10 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Multi-master move to? Wednesday, August 31, 2011

Slide 11

Slide 11 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Atomicity Multi-master sacrifices? move to? Wednesday, August 31, 2011

Slide 12

Slide 12 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Distribution / replication Atomicity Multi-master sacrifices? gives? move to? Wednesday, August 31, 2011

Slide 13

Slide 13 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Distribution / replication De-normalization Atomicity Multi-master sacrifices? gives? move to? requires? Wednesday, August 31, 2011

Slide 14

Slide 14 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Distribution / replication De-normalization Atomicity Multi-master sacrifices? gives? move to? requires? Integrity dissolves? Wednesday, August 31, 2011

Slide 15

Slide 15 text

Example reasoning “...so our RDBMS is a SPoF that won’t scale” Distribution / replication De-normalization Querying Atomicity Multi-master sacrifices? gives? move to? toughens? requires? Integrity dissolves? Wednesday, August 31, 2011

Slide 16

Slide 16 text

There is no free lunch. Wednesday, August 31, 2011

Slide 17

Slide 17 text

Key aspects Wednesday, August 31, 2011

Slide 18

Slide 18 text

Key aspects • Type of data Wednesday, August 31, 2011

Slide 19

Slide 19 text

Key aspects • Type of data • Graph Wednesday, August 31, 2011

Slide 20

Slide 20 text

Key aspects • Type of data • Graph • Documents & key/value-pairs Wednesday, August 31, 2011

Slide 21

Slide 21 text

Key aspects • Type of data • Graph • Documents & key/value-pairs • Operational aspect Wednesday, August 31, 2011

Slide 22

Slide 22 text

Key aspects • Type of data • Graph • Documents & key/value-pairs • Operational aspect • Resilient Wednesday, August 31, 2011

Slide 23

Slide 23 text

Key aspects • Type of data • Graph • Documents & key/value-pairs • Operational aspect • Resilient • Atomic Wednesday, August 31, 2011

Slide 24

Slide 24 text

Example Atomic Resilient Graph Neo4J InfiniteGraph ? Document & key/value-paris CouchDB RavenDB Redis Riak Voldemort Cassandra Wednesday, August 31, 2011

Slide 25

Slide 25 text

Sample cases Wednesday, August 31, 2011

Slide 26

Slide 26 text

Case A: Data from user • Data set... • ...mix of binary & JSON (access by known key) • ...replicated to all nodes on best effort • ...allows read & write on any node • individual nodes may be unavailable Wednesday, August 31, 2011

Slide 27

Slide 27 text

Case A: Data from user Atomic Resilient Graph Neo4J InfiniteGraph ? Document & key/value-pairs CouchDB RavenDB Redis Riak Voldemort Cassandra Wednesday, August 31, 2011

Slide 28

Slide 28 text

Case A: Data from user • Riak due to... • outstanding packaging & community • simple on both server and workstations • serious and dedicated engineering team • Lots of statistics • HTTP API • Everything’s tunable (much per request) Wednesday, August 31, 2011

Slide 29

Slide 29 text

Case B: configuration • Data set... • ...is JSON (access by alternating keys) • ...on all nodes • ...seldom updated (read only) • ...consistent and replicated on demand Wednesday, August 31, 2011

Slide 30

Slide 30 text

Case B: configuration Atomic Resilient Graph Neo4J InfiniteGraph ? Document & key/value-pairs CouchDB RavenDB Redis Riak Voldemort Cassandra Wednesday, August 31, 2011

Slide 31

Slide 31 text

Case B: configuration • CouchDB due to... • simple on both server and workstations • HTTP API • Views for alternating keys • Lots of statistics • Dead simple “one-click” push replication Wednesday, August 31, 2011

Slide 32

Slide 32 text

priorities & trade-offs Wednesday, August 31, 2011 (No)SQL for me is very much about trade offs

Slide 33

Slide 33 text

does it fit with current data structures? Wednesday, August 31, 2011 Don’t underestimate the exercise of making your data “fit” a certain nosql product

Slide 34

Slide 34 text

ease of adoption? Wednesday, August 31, 2011 Client libraries? Does it require driver libraries?

Slide 35

Slide 35 text

search and reporting? Wednesday, August 31, 2011 What access patterns do you have today? Tomorrow? What kind of reports will customers or management require?

Slide 36

Slide 36 text

does it speak HTTP? Wednesday, August 31, 2011 For us at Hitta.se this is important since almost everything we do is HTTP based

Slide 37

Slide 37 text

availability and redundancy? Wednesday, August 31, 2011 What kinds of availability? How does it handle node failures? Network partitions?

Slide 38

Slide 38 text

can you monitor it? Wednesday, August 31, 2011 How and with what?

Slide 39

Slide 39 text

performance? Wednesday, August 31, 2011 Out-of-the-box vs. vertical vs. horizontal?

Slide 40

Slide 40 text

ease of scaling in and out? Wednesday, August 31, 2011 What’s required to add additional nodes? How do you remove a node temporarily or permanently?

Slide 41

Slide 41 text

commercial support available? Wednesday, August 31, 2011 And how is the community?

Slide 42

Slide 42 text

does it run on your preferred OS? Wednesday, August 31, 2011

Slide 43

Slide 43 text

is it properly packaged? Wednesday, August 31, 2011 Proper install packages? Sane defaults in terms of service accounts and privileges?

Slide 44

Slide 44 text

do you understand it? Wednesday, August 31, 2011 Don’t underestimate this

Slide 45

Slide 45 text

is your data safe? Wednesday, August 31, 2011 Is your data really durable on disk -- assuming that’s what you need

Slide 46

Slide 46 text

there is no universal solution Wednesday, August 31, 2011

Slide 47

Slide 47 text

Thx. Mårten Gustafson @martengustafson http://marten.gustafson.pp.se/ [email protected] Wednesday, August 31, 2011