Slide 1

Slide 1 text

NoSQL Mårten Gustafson Progressive.Net @ Valtech Stockholm 2011-05-12 Thursday, May 12, 2011

Slide 2

Slide 2 text

Not Only SQL Thursday, May 12, 2011

Slide 3

Slide 3 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia Thursday, May 12, 2011

Slide 4

Slide 4 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique Thursday, May 12, 2011

Slide 5

Slide 5 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data Thursday, May 12, 2011

Slide 6

Slide 6 text

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia • Not one single technique • Not one type of data • Not one type of use case Thursday, May 12, 2011

Slide 7

Slide 7 text

The way I see it Thursday, May 12, 2011

Slide 8

Slide 8 text

Families • Graph • Key/Value • Document Thursday, May 12, 2011

Slide 9

Slide 9 text

Flavors • Stand-alone • Distributed • Embedded Thursday, May 12, 2011

Slide 10

Slide 10 text

Flavors • Stand-alone • Distributed • Embedded •Isolated instances are common •Might have slave replication Thursday, May 12, 2011

Slide 11

Slide 11 text

Flavors • Stand-alone • Distributed • Embedded •“Cluster” is default mode of operation •No master node •Multiple nodes may fail without interrupting service (implies storage distribution) •Isolated instances are common •Might have slave replication Thursday, May 12, 2011

Slide 12

Slide 12 text

Disclaimer I don’t know any and all products by heart I’m trying to illustrate my broad reasoning Thursday, May 12, 2011 NoSQL tends crammed with religious zealots

Slide 13

Slide 13 text

Example of my reasoning Flavor / Family Graph Key/Value Document Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011

Slide 14

Slide 14 text

Example of my reasoning Flavor / Family Graph Key/Value Document Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011

Slide 15

Slide 15 text

Example of my reasoning Flavor / Family Graph Key/Value Document Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011

Slide 16

Slide 16 text

Example of my reasoning Flavor / Family Graph Key/Value Document Stand alone Neo4J Redis CouchDB MongoDB Distributed Riak Voldemort Cassandra Embedded Neo4J Tokyo Cabinet LevelDB Thursday, May 12, 2011

Slide 17

Slide 17 text

priorities & trade-offs Thursday, May 12, 2011 (No)SQL for me is very much about trade offs

Slide 18

Slide 18 text

Does it fit with current data structures? Thursday, May 12, 2011 Don’t underestimate the exercise of making your data “fit” a certain nosql product

Slide 19

Slide 19 text

Ease of adoption? Thursday, May 12, 2011 Client libraries? Does it require driver libraries?

Slide 20

Slide 20 text

Indices or some sort of search? Thursday, May 12, 2011 What access patterns do you have today? Tomorrow? What kind of reports will customers or management require?

Slide 21

Slide 21 text

Does it speak HTTP? Thursday, May 12, 2011 For us at Hitta.se this is important since almost everything we do is HTTP based

Slide 22

Slide 22 text

Availability and redundancy? Thursday, May 12, 2011 What kinds of availability? How does it handle node failures? Network partitions?

Slide 23

Slide 23 text

Can you monitor it? Thursday, May 12, 2011 How and with what?

Slide 24

Slide 24 text

Performance? Thursday, May 12, 2011 Does performance scale with additional nodes?

Slide 25

Slide 25 text

Ease of scaling in and out? Thursday, May 12, 2011 What’s required to add additional nodes? How do you remove a node temporarily or permanently?

Slide 26

Slide 26 text

Commercial support available? Thursday, May 12, 2011

Slide 27

Slide 27 text

Does it run on your preferred OS? Thursday, May 12, 2011

Slide 28

Slide 28 text

Is it properly packaged? Thursday, May 12, 2011 Proper install packages? Sane defaults in terms of service accounts and privileges?

Slide 29

Slide 29 text

Do you understand it? Thursday, May 12, 2011 Don’t underestimate this

Slide 30

Slide 30 text

Can you kill it without loosing data? Thursday, May 12, 2011 Is your data really durable on disk -- assuming that’s what you need

Slide 31

Slide 31 text

For example... • I work at Hitta.se • We love availability • We like “easy” scalability Thursday, May 12, 2011

Slide 32

Slide 32 text

For example... • I work at Hitta.se • We love availability • We like “easy” scalability Thursday, May 12, 2011

Slide 33

Slide 33 text

availability + scalability Thursday, May 12, 2011

Slide 34

Slide 34 text

availability + scalability = multi-master Thursday, May 12, 2011

Slide 35

Slide 35 text

availability + scalability = storage distribution Thursday, May 12, 2011

Slide 36

Slide 36 text

availability + scalability = replication Thursday, May 12, 2011

Slide 37

Slide 37 text

availability + scalability = add & remove nodes Thursday, May 12, 2011

Slide 38

Slide 38 text

availability + scalability = tune behavior per use case Thursday, May 12, 2011

Slide 39

Slide 39 text

availability + scalability = ? Thursday, May 12, 2011

Slide 40

Slide 40 text

availability + scalability = Riak & CouchDB Thursday, May 12, 2011 For us, so far, the answer has been Riak & CouchDB

Slide 41

Slide 41 text

Riak CouchDB Thursday, May 12, 2011

Slide 42

Slide 42 text

Riak CouchDB Dynamo inspired key / value store Thursday, May 12, 2011

Slide 43

Slide 43 text

Riak CouchDB Dynamo inspired key / value store Document database Thursday, May 12, 2011

Slide 44

Slide 44 text

Riak CouchDB Data that must be available as soon as possible on all nodes Dynamo inspired key / value store Document database Thursday, May 12, 2011

Slide 45

Slide 45 text

Riak CouchDB Data that must be available as soon as possible on all nodes Data that changes less frequently and is ok to replicate “manually” Dynamo inspired key / value store Document database Thursday, May 12, 2011

Slide 46

Slide 46 text

Riak CouchDB Data that must be available as soon as possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Dynamo inspired key / value store Document database Thursday, May 12, 2011

Slide 47

Slide 47 text

Riak CouchDB Data that must be available as soon as possible on all nodes Data that changes less frequently and is ok to replicate “manually” Data that require storage distribution Data that might be local to a single node Dynamo inspired key / value store Document database Thursday, May 12, 2011

Slide 48

Slide 48 text

Common factors Riak & CouchDB Good packaging Thursday, May 12, 2011

Slide 49

Slide 49 text

Common factors Riak & CouchDB Monitorable (lots of stats) Thursday, May 12, 2011

Slide 50

Slide 50 text

Common factors Riak & CouchDB Easy configuration Thursday, May 12, 2011

Slide 51

Slide 51 text

Common factors Riak & CouchDB Reliable Thursday, May 12, 2011 Append only disk structures

Slide 52

Slide 52 text

Common factors Riak & CouchDB HTTP API Thursday, May 12, 2011

Slide 53

Slide 53 text

Common factors Riak & CouchDB They embrace HTTP Thursday, May 12, 2011

Slide 54

Slide 54 text

All your webish skillz and tools apply... Thursday, May 12, 2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*

Slide 55

Slide 55 text

All your webish skillz and tools apply... proxies load balancers caches HTTP client libs (etag, if-modified-since, etc) language-, platform- and OS-neutral MIME / Content-Type Thursday, May 12, 2011 Important for us as it requires no “drivers” and allows us to serve binary+mime No, I don’t like WS-*

Slide 56

Slide 56 text

Common factors Riak & CouchDB Able to store and serve complete web apps Thursday, May 12, 2011

Slide 57

Slide 57 text

Go do! • Test one or more NoSQL thingys • Get familiar with Brewers CAP theorem • Get familiar with the Dynamo paper Thursday, May 12, 2011

Slide 58

Slide 58 text

Thx. Mårten Gustafson @martengustafson http://marten.gustafson.pp.se/ [email protected] Thursday, May 12, 2011