Slide 1

Slide 1 text

“nosql” Getting over the bad parts David Dahl @effata

Slide 2

Slide 2 text

Rant

Slide 3

Slide 3 text

Overview ‣ Real life lessons ‣ Production systems ‣ Write heavy ‣ MongoDB ‣ Redis ‣ Cassandra

Slide 4

Slide 4 text

generic ‣ Took a DB class? - Forget everything you learned! ‣ Denormalize all the things - Up to a limit ‣ Consistency is your responsibility ‣ Primary keys - Give them a lot of thought

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

{ "_id" : ObjectId("51235a80472689978000004e"), "access" : { "admin": ['some_app'], "deep_access": { "another_level": 1 } }, "apps" : [ 'some_app', 'some_other_app' ], "created_at" : ISODate("2012-07-23T13:31:17Z"), "email" : "david@burtcorp.com", "state" : "active" }

Slide 7

Slide 7 text

Default behaviour Reckless writes

Slide 8

Slide 8 text

Brutally Slow Object Notation

Slide 9

Slide 9 text

Quite the complex beast Sharding

Slide 10

Slide 10 text

Global Write Lock Really? ... Actually, not anymore.

Slide 11

Slide 11 text

Deleting stuff

Slide 12

Slide 12 text

Good stuff ‣ Replication - It just works, and it works REALLY well - rs.init(), rs.add(“second.node”) ‣ Schemaless + secondary indexes - Add whatever, query however ‣ Javascript CLI - db.find({name: “Clive”, birthdate: {$gte: ISODate(“1975-05-01”)}})

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

‘some/arbitary/key’ => ‘string’ {‘single_level’: ‘hash’} [‘list’, ‘of’, ‘items’] Set(‘a’, ‘b’)

Slide 15

Slide 15 text

Moar memory! In memory database

Slide 16

Slide 16 text

Single threaded a.k.a That 30s list command i just ran blocked the entire production system (that totally never happened)

Slide 17

Slide 17 text

Persistance ‣ RDB - point in time snapshot - Entire process forks. - Enable overcommit memory! ‣ AOF - write log - Very slow on startup ‣ AOF has higher priority on startup - Enable at runtime or loose stuff ‣ Monitor your log files!

Slide 18

Slide 18 text

No clustering ‣ Only master-slave replication - No failover ‣ Redis sentinel - promising but not ready ‣ Redis cluster - unstable/”not production ready” ‣ Twemproxy

Slide 19

Slide 19 text

Good stuff ‣ Wicked fast - To a limit ‣ Deletion - not a problem ‣ TTL - on key level

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

row_key column_1 column_2 column_3 row_key value value value row_key column_1 column_4 row_key value value

Slide 22

Slide 22 text

Dynamo By Amazon Not to be confused with DynamoDB - by Amazon

Slide 23

Slide 23 text

Black magic Or maybe I’m just dumb

Slide 24

Slide 24 text

Extremely java centric Some of you might think thats a good thing... 1.2 and CQL3 makes things a lot better

Slide 25

Slide 25 text

Data modeling Spend a lot of time on it!

Slide 26

Slide 26 text

“No” indexes Secondary indexes only good for low uniqueness (make your own)

Slide 27

Slide 27 text

Good stuff ‣ Black magic - Complex, but well made ‣ TTL on rows and columns ‣ Writes scale linearly “to infinity” - Netflix benchmarked 1 million writes/s (EC2)

Slide 28

Slide 28 text

Thank you @effata david@burtcorp.com