Nosql - getting over the bad parts

“nosql” Getting over the bad parts David Dahl @effata

Overview ‣ Real life lessons ‣ Production systems ‣ Write
heavy ‣ MongoDB ‣ Redis ‣ Cassandra

generic ‣ Took a DB class? - Forget everything you
learned! ‣ Denormalize all the things - Up to a limit ‣ Consistency is your responsibility ‣ Primary keys - Give them a lot of thought

{ "_id" : ObjectId("51235a80472689978000004e"), "access" : { "admin": ['some_app'], "deep_access":
{ "another_level": 1 } }, "apps" : [ 'some_app', 'some_other_app' ], "created_at" : ISODate("2012-07-23T13:31:17Z"), "email" : "[email protected]", "state" : "active" }

Default behaviour Reckless writes

Brutally Slow Object Notation

Quite the complex beast Sharding

Global Write Lock Really? ... Actually, not anymore.

Deleting stuff

Good stuff ‣ Replication - It just works, and it
works REALLY well - rs.init(), rs.add(“second.node”) ‣ Schemaless + secondary indexes - Add whatever, query however ‣ Javascript CLI - db.find({name: “Clive”, birthdate: {$gte: ISODate(“1975-05-01”)}})

‘some/arbitary/key’ => ‘string’ {‘single_level’: ‘hash’} [‘list’, ‘of’, ‘items’] Set(‘a’, ‘b’)

Moar memory! In memory database

Single threaded a.k.a That 30s list command i just ran
blocked the entire production system (that totally never happened)

Persistance ‣ RDB - point in time snapshot - Entire
process forks. - Enable overcommit memory! ‣ AOF - write log - Very slow on startup ‣ AOF has higher priority on startup - Enable at runtime or loose stuff ‣ Monitor your log files!

No clustering ‣ Only master-slave replication - No failover ‣
Redis sentinel - promising but not ready ‣ Redis cluster - unstable/”not production ready” ‣ Twemproxy

Good stuff ‣ Wicked fast - To a limit ‣
Deletion - not a problem ‣ TTL - on key level

row_key column_1 column_2 column_3 row_key value value value row_key column_1
column_4 row_key value value

Dynamo By Amazon Not to be confused with DynamoDB -
by Amazon

Black magic Or maybe I’m just dumb

Extremely java centric Some of you might think thats a
good thing... 1.2 and CQL3 makes things a lot better

Data modeling Spend a lot of time on it!

“No” indexes Secondary indexes only good for low uniqueness (make
your own)

Good stuff ‣ Black magic - Complex, but well made
‣ TTL on rows and columns ‣ Writes scale linearly “to infinity” - Netflix benchmarked 1 million writes/s (EC2)

Thank you @effata [email protected]

Nosql - getting over the bad parts

Nosql - getting over the bad parts

David Dahl

More Decks by David Dahl

Other Decks in Programming

Featured

Transcript