Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to Riak at montreal.rb

FHV
June 16, 2015

Intro to Riak at montreal.rb

An introduction to Riak with some background on the problems of distributed databases.

FHV

June 16, 2015
Tweet

More Decks by FHV

Other Decks in Programming

Transcript

  1. Overview • 1. Networks and databases and drama • 2.

    Riak architecture & usage • 3. CRDTs • 4. Distributed data modelling • 5. Etc / Questions
  2. What is Riak? A decentralized key-value database with high availability

    & fault tolerance. What do these things mean??
  3. Key-value store A gigantic associative array. username flohdot location Montreal

    favourite_author Octavia Butler last_book_read The Savage Detectives
  4. Key-value store A gigantic associative array. flohdot_location Montreal flohdot_favourite_author Octavia

    Butler flohdot_last_book_read The Savage Detectives scifidude99_location Toronto scifidude99_favourite_author Terry Pratchett scifidude99_last_book_read Kraken
  5. Distributed DBs Because your data… … is too big for

    one disk? … has too many transactions for one node? … cannot have a simple point of failure? … needs backups? … all of the above?
  6. Networks suck $ curl -XPOST http://your_database/important_key -d ‘{ “important_info”: “something

    you care about”}’ 500 NOPE … then… $ curl http://your_database/important_key 503 MAYBE LATER
  7. Networks suck What even happened? Data… (A) took the scenic

    route through a wormhole? (B) was eaten by monsters, never to be seen again? (C) returned from the underworld with an evil twin?
  8. Networks suck user1 on nodeA at 8:05am $ curl -XPOST

    -d ‘{ “important_info”: “some initial data”}’ http://your_database/important_key user2 on nodeB at 8:06am $ curl -XPOST -d ‘{ “important_info”: “modified data”}’ http://your_database/important_key user1 on nodeA at 8:07am $ curl http://your_database/important_key { “important_info”: “some initial data”} :(
  9. Networks suck user1 on nodeA at 8:05am $ curl -XPOST

    -d ‘{ “important_info”: “something user1 cares about”}’ http://your_database/important_key user2 on nodeB at 8:05am $ curl -XPOST -d ‘{ “important_info”: “total nonsense HAHA”}’ http://your_database/important_key ????
  10. Hardware sucks … sometimes hardware fails completely! … or SOME

    hardware fails. … or you need to add or replace hardware.
  11. Hardware sucks … sometimes hardware fails completely! … or SOME

    hardware fails. … or you need to add or replace hardware. If you’re using a distributed database system, it needs to be fault/partition tolerant.
  12. The CAP theorem partition tolerance availability consistency AP CA CP

    At any moment in time, a system cannot be consistent, available, AND partition tolerant.
  13. The CAP theorem partition tolerance availability consistency AP CA CP

    X Distributed systems must be partition tolerant.
  14. The CAP theorem partition tolerance availability consistency AP CA CP

    X Lock during a write, make sure it propagates, then allow reads.
  15. The CAP theorem partition tolerance availability consistency AP CA CP

    X Reads and writes will always succeed, but the data you get might not be the same.
  16. The CAP theorem partition tolerance availability consistency AP CA CP

    X Riak is an AP system with eventual consistency.
  17. The CAP theorem partition tolerance availability consistency AP CA CP

    X Riak is an AP system with tunable eventual consistency.
  18. KV Buckets & Keys (strings) bucket: “books”, key: <ISBN of

    a book> Value: JSON, plaintext, image, anything up to ~1-2MB.
  19. HTTP API $ curl -v -X PUT http://localhost:8091/ buckets/books/keys/1594483299 \

    -H “Content-Type: application/json” \ -d ‘{ “title”: “The Brief & Wondrous Life of Oscar Wao”, “author”: “Junot Diaz” }’ & libraries in Ruby, Python, Java, Erlang and more.
  20. Partitioning 64 virtual nodes (vnodes) by default… … each responsible

    for 1/64th of the keyspace. The keys are hashed with SHA1 (2160 values)… “Harry Potter & the Chamber of Secrets” => 628e87e7ec52e212a7efbc88aaf7dfbf9e314a23
  21. Partitioning 64 virtual nodes (vnodes) by default… … each responsible

    for 1/64th of the keyspace. The keys are hashed with SHA1 (2160 values)… … and the hashed value determines which vnode owns the data. N1 0 to 2154-1 N2 2154 to 2*2154-1 N2 63*2154 to 2160-1 …
  22. Replication N number of (physical) nodes a write eventually replicates

    to W number of nodes that must be successfully written to before a successful response R number of nodes required to read a value successfully R + W > N
  23. Replication For a cluster with 5 physical nodes, and N

    = 3. W = 5 R = 1 Slow writes, fast reads.
  24. Replication For a cluster with 5 physical nodes, and N

    = 3. W = 1 R = 5 Fast writes, reads might have conflicts.
  25. Replication For a cluster with 5 physical nodes, and N

    = 3. W = 2 R = 2 “quorum” more than half the replicated nodes, or (floor(N/2) + 1)
  26. Conflict resolution allow_mult & last_write_wins true (false) use vector clocks

    for causal context keep options your application resolves conflicts
  27. CALM An eventually consistent system only grows in one direction.

    “Consistency As Logical Monotonicity” Operations should be: Associative Commutative Idempotent
  28. Set Union (1 ∪ 2) ∪ 3 = 1 ∪

    (2 ∪ 3) 1 ∪ 2 = 2 ∪ 1
  29. Set Union (1 ∪ 2) ∪ 3 = 1 ∪

    (2 ∪ 3) 1 ∪ 2 = 2 ∪ 1 1 ∪ 1 = 1
  30. CRDT in practice listed_2015_08_14 “books favourited (by any user) today”

    {“A Wrinkle In Time” } “A Wrinkle In Time” N1 N2 N3
  31. CRDT in practice listed_2015_08_14 “books favourited (by any user) today”

    {“A Wrinkle In Time” } {“A Wrinkle In Time” } {“A Wrinkle In Time” } “A Wrinkle In Time” N1 N2 N3 replicate! replicate!
  32. Set CRDT in practice listed_2015_08_14 “books favourited (by any user)

    today” {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time” } {“A Wrinkle In Time” } “Where the Wild Things Are” N1 N2 N3
  33. CRDT in practice listed_2015_08_14 “books favourited (by any user) today”

    {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time” } “Where the Wild Things Are” N1 N2 N3 replicate!
  34. listed_2015_08_14 “books favourited (by any user) today” CRDT in practice

    {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time” } “Where the Wild Things Are” N1 N2 N3 replicate! replicate!
  35. listed_2015_08_14 “books favourited (by any user) today” CRDT in practice

    {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time” } “Where the Wild Things Are” N1 N2 N3 replicate! replicate! X
  36. Read listed_2015_08_14 Set Union {“A Wrinkle In Time”, “Where the

    Wild Things Are” } {“A Wrinkle In Time”, “Where the Wild Things Are” } {“A Wrinkle In Time” } N1 N2 N3 ∪ ∪ = {“A Wrinkle In Time”, “Where the Wild Things Are” }
  37. Riak CRDTs Riak Datatypes (Sets, Counters, Maps) An API with

    familiar abstractions for the underlying math and magic of CRDTs.
  38. Riak CRDTs Set CRDT favourite_books = client.bucket(‘favourite_books') my_faves_set = Riak::Crdt::Set.new(favourite_books,

    'flohdot', ‘sets') my_faves_set.add(‘Thinking, Fast and Slow’) my_faves_set.add(‘Oryx and Crake’) my_faves_set.remove(‘Oryx and Crake’)
  39. Riak CRDTs Counter Datatype user_profiles = client.bucket(‘user_profiles’) flohdot = Riak::Crdt::Map.new(user_profiles,

    ‘2015_06_16’, ‘maps') flohdot.batch do |m| m.registers['first_name'] = ‘Florencia' # string m.flags[‘pro_user'] = true # boolean m.counters[‘logins’].increment end
  40. Riak CRDTs Counter Datatype user_profiles = client.bucket(‘user_profiles’) flohdot = Riak::Crdt::Map.new(user_profiles,

    ‘2015_06_16’, ‘maps') flohdot.batch do |m| m.registers['first_name'] = ‘Florencia' # string m.flags[‘pro_user'] = true # boolean m.counters[‘logins’].increment # yo dawg i herd you like maps so i put some maps in your maps end
  41. Riak data design • Are your objects immutable? • What

    kind of consistency/accuracy is required? • Do you need manual conflict resolution, or will CRDTs do? • Do you need search?
  42. Library data { “title” : “A Game of Thrones”, “author”

    : “George RR Martin”, “year” : “1996“ … }
  43. Leverage namespacing • Finding objects is fast. • No hard

    limit or performance impact on number of buckets. • Extra namespace: bucket types (multitenancy?)
  44. Sources/Learn more • A Little Riak Book by Eric Redmond

    *give me your name if you want the e-book • 7 Databases in 7 Weeks by Eric Redmond & Jim R Wilson • Riak docs http://docs.basho.com/riak/latest • Hector Castro @ Big Ruby 2014 https://www.youtube.com/watch? v=-_3Us7Ystyg#aid=P-4heI_bFwo • Peter Bourgon @ Strangeloop 2014 https://www.youtube.com/ watch?v=em9zLzM8O7c • Kyle Kingsbury’s Jepsen blog series https://aphyr.com/tags/jepsen
  45. Thanks! Get in touch! @flohdot (Twitter, Github, etc) peerio.com •

    @peerio • github.com/PeerioTechnologies Hiring… soon! & we use Riak.