Learn You Some Riak

Learn You Some Riak

An intro to Riak talk given at the Atlanta Ruby Users Group on April 11, 2012.

The focus of the talk was on some simple Riak usage with the Ruby driver, some high-level overview of how Riak works, reasons you might want to use Riak in your application, and demo off a sample toy app for pulling tweets of the Twitter Firehose for analysis using Riak MapReduce queries.

Cd839cc361ffa996be0cc8259f3d7555?s=128

Will Farrington

April 11, 2012
Tweet

Transcript

  1. Learn You Some Riak

  2. @wfarr github.com/wfarr

  3. Learn You Some Riak

  4. Story Time

  5. None
  6. TL; DR

  7. TL; DR Disclaimer: You should totally read this paper as

    soon as you get home and bask in its glory.
  8. Dynamo

  9. Buckets, Keys, Values

  10. Fault- Tolerant

  11. Masterless

  12. What is this magic?

  13. CAP Theorem

  14. Consistency

  15. All Nodes See Data at the Same Time

  16. Availability

  17. Every DB Request Gets a Response for Success or Failure,

    Guaranteed
  18. Partition Tolerance

  19. Your DB keeps working despite arbitrary message loss or failure

    of a part of the system
  20. All Good for Different Things

  21. You Only Get To Have 2 of the 3

  22. Dynamo Chooses Availability and Partition Tolerance

  23. What is Riak?

  24. Riak is a Dynamo

  25. Bucket- Key: Value

  26. Entries abc object bcd object cde object def object Logs

    abc object bcd object cde object def object
  27. The Ring

  28. None
  29. VNodes

  30. None
  31. Querying

  32. Map Reduce

  33. Riak::MapReduce.new(client)

  34. mr.filter("tweets") do matches "^testeroftests-" end

  35. fn = "function (v) { return [ JSON.parse(v.values[0].data).text ]; }"

  36. mr.map(fn, :keep => true)

  37. mr.run

  38. Search

  39. Full-text Search

  40. Lucene Syntax

  41. client.search "tweets", "retweeted:true"

  42. Supports Manual Indexing

  43. client.index "tweets", { id: "abcde", text: "#webscale" }

  44. Supports Auto Indexing

  45. t = client[‘tweets’] t.is_indexed? t.enable_index! t.disable_index!

  46. Queries Nodes Intelligently

  47. Tradeoffs

  48. Consistency Availability Partition Tolerance

  49. Consistency Availability Partition Tolerance

  50. The Good

  51. Riak gets to be masterless

  52. Riak gets to be fault-tolerant

  53. Riak gets to be easy to scale

  54. Riak gets to be easy to manage

  55. The Bad

  56. Riak is “only” eventually consistent

  57. Understand Your Tradeoffs

  58. Your Tradeoffs Might Not Be Someone Else’s

  59. There is no silver bullet

  60. None
  61. “The most boring database you’ll ever run in production.” @pharkmillups

  62. Boring Makes Devs Happy

  63. Boring Makes Ops Happy

  64. Boring is Awesome

  65. Devs Ops

  66. Ops Devs

  67. Example Use Cases

  68. Session Storage

  69. Private S3-like Storage

  70. Huge Amounts of Rich Media

  71. Caching Layer

  72. Simple Horizontal Scaling

  73. Logging Systems

  74. Maybe Not Use Cases

  75. Realtime

  76. Replacing Stuff That Isn’t Broken

  77. You Can Use Multiple Databases!

  78. Who Already Uses Riak?

  79. None
  80. None
  81. Demo

  82. Let’s Pretend...

  83. None
  84. Single Server

  85. Nope

  86. Sharding

  87. Nope

  88. “building a distributed system ass first” @jnewland

  89. Go Horizontal

  90. None
  91. Madness?

  92. Nope just big data

  93. Scenario

  94. Questions?

  95. Thanks! Will Farrington speakerdeck.com/u/wfarr github.com/wfarr/tweetscale