Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Riak Query Tale

A Riak Query Tale

An introduction to the abundance of ways you can get data out of Riak.


Mathias Meyer

February 01, 2012


  1. A Riak Query Tale Mathias Meyer, @roidrage NoSQL Cologne

  2. http://riakhandbook.com

  3. Riak Distributed Database Fault-Tolerant Content-Agnostic Scalable on Demand

  4. Querying Data

  5. Key-Value $  curl  localhost:8098/riak/users/roidrage

  6. Links $  curl  -­‐v  localhost:8098/riak/users/roidrage <  HTTP/1.1  200  OK <

     Link:  </riak/users/klimpong>;  riaktag="friend"
  7. Links  $  curl  .../riak/users/roidrage/users,friend,_/

  8. Listing Keys  $  curl  .../riak/users?keys=true

  9. Don’t do that!

  10. Streaming Keys  $  curl  .../riak/users?keys=stream

  11. Avoid that!

  12. Loads all the keys.

  13. MapReduce

  14. MapReduce Transform (Map) Aggregate (Reduce)

  15. Warning: JavaScript

  16. MapReduce  riak.add("users").            map("Riak.mapValues").    

  17. MapReduce  var  nameLength  =  function(value)  {      var  doc

     =  Riak.mapValues(value)[0];      return  [doc.length];  }
  18. MapReduce  riak.add("users").            map(nameLength).    

  19. MapReduce  riak.add("users").            map(nameLength).    

           reduce("Riak.reduceSum").            run()
  20. MapReduce  var  average  =  function(values)  {      var  avg

     =  values.reduce(function(n,  sum)  {          return  sum  +=  n;      },  0);      return  [(avg  /  values.length)];  }
  21. MapReduce  riak.add("users").            map(nameLength).    

           reduce(average).            run()
  22. MapReduce  riak.add("users").            map(nameLength).    

           reduce(average).            run() Uh-Oh!
  23. MapReduce  riak.add(["users",  "roidrage"]).            map(nameLength).  

             reduce(average).            run() Better!
  24. JavaScript M/R Breaks with Millions of Objects Uses External Libraries

    Serializes Data for JavaScript
  25. Warning: Erlang

  26. MapReduce  riak.add('tweets').    map({language:  'erlang',          

               module:  'riak_kv_mapreduce',                      function:  'map_object_value'}).run()  
  27. MapReduce  $  riak  attach  >  {ok,  C}  =  riak:local_client().  

  28. MapReduce  C:mapred([{<<"users">>,  <<"roidrage">>}],  [{map,  {modfun,  riak_kv_mapreduce,  map_object_value},  none,  false},  {reduce,

     {modfun,  riak_kv_mapreduce,  reduce_count_inputs},  none,  true}]).
  29. MapReduce  ExtractFirstName1  =  fun(RObject,  _,  _)  -­‐>      

                                 Value  =  riak_object:get_value(RObject),                                    [FirstName,  _]  =  re:split(Value,  "  "),                                      [FirstName]                                                                                        end.
  30. MapReduce  C:mapred([{<<"users">>,  <<"roidrage">>}],              

                         [{map,  {qfun,  ExtractFirstName},  none,  true}]).
  31. Erlang M/R Much more efficient than JavaScript No serialization No

    ad-hoc functions through HTTP
  32. Key-Filters Reduce MapReduce input Based on key matches

  33. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["matches",  "^roid"]]})  

  34. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],      

     ["matches",  "^ROID"]]})  
  35. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],      

     ["to_lower"],        ["matches",  "^roid"]]})  
  36. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],      

     ["ends_with",  "RAGE"]]})  
  37. Key-Filters  riak.add({bucket:  'users',  key_filters:            

     [["and",  [["string_to_int"],                                  ["less_than",  10]],                                [["string_to_int"],                                  ["greater_than",  5]]]]})
  38. Don't use key filters.

  39. Riak 2i Sorted Secondary Indexes Simple Reverse Lookups Maintained Manually

    Requires LevelDB
  40. Riak 2i  curl  -­‐X  PUT  .../riak/users/roidrage  -­‐d  @-­‐  \  

             -­‐H  "Content-­‐Type:  text/plain"  \            -­‐H  "X-­‐Riak-­‐Index-­‐firstname_bin:  mathias"  \            -­‐H  "X-­‐Riak-­‐Index-­‐lastname_bin:  meyer"  
  41. Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  

  42. Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  X-­‐Riak-­‐Index-­‐age_int:  34

  43. Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  X-­‐Riak-­‐Index-­‐age_int:  34  X-­‐Riak-­‐Index-­‐topics_bin:  nosql,cloud,operations

  44. Riak 2i  #  Match  $  curl  .../buckets/users/index/firstname_bin/Mathias

  45. Riak 2i  #  Range  $  curl  .../buckets/users/index/firstname_bin/Mathias/Till

  46. Riak 2i  #  Key  $  curl  .../buckets/users/index/$key/roidrage

  47. Ordered Keys! (sort of)

  48. MapReduce  riak.add({bucket:  'users',              

           index:  'lastname_bin',                      key:  'mathias'}).            map('Riak.mapValuesJson').run()  
  49. Riak 2i No Multi-Index Queries Requires Extra Work in the

    App Returns only keys Document-partitioned
  50. Riak Search Full-Text Search Solr-ish Interface Integrates with Riak

  51. Riak Search  curl  -­‐X  PUT  localhost:8098/riak/users  -­‐d  @-­‐  \  

             -­‐H  "Content-­‐Type:  application/json"  {"props":{"precommit":  [{"mod":"riak_search_kv_hook","fun":"precommit"}  ]}}
  52. Indexing Riak Objects  curl  -­‐X  PUT  .../riak/users/roidrage  \    

           -­‐d  "Mathias  Meyer"            -­‐H  "Content-­‐Type:  text/plain"
  53. Solr-ish Interface  curl  .../solr/users/select?q=value:Mathias  

  54. Riak Search  value:Mathias  OR  value:Till  value:Mathias  AND  value:Meyer  value:Mat*  value:[Mathias

     TO  Till]
  55. MapReduce  riak.addSearch("users",  "value:Mathias").            map("Riak.mapValues").run()

  56. Riak Search Full text search of structured data Term-partitioned Efficient

    for one term queries Multiple Interfaces No Anti-Entropy
  57. When?

  58. Key Listings Never! Almost

  59. MapReduce Analytical Queries Fixed Dataset

  60. Key Filters Never!

  61. Riak 2i Simple Lookups and Range Queries Unbounded Queries Full

  62. Riak Search Larger documents Full indexing Flexible queries Low frequency

  63. Questions?