Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Riak Query Tale

A Riak Query Tale

An introduction to the abundance of ways you can get data out of Riak.

4d9dd9bd8d3d4d0ba8af2acc41d14006?s=128

Mathias Meyer

February 01, 2012
Tweet

Transcript

  1. A Riak Query Tale Mathias Meyer, @roidrage NoSQL Cologne

  2. http://riakhandbook.com

  3. Riak Distributed Database Fault-Tolerant Content-Agnostic Scalable on Demand

  4. Querying Data

  5. Key-Value $  curl  localhost:8098/riak/users/roidrage

  6. Links $  curl  -­‐v  localhost:8098/riak/users/roidrage <  HTTP/1.1  200  OK <

     Link:  </riak/users/klimpong>;  riaktag="friend"
  7. Links  $  curl  .../riak/users/roidrage/users,friend,_/

  8. Listing Keys  $  curl  .../riak/users?keys=true

  9. Don’t do that!

  10. Streaming Keys  $  curl  .../riak/users?keys=stream

  11. Avoid that!

  12. Loads all the keys.

  13. MapReduce

  14. MapReduce Transform (Map) Aggregate (Reduce)

  15. Warning: JavaScript

  16. MapReduce  riak.add("users").            map("Riak.mapValues").    

           run()
  17. MapReduce  var  nameLength  =  function(value)  {      var  doc

     =  Riak.mapValues(value)[0];      return  [doc.length];  }
  18. MapReduce  riak.add("users").            map(nameLength).    

           run()
  19. MapReduce  riak.add("users").            map(nameLength).    

           reduce("Riak.reduceSum").            run()
  20. MapReduce  var  average  =  function(values)  {      var  avg

     =  values.reduce(function(n,  sum)  {          return  sum  +=  n;      },  0);      return  [(avg  /  values.length)];  }
  21. MapReduce  riak.add("users").            map(nameLength).    

           reduce(average).            run()
  22. MapReduce  riak.add("users").            map(nameLength).    

           reduce(average).            run() Uh-Oh!
  23. MapReduce  riak.add(["users",  "roidrage"]).            map(nameLength).  

             reduce(average).            run() Better!
  24. JavaScript M/R Breaks with Millions of Objects Uses External Libraries

    Serializes Data for JavaScript
  25. Warning: Erlang

  26. MapReduce  riak.add('tweets').    map({language:  'erlang',          

               module:  'riak_kv_mapreduce',                      function:  'map_object_value'}).run()  
  27. MapReduce  $  riak  attach  >  {ok,  C}  =  riak:local_client().  

  28. MapReduce  C:mapred([{<<"users">>,  <<"roidrage">>}],  [{map,  {modfun,  riak_kv_mapreduce,  map_object_value},  none,  false},  {reduce,

     {modfun,  riak_kv_mapreduce,  reduce_count_inputs},  none,  true}]).
  29. MapReduce  ExtractFirstName1  =  fun(RObject,  _,  _)  -­‐>      

                                 Value  =  riak_object:get_value(RObject),                                    [FirstName,  _]  =  re:split(Value,  "  "),                                      [FirstName]                                                                                        end.
  30. MapReduce  C:mapred([{<<"users">>,  <<"roidrage">>}],              

                         [{map,  {qfun,  ExtractFirstName},  none,  true}]).
  31. Erlang M/R Much more efficient than JavaScript No serialization No

    ad-hoc functions through HTTP
  32. Key-Filters Reduce MapReduce input Based on key matches

  33. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["matches",  "^roid"]]})  

  34. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],      

     ["matches",  "^ROID"]]})  
  35. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],      

     ["to_lower"],        ["matches",  "^roid"]]})  
  36. Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],      

     ["ends_with",  "RAGE"]]})  
  37. Key-Filters  riak.add({bucket:  'users',  key_filters:            

     [["and",  [["string_to_int"],                                  ["less_than",  10]],                                [["string_to_int"],                                  ["greater_than",  5]]]]})
  38. Don't use key filters.

  39. Riak 2i Sorted Secondary Indexes Simple Reverse Lookups Maintained Manually

    Requires LevelDB
  40. Riak 2i  curl  -­‐X  PUT  .../riak/users/roidrage  -­‐d  @-­‐  \  

             -­‐H  "Content-­‐Type:  text/plain"  \            -­‐H  "X-­‐Riak-­‐Index-­‐firstname_bin:  mathias"  \            -­‐H  "X-­‐Riak-­‐Index-­‐lastname_bin:  meyer"  
  41. Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  

  42. Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  X-­‐Riak-­‐Index-­‐age_int:  34

  43. Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  X-­‐Riak-­‐Index-­‐age_int:  34  X-­‐Riak-­‐Index-­‐topics_bin:  nosql,cloud,operations

  44. Riak 2i  #  Match  $  curl  .../buckets/users/index/firstname_bin/Mathias

  45. Riak 2i  #  Range  $  curl  .../buckets/users/index/firstname_bin/Mathias/Till

  46. Riak 2i  #  Key  $  curl  .../buckets/users/index/$key/roidrage

  47. Ordered Keys! (sort of)

  48. MapReduce  riak.add({bucket:  'users',              

           index:  'lastname_bin',                      key:  'mathias'}).            map('Riak.mapValuesJson').run()  
  49. Riak 2i No Multi-Index Queries Requires Extra Work in the

    App Returns only keys Document-partitioned
  50. Riak Search Full-Text Search Solr-ish Interface Integrates with Riak

  51. Riak Search  curl  -­‐X  PUT  localhost:8098/riak/users  -­‐d  @-­‐  \  

             -­‐H  "Content-­‐Type:  application/json"  {"props":{"precommit":  [{"mod":"riak_search_kv_hook","fun":"precommit"}  ]}}
  52. Indexing Riak Objects  curl  -­‐X  PUT  .../riak/users/roidrage  \    

           -­‐d  "Mathias  Meyer"            -­‐H  "Content-­‐Type:  text/plain"
  53. Solr-ish Interface  curl  .../solr/users/select?q=value:Mathias  

  54. Riak Search  value:Mathias  OR  value:Till  value:Mathias  AND  value:Meyer  value:Mat*  value:[Mathias

     TO  Till]
  55. MapReduce  riak.addSearch("users",  "value:Mathias").            map("Riak.mapValues").run()

  56. Riak Search Full text search of structured data Term-partitioned Efficient

    for one term queries Multiple Interfaces No Anti-Entropy
  57. When?

  58. Key Listings Never! Almost

  59. MapReduce Analytical Queries Fixed Dataset

  60. Key Filters Never!

  61. Riak 2i Simple Lookups and Range Queries Unbounded Queries Full

    Fault-Tolerance
  62. Riak Search Larger documents Full indexing Flexible queries Low frequency

    terms
  63. Questions?