Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Riak Query Tale

A Riak Query Tale

An introduction to the abundance of ways you can get data out of Riak.

Mathias Meyer

February 01, 2012
Tweet

More Decks by Mathias Meyer

Other Decks in Programming

Transcript

  1. A Riak
    Query Tale
    Mathias Meyer, @roidrage
    NoSQL Cologne

    View full-size slide

  2. http://riakhandbook.com

    View full-size slide

  3. Riak
    Distributed Database
    Fault-Tolerant
    Content-Agnostic
    Scalable on Demand

    View full-size slide

  4. Querying Data

    View full-size slide

  5. Key-Value
    $  curl  localhost:8098/riak/users/roidrage

    View full-size slide

  6. Links
    $  curl  -­‐v  localhost:8098/riak/users/roidrage
    <  HTTP/1.1  200  OK
    <  Link:  ;  riaktag="friend"

    View full-size slide

  7. Links
     $  curl  .../riak/users/roidrage/users,friend,_/

    View full-size slide

  8. Listing Keys
     $  curl  .../riak/users?keys=true

    View full-size slide

  9. Don’t do that!

    View full-size slide

  10. Streaming Keys
     $  curl  .../riak/users?keys=stream

    View full-size slide

  11. Loads all the keys.

    View full-size slide

  12. MapReduce
    Transform (Map)
    Aggregate (Reduce)

    View full-size slide

  13. Warning:
    JavaScript

    View full-size slide

  14. MapReduce
     riak.add("users").
               map("Riak.mapValues").
               run()

    View full-size slide

  15. MapReduce
     var  nameLength  =  function(value)  {
         var  doc  =  Riak.mapValues(value)[0];
         return  [doc.length];
     }

    View full-size slide

  16. MapReduce
     riak.add("users").
               map(nameLength).
               run()

    View full-size slide

  17. MapReduce
     riak.add("users").
               map(nameLength).
               reduce("Riak.reduceSum").
               run()

    View full-size slide

  18. MapReduce
     var  average  =  function(values)  {
         var  avg  =  values.reduce(function(n,  sum)  {
             return  sum  +=  n;
         },  0);
         return  [(avg  /  values.length)];
     }

    View full-size slide

  19. MapReduce
     riak.add("users").
               map(nameLength).
               reduce(average).
               run()

    View full-size slide

  20. MapReduce
     riak.add("users").
               map(nameLength).
               reduce(average).
               run()
    Uh-Oh!

    View full-size slide

  21. MapReduce
     riak.add(["users",  "roidrage"]).
               map(nameLength).
               reduce(average).
               run()
    Better!

    View full-size slide

  22. JavaScript M/R
    Breaks with Millions of Objects
    Uses External Libraries
    Serializes Data for JavaScript

    View full-size slide

  23. Warning:
    Erlang

    View full-size slide

  24. MapReduce
     riak.add('tweets').
       map({language:  'erlang',
                         module:  'riak_kv_mapreduce',
                         function:  'map_object_value'}).run()
     

    View full-size slide

  25. MapReduce
     $  riak  attach
     >  {ok,  C}  =  riak:local_client().
     

    View full-size slide

  26. MapReduce
     C:mapred([{<<"users">>,  <<"roidrage">>}],
     [{map,  {modfun,  riak_kv_mapreduce,  map_object_value},  none,  false},
     {reduce,  {modfun,  riak_kv_mapreduce,  reduce_count_inputs},  none,  true}]).

    View full-size slide

  27. MapReduce
     ExtractFirstName1  =  fun(RObject,  _,  _)  -­‐>                    
     
     
     
     
     
         Value  =  riak_object:get_value(RObject),                    
     
     
     
     
     
         [FirstName,  _]  =  re:split(Value,  "  "),                      
     
     
     
     
     
         [FirstName]                                                                            
     
     
     
     
     
     end.

    View full-size slide

  28. MapReduce
     C:mapred([{<<"users">>,  <<"roidrage">>}],                    
     
     
     
     
     
         [{map,  {qfun,  ExtractFirstName},  none,  true}]).

    View full-size slide

  29. Erlang M/R
    Much more efficient than JavaScript
    No serialization
    No ad-hoc functions through HTTP

    View full-size slide

  30. Key-Filters
    Reduce MapReduce input
    Based on key matches

    View full-size slide

  31. Key-Filters
     riak.add({bucket:  'users',  key_filters:
         [["matches",  "^roid"]]})
     

    View full-size slide

  32. Key-Filters
     riak.add({bucket:  'users',  key_filters:
         [["to_upper"],
           ["matches",  "^ROID"]]})
     

    View full-size slide

  33. Key-Filters
     riak.add({bucket:  'users',  key_filters:
         [["to_upper"],
           ["to_lower"],
           ["matches",  "^roid"]]})
     

    View full-size slide

  34. Key-Filters
     riak.add({bucket:  'users',  key_filters:
         [["to_upper"],
           ["ends_with",  "RAGE"]]})
     

    View full-size slide

  35. Key-Filters
     riak.add({bucket:  'users',  key_filters:
                 [["and",  [["string_to_int"],
                                     ["less_than",  10]],
                                   [["string_to_int"],
                                     ["greater_than",  5]]]]})

    View full-size slide

  36. Don't use
    key filters.

    View full-size slide

  37. Riak 2i
    Sorted Secondary Indexes
    Simple Reverse Lookups
    Maintained Manually
    Requires LevelDB

    View full-size slide

  38. Riak 2i
     curl  -­‐X  PUT  .../riak/users/roidrage  -­‐d  @-­‐  \
               -­‐H  "Content-­‐Type:  text/plain"  \
               -­‐H  "X-­‐Riak-­‐Index-­‐firstname_bin:  mathias"  \
               -­‐H  "X-­‐Riak-­‐Index-­‐lastname_bin:  meyer"
     

    View full-size slide

  39. Riak 2i
     X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias
     X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer
     

    View full-size slide

  40. Riak 2i
     X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias
     X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer
     X-­‐Riak-­‐Index-­‐age_int:  34

    View full-size slide

  41. Riak 2i
     X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias
     X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer
     X-­‐Riak-­‐Index-­‐age_int:  34
     X-­‐Riak-­‐Index-­‐topics_bin:  nosql,cloud,operations

    View full-size slide

  42. Riak 2i
     #  Match
     $  curl  .../buckets/users/index/firstname_bin/Mathias

    View full-size slide

  43. Riak 2i
     #  Range
     $  curl  .../buckets/users/index/firstname_bin/Mathias/Till

    View full-size slide

  44. Riak 2i
     #  Key
     $  curl  .../buckets/users/index/$key/roidrage

    View full-size slide

  45. Ordered Keys!
    (sort of)

    View full-size slide

  46. MapReduce
     riak.add({bucket:  'users',
                         index:  'lastname_bin',
                         key:  'mathias'}).
               map('Riak.mapValuesJson').run()
     

    View full-size slide

  47. Riak 2i
    No Multi-Index Queries
    Requires Extra Work in the App
    Returns only keys
    Document-partitioned

    View full-size slide

  48. Riak Search
    Full-Text Search
    Solr-ish Interface
    Integrates with Riak

    View full-size slide

  49. Riak Search
     curl  -­‐X  PUT  localhost:8098/riak/users  -­‐d  @-­‐  \
               -­‐H  "Content-­‐Type:  application/json"
     {"props":{"precommit":
     [{"mod":"riak_search_kv_hook","fun":"precommit"}
     ]}}

    View full-size slide

  50. Indexing Riak Objects
     curl  -­‐X  PUT  .../riak/users/roidrage  \
               -­‐d  "Mathias  Meyer"
               -­‐H  "Content-­‐Type:  text/plain"

    View full-size slide

  51. Solr-ish Interface
     curl  .../solr/users/select?q=value:Mathias
     

    View full-size slide

  52. Riak Search
     value:Mathias  OR  value:Till
     value:Mathias  AND  value:Meyer
     value:Mat*
     value:[Mathias  TO  Till]

    View full-size slide

  53. MapReduce
     riak.addSearch("users",  "value:Mathias").
               map("Riak.mapValues").run()

    View full-size slide

  54. Riak Search
    Full text search of structured data
    Term-partitioned
    Efficient for one term queries
    Multiple Interfaces
    No Anti-Entropy

    View full-size slide

  55. Key Listings
    Never!
    Almost

    View full-size slide

  56. MapReduce
    Analytical Queries
    Fixed Dataset

    View full-size slide

  57. Key Filters
    Never!

    View full-size slide

  58. Riak 2i
    Simple Lookups and Range Queries
    Unbounded Queries
    Full Fault-Tolerance

    View full-size slide

  59. Riak Search
    Larger documents
    Full indexing
    Flexible queries
    Low frequency terms

    View full-size slide