Slide 1

Slide 1 text

A Riak Query Tale Mathias Meyer, @roidrage NoSQL Cologne

Slide 2

Slide 2 text

http://riakhandbook.com

Slide 3

Slide 3 text

Riak Distributed Database Fault-Tolerant Content-Agnostic Scalable on Demand

Slide 4

Slide 4 text

Querying Data

Slide 5

Slide 5 text

Key-Value $  curl  localhost:8098/riak/users/roidrage

Slide 6

Slide 6 text

Links $  curl  -­‐v  localhost:8098/riak/users/roidrage <  HTTP/1.1  200  OK <  Link:  ;  riaktag="friend"

Slide 7

Slide 7 text

Links  $  curl  .../riak/users/roidrage/users,friend,_/

Slide 8

Slide 8 text

Listing Keys  $  curl  .../riak/users?keys=true

Slide 9

Slide 9 text

Don’t do that!

Slide 10

Slide 10 text

Streaming Keys  $  curl  .../riak/users?keys=stream

Slide 11

Slide 11 text

Avoid that!

Slide 12

Slide 12 text

Loads all the keys.

Slide 13

Slide 13 text

MapReduce

Slide 14

Slide 14 text

MapReduce Transform (Map) Aggregate (Reduce)

Slide 15

Slide 15 text

Warning: JavaScript

Slide 16

Slide 16 text

MapReduce  riak.add("users").            map("Riak.mapValues").            run()

Slide 17

Slide 17 text

MapReduce  var  nameLength  =  function(value)  {      var  doc  =  Riak.mapValues(value)[0];      return  [doc.length];  }

Slide 18

Slide 18 text

MapReduce  riak.add("users").            map(nameLength).            run()

Slide 19

Slide 19 text

MapReduce  riak.add("users").            map(nameLength).            reduce("Riak.reduceSum").            run()

Slide 20

Slide 20 text

MapReduce  var  average  =  function(values)  {      var  avg  =  values.reduce(function(n,  sum)  {          return  sum  +=  n;      },  0);      return  [(avg  /  values.length)];  }

Slide 21

Slide 21 text

MapReduce  riak.add("users").            map(nameLength).            reduce(average).            run()

Slide 22

Slide 22 text

MapReduce  riak.add("users").            map(nameLength).            reduce(average).            run() Uh-Oh!

Slide 23

Slide 23 text

MapReduce  riak.add(["users",  "roidrage"]).            map(nameLength).            reduce(average).            run() Better!

Slide 24

Slide 24 text

JavaScript M/R Breaks with Millions of Objects Uses External Libraries Serializes Data for JavaScript

Slide 25

Slide 25 text

Warning: Erlang

Slide 26

Slide 26 text

MapReduce  riak.add('tweets').    map({language:  'erlang',                      module:  'riak_kv_mapreduce',                      function:  'map_object_value'}).run()  

Slide 27

Slide 27 text

MapReduce  $  riak  attach  >  {ok,  C}  =  riak:local_client().  

Slide 28

Slide 28 text

MapReduce  C:mapred([{<<"users">>,  <<"roidrage">>}],  [{map,  {modfun,  riak_kv_mapreduce,  map_object_value},  none,  false},  {reduce,  {modfun,  riak_kv_mapreduce,  reduce_count_inputs},  none,  true}]).

Slide 29

Slide 29 text

MapReduce  ExtractFirstName1  =  fun(RObject,  _,  _)  -­‐>                                    Value  =  riak_object:get_value(RObject),                                    [FirstName,  _]  =  re:split(Value,  "  "),                                      [FirstName]                                                                                        end.

Slide 30

Slide 30 text

MapReduce  C:mapred([{<<"users">>,  <<"roidrage">>}],                                    [{map,  {qfun,  ExtractFirstName},  none,  true}]).

Slide 31

Slide 31 text

Erlang M/R Much more efficient than JavaScript No serialization No ad-hoc functions through HTTP

Slide 32

Slide 32 text

Key-Filters Reduce MapReduce input Based on key matches

Slide 33

Slide 33 text

Key-Filters  riak.add({bucket:  'users',  key_filters:      [["matches",  "^roid"]]})  

Slide 34

Slide 34 text

Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],        ["matches",  "^ROID"]]})  

Slide 35

Slide 35 text

Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],        ["to_lower"],        ["matches",  "^roid"]]})  

Slide 36

Slide 36 text

Key-Filters  riak.add({bucket:  'users',  key_filters:      [["to_upper"],        ["ends_with",  "RAGE"]]})  

Slide 37

Slide 37 text

Key-Filters  riak.add({bucket:  'users',  key_filters:              [["and",  [["string_to_int"],                                  ["less_than",  10]],                                [["string_to_int"],                                  ["greater_than",  5]]]]})

Slide 38

Slide 38 text

Don't use key filters.

Slide 39

Slide 39 text

Riak 2i Sorted Secondary Indexes Simple Reverse Lookups Maintained Manually Requires LevelDB

Slide 40

Slide 40 text

Riak 2i  curl  -­‐X  PUT  .../riak/users/roidrage  -­‐d  @-­‐  \            -­‐H  "Content-­‐Type:  text/plain"  \            -­‐H  "X-­‐Riak-­‐Index-­‐firstname_bin:  mathias"  \            -­‐H  "X-­‐Riak-­‐Index-­‐lastname_bin:  meyer"  

Slide 41

Slide 41 text

Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  

Slide 42

Slide 42 text

Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  X-­‐Riak-­‐Index-­‐age_int:  34

Slide 43

Slide 43 text

Riak 2i  X-­‐Riak-­‐Index-­‐firstname_bin:  Mathias  X-­‐Riak-­‐Index-­‐lastname_bin:  Meyer  X-­‐Riak-­‐Index-­‐age_int:  34  X-­‐Riak-­‐Index-­‐topics_bin:  nosql,cloud,operations

Slide 44

Slide 44 text

Riak 2i  #  Match  $  curl  .../buckets/users/index/firstname_bin/Mathias

Slide 45

Slide 45 text

Riak 2i  #  Range  $  curl  .../buckets/users/index/firstname_bin/Mathias/Till

Slide 46

Slide 46 text

Riak 2i  #  Key  $  curl  .../buckets/users/index/$key/roidrage

Slide 47

Slide 47 text

Ordered Keys! (sort of)

Slide 48

Slide 48 text

MapReduce  riak.add({bucket:  'users',                      index:  'lastname_bin',                      key:  'mathias'}).            map('Riak.mapValuesJson').run()  

Slide 49

Slide 49 text

Riak 2i No Multi-Index Queries Requires Extra Work in the App Returns only keys Document-partitioned

Slide 50

Slide 50 text

Riak Search Full-Text Search Solr-ish Interface Integrates with Riak

Slide 51

Slide 51 text

Riak Search  curl  -­‐X  PUT  localhost:8098/riak/users  -­‐d  @-­‐  \            -­‐H  "Content-­‐Type:  application/json"  {"props":{"precommit":  [{"mod":"riak_search_kv_hook","fun":"precommit"}  ]}}

Slide 52

Slide 52 text

Indexing Riak Objects  curl  -­‐X  PUT  .../riak/users/roidrage  \            -­‐d  "Mathias  Meyer"            -­‐H  "Content-­‐Type:  text/plain"

Slide 53

Slide 53 text

Solr-ish Interface  curl  .../solr/users/select?q=value:Mathias  

Slide 54

Slide 54 text

Riak Search  value:Mathias  OR  value:Till  value:Mathias  AND  value:Meyer  value:Mat*  value:[Mathias  TO  Till]

Slide 55

Slide 55 text

MapReduce  riak.addSearch("users",  "value:Mathias").            map("Riak.mapValues").run()

Slide 56

Slide 56 text

Riak Search Full text search of structured data Term-partitioned Efficient for one term queries Multiple Interfaces No Anti-Entropy

Slide 57

Slide 57 text

When?

Slide 58

Slide 58 text

Key Listings Never! Almost

Slide 59

Slide 59 text

MapReduce Analytical Queries Fixed Dataset

Slide 60

Slide 60 text

Key Filters Never!

Slide 61

Slide 61 text

Riak 2i Simple Lookups and Range Queries Unbounded Queries Full Fault-Tolerance

Slide 62

Slide 62 text

Riak Search Larger documents Full indexing Flexible queries Low frequency terms

Slide 63

Slide 63 text

Questions?