Riak Search: The Next Generation

7c4bac30ed2d3a9d346ced746b1d985d?s=47 Tom Santero
September 17, 2013

Riak Search: The Next Generation

Presentation on Yokozuna (https://github.com/basho/yokozuna) at the NYC Riak Meetup group

7c4bac30ed2d3a9d346ced746b1d985d?s=128

Tom Santero

September 17, 2013
Tweet

Transcript

  1. Riak Search the next generation Tuesday, September 17, 13

  2. tsantero @ basho.com Tuesday, September 17, 13

  3. Tuesday, September 17, 13

  4. Tuesday, September 17, 13

  5. 2.0 coming soon.. Tuesday, September 17, 13

  6. the history of Riak Search Tuesday, September 17, 13

  7. home grown full-text search Tuesday, September 17, 13

  8. lucene Tuesday, September 17, 13

  9. SCALE Tuesday, September 17, 13

  10. NODE # = HASH(KEY) % NUM_NODES NH(Ka) = 0 NH(Kb)

    = 1 NH(Kc) = 2 NH(Kd) = 0 ... Naive Hashing Tuesday, September 17, 13
  11. NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd

    Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Naive Hashing Tuesday, September 17, 13
  12. NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd

    Kg Ki NODE 3 Ke Kf Kh Kj Kk Kl Km Kn Ko Kp Kq Kr Naive Hashing Tuesday, September 17, 13
  13. K * (NN - 1) / NN => K •

    K = # OF KEYS • NN = # OF NODES • AS NN GROWS FACTOR ESSENTIALLY BECOMES 1, THUS ALL KEYS MOVE Naive Hashing Tuesday, September 17, 13
  14. PARTITION # = HASH(KEY) % PARTITIONS • # PARTITIONS REMAINS

    CONSTANT • KEY ALWAYS MAPS TO SAME PARTITION • NODES OWN PARTITIONS • PARTITIONS CONTAIN KEYS • EXTRA LEVEL OF INDIRECTION Consistent Hashing Tuesday, September 17, 13
  15. P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE

    0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Consistent Hashing Tuesday, September 17, 13
  16. P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE

    0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr NODE 3 Consistent Hashing Tuesday, September 17, 13
  17. NN * K/Q => K/Q • K = # OF

    KEYS • NN = # OF NODES • Q = # OF PARTITIONS • AS K GROWS NN BECOMES CONSTANT, THUS K/Q KEYS MOVE Consistent Hashing Tuesday, September 17, 13
  18. uniform distribution Consistent Hashing {logical vs physical partitioning scheme even

    division of keys Tuesday, September 17, 13
  19. the future of Riak Search Tuesday, September 17, 13

  20. Tuesday, September 17, 13

  21. persistence distributing Solr querying indexing Tuesday, September 17, 13

  22. Each Riak node runs an instance of Solr Tuesday, September

    17, 13
  23. Solr index = riak bucket document = RObj value plaintext,

    JSON, XML Tuesday, September 17, 13
  24. Distributed Searching in Solr query faceting highlighting stats spell check

    term vectors Tuesday, September 17, 13
  25. SolrCloud Tuesday, September 17, 13

  26. SolrCloud Tuesday, September 17, 13

  27. Harvest vs Yield Tuesday, September 17, 13

  28. A better measure of Availability Tuesday, September 17, 13

  29. Queries Issues Queries Offered Yield = Tuesday, September 17, 13

  30. Harvest = Data Available Total Dataset Tuesday, September 17, 13

  31. Harvest Yield Tuesday, September 17, 13

  32. Manage Harvest by storing Index Replicas Tuesday, September 17, 13

  33. Term vs Document Partitioning Schemes Tuesday, September 17, 13

  34. Node 0 Node 1 Node 2 Term Based Partitioning Tuesday,

    September 17, 13
  35. Node 0 Node 1 Node 2 Document Based Partitioning Tuesday,

    September 17, 13
  36. Replicas Node 0 Node 1 Node 2 Tuesday, September 17,

    13
  37. Quorums Tuesday, September 17, 13

  38. Concurrency => Siblings Tuesday, September 17, 13

  39. Read Repair (Anti-Entropy) Tuesday, September 17, 13

  40. replica replica replica Tuesday, September 17, 13

  41. replica replica replica X Tuesday, September 17, 13

  42. replica replica replica replica replica replica Tuesday, September 17, 13

  43. Active Anti-Entropy (self healing clusters) Tuesday, September 17, 13

  44. real-time updates persistent non-blocking disk-based Tuesday, September 17, 13

  45. Tuesday, September 17, 13

  46. = hashes marked “dirty” Tuesday, September 17, 13

  47. Tuesday, September 17, 13

  48. Tuesday, September 17, 13

  49. Tuesday, September 17, 13

  50. Tuesday, September 17, 13

  51. = keys to read-repair Tuesday, September 17, 13

  52. Questions? make it so! Tuesday, September 17, 13