Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Riak Search: The Next Generation
Search
Tom Santero
September 17, 2013
Programming
0
130
Riak Search: The Next Generation
Presentation on Yokozuna (
https://github.com/basho/yokozuna
) at the NYC Riak Meetup group
Tom Santero
September 17, 2013
Tweet
Share
More Decks by Tom Santero
See All by Tom Santero
DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker
tsantero
1
310
Buridan's Principle
tsantero
1
290
Release Engineering from the Ground Up
tsantero
1
260
Beyond Fast and Slow
tsantero
0
220
Choose Your Own Consistency
tsantero
2
180
Erlang Fight Club
tsantero
5
440
Riak on Ruby: Keys, Values and CRDTs
tsantero
0
260
Consensus, Raft and Rafter
tsantero
22
3.4k
Riak: Distributed Storage for Games You Don't Have to Worry About
tsantero
6
1.8k
Other Decks in Programming
See All in Programming
組織に自動テストを書く文化を根付かせる戦略(2024秋版) / Building Automated Test Culture 2024 Autumn Edition
twada
PRO
10
4.3k
外部システム連携先が10を超えるシステムでのアーキテクチャ設計・実装事例
kiwasaki
1
170
2万ページのSSG運用における工夫と注意点 / Vue Fes Japan 2024
chinen
3
1.3k
デプロイを任されたので、教わった通りにデプロイしたら障害になった件 ~俺のやらかしを越えてゆけ~
techouse
50
31k
カラム追加で増えるActiveRecordのメモリサイズ イメージできますか?
asayamakk
3
1.1k
C#/.NETのこれまでのふりかえり
tomokusaba
1
140
Jakarta Concurrencyによる並行処理プログラミングの始め方 (JJUG CCC 2024 Fall)
tnagao7
0
200
RailsのPull requestsのレビューの時に私が考えていること
yahonda
4
1.5k
破壊せよ!データ破壊駆動で考えるドメインモデリング / data-destroy-driven
minodriven
14
3.8k
生成 AI を活用した toitta 切片分類機能の裏側 / Inside toitta's AI-Based Factoid Clustering
pokutuna
0
530
Boost Performance and Developer Productivity with Jakarta EE 11
ivargrimstad
0
440
PagerDuty を軸にした On-Call 構築と運用課題の解決 / PagerDuty Japan Community Meetup 4
horimislime
1
100
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
268
27k
How to Ace a Technical Interview
jacobian
275
23k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
250
21k
YesSQL, Process and Tooling at Scale
rocio
167
14k
Mobile First: as difficult as doing things right
swwweet
222
8.9k
Practical Orchestrator
shlominoach
186
10k
Six Lessons from altMBA
skipperchong
26
3.4k
10 Git Anti Patterns You Should be Aware of
lemiorhan
653
59k
For a Future-Friendly Web
brad_frost
174
9.4k
Typedesign – Prime Four
hannesfritz
39
2.4k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
13
1.9k
Building Applications with DynamoDB
mza
90
6k
Transcript
Riak Search the next generation Tuesday, September 17, 13
tsantero @ basho.com Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
2.0 coming soon.. Tuesday, September 17, 13
the history of Riak Search Tuesday, September 17, 13
home grown full-text search Tuesday, September 17, 13
lucene Tuesday, September 17, 13
SCALE Tuesday, September 17, 13
NODE # = HASH(KEY) % NUM_NODES NH(Ka) = 0 NH(Kb)
= 1 NH(Kc) = 2 NH(Kd) = 0 ... Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Kg Ki NODE 3 Ke Kf Kh Kj Kk Kl Km Kn Ko Kp Kq Kr Naive Hashing Tuesday, September 17, 13
K * (NN - 1) / NN => K •
K = # OF KEYS • NN = # OF NODES • AS NN GROWS FACTOR ESSENTIALLY BECOMES 1, THUS ALL KEYS MOVE Naive Hashing Tuesday, September 17, 13
PARTITION # = HASH(KEY) % PARTITIONS • # PARTITIONS REMAINS
CONSTANT • KEY ALWAYS MAPS TO SAME PARTITION • NODES OWN PARTITIONS • PARTITIONS CONTAIN KEYS • EXTRA LEVEL OF INDIRECTION Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr NODE 3 Consistent Hashing Tuesday, September 17, 13
NN * K/Q => K/Q • K = # OF
KEYS • NN = # OF NODES • Q = # OF PARTITIONS • AS K GROWS NN BECOMES CONSTANT, THUS K/Q KEYS MOVE Consistent Hashing Tuesday, September 17, 13
uniform distribution Consistent Hashing {logical vs physical partitioning scheme even
division of keys Tuesday, September 17, 13
the future of Riak Search Tuesday, September 17, 13
Tuesday, September 17, 13
persistence distributing Solr querying indexing Tuesday, September 17, 13
Each Riak node runs an instance of Solr Tuesday, September
17, 13
Solr index = riak bucket document = RObj value plaintext,
JSON, XML Tuesday, September 17, 13
Distributed Searching in Solr query faceting highlighting stats spell check
term vectors Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
Harvest vs Yield Tuesday, September 17, 13
A better measure of Availability Tuesday, September 17, 13
Queries Issues Queries Offered Yield = Tuesday, September 17, 13
Harvest = Data Available Total Dataset Tuesday, September 17, 13
Harvest Yield Tuesday, September 17, 13
Manage Harvest by storing Index Replicas Tuesday, September 17, 13
Term vs Document Partitioning Schemes Tuesday, September 17, 13
Node 0 Node 1 Node 2 Term Based Partitioning Tuesday,
September 17, 13
Node 0 Node 1 Node 2 Document Based Partitioning Tuesday,
September 17, 13
Replicas Node 0 Node 1 Node 2 Tuesday, September 17,
13
Quorums Tuesday, September 17, 13
Concurrency => Siblings Tuesday, September 17, 13
Read Repair (Anti-Entropy) Tuesday, September 17, 13
replica replica replica Tuesday, September 17, 13
replica replica replica X Tuesday, September 17, 13
replica replica replica replica replica replica Tuesday, September 17, 13
Active Anti-Entropy (self healing clusters) Tuesday, September 17, 13
real-time updates persistent non-blocking disk-based Tuesday, September 17, 13
Tuesday, September 17, 13
= hashes marked “dirty” Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
= keys to read-repair Tuesday, September 17, 13
Questions? make it so! Tuesday, September 17, 13