Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Riak Search: The Next Generation
Tom Santero
September 17, 2013
Programming
0
120
Riak Search: The Next Generation
Presentation on Yokozuna (
https://github.com/basho/yokozuna
) at the NYC Riak Meetup group
Tom Santero
September 17, 2013
Tweet
Share
More Decks by Tom Santero
See All by Tom Santero
DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker
tsantero
1
260
Buridan's Principle
tsantero
1
190
Release Engineering from the Ground Up
tsantero
1
160
Beyond Fast and Slow
tsantero
0
180
Choose Your Own Consistency
tsantero
2
160
Erlang Fight Club
tsantero
5
410
Riak on Ruby: Keys, Values and CRDTs
tsantero
0
240
Consensus, Raft and Rafter
tsantero
22
2.9k
Riak: Distributed Storage for Games You Don't Have to Worry About
tsantero
6
1.8k
Other Decks in Programming
See All in Programming
GDG Seoul IO Extended 2022 - Android Compose
taehwandev
0
280
Dagger + Anvil: Learning to Love Dependency Injection
vrallev
3
240
Airflowはすごいぞ!
hankehly
0
370
What's new in Android development tools まとめ
mkeeda
0
270
Jetpack Compose best practices 動画紹介 @GoogleI/O LT会
takakitojo
0
270
Oracle REST Data Service: APEX Office Hours
thatjeffsmith
0
680
VisualProgramming_GoogleHome_LINE
nearmugi
1
140
How we run a Realtime Puzzle Fighting Game on AWS Serverless
falken
0
240
言語処理ライブラリ開発における失敗談 / NLPHacks
taishii
1
430
GitHub Actions を導入した経緯
tamago3keran
1
420
Seleniumでイキってたらサーバを絞め落としかけてた話
kenfujita
0
360
Springin‘でみんなもクリエイターに!
ueponx
0
130
Featured
See All Featured
ReactJS: Keep Simple. Everything can be a component!
pedronauck
655
120k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
151
13k
Ruby is Unlike a Banana
tanoku
91
9.2k
No one is an island. Learnings from fostering a developers community.
thoeni
9
1.3k
jQuery: Nuts, Bolts and Bling
dougneiner
56
6.4k
Adopting Sorbet at Scale
ufuk
63
7.6k
Making Projects Easy
brettharned
98
4.3k
How GitHub Uses GitHub to Build GitHub
holman
465
280k
Fashionably flexible responsive web design (full day workshop)
malarkey
396
62k
WebSockets: Embracing the real-time Web
robhawkes
57
5.1k
Why You Should Never Use an ORM
jnunemaker
PRO
47
7.2k
Making the Leap to Tech Lead
cromwellryan
113
7.4k
Transcript
Riak Search the next generation Tuesday, September 17, 13
tsantero @ basho.com Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
2.0 coming soon.. Tuesday, September 17, 13
the history of Riak Search Tuesday, September 17, 13
home grown full-text search Tuesday, September 17, 13
lucene Tuesday, September 17, 13
SCALE Tuesday, September 17, 13
NODE # = HASH(KEY) % NUM_NODES NH(Ka) = 0 NH(Kb)
= 1 NH(Kc) = 2 NH(Kd) = 0 ... Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Kg Ki NODE 3 Ke Kf Kh Kj Kk Kl Km Kn Ko Kp Kq Kr Naive Hashing Tuesday, September 17, 13
K * (NN - 1) / NN => K •
K = # OF KEYS • NN = # OF NODES • AS NN GROWS FACTOR ESSENTIALLY BECOMES 1, THUS ALL KEYS MOVE Naive Hashing Tuesday, September 17, 13
PARTITION # = HASH(KEY) % PARTITIONS • # PARTITIONS REMAINS
CONSTANT • KEY ALWAYS MAPS TO SAME PARTITION • NODES OWN PARTITIONS • PARTITIONS CONTAIN KEYS • EXTRA LEVEL OF INDIRECTION Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr NODE 3 Consistent Hashing Tuesday, September 17, 13
NN * K/Q => K/Q • K = # OF
KEYS • NN = # OF NODES • Q = # OF PARTITIONS • AS K GROWS NN BECOMES CONSTANT, THUS K/Q KEYS MOVE Consistent Hashing Tuesday, September 17, 13
uniform distribution Consistent Hashing {logical vs physical partitioning scheme even
division of keys Tuesday, September 17, 13
the future of Riak Search Tuesday, September 17, 13
Tuesday, September 17, 13
persistence distributing Solr querying indexing Tuesday, September 17, 13
Each Riak node runs an instance of Solr Tuesday, September
17, 13
Solr index = riak bucket document = RObj value plaintext,
JSON, XML Tuesday, September 17, 13
Distributed Searching in Solr query faceting highlighting stats spell check
term vectors Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
Harvest vs Yield Tuesday, September 17, 13
A better measure of Availability Tuesday, September 17, 13
Queries Issues Queries Offered Yield = Tuesday, September 17, 13
Harvest = Data Available Total Dataset Tuesday, September 17, 13
Harvest Yield Tuesday, September 17, 13
Manage Harvest by storing Index Replicas Tuesday, September 17, 13
Term vs Document Partitioning Schemes Tuesday, September 17, 13
Node 0 Node 1 Node 2 Term Based Partitioning Tuesday,
September 17, 13
Node 0 Node 1 Node 2 Document Based Partitioning Tuesday,
September 17, 13
Replicas Node 0 Node 1 Node 2 Tuesday, September 17,
13
Quorums Tuesday, September 17, 13
Concurrency => Siblings Tuesday, September 17, 13
Read Repair (Anti-Entropy) Tuesday, September 17, 13
replica replica replica Tuesday, September 17, 13
replica replica replica X Tuesday, September 17, 13
replica replica replica replica replica replica Tuesday, September 17, 13
Active Anti-Entropy (self healing clusters) Tuesday, September 17, 13
real-time updates persistent non-blocking disk-based Tuesday, September 17, 13
Tuesday, September 17, 13
= hashes marked “dirty” Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
= keys to read-repair Tuesday, September 17, 13
Questions? make it so! Tuesday, September 17, 13