Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Riak Search: The Next Generation
Search
Tom Santero
September 17, 2013
Programming
0
130
Riak Search: The Next Generation
Presentation on Yokozuna (
https://github.com/basho/yokozuna
) at the NYC Riak Meetup group
Tom Santero
September 17, 2013
Tweet
Share
More Decks by Tom Santero
See All by Tom Santero
DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker
tsantero
1
300
Buridan's Principle
tsantero
1
270
Release Engineering from the Ground Up
tsantero
1
240
Beyond Fast and Slow
tsantero
0
220
Choose Your Own Consistency
tsantero
2
180
Erlang Fight Club
tsantero
5
430
Riak on Ruby: Keys, Values and CRDTs
tsantero
0
260
Consensus, Raft and Rafter
tsantero
22
3.4k
Riak: Distributed Storage for Games You Don't Have to Worry About
tsantero
6
1.8k
Other Decks in Programming
See All in Programming
Exploring the Gradually Lost Technical Skills in the Cloud Native Era
hwchiu
2
3.9k
CSC307 Lecture 05
javiergs
PRO
0
210
【Go言語】golangci-lintの使い方
tomo1227
0
280
AWS CDKにおける「再利用性」を考える / aws-cdk-reusability
gotok365
6
1.3k
Google's Recipe for Scaling (Web) Security – LocoMocoSec 2024
lweichselbaum
0
170
AWS初心者ってどうやってAWSを学ぶ?〜アプリエンジニアがやってよかったアーキテクチャ学習方法〜
yamanashi_ren01
0
190
なぜ宣言的 UI は壊れにくいのか / Why declarative UI is less fragile
uenitty
29
13k
CSC307 Lecture 10
javiergs
PRO
0
310
Cloudflare Workers x AWS Lambdaの組み合わせユースケース / Cloudflare Workers x AWS Lambda Combination Use Case
seike460
PRO
2
310
継続的な活動で築く地方エンジニアの道
myamashii
2
360
OpenAI/Gemini APIを使って EPUBを翻訳するCLIツールをつくってみた
tomiyan
0
790
今こそ始める、CDKコンストラクトライブラリ開発 ― 入門から実践まで
tmokmss
1
930
Featured
See All Featured
Building Your Own Lightsaber
phodgson
101
5.9k
We Have a Design System, Now What?
morganepeng
46
7k
Principles of Awesome APIs and How to Build Them.
keavy
124
16k
How to Think Like a Performance Engineer
csswizardry
4
590
Bash Introduction
62gerente
607
210k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.4k
Statistics for Hackers
jakevdp
792
220k
Reflections from 52 weeks, 52 projects
jeffersonlam
346
19k
Why You Should Never Use an ORM
jnunemaker
PRO
51
8.9k
Gamification - CAS2011
davidbonilla
78
4.9k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
20
7.2k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
129
32k
Transcript
Riak Search the next generation Tuesday, September 17, 13
tsantero @ basho.com Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
2.0 coming soon.. Tuesday, September 17, 13
the history of Riak Search Tuesday, September 17, 13
home grown full-text search Tuesday, September 17, 13
lucene Tuesday, September 17, 13
SCALE Tuesday, September 17, 13
NODE # = HASH(KEY) % NUM_NODES NH(Ka) = 0 NH(Kb)
= 1 NH(Kc) = 2 NH(Kd) = 0 ... Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Naive Hashing Tuesday, September 17, 13
NODE 0 NODE 1 NODE 2 Ka Kb Kc Kd
Kg Ki NODE 3 Ke Kf Kh Kj Kk Kl Km Kn Ko Kp Kq Kr Naive Hashing Tuesday, September 17, 13
K * (NN - 1) / NN => K •
K = # OF KEYS • NN = # OF NODES • AS NN GROWS FACTOR ESSENTIALLY BECOMES 1, THUS ALL KEYS MOVE Naive Hashing Tuesday, September 17, 13
PARTITION # = HASH(KEY) % PARTITIONS • # PARTITIONS REMAINS
CONSTANT • KEY ALWAYS MAPS TO SAME PARTITION • NODES OWN PARTITIONS • PARTITIONS CONTAIN KEYS • EXTRA LEVEL OF INDIRECTION Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr Consistent Hashing Tuesday, September 17, 13
P9 P6 P3 P8 P5 P2 P7 P4 P1 NODE
0 NODE 1 NODE 2 Ka Kb Kc Kd Ke Kf Kg Kh Ki Kj Kk Km Kl Kp Kn Ko Kq Kr NODE 3 Consistent Hashing Tuesday, September 17, 13
NN * K/Q => K/Q • K = # OF
KEYS • NN = # OF NODES • Q = # OF PARTITIONS • AS K GROWS NN BECOMES CONSTANT, THUS K/Q KEYS MOVE Consistent Hashing Tuesday, September 17, 13
uniform distribution Consistent Hashing {logical vs physical partitioning scheme even
division of keys Tuesday, September 17, 13
the future of Riak Search Tuesday, September 17, 13
Tuesday, September 17, 13
persistence distributing Solr querying indexing Tuesday, September 17, 13
Each Riak node runs an instance of Solr Tuesday, September
17, 13
Solr index = riak bucket document = RObj value plaintext,
JSON, XML Tuesday, September 17, 13
Distributed Searching in Solr query faceting highlighting stats spell check
term vectors Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
SolrCloud Tuesday, September 17, 13
Harvest vs Yield Tuesday, September 17, 13
A better measure of Availability Tuesday, September 17, 13
Queries Issues Queries Offered Yield = Tuesday, September 17, 13
Harvest = Data Available Total Dataset Tuesday, September 17, 13
Harvest Yield Tuesday, September 17, 13
Manage Harvest by storing Index Replicas Tuesday, September 17, 13
Term vs Document Partitioning Schemes Tuesday, September 17, 13
Node 0 Node 1 Node 2 Term Based Partitioning Tuesday,
September 17, 13
Node 0 Node 1 Node 2 Document Based Partitioning Tuesday,
September 17, 13
Replicas Node 0 Node 1 Node 2 Tuesday, September 17,
13
Quorums Tuesday, September 17, 13
Concurrency => Siblings Tuesday, September 17, 13
Read Repair (Anti-Entropy) Tuesday, September 17, 13
replica replica replica Tuesday, September 17, 13
replica replica replica X Tuesday, September 17, 13
replica replica replica replica replica replica Tuesday, September 17, 13
Active Anti-Entropy (self healing clusters) Tuesday, September 17, 13
real-time updates persistent non-blocking disk-based Tuesday, September 17, 13
Tuesday, September 17, 13
= hashes marked “dirty” Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
Tuesday, September 17, 13
= keys to read-repair Tuesday, September 17, 13
Questions? make it so! Tuesday, September 17, 13