Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
designing for concurrency with riak
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Mathias Meyer
May 29, 2012
Programming
1.9k
11
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
designing for concurrency with riak
http://riakhandbook.com
Mathias Meyer
May 29, 2012
More Decks by Mathias Meyer
See All by Mathias Meyer
Building and Scaling an Distributed and Inclusive Team
roidrage
0
1.4k
cooking infrastructure with chef
roidrage
4
250
The Message Queue is Dead, Long Live the Message Queue
roidrage
4
730
riak-js
roidrage
1
320
metrics, monitoring, logging
roidrage
82
15k
design for cloud - jax 2012
roidrage
2
340
A Riak Query Tale
roidrage
5
1k
Don't Use NoSQL
roidrage
10
1.1k
Designing Applications for Amazon Web Services (GOTO Aarhus)
roidrage
6
380
Other Decks in Programming
See All in Programming
The NotImplementedError Problem in Ruby
koic
1
950
AI駆動開発を妨げる技術的負債の解消アプローチ / ai-refactoring-approach
minodriven
14
7.2k
セキュリティの専門家じゃなくてもできる。「セキュリティ意識」をアップデートして サプライチェーン攻撃への耐性を高めよう。
tk3fftk
5
950
ローカルLLMを使ってB2Bサービスを作っていての学び
yaotti
0
220
ランチタイムLT会3周年!ランチタイムLT会を3年間続けられたお話
y0hgi
1
110
OSもどきOS
arkw
0
590
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
180
脅威をエンジニアリングの糧にして――現場編 / Turning Threats into Engineering Fuel — Field Edition
nrslib
0
300
PHPで使える日時の表現と、その知り方 #frontend_phpcon_do
o0h
PRO
0
270
JavaDoc 再入門
nagise
1
420
act1-costs.pdf
sumedhbala
0
120
Inside Stream API
skrb
1
790
Featured
See All Featured
For a Future-Friendly Web
brad_frost
183
10k
Darren the Foodie - Storyboard
khoart
PRO
3
3.4k
Mind Mapping
helmedeiros
PRO
1
260
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
6k
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
2
400
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
66
55k
The SEO Collaboration Effect
kristinabergwall1
1
490
Un-Boring Meetings
codingconduct
0
320
VelocityConf: Rendering Performance Case Studies
addyosmani
333
25k
The Language of Interfaces
destraynor
162
27k
Believing is Seeing
oripsolob
1
150
Transcript
designing for concurrency with riak nosql matters mathias meyer, @roidrage
None
http://riakhandbook.com
design for concurrency?
design data for concurrency
data starts out simple
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
single source of truth
always consistent
mostly consistent
monotonic
increase number of sources
replication
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
eventual consistency* * if no new updates are made to
the object, eventually all accesses will return the last updated value. werner vogels, 2008, http://queue.acm.org/detail.cfm?id=1466448
multiple clients
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
Client 1 Client 2 PUT PUT
conflicting writes
siblings
data diverges
the challenge
determine the winner
determine order
designing data for concurrency
designing data for non-monotonic writes
no atomicity in riak
no coordination
all state is in the data
(eventual) consistency and logical monoticity * hellerstein: the declarative imperative:
experiences and conjectures in distributed logic (2010)
designing data with conflicts in mind
write now, converge later
rethink the data structures
ID Username Email 1 roidrage
[email protected]
{ "id": 1,
"username": "roidrage", "email": "
[email protected]
" }
track updates
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [ { "client": "client-‐1", "timestamp": 1337001337, "updates": [ "firstname": "Mathias", "lastname": "Meyer" ] } ] }
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [ { "client": "client-‐2", "timestamp": 1337001337, "updates": [ "email": "
[email protected]
" ] } ] }
apply all updates ordered by time
what about removing data?
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [{ "client": "client-‐1", "timestamp": 1337001337, "updates": [ { "_op": "delete", "attribute": "email" } ] }] }
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [{ "client": "client-‐2", "timestamp": 1337001337, "updates": [ { "_op": "add", "attribute": "email", "value": "
[email protected]
" } ] }] }
keep a changelog
client converges data
time as a means of ordering* * leslie lamport, et.
al.: time, clocks and the ordering of events in a distributed system (1977)
time is not a guarantee for uniqueness
vector clocks?
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [{ "id": "ca0cb932-‐a74e-‐11e1-‐9ce4-‐1093e90b5d80", "timestamp": 1337001337, "updates": [ { "_op": "delete", "attribute": "email" } ] ] }
timelines* * riak at yammer: http://basho.com/blog/technical/2011/03/28/Riak-and-Scala-at-Yammer/
time-ordered series of events
kept per user
{ "events": [ {
"id": "ca0cb932-‐a74e-‐11e1-‐9ce4-‐1093e90b5d80", "timestamp": 1337001337, "event": { "type": "push", "repository": "rails/rails", "sha1": "0ea43bf" } }, { "id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "event": { "type": "pull_request", "repository": "rails/rails", "sha1": "84efda0" } } ] }
clients dedup, sort and truncate
observation: clients manage the data
sets, counters, graphs
monotonic data structures
sets
an unordered bag of unique items
simplest thing that could possibly work...in riak
secondary indexes
X-‐Riak-‐Index-‐tags_bin: nosql, cloud, infrastructure { "id": 1, "username":
"roidrage", "email": "
[email protected]
" }
always unique
useful for simple things
useful for object associations
add-only
set: time-ordered list of operations
{ "set": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "add", "value": "roidrage" } ] }
{ "set": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "add", "value": "roidrage" }, { "id": "56707cee-‐a757-‐11e1-‐8e1b-‐1093e90b5d80", "timestamp": 1337001339, "op": "add", "value": "josh" } ] }
{ "set": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "add", "value": "roidrage" }, { "id": "56707cee-‐a757-‐11e1-‐8e1b-‐1093e90b5d80", "timestamp": 1337001339, "op": "add", "value": "josh" }, { "id": "a525f16c-‐a968-‐11e1-‐8b07-‐1093e90b5d80", "timestamp": 1337001343, "op": "remove", "value": "josh" } ] }
slightly inefficient
2-phase set* * https://github.com/aphyr/meangirls
{ "set": { "adds": ["roidrage", "josh"],
"removes": ["josh"] } }
counters
increment, decrement
{ "counter": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "incr", "value": 4 } ], }
g-counters* *a comprehensive study of convergent and commutative replicated data
types http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
{ "elements": { "client-‐1": 1,
"client-‐2": 3, "client-‐3": 5 } } value = 1 + 3 + 5 = 9
counters are easy when you increment only
convergent replicated data types *shapiro et. al.: a comprehensive study
of convergent and commutative replicated data types http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
statebox for erlang* * https://github.com/mochi/statebox
knockbox for clojure* * https://github.com/reiddraper/knockbox
data represents state
state-based means growth
data increases with lots of updates
dealing with growth
truncate
roll up, discard
{ "counter": [{ "id": "458f5936-‐a752-‐11e1-‐a876-‐1093e90b5d80",
"timestamp": 1337001347, "op": "inc", "value": 1 }], "value": 2 }
garbage collection
not easy with riak
not easy with stateful data
garbage collection requires coordination
network partitions cause stale data
the solution?
trade off data size vs. consistency
commutative replicated data types* *shapiro et. al.: a comprehensive study
of convergent and commutative replicated data types http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
operations instead of state
not yet possible with riak
eventual consistency is hard
thanks