Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
designing for concurrency with riak
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Mathias Meyer
May 29, 2012
Programming
1.9k
11
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
designing for concurrency with riak
http://riakhandbook.com
Mathias Meyer
May 29, 2012
More Decks by Mathias Meyer
See All by Mathias Meyer
Building and Scaling an Distributed and Inclusive Team
roidrage
0
1.4k
cooking infrastructure with chef
roidrage
4
250
The Message Queue is Dead, Long Live the Message Queue
roidrage
4
730
riak-js
roidrage
1
320
metrics, monitoring, logging
roidrage
82
15k
design for cloud - jax 2012
roidrage
2
340
A Riak Query Tale
roidrage
5
1k
Don't Use NoSQL
roidrage
10
1.1k
Designing Applications for Amazon Web Services (GOTO Aarhus)
roidrage
6
380
Other Decks in Programming
See All in Programming
ローカルLLMでどこまでコードが書けるか -拡張版 / How much code can be written on a local LLM Extended
kishida
12
4.4k
どこまでゆるくて許されるのか
tk3fftk
0
250
jQueryをバージョンアップする前に使いたいjQuery Migrate
matsuo_atsushi
0
600
エージェンティックRAGにAWSで入門しよう!
har1101
9
1.8k
[2026年度第1回ORセミナー] 計画最適化ベンチャーと競技プログラミング人材
terryu16
0
270
コンテキストの使い捨てをやめる — ビジネスルール駆動開発と miko —
ioki
0
240
Spring Security 実践 ─ GraphQL APIで実務に役立つ 認証・認可 を学ぶ
wagyu
0
260
Vue × Nuxt × Oxc どこまで使える?実運用の現在地
andpad
0
300
Inside Stream API
skrb
1
790
そのテスト、説明できますか?~LWテスト戦略FW~のご紹介
nakahara
0
170
OSもどきOS
arkw
0
590
TSKaigi Night Talks 2026_TypeScriptでサプライチェーンの整合性を型に閉じ込める
geekplus_tech
0
410
Featured
See All Featured
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
580
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
2.1k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
260
Balancing Empowerment & Direction
lara
6
1.2k
The Pragmatic Product Professional
lauravandoore
37
7.3k
Deep Space Network (abreviated)
tonyrice
0
210
Music & Morning Musume
bryan
47
7.2k
Crafting Experiences
bethany
1
190
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.6k
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2.1k
Transcript
designing for concurrency with riak nosql matters mathias meyer, @roidrage
None
http://riakhandbook.com
design for concurrency?
design data for concurrency
data starts out simple
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
single source of truth
always consistent
mostly consistent
monotonic
increase number of sources
replication
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
eventual consistency* * if no new updates are made to
the object, eventually all accesses will return the last updated value. werner vogels, 2008, http://queue.acm.org/detail.cfm?id=1466448
multiple clients
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3
karen
[email protected]
ID Username Email 1 roidrage
[email protected]
2 thomas
[email protected]
3 karen
[email protected]
Client 1 Client 2 PUT PUT
conflicting writes
siblings
data diverges
the challenge
determine the winner
determine order
designing data for concurrency
designing data for non-monotonic writes
no atomicity in riak
no coordination
all state is in the data
(eventual) consistency and logical monoticity * hellerstein: the declarative imperative:
experiences and conjectures in distributed logic (2010)
designing data with conflicts in mind
write now, converge later
rethink the data structures
ID Username Email 1 roidrage
[email protected]
{ "id": 1,
"username": "roidrage", "email": "
[email protected]
" }
track updates
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [ { "client": "client-‐1", "timestamp": 1337001337, "updates": [ "firstname": "Mathias", "lastname": "Meyer" ] } ] }
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [ { "client": "client-‐2", "timestamp": 1337001337, "updates": [ "email": "
[email protected]
" ] } ] }
apply all updates ordered by time
what about removing data?
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [{ "client": "client-‐1", "timestamp": 1337001337, "updates": [ { "_op": "delete", "attribute": "email" } ] }] }
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [{ "client": "client-‐2", "timestamp": 1337001337, "updates": [ { "_op": "add", "attribute": "email", "value": "
[email protected]
" } ] }] }
keep a changelog
client converges data
time as a means of ordering* * leslie lamport, et.
al.: time, clocks and the ordering of events in a distributed system (1977)
time is not a guarantee for uniqueness
vector clocks?
{ "id": 1, "username": "roidrage", "email": "
[email protected]
"
"changes": [{ "id": "ca0cb932-‐a74e-‐11e1-‐9ce4-‐1093e90b5d80", "timestamp": 1337001337, "updates": [ { "_op": "delete", "attribute": "email" } ] ] }
timelines* * riak at yammer: http://basho.com/blog/technical/2011/03/28/Riak-and-Scala-at-Yammer/
time-ordered series of events
kept per user
{ "events": [ {
"id": "ca0cb932-‐a74e-‐11e1-‐9ce4-‐1093e90b5d80", "timestamp": 1337001337, "event": { "type": "push", "repository": "rails/rails", "sha1": "0ea43bf" } }, { "id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "event": { "type": "pull_request", "repository": "rails/rails", "sha1": "84efda0" } } ] }
clients dedup, sort and truncate
observation: clients manage the data
sets, counters, graphs
monotonic data structures
sets
an unordered bag of unique items
simplest thing that could possibly work...in riak
secondary indexes
X-‐Riak-‐Index-‐tags_bin: nosql, cloud, infrastructure { "id": 1, "username":
"roidrage", "email": "
[email protected]
" }
always unique
useful for simple things
useful for object associations
add-only
set: time-ordered list of operations
{ "set": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "add", "value": "roidrage" } ] }
{ "set": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "add", "value": "roidrage" }, { "id": "56707cee-‐a757-‐11e1-‐8e1b-‐1093e90b5d80", "timestamp": 1337001339, "op": "add", "value": "josh" } ] }
{ "set": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "add", "value": "roidrage" }, { "id": "56707cee-‐a757-‐11e1-‐8e1b-‐1093e90b5d80", "timestamp": 1337001339, "op": "add", "value": "josh" }, { "id": "a525f16c-‐a968-‐11e1-‐8b07-‐1093e90b5d80", "timestamp": 1337001343, "op": "remove", "value": "josh" } ] }
slightly inefficient
2-phase set* * https://github.com/aphyr/meangirls
{ "set": { "adds": ["roidrage", "josh"],
"removes": ["josh"] } }
counters
increment, decrement
{ "counter": [ {
"id": "e018f024-‐a74e-‐11e1-‐9feb-‐1093e90b5d80", "timestamp": 1337001337, "op": "incr", "value": 4 } ], }
g-counters* *a comprehensive study of convergent and commutative replicated data
types http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
{ "elements": { "client-‐1": 1,
"client-‐2": 3, "client-‐3": 5 } } value = 1 + 3 + 5 = 9
counters are easy when you increment only
convergent replicated data types *shapiro et. al.: a comprehensive study
of convergent and commutative replicated data types http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
statebox for erlang* * https://github.com/mochi/statebox
knockbox for clojure* * https://github.com/reiddraper/knockbox
data represents state
state-based means growth
data increases with lots of updates
dealing with growth
truncate
roll up, discard
{ "counter": [{ "id": "458f5936-‐a752-‐11e1-‐a876-‐1093e90b5d80",
"timestamp": 1337001347, "op": "inc", "value": 1 }], "value": 2 }
garbage collection
not easy with riak
not easy with stateful data
garbage collection requires coordination
network partitions cause stale data
the solution?
trade off data size vs. consistency
commutative replicated data types* *shapiro et. al.: a comprehensive study
of convergent and commutative replicated data types http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
operations instead of state
not yet possible with riak
eventual consistency is hard
thanks