Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SWIM: Scalable Weakly Consistent Infection Styl...
Search
Paul Hinze
July 22, 2015
Technology
400
2
Share
SWIM: Scalable Weakly Consistent Infection Style Process Group Membership Protocol
Papers We Love, Chicago
July 22, 2015
Paul Hinze
July 22, 2015
More Decks by Paul Hinze
See All by Paul Hinze
Getting Good at System Failure Analysis
phinze
0
470
Applying Graph Theory to Infrastructure As Code
phinze
6
2.2k
Infrastructure as Code with Terraform and Friends
phinze
2
330
Smoke & Mirrors: The Primitives of High Availability
phinze
1
750
Git: Everybody's Favorite MMO
phinze
0
210
Shut Up and Pipe! Unix-style Object Collaboration in Rack and Vagrant
phinze
0
180
Freighthop: Vagrant on Rails
phinze
0
330
Puppet Modules Are Our Friends
phinze
0
130
Who Needs Clouds?: HA in Your Datacenter
phinze
1
550
Other Decks in Technology
See All in Technology
Purview 勉強会報告 Microsoft Purview 入門しようとしてみた
masakichixo
1
440
Purview Endpoint DLP 動かしてみた
kozakigh
0
440
セキュリティ対策、何からはじめる? CloudNative環境の脅威モデリングと リスク評価実践入門 #cloudnativekaigi
varu3
5
980
エンタープライズの厳格な制約を開発者に意識させない:クラウドネイティブ開発基盤設計/cloudnative-kaigi-golden-path
mhrtech
0
440
会社説明資料|株式会社ギークプラス ソフトウェア事業部
geekplus_tech
0
300
クラウドからエッジまで ~ 1,700台を支える監視設計~
optfit
0
110
インプロセスQAのための要因から捉えるプロジェクトリスクマネジメントnano #1 開発リソース効率状態への対処 #jasstnano
barus_qa
0
150
Oracle AI Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
6
1.4k
AWS WAFの運用を地道に改善し、自社で運用可能にするプラクティス
andpad
1
310
20260515 OpenIDファウンデーション・ジャパンご紹介
oidfj
0
130
オライリーイベント登壇資料「鉄リサイクル・産廃業界におけるAI技術実応用のカタチ」
takarasawa_
0
410
開発サイクルのボーダーレス化に伴う組織変革から学んだこと / Organizational Transformation Amid the Borderless Development Cycle
mii3king
0
190
Featured
See All Featured
A Tale of Four Properties
chriscoyier
163
24k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
140
Marketing to machines
jonoalderson
1
5.3k
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
810
Darren the Foodie - Storyboard
khoart
PRO
3
3.3k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
520
What's in a price? How to price your products and services
michaelherold
247
13k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.6k
KATA
mclloyd
PRO
35
15k
Bash Introduction
62gerente
615
210k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.2k
Transcript
SWIM Scalable Weakly Consistent Infection Style Process Group Membership Protocol
Paul Hinze phinze
Paul Hinze phinze death stare
Armon Dadgar armon creator of Serf and Consul
ma
None
None
Process Group Membership Protocol Who is alive
None
Process Group Membership Protocol
SWIM Scalable Weakly Consistent Infection Style Process Group Membership Protocol
Scalable The SWIM effort is motivated by the unscalability of
traditional heartbeating protocols.
Heartbeating A B C A A B B Failure Detection
+ Membership
Heartbeating
Evaluating Protocols Completeness Speed Accuracy Overhead
Evaluating Protocols Completeness Speed Accuracy Overhead heartbeating Yes Limit *
Interval High Nodes2 !
Key Insight Failure Detection State Updates. Solve separately from
Failure Detection ping! ack! A B C D {B,C,D}
Failure Detection ping! ack! A B C D {B,C,D}
Indirect Ping ping(C)! ack! B C D ping(C)! ping ack
fail {B,C,D}
Indirect Ping ping(C)! B C D ping(C)! fail ...C is
dead! fail
Key Insight Failure Detection State Updates. Solve separately from
Key Insight State Updates Failure Detection Piggyback onto messages.
State Updates ping! (B is dead) ack! (D just joined)
A B C D {B,C} {D}
Infection Style A B C D {B,C} {A,B,D} {A,B} weakly
consistent
Evaluating Protocols Completeness Speed Accuracy Overhead SWIM Yes, eventually 1
* Interval High-ish O(N)
Improvements Time Bounded Completeness Increased Accuracy
Completeness A B {B, C, D, ..., N} N
Completeness A B {B, C, D, ..., N} N 1.
Shuffle List 2. Iterate
Completeness Fixed Time
Evaluating Protocols Completeness Speed Accuracy Overhead SWIM Yes, fixed time
1 * Interval High-ish O(N)
Accuracy B C D ...C is MAYBE dead! A
Accuracy B C D I heard C might be dead.
A
Accuracy B C D I'm not dead! A
Accuracy
Evaluating Protocols Completeness Speed Accuracy Overhead SWIM Yes, fixed time
1 * Interval High O(N)
Evaluating Protocols Completeness Speed Accuracy Overhead SWIM Yes, fixed time
1 * Interval High O(N)
None
Limitations Update Latency Problem Solution Separate Gossip Timer
Limitations Cannot Handle Network Partitions Problem Solution Track and Retry
Recently Dead Nodes
Limitations No Concept of Graceful Leave (vs Failure) Problem Solution
Broadcast and Tracking of "Intents"
Limitations New Nodes Take Too Long Materialize Initial State Problem
Solution Anti-entropy TCP State Syncs
Limitations No built-in facility for user data Problem Solution Implement
user payloads (ordering via lamport clocks)
Limitations No peer metadata, only IP Addresses Problem Solution Inject
versioned peer metadata into state messages
Limitations No encryption Problem Solution Implement AES-GCM with key rotation
56 nodes 2K nodes Performance
Implementations Memberlist https://github.com/hashicorp/memberlist Serf lib https://github.com/hashicorp/serf Serf http://www.serfdom.io Consul http://www.consul.io
Implementations events, queries, and scripts serf CLI serf lib memberlist
Implementations service discovery, K/V, health checks consul serf lib memberlist
Implementations service discovery, K/V, health checks
Thanks Cloud icons by Julien Deveaux from the Noun Project