Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
There's no Clusterf*ck without a Cluster
Search
Dan Hopkins
April 19, 2014
Programming
1
190
There's no Clusterf*ck without a Cluster
Dan Hopkins
April 19, 2014
Tweet
Share
More Decks by Dan Hopkins
See All by Dan Hopkins
Actors: not just for movies anymore
danielhopkins
1
150
Other Decks in Programming
See All in Programming
Webの技術スタックで マルチプラットフォームアプリ開発を可能にするElixirDesktopの紹介
thehaigo
2
1k
ActiveSupport::Notifications supporting instrumentation of Rails apps with OpenTelemetry
ymtdzzz
1
250
見せてあげますよ、「本物のLaravel批判」ってやつを。
77web
7
7.8k
レガシーシステムにどう立ち向かうか 複雑さと理想と現実/vs-legacy
suzukihoge
14
2.2k
macOS でできる リアルタイム動画像処理
biacco42
9
2.4k
A Journey of Contribution and Collaboration in Open Source
ivargrimstad
0
970
OSSで起業してもうすぐ10年 / Open Source Conference 2024 Shimane
furukawayasuto
0
110
イベント駆動で成長して委員会
happymana
1
330
CSC509 Lecture 12
javiergs
PRO
0
160
What’s New in Compose Multiplatform - A Live Tour (droidcon London 2024)
zsmb
1
480
「今のプロジェクトいろいろ大変なんですよ、app/services とかもあって……」/After Kaigi on Rails 2024 LT Night
junk0612
5
2.2k
役立つログに取り組もう
irof
28
9.6k
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
4
370
Fashionably flexible responsive web design (full day workshop)
malarkey
405
65k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
131
33k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
6.9k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
28
2k
Code Review Best Practice
trishagee
64
17k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
8
900
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
Making Projects Easy
brettharned
115
5.9k
The Cult of Friendly URLs
andyhume
78
6k
Optimizing for Happiness
mojombo
376
70k
Transcript
There's No Clusterf*ck without a Cluster How @GoVictorOps went from
unicorns and broken to boring and working
Premature availabilization? • Connect you with your monitors • Harass
you when stuff breaks @boulderDanH
Availability is our DNA • Scala • Akka • Kafka
• Shard key
What is clustering?
An online encyclopedia says • Computers working together (appeals to
authority)
A dictionary says • clus·ter noun \ˈkləs-tər\ a number of
similar things that occur together (includes pronunciation for legitimacy)
Our definition • Who is currently in the cluster? •
Tell me when nodes are coming and going • High Availability / scaling
Requirements 1.0 1. Logical actor tree 2. Service discovery 3.
Lead me to success
Logical actor tree • Failover • Hand off
Service discovery • “cluster://user/victorops/broadcaster” ! “hello”
Tradeoffs are everywhere Vector clocks are totally cool Async consensus?
None
Implementation • Routers / Patterns • Native = Truth
Actor state • Easy and Tempting • Painful to unwind
None
None
What could go wrong? • Partitions are permanent • Want
some config? How about six! ◦ failure-detector.threshold x 2 ◦ failure-detector.min-std-deviation x 2 ◦ failure-detector.acceptable-heartbeat-pause x 2 • Hazelcast uses hazelcast.max.no.heartbeat.seconds • ZooKeeper uses “session timeout”
More picking on Akka • Logging during failures is sparse
• Remoting / Failure detection weren’t bulkheaded
Recap 1. Logical actor tree 2. Service discovery 3. Lead
me to success
Requirements 2.0 1. Member lists 2. Easy to configure, ability
to add machines w/o config 3. Pass remoting address around
None
What is Hazelcast? • Distributed maps & locks • Multicast
(IGMP)
Implementation akka.remote.quarantine-systems-for = "off" akka.remote.gate-invalid-addresses-for = 0 s src: akka-devel
• Publish Akka address using a map • Detect nodes joining / leaving cluster
• Multicast • In memory • Cluster Client • Member
list isn’t consistent across cluster What went wrong?
Recap on requirements 2.0 1. Member lists 2. Easy to
configure 3. Pass remoting address around
Requirements 3.0 1. Member list is consistent 2. Cluster clients
are first class
Cluster Membership • Consistent - Zk • Probably consistent -
Gossip • YOLO consistency - Hazelcast
… no seriously, this is the logo
What is ZooKeeper? • Clustered, consistent file system • API
is focused on building distributed concepts
Implementation • Cluster Membership = EPHEMERAL • Leader Election =
SEQUENTIAL • “Cluster” = EPHEMERAL_SEQUENTIAL • Store akka addresses in ephemeral nodes • Curator project
The Good • Reputation • Strong Consistency • Cluster clients
/ Service Discovery
What was / is hard? • Twitter’s Zk library •
External Cluster Manager
The final tally • Solid concept of membership • Keep
things simple • Log / Graph / Monitor everything
Questions?