Lithium
A split-brain resolver for Akka-Cluster
Dennis van der Bij
@MrDnx
DennisVDB
Slide 2
Slide 2 text
OMS
• SwissBorg’s OMS (order management system)
• Aggregates the prices of 4 crypto-exchanges
• Best-execution
2
Slide 3
Slide 3 text
OMS’ objectives
• Best-execution
• High availability
3
Slide 4
Slide 4 text
OMS cluster
Node-2 Node-3
Node-1
Node-4 Node-5
• Persistent actors
• Singleton actors
• …
You are here
S
Super-important
singleton
4
Slide 5
Slide 5 text
Unreachable nodes
Node-2 Node-3
Node-1
Node-4 Node-5
• S cannot be reached
• Need to start S on a reachable node
• Singleton actors are not migrated
when nodes are unreachable
S
Partition A
Partition B
Dead or alive?
5
Slide 6
Slide 6 text
Membership state
• Leader chosen deterministically
• Leader manages state transitions on
convergence
• Convergence cannot be reached with
unreachable nodes
• Eventually-perfect FD*
• Nodes cannot become fully-fledged
members or gracefully leave the
cluster
*Hayashibara, Naohiro, et al. "The φ accrual failure detector." Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004
Joining Up
Leaving
Exiting
Removed
Down
Leader
Leader
Leader
Leader
6
Slide 7
Slide 7 text
Remove from membership state
Node-2 Node-3
Node-1
Node-2 Node-3
Node-1
S
7
Slide 8
Slide 8 text
Remove from membership state
Node-4 Node-5
S
8
Slide 9
Slide 9 text
Split-brain
Node-2 Node-3
Node-1
Node-4 Node-5
Network partition
One cluster becomes two clusters
S
S
9
Slide 10
Slide 10 text
Split-brain resolver
• Prevent split-brains to happen in the 1st place
• Pick only one partition that will survive
- Survivor will down the unreachable nodes
- Non-survivors will down themselves
10
Slide 11
Slide 11 text
Existing solutions
• Lightbend SBR
- Multiple strategies
- Multi-DC
- Starting at $50’000 per year
• Four OSS SBR’s
- Two used in production, single strategy (MOIA)
- Two others, multiple strategies (fail my tests)
11
Slide 12
Slide 12 text
Lithium
• Strategies
- Static-quorum, keep-majority, keep-oldest, and keep-referee
• Multi-datacenter support
• Tests, tests, tests
12
Slide 13
Slide 13 text
Static-quorum
• Pick partition with at least N nodes
• Downs the cluster: more than nodes, no partition with at least N
nodes.
2N − 1
13
Slide 14
Slide 14 text
Static-quorum
14
Node-2 Node-3
Node-1
Node-4 Node-5
N = 3
19
Keep-oldest
• Pick partition containing the oldest member
• Oldest member hosts the singleton instance
• Nearly entire cluster is downed when oldest is alone
19
19