Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lithium: a split-brain resolver for Akka-Cluster

Lithium: a split-brain resolver for Akka-Cluster

When using Akka-Cluster, when some nodes become unreachable, no one can join or even leave the cluster anymore. To bring back the cluster to a fully working state, the unreachable nodes must be downed. However, because there is no way of knowing if a node has crashed or is victim of a network partition, if done incorrectly the downing could lead to data corruption, a split-brain, and a headache fixing it.

In order to automatically and correctly recover from unreachable nodes, Lightbend provides a resolver through it’s subscription. For individuals and companies that cannot afford the subscription, some open-source solutions exist but do not come near it in terms of features and correctness. To fix that gap, I developed an open-source split-brain resolver called Lithium as part of my EPFL master project.

In this talk I will introduce Lithium, explain how it works helps with recovering the cluster from unreachable nodes, its internals, and everything to know to set it up.

Dennis van der Bij

October 09, 2019
Tweet

Other Decks in Programming

Transcript

  1. OMS • SwissBorg’s OMS (order management system) • Aggregates the

    prices of 4 crypto-exchanges • Best-execution 2
  2. OMS cluster Node-2 Node-3 Node-1 Node-4 Node-5 • Persistent actors

    • Singleton actors • … You are here S Super-important singleton 4
  3. Unreachable nodes Node-2 Node-3 Node-1 Node-4 Node-5 • S cannot

    be reached • Need to start S on a reachable node • Singleton actors are not migrated when nodes are unreachable S Partition A Partition B Dead or alive? 5
  4. Membership state • Leader chosen deterministically • Leader manages state

    transitions on convergence • Convergence cannot be reached with unreachable nodes • Eventually-perfect FD* • Nodes cannot become fully-fledged members or gracefully leave the cluster *Hayashibara, Naohiro, et al. "The φ accrual failure detector." Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004 Joining Up Leaving Exiting Removed Down Leader Leader Leader Leader 6
  5. Split-brain resolver • Prevent split-brains to happen in the 1st

    place • Pick only one partition that will survive - Survivor will down the unreachable nodes - Non-survivors will down themselves 10
  6. Existing solutions • Lightbend SBR - Multiple strategies - Multi-DC

    - Starting at $50’000 per year • Four OSS SBR’s - Two used in production, single strategy (MOIA) - Two others, multiple strategies (fail my tests) 11
  7. Static-quorum • Pick partition with at least N nodes •

    Downs the cluster: more than nodes, no partition with at least N nodes. 2N − 1 13
  8. 16 Keep-majority • Pick partition with a majority of nodes

    (or lowest address) • Downs the cluster: no partition with a majority 16
  9. 19 Keep-oldest • Pick partition containing the oldest member •

    Oldest member hosts the singleton instance • Nearly entire cluster is downed when oldest is alone 19 19
  10. 22 Keep-referee • Pick the partition containing the “referee” node

    • Downs most of the cluster when the referee is alone 22 22
  11. How it works • Provide instance of DowningProvider • Each

    cluster member runs an instance of Lithium 26
  12. Tests, tests, tests • ~70% LOCs are tests • Unit

    tests + property-based tests • “multi-jvm” tests 28
  13. Multi-JVM tests • Simulate a cluster locally • Split links

    between members programmatically • Observe how it gets resolved 30