Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Mainframe to Microservice: An Introduction to Distributed Systems

Tyler Treat
November 01, 2014

From Mainframe to Microservice: An Introduction to Distributed Systems

An introductory overview of distributed systems—what they are and why they're difficult to build. We explore fundamental ideas and practical concepts in distributed programming. What is the CAP theorem? What is distributed consensus? What are CRDTs? We also look at options for solving the split-brain problem while considering the trade-off of high availability as well as options for scaling shared data.

Tyler Treat

November 01, 2014
Tweet

More Decks by Tyler Treat

Other Decks in Programming

Transcript

  1. An Introduction to Distributed Systems ❖ Building a foundation of

    understanding ❖ Why distributed systems? ❖ Universal fallacies ❖ Characteristics and the CAP theorem ❖ Common pitfalls ❖ Digging deeper ❖ Byzantine Generals Problem and consensus ❖ Split-brain ❖ Hybrid consistency models ❖ Scaling shared data and CRDTs
  2. –Leslie Lamport “A distributed system is one in which the

    failure of a computer you didn't even know existed can render your own computer unusable.”
  3. Scale Up vs. Scale Out ❖ Add resources to a

    node ❖ Increases node capacity, load is unaffected ❖ System complexity unaffected Vertical Scaling ❖ Add nodes to a cluster ❖ Decreases load, capacity is unaffected ❖ Availability and throughput w/ increased complexity Horizontal Scaling
  4. Why Distributed Systems? Availability Fault Tolerance Throughput Architecture Economics serve

    every request resilient to failures parallel computation decoupled, focused services scale-out becoming manageable/
 cost-effective
  5. Universal Fallacy #1 The network is reliable. ❖ Message delivery

    is never guaranteed ❖ Best effort ❖ Is it worth it? ❖ Resiliency/redundancy/failover
  6. Universal Fallacy #2 Latency is zero. ❖ We cannot defy

    the laws of physics ❖ LAN to WAN deteriorates quickly ❖ Minimize network calls (batch) ❖ Design asynchronous systems
  7. Universal Fallacy #3 Bandwidth is infinite. ❖ Out of our

    control ❖ Limit message sizes ❖ Use message queueing
  8. Universal Fallacy #4 The network is secure. ❖ Everyone is

    out to get you ❖ Build in security from day 1 ❖ Multi-layered ❖ Encrypt, pentest, train developers
  9. Universal Fallacy #5 Topology doesn’t change. ❖ Network topology is

    dynamic ❖ Don’t statically address hosts ❖ Collection of services, not nodes ❖ Service discovery
  10. Universal Fallacy #6 There is one administrator. ❖ May integrate

    with third-party systems ❖ “Is it our problem or theirs?” ❖ Conflicting policies/priorities ❖ Third parties constrain; weigh the risk
  11. Universal Fallacy #7 Transport cost is zero. ❖ Monetary and

    practical costs ❖ Building/maintaining a network is not trivial ❖ The “perfect” system might be too costly
  12. Universal Fallacy #8 The network is homogenous. ❖ Networks are

    almost never homogenous ❖ Third-party integration? ❖ Consider interoperability ❖ Avoid proprietary protocols
  13. Characteristics of a Reliable Distributed System Fault-tolerant Available Scalable 


    Consistent Secure Performant nodes can fail serve all the requests, all the time behave correctly with changing topologies state is coordinated across nodes access is authenticated it’s fast!
  14. CAP Theorem ❖ Presented in 1998 by Eric Brewer ❖

    Impossible to guarantee all three: ❖ Consistency ❖ Availability ❖ Partition tolerance
  15. Consistency ❖ Linearizable - there exists a total order of

    all state updates and each update appears atomic ❖ E.g. mutexes make operations appear atomic ❖ When operations are linearizable, we can assign a unique “timestamp” to each one (total order) ❖ A system is consistent if every node shares the same total order ❖ Consistency which is both global and instantaneous is impossible
  16. Consistency Eventual consistency
 replicas allowed to diverge, eventually converge
 Strong

    consistency
 replicas can’t diverge; requires linearizability
  17. Availability ❖ Every request received by a non-failing node must

    be served ❖ If a piece of data required for a request is unavailable, the system is unavailable ❖ 100% availability is a myth
  18. Partition Tolerance ❖ A partition is a split in the

    network—many causes ❖ Partition tolerance means partitions can happen ❖ CA is easy when your network is perfectly reliable ❖ Your network is not perfectly reliable
  19. Common Pitfalls ❖ Halting failure - machine stops ❖ Network

    failure - network connection breaks ❖ Omission failure - messages are lost ❖ Timing failure - clock skew ❖ Byzantine failure - arbitrary failure
  20. Byzantine Generals Problem ❖ Consider a city under siege by

    two allied armies ❖ Each army has a general ❖ One general is the leader ❖ Armies must agree when to attack ❖ Must use messengers to communicate ❖ Messengers can be captured by defenders
  21. Byzantine Generals Problem ❖ Send 100 messages, attack no matter

    what ❖ A might attack without B ❖ Send 100 messages, wait for acks, attack if confident ❖ B might attack without A ❖ Messages have overhead ❖ Can’t reliably make decision (provenly impossible)
  22. Distributed Consensus ❖ Replace 2 generals with N generals ❖

    Nodes must agree on data value ❖ Solutions: ❖ Multi-phase commit ❖ State replication
  23. Two-Phase Commit ❖ Blocking protocol ❖ Coordinator waits for cohorts

    ❖ Cohorts wait for commit/rollback ❖ Can deadlock
  24. State Replication ❖ E.g. Paxos, Raft protocols ❖ Elect a

    leader (coordinator) ❖ All changes go through leader ❖ Each change appends log entry ❖ Each node has log replica
  25. State Replication ❖ Must have quorum (majority) to proceed ❖

    Commit once quorum acks ❖ Quorums mitigate partitions ❖ Logs allow state to be rebuilt
  26. Split-Brain ❖ Optimistic (AP) - let partitions work as usual

    ❖ Pessimistic (CP) - quorum partition works, fence others
  27. Hybrid Consistency Models ❖ Weak == available, low latency, stale

    reads ❖ Strong == fresh reads, less available, high latency ❖ How do you choose a consistency model? ❖ Hybrid models ❖ Weaker models when possible (likes, followers, votes) ❖ Stronger models when necessary ❖ Tunable consistency models (Cassandra, Riak, etc.)
  28. Scaling Shared Data ❖ Sharing mutable data at large scale

    is difficult ❖ Solutions: ❖ Immutable data ❖ Last write wins ❖ Application-level conflict resolution ❖ Causal ordering (e.g. vector clocks) ❖ Distributed data types (CRDTs)
  29. CRDT ❖ Conflict-free Replicated Data Type ❖ Convergent: state-based ❖

    Commutative: operations-based ❖ E.g. distributed sets, lists, maps, counters ❖ Update concurrently w/o writer coordination
  30. CRDT ❖ CRDTs always converge (provably) ❖ Operations commute (order

    doesn’t matter) ❖ Highly available, eventually consistent ❖ Always reach consistent state ❖ Drawbacks: ❖ Requires knowledge of all clients ❖ Must be associative, commutative, and idempotent
  31. CRDT ❖ Add to set is associative, commutative, idempotent ❖

    add(“a”), add(“b”), add(“a”) => {“a”, “b”} ❖ Adding and removing items is not ❖ add(“a”), remove(“a”) => {} ❖ remove(“a”), add(“a”) => {“a”} ❖ CRDTs require interpretation of common data structures w/ limitations
  32. Two-Phase Set ❖ Use two sets, one for adding, one

    for removing ❖ Elements can be added once and removed once ❖ {
 “a”: [“a”, “b”, “c”],
 “r”: [“a”]
 } ❖ => {“b”, “c”} ❖ add(“a”), remove(“a”) => {“a”: [“a”], “r”: [“a”]} ❖ remove(“a”), add(“a”) => {“a”: [“a”], “r”: [“a”]}
  33. Further Readings ❖ Jepsen series
 Kyle Kingsbury (aphyr) ❖ A

    Comprehensive Study of Convergent and Commutative Replicated Data Types
 Shapiro et al. ❖ In Search of an Understandable Consensus Algorithm
 Ongaro et al. ❖ CAP Twelve Years Later
 Eric Brewer ❖ Many, many more…