OCB: Cloud Native Disaster Recovery for Stateful Workloads

Cloud Native Disaster Recovery Raffaele Spazzoli Architect at Red Hat
Lead at TAG Storage Alex Chircop CEO at Storage OS Co-chair TAG Storage 1

Cloud Native Disaster Recovery 2 Concern Traditional DR Cloud Native
DR Type of deployment active/passive, rarely active/active Active / active Disaster Detection and Recovery Trigger Human Autonomous Disaster Recovery Procedure execution Mix of manual and automated tasks Automated Recovery Time Objective (RTO) From close to zero to hours Close to zero Recovery Point Objective (RPO) From zero to hours Exactly zero for strongly consistent deployments. Theoretically unbounded, practically close to zero for eventual consistent deployments. DR Process Owner Often the Storage Team Application Team Capabilities needed for DR From storage (backup/restore, volume replication) From networking (east-west communication, global load balancer) The information in this table are generally accepted attributes and measurements for Disaster Recovery architectures

CNDR - Reference Architecture 3 Traditional DR strategies are still
possible in the cloud. Here we are focusing on a new approach.

Availability and Consistency 4 High Availability (HA) is a property
of a system that allows it to continue performing normally in the presence of failures. What happens when a component in a Failure Domain is lost? Some definitions Consistency is the property of a distributed stateful workload by which all of the instances of the workload “observe” the same state. Consistency Disaster recovery (DR) refers to the strategy for recovering from the complete loss of a datacenter. What happens when an entire Failure Domain is lost? Disaster Recovery Failure domains are areas which may fail due to a single event. Examples: nodes, racks, kubernetes clusters, network zones and datacenters Failure Domain High-Availability

CAP Theorem 5 Product CAP Choice (either Availability or Consistency)
DynamoDB Availability Cassandra Availability CockroachDB Consistency MongoDB Consistency PACELC corollary: in the absence of network partition, one can only optimize either for latency or consistency

Consensus Protocols 6 Consensus Protocols allow for the coordination of
distributed processes by agreeing on actions to be taken. Apache Bookkeeper is an example of Reliable Replicated Data Store (for log abstraction use case: append only) Building on consensus protocols and the concept of sharing a log of operations, it is possible to build a Reliable Replicated Data Store Reliable Replicated Data Store Protocols in which all participants perform the same action. They are implemented around the concepts of leader election and strict majority: Paxos, Raft. Shared State Consensus Protocols Protocols in which all participants perform different actions. They require the acknowledgment of all participants and are vulnerable to network partitioning: 2PC, 3PC Unshared State Consensus Protocols

Anatomy of a Stateful Application 7 Partitions are a way
to increase the general throughput of the workload. This is achieved by breaking the state space in partitions or shards. Partitions Putting it all together Stateful Workload Logical Tiers Replicas are a way to increase availability of a stateful workload. Replicas

Examples of Consensus Protocol choices 8 Product Replica consensus protocol
Shard consensus protocol Etcd Raft N/A (no support for shards) Consul Raft N/A (no support for shards) Zookeeper Atomic Broadcast (a derivative of Paxos) N/A (no support for shards) ElasticSearch Paxos N/A (No support for transactions) Cassandra Paxos Supported, but details are not available. MongoDB Paxos Homegrown protocol. CockroachDB Raft 2PC YugabyteDB Raft 2PC TiKV Raft Percolator Spanner Raft 2PC+high-precision time service Kafka A custom derivative of PacificA Custom Implementation of 2PC

Strongly-Consistent vs Eventually Consistent CNDR 9 Concern Strongly-Consistent Eventually-Consistent RPO
Zero Theoretically unbounded, practically close to zero. Temporarily inconsistency can happen. Note: eventual consistency does not mean eventual correctness. RTO Few seconds. Few seconds. Latency String sensitivity to latency between failure domains, single transaction latency will be >= 2 x worst latency between failure domains. No sensitivity to latency between failure domains. Throughput Theoretically scales linearly with the number of instances, practically is dependent on the workload type and the max throughput available between failure domains. Theoretically scales linearly with the number of instances, practically is dependent on the workload type. Minimum required failure domains three two

CNDR -- Strong Consistency - Kubernetes Reference Architecture 10

CNDR -- Eventual Consistency - Kubernetes Reference Architecture 11

References 12 TAG Storage Cloud Native Disaster Recovery Demos and
reference implementations: Geographically Distributed Stateful Workloads Part One: Cluster Preparation Geographically Distributed Stateful Workloads Part Two: CockroachDB Geographically Distributed Stateful Workloads - Part 3: Keycloak

Thank you 13

Short Demo 14

Demo Scenario 15

DR Simulation 16

OCB: Cloud Native Disaster Recovery for Statefu...

OCB: Cloud Native Disaster Recovery for Stateful Workloads

Red Hat Livestreaming

More Decks by Red Hat Livestreaming

Other Decks in Technology

Featured

Transcript

Cloud Native Disaster Recovery Raffaele Spazzoli Architect at Red Hat

Cloud Native Disaster Recovery 2 Concern Traditional DR Cloud Native

CNDR - Reference Architecture 3 Traditional DR strategies are still

Availability and Consistency 4 High Availability (HA) is a property

CAP Theorem 5 Product CAP Choice (either Availability or Consistency)

Consensus Protocols 6 Consensus Protocols allow for the coordination of

Anatomy of a Stateful Application 7 Partitions are a way

Examples of Consensus Protocol choices 8 Product Replica consensus protocol

Strongly-Consistent vs Eventually Consistent CNDR 9 Concern Strongly-Consistent Eventually-Consistent RPO