Slide 1

Slide 1 text

Mastering Aurora PostgreSQL Clusters for Disaster Recovery MyDBOps OpenSource Database Meetup Date: Saturday, October 7th, 2023 Time: 2 pm to 5 pm IST

Slide 2

Slide 2 text

About Me Co-Founder & CTO [email protected] A data guy by Job but a DBA by nature Network Engineer Cloud Architect Database Administrator Data Engineer Data Architect >_ @BhuviTheDataGuy @BhuviTheDataGuy https://TheDataGuy.in /in/rbhuvanesh @BhuviTheDataGuy Social Media Handles

Slide 3

Slide 3 text

About ShellKode We are a born in cloud company specializing in Modernization, Security, Data, and AI/ML to empower businesses with cutting-edge technologies and drive transformative growth. Bengaluru Achievements One of the fastest growing AWS partner Public Sector Badge Well Architected Program 50+ Happy Customers 55+ AWS Certified Architects 4 Service Delivery Centers Coimbatore Hyderabad Florida AI/ML Chatbot Decision Making AI Recommendation Engine Modernisation Migration Containerise DevOps Data Data Engineering Data Analytics DataOps GenAI Multi Model Large language Model Foundational Model Security Managed Services Services

Slide 4

Slide 4 text

Aurora – The differentiator • Storage and Compute layers are decoupled and scale independently • Data will be maintained 2 copies/Zone and 6copies/region • Auto scale with 10GB chunks • Aurora native replication • Auto scale the read replica • Provision the replica in a few mins • High throughput comparing with RDS native instances Features

Slide 5

Slide 5 text

Aurora Global Databases • Replicate your data to global • Best fit for geo span applications • Fully Managed Failover • Guaranteed RPO • Low latency replication • Failover to any region at anytime • Supports global write forwarding

Slide 6

Slide 6 text

Architecture • Physical + Log Replication • Asynchronous replication • <1 sec replication lag • Custom replication service • Powered by AWS backbone networks • Encrypted connections • Supports up to 5 secondary regions

Slide 7

Slide 7 text

Aurora Replication vs Logical replication Source: aws.amazon.com

Slide 8

Slide 8 text

Managed Failover Switchover Formerly known as "managed planned failover," this method is ideal for controlled situations like operational maintenance and other planned operational processes. By ensuring that secondary DB clusters are synchronized with the primary before implementing any further alterations, it guarantees an RPO of 0 (no data loss). Failover Utilize this method for addressing unforeseen outages. By executing a cross-Region failover to one of the secondary DB clusters within your Aurora global database, you can implement this approach. *new – Failback is possible now with the managed failover. After the failover, once the old primary is back, it’ll automatically build the secondary cluster. Switchover time – Up to 7mins New primary promotion time – Up to 1.5 mins

Slide 9

Slide 9 text

Headless Cluster Low-cost DR solution Burstable instance family is not supported for headless

Slide 10

Slide 10 text

Managed RPO 25 secs Replication Lag Detected global_db_rpo will enforce clusters to be in sync Min value = 20 seconds and Max = 68Years Ensures that at least one secondary server should be in the RPO limit. Pause all the transaction commits on the primary cluster until one of the replica catchup the lag 35 secs

Slide 11

Slide 11 text

Real world Experience

Slide 12

Slide 12 text

The dark side of global_db_rpo parameter It will enforce the block transactions, There is no secondary cluster Removing Primary and Secondary A regional Aurora Cluster

Slide 13

Slide 13 text

The dark side of global_db_rpo parameter Regional failovers(within same region) can block the transactions up to 5mins

Slide 14

Slide 14 text

TLS Certificate You’ll not get all the certificates on all the regions aws rds --region ap-south-2 \ create-db-instance \ --db-instance-identifier bhuvi-secondary-cluster-2 \ --db-cluster-identifier bhuvi-secondary-cluster \ --db-instance-class db.r5.large \ --db-parameter-group-name bhuvi-secondary-pg \ --enable-performance-insights \ --performance-insights-kms-key-id xxxx \ --ca-certificate-identifier rds-ca-2019 \ --engine aurora-postgresql aws rds describe-certificates \ --region ap-south-2 | jq \ '.Certificates[].CertificateIdentifier’ "rds-ca-rsa2048-g1" aws rds describe-certificates \ --region ap-south-1 | jq \ '.Certificates[].CertificateIdentifier’ "rds-ca-ecc384-g1" "rds-ca-rsa4096-g1" "rds-ca-rsa2048-g1" "rds-ca-2019"

Slide 15

Slide 15 text

Solution for TLS Certificate Global Bundle certificates can be used to connect the RDS/Aurora instances from any region. It will work if your RDS has the certificate rds-ca-2019 and rds-ca-rsa-2048-g1 But you’ll not get the option to choose the certificates in all the regions

Slide 16

Slide 16 text

Quiz 1. Can we use different KMS keys for global clusters(Primary cluster and Secondary cluster) 2. In a Peering connection, Secondary cluster Endpoints are not resolving on Primary region, but VPC and Subnets have DNS resolution enabled, How?

Slide 17

Slide 17 text

KMS key for Global Clusters • Both are using different Storage volumes • KMS keys can be default or CMK • You can have different CMK for both the clusters • You use the combination of Default + CMK Peering – DNS resolution • Peering VPCs will not resolve RDS endpoints via private network • Enable DNS hostnames and DNS resolution on both the requester and accepter peering connection settings. Peering - Security Group • In the peering connections you cannot whitelist a ID of the security group if the VPC is in different region. • You can whitelist • Specific IP • IP range of the Subnet • IP range of the VPC

Slide 18

Slide 18 text

Thank You!!