Slide 1

Slide 1 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Slide 2

Slide 2 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. S P O N S O R E D B Y Y U G A B Y T E D B Design patterns for multi-Region applications and data in AWS Amey Banarse D AT 2 0 7 - S VP of Solutions Engineering YugabyteDB

Slide 3

Slide 3 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amey Banarse VP of Solutions Engineering, YugabyteDB

Slide 4

Slide 4 text

© 2024 – All Rights Reserved seamless scalability built-in resilience flexible geo-distribution cost efficiency Run your business-critical applications with using PostgreSQL-compatible and Cassandra-inspired APIs while enjoying without compromising on performance

Slide 5

Slide 5 text

© 2024 – All Rights Reserved Postgres architected as a flexible managed service in any public or private cloud. We are reimagining Postgres as a native cloud service, not just running it in the cloud. 3. Architected as a Cloud DBMS Bring capabilities of leading commercial RDBMS to Postgres in a cloud-native architecture. E.g. DR & replication, perf & observability, security, etc. 2. Enterprise Grade by Default YugabyteDB: Building on Top of Postgres Innovations Fully PostgreSQL compatible API for workload portability. Leverage resilience, dynamic scalability, and multi-site distribution in the DB to make your app cloud native. 1. Distributed Postgres

Slide 6

Slide 6 text

© 2024 – All Rights Reserved Lim ∞ = Postgres Without Limits

Slide 7

Slide 7 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. PostgreSQL has become the default database API ○ Powerful RDBMS capabilities: matches Oracle features ○ Robust and mature: hardened over 30 years ○ Fully open source: permissive license, large community ○ Cloud providers adopting: managed services on all clouds “Most popular database” of 2022* “DBMS of the year” over multiple years** 2017 2018 2020 * https://www.eversql.com/most-popular-databases ** https://db-engines.com/en/blog_post/85

Slide 8

Slide 8 text

© 2024 – All Rights Reserved PostgreSQL Compatibility and Cloud Native Architecture are Critical 9 Can use PostgreSQL client drivers and psql shell Parse PG syntax - but execution is different Syntax Supports some advanced PG features - but they will work differently Feature Exactly like Postgres. Port over all existing apps, PG developers instantly at home. Runtime Wire How much Postgres compatibility? How cloud native (distributed) is the architecture? Cannot deliver high data durability, availability, scale, best in class DR Low Delivers data durability and some vertical scale. Weak HA, horizontally scale, DR Medium High data durability, availability, scalability, DR, multi-region High

Slide 9

Slide 9 text

© 2024 – All Rights Reserved 10 Can use PostgreSQL client drivers and psql shell Parse PG syntax - but execution is different Syntax Supports some advanced PG features - but they will work differently Feature Exactly like Postgres. Port over all existing apps, PG developers instantly at home. Runtime Wire How much Postgres compatibility? How cloud native (distributed) is the architecture? Cannot deliver high data durability, availability, scale, best in class DR Low Delivers data durability and some vertical scale. Weak HA, horizontally scale, DR Medium High data durability, availability, scalability, DR, multi-region High Can benefit from Postgres innovation (like pg_vector for gen AI, QoS for multi-tenancy, etc.) PG Innovation Threshold PostgreSQL Compatibility and Cloud Native Architecture are Critical

Slide 10

Slide 10 text

© 2024 – All Rights Reserved PostgreSQL Compatibility and Cloud Native Architecture are Critical 11 Can use PostgreSQL client drivers and psql shell Parse PG syntax - but execution is different Syntax Supports some advanced PG features - but they will work differently Feature Exactly like Postgres. Port over all existing apps, PG developers instantly at home. Runtime Wire How much Postgres compatibility? How cloud native (distributed) is the architecture? Cannot deliver high data durability, availability, scale, best in class DR Low Delivers data durability and some vertical scale. Weak HA, horizontally scale, DR Medium High data durability, availability, scalability, DR, multi-region High Can benefit from Postgres innovation (like pg_vector for gen AI, QoS for multi-tenancy, etc.) Can innovate on distributed, cloud native architecture (like zero downtime, global apps, fast auto- scaling, connection scaling, etc.) Can innovate on both dimensions PG Innovation Threshold Cloud DBMS Innovation Threshold

Slide 11

Slide 11 text

© 2024 – All Rights Reserved PostgreSQL Compatibility and Cloud Native Architecture are Critical 12 Can use PostgreSQL client drivers and psql shell Parse PG syntax - but execution is different Syntax Supports some advanced PG features - but they will work differently Feature Exactly like Postgres. Port over all existing apps, PG developers instantly at home. Runtime Wire How much Postgres compatibility? How cloud native (distributed) is the architecture? Cannot deliver high data durability, availability, scale, best in class DR Low Delivers data durability and some vertical scale. Weak HA, horizontally scale, DR Medium High data durability, availability, scalability, DR, multi-region High Cloud DBMS Innovation Threshold PG Innovation Threshold

Slide 12

Slide 12 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. The ability of a system to readily respond to or recover from change, disruption, or a crisis Resilience

Slide 13

Slide 13 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Commodity servers fail, network interruptions are common More apps as everything is digital and more headless services Unexpected successes can overwhelm systems Resilience was always critical: So what changed? Cloud native = More failures Bigger scale = More failures Viral success = More failures

Slide 14

Slide 14 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Just resilience is no longer enough . . .

Slide 15

Slide 15 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Modern applications demand ultra- resilience Customers expect always-on apps Nations run on digital infrastructure Brand reputation requires uptime

Slide 16

Slide 16 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Modern applications need resilience built into Postgres, not layered on top of it

Slide 17

Slide 17 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. . . . for no downtime, no limits In-Region resilience Multi-Region BCDR Zero-downtime operations Data protection Peak and freak events Grey failures From resilience to ultra-resilience . . .

Slide 18

Slide 18 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do you architect for zero downtime with YugabyteDB? ● Assume nodes and zones will fail often ● Users should have zero impact ● RPO=0, RTO~3s with sync replication ● Replication lag typically <500 ms with async Async replication between two clusters in different regions Region 2 Region 3 Region 1 Sync replication across regions within a cluster

Slide 19

Slide 19 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s dive into the real-world examples of ultra-resilience architectures

Slide 20

Slide 20 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Slide 21

Slide 21 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Business objective: Get Paramount+ closer to their end users With the anticipated expansion through globalization and release of new services and content, Paramount+ needed a database platform that could perform and scale to support peak demands to provide the best user experience Multi-Region/cloud deployment • High availability and resilience • Performance at peak scale Compliance with local laws • Conform to GDPR regulations • Conform to local security laws

Slide 22

Slide 22 text

© 2024 – All Rights Reserved YugabyteDB and Paramount+: Global authentication and user profiles Past challenges Paramount+ original single-region architecture with MySQL ● Slow read performance due to MySQL being limited to a single region ● No horizontal scalability due to limited primary and follower architecture (single 64- core node handled all writes) ● Costly downtimes due to no region-level fault tolerance in single-region architecture ● Potential data loss due to high replication lag and potential primary node failure Use case: Powers user log in, authorizes content viewing, and manages profile information (watchlist, account details, preferences, etc); YugabyteDB is the system of record for global authentication and all user profiles to view content

Slide 23

Slide 23 text

© 2024 – All Rights Reserved 24 New multi-Region design that powered Super Bowl 2024 ● Multi-Region – Stretch Sync deployment o Verified RPO=0 and RTO <10 secs on failures o Global DB - 3 Regions (east, central, west) on Public cloud IaaS, 5 AZs, replication RF=5 ● Performance and scalability o Read latencies < 30ms, transactional multi-region write latencies ~100ms o Scaled clusters seamlessly for peak events (AFC playoffs, TopGun Maverick launch, etc.) ● PostgreSQL runtime compatibility o Live Migration from MySQL to YugabyteDB ● Ecosystem integrations and extensibility o Compliance with local laws for data residency

Slide 24

Slide 24 text

YugabyteDB and Paramount+: Real-world success story Consistent global growth on YugabyteDB ● Launched AFCs, Grammys, Top Gun Maverick ● Expecting 3-6x growth across some events ● Helped expand business to the EU region for Paramount+ International https://www.paramountpressexpress.com/cbs-sports/shows/nfl-on-cbs/releases/?view=109115-nfl-on-cbs-scores-the-most-watched-nfl-divisional-playoff-game-ever-with-more-than-50-million-viewers

Slide 25

Slide 25 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Peak event: Super Bowl 2024 Use case • Media livestreaming platform • User registrations and entitlement lookup Peak scale • CBS Sports’ presentation of Super Bowl LVIII was the most-watched telecast in history, with 123.4 million viewers across platforms Challenges • Massively scaling user entitlements lookup • Resilience • Low latency for users around the world

Slide 26

Slide 26 text

© 2024 – All Rights Reserved YugabyteDB and a large financial investment firm: A success story Key requirements • Support 150 TB (up from 2 TB) • Support 150K ops/sec (up from 1K ops/s) • 600K bulk ops/sec • Predictable reads at scale with P99 < 10ms • Resilience and availability - effective business continuity for the cloud outage scenarios at very high throughput Use case: This financial app aggregates and stores retail customers' portfolio data, organizing it by customer/account and seamlessly transmitting to Fintechs and third-party aggregators such as Intuit Past challenges • Federal regulations mandated increased data retention from 4 days to 2 years, but high projected cost of current solution • Anticipated the service becoming more critical, thereby resulting in more queries, needed more efficiency and flexibility • Had to move from on-prem to AWS to support resilience, higher scale at lower costs & with higher agility

Slide 27

Slide 27 text

© 2024 – All Rights Reserved Large financial investment firm: Technical results achieved Scalability • 15 nodes across 3 AZs, RF=3 with data scale up to 150TB • 700K ops/sec • Data retention of 2 years Performance: P99 read latency < 3ms write latency <5ms Async replication Availability Zone 2 Availability Zone 3 Availability Zone 2 Availability Zone 3 Availability Zone 1 Availability Zone 1 Resilience: YugabyteDB is deployed in each AWS Region across multiple zones for business continuity Disaster recovery: • YugabyteDB clusters in two AWS regions (us-east-1 and us-east-2) • Bidirectional, asynchronous data replication between them AWS Region us-east-1 AWS Region us-east-2

Slide 28

Slide 28 text

© 2024 – All Rights Reserved Global retailer survives regional cloud outage 29 Multi-Region Product Catalog YugabyteDB Cluster US-West US-Central US-East Use Case: YugabyteDB powers the Global Product Catalog system serving 1.6 billion products to end customers in the US ● 36 Nodes across 3 Public IaaS Regions: US-East, US-West and US-Central ● Implement Preferred Leaders for low latency access in US Central ● Cluster + topology aware drivers quickly identify newly added nodes ● Supports reactive microservices & event driven architecture patterns

Slide 29

Slide 29 text

© 2024 – All Rights Reserved Global retailer survives regional cloud outage 30 Results achieved with YugabyteDB ● Service remained resilient and available through Texas cloud and power outage ● Applications automatically redirected to other regions ● No data loss (RPO = 0) and RTO <10 secs ● Sustaining high throughput of 250K+ TPS & geo-distributed for low latency read access Multi-Region Product Catalog YugabyteDB Cluster US-West US-Central US-East Use Case: YugabyteDB powers the Global Product Catalog system serving 1.6 billion products to end customers in the US

Slide 30

Slide 30 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Lessons learned ✓ Prepared for unexpected bursts ✓ Built for expected peaks ✓ Surviving DDoS attacks ✓ Flexible expansion, anywhere ✓ Multitenancy ✓ No performance compromise

Slide 31

Slide 31 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. ● Entire Region or data center failure—low probability but we see it happen regularly ● Failures that last a while ● Complex process to “heal” once the Region/DC is back online ● Ability to trade off between steady-state performance (latency) and potential data loss (RPO) ● Very quick recovery (low RTO) ● Ability to run DR drills, planned switchover and chaos testing What can go wrong… What you want…

Slide 32

Slide 32 text

© 2024 – All Rights Reserved 33 Customers around the world trust YugabyteDB to run their business 33

Slide 33

Slide 33 text

© 2024 – All Rights Reserved YugabyteDB is on a path to become the default database in the enterprise 2021 2016 2022 2023 Resilience and scale Scalable YSQL and YCQL Sync and async replication Geo-residency 2024 Enhanced PG compatibility Like Postgres + built-in resilience On-demand scaling 2025 Serverless Serverless offering for small workloads that go to 0 CONFIDENTIAL: DO NOT DISTRIBUTE 2026 2027 Multitenancy Workload consolidation on a single cluster Great for workloads that need resilience (HA), scale, or geo- distribution Great for mid-size workloads that may need unpredictable scale in the future Great for low scale workloads that require QoS Great for small scale standalone cloud native workloads

Slide 34

Slide 34 text

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Please complete the session survey in the mobile app Amey Banarse @ameybanarse