Slide 1

Slide 1 text

© 2023 All Rights Reserved Take the Stress out of Retail Peak Traffic Events Unmasking Black Friday Database Frights October 31, 2023

Slide 2

Slide 2 text

© 2023 All Rights Reserved YugabyteDB Speakers Amey Banarse VP, Global Field Engineering Michael Haag Director, Product Marketing

Slide 3

Slide 3 text

© 2023 All Rights Reserved Constant Change New normal since 2020 Global Competition Cross-category comparisons Data First Mentality Data = Differentiation State of the retail industry today

Slide 4

Slide 4 text

© 2023 All Rights Reserved Two Stages of DevOps/SREs at Retailers

Slide 5

Slide 5 text

© 2023 All Rights Reserved Two Stages of DevOps/SREs at Retailers Stressed Preparing for Stress November & December

Slide 6

Slide 6 text

© 2023 All Rights Reserved Why is it so challenging?

Slide 7

Slide 7 text

© 2023 All Rights Reserved All critical services depend on a transactional database

Slide 8

Slide 8 text

© 2023 All Rights Reserved Time to make your life easier by rethinking your business-critical services with a distributed SQL database.

Slide 9

Slide 9 text

© 2023 All Rights Reserved Four e-commerce scenarios that don’t need to be stressful anymore React to Peak Demand Support Global Shoppers Deliver Accurate Inventory Survive the Unexpected

Slide 10

Slide 10 text

© 2023 All Rights Reserved Trick (or Treat) #1: Architect for Peak Demand

Slide 11

Slide 11 text

© 2023 All Rights Reserved Peak Events Place Huge Burdens on DevOps/SRE Teams ● Unpredictable traffic ● Huge deviation from the “norm” that can be hard to test and model ● Uncertainty around changes or new services ● Limited team availability during major holidays ● Must lock environment way in advance

Slide 12

Slide 12 text

© 2023 All Rights Reserved Challenges retailers may face with existing database Manual sharding is hard and time consuming Single system requires complex vertical scaling Multiple DBs to deliver high throughput

Slide 13

Slide 13 text

© 2023 All Rights Reserved How can a distributed database help? Automatic sharding On-demand scaling Performance at scale

Slide 14

Slide 14 text

© 2023 All Rights Reserved How do you architect for scale? 14 Option 1: Scale up (migrate to larger machines) Option 2: Manually shard PostgreSQL Option 3: Add caches / read replicas to PostgreSQL ● Limits scalability ● Disruptive ● Not easy to scale back down when demand shrinks ● Operationally challenging (requires in-house experts) ● Fragile ● Application burden ● Does not solve write scalability ● App becomes complex (primary & replica endpoints) ● Cache coherence, operationally hard Option 4: Scale out (and in) automatically (Distributed SQL) ● Operationally simple (database handles the scaling) ● Transparent to applications ● Scale out and scale back in as needed

Slide 15

Slide 15 text

© 2023 All Rights Reserved Real-World Example: Top 5 Global Retailer Extensive Product Catalog ● System of record for all products sold online and across thousands of physical stores ● Over 100M items, including products from external merchants & partners Key Requirements ● High throughput, support multiple TBs in a few hours from Kafka and refresh entire catalog nightly while handling transactional updates ● Low latency with 90:10 reads to writes ● Multi-region with strong consistency, transactions & secondary indexes ● Easily scale during peak seasons with guaranteed resilience ● Non-disruptive Day 2 operations (scale up / down, replacing nodes, …) ● Platform agnostic for on-prem data center and public cloud Azure IaaS Past Challenges Several issues with previous Apache Cassandra database: ● Data not consistent in global product catalog ● Needed scale and flexibility ● Needed Multi-region deployments with strong consistency & data accuracy Top 5 Global Retailer

Slide 16

Slide 16 text

© 2023 All Rights Reserved Real-World Example: Top 5 Global Retailer Results achieved with YugabyteDB ● Handling holiday peak traffic since 2020 season ● Scaled cluster seamlessly to >150 nodes of YugabyteDB during peak traffic events in 2022 (Black Friday, Cyber Monday, …) ● Linear scale: >250K Business Transactions/sec; 1.25M DB IOPS; 3+ billions of product mappings ● Low Latencies: P99 Reads latencies within 3-5 ms; Read optimizations with low latency access using Preferred Leaders ● Multi-region, transactional writes ~75ms; Deployed in RF3 across 3 Azure Regions: US-East, US-West and US-Central; Extensive Product Catalog ● System of record for all products sold online and across thousands of physical stores ● Over 100M items, including products from external merchants & partners Top 5 Global Retailer

Slide 17

Slide 17 text

© 2023 All Rights Reserved Trick (or Treat) #2: Design for Global Shoppers

Slide 18

Slide 18 text

© 2023 All Rights Reserved Cross-border shopping is becoming increasingly popular 54% of US digital shoppers bought from foreign sites $1T Cross-border e-Commerce in 2020

Slide 19

Slide 19 text

© 2023 All Rights Reserved Challenges you may face with existing database Single writer means high latency Complex deployment configurations Hard to meet data residency requirements

Slide 20

Slide 20 text

© 2023 All Rights Reserved How can a distributed database help? Single, unified database for global customer reach Deployment flexibility Geo-distribution for compliance and performant user experience

Slide 21

Slide 21 text

© 2023 All Rights Reserved Example Global deployment spanning US, EU & APJ Regions ● Single YB cluster providing Strong Consistency across multi-region ● Scalable and highly available operational data tier ● Business continuity, able to withstand Region failure with RPO=0 and low RTO < 10s ● Geo-partitioning, Data Locality & Compliance 21

Slide 22

Slide 22 text

© 2023 All Rights Reserved Flexible deployment options in a single database Primary Use Case Performance Notes Stretched Cluster (Synchronous Replication) Active-Active-Active config with Zone-level or region-level resilience RPO=0 and RTO=3-10s; low latency reads & high throughput xCluster Asynchronous Replication Active-Active or Active-Passive for disaster recovery solution Very low read and write latency, high throughput in each cluster Read Replicas Fast reads for globally distributed customers Low latency, high-throughput reads Row-level Geo-Partitioning Satisfy data residency, compliance and regulatory requirements Data pinned to specific geographic locations and high performance in region

Slide 23

Slide 23 text

© 2023 All Rights Reserved Key Initiative Utilize YugabyteDB's capabilities for improved targeting and user experiences for Visitor Relationship Management platform. Admiral Transforms Visitor Relationship Management with YugabyteDB Managed Challenges ● ClickHouse (existing database) lacked horizontal scalability to support growing customer base ● Manage database sprawl across multiple databases ● Higher TCO due to higher maintenance and infrastructure costs. YugabyteDB Managed Business Impact ● Lower TCO with reduced maintenance and infrastructure costs ● Eliminate database sprawl and complexity with simplified infrastructure ● Scale horizontally to meet growing needs

Slide 24

Slide 24 text

© 2023 All Rights Reserved Trick (or Treat) #3: Ensure Accurate Inventory Results

Slide 25

Slide 25 text

© 2023 All Rights Reserved What happens when your data isn’t accurate…

Slide 26

Slide 26 text

© 2023 All Rights Reserved What happens when your data isn’t accurate… … you have to send one of these to your customer!

Slide 27

Slide 27 text

© 2023 All Rights Reserved Challenges you may face with existing database Only supports eventual consistency Failures lead to downtime or data loss Complex app logic slows innovation and UX

Slide 28

Slide 28 text

© 2023 All Rights Reserved How can a distributed database help? Native, consistent CDC Built-in resilience ACID transactions with strongly consistent data

Slide 29

Slide 29 text

© 2023 All Rights Reserved “Partnering with Yugabyte helps us focus on our customers instead of worrying if our systems can keep pace with our rapidly growing business.” Ram Ravichandran, CTO and co-founder Key Initiative Narvar was challenged with AWS databases becoming expensive at scale, while customers and data privacy regulations demanded a multi-cloud solution. Narvar Achieve 4x Lower TCO Challenges Struggled to control costs of AWS DynamoDB ● Unable to satisfy multi-cloud requirements ● Need to scale on demand during peak seasons ● GDPR compliance required multi-region solution YugabyteDB Business Impact Narvar switched to YugabyteDB and achieve 4x lower TCO, simplified ops and met key performance goals: ● 10k+ reads per second ● <3ms read latency ● <10ms write latency 29

Slide 30

Slide 30 text

© 2023 All Rights Reserved Trick (or Treat) #4: Mitigate Risks for Business as Normal

Slide 31

Slide 31 text

© 2023 All Rights Reserved

Slide 32

Slide 32 text

© 2023 All Rights Reserved Challenges you may face with existing database Expensive, bolted-on resilience Poor RTO & RPO hurts customer experience Decrease in DBA and app team efficiency

Slide 33

Slide 33 text

© 2023 All Rights Reserved How can a distributed database help? Built-in resilience Very Low RPO/RTO Increase productivity

Slide 34

Slide 34 text

© 2023 All Rights Reserved US-East US-West US-Central Global Retailer’s Multi-Region Deployment Shrugs Off Major Storm ● 27 Nodes across 3 Azure Regions: US-East, US-West Seattle and US-Central ● Strong consistency between all the Regions ● System of Record for Product Catalog of 100+ million items with Billions of mappings, serving over 250K ops/sec

Slide 35

Slide 35 text

© 2023 All Rights Reserved US-East US-West US-South-Texas Global Retailer’s Multi-Region Deployment Shrugs Off Major Storm ● 27 Nodes across 3 Azure Regions: US-East, US-West Seattle and US-Central ● Strong consistency between all the Regions ● System of Record for Product Catalog of 100+ million items with Billions of mappings, serving over 250K ops/sec ● Service remained resilient and available through Texas cloud / power outage ● No Data loss, RPO = 0 , RTO <10 secs

Slide 36

Slide 36 text

© 2023 All Rights Reserved Closing Thoughts and Next Steps

Slide 37

Slide 37 text

© 2023 All Rights Reserved Four e-commerce scenarios that don’t need to be stressful anymore React to Peak Demand Support Global Shoppers Deliver Accurate Inventory Survive the Unexpected

Slide 38

Slide 38 text

© 2023 All Rights Reserved Simplified DBaaS for all Use Cases Orchestration Engine API User Interface Universes & Infrastructure User Interaction DevOps API CI/CD Automation ● Cluster Provisioning, Scaling - Horizontal & Vertical on any IaaS ● Automated Day 2 Operations ○ Rolling Upgrades ○ Security Key Rotations ○ Backups & Recovery ○ Point in Time Restore ● Enterprise Integrations ○ Observability ○ Metrics & Monitoring

Slide 39

Slide 39 text

© 2023 All Rights Reserved With YugabyteDB, you can architect for zero downtime Multi-Region xCluster: Async replication between two YB clusters in different regions US-West US-Central US-East Multi-Region Stretch: Sync replication across regions within a cluster ● Assume nodes and zones will fail often ● A database should offer continuous availability without heavy lift ● Users should have zero impact St Louis Kansas City

Slide 40

Slide 40 text

© 2023 All Rights Reserved A distributed, transactional database with built-in resilience, seamless scalability, and more, with Postgres-compatible and CQL-inspired APIs We’ve eliminated SQL and NoSQL limitations while simplifying the application architecture for building next-generation services and for embracing database modernization.

Slide 41

Slide 41 text

© 2023 All Rights Reserved Learn More yugabyte.com/retail-ecommerce/ cloud.yugabyte.com

Slide 42

Slide 42 text

© 2023 All Rights Reserved 42 Thank You Join us on Slack: www.yugabyte.com/slack Star us on GitHub: github.com/yugabyte/yugabyte-db