Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zero downtime with little effort

YugabyteDB Japan
September 28, 2023
210

Zero downtime with little effort

In this presentation the Authlete will explain how they achieve Database update without downtime using YugabyteDB

YugabyteDB Japan

September 28, 2023
Tweet

More Decks by YugabyteDB Japan

Transcript

  1. Authlete introduction - Founded 2015 in Tokyo - About 30

    members from/in a dozen countries - Identity component as a service - Authorization backend for backends - Managed (SaaS) or on-premise
  2. Authlete infrastructure (dedicated cloud) GKE kubernetes cluster (multi-AZ) Authlete server

    CloudSQL proxy Authlete server CloudSQL proxy Authlete server CloudSQL proxy CloudSQL MySQL HA Primary Failover
  3. Background on problem (1/2) CloudSQL Maintenance - approx. quarterly -

    1~2 minutes downtime - notified in advance - need to notify customers - customers often want to reschedule (possible within 28 days of date) - high support overhead ~5 min/year downtime (0.999%) is itself not a big issue, but highly opinionated customers are.
  4. Background on problem (2/2) Performance - We are still far

    from hitting a limit that scaling up cannot solve - But we have had on-premise customers who are looking for very high scalability High availability - Ability to distribute cross-region (and globally) opens up new options for us - Being able to scale up/out the DB without downtime is an advantage
  5. Solutions we explored a.k.a. “Why don’t you just use Spanner?”

    We have discussed Vitess, Spanner, CochroachDB, YugabyteDB, TiDB since 2020 - Prefer managed solutions (don’t want to hire DBA team) - Need good compatibility with our Java DB middleware (Mysql/Postgres) - Ideally should offer on-premise DB solution for our on-premise customers - Good support and partnership opportunities
  6. Getting Started (Initial tests) To make sure YugabyteDB was compatible

    with Authlete, first step was to pass our unit tests with Yugabyte as a DB. PostgreSQL / MySQL Authlete YugabyteDB Authlete
  7. Initial tests Just had to set one flag to pass

    all tests. Authlete 🥲 (Some tests fail) YugabyteDB Authlete 😄 (All tests pass) YugabyteDB yb_enable_read_committed_isolation =true
  8. 3 Authlete pods on GKE Performance tuning Authlete Authlete Yugabyte

    Managed Cluster on GCP Authlete perftest client VPC peering Next, we tried running load test. - We want to observe the behavior under stress (Errors, panics, timeouts) - Measure performance and draw conclusion Authlete • Create a cluster on Yugabyte Managed • Add Yugabyte smart driver
  9. Performance tuning Thanks to Gwenn, we tuned the performance even

    further: - Enabled bunch of flags (for the most part, I didn’t know what these flags did) - We tested upgrading the Yugabyte Managed cluster while running load test - We achieved satisfactory results
  10. Column type change - Liquibase rollback was failing due to

    unsupported data type change - At that time, PostgreSQL supported this one, not Yugabyte - Yugabyte team provided support and helped us to fix our issue in recent version
  11. Retries and refresh timeout - Reworked DB calls to handle

    retries - Lowered Yugabyte Smart driver node list refresh timeout YB Node YB Node YB Node Authlete Node list: 1, 2, 3 YB Node YB Node YB Node Authlete Node list: 1, 3 After refresh and retry Zero downtime retry
  12. Integration - Easy integration in CI/CD pipelines - Almost same

    arguments as PostgreSQL - Built-in tools in docker image to manage DB
  13. Conclusions and next steps Pilot project for SaaS - We

    are planning on running a pilot project with customer cooperation - Evaluation of real-world use of Yugabyte-backed Authlete SaaS Yugabyte officially supported for on-premise customers - We are communicating to our customers that Yugabyte is a supported DB option General availability as an option for SaaS customers - Upon completion of pilot project, we are expecting to offer Yugabyte as an HA option