Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hydra geo-redundant failover for Postgresql

OnGres
April 19, 2018

Hydra geo-redundant failover for Postgresql

PostgreSQL High Availability (HA) is a very desirable goal for most, if not a strict requirement. And there are technologies and literature to help provide HA on a PostgreSQL cluster.

However at Groupon we manage hundreds of database servers, distributed across several datacenters in the world. And our internal policy mandates that we need a Disaster Recovery mechanism to switch a complete datacenter over to another.

With this main requirement we have built Hydra, a soon-to-be-open-sourced solution that implements geo-redundant failover. It relies on Consul for distributed consistency and as the basis for the multi-datacenter approach, and is implemented as a lightweight agent in Java8.

OnGres

April 19, 2018
Tweet

More Decks by OnGres

Other Decks in Technology

Transcript

  1. CEO ALVARO HERNANDEZ @ahachete DBA and Java Software Developer OnGres,

    8Kdata and ToroDB Founder Well-Known member of the PostgreSQL Community World- Class Database Expert (+30 Talks in last 2 years)
  2. JOSE FINOTTO Engineering Manager at Groupon for relational databases, leading

    efforts for development of solutions and integrations between all the Groupon infrastructure ecosystem, principally focused on the database aspect. With more than 15 years of experience in development and database administration, Jose is really interested in automation, and improvement of processes. Engineering Manager
  3. • During the q4 of 2017 we had 49.5 million

    unique customers who had bought at least one of Groupon's deals during the trailing 12 months. • More than a billion of groupons sold in our history. • Revenue of 2017 over 2.8 billions of USD. • More than 171 millions of downloads of our app. Groupon in numbers
  4. GDS • FreeBSD • ZFS • Snapshots: • Support for

    PostgreSQL and MySQL • Ansible • Internal Firewall • Package repository Poudririe • CARP: Common Address Redundancy Protocol zfs rollback tank/home/database@timeframe
  5. GDS Instance Types • Single instance by host ✓ Hardware

    dedicated to an instance. • Multiple instances by host ✓ Each instance has a share of the memory. ✓ The mapping of firewall redirects the traffic between the instances and application.
  6. PG :5432 Groupon PostgreSQL Architecture PgBouncer DC Firewall Host 1

    Host 2 ‣ CARP ‣ Virtual IPs ‣ RO VIP ‣ RW VIP ZFS Vol ZFS RO :5433 RW :5434 PG :5435 PgBouncer ZFS Vol RO :5436 RW :5437 PG :5438 PgBouncer ZFS Vol RO :5439 RW :5440
  7. GDS Numbers • We have over 1900 clusters • Approx.

    1000 production clusters. • The balance is 45% PostgreSQL and 45% MySQL • At the moment 3 datacenters around the globe. Groupon Groupon Groupon
  8. HA problem: background • Mission-critical services need to support a

    whole datacenter (DC) failure. • Databases are replicated cross-DC, and follow a master + secondary (with cascading replication) architecture. • How do we ensure exactly one master globally? How do we failover within a DC? How do we failover a whole cluster to another DC?
  9. Hydra • A multi-datacenter aware high availability solution. • Ensures

    single global master and provides intra -and inter- DC failover. • Service-independent layer: currently implements PostgreSQL and MySQL. • Offers an HTTP/REST management API for integration. • Project developed by Groupon and OnGres (8Kdata). • The code source will be open sourced soon.
  10. • A single Hydra can manage multiple database clusters. •

    Each cluster may span to multiple datacenters. Hydra knows where the primary and secondaries are, and which to promote / stop / etc. • It relies on Hashicorp’s Consul for consistent, distributed consensus and as a basis for multi-datacenter. Multi-cluster, multi-datacenter
  11. Failover features • Manual or automatic (will not be used

    in production). • Secondaries are automatically pointed to the new master (even on cascading scenarios). • Algorithm for new master election on automatic failover, detection of ahead-of-time secondaries after new master. • Graceful shutdown for masters.
  12. Service abstraction • Hydra is not designed specifically for PostgreSQL,

    but rather for a service abstraction, which assumes a primary (read-write) + secondaries (read-only) architecture with data replication. • Current version supports PostgreSQL and MySQL. • A service-dependent layer is provided for every service (promote/stop operations, lag monitoring, etc).
  13. Server-agent architecture • Hydra installs an agent (lightweight Java8) alongside

    a Consul agent and the service (PG/MySQL). • A (separate) Hydra server (also lightweight Java8) exposes a REST API to manage all the datacenters and clusters. Stateless. • Master and secondary nodes are exposed via (consistent) DNS to clients of the databases.
  14. • The entry point is one of the difficult parts

    of an HA architecture. • How do you inform database clients of topology changes? • Use DNS. This is the alternative chosen The entry point: DNS
  15. • Hydra uses Consul DNS • A Consul cluster is

    deployed in each data center • In any data center when Consul DNS is queried gives the consistent or nearest response • Hydra update following DNS names for each service: ✓primary ✓secondaries: try to return standby nearest to the data center ✓localsecondaries: standby in data center where primary is ✓remotesecondaries: standby not in data center where primary is ✓tertiaries: standby in cascading replication The entry point: DNS
  16. Consul Cluster Consul Consul PG Hydra Agent Consul Agent Detailed,

    multiple-datacenters architecture PG Master Hydra Agent Consul Agent Consul Cluster PG Hydra Agent Consul Agent PG Hydra Agent Consul Agent Consul Consul GDS Client PG Hydra Agent Consul Agent PG Hydra Agent Consul Agent GDS Client Hydra Server REST API GUI/ other Hydra MGMT Cascading Rep. Cascading Rep. Replication DNS use DNS query DNS query in SAC1 & SNC1 SNC1 SAC1 met part of the proyect Replication Replication Consul GOSSIP (ASYNC) DNS use
  17. How Hydra (Java) manages PG • ProcessBuilder allows an agent

    to work as a DBA does ✓ Used to start/stop/promote using pg_ctl / custom scripts ✓ Also to execute core utils to read/write recovery.conf (since some environment require sudo/su) • JDBC is great to execute queries and system functions and to retrieve the results
  18. REST API • Hydra can run as an agent (co-located

    with the Consul agent and the service it manages), or as a server. • Server exposes a REST API for all supported operations, based on HTTP endpoints with JSON. • API is documented and exposed via Swagger, allowing automatic creation of clients in several languages. • All operations are asynchronous.
  19. REST API. Swagger (II) • Incluir aquí una (no más)

    buena captura de pantalla de Swagger (pedir a Adrián). • Que contenga un buen ejemplo de uno o más métodos, y mucho colorcito. Tal vez métodos actualizados y no desplegados, que se vea la lista de todos. O el detalle de uno, ¡o ambos y entonces sí mejor 2 capturas!
  20. API Authentication • User authentication is performed with sessions, which

    are maintained via JWT cryptographically signed tokens. • Tokens are created via login, currently done with API key secret pairs. Tokens could also be generated via oAuth2 (Google/Github/others integration), great for auditing. • Groupon requested to simplify auth: pre-created tokens, infinite duration sessions (no login required, revocable).
  21. Hydra deployment • With support from Groupon, it’s fully Ansible-ized.

    • Hydra agent requires consul-agent co-located and a specific config file, which is service-dependent (PostgreSQL or MySQL). • Hydra server requires a separate, simpler config file. • Java 8 (and Consul) are the only pre-requisites. Tested and validated on Groupon FreeBSD with open jdk 8.
  22. Testing framework & Docker • Since the first demo, Hydra

    source code supports launching a Docker fleet from Maven (Java build tool) with no other requisite than Java, Maven and Docker, with a full Hydra architecture. It is now the CI env of the project. • An automated testing that continuously causes failovers (and uses automated new master election and node rejoining) will be published with v1.1 for deeper testing.
  23. Status and open sourcing Hydra • Hydra is going live

    on H1 2018 on Groupon • Currently is closed-source • Groupon is considering open source Hydra under BSD