Slide 1

Slide 1 text

HYDRA: Multi-datacenter HA for PostgreSQL & MySQL

Slide 2

Slide 2 text

CEO ALVARO HERNANDEZ @ahachete DBA and Java Software Developer OnGres, 8Kdata and ToroDB Founder Well-Known member of the PostgreSQL Community World- Class Database Expert (+30 Talks in last 2 years)

Slide 3

Slide 3 text

JOSE FINOTTO Engineering Manager at Groupon for relational databases, leading efforts for development of solutions and integrations between all the Groupon infrastructure ecosystem, principally focused on the database aspect. With more than 15 years of experience in development and database administration, Jose is really interested in automation, and improvement of processes. Engineering Manager

Slide 4

Slide 4 text

● During the q4 of 2017 we had 49.5 million unique customers who had bought at least one of Groupon's deals during the trailing 12 months. ● More than a billion of groupons sold in our history. ● Revenue of 2017 over 2.8 billions of USD. ● More than 171 millions of downloads of our app. Groupon in numbers

Slide 5

Slide 5 text

Groupon and relational databases Monolithics History Microservices full stack GDS

Slide 6

Slide 6 text

GDS ● FreeBSD ● ZFS ● Snapshots: ● Support for PostgreSQL and MySQL ● Ansible ● Internal Firewall ● Package repository Poudririe ● CARP: Common Address Redundancy Protocol zfs rollback tank/home/database@timeframe

Slide 7

Slide 7 text

GDS Instance Types ● Single instance by host ✓ Hardware dedicated to an instance. ● Multiple instances by host ✓ Each instance has a share of the memory. ✓ The mapping of firewall redirects the traffic between the instances and application.

Slide 8

Slide 8 text

PG :5432 Groupon PostgreSQL Architecture PgBouncer DC Firewall Host 1 Host 2 ‣ CARP ‣ Virtual IPs ‣ RO VIP ‣ RW VIP ZFS Vol ZFS RO :5433 RW :5434 PG :5435 PgBouncer ZFS Vol RO :5436 RW :5437 PG :5438 PgBouncer ZFS Vol RO :5439 RW :5440

Slide 9

Slide 9 text

GDS Numbers ● We have over 1900 clusters ● Approx. 1000 production clusters. ● The balance is 45% PostgreSQL and 45% MySQL ● At the moment 3 datacenters around the globe. Groupon Groupon Groupon

Slide 10

Slide 10 text

HA problem: background ● Mission-critical services need to support a whole datacenter (DC) failure. ● Databases are replicated cross-DC, and follow a master + secondary (with cascading replication) architecture. ● How do we ensure exactly one master globally? How do we failover within a DC? How do we failover a whole cluster to another DC?

Slide 11

Slide 11 text

Hydra ● A multi-datacenter aware high availability solution. ● Ensures single global master and provides intra -and inter- DC failover. ● Service-independent layer: currently implements PostgreSQL and MySQL. ● Offers an HTTP/REST management API for integration. ● Project developed by Groupon and OnGres (8Kdata). ● The code source will be open sourced soon.

Slide 12

Slide 12 text

● A single Hydra can manage multiple database clusters. ● Each cluster may span to multiple datacenters. Hydra knows where the primary and secondaries are, and which to promote / stop / etc. ● It relies on Hashicorp’s Consul for consistent, distributed consensus and as a basis for multi-datacenter. Multi-cluster, multi-datacenter

Slide 13

Slide 13 text

Failover features ● Manual or automatic (will not be used in production). ● Secondaries are automatically pointed to the new master (even on cascading scenarios). ● Algorithm for new master election on automatic failover, detection of ahead-of-time secondaries after new master. ● Graceful shutdown for masters.

Slide 14

Slide 14 text

Service abstraction ● Hydra is not designed specifically for PostgreSQL, but rather for a service abstraction, which assumes a primary (read-write) + secondaries (read-only) architecture with data replication. ● Current version supports PostgreSQL and MySQL. ● A service-dependent layer is provided for every service (promote/stop operations, lag monitoring, etc).

Slide 15

Slide 15 text

Server-agent architecture ● Hydra installs an agent (lightweight Java8) alongside a Consul agent and the service (PG/MySQL). ● A (separate) Hydra server (also lightweight Java8) exposes a REST API to manage all the datacenters and clusters. Stateless. ● Master and secondary nodes are exposed via (consistent) DNS to clients of the databases.

Slide 16

Slide 16 text

Basic architecture

Slide 17

Slide 17 text

● The entry point is one of the difficult parts of an HA architecture. ● How do you inform database clients of topology changes? ● Use DNS. This is the alternative chosen The entry point: DNS

Slide 18

Slide 18 text

• Hydra uses Consul DNS • A Consul cluster is deployed in each data center • In any data center when Consul DNS is queried gives the consistent or nearest response • Hydra update following DNS names for each service: ✓primary ✓secondaries: try to return standby nearest to the data center ✓localsecondaries: standby in data center where primary is ✓remotesecondaries: standby not in data center where primary is ✓tertiaries: standby in cascading replication The entry point: DNS

Slide 19

Slide 19 text

Consul Cluster Consul Consul PG Hydra Agent Consul Agent Detailed, multiple-datacenters architecture PG Master Hydra Agent Consul Agent Consul Cluster PG Hydra Agent Consul Agent PG Hydra Agent Consul Agent Consul Consul GDS Client PG Hydra Agent Consul Agent PG Hydra Agent Consul Agent GDS Client Hydra Server REST API GUI/ other Hydra MGMT Cascading Rep. Cascading Rep. Replication DNS use DNS query DNS query in SAC1 & SNC1 SNC1 SAC1 met part of the proyect Replication Replication Consul GOSSIP (ASYNC) DNS use

Slide 20

Slide 20 text

Automatic failover algorithm

Slide 21

Slide 21 text

Automatic failover algorithm

Slide 22

Slide 22 text

Automatic failover algorithm

Slide 23

Slide 23 text

How Hydra (Java) manages PG • ProcessBuilder allows an agent to work as a DBA does ✓ Used to start/stop/promote using pg_ctl / custom scripts ✓ Also to execute core utils to read/write recovery.conf (since some environment require sudo/su) • JDBC is great to execute queries and system functions and to retrieve the results

Slide 24

Slide 24 text

REST API ● Hydra can run as an agent (co-located with the Consul agent and the service it manages), or as a server. ● Server exposes a REST API for all supported operations, based on HTTP endpoints with JSON. ● API is documented and exposed via Swagger, allowing automatic creation of clients in several languages. ● All operations are asynchronous.

Slide 25

Slide 25 text

REST API. Swagger (I)

Slide 26

Slide 26 text

REST API. Swagger (II) ● Incluir aquí una (no más) buena captura de pantalla de Swagger (pedir a Adrián). ● Que contenga un buen ejemplo de uno o más métodos, y mucho colorcito. Tal vez métodos actualizados y no desplegados, que se vea la lista de todos. O el detalle de uno, ¡o ambos y entonces sí mejor 2 capturas!

Slide 27

Slide 27 text

DEMO!!!!!!!!!

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

API Authentication ● User authentication is performed with sessions, which are maintained via JWT cryptographically signed tokens. ● Tokens are created via login, currently done with API key secret pairs. Tokens could also be generated via oAuth2 (Google/Github/others integration), great for auditing. ● Groupon requested to simplify auth: pre-created tokens, infinite duration sessions (no login required, revocable).

Slide 30

Slide 30 text

Hydra deployment ● With support from Groupon, it’s fully Ansible-ized. ● Hydra agent requires consul-agent co-located and a specific config file, which is service-dependent (PostgreSQL or MySQL). ● Hydra server requires a separate, simpler config file. ● Java 8 (and Consul) are the only pre-requisites. Tested and validated on Groupon FreeBSD with open jdk 8.

Slide 31

Slide 31 text

Sample agent config-file

Slide 32

Slide 32 text

Sample server config-file

Slide 33

Slide 33 text

Testing framework & Docker ● Since the first demo, Hydra source code supports launching a Docker fleet from Maven (Java build tool) with no other requisite than Java, Maven and Docker, with a full Hydra architecture. It is now the CI env of the project. ● An automated testing that continuously causes failovers (and uses automated new master election and node rejoining) will be published with v1.1 for deeper testing.

Slide 34

Slide 34 text

Status and open sourcing Hydra ● Hydra is going live on H1 2018 on Groupon ● Currently is closed-source ● Groupon is considering open source Hydra under BSD

Slide 35

Slide 35 text

www.ongres.com [email protected] Questions? @ongresinc