Slide 1

Slide 1 text

Postgresql and Pacemaker from the Ground Up Brian Cosgrove Braintree

Slide 2

Slide 2 text

What is HA? Why do we want it?

Slide 3

Slide 3 text

High Availability High availability refers to a system or component that is continuously operational for a desirably long length of time. Availability can be measured relative to "100% operational" or "never failing." A widely-held but difficult-to- achieve standard of availability for a system or product is known as "five 9s" (99.999 percent) availability[1]. [1] http://searchdatacenter.techtarget.com/definition/high-availability

Slide 4

Slide 4 text

$50,000,000,000 / year $95,000 / minute http://venturebeat.com/2015/09/17/paypals-braintree-is-now-likely-bigger-than-square-and-stripe-combined/

Slide 5

Slide 5 text

Designing HA: Detecting failure Reliable and quick detection of failure allows us to minimize user-facing impact.

Slide 6

Slide 6 text

Designing HA: Eliminate Single Points of Failure Principle: Add redundancy to the system so that failure of a component does not mean failure of the entire system. Use PostgreSQL in combination with synchronous replication to provide a hot-standby that can be promoted if the primary fails.

Slide 7

Slide 7 text

Designing HA: Reliable failover Pacemaker can automate the promotion of standbys.

Slide 8

Slide 8 text

“The automated failover of our main production database could be described as the root cause of both of these downtime events… we've made changes to our Pacemaker configuration to ensure failover of the 'active' database role will only occur when initiated by a member of our operations team.” - A post-mortem

Slide 9

Slide 9 text

1. Have one candidate for fail-over per-database cluster 2. Fail-over is a one-way operation - no flapping 3. Let humans take over if Pacemaker is confused Mitigating some of the risks involved in automated failover

Slide 10

Slide 10 text

The nuts and bolts: Pacemaker and Corosync at Braintree

Slide 11

Slide 11 text

Our topology Each Postgres cluster gets its own Pacemaker cluster Protect against split-brains by introducing a third server which only provides a vote in leader elections Achieve even more isolation by running each Pacemaker cluster on its own “heartbeat” VLAN

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Putting it together: Resources Resources are controlled by init-like OCF scripts Resources for our installation fall into the following categories: ● VIPs (virtual IP addresses) - “IPAddr2” resource ● pgsql resources - these are the Postgres clusters themselves ● STONITH - we use a custom resource that operates via SNMP on our APC PDUs

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_configuring_stonith.html

Slide 18

Slide 18 text

Thanks! Brian Cosgrove Software Engineer twitter.com/cosgroveb [email protected] www.braintreepayments.com