3 A brief history of PostgreSQL at Zalando Live DEMO (what can possibly go wrong?) How to stop worrying (and embrace Patroni) Kubernetes: the real thing What is in the name: Postgres Operator TABLE OF CONTENTS
10 Spilo Docker image at Zalando • PGDATA on an external volume (EBS or i3/c5 NVME) • Environment-variables based configuration • One container per one EC2 instance • PostgreSQL versions from 9.4 up to 10 • Plenty of extensions (all contrib, PostGIS, timescaleDB, PL/V8, pg_cron, etc) • Additional tools (pgbouncer, pgq) • Extremely lightweight (69MB)
12 Cluster Security Group Auto-Scaling Availability Zone A Data Volume Root volume Master Elastic IP Cloud Formation Stack Replica DB Availability Zone B Data Volume Root volume Master DB Availability Zone C Data Volume Root volume Replica DB Replica ELB Security Group Replica Elastic Load Balancer 5432, 8008 5432, 8008 GET /replica db.zalando db-repl.zalando S3 bucket: Backup + WAL User Data: - Docker image - Backup schedule - Superuser password - Replication password - Postgres parameters Etcd
14 What is Patroni • Automatic failover solution for PostgreSQL streaming-replication • A daemon that manages one PostgreSQL instance • Keeps the state of the cluster in a DCS (Etcd, Zookeeper, Consul, Kubernetes), also referred to as a consistency layer • For new instances decides whether to initialize a new cluster or join an existing one • For running instances executes promotion/demotion when necessary • A number of additional related functions (global configuration, scheduled actions, pause mode, pg_rewind support, etc)
15 What Patroni is not • Not an arbiter for the whole HA cluster • Not a swiss-army knife of Postgres maintenance • Not a substitute for a proper monitoring • Not a tool to use if you don’t understand how Etcd (or another DCS that you use) works. • Not a silver bullet (but tries to balance easy-to-use vs extensibility) • Not justi an internal project of Zalando (IBM Compose, Red Hat and many other companies use it)
18 • A set of open-source components running on one or more servers • A container orchestration system • An abstraction layer over your real or virtualized hardware • An “infrastructure as code” system • Automatic resource allocation • Next step after Ansible/Chef/Puppet What is Kubernetes?
19 • An operating system • A magical way to make your infrastructure scalable • An excuse to fire your devops (someone has to configure it) • A good solution for running 2-3 servers What Kubernetes is not?
22 Building a PostgreSQL cluster on Kubernetes • A statefulset to bind pods with persistent volumes and provide auto-recovery • A service to route client connections • Spilo as a docker container (Patroni + PostgreSQL) for HA • Secrets to store database user passwords
23 • At least four long YAML manifests to write • Different parts of PostgreSQL configuration spread over multiple manifests • No easy way to work with a cluster as a whole (update, delete) • Manual generation of DB objects, i.e. users, and their passwords. Manual deployment of HA PostgreSQL cluster on Kubernetes
24 • A template for your manifests • Only one place to fill-in deployment-related values • Requires running a special pod (tiller) in your Kubernetes cluster github.com/kubernetes/charts/blob/master/incubator/patroni Initial approach to automation: HELM
25 • Implement a controller application to act on custom resources • CRD (custom resource definitions) to describe a domain-specific object (i.e. a Postgres cluster) • Encapsulates knowledge of a human operating the service https://coreos.com/blog/introducing-operators.html Kubernetes operator pattern
28 Just a piece of cake • Operator starts pods with Spilo docker image • Operator provides environment variables to Spilo • Operator makes sure all Kubernetes objects are in sync • Spilo generates Patroni configuration • Patroni creates roles and configures PostgreSQL • Patroni makes sure there is only one master • Patroni uses Kubernetes for cluster state and leader lock • Patroni creates roles and applies configuration • Patroni changes service endpoints on failover
31 Should you run your PostgreSQL clusters in on Kubernetes Strong interest in the community • Zalando Postgres Operator • CrunchyData Postgres Operator • Red Hat Project Atomic • KubeDB • Project Habitat
32 Why not AWS RDS or Aurora PostgreSQL Not an easy answer :) Full control • Independent of cloud provider • Real super user available • Custom extensions, PAM • Streaming/WAL replication in and out • Local storage not supported on RDS (NVMe SSDs) Costs? Cost of development? ...
34 Using Kubernetes as a consistency store ● Use annotations on: ○ Pods for cluster members ○ Dedicated Endpoint for cluster configuration. ○ Service-related Endpoint for leader information. ● Reliability: always use EndPoints. ● Compatibility mode: use ConfigMaps, not Endpoints. http://patroni.readthedocs.io/en/latest/kubernetes.html
35 ● PAM module written in C ● Open-source: https://github.com/CyberDem0n/pam-oauth2 ● Equivalent of arbitrary-long automatically generated, auto-expiring passwords. ● Can supply arbitrary key=value pairs to check in the OAuth response (i.e. realm=/employees) OAUTH2 PAM authentication
37 Made possible by great people inside and outside of Zalando Patroni and Spilo: github.com/zalando/patroni, github.com/zalando/spilo Alexander Kukushkin, Ants Aasma, Feike Steenbergen, Josh Berkus Postgres Operator: github.com/zalando-incubator/postgres-operator Murat Kabilov, Sergey Dudoladov, Manuel Gómez, PAM Oauth2: https://github.com/CyberDem0n/pam-oauth2 Alexander Kukushkin Put it all together in a sane way: Jan Mußler