Cloud-Native PostgreSQL on Kubernetes

Slide 1

Slide 1 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES CLOUD NATIVE POSTGRESQL EN KUBERNETES ÁLVARO HERNÁNDEZ

Slide 2

Slide 2 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES ` whoami` ● Founder & CEO, OnGres ● 20+ years PostgreSQL user and DBA ● Mostly doing R&D to create new, innovative software on Postgres ● Frequent speaker at PostgreSQL, database conferences ● Principal Architect of ToroDB ● Founder and President of the NPO Fundación PostgreSQL ● AWS Data Hero Álvaro Hernández @ahachete

Slide 3

Slide 3 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES THE “STACK” PROBLEM

Slide 4

Slide 4 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //POSTGRESQL AND ORACLE INSTALL SIZE $ podman images --format "table {{.Repository}} {{.Tag}} {{.Size}}" \ docker.io/library/postgres REPOSITORY TAG SIZE docker.io/library/postgres alpine 76.9 MB docker.io/library/postgres 12.0 356 MB $ podman images --format "table {{.Repository}} {{.Tag}} {{.Size}}" \ docker.io/store/oracle/database-enterprise REPOSITORY TAG SIZE docker.io/store/oracle/database-enterprise 12.2.0.1 3.46 GB

Slide 5

Slide 5 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //POSTGRES IS “JUST A KERNEL” Postgres is like the Linux kernel Running Postgres in production requires “a RedHat” of PostgreSQL. A curated set of open source components built, veriﬁed and packaged together.

Slide 6

Slide 6 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //THE POSTGRESQL ECOSYSTEM

Slide 7

Slide 7 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //AN ENTERPRISE-GRADE POSTGRESQL STACK

Slide 8

Slide 8 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // CONFIGURATION ● OS, filesystem tuning ● PostgreSQL default configuration is very conservative. ● Resources: ○ https://postgresqlco.nf ○ PostgreSQL Configuration for Humans

Slide 9

Slide 9 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // CONNECTION POOLING pg_bench, scale 2000, m4.large (2 vCPU, 8GB RAM, 1k IOPS)

Slide 10

Slide 10 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // CONNECTION POOLING ● PgPool? ● PgBouncer? ● Odyssey? ● Where do we place the pool? ○ Client-side ○ Server-side ○ Middle-ware ○ Some or all of the above

Slide 11

Slide 11 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // HIGH AVAILABILITY ● Manual? ● PgPool? ● Repmgr? ● Patroni? ● pg_autofailover? ● PAF? ● Stolon?

Slide 12

Slide 12 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // BACKUPS AND DR ● pg_dump? ● Barman? ● Pgbackrest? ● Wal-e / Wal-g? ● pg_probackup? ● To disk? To cloud storage?

Slide 13

Slide 13 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // CENTRALIZED LOGGING ● Logs on every server ● There is not a good solution for this ● Cloud-native solutions like ﬂuentd or Loki may work ● Store the logs on Timescale

Slide 14

Slide 14 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // NETWORK PROXY. ENTRYPOINT PROBLEM ● Entrypoint: how do I locate the master, if it might be changing? ● How do I obtain traffic metrics? ● Is it possible to manage traffic: duplicate, A/B to test clusters, or even inspect it? ● Offload TLS?

Slide 15

Slide 15 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // MONITORING ● Zabbix? ● Okmeter? ● Pganalyze? ● Pgwatch2? ● PoWA? ● New Relic? ● DataDog? ● Prometheus?

Slide 16

Slide 16 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES // MANAGEMENT INTERFACE ● There are no tools like OEM… ● UI oriented towards cluster management ● ClusterControl? ● Elephant Shed?

Slide 17

Slide 17 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //WHERE DO WE DEPLOY THE STACK?

Slide 18

Slide 18 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES DEPLOYING THE POSTGRES STACK ON KUBERNETES

Slide 19

Slide 19 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //WHY KUBERNETES? ● K8s is “the JVM” of the architecture of distributed systems: an abstraction layer & API to deploy and automate infrastructure. ● K8s provides APIs for nodes and IPs discovery, secret management, network proxying and load balancing, storage allocation, etc ● A PostgreSQL deployment can be fully automated!

Slide 20

Slide 20 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //K8S OPERATORS: AUTOMATE POSTGRESQL OPS! ● Operators are just applications, developed for K8s ● Understand PostgreSQL operations ● Call K8s APIs to execute the operations ● Automate: ○ Minor version upgrades (rolling strategy) ○ Explicit vacuums ○ Repacks / reindex ○ Health checks

Slide 21

Slide 21 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CLOUD NATIVE Cloud native applications are: ● designed to be packaged in containers ● scale and can be orchestrated for high availability And follow cloud-native best practices including: ● Single-process hierarchy per container ● Sidecar containers to separate concerns ● Design for mostly ephemeral containers

Slide 22

Slide 22 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CONTAINERS ARE NOT SLIM VMS ● A container is an abstraction over a process hierarchy, with its own network, process namespaces and virtualized storage. ● But it is just a process hierarchy. Not many processes! ● No kernel, kernel modules, device drivers, no init system, bare minimum OS. ● Should be just the binary of your process and its dynamic libraries and support ﬁles it needs.

Slide 23

Slide 23 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES TURNING POSTGRESQL CLOUD NATIVE

Slide 24

Slide 24 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //IS POSTGRESQL FOR CONTAINERS? ● Overhead is minimal (1-2%): it is just a wrapper over the processes! ● Containers are as ephemeral as the process hierarchy they wrap. ● Advantage: they can be restarted somewhere if they fail. ● It’s easier with stateless apps. But storage can be easily decoupled from containers: there are many storage persistence technologies. ● The entrypoint problem is typically solved by the container orchestration layer.

Slide 25

Slide 25 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //MINIMAL CONTAINER IMAGE ● It’s not about disk space or I/O. It’s about security and good design principles. ● PostgreSQL binaries are minimal: container image cannot be huge. Remove: ○ Non-essential PostgreSQL binaries ○ Docs, psql ○ OS non system tools --all but /bin, /sbin, /lib* ○ Init system if any!

Slide 26

Slide 26 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //LEVERAGE THE SIDECAR PATTERN If a container should only have a single process hierarchy, how can we add support daemons like monitoring or HA agents? ● In K8s a pod is a set of 1+ containers that share the same namespaces, and run side-by-side on the same host. ● Sidecar pattern: deploy side functionality (like agents) to side containers (sidecars) on the same pod as PostgreSQL’s container. ● Sidecars have the same IP and port space; process space (can send kill signals to processes), see the same persistent volume mount.

Slide 27

Slide 27 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //HIGH AVAILABILITY (HA) ● HA is a native concept of cloud native. ● K8s provides mechanisms for leader election and HA. But are not good for PostgreSQL! ● Leader election needs to be replication lag and topology aware. ● Also need to run operations after {fail,switch}over. ● Use PostgreSQL-speciﬁc HA mechanisms. ● Use K8s to automatically restart pods if they fail, and scale replicas.

Slide 28

Slide 28 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CENTRALIZED LOGGING ● A pattern that is not exclusive to containers, but reinforced in K8s. ● DBAs need not to “login” to every container to check logs. ● Centralized logs allow to: ○ Correlate events across multiple servers (leader / replicas). ○ Manage logs persistence once. ○ Run periodic reporting and alerting processes (like pgBadger). ○ Correlate with centralized monitoring (like Prometheus).

Slide 29

Slide 29 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES STACKGRES

Slide 30

Slide 30 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES PRE-DEMO

Slide 31

Slide 31 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES: CLOUD NATIVE POSTGRESQL Running on Kubernetes. Embracing multi-cloud and on-premise. Enterprise-grade, highly opinionated PostgreSQL stack. DB-as-a-Service without vendor lock-in. Root access. Open source!

Slide 32

Slide 32 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //THE STACKGRES STACK (I) UBI 8 minimal image Vanilla PostgreSQL v11, v12 Persistent storage via StorageClass Tuned by default, user conﬁgurable Util container

Slide 33

Slide 33 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //THE STACKGRES STACK (II) Connection pooling Automatic Failover + HA: Patroni Scale to any number of nodes Envoy: RW + RO entry points

Slide 34

Slide 34 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //THE STACKGRES STACK (III) Centralized log management Monitoring w/ Prometheus (built-in or external) Backup to Cloud Storage or K8s volume

Slide 35

Slide 35 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //THE STACKGRES STACK (IV) CLI & API cluster management Web UI management interface Automatic, minor version rolling upgrades Integration with OLM

Slide 36

Slide 36 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES ARCHITECTURE

Slide 37

Slide 37 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES ARCHITECTURE

Slide 38

Slide 38 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES ARCHITECTURE ● Storage Class behavior:

Slide 39

Slide 39 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES ARCHITECTURE ● Networking (Envoy, coming in 0.8)

Slide 40

Slide 40 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES DEPLOY A POSTGRESQL-aaS WITH STACKGRES

Slide 41

Slide 41 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CRDs: STACKGRES HIGH-LEVEL OBJECTS ● CRDs are Kubernetes custom objects. StackGres uses them extensively. ● They define high-level concepts, such as a Postgres Cluster. ● StackGres defines the following: ○ Postgres cluster ○ Postgres configuration ○ Connection pooling configuration ○ Instance profile

Slide 42

Slide 42 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CRDs: Instance Proﬁle apiVersion: stackgres.io/v1alpha1 kind: StackGresProfile metadata: name: size-s spec: cpu: "1000m" memory: "2Gi"

Slide 43

Slide 43 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CRDs: Postgres conﬁguration apiVersion: stackgres.io/v1alpha1 kind: StackGresPostgresConfig metadata: name: postgresconf spec: pg_version: "12" postgresql.conf: shared_buffers: '256MB' random_page_cost: '1.5' password_encryption: 'scram-sha-256' wal_compression: 'on'

Slide 44

Slide 44 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CRDs: PgBouncer conﬁguration apiVersion: stackgres.io/v1alpha1 kind: StackGresConnectionPoolingConfig metadata: name: pgbouncerconf spec: pgbouncer_version: "1.11.0" pgbouncer.ini: pool_mode: transaction max_client_conn: '200' default_pool_size: '200'

Slide 45

Slide 45 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //CRDs: Postgres cluster conﬁguration apiVersion: stackgres.io/v1alpha1 kind: StackGresCluster metadata: name: stackgres spec: instances: 2 pg_version: '12.0' pg_config: 'postgresconf' connection_pooling_config: 'pgbouncerconf' resource_profile: 'size-s' volume_size: '10Gi' postgres_exporter_version: '0.5.1' prometheus_autobind: true sidecars: - connection-pooling - postgres-util - prometheus-postgres-exporter

Slide 46

Slide 46 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES DEMO

Slide 47

Slide 47 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES ROADMAP

Slide 48

Slide 48 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES //STACKGRES.IO https://stackgres.io https://gitlab.com/ongresinc/stackgres

Slide 49

Slide 49 text

CLOUD NATIVE POSTGRESQL EN KUBERNETES QUESTIONS? Álvaro Hernández @ahachete / www.ongres.com //THANK YOU