Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Postgres on Kubernetes - Dos and Don'ts

Postgres on Kubernetes - Dos and Don'ts

Running databases in containers has been the biggest anti-pattern of the last decade. The world, however, moves on and stateful container workloads become more common, and so do databases in Kubernetes. People love the additional convenience when it comes to deployment, scalability, and operation.

With PostgreSQL on its way to become the world’s most beloved database, there certainly are quite some things to keep in mind when running it on k8s. Let us evaluate the important Dos and especially the Don’ts.

Presentation by Chris Engelbert from simplyblock (https://www.simplyblock.io).

Other Decks in Technology

Transcript

  1. Chris Engelbert Devrel @ simplyblock Previous fun companies: - Ubisoft

    / Blue Byte - Hazelcast - Instana - clevabit - Timescale Interests: - Developer Relations - Anything Performance Engineering - Backend Technologies - Fairy Tales (AMD, Intel, Nvidia) @noctarius2k @[email protected] @noctarius.com
  2. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind!
  3. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s!
  4. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s! Persistent Data will kill you! Too slow!
  5. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s! Persistent Data will kill you! Too slow! Nobody understands Kubernetes!
  6. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s! Persistent Data will kill you! Too slow! Nobody understands Kubernetes! What’s the bene fi t; databases don’t need autoscaling!
  7. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s! Persistent Data will kill you! Too slow! Nobody understands Kubernetes! What’s the bene fi t; databases don’t need autoscaling! Databases and applications should be separated!
  8. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s! Persistent Data will kill you! Too slow! Nobody understands Kubernetes! What’s the bene fi t; databases don’t need autoscaling! Databases and applications should be separated! Not another layer of indirection / abstraction!
  9. Why not to run a database in Kubernetes? K8s is

    not designed with Databases in mind! Never run Stateful Workloads in k8s! Persistent Data will kill you! Too slow! Nobody understands Kubernetes! What’s the bene fi t; databases don’t need autoscaling! Databases and applications should be separated! Not another layer of indirection / abstraction!
  10. No Cloud-Vendor Lock-In Faster Time To Market Decreasing cost Automation

    Uni fi ed deployment architecture Need read-only replicas Why?
  11. You want Continuous Backup and PITR Roll your own pg_basebackup

    or pg_dump (don’t!) Backup and Recovery https://www.ovhcloud.com/de/bare-metal/backup-storage/
  12. You want Continuous Backup and PITR Roll your own pg_basebackup

    or pg_dump (don’t!) Use tools like pgbackrest, barman, PGHoard, … Backup and Recovery https://www.ovhcloud.com/de/bare-metal/backup-storage/
  13. You want Continuous Backup and PITR Roll your own pg_basebackup

    or pg_dump (don’t!) Use tools like pgbackrest, barman, PGHoard, … Upload backups to S3? Cost! Backup and Recovery https://www.ovhcloud.com/de/bare-metal/backup-storage/
  14. You want Continuous Backup and PITR Roll your own pg_basebackup

    or pg_dump (don’t!) Use tools like pgbackrest, barman, PGHoard, … Upload backups to S3? Cost! Backup and Recovery https://www.ovhcloud.com/de/bare-metal/backup-storage/ 😅 Test Your Backups 😅
  15. shared_bu ff ers (maintenance_)work_mem e ff ective_cache_size PostgreSQL Con fi

    guration The PostgreSQL Con fi guration isn’t too much in fl uenced
  16. shared_bu ff ers (maintenance_)work_mem e ff ective_cache_size PostgreSQL Con fi

    guration The PostgreSQL Con fi guration isn’t too much in fl uenced Use Huge Pages!
  17. Do you need PG Extensions? Do you need more? Extensions!

    Is the extension part of the container image?
  18. Do you need PG Extensions? Do you need more? Extensions!

    Is the extension part of the container image? If not, you need to build your own layer…
  19. Do you need PG Extensions? Do you need more? Extensions!

    Is the extension part of the container image? If not, you need to build your own layer… or use some magic (more on this later).
  20. Use Persistent Volumes Storage Should be dynamically provisioned CSI provider

    enables encryption at rest (local volumes are a bad idea)
  21. Use Persistent Volumes Storage Should be dynamically provisioned CSI provider

    enables encryption at rest High IOPS (SSD or NVMe) (local volumes are a bad idea)
  22. Use Persistent Volumes Storage Should be dynamically provisioned CSI provider

    enables encryption at rest High IOPS (SSD or NVMe) Low Latency (local volumes are a bad idea)
  23. Use Persistent Volumes Storage Should be dynamically provisioned CSI provider

    enables encryption at rest High IOPS (SSD or NVMe) Low Latency Database performance is as fast as your storage (local volumes are a bad idea)
  24. Use Persistent Volumes Storage Should be dynamically provisioned CSI provider

    enables encryption at rest High IOPS (SSD or NVMe) Low Latency Database performance is as fast as your storage (local volumes are a bad idea) I’d recommend a disaggregated storage!
  25. Requests, Limits, and Quotas CPU and memory requests need to

    be accurate
 to prevent contention and ensure predictable performance Capacity Limits Requests Used Use Resource Requests, Limits, Quotas
  26. Requests, Limits, and Quotas CPU and memory requests need to

    be accurate
 to prevent contention and ensure predictable performance Capacity Limits Requests Used https://codimite.ai/blog/kubernetes-resources-and-scaling-a-beginners-guide/ Use Resource Requests, Limits, Quotas
  27. Make it big! Enable Huge Pages! In your OS and

    the Resource Descriptor. https://www.percona.com/blog/using-huge-pages-with-postgresql-running-inside-kubernetes/
  28. Never use PostgreSQL without Connection Pooling! Optimizes Overhead and Resource

    Utilization Resiliency and Overhead Connection Pooling
  29. Never use PostgreSQL without Connection Pooling! Optimizes Overhead and Resource

    Utilization Handles failovers, central switching of Primary Resiliency and Overhead Connection Pooling
  30. Never use PostgreSQL without Connection Pooling! Optimizes Overhead and Resource

    Utilization Handles failovers, central switching of Primary Enables easy use of Read-Replicas Resiliency and Overhead Connection Pooling
  31. Never use PostgreSQL without Connection Pooling! Optimizes Overhead and Resource

    Utilization Handles failovers, central switching of Primary Enables easy use of Read-Replicas Resiliency and Overhead Connection Pooling PgBouncer, PgPool-II, pgagroal, PgCat, Odyssey, …
  32. Never use PostgreSQL without Connection Pooling! Optimizes Overhead and Resource

    Utilization Handles failovers, central switching of Primary Enables easy use of Read-Replicas Resiliency and Overhead Connection Pooling PgBouncer, PgPool-II, pgagroal, PgCat, Odyssey, … https://tembo.io/blog/postgres-connection-poolers
  33. Use Network Policies Enable TLS (you remember?!) Networking and Access

    Control https://timeclock365.com/tc22-door-access-controller/
  34. Use Network Policies Enable TLS (you remember?!) Setup Security Policies

    Networking and Access Control https://timeclock365.com/tc22-door-access-controller/
  35. Use Network Policies Enable TLS (you remember?!) Setup Security Policies

    Con fi gure RBAC (Role-Based Access Control) Networking and Access Control https://timeclock365.com/tc22-door-access-controller/
  36. Use Network Policies Enable TLS (you remember?!) Setup Security Policies

    Con fi gure RBAC (Role-Based Access Control) Networking and Access Control Think about a policy manager such as OPA or kyverno https://timeclock365.com/tc22-door-access-controller/
  37. Observability and Alerting Like anything cloud, make sure you have

    monitoring (meaning observability) and alerting!
  38. Prometheus Exporter, Log Collector, Aggregation, Analysis, Traceability, … Observability and

    Alerting Like anything cloud, make sure you have monitoring (meaning observability) and alerting!
  39. Prometheus Exporter, Log Collector, Aggregation, Analysis, Traceability, … Observability and

    Alerting Like anything cloud, make sure you have monitoring (meaning observability) and alerting! Datadog, Instana, DynaTrace, Grafana, …
  40. Operator Use a Postgres Kubernetes Operator Handles or con fi

    gures many of the typical tasks (HA, backup, …)
  41. Operator Use a Postgres Kubernetes Operator Handles or con fi

    gures many of the typical tasks (HA, backup, …) Brings cloud-nativeness to PG
  42. Operator Use a Postgres Kubernetes Operator Handles or con fi

    gures many of the typical tasks (HA, backup, …) Brings cloud-nativeness to PG Integrates PG into k8s
  43. Operator If not, use Helm Charts Use a Postgres Kubernetes

    Operator Handles or con fi gures many of the typical tasks (HA, backup, …) Brings cloud-nativeness to PG Integrates PG into k8s
  44. Operator CloudNativePG Crunchy Postgres for Kubernetes OnGres StackGres KubeDB Zalando

    Postgres Operator Supported versions 12, 13, 14, 15, 16 11, 12, 13, 14, 15, 16 12, 13, 14, 15, 16 9.6, 10, 11, 12, 13, 14 11, 12, 13, 14, 15, 16 Postgres Clusters ✔ ✔ ✔ ✔ ✔ Streaming replication ✔ ✔ ✔ ✔ ✔ Supports Extensions ✔ ✔ ✔ ✔ ✔
  45. Operator CloudNativePG Crunchy Postgres for Kubernetes OnGres StackGres KubeDB Zalando

    Postgres Operator Hot Standby ✔ ✔ ✔ ✔ ✔ Warm Standby ✔ ✔ ✔ ✔ ✔ Automatic Failover ✔ ✔ ✔ ✔ ✔ Continuous Archiving ✔ ✔ ✔ ✔ ✔ Restore from
 WAL archive ✔ ✔ ✔ ✔ ✔ Supports PITR ✔ ✔ ✔ ✔ ✔ Manual backups ✔ ✔ ✔ ✔ ✔ Scheduled backups ✔ ✔ ✔ ✔ ✔
  46. Operator CloudNativePG Crunchy Postgres for Kubernetes OnGres StackGres KubeDB Zalando

    Postgres Operator Backups via Kubernetes ✔ ✘ ✔ ✔ ✘ Custom resources ✔ ✔ ✔ ✔ ✔ Uses default PG images ✘ ✔ ✔ ✘ ✘ CLI access ✔ ✔ ✔ ✔ ✘ WebUI ✘ ✘ ✔ ✔ ✘ Tolerations ✔ ✔ ✔ ✔ ✔ Node af fi nity ✔ ✔ ✔ ✔ ✔
  47. Always use speci fi c, dedicated machines for your database.

    Pinning and Tainting (except you’re running super small databases)
  48. Always use speci fi c, dedicated machines for your database.

    Pin your database containers to those hosts. Pinning and Tainting (except you’re running super small databases)
  49. Always use speci fi c, dedicated machines for your database.

    Pin your database containers to those hosts. Taint the hosts to prevent anything else from running on it. Pinning and Tainting (except you’re running super small databases)
  50. Always use speci fi c, dedicated machines for your database.

    Pin your database containers to those hosts. Taint the hosts to prevent anything else from running on it. Pinning and Tainting (except you’re running super small databases) (except the minimum necessary Kubernetes services, like KubeProxy)