Running Your Own Postgres-as-a-service in Kubernetes | Linux Open Source Summit 2020 | Lukas Fittl

Running Your Own Postgres-as-a-service in Kubernetes | Linux Open Source Summit 2020 | Lukas Fittl

Running a database is hard. And that's especially the case when you need tens or hundreds of databases that should be configured consistently, and have capabilities such as HA, backups, monitoring, and more.

As someone who is part of a team that has built a solution in this space, leveraging Kubernetes for the underlying cluster management, this talk will be an introduction to the tools available today to run open-source databases, and in particular PostgreSQL, at scale.

We'll review what solutions exist today for achieving this, how they interact with Kubernetes concepts such as Custom Resource Definitions (CRDs), and what matters when deciding to pick a solution, or when building your own.

The target audience for this talk is both folks that need a database like PostgreSQL to run at scale, with many servers, as well as those that are just starting out, but may have an interest in topics such as how to provide a database for micro services in a Kubernetes environment.

143117954187136b825331f24da0e201?s=128

Azure Postgres

June 30, 2020
Tweet

Transcript

  1. @LukasFittl Running Postgres-as-a-Service In Kubernetes

  2. @LukasFittl

  3. Why (not) run Postgres in Kubernetes?

  4. Consistency - Kubernetes manages all workloads, including databases

  5. Better Portability - Consistent deployment experience across clouds, instead of

    relying on cloud-specific APIs
  6. Low Latency - Co-locating Compute and Database, allows certain workloads

    to perform better
  7. High Effort - Running anything in Kubernetes is complex, and

    databases are worse
  8. How to deploy Postgres in Kubernetes

  9. Postgres

  10. Postgres High Availability Scale Out Capabilities Backups Connection Pooling Monitoring

    K8S Integration
  11. Zalando Postgres Operator

  12. Crunchy Data PostgreSQL Operator

  13. PostgreSQL Hyperscale - Azure Arc

  14. Postgres High Availability Scaling Capabilities Backups Connection Pooling Monitoring K8S

    Integration
  15. K8S Integration CLIs, Operators & API Servers Namespace handling Storage

  16. Zalando Postgres Operator Operator Postgres postgresql CRD

  17. “pgo” CLI Operator API Server Postgres Crunchy Data PostgreSQL Operator

    Zalando Postgres Operator Operator Postgres Pgcluster CRD postgresql CRD Pgpolicy CRD Pgreplica CRD Pgtask CRD
  18. “pgo” CLI “azdata” CLI Operator API Server Postgres Operator API

    Server Postgres (Coordinator) PostgreSQL Hyperscale - Azure Arc Postgres (Data Node) Postgres (Data Node) Crunchy Data PostgreSQL Operator Zalando Postgres Operator Operator Postgres Pgcluster CRD postgresql CRD Pgpolicy CRD Pgreplica CRD Pgtask CRD DatabaseService CRD DatabaseServiceTask CRD
  19. kind: DatabaseService
 metadata:
 ...
 spec:
 docker:
 ...
 engine:
 type: Postgres


    version: 12
 monitoring:
 ...
 scale:
 shards: 2
 scheduling:
 default:
 resources:
 requests:
 memory: 256Mi
 service:
 port: 5432
 type: NodePort
 storage:
 volumeSize: 1Gi kind: postgresql metadata: ... spec: databases: foo: zalando numberOfInstances: 2 postgresql: version: "12" preparedDatabases: bar: {} teamId: acid users: foo_user: [] zalando: - superuser - createdb volume: size: 1Gi kind: Pgcluster metadata: ... spec: ArchiveStorage: ... BackrestStorage: ... PrimaryStorage: ... ReplicaStorage: ... WALStorage: ... ... clustername: hippo database: hippo exporterport: "9187" limits: {} name: hippo namespace: pgo podAntiAffinity: default: preferred port: "5432" primaryhost: hippo replicas: "0" resources: memory: 128Mi rootsecretname: hippo-postgres-secret syncReplication: null tablespaceMounts: {} tlsOnly: false Zalando Postgres Operator Crunchy Data PostgreSQL Operator PostgreSQL Hyperscale - Azure Arc
  20. Namespace handling Crunchy Data’ Postgres Operator Namespace Modes: dynamic:
 Operator

    can create, delete, update any namespaces and manage RBAC.
 Operator requires ClusterRole privilege. readonly:
 Namespaces need to be pre-created and RBAC pre-configured.
 Operator requires ClusterRole privilege. disabled:
 Deploy to single namespace, no ClusterRole privilege required.
  21. Storage Generally, expect Persistent Volume Claims (PVCs) to be utilized

    for the database
 storage.
 
 Crunchy Data PostgreSQL Operator also supports table spaces, to utilize different
 storage types within the same database server (be careful when using it) Postgres Persistent Volume Persistent Volume Claim Network Storage
  22. Postgres High Availability Scaling Capabilities Backups Connection Pooling Monitoring K8S

    Integration
  23. Scenario 1: Automated Failover within a K8S cluster K8S Cluster

    1 PG1 Primary PG2 Secondary Sync Rep Operator
  24. Scenario 1: Automated Failover within a K8S cluster K8S Cluster

    1 PG1 Primary PG2 Secondary Operator Node Failure Detect
  25. Scenario 1: Automated Failover within a K8S cluster K8S Cluster

    1 PG1 Primary PG2 Primary Operator Node Failure Promote
  26. Scenario 1: Automated Failover within a K8S cluster K8S Cluster

    1 PG1 Secondary PG2 Primary Operator Sync Rep Recover
  27. Scenario 2: Disaster Recovery to another K8S cluster K8S Cluster

    1 PG1 Primary PG2 Secondary Sync Rep K8S Cluster 2 PG3 Secondary Async Replication Operator Operator
  28. Scenario 2: Disaster Recovery to another K8S cluster K8S Cluster

    1 PG1 Primary PG2 Secondary Sync Rep K8S Cluster 2 PG3 Primary Operator Operator Large-Scale Data Center Failure Promote
  29. High Availability HA within Same K8S Cluster HA across K8S

    Clusters Zalando
 Postgres Operator Built-In Manual Crunchy Data PostgreSQL Operator Built-In Manual PostgreSQL Hyperscale
 - Azure Arc Built-In Manual
  30. Pod Anti-Affinity label: failure-domain.beta.kubernetes.io/region=westus2 failure-domain…/zone=0 failure-domain…/zone=1 failure-domain…/zone=2

  31. Postgres High Availability Scaling Capabilities Backups Connection Pooling Monitoring K8S

    Integration
  32. Backups Local Volume Backups Point-in-time-Restore Offsite Backups Zalando
 Postgres Operator

    n/a Built-in
 (wal-e) Built-in
 (wal-e) Crunchy Data PostgreSQL Operator Built-in (pgBackRest) Built-in Built-in
 (Amazon S3) PostgreSQL Hyperscale
 - Azure Arc Built-in Built-in Built-in (K8S Volume Mount)
  33. Postgres High Availability Scaling Capabilities Backups Connection Pooling Monitoring K8S

    Integration
  34. Monitoring Metrics Logs Zalando
 Postgres Operator not built in not

    built in Crunchy Data PostgreSQL Operator Grafana Built-in pgbadger PostgreSQL Hyperscale
 - Azure Arc Grafana
 + Azure Monitor Kibana
 +
 Azure Log Analytics
  35. PostgreSQL Hyperscale - Azure Arc

  36. Postgres High Availability Scaling Capabilities Backups Connection Pooling Monitoring K8S

    Integration
  37. Connection Pooling Postgres pgbouncer Application pgbouncer is important for idle

    connection scaling in Postgres Idle connection in Postgres: 5-10MB
 Idle connection in pgbouncer: <1MB
  38. Connection Pooling Pgbouncer Zalando
 Postgres Operator Built-in Crunchy Data PostgreSQL

    Operator Built-in PostgreSQL Hyperscale
 - Azure Arc Planned
  39. Postgres High Availability Scaling Capabilities Backups Connection Pooling Monitoring K8S

    Integration
  40. Scaling Capabilities Read Replicas: Help you scale the read performance;

    Max data size = max storage size per node Postgres Postgres (Read Only) Postgres (Read Only) Application
  41. Scaling Capabilities Hyperscale (Citus): Scales both read and write performance;

    Max data size = # of data nodes * storage size per node Postgres (Coordinator) Postgres (Data Node) Postgres (Data Node) Application
  42. Scaling Up Grow to 100’s of database nodes,
 without re-architecting

    your application Block growth on 1 (monolithic) database vs. 18 Total Nodes Scaling Out
  43. Demo Deploying & Scaling out with PostgreSQL Hyperscale - Azure

    Arc
  44. Thank you! lukas@fittl.com @LukasFittl