Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Running Your Own Postgres-as-a-service in Kubernetes | Linux Open Source Summit 2020 | Lukas Fittl

Running Your Own Postgres-as-a-service in Kubernetes | Linux Open Source Summit 2020 | Lukas Fittl

Running a database is hard. And that's especially the case when you need tens or hundreds of databases that should be configured consistently, and have capabilities such as HA, backups, monitoring, and more.

As someone who is part of a team that has built a solution in this space, leveraging Kubernetes for the underlying cluster management, this talk will be an introduction to the tools available today to run open-source databases, and in particular PostgreSQL, at scale.

We'll review what solutions exist today for achieving this, how they interact with Kubernetes concepts such as Custom Resource Definitions (CRDs), and what matters when deciding to pick a solution, or when building your own.

The target audience for this talk is both folks that need a database like PostgreSQL to run at scale, with many servers, as well as those that are just starting out, but may have an interest in topics such as how to provide a database for micro services in a Kubernetes environment.

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. @LukasFittl
    Running Postgres-as-a-Service
    In Kubernetes

    View full-size slide

  2. Why (not) run Postgres in Kubernetes?

    View full-size slide

  3. Consistency - Kubernetes manages all workloads,
    including databases

    View full-size slide

  4. Better Portability - Consistent deployment experience
    across clouds, instead of relying on cloud-specific APIs

    View full-size slide

  5. Low Latency - Co-locating Compute and Database,
    allows certain workloads to perform better

    View full-size slide

  6. High Effort - Running anything in Kubernetes is
    complex, and databases are worse

    View full-size slide

  7. How to deploy Postgres in Kubernetes

    View full-size slide

  8. Postgres
    High Availability
    Scale Out
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  9. Zalando Postgres Operator

    View full-size slide

  10. Crunchy Data PostgreSQL Operator

    View full-size slide

  11. PostgreSQL Hyperscale - Azure Arc

    View full-size slide

  12. Postgres
    High Availability
    Scaling
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  13. K8S Integration
    CLIs, Operators & API Servers
    Namespace handling
    Storage

    View full-size slide

  14. Zalando Postgres Operator
    Operator Postgres
    postgresql CRD

    View full-size slide

  15. “pgo” CLI
    Operator
    API Server
    Postgres
    Crunchy Data PostgreSQL Operator
    Zalando Postgres Operator
    Operator Postgres
    Pgcluster CRD
    postgresql CRD
    Pgpolicy CRD
    Pgreplica CRD
    Pgtask CRD

    View full-size slide

  16. “pgo” CLI
    “azdata” CLI
    Operator
    API Server
    Postgres
    Operator
    API Server
    Postgres (Coordinator)
    PostgreSQL Hyperscale - Azure Arc
    Postgres (Data Node)
    Postgres (Data Node)
    Crunchy Data PostgreSQL Operator
    Zalando Postgres Operator
    Operator Postgres
    Pgcluster CRD
    postgresql CRD
    Pgpolicy CRD
    Pgreplica CRD
    Pgtask CRD
    DatabaseService CRD
    DatabaseServiceTask CRD

    View full-size slide

  17. kind: DatabaseService

    metadata:

    ...

    spec:

    docker:

    ...

    engine:

    type: Postgres

    version: 12

    monitoring:

    ...

    scale:

    shards: 2

    scheduling:

    default:

    resources:

    requests:

    memory: 256Mi

    service:

    port: 5432

    type: NodePort

    storage:

    volumeSize: 1Gi
    kind: postgresql
    metadata:
    ...
    spec:
    databases:
    foo: zalando
    numberOfInstances: 2
    postgresql:
    version: "12"
    preparedDatabases:
    bar: {}
    teamId: acid
    users:
    foo_user: []
    zalando:
    - superuser
    - createdb
    volume:
    size: 1Gi
    kind: Pgcluster
    metadata:
    ...
    spec:
    ArchiveStorage:
    ...
    BackrestStorage:
    ...
    PrimaryStorage:
    ...
    ReplicaStorage:
    ...
    WALStorage:
    ...
    ...
    clustername: hippo
    database: hippo
    exporterport: "9187"
    limits: {}
    name: hippo
    namespace: pgo
    podAntiAffinity:
    default: preferred
    port: "5432"
    primaryhost: hippo
    replicas: "0"
    resources:
    memory: 128Mi
    rootsecretname: hippo-postgres-secret
    syncReplication: null
    tablespaceMounts: {}
    tlsOnly: false
    Zalando Postgres Operator Crunchy Data PostgreSQL Operator PostgreSQL Hyperscale - Azure Arc

    View full-size slide

  18. Namespace handling
    Crunchy Data’ Postgres Operator Namespace Modes:
    dynamic:

    Operator can create, delete, update any namespaces and manage RBAC.

    Operator requires ClusterRole privilege.
    readonly:

    Namespaces need to be pre-created and RBAC pre-configured.

    Operator requires ClusterRole privilege.
    disabled:

    Deploy to single namespace, no ClusterRole privilege required.

    View full-size slide

  19. Storage
    Generally, expect Persistent Volume Claims (PVCs) to be utilized for the database

    storage.


    Crunchy Data PostgreSQL Operator also supports table spaces, to utilize different

    storage types within the same database server (be careful when using it)
    Postgres Persistent Volume Persistent Volume Claim Network Storage

    View full-size slide

  20. Postgres
    High Availability
    Scaling
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  21. Scenario 1: Automated Failover within a K8S cluster
    K8S Cluster 1
    PG1
    Primary
    PG2
    Secondary
    Sync Rep
    Operator

    View full-size slide

  22. Scenario 1: Automated Failover within a K8S cluster
    K8S Cluster 1
    PG1
    Primary
    PG2
    Secondary
    Operator
    Node Failure
    Detect

    View full-size slide

  23. Scenario 1: Automated Failover within a K8S cluster
    K8S Cluster 1
    PG1
    Primary
    PG2
    Primary
    Operator
    Node Failure
    Promote

    View full-size slide

  24. Scenario 1: Automated Failover within a K8S cluster
    K8S Cluster 1
    PG1
    Secondary
    PG2
    Primary
    Operator
    Sync Rep
    Recover

    View full-size slide

  25. Scenario 2: Disaster Recovery to another K8S cluster
    K8S Cluster 1
    PG1
    Primary
    PG2
    Secondary
    Sync Rep
    K8S Cluster 2
    PG3
    Secondary
    Async Replication
    Operator Operator

    View full-size slide

  26. Scenario 2: Disaster Recovery to another K8S cluster
    K8S Cluster 1
    PG1
    Primary
    PG2
    Secondary
    Sync Rep
    K8S Cluster 2
    PG3
    Primary
    Operator Operator
    Large-Scale Data Center Failure
    Promote

    View full-size slide

  27. High Availability
    HA within Same K8S Cluster HA across K8S Clusters
    Zalando

    Postgres Operator
    Built-In Manual
    Crunchy Data
    PostgreSQL Operator
    Built-In Manual
    PostgreSQL Hyperscale

    - Azure Arc
    Built-In Manual

    View full-size slide

  28. Pod Anti-Affinity
    label: failure-domain.beta.kubernetes.io/region=westus2
    failure-domain…/zone=0 failure-domain…/zone=1 failure-domain…/zone=2

    View full-size slide

  29. Postgres
    High Availability
    Scaling
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  30. Backups
    Local Volume Backups Point-in-time-Restore Offsite Backups
    Zalando

    Postgres Operator
    n/a
    Built-in

    (wal-e)
    Built-in

    (wal-e)
    Crunchy Data
    PostgreSQL Operator
    Built-in
    (pgBackRest)
    Built-in
    Built-in

    (Amazon S3)
    PostgreSQL Hyperscale

    - Azure Arc
    Built-in Built-in
    Built-in
    (K8S Volume Mount)

    View full-size slide

  31. Postgres
    High Availability
    Scaling
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  32. Monitoring
    Metrics Logs
    Zalando

    Postgres Operator
    not built in not built in
    Crunchy Data
    PostgreSQL Operator
    Grafana Built-in pgbadger
    PostgreSQL Hyperscale

    - Azure Arc
    Grafana

    +
    Azure Monitor
    Kibana

    +

    Azure Log Analytics

    View full-size slide

  33. PostgreSQL Hyperscale - Azure Arc

    View full-size slide

  34. Postgres
    High Availability
    Scaling
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  35. Connection Pooling
    Postgres
    pgbouncer
    Application
    pgbouncer is important for idle connection scaling in Postgres
    Idle connection in Postgres: 5-10MB

    Idle connection in pgbouncer: <1MB

    View full-size slide

  36. Connection Pooling
    Pgbouncer
    Zalando

    Postgres Operator
    Built-in
    Crunchy Data
    PostgreSQL Operator
    Built-in
    PostgreSQL Hyperscale

    - Azure Arc
    Planned

    View full-size slide

  37. Postgres
    High Availability
    Scaling
    Capabilities
    Backups
    Connection
    Pooling
    Monitoring
    K8S
    Integration

    View full-size slide

  38. Scaling Capabilities
    Read Replicas:
    Help you scale the read performance; Max data size = max storage size per node
    Postgres
    Postgres (Read Only)
    Postgres (Read Only)
    Application

    View full-size slide

  39. Scaling Capabilities
    Hyperscale (Citus):
    Scales both read and write performance; Max data size = # of data nodes * storage size per node
    Postgres (Coordinator)
    Postgres (Data Node)
    Postgres (Data Node)
    Application

    View full-size slide

  40. Scaling Up
    Grow to 100’s of database nodes,

    without re-architecting your application
    Block growth on 1
    (monolithic) database
    vs.
    18
    Total Nodes
    Scaling Out

    View full-size slide

  41. Demo
    Deploying & Scaling out with
    PostgreSQL Hyperscale - Azure Arc

    View full-size slide