Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PostgreSQL on Kubernetes at Scale

PostgreSQL on Kubernetes at Scale

Prague PostgreSQL Meetup talk, 28.05.2018

Oleksii Kliukin

May 28, 2018
Tweet

More Decks by Oleksii Kliukin

Other Decks in Programming

Transcript

  1. PostgreSQL on
    Kubernetes at
    Scale
    Prague PostgreSQL
    Meetup
    Oleksii Kliukin
    28-05-2018

    View full-size slide

  2. 2
    Oleksii Kliukin
    Postgres Engineer @ Zalando
    [email protected]
    Twitter: @hintbits
    About me

    View full-size slide

  3. 3
    A brief history of PostgreSQL at Zalando
    Live DEMO (what can possibly go wrong?)
    How to stop worrying (and embrace Patroni)
    Kubernetes: the real thing
    What is in the name: Postgres Operator
    TABLE OF CONTENTS

    View full-size slide

  4. 4
    ZALANDO: WE BRING FASHION TO PEOPLE IN 15 COUNTRIES

    View full-size slide

  5. 5
    ZALANDO AT A GLANCE
    as at May 2017
    >300databases
    In data centers
    > 150
    Postgres clusters on
    AWS EC2
    > 200
    Postgres clusters on
    Kubernetes

    View full-size slide

  6. 6
    Let’s start with names
    ` `
    primary
    standby standby
    PostgreSQL (HA) cluster
    stream
    ing
    replication
    template0
    postgres
    template1
    PostgreSQL instance

    View full-size slide

  7. 7
    Brief history of
    PostgreSQL at Zalando

    View full-size slide

  8. 8
    Vintage (DC) and modern (AWS) PostgreSQL environments
    DC1
    DC2
    (one hour delay)
    NFS
    WAL
    archive
    AWS VPC

    View full-size slide

  9. Should you run your PostgreSQL inside a
    container?

    View full-size slide

  10. 10
    Spilo Docker image at Zalando
    • PGDATA on an external volume (EBS or i3/c5 NVME)
    • Environment-variables based configuration
    • One container per one EC2 instance
    • PostgreSQL versions from 9.4 up to 10
    • Plenty of extensions (all contrib, PostGIS, timescaleDB, PL/V8,
    pg_cron, etc)
    • Additional tools (pgbouncer, pgq)
    • Extremely lightweight (69MB)

    View full-size slide

  11. github.com/zalando/spilo

    View full-size slide

  12. 12
    Cluster Security Group
    Auto-Scaling
    Availability Zone A
    Data Volume
    Root volume
    Master Elastic IP
    Cloud Formation Stack
    Replica
    DB
    Availability Zone B
    Data Volume
    Root volume
    Master DB
    Availability Zone C
    Data Volume
    Root volume
    Replica
    DB
    Replica ELB
    Security Group
    Replica Elastic
    Load Balancer
    5432, 8008
    5432, 8008
    GET /replica
    db.zalando
    db-repl.zalando
    S3 bucket:
    Backup + WAL
    User Data:
    - Docker image
    - Backup schedule
    - Superuser password
    - Replication password
    - Postgres parameters
    Etcd

    View full-size slide

  13. Patroni is a secret ingredient to make it all work

    View full-size slide

  14. 14
    What is Patroni
    • Automatic failover solution for PostgreSQL
    streaming-replication
    • A daemon that manages one PostgreSQL instance
    • Keeps the state of the cluster in a DCS (Etcd, Zookeeper,
    Consul, Kubernetes), also referred to as a consistency layer
    • For new instances decides whether to initialize a new cluster or
    join an existing one
    • For running instances executes promotion/demotion when
    necessary
    • A number of additional related functions (global configuration,
    scheduled actions, pause mode, pg_rewind support, etc)

    View full-size slide

  15. 15
    What Patroni is not
    • Not an arbiter for the whole HA cluster
    • Not a swiss-army knife of Postgres maintenance
    • Not a substitute for a proper monitoring
    • Not a tool to use if you don’t understand how Etcd (or another
    DCS that you use) works.
    • Not a silver bullet (but tries to balance easy-to-use vs
    extensibility)
    • Not justi an internal project of Zalando (IBM Compose, Red Hat
    and many other companies use it)

    View full-size slide

  16. 16
    Why distributed consistency?
    Etcd cluster
    Primary
    candidate
    Primary
    candidate
    Take leader
    Take leader
    Primary
    candidate
    Take
    leader

    View full-size slide

  17. github.com/zalando/patroni

    View full-size slide

  18. 18
    • A set of open-source components running on one or
    more servers
    • A container orchestration system
    • An abstraction layer over your real or virtualized
    hardware
    • An “infrastructure as code” system
    • Automatic resource allocation
    • Next step after Ansible/Chef/Puppet
    What is Kubernetes?

    View full-size slide

  19. 19
    • An operating system
    • A magical way to make your infrastructure scalable
    • An excuse to fire your devops (someone has to
    configure it)
    • A good solution for running 2-3 servers
    What Kubernetes is not?

    View full-size slide

  20. 20
    Kubernetes
    • Node
    • Pod
    • Container
    • Persistent Volumes
    • Service/Endpoint
    • Labels
    • Secrets
    Terminology: traditional DC compared to Kubernetes
    Traditional infrastructure
    • Physical server
    • Virtual machine
    • Individual application
    • NAS/SAN
    • Load balancer
    • Application registry/hardware information
    • Password files

    View full-size slide

  21. 21
    Declarative resource description (manifest)
    apiVersion: v1
    kind: Service
    metadata:
    name: nginx
    labels:
    app: nginx
    spec:
    ports:
    - port: 80
    name: web
    clusterIP: None
    selector:
    app: nginx

    View full-size slide

  22. 22
    Building a PostgreSQL cluster on Kubernetes
    • A statefulset to bind pods with
    persistent volumes and provide
    auto-recovery
    • A service to route client connections
    • Spilo as a docker container (Patroni
    + PostgreSQL) for HA
    • Secrets to store database user
    passwords

    View full-size slide

  23. 23
    • At least four long YAML manifests to write
    • Different parts of PostgreSQL configuration
    spread over multiple manifests
    • No easy way to work with a cluster as a whole
    (update, delete)
    • Manual generation of DB objects, i.e. users, and
    their passwords.
    Manual deployment of HA PostgreSQL cluster on Kubernetes

    View full-size slide

  24. 24
    • A template for your manifests
    • Only one place to fill-in deployment-related values
    • Requires running a special pod (tiller) in your Kubernetes
    cluster
    github.com/kubernetes/charts/blob/master/incubator/patroni
    Initial approach to automation: HELM

    View full-size slide

  25. 25
    • Implement a controller application to act on custom
    resources
    • CRD (custom resource definitions) to describe a
    domain-specific object (i.e. a Postgres cluster)
    • Encapsulates knowledge of a human operating the service
    https://coreos.com/blog/introducing-operators.html
    Kubernetes operator pattern

    View full-size slide

  26. 26
    • Defines a custom Postgres resource
    • Watches instances of Postgres, creates/updates/deletes
    corresponding Kubernetes objects
    • Allows updating running-cluster resources (memory, cpu,
    volumes), postgres configuration
    • Creates databases, users and automatically generates
    passwords
    • Auto-repairs, smart rolling updates (switchover to replicas
    before updating the master)
    Zalando Postgres operator

    View full-size slide

  27. 27
    Simple Postgres manifest
    apiVersion: "acid.zalan.do/v1"
    kind: postgresql
    metadata:
    name: acid-minimal-cluster
    spec:
    teamId: "ACID"
    volume:
    size: 1Gi
    numberOfInstances: 2
    users:
    # database owner
    zalando:
    - superuser
    - createdb
    # role for application foo
    foo_user:
    #databases: name->owner
    databases:
    foo: zalando
    postgresql:
    version: "10"

    View full-size slide

  28. 28
    Just a piece of cake
    • Operator starts pods with Spilo docker image
    • Operator provides environment variables to Spilo
    • Operator makes sure all Kubernetes objects are
    in sync
    • Spilo generates Patroni configuration
    • Patroni creates roles and configures PostgreSQL
    • Patroni makes sure there is only one master
    • Patroni uses Kubernetes for cluster state and
    leader lock
    • Patroni creates roles and applies configuration
    • Patroni changes service endpoints on failover

    View full-size slide

  29. deploy
    cluster
    manifest
    Stateful set
    Spilo pod
    Kubernetes cluster
    PATRONI
    operator
    pod
    Endpoint
    Service
    Client
    application
    operator
    config
    map
    Cluster
    secrets
    DB
    deployer
    create
    create
    create
    watch
    Infrastructure
    roles

    View full-size slide

  30. 31
    Should you run your PostgreSQL clusters in on Kubernetes
    Strong interest in the
    community
    • Zalando Postgres Operator
    • CrunchyData Postgres Operator
    • Red Hat Project Atomic
    • KubeDB
    • Project Habitat

    View full-size slide

  31. 32
    Why not AWS RDS or Aurora PostgreSQL
    Not an easy answer :)
    Full control
    • Independent of cloud provider
    • Real super user available
    • Custom extensions, PAM
    • Streaming/WAL replication in and out
    • Local storage not supported on RDS (NVMe
    SSDs)
    Costs? Cost of development? ...

    View full-size slide

  32. 33
    • Kubernetes-native Patroni
    • PAM OAuth2 for
    PostgreSQL
    Bonus projects

    View full-size slide

  33. 34
    Using Kubernetes as a consistency store
    ● Use annotations on:
    ○ Pods for cluster members
    ○ Dedicated Endpoint for cluster configuration.
    ○ Service-related Endpoint for leader information.
    ● Reliability: always use EndPoints.
    ● Compatibility mode: use ConfigMaps, not Endpoints.
    http://patroni.readthedocs.io/en/latest/kubernetes.html

    View full-size slide

  34. 35
    ● PAM module written in C
    ● Open-source: https://github.com/CyberDem0n/pam-oauth2
    ● Equivalent of arbitrary-long automatically generated,
    auto-expiring passwords.
    ● Can supply arbitrary key=value pairs to check in the OAuth
    response (i.e. realm=/employees)
    OAUTH2 PAM authentication

    View full-size slide

  35. 36
    OAUTH2 PAM authentication
    Operator configuration:
    pam_configuration:
    https://info.example.com/oauth2/tokeninfo?access_token= uid
    realm=/employees
    pam_role_name: users
    Operator sets PAM_OAUTH2 Spilo environment variable, adds a line to pg_hba.conf
    hostssl all +users all pam
    Spilo writes /etc/pam.d/postgresql using PAM_OAUTH2 value.

    View full-size slide

  36. 37
    Made possible by great people inside and outside of Zalando
    Patroni and Spilo: github.com/zalando/patroni, github.com/zalando/spilo
    Alexander Kukushkin, Ants Aasma, Feike Steenbergen, Josh Berkus
    Postgres Operator: github.com/zalando-incubator/postgres-operator
    Murat Kabilov, Sergey Dudoladov, Manuel Gómez,
    PAM Oauth2: https://github.com/CyberDem0n/pam-oauth2
    Alexander Kukushkin
    Put it all together in a sane way:
    Jan Mußler

    View full-size slide

  37. 38
    “HIRE THE BEST PEOPLE YOU CAN, AND GET OUT OF THEIR WAY.“

    View full-size slide

  38. We are hiring Database Engineers
    https://jobs.zalando.com/jobs/570376-database-engineer-postgresql

    View full-size slide