Traditional Databases in Kubernetes

Traditional Databases in Kubernetes

describes how traditional databases like postgresql clusters are redesigned to run it on Kubernetes to suites CloudNative nature.

C3bf63e3aa5a2e655b7fb91f75ce8e95?s=128

Kunal Kushwaha

January 16, 2020
Tweet

Transcript

  1. 2.

    About Me • Senior Engineer @ NTT Open Source Software

    Center • Contributes to OSS container projects like podman, KubeVirt etc • Docker Community Leader & Meetup Organizer. • I did detailed evaluation on legacy application migration on Kubernetes using KubeVirt. ◦ Presented at Open Source Summit, Japan (2019) ◦ https://events19.linuxfoundation.org/wp-content/uploads/2018/07/Running-Legacy-VMs-with-Kubernetes.pdf • Currently working on fixing issues in KubeVirt for smooth migration of traditional DB and HA application to Kubernetes.
  2. 3.

    Major bottlenecks for DB migration in Kubernetes • Lack of

    Virtual IP on Custom Networks. ◦ Traditionally, in HA architecture data path, control path have its own network. ◦ K8s service cannot be binded to secondary networks created through meta CNI like Multus. • Lack of Fencing mechanism for data consistency ◦ Shared storage is quite popular in traditional HA architecture. ◦ Still fencing mechanism is not yet available in Kubernetes, which makes difficult for traditional HA application to migrate without modification. • Lack of static IP ◦ Static IPs were heavily used in traditional Infrastructure. ◦ It is hard to get especially in non-cloud infrastructure and its very costly in cloud too.
  3. 4.

    Alternative solutions for Kubernetes • Replace shared disk with streaming

    replication. ◦ Due to lack of reliable fencing mechanism, avoid shared disk. ◦ Better scale factor (even to multi-region level) • HA packages in side-car. ◦ Move HA packages out of DB containers to side-car containers within pod. ◦ Run along with each db instance and update the DB state to external key-value store. ◦ Participate in Leader election with help of state stored in external key-value store. • Store cluster state outside DB clusters. ◦ Storing cluster state outside DB cluster helps to deal with split-brain conditions. ◦ Cluster state can be stored in distributed and strongly consistent key-value store, so can be accessed independent of db pods.
  4. 5.
  5. 6.

    Popular open source DB Operators • zalando/postgres-operator ◦ Build by

    an european e-commerce company Zalando. ◦ Based on zalando/patroni, a template for postgresql HA with Zookeeper, etc or consul ◦ Production quality. • sorintlab/stolon ◦ Developed by sorintlab for running PostgreSQL in cloud native environment ◦ Similar design like patroni, but independent implementation. • CrunchyData/postgres-operator ◦ Developed by team of PostgreSQL veterans. ◦ Uses zalando/paroni for HA functionality since latest release.
  6. 7.

    Zalando PG Operator Operator Creates & manage postgreSQL HA Cluster.

    Build using open source projects • Patroni: A python daemon that manage one postgres instance. ◦ It keeps cluster state in distributed & strongly consistent key value store. • Spilo: Spilo is a Docker image that provides PostgreSQL and Patroni bundled together.
  7. 9.

    HA Implementation As PostgreSQL cannot talk to Kubernetes directly, So

    Patroni manage PostgreSQL. • Patroni run alongside PostgreSQL & keeps cluster state in distributed & strongly consistent key value store like etcd. • A leader node name is set as a value of the leader key that expires at predefined TTL • The leader node update leader key more often than expiration TTL, preventing expiration. • A non-leader node is not allowed to update the leader key with its name. • Each instance watches the leader key. • Once the leader key expires, each remaining instance decides if it is “healthy enough” * to become leader. • The first “healthy”* instance that creates leader key with its name, becomes leader. ★ Member is never healthy if old master is still running. ★ Member is never healthy if its WAL is behind some other member or its too far behind the last known master position.
  8. 16.

    Credits & References Diagrams Credits PostgreSQL and Kubernetes: DBaaS without

    a vendor-lock Referenced materials • Blue elephant on-demand: PostgreSQL + Kubernetes FOSDEM 2018, Brussels • zalando/patroni: A template for PostgreSQL High Availability with ZooKeeper, etcd, or Consul • An introduction to stolon: cloud native PostgreSQL high availability • sorintlab/stolon: PostgreSQL cloud native High Availability and more. • https://crunchydata.github.io/postgres-operator/latest/gettingstarted/design/designoverview/ • Crunchy Data Container Suite Documentation
  9. 17.