Slide 1

Slide 1 text

Traditional Databases in Kubernetes Kunal Kushwaha @kunalkushwaha

Slide 2

Slide 2 text

About Me ● Senior Engineer @ NTT Open Source Software Center ● Contributes to OSS container projects like podman, KubeVirt etc ● Docker Community Leader & Meetup Organizer. ● I did detailed evaluation on legacy application migration on Kubernetes using KubeVirt. ○ Presented at Open Source Summit, Japan (2019) ○ https://events19.linuxfoundation.org/wp-content/uploads/2018/07/Running-Legacy-VMs-with-Kubernetes.pdf ● Currently working on fixing issues in KubeVirt for smooth migration of traditional DB and HA application to Kubernetes.

Slide 3

Slide 3 text

Major bottlenecks for DB migration in Kubernetes ● Lack of Virtual IP on Custom Networks. ○ Traditionally, in HA architecture data path, control path have its own network. ○ K8s service cannot be binded to secondary networks created through meta CNI like Multus. ● Lack of Fencing mechanism for data consistency ○ Shared storage is quite popular in traditional HA architecture. ○ Still fencing mechanism is not yet available in Kubernetes, which makes difficult for traditional HA application to migrate without modification. ● Lack of static IP ○ Static IPs were heavily used in traditional Infrastructure. ○ It is hard to get especially in non-cloud infrastructure and its very costly in cloud too.

Slide 4

Slide 4 text

Alternative solutions for Kubernetes ● Replace shared disk with streaming replication. ○ Due to lack of reliable fencing mechanism, avoid shared disk. ○ Better scale factor (even to multi-region level) ● HA packages in side-car. ○ Move HA packages out of DB containers to side-car containers within pod. ○ Run along with each db instance and update the DB state to external key-value store. ○ Participate in Leader election with help of state stored in external key-value store. ● Store cluster state outside DB clusters. ○ Storing cluster state outside DB cluster helps to deal with split-brain conditions. ○ Cluster state can be stored in distributed and strongly consistent key-value store, so can be accessed independent of db pods.

Slide 5

Slide 5 text

Building Blocks ● Persistent Volume ● StatefulSet ● Service ● ConfigMaps ● Secrets ● Operator Model

Slide 6

Slide 6 text

Popular open source DB Operators ● zalando/postgres-operator ○ Build by an european e-commerce company Zalando. ○ Based on zalando/patroni, a template for postgresql HA with Zookeeper, etc or consul ○ Production quality. ● sorintlab/stolon ○ Developed by sorintlab for running PostgreSQL in cloud native environment ○ Similar design like patroni, but independent implementation. ● CrunchyData/postgres-operator ○ Developed by team of PostgreSQL veterans. ○ Uses zalando/paroni for HA functionality since latest release.

Slide 7

Slide 7 text

Zalando PG Operator Operator Creates & manage postgreSQL HA Cluster. Build using open source projects ● Patroni: A python daemon that manage one postgres instance. ○ It keeps cluster state in distributed & strongly consistent key value store. ● Spilo: Spilo is a Docker image that provides PostgreSQL and Patroni bundled together.

Slide 8

Slide 8 text

Operator Actions

Slide 9

Slide 9 text

HA Implementation As PostgreSQL cannot talk to Kubernetes directly, So Patroni manage PostgreSQL. ● Patroni run alongside PostgreSQL & keeps cluster state in distributed & strongly consistent key value store like etcd. ● A leader node name is set as a value of the leader key that expires at predefined TTL ● The leader node update leader key more often than expiration TTL, preventing expiration. ● A non-leader node is not allowed to update the leader key with its name. ● Each instance watches the leader key. ● Once the leader key expires, each remaining instance decides if it is “healthy enough” * to become leader. ● The first “healthy”* instance that creates leader key with its name, becomes leader. ★ Member is never healthy if old master is still running. ★ Member is never healthy if its WAL is behind some other member or its too far behind the last known master position.

Slide 10

Slide 10 text

Failover simulation

Slide 11

Slide 11 text

Failover simulation

Slide 12

Slide 12 text

Failover simulation

Slide 13

Slide 13 text

Failover simulation

Slide 14

Slide 14 text

Failover simulation

Slide 15

Slide 15 text

Failover simulation

Slide 16

Slide 16 text

Credits & References Diagrams Credits PostgreSQL and Kubernetes: DBaaS without a vendor-lock Referenced materials ● Blue elephant on-demand: PostgreSQL + Kubernetes FOSDEM 2018, Brussels ● zalando/patroni: A template for PostgreSQL High Availability with ZooKeeper, etcd, or Consul ● An introduction to stolon: cloud native PostgreSQL high availability ● sorintlab/stolon: PostgreSQL cloud native High Availability and more. ● https://crunchydata.github.io/postgres-operator/latest/gettingstarted/design/designoverview/ ● Crunchy Data Container Suite Documentation

Slide 17

Slide 17 text

Thank You

Slide 18

Slide 18 text

Comparison