A description of Sopra Steria's journey of backup on Kubernetes. Presented by Michael Courcy at Cloud Native Storage Data / KubeCon 2019 North America (San Diego).
SopraSteria is a Digital Services and transformation Company (Similar to Capgemini or Atos) with more than 30,000 collaborators world wide and >4Bil euros annual revenues. • A lot of customers, big accounts : Banking, Government, Army, Energy, Insurance… • Kubernetes is the tool that we always wanted : • Contract with many small apps are easy to resuscitate • Size matter when you want to reduce cost • All the usual benefits of Kubernetes and devops practices of course • Provide common tool for all the teams in a consistent manner (CI/CD, home made or those we choose to invest) Cloud & On-Prem Amazon RDS … Elastic Modern Databases Multiple Storage AWS EBS AWS S3
Keep it simple Make it secure Room for innovation Business data Mongo, PG, Mysql Namespaces resources: Deployment, secrets, services … ETCD : describe the whole cluster state Machines : root/os filesystem + attached devices Don’t want to read manuals Don’t want storage overlays RBAC, Audit, end-end encryption,… Reduce attack surface (zero-trust) No vendor or data lock-ins Give our customers choice Capture business data and cluster info Backup location choice Backup Requirements Recovery Requirements Restore business data and cluster info Restore location choice Self-service portal Automation Compliance Reporting Avoid extra tooling Management Requirements
it 1 2 3 Disaster Recovery Sovereignty Costs Don’t put all your eggs in a single basket Country rules govern data location Different for Primary, Second and Archive storage Option 1: Build Option 2: Migrate Expensive! Better way: Intelligent data mobility
… so we still did not have a solution The old way • Expensive … • Hard to rebuild: nodes and disks are always changing • Need to shutdown all the machines for complete consistency • Back in the past is back in the past for all tenants EBS + Lambda • Works only on AWS • Work only with EBS PVC, what about CEPH volume built on EBS ? • Save everything can’t apply different policy • Not so easy to rebuild a PV from a snapshot need AWS credentials Port Forward + Cron Server • Port forward not stable • Cron server become the SPOF • Hard to introduce new policy • Credential sharing of the databases on the cron server Kubernetes Cron Job • Many cronjob on different projects needs supervision • In a multi-tenants env no way to detect the arrival of new data to protect • Very Hard to share good practices or blueprint • Need to share credentials to the storage backup between teams …
we enjoy working together • Easy to use: Management interface (CRDs under the hood); Policy-based; Extensible • Secure: Multitenancy support; RBAC support; Encryption • Built for Kubernetes: Captures both application configuration and data Overall, it fits our requirements around ecosystem support, scale, retention policies, encryption, DR
important to have agility Can I implement different levels of backup granularity based on the app? How do I bring data from production into my CI/CD pipelines? Can I extend the protection to my favorite database on my own? Can you migrate my application from OpenShift v3.x to v4.x? Can I keep/choose storage from multiple vendors and still have a single backup system? Can I go across clouds and clusters without adding a layer of storage abstraction (SDS)?