Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Journey of Backups in Kubernetes

Kasten
November 18, 2019

The Journey of Backups in Kubernetes

A description of Sopra Steria's journey of backup on Kubernetes. Presented by Michael Courcy at Cloud Native Storage Data / KubeCon 2019 North America (San Diego).

Video at https://youtu.be/hpIVtaeGDbg

Kasten

November 18, 2019
Tweet

More Decks by Kasten

Other Decks in Technology

Transcript

  1. Who is Sopra Steria plus our cloud native environment •

    SopraSteria is a Digital Services and transformation Company (Similar to Capgemini or Atos) with more than 30,000 collaborators world wide and >4Bil euros annual revenues. • A lot of customers, big accounts : Banking, Government, Army, Energy, Insurance… • Kubernetes is the tool that we always wanted : • Contract with many small apps are easy to resuscitate • Size matter when you want to reduce cost • All the usual benefits of Kubernetes and devops practices of course • Provide common tool for all the teams in a consistent manner (CI/CD, home made or those we choose to invest) Cloud & On-Prem Amazon RDS … Elastic Modern Databases Multiple Storage AWS EBS AWS S3
  2. Our Kubernetes backup requirements plus key principles 1 2 3

    Keep it simple Make it secure Room for innovation Business data Mongo, PG, Mysql Namespaces resources: Deployment, secrets, services … ETCD : describe the whole cluster state Machines : root/os filesystem + attached devices Don’t want to read manuals Don’t want storage overlays RBAC, Audit, end-end encryption,… Reduce attack surface (zero-trust) No vendor or data lock-ins Give our customers choice Capture business data and cluster info Backup location choice Backup Requirements Recovery Requirements Restore business data and cluster info Restore location choice Self-service portal Automation Compliance Reporting Avoid extra tooling Management Requirements
  3. Mobility for Kubernetes Application is critical plus the government wants

    it 1 2 3 Disaster Recovery Sovereignty Costs Don’t put all your eggs in a single basket Country rules govern data location Different for Primary, Second and Archive storage Option 1: Build Option 2: Migrate Expensive! Better way: Intelligent data mobility
  4. How our backup journey evolved character building along the way

    … so we still did not have a solution The old way • Expensive … • Hard to rebuild: nodes and disks are always changing • Need to shutdown all the machines for complete consistency • Back in the past is back in the past for all tenants EBS + Lambda • Works only on AWS • Work only with EBS PVC, what about CEPH volume built on EBS ? • Save everything can’t apply different policy • Not so easy to rebuild a PV from a snapshot need AWS credentials Port Forward + Cron Server • Port forward not stable • Cron server become the SPOF • Hard to introduce new policy • Credential sharing of the databases on the cron server Kubernetes Cron Job • Many cronjob on different projects needs supervision • In a multi-tenants env no way to detect the arrival of new data to protect • Very Hard to share good practices or blueprint • Need to share credentials to the storage backup between teams …
  5. Why we chose Kasten K10 met our requirements…. … and

    we enjoy working together • Easy to use: Management interface (CRDs under the hood); Policy-based; Extensible • Secure: Multitenancy support; RBAC support; Encryption • Built for Kubernetes: Captures both application configuration and data Overall, it fits our requirements around ecosystem support, scale, retention policies, encryption, DR
  6. Requirements don’t stop our customers want more too… … so

    important to have agility Can I implement different levels of backup granularity based on the app? How do I bring data from production into my CI/CD pipelines? Can I extend the protection to my favorite database on my own? Can you migrate my application from OpenShift v3.x to v4.x? Can I keep/choose storage from multiple vendors and still have a single backup system? Can I go across clouds and clusters without adding a layer of storage abstraction (SDS)?