Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CI/CD, Kubernetes, and Databases: Better Together

December 11, 2018

CI/CD, Kubernetes, and Databases: Better Together

Adding data to both your CI and CD pipeline is one of the last steps of the DevOps journey and probably the scariest given the impact of getting it wrong. This talk covers how one can safely combine Kubernetes, Databases, and the CI/CD pipeline to actually make the process safer and more stable that the status quo today and, in today’s rapid deployment culture, make databases “shift left” and reduces DBA burnout. This includes leveraging techniques and building an open-source toolkit to deliver automated schema changes, cloning, sandboxing, masking for production-like data in staging, and rapid data movement for fast database creation. More importantly, this talk will show how these benefits can help with internal culture shift by breaking down silos and bringing in a traditionally conservative database group more fully into the automation fold.


December 11, 2018

More Decks by Kasten

Other Decks in Technology


  1. CI/CD, Kubernetes, and Databases: Better Together Niraj Tolia Tom Manville

    @nirajtolia @tdmanv
  2. about us page 02 Niraj Tolia Co-founder & CEO @

    Kasten Previously at EMC, Maginatics, HP, CMU Tom Manville Founding Engineer @ Kasten Previously at Dropbox, Maginatics, U. Mich.
  3. our goal: move fast and test with real data

  4. what we will not cover in this talk page 04

    Kubernetes Ready for Production Stateful Apps Presented at SNIA’s 2018 Storage Developer Conference Implementing a Data Protection Strategy KubeCon Seattle, Wednesday, December 12, 2:35pm
  5. current state of databases in a cloud-native world page 05

  6. cloud-native and databases why is there so much fear and

    risk? page 06 Still see database groups isolated from both dev and infra ops groups. Not part of app dev. DBAs and Ops Not built into CI/CD pipelines. Test datasets have manual imports and get stale quickly. Automation Gap Databases are isolated from the application, might have manual changes applied, treated as pets. Snowflakes
  7. What should the future look like?

  8. None
  9. increasing agility with databases in a cloud-native environment page 09

    Source Control Include all schema changes, upgrades changes, tools, etc. in the application repository Kubernetes to tie it all together! CI/CD Pipeline Automate testing all database changes and modifications Database Infrastructure Deliver database infrastructure and configuration as code
  10. how kubernetes makes a difference page 010 Enforces Good DevOps

    Hygiene Immutability, config as code, automation makes repeatable and reliable testing easy Efficient, High Resource Utilization Declarative systems approach supports reliable use of multiple testing environments to test at scale Universal Control Plane Use the same management plane as you use for all other components of your application
  11. ci/cd advantages for databases page 011 Catch issues early •

    Unit tests for coverage • Integration and staging environments for behavioral • Faster change iteration with automated testing • High velocity prod DB deployments Engineering agility • Enforces the the app and DB are always in sync • Higher-confidence releases Automated testing
  12. But, it’s a database! So, what about the data?

  13. Need to safely test with production data (but not in

  14. data based testing number of integration challenges page 014 Storage

    Integration Might need to integrate with volume- level storage APIs for efficiency. Database Integration For consistent data capture including w/ eventually consistent data stores Application Integration Polyglot persistence in micro-service based applications needs app-level coordination. So does data masking to protect sensitive data.
  15. Supporting Data Mobility

  16. page 016 kanister: A Kubernetes-native framework for application-level data management

    • Supports complex data management workflows • Easy to integrate against your CI/CD pipeline • Actions invoked via Custom Resources (CRs) • Easy to extend via simple “recipes” or Blueprints https://github.com/kanisterio
  17. kanister: the highlights page 017 Control Plane Integration • Ties

    K8s and DB control planes • Library support for complex workflows (e.g., scale up/down) • Filters • Masking • Incremental Capture Database Manipulation • File/Block integration via native API and CSI v1.0 • S3 API support for object stores Data Capture/Export Visit https://kasten.io/kanister for more information
  18. kanister workflow page 018 Blueprint (Custom K8s Resource) Stateful Application

    1. ActionSet Creation 2. Blueprint Discovery 3. Action Execution KubeExec / KubeTask 4. Status Update Kanister Controller ActionSet (Custom K8s Resource)
  19. kanister actionset (abridged) page 019 apiVersion: cr.kanister.io/v1alpha1 kind: ActionSet spec:

    actions: - name: backup blueprint: postgresql object: kind: StatefulSet name: postgresql-cluster namespace: default configMaps: ...
  20. kanister blueprint (abridged) page 020 apiVersion: cr.kanister.io/v1alpha1 kind: Blueprint actions:

    backup: type: StatefulSet phases: - func: KubeExec args: - '{{ .StatefulSet.Namespace }}' - '{{ index .StatefulSet.Pods 0 }}' - postgresql-tools-sidecar - bash - -c - wal-e ... - func: ... restore: ...
  21. Demo!

  22. demo: pipeline setup page 022 Application Code Config Definition Database

    Schema Source Control Integration Pipeline Deployment Pipeline Production Cluster Data Mobility
  23. integration demo: data flow setup page 023 App Pod DB

    Pod Namespace: demo Production Kubernetes Cluster NS: kio DB Pod Integration Kubernetes Cluster NS: kio Object Storage Firewall App + Data Snapshot App Pod Namespace: test DB ⓵ App Export ⓶ App Import ⓸ Data Population ⓷ Test Invocation K10: Policy and Orchestration (e.g., Periodic Import or Export) + Kanister: Data Manipulation and Mobility
  24. end-to-end demo

  25. advanced topics (hopefully) coming soon to a conf. near you

    page 025 CD w/ schema changes Deploying schema changes (and rollbacks) can be a lot more involved. Backup/recovery is a critical part of this. Managed Services Apart from cost, these slides apply to managed services too but do track emerging best practices Masking and Sampling Kanister has support for injecting your own code to mask sensitive data or only extract a a subset Dataset Promotion There are situations where you might want to promote data from dev → staging → prod
  26. kubernetes, ci/cd, and databases wrapping up page 26 01Automate your

    DB Pipeline Deploy database updates and changes with increased confidence 04Make DB Engineering Agile Integrate database teams into your DevOps and Agile journey. Break apart the silos! 02Leverage Kubernetes Deliver greater agility to your dev teams by allowing easy and reliable testing 03Use Real Data Test on production data to reduce code quality risk when running against synthetic or stale data Build & Standardize your DB Pipeline on Kubernetes!
  27. page 027 Image is the cover art from Better Together,

    a Jack Johnson song Questions? You can also find us at: Booth S/E15 www.kasten.io @kastenhq @nirajtolia @tdmanv