Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keptn @ Cloud Native Meetup Portland

Keptn @ Cloud Native Meetup Portland

Keptn is an event-based control plane for continuous delivery and automated operations for cloud-native applications.

Jürgen Etzlstorfer

January 28, 2020
Tweet

More Decks by Jürgen Etzlstorfer

Other Decks in Technology

Transcript

  1. 2 That is why we are building Because cloud native

    delivery and operations is a BIG challenge for enterprises! Cloud Native
  2. 3 MTTI Mean Time to Innovation MTTR Mean Time to

    Remediate 4.8 days 4 hours ~ 10min 12.5 days 2 days ~ 1 hour The reality – based on facts https://dynatrace.ai/acsurvey Only < 5% are “Cloud Native”
  3. Confidential 6 Mixed information about • Process (build, deploy, test,

    evaluate, …) • Target platform (k8s, …) • Environments (dev, hardening, …) • Tools (Terraform, Helm, hey, …) No clear separation of concerns • Developers • Define which artifact to use • Want fast feedback on their code • DevOps Engineers • Define which tools to use • Ensure tools are properly configured • Site Reliability Engineers • Define delivery processes • Define operations workflows 6 Delivery pipelines: What this looks like in the real world…
  4. Confidential 10 What is Keptn? Keptn is an event-based control

    plane for continuous delivery and automated operations for cloud-native applications.
  5. Confidential 11 Why Keptn? Scriptless delivery and operations • Declarative

    approach for delivery or operations automation • Share definitions across any number of microservices w/o individual pipelines and scripts Separation of concerns • Processes defined by SREs • Tooling defined by DevOps • Artifacts defined by Devs Event-based automation for extensibility • CloudEvents for all delivery and operations steps • Simple and fast integration by registering to those events Built-in Observability • Built-in tracing capabilities for all deployments and operations flows • Visualization in the Bridge
  6. Confidential 12 How Keptn manages Continuous Delivery New artifact Update

    Config Update Environment Run Tests Validate Quality Gate Rollback if failed Repeat for other stages: Built-in Quality Gate for CD, which can also be used stand-alone
  7. Confidential 13 How Quality Gates allow to build better software

    faster In cloud-native environments, full automation of quality gates is a must-have! • Service Level Indicators (SLIs) define relevant indicators for the service • Metrics: response time, error rate, CPU usage, … • Infrastructure: architecture constraints, … • Business: Conversion rates, user satisfaction, … • Service Level Objectives (SLOs) define objectives using the SLIs • Compare against fixed thresholds • Compare against previous versions of the service Keptn Lighthouse queries SLIs from Prometheus Dynatrace evaluates SLOs Neotys
  8. Confidential 15 Quality Gates Example SLI Criterion Scoring Objective Response

    Time <500: 0.5 <200: 1 +5% -5% <+5%: 0.5 =0%: 1 SQL Statements Value: 400 Score: 0.5 Value: 0% Score: 1 1.5/2 (75%) >-5%: 0.5 Source Grading Example Total spec_version: "1.0" comparison: compare_with: "single_result" objectives: - sli: response_time_95 pass: - criteria - "<200" warning: - criteria - "<500" - sli: sql_statements pass: - criteria: - "=0%" warning: - criteria: - "<+5%" - ">-5%" total_score: pass: "90%" warning: "75%" SLI Criterion 500 200 Strategy # other possible options: # compare_with: "several_results"
  9. Confidential 16 Keptn is more than CD: Automated Remediation Alert

    by Problem v1 Get remediation action Execute remediation action Re-validate Quality Gate Examples: • Rollback to old version • Toggle feature flag • Scale up Deployment/restart Pods • Clear disk • YOUR scripted operation tasks v2 v1 v1 Resolved?
  10. Confidential 19 Remove hard dependencies and integrations Build Prepare Deploy

    Test Notify Rollback Config Mgmt. Deploy Test Monitoring ChatOps Rollback
  11. Confidential 20 Remove hard dependencies and integrations Build Prepare Deploy

    Test Notify Rollback Config Mgmt. Deploy Test Monitoring ChatOps Rollback Eventing Event:Deploy Artifact:container1 Stage:Dev Strategy:Blue/Green which events to generate who consumes events
  12. Confidential 21 Eventing Let us build an architecture that supports

    this paradigm Application Plane Define overall process for delivery and operations Control Plane Follow application logic and communicate/configure required services Deploy Service Test Service Validation Service Remediation Service Config Service … Service Artifact / Microservice API Site Reliability Engineer DevOps Developer shipyard.yaml uniform.yaml
  13. Confidential 22 keptn: v1 type: shipyard stages: - name: “dev”

    deployment: “direct” test: “functional” promotion: “automatic” - name: “hardening” deployment: “blue-green” - approval: “manual” test: “performance” promotion: “manual” - name: “prod” deployment: “blue-green” release: “canary” - interval: “10m” - increase: “20%” Shipyard specifies STAGES and WHAT TO DO in these stages Defining your process – Shipyard files https://github.com/keptn/spec/blob/master/shipyard.md
  14. Confidential 23 keptn: v1 type: uniform services: - name: “slack-trail”

    image: “keptn/slack:1.0” env: - name: “SLACK_WEBHOOK” value: “https://hooks.sl..." topics: - “*” - name: “deploy-svc” image: “argo/argocd-svc:1.3” topics: - “start_deploy” - name: “performance-test” image: “keptn/jmeter:0.7” topics: - “start_test” ... Uniform specifies WHO reacts to which EVENTS Defining your integrations – Uniform Files
  15. Confidential 24 --- spec_version: '0.1.1’ comparison: compare_with: "single_result” include_result_with_score: "pass"

    aggregate_function: avg objectives: - sli: response_time_p95 pass: - criteria: - "<=+10%" - "<600" warning: - criteria: - "<=800" total_score: pass: "90%" warning: 75% SLO defines the QUALITY CRITERIA of a service Defining quality gates – Service Level Objective files
  16. 30 Config ChatOps IT Autom Deploy Test Observe Zero-Touch Toolchain

    Integration: $ keptn wear uniform <GitHub, Slack ...>
  17. 31 Config ChatOps IT Autom Deploy Test Observe Re-Think Pipelines:

    $ keptn create project keptn-sample {stage(perf),prod(bg)} S T A G I N G P R O D Direct Update C D Blue/Green Update C D
  18. 32 Config ChatOps IT Autom Deploy Test Observe Zero-Touch Cloud

    Native Services: $ keptn onboard service myservice [xxx.yaml] S T A G I N G P R O D Direct Update C D Blue/Green Update C D PLACEHOLDER PLACEHOLDER
  19. 33 Config ChatOps IT Autom Deploy Test Observe Automated Multi-Stage

    Delivery: $ keptn new artifact myservice:1.0.0 S T A G I N G P R O D Score Direct Performance Update Promote? C D T O Score Blue/Green Update Keep? C D T O PLACEHOLDER PLACEHOLDER 1.0.0 1 1 90 / 100 1.0.0 1 1 1 75 / 100 P R O M O T E K E E P
  20. 34 Config ChatOps IT Autom Deploy Test Observe Automated Quality

    Gates: $ keptn new artifact myservice:2.0.0 S T A G I N G P R O D Score Direct Performance Update Promote? C D T O Score Blue/Green Update Keep? C D T O 1.0.0 1 1 45 / 100 1.0.0 1 1 1 2.0.0 2 2 A B O R T
  21. 35 Config ChatOps IT Autom Deploy Test Observe Self-Healing Blue/Green

    Deployments: $ keptn new artifact myservice:3.0.0 S T A G I N G P R O D Score Direct Performance Update Promote? C D T O Score Blue/Green Update Keep? C D T O 1.0.0 1 1 85 / 100 1.0.0 1 1 1 2.0.0 2 2 3.0.0 3 3 3.0.0 3 3 3 80 / 100 P R O M O T E R E V E R T