Slide 1

Slide 1 text

@mathetake — Kubernete Meetup Tokyo #32 Introduction to Flagger — A Progressive Delivery Operator for Kubernetes —

Slide 2

Slide 2 text

ҎԼΛ͓࿩͠·͢ • എܠ: Progressive Deliveryͱ͸ͳʹ͔ • Flaggerೖ໳ • Flaggerͷ಺෦ڍಈ ໌೔͔ΒFlaggerΛݕূ͢ΔͨΊೖ໳ ࠓ೔ͷ͓࿩

Slide 3

Slide 3 text

1. Background ※gitopsલఏͰ࿩͠·͕͢, ผͷफ೿Ͱ΋ڞ௨͢Δ࿩Ͱ͢

Slide 4

Slide 4 text

• ΈΜͳେ޷͖Continuous X • Cloud Nativeք۾ͷελϯμʔυ • ͳ͍ͱੜ͖͍͚ͯͳ͍ • ๛෋ͳsoftware܈ • Spinnaker, Flux, ArgoCD, … • Goal: Agility & Reliability • Automated testing, deployment Continuous X

Slide 5

Slide 5 text

• Continuous X allows us to test/deploy tons of applications in a day • PRϚʔδͰσϓϩΠ׬ྃɺखΦϖແ͠ • Rollback = PRͷϦόʔτ • ؆୯! • Reconciliation Loopສࡀ • ා͍΋ͷ͸ͳ͍ Continuous X

Slide 6

Slide 6 text

• Continuous X allows us to test/deploy tons of applications in a day • PRϚʔδͰσϓϩΠ׬ྃɺखΦϖແ͠ • Rollback = PRͷϦόʔτ • ؆୯! • Reconciliation Loopສࡀ • ා͍΋ͷ͸ͳ͍ Continuous X No Free Lunch

Slide 7

Slide 7 text

• CDʹΑΔdeployޙͷreliability…? • deploy௚ޙ͸ϩά/ϝτϦΫεΛݟ͟ΔΛಘͳ͍ • ΍͹͔ͬͨΒϦόʔτʂ • ݁ہHuman in the loop • ϩʔϧόοΫͷΦϖϨʔγϣϯͰࣄނΔՄೳੑ • We humans are all fallible Continuous Delivery is hard

Slide 8

Slide 8 text

΋ͬͱ͍͍ײ͡ʹγϡοͱ͍ͨ͠ • CDʹΑΔdeployޙͷreliability…? • deploy௚ޙ͸ϩά/ϝτϦΫεΛݟ͟ΔΛಘͳ͍ • ΍͹͔ͬͨΒϦόʔτʂ • ݁ہHuman in the loop • ϩʔϧόοΫͷΦϖϨʔγϣϯͰࣄނΔՄೳੑ • We humans are all fallible Continuous Delivery is hard

Slide 9

Slide 9 text

What is Progressive Delivery

Slide 10

Slide 10 text

What is Progressive Delivery

Slide 11

Slide 11 text

• NaiveͳCDͩͱAll-or-nothing deployment • ৽͍͠versionͰݹ͍΍ͭΛ͢΂ͯೖΕସ͑Δ • ϩʔϧόοΫ͸·ͨͦͷٯ • *Progressive Delivery = CD++ • ී௨ͷCDΛΑΓγϡοͱ͢Δ΋ͷ • All-or-nothingͰ͸ͳ͘ɺঃʑʹσϓϩΠ • ʮঃʑʹσϓϩΠʯΛࣗಈԽ • ຊ൪σϓϩΠͷϦεΫΛݮΒ͢΋ͷ What is Progressive Delivery *

Slide 12

Slide 12 text

• Progressive Delivery = ঃʑʹຊ൪ʹσϓϩΠ͍ͯ͘͠ ≒ ຊ൪؀ڥͰ৽͍͠όʔδϣϯΛࢼ͢(test) • ‘Test in production’ ͷߟ͑ํʹ͍ۙ • Staging cannot be production • iterationͷߴ଎Խ, Agilityͷ޲্ All-or-nothing deployment + Test in production = high risk Progressive Delivery + Test in production = low risk Test in production meets P.D.

Slide 13

Slide 13 text

• Spinnaker • K8s native • Argo-rollouts • nativeͳservice based* • weaveworks/flagger • Service Mesh/Ingress Controllerϕʔε • جຊಈ࡞: গ͠rollout 㱻 ϝτϦΫεͷνΣοΫ Implementation *ੲͷόʔδϣϯ͸service meshͱ߹Θͤͯtraffic shifting͕Ͱ͖ͳ͔ͬͨ

Slide 14

Slide 14 text

2. Introduction to Flagger

Slide 15

Slide 15 text

• Progressive DeliveryͷͨΊͷOSS • • weaveworksࣾͷStefan͕։ൃ • 2020೥6݄17೔ʹGA • IstioΛલఏͱͨ͠΋ͷͱͯ͠։ൃ → ͦͷޙෳ਺platformΛαϙʔτ What is Flagger Give developers confidence in automating the production releases

Slide 16

Slide 16 text

• Service mesh native • Istio, SMI(linkerd, crossover), Appmesh + ingress controllers • Fine tuned traffic shifting • Gitops native • multiple deployment strategies • Custom metrics • Manual gating(approve, pause, resume), Webhooks • Alerting Flagger’s Features

Slide 17

Slide 17 text

1. Canary Release (progressive traffic shifting) • http, grpcͳΞϓϦέʔγϣϯ༻ 2. A/B Testing (HTTP headers and cookies traffic routing) • Session affinity͕ඞཁͳΞϓϦέʔγϣϯ༻ 3. Blue/Green (traffic switching) • Any workload 4. Blue/Green Mirroring (traffic shadowing) • ႈ౳ͳΞϓϦέʔγϣϯ༻(࠷΋҆શ) • Canary Releaseͷલஈ֊ͱͯ͠mirroring͢Δ͜ͱ΋Մೳ Deployment Strategy

Slide 18

Slide 18 text

Deployment Strategy - Canary Release

Slide 19

Slide 19 text

Deployment Strategy - A/B testing

Slide 20

Slide 20 text

Deployment Strategy - Blue Green

Slide 21

Slide 21 text

Control loop / phase Analysis Promotion • Deployment strategy • Traffic Shifting • webhook • metrics check • Manual gating Rollback • Alert provider Stable Initialize

Slide 22

Slide 22 text

1. Canary • ϝΠϯͷϦιʔε • ࠷௿ݶͷػೳͰ͋Ε͹͜Ε͚ͩͰOK 2. MetricTemplate • Analyze͢ΔmetricsͷΫΤϦ • Metric ϓϩόΠμʔͷࢦఆ 3. Alertprovider • deliveryͷ௨஌ઌΛࢦఆ Flagger CRD

Slide 23

Slide 23 text

apiVersion: kind: Canary metadata: name: podinfo namespace: test spec: provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…”

Slide 24

Slide 24 text

apiVersion: kind: Canary metadata: name: podinfo namespace: test spec: provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” Applicationͷࢦఆ - Deployment - Daemonset

Slide 25

Slide 25 text

apiVersion: kind: Canary metadata: name: podinfo namespace: test spec: provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” HPAͷࢦఆ(optional)

Slide 26

Slide 26 text

apiVersion: kind: Canary metadata: name: podinfo namespace: test spec: provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” serviceͷఆٛ

Slide 27

Slide 27 text

apiVersion: kind: Canary metadata: name: podinfo namespace: test spec: provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” Canary analysisͷఆٛ

Slide 28

Slide 28 text

apiVersion: kind: MetricTemplate metadata: name: not-found-percentage namespace: istio-system spec: provider: type: prometheus address: http://promethues.istio-system:9090 query: | 100 - sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}", response_code!="404" }[{{ interval }}] ) ) / sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}" }[{{ interval }}] ) ) * 100 Flagger CRD: MetricTemplate

Slide 29

Slide 29 text

apiVersion: kind: MetricTemplate metadata: name: not-found-percentage namespace: istio-system spec: provider: type: prometheus address: http://promethues.istio-system:9090 query: | 100 - sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}", response_code!="404" }[{{ interval }}] ) ) / sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}" }[{{ interval }}] ) ) * 100 Flagger CRD: MetricTemplate MetricϓϩόΠμͷࢦఆ - Prometheus - Datadog - CloudWatch

Slide 30

Slide 30 text

apiVersion: kind: MetricTemplate metadata: name: not-found-percentage namespace: istio-system spec: provider: type: prometheus address: http://promethues.istio-system:9090 query: | 100 - sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}", response_code!="404" }[{{ interval }}] ) ) / sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}" }[{{ interval }}] ) ) * 100 Flagger CRD: MetricTemplate ΫΤϦͷtemplate

Slide 31

Slide 31 text

apiVersion: kind: AlertProvider metadata: name: on-call namespace: flagger spec: type: slack channel: on-call-alerts username: flagger # address: secretRef: name: on-call-url --- apiVersion: v1 kind: Secret metadata: name: on-call-url namespace: flagger data: address: Flagger CRD: Alert Provider

Slide 32

Slide 32 text

3. How Flagger works?

Slide 33

Slide 33 text


Slide 34

Slide 34 text

• Canary CRΛ࡞੒ͨ࣌͠఺ͰFlagger͸ෳ਺ͷϦιʔεΛ࡞Δ • ΞϓϦέʔγϣϯؔ܎: جຊతʹݩͷఆٛͷίϐʔ • Deployment/Daemonset: -primary • “app: foo” → “app: foo-primary” ͱϥϕϧΛม׵ • HPA: -primary • mountͯ͠Δconfigmap, secret: -primary Canary Initialization

Slide 35

Slide 35 text

• Canary CRΛ࡞੒ͨ࣌͠఺ͰFlagger͸ෳ਺ͷϦιʔεΛ࡞Δ • ΞϓϦέʔγϣϯؔ܎: جຊతʹݩͷఆٛͷίϐʔ • Deployment/Daemonset: -primary • “app: foo” → “app: foo-primary” ͱϥϕϧΛม׵ • HPA: -primary • mountͯ͠Δconfigmap, secret: -primary Canary Initialization طଘͷdeployment,HPAΛͦͷ··࢖͑Δ

Slide 36

Slide 36 text

• Canary CRΛ࡞੒ͨ࣌͠఺ͰFlagger͸ෳ਺ͷϦιʔεΛ࡞Δ • ҎԼͷ3ͭͷserviceΛ࡞੒(+meshʹԠͯ͡VS౳) • ..svc.cluster.local • selector: app=-primary • -primary..svc.cluster.local • selector: app=-primary • -canary..svc.cluster.local • selector: app= Canary Initialization

Slide 37

Slide 37 text

• ݩͷuser managedͳdeploymentΛ canary ͱݺͿ • ίϐʔ͞ΕͨFlagger managedͳdeploymentΛ primary ͱݺͿ • ॾʑͷ࡞੒͕׬ྃޙɺuser managedͳdeploymentΛreplicas = 0ʹ͢Δ • (replicas͸ݩʑઃఆ͞Ε͍ͯͳ͍લఏ) • Canaryͷpod͸θϩݸʹͳΔ Canary Initialization

Slide 38

Slide 38 text

deployment foo-primary service foo-primary Canary Initialization deployment foo HPA foo secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Managed by Users

Slide 39

Slide 39 text

deployment foo-primary service foo-primary Canary Initialization deployment foo HPA foo secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Managed by Flagger

Slide 40

Slide 40 text

deployment foo-primary service foo-primary Canary Initialization deployment foo replicas: 0 HPA foo secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Set ‘replicas = 0’ on canary

Slide 41

Slide 41 text

deployment foo-primary service foo-primary Canary Initialization deployment foo replicas: 0 HPA foo secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Traffic

Slide 42

Slide 42 text

• ॳظԽޙtargetʹมߋ͕ೖΔͷΛ଴ͭ • Target = deployment, daemonset • spec.templateͷϋογϡ஋ͷมԽ • Target͕Ϛ΢ϯτͯ͠Δconfigmap, secretͷมԽ΋ݕ஌ • ݕ஌ͨ࣌͠఺Ͱ͸primaryʹ͸൓ө͞Εͳ͍(౰ͨΓલ) Change Detection

Slide 43

Slide 43 text

• canaryͷ rollout: Replicas = 0 Λ࡟আ • HealthyʹͳΒͳ͚Ε͹rollback • 100%ͷτϥϑΟοΫ͕canaryʹྲྀΕΔ·ͰɺҎԼΛ܁Γฦ͢ • ਺ˋͷτϥϑΟοΫΛcanaryʹshifting͢Δ • metricsΛ֬ೝ: ࢦఆ͞Εͨrangeͷதʹೖ͍ͬͯΔ͔ • ೖ͍ͬͯͳ͚Ε͹rollback • canary͔Βprimary΁deployment, configmap, secret, hpaΛίϐʔ͢Δ • User managedͳresourceΛFlagger managedͳresource΁ίϐʔ • τϥϑΟοΫΛprimary΁໭͢ • ஈ֊తʹ໭͢͜ͱ΋Մೳ (progressive promotion) Promotion Process ※͋͘·ͰҰྫ

Slide 44

Slide 44 text

Promotion Process Scale up canary Start analysis Finish analysis Update primary Start progressive promotion Scale down canary Promotion finish

Slide 45

Slide 45 text

• rollback = ݩͷdeployment, daemonsetʹඥͮ͘podΛ0ݸʹ͢Δ • deploymentͷ৔߹: ‘replicas: 0’ Ληοτ͢Δ • daemonsetͷ৔߹: ଘࡏ͠ͳ͍labelͰnodeSelectorΛηοτ͢Δ • ϚχϑΣετࣗମ͕ϩʔϧόοΫ͞ΕΔΘ͚Ͱ͸ͳ͍ • gitopsϑϨϯυϦʔ Rollback

Slide 46

Slide 46 text

• ֤ϑΣʔζͰwebhookΛઃఆͰ͖Δ(2xxҎ֎͕ฦ͖ͬͯͨΒrollback) • confirm-rollout … canaryͷrolloutલ • pre-rollout … analysisΛ։࢝͢Δલ • rollout … analysisͷϧʔϓຖ • confirm-promotion … primaryʹpromotion͢Δલ • post-rollout … primary΁ͷpromotionޙ Webhooks

Slide 47

Slide 47 text

• podͷ਺͸୯७ʹೋഒ • service͕flaggerʹΑͬͯ࡞ΒΕΔ • طଘͷserviceΛͦͷ··࢖͑ͳ͍(࢖͑Δ͕ɺgitopsతʹ͸Ξ΢τ) • nativeͷserviceΛ࢖ͬͨcanary release͕Ͱ͖ͳ͍(argo-rolloutsํࣜ) • Blue Green͸Մೳ͕ͩɺجຊతʹ͸service mesh, ingress controllerલఏ • HPA΁ͷมߋ͸ݕ஌͞Εͳ͍ • ׬શʹࣗಈԽ͞ΕΔΘ͚Ͱ͸ͳ͍(ઈର҆શͰ͸ͳ͍) • metricsͰ֬ೝͰ͖ͳ͍όά΍Τϥʔ͸͋ΓಘΔ Considerations / Limitations

Slide 48

Slide 48 text

• Progressive Delivery͸CD++ • ProductionϦϦʔεͷriskΛݮΒ͢ • AgilityͱReliabilityͷཱ྆ • Flagger͸k8s(service mesh)্ͰͷProgressive DeliveryΛ࣮ݱ • ෳ਺ͷdeployment strategy, metrics provider, • configmap/secretͷมߋݕ஌ • We welcome all contributions! • Stack driverରԠ, StatefulsetରԠ, HPAͷมߋݕ஌, etc. • ࣭໰౳͸ؾܰʹweaveworks community slackʹͯ: #flagger Conclusion

Slide 49

Slide 49 text

• • • • delivery-5f1ea9b627c1 • • • • • • • • References