Pro Yearly is on sale from $80 to $50! »

Introduction to Flagger

Introduction to Flagger

Kubernetes Meetup Tokyo #32

744a38d972036c3bd0bcdaddafdd5f26?s=128

mathetake

July 28, 2020
Tweet

Transcript

  1. @mathetake — Kubernete Meetup Tokyo #32 Introduction to Flagger —

    A Progressive Delivery Operator for Kubernetes —
  2. ҎԼΛ͓࿩͠·͢ • എܠ: Progressive Deliveryͱ͸ͳʹ͔ • Flaggerೖ໳ • Flaggerͷ಺෦ڍಈ ໌೔͔ΒFlaggerΛݕূ͢ΔͨΊೖ໳

    ࠓ೔ͷ͓࿩
  3. 1. Background ※gitopsલఏͰ࿩͠·͕͢, ผͷफ೿Ͱ΋ڞ௨͢Δ࿩Ͱ͢

  4. • ΈΜͳେ޷͖Continuous X • Cloud Nativeք۾ͷελϯμʔυ • ͳ͍ͱੜ͖͍͚ͯͳ͍ • ๛෋ͳsoftware܈

    • Spinnaker, Flux, ArgoCD, … • Goal: Agility & Reliability • Automated testing, deployment Continuous X https://www.weave.works/blog/automate-kubernetes-with-gitops
  5. • Continuous X allows us to test/deploy tons of applications

    in a day • PRϚʔδͰσϓϩΠ׬ྃɺखΦϖແ͠ • Rollback = PRͷϦόʔτ • ؆୯! • Reconciliation Loopສࡀ • ා͍΋ͷ͸ͳ͍ Continuous X
  6. • Continuous X allows us to test/deploy tons of applications

    in a day • PRϚʔδͰσϓϩΠ׬ྃɺखΦϖແ͠ • Rollback = PRͷϦόʔτ • ؆୯! • Reconciliation Loopສࡀ • ා͍΋ͷ͸ͳ͍ Continuous X No Free Lunch
  7. • CDʹΑΔdeployޙͷreliability…? • deploy௚ޙ͸ϩά/ϝτϦΫεΛݟ͟ΔΛಘͳ͍ • ΍͹͔ͬͨΒϦόʔτʂ • ݁ہHuman in the

    loop • ϩʔϧόοΫͷΦϖϨʔγϣϯͰࣄނΔՄೳੑ • We humans are all fallible Continuous Delivery is hard
  8. ΋ͬͱ͍͍ײ͡ʹγϡοͱ͍ͨ͠ • CDʹΑΔdeployޙͷreliability…? • deploy௚ޙ͸ϩά/ϝτϦΫεΛݟ͟ΔΛಘͳ͍ • ΍͹͔ͬͨΒϦόʔτʂ • ݁ہHuman in

    the loop • ϩʔϧόοΫͷΦϖϨʔγϣϯͰࣄނΔՄೳੑ • We humans are all fallible Continuous Delivery is hard
  9. What is Progressive Delivery https://carlossg.github.io/presentations/2019-06_cdsummit/#/2/1

  10. What is Progressive Delivery https://carlossg.github.io/presentations/2019-06_cdsummit/#/2/1

  11. • NaiveͳCDͩͱAll-or-nothing deployment • ৽͍͠versionͰݹ͍΍ͭΛ͢΂ͯೖΕସ͑Δ • ϩʔϧόοΫ͸·ͨͦͷٯ • *Progressive Delivery

    = CD++ • ී௨ͷCDΛΑΓγϡοͱ͢Δ΋ͷ • All-or-nothingͰ͸ͳ͘ɺঃʑʹσϓϩΠ • ʮঃʑʹσϓϩΠʯΛࣗಈԽ • ຊ൪σϓϩΠͷϦεΫΛݮΒ͢΋ͷ What is Progressive Delivery * https://qiita.com/mumoshu/items/63b29bca6a052d8c7087
  12. • Progressive Delivery = ঃʑʹຊ൪ʹσϓϩΠ͍ͯ͘͠ ≒ ຊ൪؀ڥͰ৽͍͠όʔδϣϯΛࢼ͢(test) • ‘Test in

    production’ ͷߟ͑ํʹ͍ۙ • Staging cannot be production • iterationͷߴ଎Խ, Agilityͷ޲্ All-or-nothing deployment + Test in production = high risk Progressive Delivery + Test in production = low risk Test in production meets P.D. https://www.getambassador.io/docs/latest/topics/concepts/progressive-delivery/
  13. • Spinnaker • K8s native • Argo-rollouts • nativeͳservice based*

    • weaveworks/flagger • Service Mesh/Ingress Controllerϕʔε • جຊಈ࡞: গ͠rollout 㱻 ϝτϦΫεͷνΣοΫ Implementation *ੲͷόʔδϣϯ͸service meshͱ߹Θͤͯtraffic shifting͕Ͱ͖ͳ͔ͬͨ
  14. 2. Introduction to Flagger

  15. • Progressive DeliveryͷͨΊͷOSS • github.com/weaveworks/flagger • weaveworksࣾͷStefan͕։ൃ • 2020೥6݄17೔ʹGA •

    IstioΛલఏͱͨ͠΋ͷͱͯ͠։ൃ → ͦͷޙෳ਺platformΛαϙʔτ What is Flagger Give developers confidence in automating the production releases
  16. • Service mesh native • Istio, SMI(linkerd, crossover), Appmesh +

    ingress controllers • Fine tuned traffic shifting • Gitops native • multiple deployment strategies • Custom metrics • Manual gating(approve, pause, resume), Webhooks • Alerting Flagger’s Features
  17. 1. Canary Release (progressive traffic shifting) • http, grpcͳΞϓϦέʔγϣϯ༻ 2.

    A/B Testing (HTTP headers and cookies traffic routing) • Session affinity͕ඞཁͳΞϓϦέʔγϣϯ༻ 3. Blue/Green (traffic switching) • Any workload 4. Blue/Green Mirroring (traffic shadowing) • ႈ౳ͳΞϓϦέʔγϣϯ༻(࠷΋҆શ) • Canary Releaseͷલஈ֊ͱͯ͠mirroring͢Δ͜ͱ΋Մೳ Deployment Strategy
  18. Deployment Strategy - Canary Release

  19. Deployment Strategy - A/B testing

  20. Deployment Strategy - Blue Green

  21. Control loop / phase Analysis Promotion • Deployment strategy •

    Traffic Shifting • webhook • metrics check • Manual gating Rollback • Alert provider Stable Initialize
  22. 1. Canary • ϝΠϯͷϦιʔε • ࠷௿ݶͷػೳͰ͋Ε͹͜Ε͚ͩͰOK 2. MetricTemplate • Analyze͢ΔmetricsͷΫΤϦ

    • Metric ϓϩόΠμʔͷࢦఆ 3. Alertprovider • deliveryͷ௨஌ઌΛࢦఆ Flagger CRD
  23. apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec:

    provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…”
  24. apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec:

    provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” Applicationͷࢦఆ - Deployment - Daemonset
  25. apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec:

    provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” HPAͷࢦఆ(optional)
  26. apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec:

    provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” serviceͷఆٛ
  27. apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec:

    provider: istio targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 autoscalerRef: apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler name: podinfo service: name: podinfo port: 9898 targetPort: 9898 portName: http portDiscovery: true match: - uri: prefix: / timeout: 5s Flagger CRD: Canary analysis: interval: 1m threshold: 10 maxWeight: 50 stepWeight: 5 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: "database connections" templateRef: name: db-connections thresholdRange: min: 2 max: 100 interval: 1m webhooks: - name: "load test" type: rollout url: http://flagger-loadtester.test/ metadata: cmd: “…” Canary analysisͷఆٛ
  28. apiVersion: flagger.app/v1beta1 kind: MetricTemplate metadata: name: not-found-percentage namespace: istio-system spec:

    provider: type: prometheus address: http://promethues.istio-system:9090 query: | 100 - sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}", response_code!="404" }[{{ interval }}] ) ) / sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}" }[{{ interval }}] ) ) * 100 Flagger CRD: MetricTemplate
  29. apiVersion: flagger.app/v1beta1 kind: MetricTemplate metadata: name: not-found-percentage namespace: istio-system spec:

    provider: type: prometheus address: http://promethues.istio-system:9090 query: | 100 - sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}", response_code!="404" }[{{ interval }}] ) ) / sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}" }[{{ interval }}] ) ) * 100 Flagger CRD: MetricTemplate MetricϓϩόΠμͷࢦఆ - Prometheus - Datadog - CloudWatch
  30. apiVersion: flagger.app/v1beta1 kind: MetricTemplate metadata: name: not-found-percentage namespace: istio-system spec:

    provider: type: prometheus address: http://promethues.istio-system:9090 query: | 100 - sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}", response_code!="404" }[{{ interval }}] ) ) / sum( rate( istio_requests_total{ reporter="destination", destination_workload_namespace="{{ namespace }}", destination_workload="{{ target }}" }[{{ interval }}] ) ) * 100 Flagger CRD: MetricTemplate ΫΤϦͷtemplate
  31. apiVersion: flagger.app/v1beta1 kind: AlertProvider metadata: name: on-call namespace: flagger spec:

    type: slack channel: on-call-alerts username: flagger # address: https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK secretRef: name: on-call-url --- apiVersion: v1 kind: Secret metadata: name: on-call-url namespace: flagger data: address: <encoded-url> Flagger CRD: Alert Provider
  32. 3. How Flagger works?

  33. Overview https://github.com/stefanprodan/gitops-progressive-delivery

  34. • Canary CRΛ࡞੒ͨ࣌͠఺ͰFlagger͸ෳ਺ͷϦιʔεΛ࡞Δ • ΞϓϦέʔγϣϯؔ܎: جຊతʹݩͷఆٛͷίϐʔ • Deployment/Daemonset: <canary.spec.targetRef.name>-primary •

    “app: foo” → “app: foo-primary” ͱϥϕϧΛม׵ • HPA: <canary.spec.autoscalerRef.name>-primary • mountͯ͠Δconfigmap, secret: <name>-primary Canary Initialization
  35. • Canary CRΛ࡞੒ͨ࣌͠఺ͰFlagger͸ෳ਺ͷϦιʔεΛ࡞Δ • ΞϓϦέʔγϣϯؔ܎: جຊతʹݩͷఆٛͷίϐʔ • Deployment/Daemonset: <canary.spec.targetRef.name>-primary •

    “app: foo” → “app: foo-primary” ͱϥϕϧΛม׵ • HPA: <canary.spec.autoscalerRef.name>-primary • mountͯ͠Δconfigmap, secret: <name>-primary Canary Initialization طଘͷdeployment,HPAΛͦͷ··࢖͑Δ
  36. • Canary CRΛ࡞੒ͨ࣌͠఺ͰFlagger͸ෳ਺ͷϦιʔεΛ࡞Δ • ҎԼͷ3ͭͷserviceΛ࡞੒(+meshʹԠͯ͡VS౳) • <service.name>.<namespace>.svc.cluster.local • selector: app=<name>-primary

    • <service.name>-primary.<namespace>.svc.cluster.local • selector: app=<name>-primary • <service.name>-canary.<namespace>.svc.cluster.local • selector: app=<name> Canary Initialization
  37. • ݩͷuser managedͳdeploymentΛ canary ͱݺͿ • ίϐʔ͞ΕͨFlagger managedͳdeploymentΛ primary ͱݺͿ

    • ॾʑͷ࡞੒͕׬ྃޙɺuser managedͳdeploymentΛreplicas = 0ʹ͢Δ • (replicas͸ݩʑઃఆ͞Ε͍ͯͳ͍લఏ) • Canaryͷpod͸θϩݸʹͳΔ Canary Initialization
  38. deployment foo-primary service foo-primary Canary Initialization deployment foo HPA foo

    secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Managed by Users
  39. deployment foo-primary service foo-primary Canary Initialization deployment foo HPA foo

    secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Managed by Flagger
  40. deployment foo-primary service foo-primary Canary Initialization deployment foo replicas: 0

    HPA foo secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Set ‘replicas = 0’ on canary
  41. deployment foo-primary service foo-primary Canary Initialization deployment foo replicas: 0

    HPA foo secret foo configmap foo HPA foo-primary secret foo-primary configmap foo-primary service foo-canary service foo Traffic
  42. • ॳظԽޙtargetʹมߋ͕ೖΔͷΛ଴ͭ • Target = deployment, daemonset • spec.templateͷϋογϡ஋ͷมԽ •

    Target͕Ϛ΢ϯτͯ͠Δconfigmap, secretͷมԽ΋ݕ஌ • ݕ஌ͨ࣌͠఺Ͱ͸primaryʹ͸൓ө͞Εͳ͍(౰ͨΓલ) Change Detection
  43. • canaryͷ rollout: Replicas = 0 Λ࡟আ • HealthyʹͳΒͳ͚Ε͹rollback •

    100%ͷτϥϑΟοΫ͕canaryʹྲྀΕΔ·ͰɺҎԼΛ܁Γฦ͢ • ਺ˋͷτϥϑΟοΫΛcanaryʹshifting͢Δ • metricsΛ֬ೝ: ࢦఆ͞Εͨrangeͷதʹೖ͍ͬͯΔ͔ • ೖ͍ͬͯͳ͚Ε͹rollback • canary͔Βprimary΁deployment, configmap, secret, hpaΛίϐʔ͢Δ • User managedͳresourceΛFlagger managedͳresource΁ίϐʔ • τϥϑΟοΫΛprimary΁໭͢ • ஈ֊తʹ໭͢͜ͱ΋Մೳ (progressive promotion) Promotion Process ※͋͘·ͰҰྫ
  44. Promotion Process https://github.com/weaveworks/flagger/pull/593 Scale up canary Start analysis Finish analysis

    Update primary Start progressive promotion Scale down canary Promotion finish
  45. • rollback = ݩͷdeployment, daemonsetʹඥͮ͘podΛ0ݸʹ͢Δ • deploymentͷ৔߹: ‘replicas: 0’ Ληοτ͢Δ

    • daemonsetͷ৔߹: ଘࡏ͠ͳ͍labelͰnodeSelectorΛηοτ͢Δ • ϚχϑΣετࣗମ͕ϩʔϧόοΫ͞ΕΔΘ͚Ͱ͸ͳ͍ • gitopsϑϨϯυϦʔ Rollback
  46. • ֤ϑΣʔζͰwebhookΛઃఆͰ͖Δ(2xxҎ֎͕ฦ͖ͬͯͨΒrollback) • confirm-rollout … canaryͷrolloutલ • pre-rollout … analysisΛ։࢝͢Δલ

    • rollout … analysisͷϧʔϓຖ • confirm-promotion … primaryʹpromotion͢Δલ • post-rollout … primary΁ͷpromotionޙ Webhooks
  47. • podͷ਺͸୯७ʹೋഒ • service͕flaggerʹΑͬͯ࡞ΒΕΔ • طଘͷserviceΛͦͷ··࢖͑ͳ͍(࢖͑Δ͕ɺgitopsతʹ͸Ξ΢τ) • nativeͷserviceΛ࢖ͬͨcanary release͕Ͱ͖ͳ͍(argo-rolloutsํࣜ) •

    Blue Green͸Մೳ͕ͩɺجຊతʹ͸service mesh, ingress controllerલఏ • HPA΁ͷมߋ͸ݕ஌͞Εͳ͍ • ׬શʹࣗಈԽ͞ΕΔΘ͚Ͱ͸ͳ͍(ઈର҆શͰ͸ͳ͍) • metricsͰ֬ೝͰ͖ͳ͍όά΍Τϥʔ͸͋ΓಘΔ Considerations / Limitations
  48. • Progressive Delivery͸CD++ • ProductionϦϦʔεͷriskΛݮΒ͢ • AgilityͱReliabilityͷཱ྆ • Flagger͸k8s(service mesh)্ͰͷProgressive

    DeliveryΛ࣮ݱ • ෳ਺ͷdeployment strategy, metrics provider, • configmap/secretͷมߋݕ஌ • We welcome all contributions! • Stack driverରԠ, StatefulsetରԠ, HPAͷมߋݕ஌, etc. • ࣭໰౳͸ؾܰʹweaveworks community slackʹͯ: https://slack.weave.works/ #flagger Conclusion
  49. • https://github.com/weaveworks/flagger • https://docs.flagger.app/ • https://www.weave.works/blog/announcing-flagger-1-0 • https://medium.com/google-cloud-jp/gke-istio-flagger%E3%81%AB%E3%82%88%E3%82%8Bprogressive- delivery-5f1ea9b627c1 •

    https://www.slideshare.net/weaveworks/whats-new-in-flagger-10-with-stefan-prodan • https://medium.com/google-cloud/automated-canary-deployments-with-flagger-and-istio-ac747827f9d1 • https://medium.com/@dlorenc/pitfalls-of-progressive-delivery-114c6e3f9dbb • https://carlossg.github.io/presentations/2019-06_cdsummit • https://medium.com/@copyconstruct/testing-in-production-the-safe-way-18ca102d0ef1 • https://www.infoq.com/presentations/progressive-delivery • https://qiita.com/mumoshu/items/63b29bca6a052d8c7087 • https://www.getambassador.io/docs/latest/topics/concepts/progressive-delivery/ References