Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes at NU.nl (2019-09-05)

Tibo Beijen
September 05, 2019

Kubernetes at NU.nl (2019-09-05)

Slides of the presentation about Kubernetes practices and learnings at NU.nl.
This presentation was the first of two at the Dutch Kubernetes meetup at the Sanoma Netherlands offices, that took place on Sept. 5th 2019

#kubernetes #aws #devops #cncf #nu.nl #cloud-computing

Tibo Beijen

September 05, 2019
Tweet

Other Decks in Technology

Transcript

  1. @

  2. NU.nl About • First dutch digital news platform. • Unique

    visitors: • 7 mln. / month • 2.1 mln. / day • Page hits: ~12 mln / day • API: ~150k rpm / 2500rps
  3. NU.nl Sanoma • Part of Sanoma • NL: NU.nl, Viva,

    Libelle, Scoopy • FI: Helsingin Sanomat • Reaching ~9.8 mln dutch people / month
  4. IT organization Teams • NU.nl teams • Web 1 (application

    / front-end-ish) • Web 2 (application / back-end-ish / infra) • Feature 1 & 2 (cross-discipline) • iOS • Android • Sanoma teams • DevSupport, Mediatool, Content Aggregation
  5. NU.nl Growing number of teams • Increased number of parallel

    workflows • Testing • Releasing • Roadmaps • Knowing about everything no longer possible • Aligning ‘procedures by agreement’ increasingly hard
  6. Current infrastructure AWS accounts & VPCs VPC sanoma RDS Elasticache

    ALBs EC2 Cloudfront API CMS WWW XYZ VPC nu-test FOO K8S VPC nu-prod BAR K8S
  7. Development workflow From code to release • Code • Automated

    tests • Code review • Manually initiated deploy to test • Feature test • Manually initiated deploy to staging • Exploratory test • Manually initiated deploy to production
  8. DevOps practices Solid foundation • All infra in code •

    Terraform • Terrible providing mechanisms: • Authorization • Managing TF state files
  9. DevOps practices But… • Setting up additional test environments slow

    • Slow feedback loop • Terraform plan vs apply (surprise surprise, it didn’t work) • Ansible (~20 minutes) • Vagrant? (but not fully representative of EC2) • Config drift • Hard to nail down every system package version • EC2 instances having different lifecycle
  10. DevOps practices But… (part 2) • No scaling infra* •

    Heavily invested in Ansible • Config & secrets management problematic • GUIs time consuming • No change history • Or highly detached from code history • No context • Not overly secret *Yes, we know it’s 2019
  11. DevOps practices But… (part 3) • Current deployment system assumes

    fixed set of servers • Possible alternatives include: • ASG rolling updates (can get slow) • Pull current application code on start-up (even slower) • Bake AMI • Periodically poll for application version to be deployed • Works quite well • …as long as new code combined with config doesn’t break. • So a certain level of orchestration would be needed.
  12. Timing What direction to move? • DevOps challenges • Desire

    to improve delivery process, having true artifacts • Early 2018 • Containers are a well-established way of ‘packaging’ an application • Kubernetes getting out of early-adopters phase • NU.nl (re-)launching a new product: NUjij
  13. Improvement layers A journey or a destination? 1: Containers as

    artifacts • Versatile • Forces us to do certain things right • 12factor • Centralized logging • Easily moved through a pipeline • Lots of tooling
  14. Improvement layers A journey or a destination? 2: A flexible

    platform to deploy and run containerized applications on • Tackling challenges at platform level instead of per-application: • Scaling • Security updates • Observability • Deployment & configuration process
  15. Improvement layers A journey or a destination? 2: A flexible

    platform to deploy and run containerized applications on • Kubernetes • Rapidly increasing adoption • Short feedback loop • Ability to run locally (unlike, say, ECS) • Easily stamp out deployments for: • feature testing/demo-ing • e2e tests
  16. Narrowing the scope Lets not get carried away The goal

    is not: • To chop up change all of our applications into nano- micro-services • They’re not that monolithic anyway • To put everything in Kubernetes • Managed AWS services where possible • Redis, RDS Focus on agility and efficiency of what we change most frequently: Code
  17. Multiple clusters By criticality 3 AWS accounts, 3 clusters: •

    osc-nu-prod • production • osc-nu-test • test • staging • osc-nu-dev • proofing infra changes
  18. Kops Why Kops? • Manages cluster upgrades • Rolling upgrade

    • Draining nodes • EKS not yet available • Let alone in eu-west-1
  19. Components kube-system • Networking • Calico • EFS • previousnext/k8s-aws-efs

    • No AZ-restrictions when re-scheduling pods • Creates new EFS filesystem for each PersistentVolumeClaim • Security & reliability (isolated IOPs budgets) • Slow on initial deploy
  20. Components kube-system • AWS IAM Authenticator • The ‘Zalando suite’

    • Skipper • Skipper Daemonset • kube-ingress-aws-controller Deployment • ExternalDNS • Configures PowerDNS (& others) based on ingress host
  21. Components Zalando skipper • Skipper Daemonset • Feature rich (metrics,

    shadow traffic, blue/green) • kube-ingress-aws-controller Deployment • https://github.com/zalando-incubator/kube- ingress-aws-controller • Sets up & manages ALB • Finds appropriate ACM certificate • Supports multiple ACM certificates per ALB
  22. Components Autoscaling • Horizontal Pod Autoscaler • Scales number of

    pods based on (CPU) utilization • Cluster autoscaler • Running on master nodes • Scales asg out when pods pending • Scales asg in when nodes underutilized
  23. Jenkins Temporary deployment for running tests • Deploy to temp.

    namespace • Jenkins-SU • Run tests in deployment • Deploy to test/staging/production • By bumping image version • Production: Jenkins-SU • Clean up temp. namespace • Jenkins-SU
  24. Jenkins Jenkins-SU • Sets up namespace • Adding RBAC for

    Jenkins • Only if ns name matches pattern ‘Jenkins-*’ • Deletes namespace • Only if ns name matches pattern ‘Jenkins-*’ • Avoids need for Jenkins to be able to delete every namespace curl -X POST --user ${JENKINS_SU_AUTH} --data '{\"name\": \"${K8S_BUILD_NS}\"}' http://su.jenkins-su/ns/ curl -X DELETE --user ${JENKINS_SU_AUTH} --data '{\"name\": \"${K8S_BUILD_NS}\"}' http://su.jenkins-su/ns/
  25. Kubernetes in action Questions • Will it be stable? •

    Will we be able to operate? • Should we wait for EKS? • Do we actually want EKS? What will EKS be like?
  26. Incident 1 Accidentally trying to load a ElasticSearch index of

    90Gb • Misconfigured elast-alert (trying to read entire index) • No memory limit configured
  27. Incident 1 Accidentally trying to load a ElasticSearch index of

    90Gb • Required manual intervention: Yes • Stopping the bleeding: • Remove elast-alert • Permanent fixes: • Don’t load entire index • Apply limits
  28. Incident 2 Rapid traffic increase affecting core components • 2019-03-18

    Utrecht shooting • 11:11 First article published • 11:56 breaking push • CPU burstable pods causing node 100% CPU • Core components (kubelet, ingress) suffering
  29. Incident 2 Rapid traffic increase affecting core components pod pod

    kubelet skipper node Pods: 0.4 CPU req. 0.8 CPU limit 80% CPU utilization pod kubelet skipper node pod Pods: 0.4 CPU req. 0.8 CPU limit 120% CPU utilization problems
  30. Incident 2 Rapid traffic increase affecting core components • Required

    manual intervention: No • Fixes: • Reduce CPU burstable amount of pods • Increase resource requests of skipper • Mind QoS: Guaranteed, Burstable, Best effort • Reserve cpu & memory for kubelet • --kube-reserved • --system-reserved
  31. Incident 3 Application update increasing memory footprint • Upgrade including

    moving from MongoDB 3 to MongoDB 4 • HorizontalPodAutoscaler based on CPU • Scaling based on CPU not kicking in • New increased memory footprint causing OOMkilled
  32. Incident 3 Application update increasing memory footprint • Required manual

    intervention: Yes • Stopping the bleeding: • Increase memory limit of Talk pods • Permanent fixes: • Adjust CPU request/limit & HPA thresholds • Scale on both CPU and memory • Note: Not all applications ‘give back’ memory • Set memory limit higher than request to prevent ‘snowball effect’
  33. Incident 3 OOMKilled snowball effect pod pod pod pod pod

    pod pod pod pod pod starting … 1 2 3 4
  34. That’s not fine Is it? • On the positive side:

    • All are result of (lack of) resource limit configuration • This can be learned • On the negative side: • This needs to be learned • Note: ‘Availability bias’
  35. Automation Improving the pipeline • Automating setting the image version

    is not enough • Rolling out Kubernetes manifests still manual task • Updating configuration & secrets still manual task • Duplication in manifests between stages • Not easily seen what parts are different • Differences intentional or accidental? • This actually slows us down • Does git represent the current state? kubectl -n talk get secrets env -o json |jq -r '.data | map_values(@base64d) | to_entries | .[] | .key + "=\"" + .value +"\""'
  36. Helm The package manager for Kubernetes • Charts • Configured

    via values • It’s like Terraform modules • Or Ansible group_vars • Leveraging community knowledge and efforts • E.g. prometheus-operator • No need to copy charts, able to reference. • Helm v3
  37. SOPS: Secrets OPerationS Secrets management stinks, use some sops! •

    By Mozilla • Manage AWS API access, not keys • Versatile • YAML, JSON, ENV, INI, binary (plain text) • Not limited to Kubernetes • Meaningful diffs • Alternatives considered: • Kamus • Bitnami SealedSecrets
  38. Helmfile Wiring it together • Charts • Referenced from online

    chart sources or local • Environments • Test, staging, production • Referencing values and secrets • Releases • Release name • Reference to chart • Values (can be a templated file, using vars and secrets from environment)
  39. Helmfile Wiring it together environment values secrets (SOPS) release X

    release Y release Z ENV values values values Helmfile
  40. Helmfile Wiring it together • Advantages: • Meaningful git diffs

    • Easily manage multiple releases in single pipeline, e.g.: • Everything related to monitoring and logging • Kube-system • Declarative definition • Of what would otherwise be numerous helm args and steps in CI/CD pipeline
  41. Helmfile Wiring it together • Advantages (continued): • Ability to

    pass in ENV vars • E.g. build result image tags • Ability to reference complex charts created by community • Charts as a building block allows re-use. Example: • Instead of plain yaml you write a chart • If fitting workflow, the chart can be a published artifact • Chart can be re-used e.g. in e2e tests
  42. Helmfile Wiring it together • Disadvantages: • 2 levels of

    templating • Chart itself • Only if writing own charts • Environment & release values into Helm values • Template error message not overly clear • Or even misleading • At least it breaks
  43. Helmfile Final words But tiller? • Helm as a templating

    engine • Option: Using Helm 2 ‘Tillerless’ • Tiller outside of cluster, not by-passing RBAC • Start using Helm as package manager when Helm 3 settles down • Easy removal of temp. per-feature deploys • Diffs
  44. Auto-scaling Types of scaling • Reactive • Breaking news •

    K8S cluster-autoscaler • Can’t schedule pod? Add nodes. • Predictive • Ticket sale start • Black Friday
  45. Auto-scaling Types of scaling • From within cluster • K8S

    cluster-autoscaler • From outside of cluster • ASG scaling policies
  46. Cluster auto-scaler Bag of tricks • Mix predictive and reactive

    • Add asg instances without telling cluster-autoscaler • Traffic expected to arrive by the time cluster-autoscaler starts to scale in, leaving plenty of resources as needed. • Pause pods • Lower priority pods that can safely be evicted • Effectively ‘creating headroom’ in cluster
  47. Considerations When engaging ‘ludicrous mode’™ Can control-plane handle scale? •

    KOPS • Size master nodes for max. cluster size • Overhead cost • EKS • What’s behind the abstraction? • ELB 503s exist after all • Plan: Proof of concepts
  48. Consider EKS Managed control plane EKS Kops Managed control plane

    Total control over setup Easier: EKS IAM roles for pods • Launched 2019-09-04 (yesterday)* Smooth rolling upgrade process Probably cheaper (2/3 of 3x m4.large) No VPC CNI Pod density limitations * https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/
  49. EKS IAM roles for pods Also possible on DIY clusters,

    officially launched yesterday • OIDC federation access (OpenID Connect identity provider) • Assume role via Secure Token Service (STS) • Projected service account tokens (JWT) in pod • STS can validate JWT tokens against OIDC provider • Boils down to: • Enable/set-up prerequisites in cluster • Add ServiceAccount having IAM role annotation to pod • Use recent AWS SDK
  50. Multiple clusters per AWS account Don’t lock ourselves in a

    corner. api.<aws-account-name>.<k8s-sanoma-domain> api.<cluster-name>.<aws-account-name>.<k8s-sanoma-domain> Route53 zone 1 Route53 zone 1 Route53 zone 2 NS records
  51. CI/CD to separate cluster Similar flows • No more taints

    and tolerations • Similar authorization mechanism to all deploy targets • Possibly IAM • No need for Jenkins-SU • Clusters should be cattle anyway
  52. Pipelines GitOps • Manage namespaces via pipeline: • kube-system •

    monitor • Creation of application namespaces including RBAC • Helmfile
  53. System applications Small improvements • Prometheus-operator • PrometheusRule resource type

    • Default dashboards • EFS • https://github.com/previousnext/k8s-aws-efs • Current. Works well but not a lot of active development. • 2 contributors. 46 stars. • https://github.com/kubernetes-incubator/external-storage • De facto EFS provisioner. 146 contributors. 1630 stars. • Bonus: No more time-consuming initial volume set-up
  54. Expand Increase Return on Investment • Add more applications •

    Facilitate parallel testing & development workflows • Feature testing • Mobile app development • E2e tests
  55. Links Further reading Scaling & spot instances: • https://itnext.io/the-definitive-guide-to-running-ec2-spot-instances-as-kubernetes-worker-nodes-68ef2095e767 EKS:

    • https://medium.com/glia-tech/productionproofing-eks-ed52951ffd6c QoS: • https://www.replex.io/blog/everything-you-need-to-know-about-kubernetes-quality-of-service-qos-classes Failure stories: • https://k8s.af/
  56. Know your limits Automate all the things Everything code Kubernetes

    is a journey, not a destination All should be cattle. No pets allowed!
  57. ?