Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes - Beyond the basics (vol 1.)

Kubernetes - Beyond the basics (vol 1.)

So you've got a good grasp of the Kubernetes basics, you can create a cluster and deploy some applications to it, what about taking things up a notch?

In this talk and with demos we'll cover some more advanced topics within Kubernetes such as:-

Influencing the scheduling of pods
Controlling applications being scheduled using admission controllers
Auto scaling of applications and clusters
Options for extending/customising Kubernetes using Custom Resources
Adding a service mesh to improve traffic shaping

After this talk attendees should have a much clearer understanding of the additional capability in Kubernetes and associated platforms that they may want to use to improve their application platform.

Readers should have a good understanding of the basic Kubernetes concepts and constructs.

Shahid Iqbal

March 01, 2019

More Decks by Shahid Iqbal

Other Decks in Technology


  1. @shahiddev Shahid Iqbal | Freelance consultant @shahiddev Kubernetes Going beyond

    the basics
  2. @shahiddev Very brief intro Freelance hands-on consultant working on .NET,

    Azure & Kubernetes .NET developer/Architect for over a decade & Microsoft MVP Based in the UK and working globally Co-organiser of the MK.net meetup in the UK @shahiddev on Twitter https://www.linkedin.com/in/shahiddev/ https://blog.headforcloud.com https://sessionize.com/shahid-iqbal
  3. @shahiddev Agenda Cover more detailed concepts within Kubernetes Scheduling Admission

    controllers Options for extending K8s Scaling - Virtual node - KEDA Demos!
  4. @shahiddev Not covering Fundamentals of Kubernetes Deep dive into creating

    custom controllers/operators Disclaimer: the definition of “advanced” topics is very subjective
  5. @shahiddev Audience participation

  6. @shahiddev Pod scheduling

  7. @shahiddev Control plane node etcd API Server Scheduler Controller manager

    Cloud Controller manager
  8. @shahiddev Scheduling pods Create a pod Scheduler detects no node

    assigned Assigns a node
  9. @shahiddev Why influence pod scheduling/placement? Heterogenous cluster with specialised hardware/software

    Allocate to teams/multi-tenancy Regulatory requirements Application architecture requires components to be co-located or separated
  10. @shahiddev Approaches to influencing pod scheduling Node selector Node affinity/anti-affinity

    Node taints/tolerations Pod affinity/anti-affinity Custom scheduler
  11. @shahiddev Node selector Add nodeSelector in our PodSpec

  12. None
  13. @shahiddev Node selector If you need to add custom node

    label Add custom key-value pair label to node
  14. @shahiddev Node selector issues Basic matching on exact key-value pair

    Pods will fail to start if no node is found with matching label
  15. @shahiddev Node affinity/anti-affinity Allows pods to decide which node to

    use based on labels Match on conditions rather than exactly with a key-value pair. “In”, “NotIn”, “Exists”, “DoesNotExist”, “Gt”, “Lt” Can be selective on how “demanding” you are
  16. @shahiddev Specifying demand for node requiredDuringSchedulingIgnoredDuringExecution Hard requirement → Pod

    not scheduled preferredDuringSchedulingIgnoredDuringExecution Soft requirement → Pod still scheduled requiredDuringSchedulingRequiredDuringExecution Not implemented yet
  17. @shahiddev Node affinity

  18. @shahiddev Node affinity

  19. @shahiddev Node selectors/affinity issues If we want to prevent certain

    nodes from being used we cannot do this easily. We would need to ensure EVERY deployment had a node anti- affinity.
  20. @shahiddev Taints & Tolerations Allows nodes to repel pods based

    on taint Nodes are tainted and Pods tolerate taints Taints are comprised of: key, value and effect NoSchedule, PreferNoSchedule, NoExecute
  21. @shahiddev Taints and tolerations Existing running pods can be evicted

    if a node is tainted and a pod doesn’t tolerate it K8s add taints to nodes in certain circumstances (node problems)
  22. @shahiddev Taints & tolerations Taint node

  23. @shahiddev Taints & tolerations Schedule without toleration

  24. @shahiddev Taints & tolerations Add toleration to pod spec

  25. @shahiddev Taints & tolerations Schedule with toleration

  26. @shahiddev Taints vs Node affinity With Node affinity any pod

    could be scheduled unless the pod explicitly sets node anti-affinity – Node has no say in the matter Taints allow nodes to repel all pods unless they tolerate the taint - including currently running pods! Use Taints to prevent “casual scheduling” on certain Nodes E.g. where you have a limited/expensive resource
  27. @shahiddev Inter-pod affinity/anti-affinity Select nodes used for pod based on

    what other pods are running on it Ensure certain components run on same node e.g. cache alongside app Ensure certain components don’t run in same zone e.g. ensure app components can tolerate node loss
  28. @shahiddev Inter-pod affinity/anti-affinity Same constructs for indicating strictness requiredDuringSchedulingIgnoredDuringExecution preferredDuringSchedulingIgnoredDuringExecution

    Topologykey References a node label The “level” of infrastructure that is used to apply the rules E.g. hostname or failure domain or availability zone
  29. @shahiddev TopologyKey Label key Kubernetes.io/hostname Node-1 Node-2 Node-3 Node-4 failure-domain.beta.kubernetes.io/zone

    1 1 2 2 Label Value
  30. @shahiddev Inter-pod affinity/anti-affinity

  31. @shahiddev Pod affinity Node 1 web cache Node 2 Node

    3 Replicas: 4 PodAffinity: cache (preferred) PodAntiAffinity: web Topologykey: kubernetes.io/hostname Node 4
  32. @shahiddev Pod affinity Node 1 cache Node 2 Node 3

    Replicas: 4 Node 4 web web web web PodAffinity: cache (preferred) PodAntiAffinity: web Topologykey: kubernetes.io/hostname PodAffinity: web (preferred) PodAntiAffinity: cache Topologykey: kubernetes.io/hostname
  33. @shahiddev Pod affinity Node 1 cache Node 2 Node 3

    Replicas: 4 Node 4 web web web web PodAffinity: cache (preferred) PodAntiAffinity: web Topologykey: kubernetes.io/hostname PodAffinity: web (preferred) PodAntiAffinity: cache Topologykey: kubernetes.io/hostname cache cache cache
  34. @shahiddev Pod affinity – Zone topology Zone 1 Node 1

    web cache Zone 1 Node 2 Zone 2 Node 3 Replicas: 2 PodAffinity: cache (preferred) PodAntiAffinity: web Topologykey: zone Zone 2 Node 4
  35. @shahiddev Pod affinity – Zone topology Zone 1 Node 1

    cache Zone 1 Node 2 Zone 2 Node 3 Replicas: 2 Zone 2 Node 4 web web PodAffinity: cache (preferred) PodAntiAffinity: web Topologykey: zone PodAffinity: web (preferred) PodAntiAffinity: cache Topologykey: zone
  36. @shahiddev Pod affinity – Zone topology Zone 1 Node 1

    cache Zone 1 Node 2 Zone 2 Node 3 Replicas: 2 Zone 2 Node 4 web PodAffinity: cache (preferred) PodAntiAffinity: web Topologykey: zone PodAffinity: web (preferred) PodAntiAffinity: cache Topologykey: zone cache web
  37. @shahiddev Web front end

  38. @shahiddev Cache

  39. @shahiddev Pod distribution

  40. @shahiddev Custom scheduler Scheduler written in any language Needs access

    to API server To use define scheduler name in pod spec
  41. @shahiddev Controlling/Extending K8s

  42. @shahiddev Taking more control… Want to have more control over

    resources that are created Apply custom policies to resources (e.g. must have certain labels) Prevent certain resources being created Inject additional logic transparently into resources
  43. @shahiddev Admissions controllers Code that intercepts API server requests before

    they are persisted Controllers can be Validating – can inspect the objects but not modify Mutating – can modify the objects Both Enabled/disabled using kube-apisever Limited options on managed K8s providers Are compiled into the api server binary
  44. @shahiddev Admission controllers DefaultTolerationSeconds MutatingAdmissionWebhook ValidatingAdmissionWebhook ResourceQuota Priority NamespaceLifecycle LimitRanger

    ServiceAccount PersistentVolumeClaimResize DefaultStorageClass
  45. @shahiddev API request lifecycle HTTP handler AuthN/AuthZ Mutating admission controllers

    Object schema validation Validating admission controllers Persistence (etcd) Mutating admission webhooks Validating admission webhooks Mutating admission webhooks Validating admission webhooks Adapted from: https://banzaicloud.com/blog/k8s-admission-webhooks/
  46. @shahiddev Admission Webhooks Implemented by two “special” admission controllers MutatingAdmissionWebhook

    – modifies resources/creates new resources ValidatingAdmissionWebhook – use to block resource creation Controllers invoke HTTP callback Logic doesn’t need to be compiled into api server Logic can be hosted inside/outside the cluster

  48. @shahiddev Open Policy Agent (OPA) Admission controllers let you tightly

    control what can run in your cluster. Use OPA framework uses admission control but abstracts the lower level details. https://www.openpolicyagent.org
  49. @shahiddev Extending Kubernetes API Build abstractions on top of K8s

    resources Create entirely new resources within K8s Use kubectl to manage custom resources
  50. @shahiddev Extending Kubernetes API options Extension API servers Custom resource

    definitions Custom controllers
  51. @shahiddev Custom Resource Definitions (CRDs) A new resource type alongside

    the built in types Can use kubectl to create and delete Stored in Etcd Useless without controller to act on resource
  52. @shahiddev Custom Resource Definition

  53. @shahiddev Creating a Foo resource

  54. None
  55. @shahiddev Custom controllers Can be used to customise behaviour of

    existing resources Often paired with CRDs to add behaviour to custom resources Often implemented in Go Operator ~= Crds + custom controllers
  56. @shahiddev Well known operators https://github.com/operator-framework/awesome-operators

  57. @shahiddev Writing your own operator? https://github.com/operator-framework

  58. @shahiddev Scaling application & clusters

  59. @shahiddev Autoscaling Horizontal Pod Autoscaler (HPA) Scale number of pods

    based on metrics v2 HPA – can use external metrics Vertical Pod Autoscaler (VPA) Increase the resources for a given pod based on metrics (scale up) Cluster Autoscaler (CA) Scale cluster if pods are waiting to be scheduled Relies on cloud provider to increase node count Virtual kubelet/node OSS project to connect external compute resource to K8s cluster Interact with resource via familiar k8s api
  60. @shahiddev Auto scaling triggers Horizontal scaling can be based on

    metrics from pod V1 HPA uses CPU/Memory V2 HPA (beta) can scale from almost any metric including external metrics (e.g. queue depth) VPA CPU/Memory usage of pod Cluster autoscaler based on pods waiting to be scheduled due to insufficient cluster resources
  61. @shahiddev Scale to zero Out of the box Kubernetes unable

    to auto-scale pods to zero instances* Desirable to scale certain microservices to zero instances Message handlers “functions” style applications * K8s 1.15 adds support for this via feature gate
  62. @shahiddev KEDA – Kubernetes Event Driven Autoscaler Open source project

    led by Microsoft and RedHat Allows for Kubernetes deployments to be auto scaled based on events Scale up from zero -> n instances Scale down from n -> zero instances
  63. @shahiddev How KEDA works

  64. @shahiddev KEDA scalers/event sources • AWS CloudWatch • AWS Simple

    Queue Service • Azure Event Hub† • Azure Service Bus Queues and Topics • Azure Storage Queues • GCP PubSub • Kafka • Liiklus • Prometheus • RabbitMQ • Redis Lists Others in development
  65. @shahiddev Virtual Kubelet/Node

  66. @shahiddev Virtual Kubelet implementations Azure Container Instances AWS Fargate Hashicorp

    Nomad Service Fabric Mesh Azure IoT Edge …others
  67. @shahiddev Azure Container Instances “Serverless” containers No infrastructure required Per

    sec billing for running container Good for: Testing images Short lived containers Bursting for sudden spikes
  68. @shahiddev Bursting load using virtual node Bursting to ACI to

    continue scaling beyond cluster capacity ACI
  69. @shahiddev Virtual nodes option in AKS


  71. @shahiddev Virtual node recap • Virtual node was tainted to

    prevent pods being scheduled “accidentally” • The e-commerce shop deployment to burst was configured with • Toleration for the virtual node taint – now allows pods to be scheduled on the virtual node • Node anti-affinity to the virtual node (soft) – prevents usage of the virtual node unless there is no other choice
  72. @shahiddev Wrapping it up Many powerful constructs available in Kubernetes

    to control pod scheduling Admission webhooks allow customisation of resources with minimal code Custom resources with controllers give you ultimate extensibility Virtual node may allow for “serverless” k8s clusters in the future
  73. @shahiddev Where can I go to learn more? http://www.katacoda.com https://www.katacoda.com/openshift/courses/operatorframework

    https://github.com/Azure-Samples/virtual-node-autoscale http://bit.ly/k8s-microservices-video
  74. @shahiddev Shahid Iqbal | Freelance consultant @shahiddev Thank you! Questions?

    @shahiddev on Twitter https://www.linkedin.com/in/shahiddev/ https://blog.headforcloud.com