Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes - Beyond the basics (vol 1.)

Kubernetes - Beyond the basics (vol 1.)

So you've got a good grasp of the Kubernetes basics, you can create a cluster and deploy some applications to it, what about taking things up a notch?

In this talk and with demos we'll cover some more advanced topics within Kubernetes such as:-

Influencing the scheduling of pods
Controlling applications being scheduled using admission controllers
Auto scaling of applications and clusters
Options for extending/customising Kubernetes using Custom Resources
Adding a service mesh to improve traffic shaping

After this talk attendees should have a much clearer understanding of the additional capability in Kubernetes and associated platforms that they may want to use to improve their application platform.

Readers should have a good understanding of the basic Kubernetes concepts and constructs.

Shahid Iqbal

March 01, 2019
Tweet

More Decks by Shahid Iqbal

Other Decks in Technology

Transcript

  1. @shahiddev
    Shahid Iqbal | Freelance consultant
    @shahiddev
    Kubernetes
    Going beyond the basics

    View full-size slide

  2. @shahiddev
    Very brief intro
    Freelance hands-on consultant working on .NET, Azure & Kubernetes
    .NET developer/Architect for over a decade & Microsoft MVP
    Based in the UK and working globally
    Co-organiser of the MK.net meetup in the UK
    @shahiddev on Twitter
    https://www.linkedin.com/in/shahiddev/
    https://blog.headforcloud.com
    https://sessionize.com/shahid-iqbal

    View full-size slide

  3. @shahiddev
    Agenda
    Cover more detailed concepts within Kubernetes
    Scheduling
    Admission controllers
    Options for extending K8s
    Scaling
    - Virtual node
    - KEDA
    Demos!

    View full-size slide

  4. @shahiddev
    Not covering
    Fundamentals of Kubernetes
    Deep dive into creating custom controllers/operators
    Disclaimer: the definition of “advanced” topics is very subjective

    View full-size slide

  5. @shahiddev
    Audience participation

    View full-size slide

  6. @shahiddev
    Pod scheduling

    View full-size slide

  7. @shahiddev
    Control plane node
    etcd
    API Server Scheduler
    Controller
    manager
    Cloud
    Controller
    manager

    View full-size slide

  8. @shahiddev
    Scheduling pods
    Create a pod
    Scheduler
    detects no node
    assigned
    Assigns a node

    View full-size slide

  9. @shahiddev
    Why influence pod scheduling/placement?
    Heterogenous cluster with specialised hardware/software
    Allocate to teams/multi-tenancy
    Regulatory requirements
    Application architecture requires components to be co-located or
    separated

    View full-size slide

  10. @shahiddev
    Approaches to influencing pod scheduling
    Node selector
    Node affinity/anti-affinity
    Node taints/tolerations
    Pod affinity/anti-affinity
    Custom scheduler

    View full-size slide

  11. @shahiddev
    Node selector
    Add nodeSelector in our PodSpec

    View full-size slide

  12. @shahiddev
    Node selector
    If you need to add custom node label
    Add custom key-value pair label to node

    View full-size slide

  13. @shahiddev
    Node selector issues
    Basic matching on exact key-value pair
    Pods will fail to start if no node is found with matching label

    View full-size slide

  14. @shahiddev
    Node affinity/anti-affinity
    Allows pods to decide which node to use based on labels
    Match on conditions rather than exactly with a key-value pair.
    “In”, “NotIn”, “Exists”, “DoesNotExist”, “Gt”, “Lt”
    Can be selective on how “demanding” you are

    View full-size slide

  15. @shahiddev
    Specifying demand for node
    requiredDuringSchedulingIgnoredDuringExecution
    Hard requirement → Pod not scheduled
    preferredDuringSchedulingIgnoredDuringExecution
    Soft requirement → Pod still scheduled
    requiredDuringSchedulingRequiredDuringExecution
    Not implemented yet

    View full-size slide

  16. @shahiddev
    Node affinity

    View full-size slide

  17. @shahiddev
    Node affinity

    View full-size slide

  18. @shahiddev
    Node selectors/affinity issues
    If we want to prevent certain nodes from being used we cannot do
    this easily.
    We would need to ensure EVERY deployment had a node anti-
    affinity.

    View full-size slide

  19. @shahiddev
    Taints & Tolerations
    Allows nodes to repel pods based on taint
    Nodes are tainted and Pods tolerate taints
    Taints are comprised of: key, value and effect
    NoSchedule, PreferNoSchedule, NoExecute

    View full-size slide

  20. @shahiddev
    Taints and tolerations
    Existing running pods can be evicted if a node is tainted and a pod
    doesn’t tolerate it
    K8s add taints to nodes in certain circumstances (node problems)

    View full-size slide

  21. @shahiddev
    Taints & tolerations
    Taint node

    View full-size slide

  22. @shahiddev
    Taints & tolerations
    Schedule without toleration

    View full-size slide

  23. @shahiddev
    Taints & tolerations
    Add toleration to pod spec

    View full-size slide

  24. @shahiddev
    Taints & tolerations
    Schedule with toleration

    View full-size slide

  25. @shahiddev
    Taints vs Node affinity
    With Node affinity any pod could be scheduled unless the pod
    explicitly sets node anti-affinity – Node has no say in the matter
    Taints allow nodes to repel all pods unless they tolerate the taint
    - including currently running pods!
    Use Taints to prevent “casual scheduling” on certain Nodes
    E.g. where you have a limited/expensive resource

    View full-size slide

  26. @shahiddev
    Inter-pod affinity/anti-affinity
    Select nodes used for pod based on what other pods are running
    on it
    Ensure certain components run on same node
    e.g. cache alongside app
    Ensure certain components don’t run in same zone
    e.g. ensure app components can tolerate node loss

    View full-size slide

  27. @shahiddev
    Inter-pod affinity/anti-affinity
    Same constructs for indicating strictness
    requiredDuringSchedulingIgnoredDuringExecution
    preferredDuringSchedulingIgnoredDuringExecution
    Topologykey
    References a node label
    The “level” of infrastructure that is used to apply the rules
    E.g. hostname or failure domain or availability zone

    View full-size slide

  28. @shahiddev
    TopologyKey
    Label key
    Kubernetes.io/hostname Node-1 Node-2 Node-3 Node-4
    failure-domain.beta.kubernetes.io/zone 1 1 2 2
    Label Value

    View full-size slide

  29. @shahiddev
    Inter-pod affinity/anti-affinity

    View full-size slide

  30. @shahiddev
    Pod affinity
    Node 1
    web
    cache
    Node 2 Node 3
    Replicas: 4
    PodAffinity: cache (preferred)
    PodAntiAffinity: web
    Topologykey: kubernetes.io/hostname
    Node 4

    View full-size slide

  31. @shahiddev
    Pod affinity
    Node 1
    cache
    Node 2 Node 3
    Replicas: 4 Node 4
    web web web web
    PodAffinity: cache (preferred)
    PodAntiAffinity: web
    Topologykey: kubernetes.io/hostname
    PodAffinity: web (preferred)
    PodAntiAffinity: cache
    Topologykey: kubernetes.io/hostname

    View full-size slide

  32. @shahiddev
    Pod affinity
    Node 1
    cache
    Node 2 Node 3
    Replicas: 4 Node 4
    web web web web
    PodAffinity: cache (preferred)
    PodAntiAffinity: web
    Topologykey: kubernetes.io/hostname
    PodAffinity: web (preferred)
    PodAntiAffinity: cache
    Topologykey: kubernetes.io/hostname
    cache cache cache

    View full-size slide

  33. @shahiddev
    Pod affinity – Zone topology
    Zone 1
    Node 1
    web
    cache
    Zone 1
    Node 2
    Zone 2
    Node 3
    Replicas: 2
    PodAffinity: cache (preferred)
    PodAntiAffinity: web
    Topologykey: zone
    Zone 2
    Node 4

    View full-size slide

  34. @shahiddev
    Pod affinity – Zone topology
    Zone 1
    Node 1
    cache
    Zone 1
    Node 2
    Zone 2
    Node 3
    Replicas: 2 Zone 2
    Node 4
    web web
    PodAffinity: cache (preferred)
    PodAntiAffinity: web
    Topologykey: zone
    PodAffinity: web (preferred)
    PodAntiAffinity: cache
    Topologykey: zone

    View full-size slide

  35. @shahiddev
    Pod affinity – Zone topology
    Zone 1
    Node 1
    cache
    Zone 1
    Node 2
    Zone 2
    Node 3
    Replicas: 2 Zone 2
    Node 4
    web
    PodAffinity: cache (preferred)
    PodAntiAffinity: web
    Topologykey: zone
    PodAffinity: web (preferred)
    PodAntiAffinity: cache
    Topologykey: zone
    cache
    web

    View full-size slide

  36. @shahiddev
    Web front end

    View full-size slide

  37. @shahiddev
    Cache

    View full-size slide

  38. @shahiddev
    Pod distribution

    View full-size slide

  39. @shahiddev
    Custom scheduler
    Scheduler written in any language
    Needs access to API server
    To use define scheduler name in pod spec

    View full-size slide

  40. @shahiddev
    Controlling/Extending K8s

    View full-size slide

  41. @shahiddev
    Taking more control…
    Want to have more control over resources that are created
    Apply custom policies to resources (e.g. must have certain labels)
    Prevent certain resources being created
    Inject additional logic transparently into resources

    View full-size slide

  42. @shahiddev
    Admissions controllers
    Code that intercepts API server requests before they are persisted
    Controllers can be
    Validating – can inspect the objects but not modify
    Mutating – can modify the objects
    Both
    Enabled/disabled using kube-apisever
    Limited options on managed K8s providers
    Are compiled into the api server binary

    View full-size slide

  43. @shahiddev
    Admission controllers
    DefaultTolerationSeconds
    MutatingAdmissionWebhook
    ValidatingAdmissionWebhook
    ResourceQuota
    Priority
    NamespaceLifecycle
    LimitRanger
    ServiceAccount
    PersistentVolumeClaimResize
    DefaultStorageClass

    View full-size slide

  44. @shahiddev
    API request lifecycle
    HTTP
    handler
    AuthN/AuthZ
    Mutating
    admission
    controllers
    Object
    schema
    validation
    Validating
    admission
    controllers
    Persistence
    (etcd)
    Mutating
    admission
    webhooks
    Validating
    admission
    webhooks
    Mutating
    admission
    webhooks
    Validating
    admission
    webhooks
    Adapted from: https://banzaicloud.com/blog/k8s-admission-webhooks/

    View full-size slide

  45. @shahiddev
    Admission Webhooks
    Implemented by two “special” admission controllers
    MutatingAdmissionWebhook – modifies resources/creates new resources
    ValidatingAdmissionWebhook – use to block resource creation
    Controllers invoke HTTP callback
    Logic doesn’t need to be compiled into api server
    Logic can be hosted inside/outside the cluster

    View full-size slide

  46. @shahiddev
    QUICK DEMO
    ADMISSION WEBHOOKS

    View full-size slide

  47. @shahiddev
    Open Policy Agent (OPA)
    Admission controllers let you tightly control what can run in your
    cluster.
    Use OPA framework uses admission control but abstracts the lower
    level details.
    https://www.openpolicyagent.org

    View full-size slide

  48. @shahiddev
    Extending Kubernetes API
    Build abstractions on top of K8s resources
    Create entirely new resources within K8s
    Use kubectl to manage custom resources

    View full-size slide

  49. @shahiddev
    Extending Kubernetes API options
    Extension API servers
    Custom resource definitions
    Custom controllers

    View full-size slide

  50. @shahiddev
    Custom Resource Definitions (CRDs)
    A new resource type alongside the built in types
    Can use kubectl to create and delete
    Stored in Etcd
    Useless without controller to act on resource

    View full-size slide

  51. @shahiddev
    Custom Resource Definition

    View full-size slide

  52. @shahiddev
    Creating a Foo resource

    View full-size slide

  53. @shahiddev
    Custom controllers
    Can be used to customise behaviour of existing resources
    Often paired with CRDs to add behaviour to custom resources
    Often implemented in Go
    Operator ~= Crds + custom controllers

    View full-size slide

  54. @shahiddev
    Well known operators
    https://github.com/operator-framework/awesome-operators

    View full-size slide

  55. @shahiddev
    Writing your own operator?
    https://github.com/operator-framework

    View full-size slide

  56. @shahiddev
    Scaling application &
    clusters

    View full-size slide

  57. @shahiddev
    Autoscaling
    Horizontal Pod Autoscaler (HPA)
    Scale number of pods based on metrics
    v2 HPA – can use external metrics
    Vertical Pod Autoscaler (VPA)
    Increase the resources for a given pod based on metrics (scale up)
    Cluster Autoscaler (CA)
    Scale cluster if pods are waiting to be scheduled
    Relies on cloud provider to increase node count
    Virtual kubelet/node
    OSS project to connect external compute resource to K8s cluster
    Interact with resource via familiar k8s api

    View full-size slide

  58. @shahiddev
    Auto scaling triggers
    Horizontal scaling can be based on metrics from pod
    V1 HPA uses CPU/Memory
    V2 HPA (beta) can scale from almost any metric including external metrics
    (e.g. queue depth)
    VPA CPU/Memory usage of pod
    Cluster autoscaler based on pods waiting to be scheduled due to
    insufficient cluster resources

    View full-size slide

  59. @shahiddev
    Scale to zero
    Out of the box Kubernetes unable to auto-scale pods to zero
    instances*
    Desirable to scale certain microservices to zero instances
    Message handlers
    “functions” style applications
    * K8s 1.15 adds support for this via feature gate

    View full-size slide

  60. @shahiddev
    KEDA – Kubernetes Event Driven Autoscaler
    Open source project led by Microsoft and RedHat
    Allows for Kubernetes deployments to be auto scaled based on
    events
    Scale up from zero -> n instances
    Scale down from n -> zero instances

    View full-size slide

  61. @shahiddev
    How KEDA works

    View full-size slide

  62. @shahiddev
    KEDA scalers/event sources
    • AWS CloudWatch
    • AWS Simple Queue
    Service
    • Azure Event Hub†
    • Azure Service Bus
    Queues and Topics
    • Azure Storage Queues
    • GCP PubSub
    • Kafka
    • Liiklus
    • Prometheus
    • RabbitMQ
    • Redis Lists
    Others in development

    View full-size slide

  63. @shahiddev
    Virtual Kubelet/Node

    View full-size slide

  64. @shahiddev
    Virtual Kubelet implementations
    Azure Container Instances
    AWS Fargate
    Hashicorp Nomad
    Service Fabric Mesh
    Azure IoT Edge
    …others

    View full-size slide

  65. @shahiddev
    Azure Container Instances
    “Serverless” containers
    No infrastructure required
    Per sec billing for running container
    Good for:
    Testing images
    Short lived containers
    Bursting for sudden spikes

    View full-size slide

  66. @shahiddev
    Bursting load using virtual node
    Bursting to ACI to continue scaling
    beyond cluster capacity
    ACI

    View full-size slide

  67. @shahiddev
    Virtual nodes option in AKS

    View full-size slide

  68. @shahiddev
    DEMO
    KEDA
    VIRTUAL NODE SCALING

    View full-size slide

  69. @shahiddev
    Virtual node recap
    • Virtual node was tainted to prevent pods being scheduled
    “accidentally”
    • The e-commerce shop deployment to burst was configured with
    • Toleration for the virtual node taint – now allows pods to be scheduled on
    the virtual node
    • Node anti-affinity to the virtual node (soft) – prevents usage of the virtual
    node unless there is no other choice

    View full-size slide

  70. @shahiddev
    Wrapping it up
    Many powerful constructs available in Kubernetes to control pod
    scheduling
    Admission webhooks allow customisation of resources with
    minimal code
    Custom resources with controllers give you ultimate extensibility
    Virtual node may allow for “serverless” k8s clusters in the future

    View full-size slide

  71. @shahiddev
    Where can I go to learn more?
    http://www.katacoda.com
    https://www.katacoda.com/openshift/courses/operatorframework
    https://github.com/Azure-Samples/virtual-node-autoscale
    http://bit.ly/k8s-microservices-video

    View full-size slide

  72. @shahiddev
    Shahid Iqbal | Freelance consultant
    @shahiddev
    Thank you!
    Questions?
    @shahiddev on Twitter
    https://www.linkedin.com/in/shahiddev/
    https://blog.headforcloud.com

    View full-size slide