Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes 101

Kubernetes 101


Patrick D'appollonio Vega

January 14, 2022


  1. Kubernetes 101 Starting from the (very, very) beginning

  2. Let's start: At the beginning, there was a big bang,

    a surprisingly hot and dense.... (oh, oh, not that early!)
  3. Send your questions! https://pigeonhole.at/KUBECTL Anonymous questions are always welcome!

  4. Who's this guy? Patrick D'appollonio New to Sourced, started ~6

    months ago! Chilean Software developer turned DevOps connoisseur Been doing Kubernetes stuff before with: Ubisoft (live games) DreamWorks (movie rendering) HPE (consulting)
  5. Kube-what?

  6. Kubernetes Ancient greek for "helmsman" -- no kidding. For us

    it's a Highly available Container Orchestrator. Created at Google, originally named "Borg". Battle-tested at high scales, Google-level scale. it does "Container magic"...
  7. In more depth... Kubernetes is a group of several Go

    applications, evolving around Containers and costainer orchestration. The core requirement is to have, at minimum, the following features: Scheduling Deployment Scaling Load balancing Health monitoring Resource allocating Redundancy
  8. ... but there's so much more! It's possible to extend

    what's possible on Kubernetes using custom resources, to the point that you stop looking at container orchestration and look at this as a high-resilience environment. But that's a discussion for another time 😉
  9. Bits & pieces

  10. Control Plane A group of applications that run on the

    "control" side of Kubernetes: API Server Kube Controller manager Cloud controller manager Scheduler etcd The document database where all the Kubernetes data is stored. Written in Go, highly performant and RAM hungry. It's a "cluster" too on its own and as such needs consensus.
  11. Worker Nodes They run your apps, and they can be

    grouped depending on their capabilities. You can have CPU intensive nodes, or RAM intensive nodes. You can also have GPU-powered nodes for machine learning, or even Windows-powered nodes for Windows workloads. The main ingredient? The Kubelet.
  12. None
  13. ... looks hella complicated!

  14. Splitting by piece: API Server The API Server is the

    brains of the operation. It coordinates all the efforts from all controllers. While controllers act, the API Server is the source of truth for them. We interact with the API server using kubectl. Kubectl is the tool that talks to the Kubernetes API to emit "orders/instructions."
  15. kubectl Another Go application. This one runs on your machine.

    Send instructions to Kubernetes using common HTTP verbs: GET to list POST to create PUT/PATCH to update DELETE to, well, delete Available for all platforms.
  16. ... BTW, Let's start a flame war... Is it... kube-control

    koob-control koo-bay-ctl koob-cuttle koo-bay-cuddle koo-bectal koobec-tee-ell https://youtu.be/2wgAIvXpJqU
  17. $KUBECONFIG In order for kubectl to know who we are

    and what we can do, it needs our "credentials". Standard Kubernetes is very old-fashioned: uses Certificates and Tokens to authenticate users. Kubernetes-to-Kubernetes components also use TLS for communication.
  18. certificate-authority-data: "LS0tLS1CRUd..." server: "" client-certificate-data: "LS0tLS1CRUd..." client-key-data: "LS0tLS1E1Q..." cluster: personal-cluster

    user: patrick-account apiVersion: v1 1 kind: Config 2 3 clusters: 4 - name: personal-cluster 5 cluster: 6 7 8 9 users: 10 - name: patrick-account 11 user: 12 13 14 15 contexts: 16 - name: patrick-on-personal 17 context: 18 19 20 21 current-context: patrick-on-personal 22 Standard $KUBECONFIG
  19. auth-provider: config: client-id: kubernetes client-secret: <token> id-token: <token> refresh-token: <refresh-token>

    name: oidc apiVersion: v1 1 kind: Config 2 3 clusters: 4 - ... 5 6 users: 7 - name: patrick-account 8 user: 9 10 11 12 13 14 15 16 17 contexts: 18 - ... 19 20 current-context: ... 21 Third-party identity $KUBECONFIG
  20. auth-provider: config: access-token: <access-token> cmd-path: gcloud cmd-args: config config-helper --format=json

    expiry: <exp-date> expiry-key: '{.credential.token_expiry}' token-key: '{.credential.access_token}' name: gcp apiVersion: v1 1 kind: Config 2 3 clusters: 4 - ... 5 6 users: 7 - name: patrick-account 8 user: 9 10 11 12 13 14 15 16 17 18 19 ... 20 GCP, GKE, Azure $KUBECONFIG
  21. KUBECONFIGs can be mix-matched Some people prefer a single KUBECONFIG

    with multiple entries. Others prefer to point the KUBECONFIG environment variable to their files and be more declarative. There isn't right or wrong.
  22. Side note: all Kubernetes objects follow the same format apiVersion:

    example.com/v1 kind: Foo metadata: spec: # ? 1 2 3 name: my-example 4 namespace: my-namespace # ? 5 labels: 6 tier: prod 7 group: sales 8 annotations: 9 labels.foo.example.com: my-value 10 11 key1: value1 12 keyN: valueN 13 Including KUBECONFIG and Kubernetes LIST responses!
  23. Role-based access control In order to authenticate and authorize, we

    use Kubernetes Service Accounts, tied with Roles and Role bindings.
  24. How apps live inside Kubernetes?

  25. The Pod is the smallest deployment object in Kubernetes One

    Kubernetes pod can have multiple "containers" inside. These containers can have a specific behaviour: Initialization containers Normal containers Sidecars Pods are stateless unless they have a way to keep a state attached to them.
  26. apiVersion: v1 kind: Pod metadata: name: my-application spec: containers: -

    name: container image: patrickdappollonio/hello-docker 1 2 3 4 5 6 7 8 Example pod: this runs a single container inside the pod with the name "my-application"
  27. apiVersion: v1 kind: Pod metadata: name: my-application spec: containers: -

    name: container image: patrickdappollonio/hello-docker 1 2 3 4 5 6 7 8 apiVersion: v1 kind: Pod metadata: name: my-application spec: containers: - name: container image: patrickdappollonio/hello-docker 1 2 3 4 5 6 7 8 Example pod: this runs a single container inside the pod with the name "my-application"
  28. Pod Networking All containers inside a pod share a "virtual

    network" (or software-defined network, made with either "iptables" or "ipvs" or third-party software). This network is based on conventions: The container name becomes the hostname, and it's exposed to all containers in the pod Routing between these containers is "exclusive" and it can't be disrupted* However, one of the containers in a pod can "own" the network and force all traffic to go through it. (topic for another talk)
  29. Exposing pods as Services Once a pod is created, it

    only exists within the network space in itself -- unless other solutions like hostPort are used. To expose, we need to register it to the cluster-level now using a Kubernetes Service. Without going into detail, creating a Service for a pod means you have to "tie" the Pod to a Service using Labels. The Service then will give you a unique FQDN: ${name}.${namespace}.svc.cluster.local
  30. Pro tip: lots of details about this in my follow-

    up blog post about Zero-Downtime deployments in Kubernetes! https://sourced.atlassian.net/l/c/8Chh4t1P Sourced Confluence link:
  31. None
  32. Services? Services are the way to expose Pods as applications

    to other parts of the Cluster, whether they're in the same namespace or a different one. There's two useable versions of Services: ClusterIP NodePort And other two used for custom behaviour: ExternalName and LoadBalancer.
  33. ClusterIP The default. Exposes a service to the rest of

    the cluster, instead to just the Pod network. Very useful to prevent exposing "more that's necessary". NodePort A kind of Service that should be used for debugging or troubleshooting. Allows to "catch" a random port between 32000 and 32767 (by default, can be changed) and exposes the app on all nodes through that port. Kinds of Services
  34. labels: app: my-application apiVersion: v1 1 kind: Pod 2 metadata:

    3 name: my-application 4 5 6 spec: 7 containers: 8 - name: container 9 image: patrickdappollonio/hello-docker 10 type: ClusterIP selector: app: my-application apiVersion: v1 1 kind: Service 2 metadata: 3 name: hello-svc 4 spec: 5 6 7 8 ports: 9 - port: 8000 10 type: NodePort selector: app: my-application nodePort: 31234 apiVersion: v1 1 kind: Service 2 metadata: 3 name: hello-svc 4 spec: 5 6 7 8 ports: 9 - port: 8000 10 11
  35. LoadBalancer This one is special. It's nothing more than a

    mixture between ClusterIP behaviour (exposed to the cluster) and NodePort (exposed in a random port). This uses the "Cloud Controller Manager" to provision an actual Load Balancer (either appliance or VM-based) and point the "members" of the Load Balancer either to the NodePort service or directly to the Pod. It's a local object to keep track of the "members" of the Service and update the Cloud LB accordingly. Kinds of Services
  36. None
  37. ExternalName We know the following: Pod containers are exposed to

    other pods of the container Services expose these pods as Services to the entire cluster What if the Application isn't running from inside the cluster, but we want the fancy FQDN for it and/or make it look it's internal to the cluster? ExternalName to the rescue! Kinds of Services
  38. ExternalName example name: sourced-bucket type: ExternalName externalName: | sourced-bucket.s3.us-west-2.amazonaws.com apiVersion:

    v1 1 kind: Service 2 metadata: 3 4 namespace: prod 5 spec: 6 7 8 9 The code above exposes the Sourced bucket in S3 US West 2 as follows using a DNS CNAME redirection: sourced-bucket.prod.svc.cluster.local sourced-bucket.prod # cluster-level fqdn sourced-bucket # if in the same namespace
  39. Adding resiliency

  40. Rule of thumb Never put all your eggs in the

    same basket
  41. In any of the previous cases, having a single Pod

    is not good for our HA: if it goes down, our application goes dow. For all stateless applications, Kubernetes offers: DEPLOYMENTS The fight for not having a single Pod
  42. Kubernetes Deployment Allows you to make multiple copies of your

    app, all running at the same time. With a Service, you can round-robin load balance all of them through kube-proxy Allows you to perform rollout releases with zero downtime (shoutout to the zero-downtime article again!) Allows a basic "template" of a Pod
  43. replicas: 3 matchLabels: app: my-application labels: app: my-application image: patrickdappollonio/hello-docker

    apiVersion: apps/v1 1 kind: Deployment 2 metadata: 3 name: my-application 4 spec: 5 6 selector: 7 8 9 template: 10 metadata: 11 12 13 spec: 14 containers: 15 - name: container 16 17 The same Pod we saw before, but now as a Deployment with 3 replicas
  44. StatefulSets, DaemonSets, all-the-sets...

  45. StatefulSet On a Deployment, Pod names are randomized and redeploying

    a pod might not put the pod back in the same node where it originally was. Multiple reasons might require you do either get predictable names or remember where your pods are deployed. StatefulSets are the solution. DaemonSet If you have 8 nodes, and you launch 8 containers, chances are, they will all be randomly distributed through your 8 nodes, but you might have nodes with zero pods and nodes with more than one. DaemonSets allow you to evenly distribute one-pod- per-node.
  46. Pro tip: StatefulSets are a great way to migrate old

    applications (running, say, in VMs) to Kubernetes and the cloud. It requires less cloud-readiness to launch an application as StatefulSet than a Deployment. There's several features in place to help Pods maintain their "state" (except in-memory state) as much as possible. Use this to your customer's advantage!
  47. Feeding configuration to our pods The case for ConfigMaps and

  48. ConfigMaps & Secrets Both ConfigMaps and Secrets work almost the

    same way: Both allow you to mount their data either as "files" (volumes) or environment variables (whenever possible) Both hold non binary data -- although binary data can be converted to string via base64 The main difference here is: Secrets are encrypted... ... but ConfigMaps are not *
  49. apiVersion: v1 kind: ConfigMap metadata: name: game-config data: game.properties: |

    enemies=aliens lives=3 enemies.cheat=true secret.code.lives=30 backend-vs: "v1.3.0" 1 2 3 4 5 6 7 8 9 10 11 apiVersion: v1 kind: Secret metadata: name: game-secrets type: Opaque stringData: db_password: "covfefe123" # data: # db_password: Y292ZmVmZTEyMwo= 1 2 3 4 5 6 7 8 9 10 ConfigMap "Opaque" secret
  50. ... wait "Opaque" secret? I can see it clearly!

  51. TL;DR There are multiple kinds of Secret types, like: "Opaque"

    (or normal) secrets Service account tokens Docker configs Basic Auth SSH auth TLS Bootstrap Tokens ... but they're all "Opaque" secrets internally.
  52. Some of these secret kinds are meant to be "handled"

    by Controllers (apps that can extend the Kubernetes functionality). However, adoption has been lacking, besides Kubernetes' own standards. The goal is that secrets could be handled differently depending on their type.
  53. name: game-config-env-file enemies: aliens apiVersion: v1 1 kind: ConfigMap 2

    metadata: 3 4 data: 5 6 valueFrom: name: game-config-env-file key: enemies apiVersion: v1 1 kind: Pod 2 metadata: 3 name: live-running-server 4 spec: 5 containers: 6 - name: test-container 7 image: example.org/gameco/shooter 8 env: 9 - name: ENEMY_TYPE 10 11 configMapKeyRef: 12 13 14 (you can also mount them as files, but it'll be left as exercise for the reader 😉)
  54. Actual data storage Persistent Volume Claims, Persistent Volumes and Storage

  55. Persistent Volume Claims (PVC) They're a way to "request block

    storage" from Kubernetes. Can be used to store data, logs, database backend information, etc. Think about it as any folder backed up by a storage engine somewhere.
  56. name: my-volume storage: 30Gi apiVersion: v1 1 kind: PersistentVolumeClaim 2

    metadata: 3 4 spec: 5 accessModes: 6 - ReadWriteOnce 7 resources: 8 requests: 9 10 volumeMounts: - name: game-state-storage mountPath: /opt/shooter-data - name: game-state-storage persistentVolumeClaim: claimName: my-volume apiVersion: v1 1 kind: Pod 2 metadata: 3 name: hello-docker 4 labels: 5 app: hello-docker-app 6 spec: 7 containers: 8 - name: shooter 9 image: example.org/gamingco/shooter 10 11 12 13 volumes: 14 15 16 17 Our game will save data in the /opt/shooter-data folder. The volume will be backed up by the block storage mechanism configured in the cluster.
  57. Persistent Volumes (PV) Do not confuse with Persistent Volume Claims.

    Persistent Volumes (PVs) are a traditional way to mount a local-or-network classic volume for block storage. You can also host "cloud" block storage with this if there's a FileSystem implementation for it (like FUSE). In plain English: imagine the cluster is in your house, and you want to mount your USB volume as storage for your cluster. You'll configure it through a Persistent Volume.
  58. Persistent Volumes have been slowly retracting in use in favour

    of Storage Classes (SC) mainly due to their cloud- ready behaviour. In other words, can I mount my good-old Toshiba external drive to 3 computers at once with the same USB? Probably no.
  59. StorageClass (SC) Storage Classes are the "level up" from Persistent

    Volumes (PV). They allow to sell bulk block (or sometimes non-block) storage through Kubernetes. Often Persistent Volumes double-check if the amount of storage you're requesting is available in the destination. Storage Classes assume that since it's cloud based the "capacity" is infinite (as long as you keep paying 😉) Finally, it's quite common for PvCs bound to an SC to "expand" (contrary to my Toshiba that's still at 2 TB) Disclaimer: some SCs might still check for available capacity against "quotas"
  60. You can configure either or for the storage backend of

    a Persistent Volume Claim: storageClassName: premium-rwo apiVersion: v1 1 kind: PersistentVolumeClaim 2 metadata: 3 name: my-volume 4 spec: 5 6 accessModes: 7 - ReadWriteOnce 8 resources: 9 requests: 10 storage: 30Gi 11 storageClassName: "" volumeName: "pv-toshiba" apiVersion: v1 1 kind: PersistentVolumeClaim 2 metadata: 3 name: my-volume 4 spec: 5 6 7 accessModes: 8 - ReadWriteOnce 9 resources: 10 requests: 11 storage: 30Gi 12 PSA: depending on your config, your Persistent Volume (PV) might need a spec.claimRef field too.
  61. Cloud Kubernetes services (EKS, GKE, AKS) often come with preinstalled

    and preconfigured Storage Classes: $ kubectl get storageclasses NAME PROVISIONER RECLAIMPOLICY ... premium-rwo pd.csi.storage.gke.io Delete ... standard (default) kubernetes.io/gce-pd Delete ... standard-rwo pd.csi.storage.gke.io Delete ...
  62. With this, now you can launch an entire end-to-end stack

    in Kubernetes that can be exposed to the world.
  63. For a future next episode: Jobs, CronJobs Controllers, Operators Custom

    Resource Definitions Ingress Controllers and Service Meshes ... and more!
  64. For now: questions?