Kubernetes 101

Kubernetes 101 Starting from the (very, very) beginning

Let's start: At the beginning, there was a big bang,
a surprisingly hot and dense.... (oh, oh, not that early!)

Send your questions! https://pigeonhole.at/KUBECTL Anonymous questions are always welcome!

Who's this guy? Patrick D'appollonio New to Sourced, started ~6
months ago! Chilean Software developer turned DevOps connoisseur Been doing Kubernetes stuﬀ before with: Ubisoft (live games) DreamWorks (movie rendering) HPE (consulting)

Kube-what?

Kubernetes Ancient greek for "helmsman" -- no kidding. For us
it's a Highly available Container Orchestrator. Created at Google, originally named "Borg". Battle-tested at high scales, Google-level scale. it does "Container magic"...

In more depth... Kubernetes is a group of several Go
applications, evolving around Containers and costainer orchestration. The core requirement is to have, at minimum, the following features: Scheduling Deployment Scaling Load balancing Health monitoring Resource allocating Redundancy

... but there's so much more! It's possible to extend
what's possible on Kubernetes using custom resources, to the point that you stop looking at container orchestration and look at this as a high-resilience environment. But that's a discussion for another time 😉

Bits & pieces

Control Plane A group of applications that run on the
"control" side of Kubernetes: API Server Kube Controller manager Cloud controller manager Scheduler etcd The document database where all the Kubernetes data is stored. Written in Go, highly performant and RAM hungry. It's a "cluster" too on its own and as such needs consensus.

Worker Nodes They run your apps, and they can be
grouped depending on their capabilities. You can have CPU intensive nodes, or RAM intensive nodes. You can also have GPU-powered nodes for machine learning, or even Windows-powered nodes for Windows workloads. The main ingredient? The Kubelet.

... looks hella complicated!

Splitting by piece: API Server The API Server is the
brains of the operation. It coordinates all the eﬀorts from all controllers. While controllers act, the API Server is the source of truth for them. We interact with the API server using kubectl. Kubectl is the tool that talks to the Kubernetes API to emit "orders/instructions."

kubectl Another Go application. This one runs on your machine.
Send instructions to Kubernetes using common HTTP verbs: GET to list POST to create PUT/PATCH to update DELETE to, well, delete Available for all platforms.

... BTW, Let's start a flame war... Is it... kube-control
koob-control koo-bay-ctl koob-cuttle koo-bay-cuddle koo-bectal koobec-tee-ell https://youtu.be/2wgAIvXpJqU

$KUBECONFIG In order for kubectl to know who we are
and what we can do, it needs our "credentials". Standard Kubernetes is very old-fashioned: uses Certiﬁcates and Tokens to authenticate users. Kubernetes-to-Kubernetes components also use TLS for communication.

certificate-authority-data: "LS0tLS1CRUd..." server: "https://192.168.0.1:6443" client-certificate-data: "LS0tLS1CRUd..." client-key-data: "LS0tLS1E1Q..." cluster: personal-cluster
user: patrick-account apiVersion: v1 1 kind: Config 2 3 clusters: 4 - name: personal-cluster 5 cluster: 6 7 8 9 users: 10 - name: patrick-account 11 user: 12 13 14 15 contexts: 16 - name: patrick-on-personal 17 context: 18 19 20 21 current-context: patrick-on-personal 22 Standard $KUBECONFIG

auth-provider: config: client-id: kubernetes client-secret: <token> id-token: <token> refresh-token: <refresh-token>
name: oidc apiVersion: v1 1 kind: Config 2 3 clusters: 4 - ... 5 6 users: 7 - name: patrick-account 8 user: 9 10 11 12 13 14 15 16 17 contexts: 18 - ... 19 20 current-context: ... 21 Third-party identity $KUBECONFIG

auth-provider: config: access-token: <access-token> cmd-path: gcloud cmd-args: config config-helper --format=json
expiry: <exp-date> expiry-key: '{.credential.token_expiry}' token-key: '{.credential.access_token}' name: gcp apiVersion: v1 1 kind: Config 2 3 clusters: 4 - ... 5 6 users: 7 - name: patrick-account 8 user: 9 10 11 12 13 14 15 16 17 18 19 ... 20 GCP, GKE, Azure $KUBECONFIG

KUBECONFIGs can be mix-matched Some people prefer a single KUBECONFIG
with multiple entries. Others prefer to point the KUBECONFIG environment variable to their ﬁles and be more declarative. There isn't right or wrong.

Side note: all Kubernetes objects follow the same format apiVersion:
example.com/v1 kind: Foo metadata: spec: # ? 1 2 3 name: my-example 4 namespace: my-namespace # ? 5 labels: 6 tier: prod 7 group: sales 8 annotations: 9 labels.foo.example.com: my-value 10 11 key1: value1 12 keyN: valueN 13 Including KUBECONFIG and Kubernetes LIST responses!

Role-based access control In order to authenticate and authorize, we
use Kubernetes Service Accounts, tied with Roles and Role bindings.

How apps live inside Kubernetes?

The Pod is the smallest deployment object in Kubernetes One
Kubernetes pod can have multiple "containers" inside. These containers can have a speciﬁc behaviour: Initialization containers Normal containers Sidecars Pods are stateless unless they have a way to keep a state attached to them.

apiVersion: v1 kind: Pod metadata: name: my-application spec: containers: -
name: container image: patrickdappollonio/hello-docker 1 2 3 4 5 6 7 8 Example pod: this runs a single container inside the pod with the name "my-application"

apiVersion: v1 kind: Pod metadata: name: my-application spec: containers: -
name: container image: patrickdappollonio/hello-docker 1 2 3 4 5 6 7 8 apiVersion: v1 kind: Pod metadata: name: my-application spec: containers: - name: container image: patrickdappollonio/hello-docker 1 2 3 4 5 6 7 8 Example pod: this runs a single container inside the pod with the name "my-application"

Pod Networking All containers inside a pod share a "virtual
network" (or software-deﬁned network, made with either "iptables" or "ipvs" or third-party software). This network is based on conventions: The container name becomes the hostname, and it's exposed to all containers in the pod Routing between these containers is "exclusive" and it can't be disrupted* However, one of the containers in a pod can "own" the network and force all traﬃc to go through it. (topic for another talk)

Exposing pods as Services Once a pod is created, it
only exists within the network space in itself -- unless other solutions like hostPort are used. To expose, we need to register it to the cluster-level now using a Kubernetes Service. Without going into detail, creating a Service for a pod means you have to "tie" the Pod to a Service using Labels. The Service then will give you a unique FQDN: ${name}.${namespace}.svc.cluster.local

Pro tip: lots of details about this in my follow-
up blog post about Zero-Downtime deployments in Kubernetes! https://sourced.atlassian.net/l/c/8Chh4t1P Sourced Conﬂuence link:

Services? Services are the way to expose Pods as applications
to other parts of the Cluster, whether they're in the same namespace or a diﬀerent one. There's two useable versions of Services: ClusterIP NodePort And other two used for custom behaviour: ExternalName and LoadBalancer.

ClusterIP The default. Exposes a service to the rest of
the cluster, instead to just the Pod network. Very useful to prevent exposing "more that's necessary". NodePort A kind of Service that should be used for debugging or troubleshooting. Allows to "catch" a random port between 32000 and 32767 (by default, can be changed) and exposes the app on all nodes through that port. Kinds of Services

labels: app: my-application apiVersion: v1 1 kind: Pod 2 metadata:
3 name: my-application 4 5 6 spec: 7 containers: 8 - name: container 9 image: patrickdappollonio/hello-docker 10 type: ClusterIP selector: app: my-application apiVersion: v1 1 kind: Service 2 metadata: 3 name: hello-svc 4 spec: 5 6 7 8 ports: 9 - port: 8000 10 type: NodePort selector: app: my-application nodePort: 31234 apiVersion: v1 1 kind: Service 2 metadata: 3 name: hello-svc 4 spec: 5 6 7 8 ports: 9 - port: 8000 10 11

LoadBalancer This one is special. It's nothing more than a
mixture between ClusterIP behaviour (exposed to the cluster) and NodePort (exposed in a random port). This uses the "Cloud Controller Manager" to provision an actual Load Balancer (either appliance or VM-based) and point the "members" of the Load Balancer either to the NodePort service or directly to the Pod. It's a local object to keep track of the "members" of the Service and update the Cloud LB accordingly. Kinds of Services

ExternalName We know the following: Pod containers are exposed to
other pods of the container Services expose these pods as Services to the entire cluster What if the Application isn't running from inside the cluster, but we want the fancy FQDN for it and/or make it look it's internal to the cluster? ExternalName to the rescue! Kinds of Services

ExternalName example name: sourced-bucket type: ExternalName externalName: | sourced-bucket.s3.us-west-2.amazonaws.com apiVersion:
v1 1 kind: Service 2 metadata: 3 4 namespace: prod 5 spec: 6 7 8 9 The code above exposes the Sourced bucket in S3 US West 2 as follows using a DNS CNAME redirection: sourced-bucket.prod.svc.cluster.local sourced-bucket.prod # cluster-level fqdn sourced-bucket # if in the same namespace

Adding resiliency

Rule of thumb Never put all your eggs in the
same basket

In any of the previous cases, having a single Pod
is not good for our HA: if it goes down, our application goes dow. For all stateless applications, Kubernetes oﬀers: DEPLOYMENTS The fight for not having a single Pod

Kubernetes Deployment Allows you to make multiple copies of your
app, all running at the same time. With a Service, you can round-robin load balance all of them through kube-proxy Allows you to perform rollout releases with zero downtime (shoutout to the zero-downtime article again!) Allows a basic "template" of a Pod

replicas: 3 matchLabels: app: my-application labels: app: my-application image: patrickdappollonio/hello-docker
apiVersion: apps/v1 1 kind: Deployment 2 metadata: 3 name: my-application 4 spec: 5 6 selector: 7 8 9 template: 10 metadata: 11 12 13 spec: 14 containers: 15 - name: container 16 17 The same Pod we saw before, but now as a Deployment with 3 replicas

StatefulSets, DaemonSets, all-the-sets...

StatefulSet On a Deployment, Pod names are randomized and redeploying
a pod might not put the pod back in the same node where it originally was. Multiple reasons might require you do either get predictable names or remember where your pods are deployed. StatefulSets are the solution. DaemonSet If you have 8 nodes, and you launch 8 containers, chances are, they will all be randomly distributed through your 8 nodes, but you might have nodes with zero pods and nodes with more than one. DaemonSets allow you to evenly distribute one-pod- per-node.

Pro tip: StatefulSets are a great way to migrate old
applications (running, say, in VMs) to Kubernetes and the cloud. It requires less cloud-readiness to launch an application as StatefulSet than a Deployment. There's several features in place to help Pods maintain their "state" (except in-memory state) as much as possible. Use this to your customer's advantage!

Feeding configuration to our pods The case for ConfigMaps and
Secrets

ConfigMaps & Secrets Both ConfigMaps and Secrets work almost the
same way: Both allow you to mount their data either as "files" (volumes) or environment variables (whenever possible) Both hold non binary data -- although binary data can be converted to string via base64 The main difference here is: Secrets are encrypted... ... but ConfigMaps are not *

apiVersion: v1 kind: ConfigMap metadata: name: game-config data: game.properties: |
enemies=aliens lives=3 enemies.cheat=true secret.code.lives=30 backend-vs: "v1.3.0" 1 2 3 4 5 6 7 8 9 10 11 apiVersion: v1 kind: Secret metadata: name: game-secrets type: Opaque stringData: db_password: "covfefe123" # data: # db_password: Y292ZmVmZTEyMwo= 1 2 3 4 5 6 7 8 9 10 ConﬁgMap "Opaque" secret

... wait "Opaque" secret? I can see it clearly!

TL;DR There are multiple kinds of Secret types, like: "Opaque"
(or normal) secrets Service account tokens Docker conﬁgs Basic Auth SSH auth TLS Bootstrap Tokens ... but they're all "Opaque" secrets internally.

Some of these secret kinds are meant to be "handled"
by Controllers (apps that can extend the Kubernetes functionality). However, adoption has been lacking, besides Kubernetes' own standards. The goal is that secrets could be handled diﬀerently depending on their type.

name: game-config-env-file enemies: aliens apiVersion: v1 1 kind: ConfigMap 2
metadata: 3 4 data: 5 6 valueFrom: name: game-config-env-file key: enemies apiVersion: v1 1 kind: Pod 2 metadata: 3 name: live-running-server 4 spec: 5 containers: 6 - name: test-container 7 image: example.org/gameco/shooter 8 env: 9 - name: ENEMY_TYPE 10 11 configMapKeyRef: 12 13 14 (you can also mount them as ﬁles, but it'll be left as exercise for the reader 😉)

Actual data storage Persistent Volume Claims, Persistent Volumes and Storage
Classes.

Persistent Volume Claims (PVC) They're a way to "request block
storage" from Kubernetes. Can be used to store data, logs, database backend information, etc. Think about it as any folder backed up by a storage engine somewhere.

name: my-volume storage: 30Gi apiVersion: v1 1 kind: PersistentVolumeClaim 2
metadata: 3 4 spec: 5 accessModes: 6 - ReadWriteOnce 7 resources: 8 requests: 9 10 volumeMounts: - name: game-state-storage mountPath: /opt/shooter-data - name: game-state-storage persistentVolumeClaim: claimName: my-volume apiVersion: v1 1 kind: Pod 2 metadata: 3 name: hello-docker 4 labels: 5 app: hello-docker-app 6 spec: 7 containers: 8 - name: shooter 9 image: example.org/gamingco/shooter 10 11 12 13 volumes: 14 15 16 17 Our game will save data in the /opt/shooter-data folder. The volume will be backed up by the block storage mechanism conﬁgured in the cluster.

Persistent Volumes (PV) Do not confuse with Persistent Volume Claims.
Persistent Volumes (PVs) are a traditional way to mount a local-or-network classic volume for block storage. You can also host "cloud" block storage with this if there's a FileSystem implementation for it (like FUSE). In plain English: imagine the cluster is in your house, and you want to mount your USB volume as storage for your cluster. You'll conﬁgure it through a Persistent Volume.

Persistent Volumes have been slowly retracting in use in favour
of Storage Classes (SC) mainly due to their cloud- ready behaviour. In other words, can I mount my good-old Toshiba external drive to 3 computers at once with the same USB? Probably no.

StorageClass (SC) Storage Classes are the "level up" from Persistent
Volumes (PV). They allow to sell bulk block (or sometimes non-block) storage through Kubernetes. Often Persistent Volumes double-check if the amount of storage you're requesting is available in the destination. Storage Classes assume that since it's cloud based the "capacity" is inﬁnite (as long as you keep paying 😉) Finally, it's quite common for PvCs bound to an SC to "expand" (contrary to my Toshiba that's still at 2 TB) Disclaimer: some SCs might still check for available capacity against "quotas"

You can configure either or for the storage backend of
a Persistent Volume Claim: storageClassName: premium-rwo apiVersion: v1 1 kind: PersistentVolumeClaim 2 metadata: 3 name: my-volume 4 spec: 5 6 accessModes: 7 - ReadWriteOnce 8 resources: 9 requests: 10 storage: 30Gi 11 storageClassName: "" volumeName: "pv-toshiba" apiVersion: v1 1 kind: PersistentVolumeClaim 2 metadata: 3 name: my-volume 4 spec: 5 6 7 accessModes: 8 - ReadWriteOnce 9 resources: 10 requests: 11 storage: 30Gi 12 PSA: depending on your config, your Persistent Volume (PV) might need a spec.claimRef field too.

Cloud Kubernetes services (EKS, GKE, AKS) often come with preinstalled
and preconﬁgured Storage Classes: $ kubectl get storageclasses NAME PROVISIONER RECLAIMPOLICY ... premium-rwo pd.csi.storage.gke.io Delete ... standard (default) kubernetes.io/gce-pd Delete ... standard-rwo pd.csi.storage.gke.io Delete ...

With this, now you can launch an entire end-to-end stack
in Kubernetes that can be exposed to the world.

For a future next episode: Jobs, CronJobs Controllers, Operators Custom
Resource Deﬁnitions Ingress Controllers and Service Meshes ... and more!

For now: questions?

Kubernetes 101

Kubernetes 101

More Decks by Patrick D'appollonio Vega

Other Decks in Programming

Featured

Transcript