Slide 1

Slide 1 text

Kuberception: Self Hosting Kubernetes Tasdik Rahman @tasdikrahman | tasdikrahman.me

Slide 2

Slide 2 text

Who is this talk for Kubernetes cluster operators People evaluating alternatives to KOPS/kubeadm etc. or a managed solution Kubeadm OR

Slide 3

Slide 3 text

Agenda 1. What is Self Hosted Kubernetes. 2. Why? 3. How does it work? 4. Learnings from running it on production. 5. What’s next?

Slide 4

Slide 4 text

Brief Intro to Kubernetes

Slide 5

Slide 5 text

Section Header Picture Credits: https://elastisys.com/

Slide 6

Slide 6 text

What is Self hosted Kubernetes?

Slide 7

Slide 7 text

Self hosted Kubernetes It runs all required and optional components of a Kubernetes cluster on top of Kubernetes itself. The kubelet manages itself or is managed by the system init and all the Kubernetes components can be managed by using Kubernetes APIs. *Ref: CoreOS tectonic docs

Slide 8

Slide 8 text

Self hosted Kubernetes

Slide 9

Slide 9 text

Is it something new? * https://github.com/kubernetes/kubernetes/issues/246

Slide 10

Slide 10 text

Why?

Slide 11

Slide 11 text

Desired control plane components properties ● Highly available ● Should be able to tolerate node failures ● Scale up and down with requirements ● Rollback and upgrades ● Monitoring and alerting ● Resource allocation ● RBAC

Slide 12

Slide 12 text

How is Self Hosted kubernetes addressing them? ● Small Dependencies ● Deployment consistency ● Introspection ● Cluster Upgrades ● Easier Highly-Available Configurations ● Streamlined, cluster lifecycle management.

Slide 13

Slide 13 text

Small Dependencies Kubelet kubeconfig Container runtime Minimal on-host requirements

Slide 14

Slide 14 text

No distinction between master and worker nodes

Slide 15

Slide 15 text

You select master nodes by adding labels to it $ kubectl label node node1 master=true

Slide 16

Slide 16 text

Introspection

Slide 17

Slide 17 text

Cluster upgrades

Slide 18

Slide 18 text

Easier Highly Available configurations

Slide 19

Slide 19 text

Streamlined Cluster Lifecycle management $ kubectl apply -f kube-apiserver.yaml $ kubectl apply -f controller-manager.yaml $ kubectl apply -f flannel.yaml $ kubectl apply -f my-app.yaml

Slide 20

Slide 20 text

How does it work?

Slide 21

Slide 21 text

Three main problems to solve for it work

Slide 22

Slide 22 text

Bootstrapping Upgrades Disaster recovery

Slide 23

Slide 23 text

Bootstrapping ● Control plane running as daemonsets, deployments. Making use of secrets and configmaps ● But … We need a control to plane to apply these deployments and daemonsets on

Slide 24

Slide 24 text

Credits: ASAPScience

Slide 25

Slide 25 text

Then how should we solve this?

Slide 26

Slide 26 text

Use a temporary, static control plane to bootstrap the cluster

Slide 27

Slide 27 text

Bootkube: Looking at it from 10,000 feet

Slide 28

Slide 28 text

Temporary control-plane manifests Self hosted control-plane manifests Master (initial) node Bootkube Temporary control-plane self-hosted control-plane

Slide 29

Slide 29 text

etcd Bootkube (kube-apiserver, controller-manager, scheduler) System kubelet (managed by system init) api-server scheduler controller- manager Ephemeral control plane being brought up by bootkube.

Slide 30

Slide 30 text

etcd System kubelet (managed by system init) api-server scheduler controller- manager Bootkube exits, bringing down the ephemeral control plane.

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Does this even work?

Slide 33

Slide 33 text

Controller node(master)

Slide 34

Slide 34 text

Demo

Slide 35

Slide 35 text

Do a kubernetes control plane component version upgrade on the test cluster

Slide 36

Slide 36 text

Learnings from running it on production clusters

Slide 37

Slide 37 text

● Using the right instance types for the compute instances. ● Self hosted etcd outage. ● api-server crashing during image upgrade. ● appropriate resource limits. ● Disaster recovery (etcd-backups/bootkube recover/heptio ark). ● Blue-green clusters. ● Kubelet OOM’d. ● Cross checking for compatibility with the cluster upgrade. What went wrong and what went right

Slide 38

Slide 38 text

What’s next?

Slide 39

Slide 39 text

Automate the boring stuff Credits: AlSweigart

Slide 40

Slide 40 text

Automate ● Extend kubernetes by leveraging CRD’s ● The cluster upgrade part can be delegated to an operator. ● Custom systemd/shell scripts ● .

Slide 41

Slide 41 text

Future of bootkube ● Will be replaced by Kubelet pod API. ○ The write API would enable an external installation program to setup the control plane of a self-hosted Kubernetes cluster without requiring an existing API server.

Slide 42

Slide 42 text

Links ● Github repo used in the demo for setting up the self hosted k8s test cluster using typhoon: https://github.com/tasdikrahman/infra ● https://typhoon.psdn.io/: used as baseline for this demo to create the self hosted k8s cluster.

Slide 43

Slide 43 text

References ● SIG-lifecycle Spec on self hosted kubernetes ● bootkube: Design principles ● bootkube: How does is work ● bootkube: Upgrading the kubernetes cluster ● SIG lifecycle google groups early discussions on self hosting

Slide 44

Slide 44 text

Credits ● @rmenn, @hashfyre, @gappan28 for teaching me what I know. ● @aaronlevy and @dghubble for always being there on #bootkube on k8s slack to clear up any questions on bootkube. ● @kubenetesio for sharing the slide template. ● The OSS contributors out there who have made k8s and the ecosystem around it, what it is today. ● Arjun for lending me his laptop for DevOpsdays.

Slide 45

Slide 45 text

Questions? tasdikrahman.me | @tasdikrahman