Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intel and ARM, let Kubernetes rule them all!

Intel and ARM, let Kubernetes rule them all!

A presentation on multi-platform Kubernetes I delivered at Devops Gathering 2017 https://devops-gathering.io/2017/

Video recording: https://youtu.be/MxeVwAbRbDM
Location: Bochum, Germany

Lucas Käldström

March 24, 2017
Tweet

More Decks by Lucas Käldström

Other Decks in Technology

Transcript

  1. A Swedish-speaking second-year Upper Secondary School (High School) Student from

    Finland A person that has never attended a computing class :) A maintainer of Kubernetes since a year back The "kubernetes-on-arm" guy $ whoami 2 . 1
  2. I worked on and maintained minikube in the early days

    of the project, until I... However, I wasn't satisfied with a side-project, I wanted it in core, so I implemented multiarch support for Kubernetes in the Spring 2016. I also wrote a multi-platform proposal My first open source project was kubernetes-on-arm. It was the first easy solution to run Kubernetes on Raspberry Pi's ...moved on to kubeadm in August 2016, and started focusing on SIG-Cluster-Lifecycle issues, which I find very interesting and challenging. What have I been tinkering with? 2 . 2
  3. “ Platform agnostic. The specifications developed will not be platform

    specific such that they can be implemented on a variety of architectures and operating systems. -- CNCF Values 3 . 2
  4. Why is the multi-platform functionality important for Kubernetes long-term? $

    kubectl motivate multiplatform 1. We don't know which platform will be the dominating one in 20 years from now 2. By letting new architectures join the project, and more people with them, we'll see a stronger ecosystem and a sound competition. 3. The risk of vendor lock-in on the default platform is significantly reduced 3 . 3
  5. What could Kubernetes on ARM be used for right now?

    - A master's thesis about educating Kubernetes' concepts by letting the students use Kubernetes on small Raspberry Pi clusters. KubeCloud: A Small-Scale Tangible Cloud Computing Environment - The world's first 10nm processor is an ARM processor, exciting times! Microsoft Pledges to Use ARM Server Chips, Threatening Intel's Dominance In classrooms -- learning others how Kubernetes works by using Raspberry Pi's is the ideal way of letting newcomers actually see what it's all about 3 . 4
  6. “ Since kubeadm was announced, it has been super-easy to

    set up Kubernetes in an official way on ARM and now also on ppc64le and s390x Example setup on an ARM machine: $ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - $ cat <<EOF > /etc/apt/sources.list.d/kubernetes.list deb http://apt.kubernetes.io/ kubernetes-xenial main EOF $ apt-get update && apt-get install -y docker.io kubeadm $ kubeadm init ... $ kubectl apply -f https://git.io/weave-kube-1.6 $ # DONE! TL;DR; Kubernetes shouldn't have different install paths for different platforms, it should just work out-of-the-box How can I set up Kubernetes on an other architecture? 4
  7. Oh, wow, how does that work under the hood? Quick

    intro on cross-compiling and manifest lists 5 . 1
  8. Kubernetes releases server binaries for all supported architectures (amd64, arm,

    arm64, ppc64le, s390x) and node binaries for all supported platforms (+windows/amd64) All docker images in the core k8s repo are built and pushed for all architectures using a semi-standardized Makefile. Debian packages are provided for all architectures as well, basically just downloads the binaries and makes debs of them kubeadm is aware of which architecture it's running on on init and generates manifests for the right architecture. How does it work under the hood? 5 . 2
  9. Binaries and docker images released by Kubernetes are cross- compiled

    and cross-built for non-amd64 architectures. $ # Cross-compile main.go to ARM 32-bit $ GOOS=linux GOARCH=arm CGO_ENABLED=0 go build main.go $ # Cross-compile main.go (which contains CGO code) to ARM 32-bit $ GOOS=linux GOARCH=arm CGO_ENABLED=1 CC=arm-linux-gnueabihf go build main.go Cross-compilation with Go is relatively easy. Cross-building is a little bit harder, one may have to use QEMU to emulate another arch: $ # Cross-build an armhf image with a RUN command that is executed on an amd64 host $ cat Dockerfile FROM armhf/debian:jessie COPY qemu-arm-static /usr/bin/ RUN apt-get install iptables nfs-common COPY hyperkube / $ # Register the binfmt_misc module in the kernel and download QEMU $ docker run --rm --privileged multiarch/qemu-user-static:register --reset $ curl -sSL https://foo-qemu-download.com/x86_64_qemu-arm-static.tar.gz | tar -xz $ docker build -t gcr.io/google_containers/hyperkube-arm:v1.x.y . A quick recap on cross-compiling and cross-building 5 . 3
  10. “ I don't want to have the architecture in the

    image name!! Me neither. Enter manifest lists. 5 . 4
  11. Imagine this scenario... $ go build my-cool-app.go $ docker build

    -t luxas/my-cool-app-amd64:v1.0.0 . ... $ docker push luxas/my-cool-app-amd64:v1.0.0 $ # ARM $ GOARCH=arm go build my-cool-app.go $ docker build -t luxas/my-cool-app-arm:v1.0.0 . ... $ docker push luxas/my-cool-app-arm:v1.0.0 $ # ARM 64-bit $ GOARCH=arm64 go build my-cool-app.go $ docker build -t luxas/my-cool-app-arm64:v1.0.0 . ... $ docker push luxas/my-cool-app-arm64:v1.0.0 Then you get excited and create a k8s cluster of amd64, arm and arm64 nodes and try to run your application on that cluster. But what architecture should you use? $ kubectl run --image luxas/my-cool-app-???:v1.0.0 my-cool-app --port 80 $ kubectl expose deployment my-cool-app --port 80 This the hardest problem with a multi-platform cluster, if you hardcode the architecture here, it will fail on all other machines. Ideally I would like to do this: $ kubectl run --image luxas/my-cool-app:v1.0.0 my-cool-app --port 80 $ kubectl expose deployment my-cool-app --port 80 5 . 5
  12. Fortunately, that's totally possible! "Manifest list" is currently a Docker

    registry and client feature only, but I hope the general idea can propagate to other CRI implementations in the future. The idea is very simple, you have one tag (e.g. luxas/my-cool-app:v1.0.0) that serves as a "redirector" to platform-specific images. The client will then download the right image digest based on what platform it's running on. Docker registry v2 schema 2 API reference 5 . 6
  13. Ok, so now that I know what a manifest list

    is, how do I create it? $ go build my-app.go $ docker build -t luxas/my-app-amd64:v1.0.0 . ... $ docker push luxas/my-app-amd64:v1.0.0 $ # ARM $ GOARCH=arm go build my-app.go $ docker build -t luxas/my-app-arm:v1.0.0 . ... $ docker push luxas/my-app-arm:v1.0.0 $ # ARM 64-bit $ GOARCH=arm64 go build my-app.go $ docker build -t luxas/my-app-arm64:v1.0.0 . ... $ docker push luxas/my-app-arm64:v1.0.0 $ wget https://github.com/estesp/manifest-tool/releases/download/v0.4.0/manifest-tool-linux-amd64 $ mv manifest-tool-linux-amd64 manifest-tool && chmod +x manifest-tool $ export PLATFORMS=linux/amd64,linux/arm,linux/arm64 $ ./manifest-tool push from-args \ --platforms $PLATFORMS \ # Which platforms the manifest list include --template luxas/my-app-ARCH:v1.0.0 \ # ARCH is a placeholder for the real architecture --target luxas/my-app:v1.0.0 # The name of the resulting manifest list 5 . 7
  14. v1.2: - The first release I participated in, I made

    the release bundle include ARM 32-bit binaries v1.3: - Server docker images are released for ARM, both 32 and 64-bit - kubelet chooses the right pause image and registers itself with beta.kubernetes.io/{os,arch} v1.4: - kubeadm released as an official deployment method that supports ARM 32 and 64-bit - Unfortunately, I had to use a patched Golang version for building ARM 32- bit binaries... v1.6: - The patched Golang version for ARM could be removed. - I reenabled ppc64le builds and the community contributed s390x builds. How has the Kubernetes road to multiarch been? 5 . 8
  15. Demo! Set up a cluster consisting of 2x Up Board

    2x Odroid C2 3x Raspberry Pi 3 6 . 1
  16. With kubeadm this gets easy KUBE_HYPERKUBE_IMAGE=luxas/hyperkube:v1.6.0-kubeadm-workshop-2 kubeadm-new init --config kubeadm.yaml

    sudo cp /etc/kubernetes/admin.conf $HOME/ sudo chown $(id -u):$(id -g) $HOME/admin.conf export KUBECONFIG=$HOME/admin.conf kubectl apply -f weave.yaml kubectl taint no pi5 beta.kubernetes.io/arch=arm64:NoSchedule kubectl taint no pi6 pi7 beta.kubernetes.io/arch=arm:NoSchedule 6 . 2
  17. # Create the Dashboard Deployment and Service kubectl apply -f

    demos/dashboard/dashboard.yaml # Create the Heapster Deployment and Service kubectl apply -f demos/monitoring/heapster.yaml # Deploy Traefik as the Ingress Controller and use Ngrok to # expose the Traefik Service to the Internet kubectl apply -f demos/loadbalancing/traefik-common.yaml kubectl apply -f demos/loadbalancing/traefik-ngrok.yaml # Expose the Dashboard to the world kubectl apply -f demos/dashboard/ingress.yaml # Get the public ngrok URL curl -sSL $(kubectl -n kube-system get svc ngrok -o template --template \ "{{.spec.clusterIP}}")/api/tunnels | jq ".tunnels[].public_url" | sed 's/"//g;/http:/d' # Create InfluxDB and Grafana for the saving the Heapster data kubectl apply -f demos/monitoring/influx-grafana.yaml # Create the Prometheus Operator, a Prometheus instance and a sample metrics app kubectl apply -f demos/monitoring/prometheus-operator.yaml kubectl apply -f demos/monitoring/sample-prometheus-instance.yaml # Create a Custom Metrics API server kubectl apply -f demos/monitoring/custom-metrics.yaml 6 . 3
  18. $ kubectl get no -owide NAME STATUS AGE VERSION EXTERNAL-IP

    OS-IMAGE KERNEL-VERSION pi5 Ready 42m v1.6.0-beta.4 <none> Debian GNU/Linux 8 (jessie) 4.9.13-bee42-v8 pi6 Ready 43m v1.6.0-beta.4 <none> Raspbian GNU/Linux 8 (jessie) 4.4.50-hypriotos-v7+ pi7 NotReady 43m v1.6.0-beta.4 <none> Raspbian GNU/Linux 8 (jessie) 4.4.50-hypriotos-v7+ upboard1 Ready 46m v1.7.0-alpha.0.1446+33eb8794c93d5b-dirty <none> Ubuntu 16.04.2 LTS 4.4.0-67-generic upboard2 NotReady 43m v1.6.0-beta.4 <none> Ubuntu 16.04.2 LTS 4.4.0-66-generic $ kubectl get po --all-namespaces -owide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE custom-metrics custom-metrics-apiserver-2410399496-j6dg1 1/1 Running 0 22m 10.47.0.2 pi6 default prometheus-operator-1505754769-n7kj8 1/1 Running 0 30m 10.44.0.4 upboard2 default prometheus-sample-metrics-prom-0 2/2 Running 0 29m 10.44.0.8 upboard2 default sample-metrics-app-2440858958-1h5wf 1/1 Running 0 1m 10.45.0.6 pi5 default sample-metrics-app-2440858958-35fdz 1/1 Running 0 1m 10.44.0.11 upboard2 default sample-metrics-app-2440858958-56r2x 1/1 Running 0 1m 10.44.0.9 upboard2 default sample-metrics-app-2440858958-9grc1 1/1 Running 0 29m 10.45.0.4 pi5 default sample-metrics-app-2440858958-f5w1t 1/1 Running 0 4m 10.45.0.5 pi5 default sample-metrics-app-2440858958-km3gq 1/1 Running 0 12m 10.47.0.3 pi6 default sample-metrics-app-2440858958-lntqp 1/1 Running 0 1m 10.47.0.5 pi6 default sample-metrics-app-2440858958-nst8h 1/1 Running 0 4m 10.47.0.4 pi6 kube-system etcd-upboard1 1/1 Running 0 44m 192.168.200.211 upboard1 kube-system heapster-57121549-mtx6f 1/1 Running 0 41m 10.44.0.2 upboard2 kube-system kube-dns-3913472980-l3rkl 3/3 Running 0 44m 10.32.0.2 upboard1 kube-system kube-proxy-0jwxh 1/1 Running 0 42m 192.168.200.215 pi5 kube-system kube-proxy-7ks9n 1/1 Running 0 45m 192.168.200.211 upboard1 kube-system kube-proxy-ktxqd 1/1 Running 0 43m 192.168.200.212 upboard2 kube-system kube-proxy-snp6v 1/1 Running 0 43m 192.168.200.216 pi6 kube-system kubernetes-dashboard-2731141917-rdbj2 1/1 Running 0 41m 10.44.0.1 upboard2 kube-system monitoring-grafana-4071825559-rbs3w 1/1 Running 0 34m 10.45.0.2 pi5 kube-system monitoring-influxdb-1373127269-pzwhx 1/1 Running 0 34m 10.45.0.3 pi5 kube-system ngrok-3984100120-f5900 1/1 Running 0 41m 10.44.0.3 upboard2 kube-system pv-controller-manager-3769581161-dcn66 1/1 Running 0 40m 10.47.0.1 pi6 kube-system self-hosted-kube-apiserver-kk6hk 1/1 Running 1 45m 192.168.200.211 upboard1 kube-system self-hosted-kube-controller-manager-1546170996-40n6g 1/1 Running 0 45m 192.168.200.211 upboard1 kube-system self-hosted-kube-scheduler-3991062876-6s94c 1/1 Running 1 45m 192.168.200.211 upboard1 kube-system traefik-ingress-controller-3665677306-f5dhj 1/1 Running 0 41m 10.45.0.1 pi5 kube-system weave-net-3h3xm 2/2 Running 0 42m 192.168.200.215 pi5 kube-system weave-net-f7wwj 2/2 Running 0 43m 192.168.200.212 upboard2 kube-system weave-net-kxcr2 2/2 Running 0 43m 192.168.200.216 pi6 kube-system weave-net-n3tvh 2/2 Running 0 44m 192.168.200.211 upboard1 wardle wardle-apiserver-3982025089-3grzx 2/2 Running 0 32m 10.44.0.10 upboard2 6 . 4
  19. $ kubectl get svc --all-namespaces NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S)

    AGE custom-metrics api 10.100.246.198 <none> 443/TCP 35m default kubernetes 10.96.0.1 <none> 443/TCP 52m default prometheus-operated None <none> 9090/TCP 35m default sample-metrics-app 10.97.141.133 <none> 8080/TCP 35m default sample-metrics-prom 10.105.118.16 <nodes> 9090:30999/TCP 35m kube-system heapster 10.107.113.203 <none> 80/TCP 48m kube-system kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 52m kube-system kubernetes-dashboard 10.99.233.145 <none> 80/TCP 48m kube-system monitoring-grafana 10.105.105.151 <none> 80/TCP 44m kube-system monitoring-influxdb 10.99.193.162 <none> 8086/TCP 44m kube-system ngrok 10.107.224.120 <none> 80/TCP 48m kube-system traefik-ingress-controller 10.102.162.4 <none> 80/TCP 48m kube-system traefik-web 10.109.90.245 <none> 80/TCP 48m wardle api 10.99.51.75 <none> 443/TCP 39m $ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% pi5 352m 8% 487Mi 56% upboard1 428m 10% 984Mi 75% pi6 414m 10% 449Mi 58% 6 . 5
  20. $ kubectl api-versions apiregistration.k8s.io/v1alpha1 apps/v1beta1 authentication.k8s.io/v1 authentication.k8s.io/v1beta1 authorization.k8s.io/v1 authorization.k8s.io/v1beta1 autoscaling/v1

    autoscaling/v2alpha1 batch/v1 batch/v2alpha1 certificates.k8s.io/v1beta1 custom-metrics.metrics.k8s.io/v1alpha1 extensions/v1beta1 monitoring.coreos.com/v1alpha1 policy/v1beta1 rbac.authorization.k8s.io/v1alpha1 rbac.authorization.k8s.io/v1beta1 rook.io/v1beta1 settings.k8s.io/v1alpha1 storage.k8s.io/v1 storage.k8s.io/v1beta1 v1 wardle.k8s.io/v1alpha1 $ kubectl apply -f demos/sample-apiserver/my-flunder.yaml flunder "my-first-flunder" configured $ kubectl get flunders NAME KIND my-first-flunder Flunder.v1alpha1.wardle.k8s.io 6 . 6
  21. $ curl -sSLk https://10.100.246.198/apis/custom-metrics.metrics.k8s.io/v1alpha1\ /namespaces/default/services/sample-metrics-app/http_requests_total { "kind": "MetricValueList", "apiVersion": "custom-metrics.metrics.k8s.io/v1alpha1",

    "metadata": {}, "items": [ { "describedObject": { "kind": "Service", "namespace": "default", "name": "sample-metrics-app", "apiVersion": "/__internal" }, "metricName": "http_requests_total", "timestamp": "2017-03-24T13:14:13Z", "window": 60, "value": "299m" } ] } $ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE sample-metrics-app-hpa Deployment/sample-metrics-app 333m / 100 2 10 10 31m 6 . 7
  22. The current situation is ok and works, but it could

    obviously be improved. Here are some shout-outs to the community: - Automated CI testing for the other architectures using kubeadm - We might be able to use the CNCF cluster here? - Formalize a standard specification for how Kubernetes binaries should be compiled and how server images should be built - Official Kubernetes projects should publish binaries for at least amd64, arm, arm64, ppc64le, s390x and windows (node only) - Manifest lists should be built for the server images - This is blocked on gcr.io not supporting v2 schema 2 :( - Implement this feature in other CRI-compliant implementations - Creating an external Admission Controller that applies platform data What's yet to be done here? 7 . 2