Knative: The missing serving layer for Kubernetes

Proprietary + Confidential Knative: The missing serving layer for Kubernetes
Ahmet Alp Balkan twitter.com/ahmetb youtube.com/ahmetb github.com/ahmetb

Proprietary + Confidential About me Working at Google Cloud on
developer container-based developer experiences (GKE, Kubernetes, Cloud Run, Knative). Creator of several open source projects: - Google microservices-demo - Krew (Kubernetes SIG CLI) - kubectx/kubens Previously worked at Microsoft Azure, was a Docker maintainer.

Proprietary + Confidential Kubernetes is a great platform to deploy
and run microservices. –Everyone

Proprietary + Confidential Kubernetes is a generic platform to run
any workload, and "services" deserve better networking, rollout and monitoring capabilities from the infrastructure they run on. -me 😇

Proprietary + Confidential Kubernetes: the good parts ✔ 1. A
"declarative" and "goal-state driven" API. 2. Manage a large set of machines (i.e. a cluster) 3. APIs to run container workloads on those machines (Pod, Deployment, StatefulSet..) 4. Extensibility to define your own APIs (CRDs) and controllers around them to actuate resources.

Proprietary + Confidential Pod smallest deployment unit (1..N containers) ReplicaSet
a scalable set of identical stateless Pods Deployment ReplicaSet but with revisions and rolling updates StatefulSets Pods pinned to particular nodes Job Run a Pod to completion CronJob Run a Job periodically

Proprietary + Confidential • serves an API or web page
• stateless replicas • load balancing • autoscaling • rollouts (blue/green) • rollbacks Microservices noun. service, but smaller. usually a twelve-factor app. • service discovery • secure transport (TLS) • request metrics • graceful termination • shield from spikes/DoS • concurrency limits • ...

Proprietary + Confidential • serves an API or web page
• stateless replicas • load balancing • autoscaling • rollouts (blue/green) • rollbacks Where Kubernetes falls short • service discovery • secure transport (TLS) • request metrics • graceful termination • shield from spikes/DoS • concurrency limits • ... DIY

microservice HTTP request HTTP response HTTP request HTTP response client

TCP socket microservice client Kubernetes has no notion of application-layer
(L7) requests (HTTP, gRPC, ...).

Proprietary + Confidential • Per-connection. • Causing uneven distribution ◦
single client establishing too many connections • Naturally "sticky sessions" ◦ a client routed to the same Pod even if degraded or faulty Where Kubernetes falls short Load balancing Pod Pod Pod Pod Pod

Proprietary + Confidential • Based only on CPU and memory
• Delayed metrics collection ◦ cannot easily handle spiky traffic patterns ◦ it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling

• Delayed metrics collection ◦ cannot easily handle spiky traffic patterns ◦ it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling 0.4 cpu Autoscaling target: 1.0 CPU Pod (1.5 CPU)

• Delayed metrics collection ◦ cannot easily handle spiky traffic patterns ◦ it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu Pod (1.5 CPU)

• Delayed metrics collection ◦ cannot easily handle spiky traffic patterns ◦ it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod (1.5 CPU) 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu

• Delayed metrics collection ◦ cannot easily handle spiky traffic patterns ◦ it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod (1.5 CPU) 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu Pod

• Delayed metrics collection ◦ cannot easily handle spiky traffic patterns ◦ it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu Pod

Proprietary + Confidential • No support for highly spiky traffic
patterns. • Need a proxy or gateway to “front” the requests and “buffer” them. • No "max N requests per container" Where Kubernetes falls short Meat shielding Concurrency Controls Rapid Autoscaling Meat shield Pod Pod Pod Pod Pod Pod

Proprietary + Confidential • Can't split traffic per-request, e.g. ◦
95% v1 ◦ 5% v2 • Need to implement blue/green rollouts yourself. ◦ Deployment API gives some options for rolling updates, but not quite blue/green. Where Kubernetes falls short Rollouts Blue/green deployments Pod v1 Pod v2 95% 5%

Proprietary + Confidential Pod • Unused replicas keep consuming resources.
• Hard to have high utilization, because we almost always overprovision in Kubernetes. Where Kubernetes falls short Scale to zero Pod Pod

Proprietary + Confidential Knative to rescue

Proprietary + Confidential Make your developers more productive Knative components
build on top of Kubernetes, abstracting away the complex details and enabling developers to focus on what matters. Built by codifying the best practices shared by successful real-world implementations, Knative solves the "boring but difficult" parts of deploying and managing cloud native services so you don't have to. Highlights • Focused API with higher level abstractions for common app use-cases. • Stand up a scalable, secure, stateless service in seconds. • Loosely coupled features let you use the pieces you need. • Pluggable components let you bring your own logging and monitoring, networking, and service mesh. • Knative is portable: run it anywhere Kubernetes runs, never worry about vendor lock-in. • Idiomatic developer experience, supporting common patterns such as GitOps, DockerOps, ManualOps. • Knative can be used with common tools and frameworks such as Django, Ruby on Rails, Spring, and many more. Knative.dev: Kubernetes-based platform to deploy and manage modern serverless workloads.

Proprietary + Confidential A set of extensions to Kubernetes that
supercharges your cluster to run stateless services more efficiently. Heavily customizable and pluggable. Strong open source community involving Google, Red Hat, VMware, IBM and SAP. AHMET'S DEFINITION

Proprietary + Confidential Knative enhances Kubernetes Load Balancing More: https://ahmet.im/blog/knative-better-kubernetes-networking/
Kubernetes Connection-based. Unintentionally sticky sessions. Possibly uneven. Knative Per-request (HTTP, gRPC, …).

Proprietary + Confidential Kubernetes N/A Knative Scale application to 0,
if inactive for a while. Activate (0→1) on the next request. Knative enhances Kubernetes Scale to zero More: https://ahmet.im/blog/knative-better-kubernetes-networking/

Proprietary + Confidential Kubernetes Memory/CPU based autoscaling (slow). No meat
shield, spiky traffic will crash Pod. Knative Rapid, request-oriented autoscaling. Handles traffic spikes by buffering requests. Knative enhances Kubernetes Autoscaling More: https://ahmet.im/blog/knative-better-kubernetes-networking/

Proprietary + Confidential Kubernetes N/A You have to implement yourself.
Knative Reports "golden signals" such as request count, latency, error rate. Knative enhances Kubernetes Request metrics More: https://ahmet.im/blog/knative-better-kubernetes-networking/

Proprietary + Confidential Kubernetes Doesn't know about "new versions" or
"requests" to split traffic among them. Knative Each deploy creates a new Revision. Split traffic between Revisions declaratively. Knative enhances Kubernetes Blue-green deployments More: https://ahmet.im/blog/knative-better-kubernetes-networking/

Proprietary + Confidential More: https://ahmet.im/blog/knative-better-kubernetes-networking/ 1. Kubernetes Deployment 2. Kubernetes
Service 3. Kubernetes Ingress 4. Kubernetes HorizontalPodAutoscaler Migrating a Kubernetes microservice to Knative is easy

Proprietary + Confidential 1. Kubernetes Deployment → shorten to Knative
Service 2. Kubernetes Service 3. Kubernetes Ingress 4. Kubernetes HorizontalPodAutoscaler Migrating a Kubernetes microservice to Knative is easy

apiVersion: apps/v1 kind: Deployment metadata: name: hello-web spec: replicas: 1
selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer selector: app: hello tier: web ports: - port: 80 targetPort: 8080

selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer selector: app: hello tier: web ports: - port: 80 targetPort: 8080 no need, Knative will give us a $PORT

selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer selector: app: hello tier: web ports: - port: 80 targetPort: 8080 no need for all these labels and selectors

selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer selector: app: hello tier: web ports: - port: 80 targetPort: 8080 Knative autoscales

selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer selector: app: hello tier: web ports: - port: 80 targetPort: 8080 Knative creates both internal and external endpoints by default

selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer selector: app: hello tier: web ports: - port: 80 targetPort: 8080 No need for a container name if you have only one

apiVersion: v1 kind: Service metadata: name: hello-web spec: type: LoadBalancer
selector: app: hello tier: web ports: - port: 80 targetPort: 8080 apiVersion: apps/v1 kind: Deployment metadata: name: hello-web spec: replicas: 1 selector: matchLabels: app: hello tier: web template: metadata: labels: app: hello tier: web spec: containers: - name: main image: gcr.io/google-samples/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi Kubernetes Deployment Kubernetes Service

apiVersion: apps/v1 kind: Deployment metadata: name: hello spec: template: spec:
containers: - image: gcr.io/ahmetb-demo/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi

apiVersion: apps/v1 serving.knative.dev/v1 kind: Deployment Service metadata: name: hello spec:
template: spec: containers: - image: gcr.io/ahmetb-demo/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi

apiVersion: serving.knative.dev/v1 kind: Service metadata: name: hello spec: template: spec:
containers: - image: gcr.io/ahmetb-demo/hello-app:1.0 resources: limits: cpu: 100m memory: 256Mi

Proprietary + Confidential What if you don't even need Kubernetes
to have this? Cloud Run

Proprietary + Confidential Cloud Run Run stateless containers on Google’s
managed serverless infrastructure. Container image to production URL in a few seconds. Runs any language or framework. Pay only during requests, idle time is free.

Proprietary + Confidential Cloud Run Pay only for what you
use. Charged only during requests.

Proprietary + Confidential How to get Knative on GCP?

Proprietary + Confidential Thank you Ahmet Alp Balkan Software Engineer,
Google Cloud twitter.com/ahmetb github.com/ahmetb youtube.com/ahmetb Resources: • cloud.run (docs) • knative.dev (docs) • knative.tips (my notes)

Knative: The missing serving layer for Kubernetes

Knative: The missing serving layer for Kubernetes

More Decks by Ahmet Alp Balkan

Other Decks in Technology

Featured

Transcript