Serverless containers on fully managed infrastructure

Slide 1

Slide 1 text

Senior Software Engineer google_l Cloud

Slide 2

Slide 2 text

Working on container-based experiences and services such as Kubernetes/GKE, Cloud Run and Knative. Maintainer for some popular open source projects in the Kubernetes operator productivity space. Prior to Google, worked at Microsoft,on porting Docker to Windows, bringing docker-{machine,registry} to Azure cloud, and was an ex-Docker maintainer.

Slide 3

Slide 3 text

I can write my code, and “git push”. It deploys to a self-scaling managed infrastructure. HOW DID WE LOSE THE SIGHT OF THIS?

Slide 4

Slide 4 text

Heroku and App Engine would later be dubbed as: .

Slide 5

Slide 5 text

Let’s attempt to define serverless...

Slide 6

Slide 6 text

No infra management, Autoscaling based on demand, Managed security Pay only for usage, no cost if unused Event-driven or request-driven services PROVOCATIVE FOLLOW-UP QUESTIONS Can databases be serverless? (Google Cloud Firestore, Google Cloud PubSub, Amazon Aurora DB ...) Can containers be serverless?

Slide 7

Slide 7 text

Containers Flexibility Serverless Velocity

Slide 8

Slide 8 text

Can be ?

Slide 9

Slide 9 text

developer experience ● Build a container image ● Deploy the application ● Expose at an endpoint ● Request-level load balancing ● Set up SSL/TLS ● Scale up based on demand ● Scale down to zero ● Canary deployments ● Monitor metrics

Slide 10

Slide 10 text

developer experience ● Build a container image→ Dockerfile/pack ● Deploy the application → Deployment ● Expose at an endpoint → ClusterIP svc ● Request-level load balancing → Ingress ● Set up SSL/TLS → Ingress+cert-manager? ● Scale up based on demand → HPA (CPU/mem) ● Scale down to zero → ??? ● Canary deployments → DIY? ● Monitor metrics → DIY Prometheus

Slide 11

Slide 11 text

Bringing “serverless-like” features to Kubernetes

Slide 12

Slide 12 text

● Source-to-URL deploys Kubernetes needs container images built/pushed ● Request-based load balancing and rapid autoscaling Kubernetes load-balancing only happens at TCP layer among Pods Overloaded pods will crash during traffic spikes, dropping other traffic ● Scale-to-zero Kubernetes cannot do natively ● Canary deployments, rollouts/rollbacks Kubernetes cannot natively split traffic (lack of L7 HTTP load balancing) Kubernetes has no notion of immutable revisions to cleanly rollback ● Out-of-the box application monitoring metrics Kubernetes doesn’t provide monitoring signals beyond CPU/memory

Slide 13

Slide 13 text

AHMET’S DEFINITION: Open source implementation (and API) that supercharges Kubernetes cluster to run stateless services more effectively. Heavily customizable and pluggable. Strong open source community involving Google, Red Hat, IBM, VMware, SAP and others.

Slide 14

Slide 14 text

● Rapid autoscaling w/ request-layer load balancing Knative performs request/RPC-level (Layer 7) load balancing. Knative is designed to handle without overloading pods. ● Can scale-to-zero Knative can shut down unused applications, and wake them up on first request ● Can do canary deployments, rollouts/rollbacks Each Knative deployment creates a new “immutable” Revision. You can split traffic among Revisions based on percentages. ● Out-of-the box application monitoring metrics Knative exposes HTTP golden signals (request count, latency, error codes metrics) for all applications over Prometheus or other telemetry drivers. ● Still Kubernetes. :) Knative Kubernetes

Slide 15

Slide 15 text

But is still and is not ?

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Container image to production URL in a few seconds Runs applications or binaries in any language Fully-managed, rapid autoscaling Cloud Run Pay during requests, idle time is free

Slide 18

Slide 18 text

DEMO

Slide 19

Slide 19 text

● Container must be serving on given $PORT Single port number, but you can customize the port number) ● Container must serve plain HTTP. (or unencrypted h2c for gRPC) ● Container should do things only while serving requests. Background CPU usage is throttled to ~0. ● Only Linux x86_64 executables. No 32-bit support.

Slide 20

Slide 20 text

concurrency = 1 concurrency = 50 Most serverless platforms Cloud Run

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Cloud Run (fully managed) Cloud Run for Anthos • Implements Knative API • Runs on Google’s infrastructure • Pay per request • Designed go from 0 qps to tens of thousands of qps very quickly. • Hosted open source Knative • Runs next to other Kubernetes workloads in your GKE clusters anywhere (GCP/AWS/Azure/on-prem) • Same serverless developer experience • Tunable/extensible for your needs, as Knative is heavily customizable and pluggable.

Slide 23

Slide 23 text

GKE on AWS or Azure GKE On-prem Google’s infra Cloud Run for Anthos Knative Anthos/GKE (Kubernetes) UI CLI YAML Cloud Run Developer & Operator GKE on GCP Knative API gVisor

Slide 24

Slide 24 text

e.g. Azure Container Instances (ACI) and AWS Fargate ● Cloud Run only runs stateless HTTP server workloads Only HTTP or event-triggered, no background batch processing. ● Cloud Run has rapid autoscaling out of the box. No network configuration, autoscaling configuration etc. ● Pricing model is “pay only during requests”.

Slide 25

Slide 25 text

DEMO

Slide 26

Slide 26 text

● Knative ○ Domain name per service ○ Automatic TLS for domains ○ Configurable min/max instances ○ Declarative events support ● Cloud Run ○ Configurable max instances ○ gRPC/streaming support ○ Support for VPC networks, CDN, configurable L7 LB ● Google Cloud Buildpacks github.com/GoogleCloudPlatform/buildpacks

Slide 27

Slide 27 text

● Get started to deploy your serverless containers to Cloud Run. ● Learn more about the Knative open source project. ● github.com/ahmetb/ Answers to most of your questions about Cloud Run. twitter.com/ahmetb github.com/ahmetb