Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes at Cruise: Two Years of Multitenancy

Karl Isenberg
November 19, 2019

Kubernetes at Cruise: Two Years of Multitenancy

Cruise has been working on self-driving cars for six years and growing exponentially for most of that time. Two years ago they started using Kubernetes, betting on namespace-level multitenancy to provide isolation between teams and projects. Today they have over 40 internal tenants, 100,000 pods, 4,000 nodes, and… an embarrassing number of KubeDNS replicas.

This session will take you through the motivations, story, and results of migrating to multitenant Kubernetes, along with some hard-earned Pro Tips from the trenches.

You’ll also learn about the open source tooling they built around Spinnaker, Vault, Google Cloud, and Istio in order to integrate with our multitenant Kubernetes.

Come see how they went from barely isolated to very isolated and saved a few million dollars doing it!

Karl Isenberg

November 19, 2019
Tweet

Other Decks in Technology

Transcript

  1. 2 Building the world’s most advanced autonomous vehicles... ...and running

    the backend on Kubernetes. Kubernetes at Cruise: Two Years of Multitenancy
  2. 3 Kubernetes at Cruise: Two Years of Multitenancy Kubernetes Cloud

    Storage Cloud Routers Interconnects Physical Routers On-Premises Cloud NAT Internet
  3. 4 Kubernetes at Cruise: Two Years of Multitenancy Ingre RBACSync

    [C] GKE Spinnaker Ingress Controller (private) Runscope Stackdriver Logging Network Load Balancer Private Cloud DNS Internal Load Balancer Container Registry Ingress Security Deployment Observability Forge Image Builder [C] Cruise PaaS Ingress Controller (public) Cloud NAT Egress Management Juno [C] Isopod [C] [C] Cruise Projects Other logos unaffiliated with Cruise. [C]
  4. Multitenancy at Scale 5 Clusters: 12-26 Largest Cluster: ~1,000 nodes

    (64 or 32 vCPU each) Kubernetes at Cruise: Two Years of Multitenancy
  5. 6 Kubernetes at Cruise: Two Years of Multitenancy Multitenancy is

    when multiple applications operate in a shared environment. Tenants are logically isolated, but physically integrated. The more physical integration, the harder it is to preserve logical isolation.
  6. 7 Why Multitenancy? Lower Cloud Costs Higher collocation allows for

    higher utilization of cloud resources (compute, network, storage). Fewer clusters can be managed by fewer platform engineers. Validate real workloads at real scale by postponing clusters proliferation. Focus on production readiness and tenant-facing improvements before scaling cluster operations. Lower Operational Costs Higher Scale Validation Higher Consistency Kubernetes at Cruise: Two Years of Multitenancy
  7. 8 8 Multitenancy: Layers of Isolation Kubernetes at Cruise: Two

    Years of Multitenancy Domain Isolation Identity Isolation Permission Isolation Resource Isolation Network Isolation Integration Isolation System Isolation Isolation / Cost
  8. Identity & Authentication User Identity - G Suite User Accounts

    - Okta Single Sign-On (SSO) - Duo Security (2FA) Service Identity - GCP Service Accounts - K8s Service Accounts - Signed Certificates - JSON Web Tokens (JWT) 10 Kubernetes at Cruise: Two Years of Multitenancy
  9. Kubernetes at Cruise: Two Years of Multitenancy In-Memory Volume App

    Container Kubernetes Pod Secrets Login [C] Cruise Projects Other logos unaffiliated with Cruise. [C] Vault Login Kubernetes service accounts used for Vault authentication. Secrets Injection DAYTONA Init container side-loads secrets Identity Translation Vault generates temporary credentials on-demand (OSS) https://github.com/cruise-automation/daytona Vault client for servers & containers. See also: GKE Workload Identity (beta)
  10. Audit Validating webhook logs policy violations Enforce Validating webhook optionally

    enforces policies Apply Defaults Mutating webhook applies policy defaults Kubernetes at Cruise: Two Years of Multitenancy Prevent Privilege Escalation & Lateral Movement - No Bind Mounts - No Host Network - No Host PID - No New Capabilities - No Privileged Container - No Helm Tiller - Default Docker Seccomp profile (OSS) https://github.com/cruise-automation/k-rail [C] Cruise Projects Other logos unaffiliated with Cruise. [C] Security & operational policy enforcement tool. See also: GKE Metadata Concealment (beta)
  11. Domain Isolation 13 Kubernetes at Cruise: Two Years of Multitenancy

    Environmental Dev, Test, Stage, Prod Organizational Org, Dept, Team, Personal Architectural Project, System, Component
  12. Environmental vs Organizational Domains 14 Pods Dev Staging Prod Team

    A Team B Team C Pods Pods Pods Pods Pods Pods Pods Pods Clusters Namespaces Kubernetes at Cruise: Two Years of Multitenancy
  13. Environmental vs Architectural Domains 15 Pods Dev Staging Prod Project

    A Project B Project C Pods Pods Pods Pods Pods Pods Pods Pods Clusters Namespaces Kubernetes at Cruise: Two Years of Multitenancy
  14. Group Role Binding 17 apiVersion: rbacsync.getcruise.com/v1alpha kind: RBACSyncConfig metadata: name:

    namespace-bindings namespace: backend spec: bindings: - group: [email protected] roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: namespace-admin - group: [email protected] roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: namespace-editor RBACSyncConfig ClusterRoleBinding RoleBinding ClusterRole Role Groups GSuite RBACSync Kubernetes at Cruise: Two Years of Multitenancy (OSS) https://github.com/cruise-automation/rbacsync See also: Google Groups for GKE (beta)
  15. Vault Workspaces 18 Kubernetes at Cruise: Two Years of Multitenancy

    Group Permissions Path Tenant Admin admin secret/<prefix>/<namespace>/* Tenant Contractor list secret/<prefix>/<namespace>/* App Service Account list, get secret/<prefix>/<namespace>/<env>/<app>/* Standard hierarchy for storing and authorizing application secrets.
  16. Kubernetes at Cruise: Two Years of Multitenancy Isopod (OSS) https://github.com/cruise-automation/isopod

    Domain Specific Language Loosely typed with local runtime type validation Less YAML Skylark backed by Kubernetes Go Client Flexible Reuse Alternative to Helm & Terraform for addon management DSL for Kubernetes configuration without YAML.
  17. Kubernetes at Cruise: Two Years of Multitenancy (internal project) Juno

    Resource Management - GCP Project - Vault Workspace - K8s Namespace Related OSS Projects - namespace-configuration-operator - rbac-permissions-operator Cruise infra self-service resource provisioner.
  18. Resource Isolation Built-In Types - CPU, GPU - Memory -

    Persistent Storage (for each Storage Class) - Ephemeral Storage Storage Volumes - OS Root - Container Images - Container Root - Ephemeral Storage Volumes - Persistent Local Storage Volumes Quotas & Limits - Resource Quota: Namespace Limits & Usage - Limit Range: Pod Default Requests & Limits - Defaults & Overrides (Juno) 21 Kubernetes at Cruise: Two Years of Multitenancy
  19. 23 Shared Tunnels NAT Gateways - NAT Gateway Terraform Module

    (network label routing) - Cloud NAT - Whitelists Ingress / Egress QPS - No Built-In Isolation - Network Stack shared with Network Storage (NAS/SAN/Cloud) Bandwidth - No Built-In Isolation - CNI Bandwidth Plugin (Calico) - Istio Rate Limits (Quota Rules) Kubernetes at Cruise: Two Years of Multitenancy
  20. Mesh Boundary 24 Virtual Firewalls Network Policy - IP Block

    - Namespace Selector - Pod Selector Service Authorization - Istio mTLS - Istio Authorization Policy Rule Based Access - Istio Denier Rule - Istio List Checker Adapter Kubernetes at Cruise: Two Years of Multitenancy Workload Pods Workload Container Istio Sidecar Workload Pods Workload Container Istio Sidecar Egress Gateway Private Ingress Gateway Public Ingress Gateway External Service Private Network Local Client Public Client
  21. Shared Ingress Isolation Options - Separate Private & Public (shown)

    - Dedicated Ingress Node Pool - Dedicated Ingress VMs (shown) - Dedicated Ingress Per Tenant Kubernetes at Cruise: Two Years of Multitenancy
  22. Shared DNS Isolation Options - Node Local Cache - Dedicated

    DNS Per Node Pool - Dedicated DNS Per Cluster Node Node Kubernetes at Cruise: Two Years of Multitenancy Application Pod KubeDNS Pod SkyDNS DnsMasq Sidecar (probes) Datadog Daemon Google Cloud DNS CoreDNS AWS Route 53 etcd *.cluster.local Private Zones DNS Requests (UDP) Application Default Resolver resolv.conf DataDog Agent Datadog Daemon DataDog Agent Network Local Cache
  23. Shared Observability Logs - Log Visibility (Container, Platform, Audit) -

    Log-Based Metrics (Edit Perms) - Fluentd DaemonSet Vertical Autoscaling Metrics - Kube State Metrics not HA - DaemonSet Agent HA & Slow or Local & Fast - Sidecar Agent Duplicate Metrics - DogStatsD vs Prometheus Style Distributed Tracing - OpenTelemetry vs OpenCensus vs OpenTracing - Stackdriver vs DataDog vs Jaeger vs Zipkin Dashboard Management - Platform Dashboards - Workload Dashboards - Dashboard Templates Kubernetes at Cruise: Two Years of Multitenancy Container & Ingress Log Export Team Project PaaS Project Kubernetes Engine Monitoring Logging Logging Logs Based Metrics Cloud Load Balancing Cloud Pub/Sub Kubernetes Engine Container Logs Ingress Logs
  24. No Isolation (Shared Cluster Admin) 29 System Isolation Machines -

    Dedicated Node Pool - Dedicated Cluster Cluster Components - API Server - Scheduler - Cluster Autoscaler - Kube Proxy (iptables) Networks - Dedicated IP Ranges - Dedicated Subnet - Dedicated Network - Dedicated Interconnects Kubernetes at Cruise: Two Years of Multitenancy Logical Isolation (Soft Multitenancy) Physical Isolation (Hard Multitenancy) System Isolation (Single Tenancy)
  25. Was it worth it? Costs - Shared Downtime - Incompatible

    Tooling Challenges - Single Tenant Integrations - Managed CRD Installation - Managed Internal Platform Model - Kubernetes Itself Benefits - Lower Cloud Costs - Lower Operational Costs - Higher Scale Validation - Higher Consistency - Prioritized Security Investments - Expertise Building 30 Kubernetes at Cruise: Two Years of Multitenancy