instructions. A process is a binary loaded into memory with a pid (loaded from another process!) CPUs execute instructions, which use memory and IO via syscalls 3
need to run processes • We need to configure these things • We need to install binaries on these ‘things’ • We need to configure these binaries • We need to run these binaries as processes • We need to secure and manage these things and processes • We need to route traffic to these processes • We need to update these things • We need to update these processes How how how?!?!?!!?
10.52.0.0/16 Stage 10.51.0.0/16 VendorX 10.53.0.0/16 Ops 10.54.0.0/16 Portal IT 172.16.0.0/16 An environment is: • An AWS account and permissions • A VPC • Route tables • Everything required to run Namely • The ability to deploy components
An IP4 address is 32 bits That’s 32 1’s and 0’s 11110000 11110000 11110000 11110000 The /8 and /16 denotes how many bits ‘to keep’ (Big E) This denotes how IP’s are allocated A route table directs an IP range to a target
10.100.12.0/22 Zone 1c, private subnet, 10.100.28.0/22 Zone 1d, private subnet, @ 10.100.44.0/22 Node Node Node Node Node Node Node Node Node Zone 1a, public subnet, 10.100.10.0/22 Zone 1c, public subnet, 10.100.32.0/22 Zone 1d, public subnet, 10.100.40.0/22 Public ELBs With Public IPs Private ELBs With Subnet IPs Try using the dig command to find out how dns names are mapped to IPs ENI ENI ENI SG SG SG SG SG SG SG SG SG
container is just a binary, with its dependencies • Abstracts node-management issues for processes ◦ IP addresses, ports, security, quotas ◦ Also in this space: ECS, Docker-Swarm, GAE • Built-in config, secret, service discovery, scaling management • Automates some cloud infra, like load balancers • Foundational-focused API (doesn’t solve some things) ◦ Allows for extensibility
a cluster • Has a name and labels • Has one, unique private IP address • Has one or more containers • All containers in a pod share networking, storage ◦ Can “see” eachother on localhost Pod Name: slug-bcddbcd8-1sa3a Labels: • app: slug Containers: • Slug • image: namely/slug • ports: ◦ 50051 • env • Istio • A ‘Sidecar’ is simply another container in a pod, usually auto-injected
specify how many instances of a pod should be running • Represents desired state ◦ Kubernetes will try to get current state to desired state • You usually don’t deal with these though • But they are why Pods are re-created after you delete them! ReplicaSet Name: slug-bcddbcd8-1sa3a Labels: • app: slug Spec: • Replicas: 2 • Selector ◦ Labels • Template ◦ Same as Pod! Pod Pod
deal with these ◦ Through Spinnaker! • Deployments wind down old ReplicaSets and scale up new ones. • They support various strategies for how things are updated. Deployment Name: slug Labels: • app: slug Spec: • Replicas: 4 • Strategy ◦ Rolling-update • Selector ◦ Labels • Template ◦ Same as Pod! RS-old RS-new
• Uses an internal DNS service ◦ Currently CoreDNS • Also uses a private network (10.3/16) • Uses labels to match service names with pods • Has three types. ◦ ClusterIP (default) ◦ LoadBalancer ▪ Allows external traffic to flow to internal pods ▪ On AWS creates an ELB ◦ Don’t worry about type three Service Name: slug Labels: • app: slug Spec: • Type • Ports • Selector ◦ Labels
4 Layer 3 Layer 2 Layer 1 Application Blah Transport Network Data Link Physical HTTP/HTTP2 TCP/UDP IPv4/IPv6 Ethernet* Raw wires(less) This is important This is important
(single service) • Ingress is L7. It knows about http and http2 • Not natively implemented by Kubernetes ◦ Only schema is defined ◦ Third parties implement ◦ We tried Contour, Nginx, and Istio • Allows us to compose, shape and route traffic declaratively ◦ Used for gRPC traffic ◦ Used for our APIs ◦ Used to better handle AATE egress Ingress Name: slug Labels: app: slug Spec: rules: - host: '*.i.namely.com' http: paths: - backend: serviceName: slug servicePort: 80 path: /api/slug
do you determine ‘healthy’ pod • No standard metrics, tracing • No way to test failures • Can’t optimize traffic (inter-zone) • Standard retries, fail-fast
L4 and L7 • Emits stuff for us • Big plans for Namely ◦ A/B testing ◦ Failure testing ◦ Better traffic management Some Pod 10.2.123.12 App Container Istio 10.3.0.0/16 Other Pod 10.2.98.128 10.3.0.0/16 App Container Istio Straight to Pod’s IP Hey K8S, tell me every service and every pod IP
that are not automatable, we continue having to staff humans to maintain the system. If we have to staff humans to do the work, we are feeding the machines with the blood, sweat, and tears of human beings. Think The Matrix with less special effects and more pissed off System Administrators engineers.