Real World Kubernetes
Deployments
failure domains, upgrades, high-availability
@coreoslinux
@brandonphilips
Slide 2
Slide 2 text
Follow Along Instructions
http://bit.ly/1XeUbMW
Stickers Upfront
Decorate your laptop, dog, kid, phone.
Slide 3
Slide 3 text
Brandon Philips
CTO, CoreOS
github.com/philips
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
Build, Store and Distribute your Containers
quay.io
Slide 6
Slide 6 text
Linux
Slide 7
Slide 7 text
Secure the Internet
MISSION
Slide 8
Slide 8 text
Separate Apps from OS
STRATEGY
Slide 9
Slide 9 text
Make Servers Consistent
STRATEGY
Slide 10
Slide 10 text
Tolerate Machine Failures
STRATEGY
Slide 11
Slide 11 text
Make Servers Easy to Upgrade
STRATEGY
Slide 12
Slide 12 text
Simplify Application Upgrades
STRATEGY
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
No content
Slide 15
Slide 15 text
No content
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
No content
Slide 18
Slide 18 text
Application Packaging
1
Slide 19
Slide 19 text
Abstract away app from the OS
OS App
Slide 20
Slide 20 text
No content
Slide 21
Slide 21 text
No content
Slide 22
Slide 22 text
Linux at Scale
2
Slide 23
Slide 23 text
Patches to the OS and kernel are hard
Retest after updates
No automation
SECURITY
Dependency breakage
Uptime risk
APPLICATION
Slide 24
Slide 24 text
No content
Slide 25
Slide 25 text
Auto-updating browsers fixed security
We got HTML5 at the same time
Slide 26
Slide 26 text
Clustering
3
Slide 27
Slide 27 text
Operations Paradise
Easy scale out
Painless app upgrades
Tolerant of machine failure
Slide 28
Slide 28 text
App Req/sec: 6,000
App Healthy: True
Slide 29
Slide 29 text
App Req/sec: 6,000
App Healthy: True
Slide 30
Slide 30 text
App Req/sec: 7,000
App Healthy: True
Slide 31
Slide 31 text
App Req/sec: 8,000
App Healthy: True
Slide 32
Slide 32 text
App Req/sec: 7,000
App Healthy: True
Slide 33
Slide 33 text
App Req/sec: 6,000
App Healthy: True
Slide 34
Slide 34 text
App Req/sec: 8,000
App Healthy: True
Slide 35
Slide 35 text
App Req/sec: 7,000
App Healthy: True
Slide 36
Slide 36 text
App Req/sec: 8,000
App Healthy: True
Slide 37
Slide 37 text
App Req/sec: 8,000
App Healthy: True
Slide 38
Slide 38 text
3
Application packaging
Clustering
Linux at scale
Slide 39
Slide 39 text
3
Application packaging
Clustering
Linux at scale
Slide 40
Slide 40 text
Follow Along Instructions
https://github.com/philips/repositories
2016-OSCON-containers-at-scale
Slide 41
Slide 41 text
CoreOS+Kubernetes
vagrant, aws, bare metal, etc
coreos.com/kubernetes/docs/latest/
Slide 42
Slide 42 text
kubernetes
architecture in practice
Slide 43
Slide 43 text
worker
kubelet
worker
kubelet
worker
kubelet
scheduler
& API
worker
kubelet
w
ku
t
worker
kubelet
Slide 44
Slide 44 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 45
Slide 45 text
worker &
API
works on 1 node too
Slide 46
Slide 46 text
kube-aws
Initial Cluster Setup
Slide 47
Slide 47 text
worker
kubelet
worker
kubelet
controller
scheduler, etcd
& API
Slide 48
Slide 48 text
Demo
Boot up a Cluster
Slide 49
Slide 49 text
Demo
Run an App
Slide 50
Slide 50 text
Demo
Understand the Network
Slide 51
Slide 51 text
Domains
Let's Talk About Failure
Slide 52
Slide 52 text
Failure domains are regions or
components of the infrastructure
which contain a potential for failure.
Slide 53
Slide 53 text
These regions can be physical or
logical boundaries, and each has its
own risks and challenges to architect
for.
Slide 54
Slide 54 text
Failure Feud
- Machine Failure
- Network/Disks/RAM/Processor/Power Supply
- Rack Failure
- Network/Power
- Data Center Failure
- Network/Power/Fire/Semi-trucks
- Internet Failure
- Network/Political/Natural
Slide 55
Slide 55 text
Failure Analysis
Kid Celebrating
Slide 56
Slide 56 text
No content
Slide 57
Slide 57 text
Kid Hitting His Eye Failure Analysis
- Failure is caused by human error
- Celebration continues; eye unnecessary
- Kid has two eyes can continue seeing
- Brain elects new eye automatically
Slide 58
Slide 58 text
Primary Datastore
etcd operations
Slide 59
Slide 59 text
/etc
distributed
hence, the name...
Slide 60
Slide 60 text
a clustered
key-value store
GET and SET operations
Slide 61
Slide 61 text
a building block for
higher order systems
primitives for building reliable
distributed systems
Slide 62
Slide 62 text
Demo
play.etcd.io
Slide 63
Slide 63 text
No content
Slide 64
Slide 64 text
No content
Slide 65
Slide 65 text
No content
Slide 66
Slide 66 text
No content
Slide 67
Slide 67 text
No content
Slide 68
Slide 68 text
No content
Slide 69
Slide 69 text
No content
Slide 70
Slide 70 text
Failure Analysis
etcd
Slide 71
Slide 71 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 72
Slide 72 text
kube-aws
high availability in cloud
Slide 73
Slide 73 text
scheduler
& API
EBS
{
ASG
Slide 74
Slide 74 text
etcd protects against
- Machine Failure
- Replication, automatic leader election
- Flakey Disk Failure
- CRC checksums on WAL files
- Network Failure
- Timeouts and linearized state machine
Slide 75
Slide 75 text
etcd does not protect against
- Denial of Service
- Future work on proxies
- Lying etcd Peers
- We do a ton of functional testing a hedge
- Buggy or Broken Clients
- Client deleting all keys requires restore from backup
Slide 76
Slide 76 text
Demo
etcd restore backup
Slide 77
Slide 77 text
1 2 3 4
{
Log
Slide 78
Slide 78 text
1 2 3 4
Entries
Slide 79
Slide 79 text
1 2 3 4
Indexes
Slide 80
Slide 80 text
Kubernetes Control
API Service, Scheduler, Controller Manager
Slide 81
Slide 81 text
Failure Analysis
Kubernetes
Slide 82
Slide 82 text
Demo
etcd down for API server
Slide 83
Slide 83 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 84
Slide 84 text
scheduler
& API
Slide 85
Slide 85 text
Demo
etcd restore for API server
Slide 86
Slide 86 text
scheduler
& API
Slide 87
Slide 87 text
Demo
node partition from API
Slide 88
Slide 88 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 89
Slide 89 text
Demo
node scaling up
Slide 90
Slide 90 text
worker
kubelet
worker
kubelet
scheduler
& API
worker
kubelet
Slide 91
Slide 91 text
Demo
node scheduled outage API
Slide 92
Slide 92 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 93
Slide 93 text
Demo
node unplanned outage
Slide 94
Slide 94 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 95
Slide 95 text
Demo
node downgrade/upgrade outage
Slide 96
Slide 96 text
worker
kubelet
worker
kubelet
scheduler
& API
Slide 97
Slide 97 text
Future Work
Upstream Kubernetes and Elsewhere
Slide 98
Slide 98 text
Upstream
rktnetes
Auth/OIDC
Node self-signed TLS
Slide 99
Slide 99 text
Scaling
15x scheduler performance
30k pods on 1k nodes
SIG-scale
Slide 100
Slide 100 text
Automatic Node Drain
Locksmith Design Doc
Slide 101
Slide 101 text
No content
Slide 102
Slide 102 text
Performance etcd3 /ZooKeeper snapshot disabled
Slide 103
Slide 103 text
Performance etcd3 /ZooKeeper snapshot disabled
Slide 104
Slide 104 text
Memory
10GB
2.4GB
0.8GB
512MB data - 2M 256B keys
Slide 105
Slide 105 text
Sounds good, but...
Is anyone successful with all this in prod?
Slide 106
Slide 106 text
Publically traded options exchange
Slide 107
Slide 107 text
Containers on CoreOS are powering ISE's high-
throughput, low-latency financial exchange
Running in production
Bare metal & AWS
Billions of transactions a day
150 million req/sec