Kubernetes is an open-
source system for
automating deployment,
scaling, and management of
containerized applications
Slide 21
Slide 21 text
Kubernetes builds upon 15 years
of experience of running
production workloads at Google,
combined with best-of-breed
ideas and practices from the
community
Slide 22
Slide 22 text
I’m not here to tell
you that you should
adopt Kubernetes
Slide 23
Slide 23 text
Or even to go to deep
into the technical
details of our
migration
I’d like to share an anecdote
from our ongoing journey
Slide 30
Slide 30 text
The only slide with bullets, I promise!
• Why we migrated our monolith to Kubernetes
• How did we approached a large cross-team project
• Where we are today
• What we learned in the process
• Where we’re headed
Slide 31
Slide 31 text
Why?
Slide 32
Slide 32 text
Context
Slide 33
Slide 33 text
The monolith
Slide 34
Slide 34 text
Ruby on Rails
Slide 35
Slide 35 text
github.com/
github/
github
Slide 36
Slide 36 text
GitHub dot com
the website
Slide 37
Slide 37 text
10 years old
Slide 38
Slide 38 text
Extremely
important to
early velocity
Slide 39
Slide 39 text
Increasing
complexity
Slide 40
Slide 40 text
Diffusion of
responsibility
Slide 41
Slide 41 text
No content
Slide 42
Slide 42 text
Incredibly high
performance
hardware
Slide 43
Slide 43 text
Incredibly reliable
hardware
Slide 44
Slide 44 text
Incredibly low
latency
networking
Slide 45
Slide 45 text
Incredibly high
throughput
networking
Slide 46
Slide 46 text
No content
Slide 47
Slide 47 text
Unit of compute
==
instance
Slide 48
Slide 48 text
Instance setup tightly
coupled with
configuration
management
Slide 49
Slide 49 text
API-driven,
testable, but brutal
feedback loop
Slide 50
Slide 50 text
Human-managed provisioning and
load balancing config
Slide 51
Slide 51 text
High level of effort
required to get a
service into
production
Slide 52
Slide 52 text
No content
Slide 53
Slide 53 text
Our customer
base is growing
Slide 54
Slide 54 text
Our customers are
growing
Slide 55
Slide 55 text
Our ecosystem is
growing
Slide 56
Slide 56 text
Our organization is
growing
Slide 57
Slide 57 text
We’re shipping
new products
Slide 58
Slide 58 text
We’re improving
existing products
Slide 59
Slide 59 text
Our customers
expect increasing
speed and reliability
Slide 60
Slide 60 text
We saw indications that our
approach was struggling to
deal with these forces
Slide 61
Slide 61 text
The engineering culture at
GitHub was attempting to
evolve to encourage individual
teams to act as maintainers of
their own services
Slide 62
Slide 62 text
SRE's tools and practices for running services
had not yet evolved to match
Slide 63
Slide 63 text
Easier to add functionality to an existing service
Slide 64
Slide 64 text
Unsurprisingly, the
monolith kept
growing
Slide 65
Slide 65 text
Increasing CI duration
Slide 66
Slide 66 text
Increasing deploy duration
Slide 67
Slide 67 text
Inflexible
infrastructure
Slide 68
Slide 68 text
Inefficient infrastructure
Slide 69
Slide 69 text
Private
cloud
lock-in
Slide 70
Slide 70 text
Developer and user experience trending downward
Slide 71
Slide 71 text
The planets aligned in way
that made all of these
problems visible all at one
Slide 72
Slide 72 text
Hack week
Slide 73
Slide 73 text
Given a week to ship
something new and
innovative, what might we
expect engineers to do?
Slide 74
Slide 74 text
1) spend ~1 day on
Puppet, provisioning,
and load balancing
config
Slide 75
Slide 75 text
2) reach out to SRE
on Thursday and
ask for our help?
Slide 76
Slide 76 text
3) build hack week
features as a PR
against the monolith
Slide 77
Slide 77 text
Microcosm of the larger problems
with our approach
Slide 78
Slide 78 text
Incentives
not aligned
with the outcomes
we desired
Slide 79
Slide 79 text
No content
Slide 80
Slide 80 text
Our on-ramp went
in the
wrong direction
Slide 81
Slide 81 text
High
effort
required
Slide 82
Slide 82 text
No content
Slide 83
Slide 83 text
We decided to make
an investment in
our tools
Slide 84
Slide 84 text
We decided to make
an investment in
our processes
Slide 85
Slide 85 text
We decided to make
an investment in
our technology
Slide 86
Slide 86 text
To support the other ongoing
changes in our organization,
we decided that we would work to
level the playing field
Slide 87
Slide 87 text
To support the decomposition
of the monolith, we decided
that we would work to
provide a better experience
for new services
Slide 88
Slide 88 text
To enable SRE to spend more
time on interesting services,
we decided to work to reduce
the amount of time we needed
to spend on boring services
Slide 89
Slide 89 text
To reduce the time we spent
on boring services, we
decided to work to make the
service provisioning process
entirely self-service
Slide 90
Slide 90 text
To bring the infrastructure-
building feedback loop down,
we decided to base this new
future on a container
orchestration platform
Slide 91
Slide 91 text
To leverage the experience
of Google and the strength
of the community, we
decided to build this new
approach with Kubernetes
Slide 92
Slide 92 text
How?
Slide 93
Slide 93 text
okay sorry, a few more bullets
• Passion team
• Prototype
• Pick an impactful and visible target
• Product vision and project plan
• Pwork
• Pause and regroup