Our migration to microservices
The monolith
Containerizing the monolith
Conclusion
Contents
Slide 8
Slide 8 text
Our migration to microservices
Slide 9
Slide 9 text
$ rails new soundcloud
How we started
Slide 10
Slide 10 text
Around 2012
Slide 11
Slide 11 text
Around 2012
Slide 12
Slide 12 text
Around 2012
Slide 13
Slide 13 text
Around 2012
Slide 14
Slide 14 text
After
Slide 15
Slide 15 text
After
Slide 16
Slide 16 text
But what about deployment?
Slide 17
Slide 17 text
Component Environment Zone
Deployment
Our abstractions
Slide 18
Slide 18 text
Deployment
The process
Slide 19
Slide 19 text
The monolith
Slide 20
Slide 20 text
Track User Playlist
The monolith
Core entities
Slide 21
Slide 21 text
360 Chef provisioned
bare-metal machines
Rails 2.3 Capistrano deployment
The monolith
The technology
Slide 22
Slide 22 text
The monolith
The architecture
Public API
Internal API
Public API
Strangler
Internal API
caching/
strangler
Another BFF
Many
microservices
Public Web
Workers
Slide 23
Slide 23 text
The monolith
The components
Public API
MoshiMoshi
(Internal API)
Public Web Assets
MoshiMoshi
Comments
(Internal API)
Workers Cron Shell Migration
Slide 24
Slide 24 text
The monolith
The hosts
component
statsd-exporter
mtail statsd
passenger-
exporter
…
Slide 25
Slide 25 text
Utilization Deployment Lack of confidence
The monolith
The issues
Slide 26
Slide 26 text
Containerizing the monolith
Slide 27
Slide 27 text
Throw it into Kubernetes
Slide 28
Slide 28 text
Congrats, your monolith is a
microservice now
Slide 29
Slide 29 text
Thank You!
Slide 30
Slide 30 text
1.5 Engineers 1 year until the last bit
was cleaned up
The project
Slide 31
Slide 31 text
The first milestone
Slide 32
Slide 32 text
Docker development
container
Tests on GoCD
The first milestone
Slide 33
Slide 33 text
The proof of concept
Slide 34
Slide 34 text
First staging component
The proof of concept
Does it even work?
Slide 35
Slide 35 text
The proof of concept
PR
O
BLEM
!
Init script Nginx Passenger
Passenger
Process
Rails App
???
Where are my env variables?
Slide 36
Slide 36 text
The proof of concept
Env variables with The Perl Hack™
SO
LU
TIO
N
Slide 37
Slide 37 text
Productionizing
Slide 38
Slide 38 text
Deployment Monitoring Logs
Productionizing
Does it run where it matters?
Slide 39
Slide 39 text
Productionizing
Anatomy of a traffic serving pod
component statsd statsd-exporter
passenger-
exporter
mtail twemproxy
twemproxy-cu
twemproxy-
exporter
init
Slide 40
Slide 40 text
Don’t choke service
discovery
Be allocatable
Productionizing
Sizing the pods
Slide 41
Slide 41 text
CPU units for main
container
Passenger processes
Productionizing
Sizing the pods
3 16
Slide 42
Slide 42 text
Productionizing
PR
O
BLEM
!
Stdout v/s the log metrics exporter
Component
STDOUT
Log aggregator
File???
Mtail
Slide 43
Slide 43 text
Productionizing
Mtail with The Rotatelogs Hack™
SO
LU
TIO
N
Slide 44
Slide 44 text
Productionizing Public API
Slide 45
Slide 45 text
Orchestration
Productionizing Public API
Slide 46
Slide 46 text
Productionizing Public API
PR
O
BLEM
!
DNS latency and our excessive usage
Component
Service Service Service
Slide 47
Slide 47 text
Productionizing Public API
CoreDNS and the DNS Hack™
SO
LU
TIO
N
Slide 48
Slide 48 text
Productionizing Internal API
Slide 49
Slide 49 text
Highest throughput
Productionizing Internal API
Slide 50
Slide 50 text
Productionizing Internal API
PR
O
BLEM
!
Latency was too high
Slide 51
Slide 51 text
Productionizing Internal API
Optimize GC and make cheaper SQL queries
SO
LU
TIO
N
Slide 52
Slide 52 text
Productionizing Internal API
PR
O
BLEM
!
Errors spikes during deployment
Slide 53
Slide 53 text
Productionizing Internal API
The preStop Trick™
SO
LU
TIO
N
Slide 54
Slide 54 text
Productionizing Internal API
PR
O
BLEM
!
Errors spikes during deployment (still???)
Slide 55
Slide 55 text
Productionizing Internal API
The Pre Start Trick™
SO
LU
TIO
N
Slide 56
Slide 56 text
The rest
Slide 57
Slide 57 text
● Workers
● Cron jobs
● Shell / Migration hosts
● Cleanup!
The rest
Slide 58
Slide 58 text
Finishing
Slide 59
Slide 59 text
Current status
Slide 60
Slide 60 text
On-prem Cloud
Current status
Number of pods
~1000 ~140
Slide 61
Slide 61 text
On-prem RPS Cloud RPS
Current status
Traffic
25K 3K
Slide 62
Slide 62 text
Current status
Many deploys
Slide 63
Slide 63 text
Conclusions
Slide 64
Slide 64 text
One Infrastructure One Delivery Process
Conclusions
What we solved
Slide 65
Slide 65 text
Step by step Controlled rollouts Managing expectations
Conclusions
How we did it
Slide 66
Slide 66 text
Improved utilization Increased confidence Enabling new initiatives
Conclusions
Benefits
Slide 67
Slide 67 text
Assess current progress Evaluate costs and
benefits
Conclusions
Should you do it?