Development, Deployment
and Collaboration at Etsy
Daniel Schauenberg
dschauenberg@etsy.com
@mrtazz
Slide 2
Slide 2 text
No content
Slide 3
Slide 3 text
@mrtazz
Etsy Stats
Slide 4
Slide 4 text
@mrtazz
Etsy Stats
Slide 5
Slide 5 text
@mrtazz
Item by TheBackPackShoppe
Slide 6
Slide 6 text
http://www.flickr.com/photos/brianglanz/1095706242
Slide 7
Slide 7 text
avg 50 deploys/
day
Slide 8
Slide 8 text
avg n > m deploys/
day
Slide 9
Slide 9 text
How comfortable
are you deploying
a change right
now?
Slide 10
Slide 10 text
@mrtazz
http://www.flickr.com/photos/renaissancechambara/2349811492
small change
Slide 11
Slide 11 text
Config
Flags
Item by RocajoStudio
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
“If this is your first
day at Etsy, you
deploy the site”
Slide 14
Slide 14 text
Developer VMs
Slide 15
Slide 15 text
@mrtazz
Developer VMs
• KVM
• Every engineer has one
• Fully Chef’d with the Etsy Stack
• Different sizes and Chef roles
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
Continuous
Integration
Slide 18
Slide 18 text
No content
Slide 19
Slide 19 text
@mrtazz
Continuous Integration
• Run set of tests before each deploy
• Full QA suite
• Princess/Production smoker tests
• Try (yup, there is one)
Slide 20
Slide 20 text
http://www.flickr.com/photos/egfocus/6962179321
Slide 21
Slide 21 text
@mrtazz
The Bobs
• LXC virtualized hosts
• 14/physical hosts
• Spread over 3 SSDs
• Most of them attached to try
Slide 22
Slide 22 text
No content
Slide 23
Slide 23 text
Item by decomodwalls
Slide 24
Slide 24 text
Deployinator
Slide 25
Slide 25 text
@mrtazz
Deployinator
• 2 Buttons, no ambiguity
• Overview of current state of deploy
• Links to Logwatcher and Dashboards
• Easy to add stacks for new tools to deploy
Slide 26
Slide 26 text
http://www.flickr.com/photos/jbgeronimi/6363087361
Slide 27
Slide 27 text
No content
Slide 28
Slide 28 text
Monitoring
Slide 29
Slide 29 text
@mrtazz
shouldigraphit.com
Slide 30
Slide 30 text
@mrtazz
Monitoring
• Devs do their feature monitoring
• Everybody can access all the graphs
• Dashboard All The Things!
• Stream All The Logs!
Slide 31
Slide 31 text
No content
Slide 32
Slide 32 text
No content
Slide 33
Slide 33 text
No content
Slide 34
Slide 34 text
On Call
Slide 35
Slide 35 text
If you are writing
code, you are
on-call
Slide 36
Slide 36 text
@mrtazz
On-Call Schedules
• ops on-call
• dev on-call
• payments on-call
• support on-call
Slide 37
Slide 37 text
No content
Slide 38
Slide 38 text
@mrtazz
Dev On-Call
• On-call for 3 days
• All developers who are not in another
rotation
• L1 and L2 escalations
• L1 if it’s your first time
Slide 39
Slide 39 text
Incident Response
Slide 40
Slide 40 text
@mrtazz
Incident Response
• “This graph looks funny”
• “Hey I just got paged for elevated error rate
after deploys”
• “Supergrep is going crazy!!”
Slide 41
Slide 41 text
Is the site down?
Slide 42
Slide 42 text
No content
Slide 43
Slide 43 text
#warroom
Slide 44
Slide 44 text
@mrtazz
#warroom
• only outage related conversations
• coordinate investigation, communication,
countermeasures and monitoring
• good place to lurk for new engineers
Slide 45
Slide 45 text
Post Mortems
Slide 46
Slide 46 text
blameless
Slide 47
Slide 47 text
Everybody’s invited
Slide 48
Slide 48 text
Learning Opportunity
Slide 49
Slide 49 text
Summary
Slide 50
Slide 50 text
@mrtazz
Summary
• These are things that work for *us*
• Culture is an on-going effort
• Share everything
• Encourage learning/teaching
Slide 51
Slide 51 text
@mrtazz
Summary
• Lunch ’n learns
• DC visits
• On-call for a day
• Bootcamps/Senior rotations