side project
100% couchdb
donated hosting
IrisCouch
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
No content
Slide 6
Slide 6 text
December 2013
Slide 7
Slide 7 text
January 2014
Slide 8
Slide 8 text
February 2014
» company founded & funded
» 100% hosted on Joyent
» several skimdbs load-balanced by Fastly
» hand-built CouchDB + Spidermonkey
» automation by bash
» Twitter tells us when we're down
Slide 9
Slide 9 text
This is when I arrive.
(funding means you can hire!)
» PagerDuty account: first thing I did
» Nagios all hooked up & monitoring basic host
health
» we have maybe 10 hosts total driving the registry
Slide 10
Slide 10 text
Funding also means attention
from bounty-hunters.
Slide 11
Slide 11 text
security audit
Slide 12
Slide 12 text
Stabilization stage 1
reactive
» monitor everything more deeply
» methodically identify & monitor causes of outages
» react quickly to fix problems
» Twitter is no longer telling us when we're down
Slide 13
Slide 13 text
Stabilization stage 2
proactive
» our second devops person: Ben Coe
» recurring problems fixed in the apps
» monitoring checks self-heal
» redundancy everywhere
» automation!
» our night shift is bored!
Slide 14
Slide 14 text
June 2013
Superficially
similar.
Slide 15
Slide 15 text
major changes
100% on AWS
Ubuntu Trusty
70/30 split between us-west-2 & us-east-1
100% automated with ansible
52 running instances, variable
Slide 16
Slide 16 text
the stack
» Fastly CDN for Varnish cache & geolocality
» nginx to serve static files
» pound to terminate TLS
» CouchDB for package metadata & app logic
» nagios + PagerDuty for monitoring
» InfluxDB + Grafana for metrics
» Tarsnap for backups
Slide 17
Slide 17 text
No content
Slide 18
Slide 18 text
No content
Slide 19
Slide 19 text
weak points
» single points of failure: Fastly, write primary
» still looking for an off-AWS backup
» expensive to run: too many couchdbs
» too entangled with couchdb
» complex in odd places: the skimworker, for example
Slide 20
Slide 20 text
I now praise
CouchDB
Slide 21
Slide 21 text
my next goal:
make it cheap
Slide 22
Slide 22 text
by next week
haproxy
50-50 region balance
cheaper by far