About Me • Director of Development for Deseret Digital Media • Utah PHP Usergroup President • I Make (and Break) Web Stuff (10 years) • Salt User in Production since 0.8 (I <3 Salt)
This Presentation • Lessons learned at DDM & previous jobs • Insight into our process of increasing “DevOps” • We’re still learning, but this what we’ve found. • Slides will be posted online, so don’t worry about copying slide content. • Feel free to ask on-topic questions during, and we’ll have questions at the end.
About DDM • Deseret Digital Media runs local website like KSL.com, DeseretNews.com • Running National and International Websites like OK.com, familia.com.br, etc. • ~10 million pageviews a day across sites. • ~150 internal VMs, a few dozen physical machines, some AWS sprinkled around.
Challenges We Faced • Giant mesh-up of technologies • Tightly-coupled & fragile infrastructure • Debugging production only bugs was difficult • Bugs that were part code, part environment were a nightmare to track down.
What does DevOps Mean To Us? • DevOps: Dev & Ops, a Culture of Collaboration • Our Goal: “10 deploys a day without issues” • Everyone shares the goal of quick development of features AND a stable system that stays up.
Team Realizations • Hardest problem already solved: awesome team • No foreseeable rapid expansion, must operate at our current scale • Each Project’s Director of Development was acting as the bridge between Dev and Ops, but would become a bottleneck.
DevOp Engineer • Well Defined Role: • Ownership over the TOOLS to improve DevOps efforts. • Resource for other teams to help use DevOps Tools. • Easy to work with, aptitude for systems & ops, likes to try new things.
Promoting From Within • A seasoned dev for your team already knows: • Your Pain Points • Your System’s Quirks • How the “Chaos Works” • Knows the people & personalities on your team
Increasing Ops Among Devs • Identify Devs who liked “Ops” & wanted to Learn • Pair Dev with Op / Director • Learning Dev works on things, not Op /Director. • Pair program if needed.
Metrics • Everyone has access to Network, Server, and Application Metrics. • Consolidate & reduce places to look. We try to pipe everything to StatsD / Graphite • Each developer trained to add & track metrics in production. • We’re okay with 98% uptime of stats to avoid complexity.
Real-Time Logging • Harder & more complicated at scale • Still trying to solve well, we have lots of logs. • Start with small window of data (i.e. 48 hours) and start to expand window. • We’re trying Logstash, ElasticSearch, and Kibana right now. • Generate Statistics off our Logs
Tracking Changes • Everything, everything, everything in git (we use GitHub) • Everyone has access to all repos • Everyone does work through Pull Requests • Everyone has their work code reviewed * * - Your can merge w/o a review, but must be willing to defend your choice
Everyone Can Deploy • Automated our deployment process to a single step. • Everyone can deploy, deployments are logged • Easy rollback is a requirement! • Implementing feature flags to turn off single parts of our application.
Automated Tests • If you want to trust your Devs, you need tests • Legacy apps we wrote Integration Tests • New Apps & Refactored Legacy Parts have Unit Tests • Continuous Integration to make sure tests run
Dev Salt Master • Every server has two minions: • Admin Salt (aka root) • Dev Salt (aka bob) • Each connect to different master server: • All Devs have access to Dev Salt Master • Trusted Devs get access to Admin Salt Master
Dev Environment • Developers own the Dev Environment • Dev Teams manage the Salt States for their Env • Vagrant + Salt for their Env • Who makes changes? Developers • DevOp helps advise & offer support
Stage Environment • Stage & Production use same salt repos, different branches • Developers make all the changes for Application Servers • All Changes through Pull Requests • We’ll worry about env changes before code • Small changes we quickly release, large or long running branches are scary & dangerous
Production Environment • Merge change to Production Branch • salt \* state.highstate • Reminder: Small quick changes over time, never a large change at once.
Level of “DevOps” Skills • Thinks about their impact on Ops: Everyone • Able to debug issues with production: Most • Able to make changes to environments: Many • “Awesome DevOp”: Some
Where We Are At • All Dev Environments using Vagrant + Salt • All New Stage & Prod Environments are Salty • Some Legacy Stage & Production Envs are Salty • Continuously working on getting out stuff salty.
Increase Team’s Insight • Make sure devs can see & understand how their code performs • Increase responsibility of team for those metrics. • If they break it, they fix it. Do not always bail them out. • Everyone can see everything.
Increase Team’s Insight • Make sure devs can see & understand how their code performs • Increase responsibility of team for those metrics. • If they break it, they fix it. Do not always bail them out. • Everyone can see everything.
Mentor Those With Desire / Aptitude • Give Developers Safe Environment to Learn • Let them submit code-reviewed changes for Stage & Production • When teaching / mentoring, let the learner drive, kindly offer advice and help. • It takes time, but worth the investment.