Slide 1

Slide 1 text

Mentoring Devs Into DevOps Justin Carmony Director of Development Deseret Digital Media

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Business Track

Slide 4

Slide 4 text

Lets Measure The Audience • Who here is a… • System Administrator? • Developer? • Manager / Management? • “DevOp?”

Slide 5

Slide 5 text

Confession: I’m a Developer

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Self-Taught Ops Because There Was No One Else To Do It

Slide 8

Slide 8 text

About Me • Director of Development
 for Deseret Digital Media • Utah PHP Usergroup
 President • I Make (and Break) 
 Web Stuff (10 years) • Salt User in Production since 0.8 (I <3 Salt)

Slide 9

Slide 9 text

This Presentation • Lessons learned at DDM & previous jobs • Insight into our process of increasing “DevOps” • We’re still learning, but this what we’ve found. • Slides will be posted online, so don’t worry about copying slide content. • Feel free to ask on-topic questions during, and we’ll have questions at the end.

Slide 10

Slide 10 text

About DDM • Deseret Digital Media runs local website like KSL.com, DeseretNews.com • Running National and International Websites like OK.com, familia.com.br, etc. • ~10 million pageviews a day across sites. • ~150 internal VMs, a few dozen physical machines, some AWS sprinkled around.

Slide 11

Slide 11 text

Lets Start With a Story!

Slide 12

Slide 12 text

You Work for an Awesome Tech Company

Slide 13

Slide 13 text

Team Is Working Hard to Build New Things!

Slide 14

Slide 14 text

You launch your awesome product!

Slide 15

Slide 15 text

A Few More Features…

Slide 16

Slide 16 text

… and next thing you know…

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Awesome Job Team, We Rock!

Slide 19

Slide 19 text

We Need ! Real-Time XYZ Feature! ASAP! &#$%!

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

“Huh, it works if you ! just turn off caching…”! - Dev @ 80th Hour This Week

Slide 22

Slide 22 text

“I’m sure this ! will work…”

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

“Our servers are melting!”

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

“We Need a Better Solution!”

Slide 35

Slide 35 text

So…

Slide 36

Slide 36 text

Where Do We Start?

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

We Have This Problem

Slide 39

Slide 39 text

Challenges We Faced • Giant mesh-up of technologies • Tightly-coupled & fragile infrastructure • Debugging production only bugs was difficult • Bugs that were part code, part environment were a nightmare to track down.

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

So One Day… We Had A Genius Idea!

Slide 42

Slide 42 text

Lets Hire a DevOp!

Slide 43

Slide 43 text

I’m Not Joking We Actually Said This

Slide 44

Slide 44 text

Two Problems with this “Idea”

Slide 45

Slide 45 text

Problem #1 - We Didn’t Understand What We Really Wanted

Slide 46

Slide 46 text

Step 1: Hire a DevOp! Step 2: ????????????! Step 3: Profit! Everything Works ! Perfectly!

Slide 47

Slide 47 text

Problem #2 - People Who Are Great At Dev & Ops Are Hard To Find

Slide 48

Slide 48 text

Expectation:

Slide 49

Slide 49 text

Reality:

Slide 50

Slide 50 text

Honest Team Discussion: What is it we’re really looking for?

Slide 51

Slide 51 text

We Discovered a Few Things

Slide 52

Slide 52 text

What does DevOps Mean To Us? • DevOps: Dev & Ops, a Culture of Collaboration • Our Goal: “10 deploys a day without issues” • Everyone shares the goal of quick development of features AND a stable system that stays up.

Slide 53

Slide 53 text

Team Structure Devs: 30 Ops: 2 DevOps: 1 Hiring one person won’t just solve all our problems!

Slide 54

Slide 54 text

Team Realizations • Hardest problem already solved: awesome team • No foreseeable rapid expansion, must operate at our current scale • Each Project’s Director of Development was acting as the bridge between Dev and Ops, but would become a bottleneck.

Slide 55

Slide 55 text

Teams Already Had Some Ad-Hoc DevOps Tools - Real-time Logging - Capistrano Deploys - Nagios Alerts - Server Metrics - Puppet for File Mgmt - App Stats w/ Graphite - Graphite Dashboards - Salt for Cfg Management - Homebrewed Metrics Sys. - Homebrewed Alert System

Slide 56

Slide 56 text

Step 1: Hire a DevOp! Step 2: ????????????! Step 3: Profit! Everything Works ! Perfectly!

Slide 57

Slide 57 text

We Formed A Strategy

Slide 58

Slide 58 text

Step #1: Promote Dev to DevOp Role

Slide 59

Slide 59 text

WAIT! Isn’t that the advice you just said was a bad idea?!

Slide 60

Slide 60 text

DevOp Engineer • Well Defined Role: • Ownership over the TOOLS to improve DevOps efforts. • Resource for other teams to help use DevOps Tools. • Easy to work with, aptitude for systems & ops, likes to try new things.

Slide 61

Slide 61 text

Promoting From Within • A seasoned dev for your team already knows: • Your Pain Points • Your System’s Quirks • How the “Chaos Works” • Knows the people & personalities on your team

Slide 62

Slide 62 text

Step #2: Change Team Structure

Slide 63

Slide 63 text

Team Structure Devs: 30 Ops: 2

Slide 64

Slide 64 text

Team Structure Goal: Spread Out Expertise By Increasing Ops Experience & Skills Among Devs Dev Ops

Slide 65

Slide 65 text

Team Structure Dev Ops

Slide 66

Slide 66 text

Increasing Ops Among Devs • Identify Devs who liked “Ops” & wanted to Learn • Pair Dev with Op / Director • Learning Dev works on things, not Op /Director. • Pair program if needed.

Slide 67

Slide 67 text

Step #3: Increase Everyone’s Insight

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

Metrics • Everyone has access to Network, Server, and Application Metrics. • Consolidate & reduce places to look. We try to pipe everything to StatsD / Graphite • Each developer trained to add & track metrics in production. • We’re okay with 98% uptime of stats to avoid complexity.

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

Real-Time Logging

Slide 72

Slide 72 text

Real-Time Logging • Harder & more complicated at scale • Still trying to solve well, we have lots of logs. • Start with small window of data (i.e. 48 hours) and start to expand window. • We’re trying Logstash, ElasticSearch, and Kibana right now. • Generate Statistics off our Logs

Slide 73

Slide 73 text

Tracking Changes • Everything, everything, everything in git 
 (we use GitHub) • Everyone has access to all repos • Everyone does work through Pull Requests • Everyone has their work code reviewed * * - Your can merge w/o a review, but must be willing to defend your choice

Slide 74

Slide 74 text

Deploys

Slide 75

Slide 75 text

Everyone Can Deploy • Automated our deployment process to a single step. • Everyone can deploy, deployments are logged • Easy rollback is a requirement! • Implementing feature flags to turn off single parts of our application.

Slide 76

Slide 76 text

Tests Unit Functional Integration Acceptance etc

Slide 77

Slide 77 text

Automated Tests • If you want to trust your Devs, you need tests • Legacy apps we wrote Integration Tests • New Apps & Refactored Legacy Parts have Unit Tests • Continuous Integration to make sure tests run

Slide 78

Slide 78 text

Step #4: Devs Use The Ops Tools

Slide 79

Slide 79 text

Devs can grok salt

Slide 80

Slide 80 text

Safe Environment For Devs to Learn salt \* cmd.run "rm -rf /tmp /*" Salt is awesome, but it can’t ! recover from that

Slide 81

Slide 81 text

Dev Salt Master Devs Can Look Into Every Server

Slide 82

Slide 82 text

Dev Salt Master • Every server has two minions: • Admin Salt (aka root) • Dev Salt (aka bob) • Each connect to different master server: • All Devs have access to Dev Salt Master • Trusted Devs get access to Admin Salt Master

Slide 83

Slide 83 text

Everything Salty in Git Reminder:

Slide 84

Slide 84 text

Dev Environment • Developers own the Dev Environment • Dev Teams manage the Salt States for their Env • Vagrant + Salt for their Env • Who makes changes? Developers • DevOp helps advise & offer support

Slide 85

Slide 85 text

Team Structure Dev Ops

Slide 86

Slide 86 text

Stage Environment • Stage & Production use same salt repos, different branches • Developers make all the changes for Application Servers • All Changes through Pull Requests • We’ll worry about env changes before code • Small changes we quickly release, large or long running branches are scary & dangerous

Slide 87

Slide 87 text

Production Environment • Merge change to Production Branch • salt \* state.highstate • Reminder: Small quick changes over time, never a large change at once.

Slide 88

Slide 88 text

Environment Caveats • Ops & DevOps Manage VM Hosts, Physical Load Balancers, FireWalls, etc • Ops & DevOps manage servers that deal with data: • MySQL • MongoDB • etc

Slide 89

Slide 89 text

Mentoring Devs

Slide 90

Slide 90 text

Mentoring Devs • Not every Dev will become an amazing DevOp • Thats okay!

Slide 91

Slide 91 text

Level of “DevOps” Skills • Thinks about their impact on Ops: Everyone • Able to debug issues with production: Most • Able to make changes to environments: Many • “Awesome DevOp”: Some

Slide 92

Slide 92 text

So Everything Is Awesome for us, right?

Slide 93

Slide 93 text

Honesty Slide: We Have Skeletons In Our Closets

Slide 94

Slide 94 text

Where We Are At • All Dev Environments using Vagrant + Salt • All New Stage & Prod Environments are Salty • Some Legacy Stage & Production Envs are Salty • Continuously working on getting out stuff salty.

Slide 95

Slide 95 text

Making This Work For Your Team

Slide 96

Slide 96 text

Honest Introspection • Determine for your team what are your… • Strengths • Weaknesses • Problems • Goals

Slide 97

Slide 97 text

Increase Team’s Insight • Make sure devs can see & understand how their code performs • Increase responsibility of team for those metrics. • If they break it, they fix it. 
 Do not always bail them out. • Everyone can see everything.

Slide 98

Slide 98 text

Mentor Those With Desire / Aptitude • Give Developers Safe Environment to Learn • Let them submit code-reviewed changes for Stage & Production • When teaching / mentoring, let the learner drive, kindly offer advice and help. • It takes time, but worth the investment.

Slide 99

Slide 99 text

A Few Final Thoughts

Slide 100

Slide 100 text

Team Culture ! Matters

Slide 101

Slide 101 text

Positive Influence

Slide 102

Slide 102 text

Questions?

Slide 103

Slide 103 text

Thank You Justin Carmony Email: [email protected] Twitter: @JustinCarmony IRC: carmony #salt #uphpu Website: [email protected]

Slide 104

Slide 104 text

p.s. we’re hiring, email / pm / tweet me