Slide 1

Slide 1 text

Developing Exceptional Architecture Design with Open Source and DevOps

Slide 2

Slide 2 text

Chris Wahl Chief Technologist, Rubrik [email protected] chriswahl @ChrisWahl

Slide 3

Slide 3 text

Agenda • What The Heck is Going On? • Learning from Open Source • Practice what you Preach • So, What Does This Look Like? • Interesting Use Cases! • Parting Thoughts

Slide 4

Slide 4 text

What The Heck is Going On? A fancy version of “Start with Why”

Slide 5

Slide 5 text

The Counter-Industrial Revolution “At the cusp of the 20th and 21st centuries, intangible assets overtook tangible assets in the economy” • Computerized information • Innovative property • Economic competencies

Slide 6

Slide 6 text

The Counter-Industrial Revolution 1. The optimal scale for digital assets is global. 2. The home has re-emerged as a workplace for teleworkers. 3. The gig economy has sprung up at the expense of wage-based employment. 4. Many firms delay IPO, preferring to remain private because they regard venture capitalists as better positioned to understand the value of their intangible assets.

Slide 7

Slide 7 text

The Counter-Industrial Revolution Spiral Motion Linear building Batch time Physical I/O Circular evolution Real-time Data (& memes) I/O

Slide 8

Slide 8 text

With that said …

Slide 9

Slide 9 text

Link: https://twitter.com/kevinbehr/status/1098707964721094656

Slide 10

Slide 10 text

Link: https://twitter.com/ChrisWahl/status/1080986977485406208

Slide 11

Slide 11 text

Learning from Open Source A sprinkle of development, a dash of chaos, and a pinch of creativity

Slide 12

Slide 12 text

Open Source Projects are Hard It all stems from why, how, and what. • How do you get folks interested? • What carrots or sticks do you have available? • How many goats can you herd at once? • How do you scale? If you don’t believe in the mission, no one else will either.

Slide 13

Slide 13 text

That should sound a bit familiar

Slide 14

Slide 14 text

Overview – 2011 Project • Managing a few thousand pizza boxes • Working on a very small team • Leaving behind cheeky MOTDs for the lulz

Slide 15

Slide 15 text

Challenges • Team of two • Customers: 500+ developers around the world • Environment would scale in really wonky ways, daily

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Looked for Inspiration Elsewhere • Found the OpenStack project • Figured Nova and Cinder would be interesting projects • Knew enough Python to get messy • Largely just interested in large scale development • How does the other side operate? • What lessons could I adapt to ops?

Slide 18

Slide 18 text

Learnings • My “two pizza team” was kind of nice • The grass wasn’t any greener, just a different shade of sad • DVCS is dope!

Slide 19

Slide 19 text

Learnings Operations often gets in the way • Sometimes for good – security, performance, capacity planning • Sometimes for bad – lack of understanding, silos, NIMBY

Slide 20

Slide 20 text

Takeaways • We adopted Git as our DVCS • Peer reviews = death to CABs!

Slide 21

Slide 21 text

Takeaways Contacted the facilities team to get a seating chart • Found where the dev leaders sat • Went to lunch, talked about our challenges • Shared knowledge and roadmap to align efforts • Borrowed a dev to clean up my messy code + ServiceNow workflows I learned that communication is always* the problem

Slide 22

Slide 22 text

*but sometimes the problem is just DNS

Slide 23

Slide 23 text

Practice what you Preach A different approach to operating a team

Slide 24

Slide 24 text

Some call this DevOps ¯\_(ツ)_/¯

Slide 25

Slide 25 text

Link: https://twitter.com/ScribblingOn/status/1101548686344114176

Slide 26

Slide 26 text

Achievement Unlocked: Developer

Slide 27

Slide 27 text

Link: https://twitter.com/srockets/status/1097642296261074945

Slide 28

Slide 28 text

Technical Debt is a tradeoff for speed Eventually, you need to pay it down

Slide 29

Slide 29 text

Link: https://twitter.com/tactical_intel/status/1101717575388487680

Slide 30

Slide 30 text

Infra vendors

Slide 31

Slide 31 text

So, What Does This World Look Like? How the sausage is CI/CD pipelined

Slide 32

Slide 32 text

As It Applies to Rubrik • 100% distributed team, globally • Workflows, documentation, and pipeline-driven process drive most everything • Building “all the things” in two on-prem facilities and three clouds (Azure, AWS, GCP) in a variety of regions • Default = native tools > overlays • We have no dedicated “ops” people • The work is never done, but that’s OK

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

We operate as engineers There are some folks who are pretty good at advocating for this: “Netflix’s engineering culture is predicated on Freedom & Responsibility, the idea that everyone (and every team) at Netflix is entrusted with a core responsibility and they are free to operate with freedom to satisfy their mission.” This is the premise behind contractual operations

Slide 35

Slide 35 text

My team’s skill requirements: Learn a language Learn Git Learn REST (and GraphQL) Act like an adult

Slide 36

Slide 36 text

Contractual Operations Any system we create / operate is too complex for any one person or team to know all of it. Take API contracts as an example: • Machine readable definition of an API interface (e.g. Swagger). • Definition of the surface area of the resources that are available. • Describe, communicate, and collaborate around APIs. • These are often extremely abstract.

Slide 37

Slide 37 text

Contractual Operations: Requirements • Clearly define your inputs and outputs • Document the heck out of everything • Let people know when things change • Be creative within that realm • Build for scale • Unless it works idempotently, it’s wrong • Assume you will spontaneously combust at any moment • Embrace Service Ownership

Slide 38

Slide 38 text

This works regardless of locality or topology

Slide 39

Slide 39 text

Production Readiness ❑Security ❑Logging ❑Monitoring ❑Alerting ❑Observability (ephemeral) ❑Enabling Features ❑Testing ❑SLOs ❑Costs ❑Performance ❑Deployment / Build ❑Enablement

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

Musings on Building for Scale • Consider everything an artifact that needs a unique ID • Use declarative languages, DSLs, or native tooling when possible • Suggest changes in DVCS (Git), not by hand or dead tree (paper) • Style guides are dope • No tests = stop, just stop!

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Things My Team Likes • Azure DevOps is fairly solid as a build / release tool • Jira works for project management, Kanban boards, and getting the attention of other engineers ☺ • IBM RedHat CloudForms / ManageIQ (cloud management) • Terraform (config management; built on our Go SDK) • Ansible (config management; built on our Python SDK)

Slide 45

Slide 45 text

In Conclusion

Slide 46

Slide 46 text

Thoughts • Communicate your plans • Find pain points to solve and innovation to copy • Design out in the open • Don’t consume anything that doesn’t have awesome APIs • Test everything • Assume good intent • Only punish those who build by hand (mistakes happen) • Version control everything • Pipeline everything • Test everything (!!!)

Slide 47

Slide 47 text

Thank you, wonderful humans! [email protected] chriswahl @ChrisWahl