The Two Sides to Google Infrastructure for Everyone Else
My talk from Velocity Santa Clara. The format was a debate. Between myself. Looking at the pros and cons around adopting software and practices from other organisations wholesale, using the GIFEE meme as an example.
<1% of US workers are software engineers or programmers Gareth Rushgrove US Bureau of Labor Statistics 2002. 1,069,000 jobs in working age population of 185million
Goal of SRE team isn’t zero outages – SRE and product devs are incentive aligned to spend the error budget to get maximum feature velocity Gareth Rushgrove Dan Luu, ex Google ” “ http://danluu.com/google-sre-book/
If a human operator needs to touch your system during normal operations, you have a bug. The definition of normal changes as your systems grow Gareth Rushgrove Carla Geisser, Google SRE ” “
Your startup with a single-purpose application does not have the luxury of having your operations team say I’m sorry you’re over your error budget Gareth Rushgrove John Vincent, Ops Hero ” “
(without introducing more risk) The field of Sociotechnical Systems suggests that all human systems include both a technical system and a social system Gareth Rushgrove https://en.wikipedia.org/wiki/Coevolution#Technological_coevolution
(without introducing more risk) Better outcomes are usually obtained by a reciprocal process of joint optimization, through which both the technical system and the social system change Gareth Rushgrove https://en.wikipedia.org/wiki/Coevolution#Technological_coevolution