Move Fast and Break Nothing

78b475797a14c84799063c7cd073962f?s=47 Zach Holman
October 01, 2014

Move Fast and Break Nothing

Move fast and break things. Scale later. Ship it. They're the rallying cries of our industry. While they've helped propel us forward, they're still crude solutions. We're moving fast, but what specifically is okay for us to break, and how much can we break it? How do we quickly ship really difficult changes? The next step is being smarter about the answers to these questions. Turns out, there are some really cool approaches we can take these days to keep us fast while your company grows.

You can also read the full text accompaniment to this talk at: http://zachholman.com/talk/move-fast-break-nothing/

78b475797a14c84799063c7cd073962f?s=128

Zach Holman

October 01, 2014
Tweet

Transcript

  1. break nothing move fast &

  2. None
  3. Move fast and break things — Mark Zuckerberg, Facebook “

  4. Move fast and fix things — Mark Zuckerberg, Facebook “

  5. Move fast with stable infra — Mark Boringberg, Facebook “

  6. Move fast and be bold “ — Miguel Velazquez, Facebook

  7. Drop the ‘the’ — it’s cleaner. “ — JUSTIN TIMBERLAKE

    SWOOOOOON ”
  8. good philosophy moving quickly and safely is

  9. what does it mean in practice?

  10. @ holman

  11. @ holman !

  12. None
  13. None
  14. got me thinking about building environments where success is a

    byproduct
  15. code 3 areas , process& talk

  16. code

  17. move fast

  18. people like fast better product

  19. people like fast see what works

  20. people like fast momentum

  21. break things and

  22. break things fix later and… stuff and

  23. some things you can’t break

  24. billing· permissions · upgrades· migrations

  25. how do you move fast in dicey code?

  26. we redid github’s permissions code

  27. hella scary.

  28. .01% errors unacceptable =

  29. 0 errors mandatory =

  30. how do we ship this fast + safe?

  31. tests are great, of course

  32. production can differ from what your tests expect

  33. production can differ from what you even expect

  34. ensure that your code doesn’t change production behavior

  35. parallel codepaths

  36. request

  37. user.old_code always gets run request

  38. user.new_code also gets run request user.old_code always gets run

  39. user.new_code also gets run request user.old_code always gets run response

  40. run both, compare

  41. /github/dat-science p

  42. science “new-auth” do | e | e.control { user.slow_auth }

    e.candidate { user.fast_auth } end
  43. science “new-auth” do | e | e.control { user.slow_auth }

    e.candidate { user.fast_auth } end always runs and returns your result
  44. can be run as a percentage science “new-auth” do |

    e | e.control { user.slow_auth } e.candidate { user.fast_auth } end
  45. collects the two results science “new-auth” do | e |

    e.control { user.slow_auth } e.candidate { user.fast_auth } end
  46. attempts vs. mismatches

  47. 75th and 99th percentile performance

  48. quickly iterate until you can prove safety

  49. existing process build into your

  50. what do you have today that you can already use?

  51. each layer of process added is expensive

  52. instead grow process laterally

  53. tests rule everything around me

  54. ci as code maintenance

  55. failing test: removing html class without cleaning css

  56. failing test: removing css without removing html class

  57. failing test: adding <img> that’s not on our cdn

  58. failing test: invalid css (scss linting)

  59. automates gruntwork out of code review

  60. we test blog posts, too

  61. failing test: images off our cdn

  62. failing test: image non-retina or too big

  63. failing test: using an oft-used phrase

  64. pro- cess

  65. what’s important?

  66. feature launches need: · designers developers · product managers ·

    qa · lawyers · marketing· ops
  67. text files are rad

  68.  anpp (apple new product process)

  69.  (apple new product process) giant-ass checklist anpp

  70.  (apple new product process) lists at tim-level and small

    team-level anpp
  71.  (apple new product process) happens as very first step

    anpp
  72. d

  73. d ship lists

  74. d ship lists happens often as last step

  75. d ship lists all the todos for teams for their

    +1
  76. None
  77. simple todo lists make you think less

  78. ownership

  79. don’t throw code over the wall

  80. tie code to writers d

  81. tie code to writers d notifies you if you screw

    up
  82. None
  83. authors know their own code

  84. authors should fix their bugs

  85. how can your process help out?

  86. tie code to writers

  87. tie code to writers

  88. tie code to writers chrome has owner files

  89. tie code to writers dictates responsibility

  90. tie code to writers makes mentors visible

  91. tie code to writers d

  92. tie code to writers d per-file ownership

  93. tie code to writers d class BranchesController areas_of_responsibility :git end

  94. None
  95. None
  96. tie code to writers d issues get assigned to the

    appropriate team
  97. talk

  98. more communication! yayyyyyyyycrap

  99. email meetings chat issues im on-call video al

  100. more is good better is better

  101. most things aren’t emergencies

  102. most things aren’t emergencies be mindful of time

  103. measure impact of your actions be mindful of time

  104. imagine:

  105. 1. page a coworker for help 2. they get woken

    up 3. their phone takes a selfie of them 4. selfie gets posted into chat
  106. None
  107. would this make you respect your coworker’s time more?

  108. add empathy to your process

  109. empathy stems from being exposedto real pain

  110. everyone should do some support

  111. there’s a difference between reading about it helping with it

    &
  112. institutional teaching

  113. [we] have a responsibility to be teachers—that this should be

    a central part of [our] jobs “ — Ed Catmull, Creativity, Inc.
  114. …it’s just logic that someday we won’t be here. —

    Ed Catmull, Creativity, Inc. ”
  115. more productive if workers are smart

  116. how do you teach without making it lame?

  117. chatops: learning by osmosis

  118. app crashes. here’s a normal flow:

  119. 1. employee gets paged 2. they ssh into… something 3.

    they fix it… somehow app crashes. here’s a normal flow:
  120. no visibility means no teaching

  121. no visibility means no improvement

  122. d HUBOT

  123. 1. employee gets paged 2. they manage it in a

    chat room 3. they fix it and people can watch here’s a better flow: app crashes.
  124. visibility means teaching

  125. visibility means improvement

  126. real-life debugging

  127. github’s wifi is down floor two, san francisco (right now!)

  128. None
  129. None
  130. None
  131. None
  132. debug in the open

  133. feedback

  134. blue angels the feedback on

  135. None
  136. two-way street

  137. two-way street how well do you receive feedback?

  138. two-way street how well do you give feedback?

  139. improving feedbackacross an org

  140. None
  141. caution with a degree of moving fast

  142. need to be fast

  143. need to be safe

  144. takes effort to achieve both

  145. @ holman