Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Choose Boring Technology

Choose Boring Technology

When you're starting or running a company, how do you choose technology? The prevailing advice du jour is something along the lines of "use the best tool for the job." This is obviously right, but it is also devoid of meaning in an unfortunate way that lets people define "best" and "job" as myopically as they like.

Dan McKinley

July 23, 2015
Tweet

More Decks by Dan McKinley

Other Decks in Technology

Transcript

  1. View Slide

  2. Hey

    View Slide

  3. I’m Dan McKinley. That’s me in the hole. It’s a metaphor.

    View Slide

  4. I work for a company called Stripe. Before that, I was an early employee at Etsy, where I worked for a lot of years. I
    acquired a great deal of practical experience at Etsy so I’m going to be referring to my time there a lot.

    View Slide

  5. Etsy wasn’t mature as an engineering organization when I got there, but I was eventually spoiled rotten when it came
    to technology culture at Etsy.
    As I’ve gone back into the wider tech world, I’ve had to confront some questions I hadn’t really considered in a few
    years.
    And I’ve realized I have opinions about these things. That’s what this talk is about.

    View Slide

  6. So how do you choose technology? This was eventually more or less handled for me at Etsy. Now I need to worry
    about it again.

    View Slide

  7. You can achieve anything with software. And I definitely believe that companies don’t usually succeed or fail because
    of specific technical choices.
    But technology choices are relevant. They affect how straight the path is between you and achieving your goals. They
    affect your efficiency.

    View Slide

  8. Another question that I care about is: how do you make developers happy? This matters to me, as a developer. But
    also as a leader in a software organization—productivity and retention rely on this.

    View Slide

  9. If you ask developers, a lot of them will tell you they’re happy when they’re working with technology that’s exciting. Or
    another thing they say is that they like to work on hard problems.
    Those things may or may not be true. But what I’ve learned with experience is that it’s not really the case at the
    highest levels of fulfillment. They don’t talk about nodejs in heaven.

    View Slide

  10. You can probably tell from the title of this talk that I don’t think that chasing shiny technology is right. But look, I’ve
    been there.
    I, too, once chased shiny technology.

    View Slide

  11. Etsy early on was a big ball of PHP, written by an overall brilliant person who was unfortunately learning PHP as he
    was writing it.
    I spent years trying to avoid dealing with the results of that. At one point I tried building Scala services that talked to
    MongoDB.
    I wrote blog posts about this that Etsy employees are still giving me shit about. And with good cause.

    View Slide

  12. I think it’s fair to say that I’m a completely different kind of engineer now. I tend to be focused on things that are only
    vaguely engineering. I talk at design conferences, or in the “business” track. I care a lot more about product than your
    average engineer.

    View Slide

  13. I view this less as the result of getting old and cranky and more as the result of climbing up Maslow’s hierarchy of
    needs. Maslow’s hierarchy, briefly, is the idea that you have to satisfy your more basic needs before higher levels of
    intellectual fulfillment are possible.

    View Slide

  14. The same is basically true about software. You can’t ask intelligent questions about the direction of the product if
    you’re worried about which database to use or which alerting system to use.
    In my career to date, I’ve been pretty lucky to have my most basic needs fulfilled. And I want to help get others to this
    state.

    View Slide

  15. So, try to think of me as a time traveler from your future. I’ve been through the shiny technology wars you might be
    fighting today. It’s better over here. The air is fresher. Food tastes better.

    View Slide

  16. So, on to the problem of choosing technology. A thing that I think is obviously true is that as human beings, we have
    limited attention. We can only worry about so much stuff at one time.

    View Slide

  17. I personally model that like this. You could say that we all get a limited number of innovation tokens to spend. This is a
    construct I just made up, but I think it’s helpful. And since I created this currency I also decided to put Elon Musk on it.
    These represent our limited capacity to do something creative, or hard. We really don’t have that many of these to
    allocate. Early on in a company’s life, we get like maybe three. Not too many more than that.

    View Slide

  18. So what’s your company trying to do? Well, Etsy, where I used to work, is trying to reshape the world economy.

    View Slide

  19. I dunno, that sounds like a big job. That probably requires at least one of your tokens.

    View Slide

  20. The company where I work now is trying to increase the GDP of the internet.

    View Slide

  21. Again, that sounds like a pretty complicated thing to be doing. We probably have to spend at least one of our tokens
    on that. Maybe two. Maybe all of them!

    View Slide

  22. If you think about innovation as a scarce resource, it starts to make less sense to be on the front lines of innovating on
    databases. Or on programming paradigms.
    The point isn’t really that these things can’t work. Of course they can work. But exciting new technology takes a great
    deal more attention to work than boring, proven technology does.

    View Slide

  23. To get at the reason for that I want to talk about the philosophy of knowledge a little bit. What can we know about a
    piece of technology? This is not actually a frivolous question. It’s really important.

    View Slide

  24. Now look, I don’t like Donald Rumsfeld. But he’s associated with the following, which is thoroughly relevant to our
    subject.

    View Slide

  25. And that’s this. When we don’t know something, there are really two different categories that that lack of knowledge
    can be in.
    There are known unknowns, that is, things that we know that we don’t know. And there are unknown unknowns, things
    that we don’t know and that we don’t know that we don’t know.

    View Slide

  26. This applies in technology. This is an example of a known unknown. For a given database, we might not know what
    happens when a network partition occurs. But we know that a network partition is possible. Since we know that this is
    possible, we can test for this. Or we can just cross our fingers and hope that it doesn’t happen. Either way, we are
    informed about the possibility.

    View Slide

  27. There are also unknown unknowns in technology. This is a good example I saw a few months ago. This person had a
    java process that was writing stats to a file, and that was causing GC pauses. It took him forever to figure this out
    because the possibility hadn’t occurred to him. That’s an unknown unknown.

    View Slide

  28. Now, it’s important to realize that both categories are present in all software. There are always bugs that nobody
    knows about, even in software that’s been around forever.

    View Slide

  29. But it’d be wrong to say that all technology is therefore equivalent. New technology has larger magnitudes for both of
    these sets.
    New tech typically has more known unknowns, and many more unknown unknowns. And this is really important.

    View Slide

  30. Boring technology in a nutshell is technology that’s well understood. We know what it’s capable of, and at least as
    importantly, we also know what it’s not capable of. We know how boring technology fails.

    View Slide

  31. So, ok, all you have to do is pick proven technology, and you’re all set, right? Well, no. The combination of things that
    you choose also matters.

    View Slide

  32. Let’s say that you’re already using this stack. You have python, memcached, mysql, and apache.

    View Slide

  33. Let’s say you have a new problem to solve. Do you think it makes sense to add ruby to your existing stack?

    View Slide

  34. I think most people’s intuition there is “probably not.” We know that the marginal utility of adding ruby isn’t going to
    outweigh the complexity hit we take by adding it. Python and ruby feel pretty equivalent.

    View Slide

  35. And we’ve had formal proofs since the 1930s that all problems can in principle be solved with one or the other.

    View Slide

  36. Ok, so how about adding redis? We already have mysql and memcached, but should we add redis?

    View Slide

  37. About here is where people lose it and start beating the polyglot programmer drums. There’s something about the idea
    of adding a new database that has people storming the Bastille, saying “you can’t stop us from using the best tool for
    the job!”
    People tend to think that what they're doing when they acquiesce to this is that they're giving developers freedom. And
    sure, it is freedom, but it's freedom very narrowly defined.

    View Slide

  38. What’s going on there? Let’s try to tease this apart.

    View Slide

  39. This is what we’re implicitly saying when we want to add a piece of technology.
    Except in relatively rare cases where it’s not possible to solve a problem with our existing stack, we’re saying that the
    new tech is going to be so much better in the near term that this benefit outweighs the cost of having two pieces of
    technology around in perpetuity.

    View Slide

  40. We can actually start to formalize this idea, and think about it a structured way.

    View Slide

  41. Well, sort of. I don’t expect to see this published in ACM. But here goes.

    View Slide

  42. Your job is basically what my friend Coda says, here. You’re supposed to be solving business problems with
    technology.

    View Slide

  43. We can model that as a bipartite graph. On the left side we have business problems, and on the right side we have
    technical solutions.

    View Slide

  44. As practitioners we have to try to connect all of the nodes on the left side so that our problems are solved. Adding an
    edge here is making a technology choice.

    View Slide

  45. Every choice has a maintenance cost, but we also get the benefit of the technology that we choose.

    View Slide

  46. Every choice has maintenance costs, but every choice also helps us solve the problem. So we have a nonzero benefit,
    and a nonzero cost for every choice.

    View Slide

  47. When we add more than one edge, we can make a choice. We can use the same technology that we’ve already paid
    for …

    View Slide

  48. Or we could pick a different piece of technology. We have to pay for that new tech, too, but maybe we get so much
    development velocity that it’s worth it.

    View Slide

  49. We can start to think about this mathematically. We’re trying to minimize this cost function. The total cost of our
    operations is all of the maintenance costs we take on from our choices, minus the development velocity we get from
    every choice.

    View Slide

  50. The way we behave really depends on what you believe about which term dominates this equation in the real world.
    If technology is really expensive to operate, the costs dominate. If technology really makes a huge difference in how
    easy your job is, the benefits dominate.

    View Slide

  51. So, depending, you might decide to make an allocation like this. Here we’ve picked many different technologies to use
    to solve all of our problems.

    View Slide

  52. And that makes complete sense if each additional technology choice is cheap.
    If we think that we get more out of using each new technology than we’ll pay for operationalizing it, then doing it this
    way makes sense.

    View Slide

  53. This is an alternative strategy. Here we’ve chosen just a few technologies,

    View Slide

  54. And that’s what we should do if we think that each technology we add comes with a lot of baggage.

    View Slide

  55. Here in reality, new technology choices come with a great deal of baggage.

    View Slide

  56. This is reality. Costs to operate a technology in perpetuity tend to outstrip the convenience you get by using something
    different.

    View Slide

  57. So this tends to be the right way to do it. We should generally pick the smallest set of tech that lets us get the job
    done.

    View Slide

  58. That’s the case because operating a piece of technology at a professional level turns out to be really hard. It’s easy to
    get started with a lot of technology, but harder to do a really good job with it.

    View Slide

  59. This is why. Adding the technology is easy, living with it is hard. These are all the things you have to worry about.

    View Slide

  60. Polyglot programming is not the kind of freedom we are looking for.
    If you’re giving individual teams or individual engineers free reign to make local decisions about infrastructure, you’re
    hurting yourself globally.
    It’s handing developers the chains so that they’re free to imprison themselves with operational toil, forever.

    View Slide

  61. There’s more to this than just avoiding operational overhead. By embracing polyglot programming, you’re also
    discarding real benefits that only arise when everyone’s using a shared platform.

    View Slide

  62. A good example of this from my experience is Etsy’s activity feeds. I built this with a small team back in 2010.

    View Slide

  63. Here’s a totally reasonable way to build activity feeds, if that’s all you’re trying to do. You could write events to mysql,
    aggregate them into a feed offline, stuff the feed into redis, and then serve the feeds to end users from redis. This
    would totally work great.

    View Slide

  64. But when we set out to build activity feeds, we didn’t have redis. We did have memcached. They’re sort of similar but
    that have very different guarantees. The most relevant difference to us here is that Redis is persistent, and memcache
    isn’t.
    We didn’t add redis to our stack to make activity feeds. We made do with what we had.

    View Slide

  65. And that required a good bit of extra effort up front. Since memcached isn’t persistent, we had to write a bunch of extra
    code to possibly generate the feed fresh for frontend requests. We couldn’t just assume that the feed would exist when
    the user came to the site.
    That was hard work we wouldn’t have had to do if we added redis, but we got through it.

    View Slide

  66. Then we walked away. We didn’t do anything related to activity feeds for years after that.

    View Slide

  67. But a funny thing happened. The usage of activity feeds exploded by 20 times. And it was totally fine.
    This is the greatest purely technical achievement in my entire career.

    View Slide

  68. The reason it was totally fine was because we used the shared stack. We had to plug in more mysql shards and
    memcached boxes, but people were doing that anyway.

    View Slide

  69. If we’d done redis just for activity feeds, you can be sure that redis would have become distressed as the feature
    scaled up 20 times. And we would have had to go back and work on redis just to keep activity feeds working.

    View Slide

  70. Or more likely, someone else would have had to do it. Our team didn’t exist at all a year later, we were all working on
    different things. Making a mess for others to clean up strikes me as even worse. That’s what you’re doing by adding a
    piece of technology that makes sense locally.

    View Slide

  71. This is an example, but it’s not an absolute principle. Obviously sometimes it does make sense to add new technology
    to your stack.

    View Slide

  72. So I wanted to finish by talking about how we should go about doing that.

    View Slide

  73. First of all, it’s important to recognize that adding technology is a process. Technology has global effects on your
    company, it isn’t something that should be left to individual engineers.
    I don’t care if you’re a flat organization, a holocracy, or if you have 500 middle managers. You have to figure out how
    to talk to each other before you add new technology.

    View Slide

  74. When we were all using real hardware, it was usually the case that talking to at least one other person was necessary
    before adding something new. Now everybody’s on AWS, and this is no longer true. Engineers can sit in a corner and
    proliferate new systems all day.
    I don’t think that real hardware is a good thing on balance, but I do think that talking to people is a positive thing. We
    just have to work harder to do this now, and have those conversations on purpose.

    View Slide

  75. The first question you should talk about is how you’d solve the problem without adding anything new.

    View Slide

  76. I think that you’ll notice that pretty often, this is enough to end the conversation. Because a high percentage of the
    time, the problem to be solved is that someone wants to use a new piece of tech for its own sake. You should not
    entertain this impulse as a serious person.

    View Slide

  77. But anyway, assuming that you have a real problem, the answer is rarely that you can’t do it. If you have a functioning
    website of any kind and you think you can’t accomplish a specific new feature with what you’ve already have, you’re
    probably just not thinking hard enough.
    You may need to resort to unnatural acts, but you can get pretty far with a minimal stack.

    View Slide

  78. Again, you might have to do really awkward things, and it’s possible that those are too costly. But you should talk
    about and write down what those things are.

    View Slide

  79. And if you decide to try out a new piece of technology, you should figure out low-risk ways to get started. Your tactic
    should not be to rewrite your entire application with it in one step. You should be proving the technology in production
    with minimal risk, and then gradually gaining confidence in it.

    View Slide

  80. But ultimately, if you’re adding a redundant piece of technology, your goal is to replace something with it. Your goal
    shouldn’t be to operate two pieces of technology that are redundant with one another forever—commit to replacing
    what you have, or don’t add the technology.

    View Slide

  81. So, in closing

    View Slide

  82. This is what you should do, most of the time. Choose technology that’s well understood, with failure modes that are
    known.

    View Slide

  83. Use technology that lets you focus your attention on what really matters.

    View Slide

  84. Don’t choose tech because of testimonials on Hacker News. Hacker News is kind of like Fox News, and not just
    because it’s dominated by libertarians.
    Something terrible is happening somewhere in the world all the time, so cable news always has a story. Someone’s
    porting their site to a NoSQL database right now, and they’ll write an unreasonable blog post about it that will be on
    HN. It’s unreasonable to extrapolate in either of these scenarios.

    View Slide

  85. Choose a few globally optimal technologies. Don’t make local decisions. Be kind to your future coworkers. Be kind to
    your future selves.

    View Slide

  86. It’s important to master the tools that you do pick.

    View Slide

  87. Every piece of software has this curve to some degree. When you start out you encounter a bunch of problems, but
    you expect to get them ironed out over time.

    View Slide

  88. There’s a natural tendency to want to give up on something in its infancy. When you’ve got a lot of problems with a
    thing, people freak out and want to switch to something else.
    If you encounter this and you’re naive, it can lead to a lot of wreckage. If you do one project with one database,
    encounter some of its quirks and then immediately give up, you can pretty rapidly wind up with ten different databases
    in production.

    View Slide

  89. If you do that you miss out on the part of the curve that we call “mastery.” It’s possible that given enough time with
    something, you can reach a state of minimal problems. Probably not zero problems, but the situation will feel like it’s
    stabilized.
    Now the given curve here, both the magnitude and the shape of it, varies across different kinds of technology. It’s true
    that you’re probably going to have a better time with mysql than with mongodb. But you’re not going to have zero
    problems with mysql, and you should not expect that on the path to mastery.

    View Slide

  90. There’s an unfortunate dilemma inherent in mastering your tools: having done that, you know where the bodies are
    buried. Familiarity with tools can breed contempt.

    View Slide

  91. There are tradeoffs with every tool. You always have things that are good,

    View Slide

  92. and you have things that aren’t great. That’s just the reality of mapping technology solutions onto problems in
    imperfect ways.

    View Slide

  93. Human nature is to obsess about the pain points. Or at least this is my nature. I think a lot of engineers suffer from the
    same thing, though, and technology doesn’t help. We don’t usually set up alerts reminding us about how well
    everything is going, if we all just step back and reflect. Although that’s a good idea for your next hack week.

    View Slide

  94. So it’s also human nature to look at another piece of technology and notice that it solves a couple of those pain points.
    And this is the definition of naïveté in engineering.

    View Slide

  95. Because as we’ve seen, we might not even think to ask a bunch of questions about a new piece of tech that we should
    be asking.

    View Slide

  96. There can be a lot of pain points hidden in our own blind spots.

    View Slide

  97. So we recognize that we have all of these cognitive issues: we’re susceptible to the green grass fallacy. We know we
    will tend to give up on our tools too quickly. We’re all people who got into this business because we like technology,
    and that will lead us to chase shiny new stuff.
    Humans are amazing animals that have figured out a method for containing the damage created by our own
    psychology. It’s called “society.”
    The way we protect ourselves from our own natures is to have a process. Don’t let technology choices happen without
    discussion. Have a process.

    View Slide

  98. Real happiness comes from what you can do after conquering technical choices, not from what you get from making
    technical choices.
    There’s a tendency among programmers to think that if they’re writing code, by definition they’re not wasting their time.
    This is a tar pit.

    View Slide

  99. Real happiness comes from achieving your higher-level goals. Not from solving interesting technical riddles that you
    create for yourself.

    View Slide

  100. View Slide