Choose Boring Technology

Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Hey

Slide 3

Slide 3 text

I’m Dan McKinley. That’s me in the hole. It’s a metaphor.

Slide 4

Slide 4 text

I work for a company called Stripe. Before that, I was an early employee at Etsy, where I worked for a lot of years. I acquired a great deal of practical experience at Etsy so I’m going to be referring to my time there a lot.

Slide 5

Slide 5 text

Etsy wasn’t mature as an engineering organization when I got there, but I was eventually spoiled rotten when it came to technology culture at Etsy. As I’ve gone back into the wider tech world, I’ve had to confront some questions I hadn’t really considered in a few years. And I’ve realized I have opinions about these things. That’s what this talk is about.

Slide 6

Slide 6 text

So how do you choose technology? This was eventually more or less handled for me at Etsy. Now I need to worry about it again.

Slide 7

Slide 7 text

You can achieve anything with software. And I definitely believe that companies don’t usually succeed or fail because of specific technical choices. But technology choices are relevant. They affect how straight the path is between you and achieving your goals. They affect your efficiency.

Slide 8

Slide 8 text

Another question that I care about is: how do you make developers happy? This matters to me, as a developer. But also as a leader in a software organization—productivity and retention rely on this.

Slide 9

Slide 9 text

If you ask developers, a lot of them will tell you they’re happy when they’re working with technology that’s exciting. Or another thing they say is that they like to work on hard problems. Those things may or may not be true. But what I’ve learned with experience is that it’s not really the case at the highest levels of fulfillment. They don’t talk about nodejs in heaven.

Slide 10

Slide 10 text

You can probably tell from the title of this talk that I don’t think that chasing shiny technology is right. But look, I’ve been there. I, too, once chased shiny technology.

Slide 11

Slide 11 text

Etsy early on was a big ball of PHP, written by an overall brilliant person who was unfortunately learning PHP as he was writing it. I spent years trying to avoid dealing with the results of that. At one point I tried building Scala services that talked to MongoDB. I wrote blog posts about this that Etsy employees are still giving me shit about. And with good cause.

Slide 12

Slide 12 text

I think it’s fair to say that I’m a completely different kind of engineer now. I tend to be focused on things that are only vaguely engineering. I talk at design conferences, or in the “business” track. I care a lot more about product than your average engineer.

Slide 13

Slide 13 text

I view this less as the result of getting old and cranky and more as the result of climbing up Maslow’s hierarchy of needs. Maslow’s hierarchy, briefly, is the idea that you have to satisfy your more basic needs before higher levels of intellectual fulfillment are possible.

Slide 14

Slide 14 text

The same is basically true about software. You can’t ask intelligent questions about the direction of the product if you’re worried about which database to use or which alerting system to use. In my career to date, I’ve been pretty lucky to have my most basic needs fulfilled. And I want to help get others to this state.

Slide 15

Slide 15 text

So, try to think of me as a time traveler from your future. I’ve been through the shiny technology wars you might be fighting today. It’s better over here. The air is fresher. Food tastes better.

Slide 16

Slide 16 text

So, on to the problem of choosing technology. A thing that I think is obviously true is that as human beings, we have limited attention. We can only worry about so much stuff at one time.

Slide 17

Slide 17 text

I personally model that like this. You could say that we all get a limited number of innovation tokens to spend. This is a construct I just made up, but I think it’s helpful. And since I created this currency I also decided to put Elon Musk on it. These represent our limited capacity to do something creative, or hard. We really don’t have that many of these to allocate. Early on in a company’s life, we get like maybe three. Not too many more than that.

Slide 18

Slide 18 text

So what’s your company trying to do? Well, Etsy, where I used to work, is trying to reshape the world economy.

Slide 19

Slide 19 text

I dunno, that sounds like a big job. That probably requires at least one of your tokens.

Slide 20

Slide 20 text

The company where I work now is trying to increase the GDP of the internet.

Slide 21

Slide 21 text

Again, that sounds like a pretty complicated thing to be doing. We probably have to spend at least one of our tokens on that. Maybe two. Maybe all of them!

Slide 22

Slide 22 text

If you think about innovation as a scarce resource, it starts to make less sense to be on the front lines of innovating on databases. Or on programming paradigms. The point isn’t really that these things can’t work. Of course they can work. But exciting new technology takes a great deal more attention to work than boring, proven technology does.

Slide 23

Slide 23 text

To get at the reason for that I want to talk about the philosophy of knowledge a little bit. What can we know about a piece of technology? This is not actually a frivolous question. It’s really important.

Slide 24

Slide 24 text

Now look, I don’t like Donald Rumsfeld. But he’s associated with the following, which is thoroughly relevant to our subject.

Slide 25

Slide 25 text

And that’s this. When we don’t know something, there are really two different categories that that lack of knowledge can be in. There are known unknowns, that is, things that we know that we don’t know. And there are unknown unknowns, things that we don’t know and that we don’t know that we don’t know.

Slide 26

Slide 26 text

This applies in technology. This is an example of a known unknown. For a given database, we might not know what happens when a network partition occurs. But we know that a network partition is possible. Since we know that this is possible, we can test for this. Or we can just cross our fingers and hope that it doesn’t happen. Either way, we are informed about the possibility.

Slide 27

Slide 27 text

There are also unknown unknowns in technology. This is a good example I saw a few months ago. This person had a java process that was writing stats to a file, and that was causing GC pauses. It took him forever to figure this out because the possibility hadn’t occurred to him. That’s an unknown unknown.

Slide 28

Slide 28 text

Now, it’s important to realize that both categories are present in all software. There are always bugs that nobody knows about, even in software that’s been around forever.

Slide 29

Slide 29 text

But it’d be wrong to say that all technology is therefore equivalent. New technology has larger magnitudes for both of these sets. New tech typically has more known unknowns, and many more unknown unknowns. And this is really important.

Slide 30

Slide 30 text

Boring technology in a nutshell is technology that’s well understood. We know what it’s capable of, and at least as importantly, we also know what it’s not capable of. We know how boring technology fails.

Slide 31

Slide 31 text

So, ok, all you have to do is pick proven technology, and you’re all set, right? Well, no. The combination of things that you choose also matters.

Slide 32

Slide 32 text

Let’s say that you’re already using this stack. You have python, memcached, mysql, and apache.

Slide 33

Slide 33 text

Let’s say you have a new problem to solve. Do you think it makes sense to add ruby to your existing stack?

Slide 34

Slide 34 text

I think most people’s intuition there is “probably not.” We know that the marginal utility of adding ruby isn’t going to outweigh the complexity hit we take by adding it. Python and ruby feel pretty equivalent.

Slide 35

Slide 35 text

And we’ve had formal proofs since the 1930s that all problems can in principle be solved with one or the other.

Slide 36

Slide 36 text

Ok, so how about adding redis? We already have mysql and memcached, but should we add redis?

Slide 37

Slide 37 text

About here is where people lose it and start beating the polyglot programmer drums. There’s something about the idea of adding a new database that has people storming the Bastille, saying “you can’t stop us from using the best tool for the job!” People tend to think that what they're doing when they acquiesce to this is that they're giving developers freedom. And sure, it is freedom, but it's freedom very narrowly defined.

Slide 38

Slide 38 text

What’s going on there? Let’s try to tease this apart.

Slide 39

Slide 39 text

This is what we’re implicitly saying when we want to add a piece of technology. Except in relatively rare cases where it’s not possible to solve a problem with our existing stack, we’re saying that the new tech is going to be so much better in the near term that this benefit outweighs the cost of having two pieces of technology around in perpetuity.

Slide 40

Slide 40 text

We can actually start to formalize this idea, and think about it a structured way.

Slide 41

Slide 41 text

Well, sort of. I don’t expect to see this published in ACM. But here goes.

Slide 42

Slide 42 text

Your job is basically what my friend Coda says, here. You’re supposed to be solving business problems with technology.

Slide 43

Slide 43 text

We can model that as a bipartite graph. On the left side we have business problems, and on the right side we have technical solutions.

Slide 44

Slide 44 text

As practitioners we have to try to connect all of the nodes on the left side so that our problems are solved. Adding an edge here is making a technology choice.

Slide 45

Slide 45 text

Every choice has a maintenance cost, but we also get the benefit of the technology that we choose.

Slide 46

Slide 46 text

Every choice has maintenance costs, but every choice also helps us solve the problem. So we have a nonzero benefit, and a nonzero cost for every choice.

Slide 47

Slide 47 text

When we add more than one edge, we can make a choice. We can use the same technology that we’ve already paid for …

Slide 48

Slide 48 text

Or we could pick a different piece of technology. We have to pay for that new tech, too, but maybe we get so much development velocity that it’s worth it.

Slide 49

Slide 49 text

We can start to think about this mathematically. We’re trying to minimize this cost function. The total cost of our operations is all of the maintenance costs we take on from our choices, minus the development velocity we get from every choice.

Slide 50

Slide 50 text

The way we behave really depends on what you believe about which term dominates this equation in the real world. If technology is really expensive to operate, the costs dominate. If technology really makes a huge difference in how easy your job is, the benefits dominate.

Slide 51

Slide 51 text

So, depending, you might decide to make an allocation like this. Here we’ve picked many different technologies to use to solve all of our problems.

Slide 52

Slide 52 text

And that makes complete sense if each additional technology choice is cheap. If we think that we get more out of using each new technology than we’ll pay for operationalizing it, then doing it this way makes sense.

Slide 53

Slide 53 text

This is an alternative strategy. Here we’ve chosen just a few technologies,

Slide 54

Slide 54 text

And that’s what we should do if we think that each technology we add comes with a lot of baggage.

Slide 55

Slide 55 text

Here in reality, new technology choices come with a great deal of baggage.

Slide 56

Slide 56 text

This is reality. Costs to operate a technology in perpetuity tend to outstrip the convenience you get by using something different.

Slide 57

Slide 57 text

So this tends to be the right way to do it. We should generally pick the smallest set of tech that lets us get the job done.

Slide 58

Slide 58 text

That’s the case because operating a piece of technology at a professional level turns out to be really hard. It’s easy to get started with a lot of technology, but harder to do a really good job with it.

Slide 59

Slide 59 text

This is why. Adding the technology is easy, living with it is hard. These are all the things you have to worry about.

Slide 60

Slide 60 text

Polyglot programming is not the kind of freedom we are looking for. If you’re giving individual teams or individual engineers free reign to make local decisions about infrastructure, you’re hurting yourself globally. It’s handing developers the chains so that they’re free to imprison themselves with operational toil, forever.

Slide 61

Slide 61 text

There’s more to this than just avoiding operational overhead. By embracing polyglot programming, you’re also discarding real benefits that only arise when everyone’s using a shared platform.

Slide 62

Slide 62 text

A good example of this from my experience is Etsy’s activity feeds. I built this with a small team back in 2010.

Slide 63

Slide 63 text

Here’s a totally reasonable way to build activity feeds, if that’s all you’re trying to do. You could write events to mysql, aggregate them into a feed offline, stuff the feed into redis, and then serve the feeds to end users from redis. This would totally work great.

Slide 64

Slide 64 text

But when we set out to build activity feeds, we didn’t have redis. We did have memcached. They’re sort of similar but that have very different guarantees. The most relevant difference to us here is that Redis is persistent, and memcache isn’t. We didn’t add redis to our stack to make activity feeds. We made do with what we had.

Slide 65

Slide 65 text

And that required a good bit of extra effort up front. Since memcached isn’t persistent, we had to write a bunch of extra code to possibly generate the feed fresh for frontend requests. We couldn’t just assume that the feed would exist when the user came to the site. That was hard work we wouldn’t have had to do if we added redis, but we got through it.

Slide 66

Slide 66 text

Then we walked away. We didn’t do anything related to activity feeds for years after that.

Slide 67

Slide 67 text

But a funny thing happened. The usage of activity feeds exploded by 20 times. And it was totally fine. This is the greatest purely technical achievement in my entire career.

Slide 68

Slide 68 text

The reason it was totally fine was because we used the shared stack. We had to plug in more mysql shards and memcached boxes, but people were doing that anyway.

Slide 69

Slide 69 text

If we’d done redis just for activity feeds, you can be sure that redis would have become distressed as the feature scaled up 20 times. And we would have had to go back and work on redis just to keep activity feeds working.

Slide 70

Slide 70 text

Or more likely, someone else would have had to do it. Our team didn’t exist at all a year later, we were all working on different things. Making a mess for others to clean up strikes me as even worse. That’s what you’re doing by adding a piece of technology that makes sense locally.

Slide 71

Slide 71 text

This is an example, but it’s not an absolute principle. Obviously sometimes it does make sense to add new technology to your stack.

Slide 72

Slide 72 text

So I wanted to finish by talking about how we should go about doing that.

Slide 73

Slide 73 text

First of all, it’s important to recognize that adding technology is a process. Technology has global effects on your company, it isn’t something that should be left to individual engineers. I don’t care if you’re a flat organization, a holocracy, or if you have 500 middle managers. You have to figure out how to talk to each other before you add new technology.

Slide 74

Slide 74 text

When we were all using real hardware, it was usually the case that talking to at least one other person was necessary before adding something new. Now everybody’s on AWS, and this is no longer true. Engineers can sit in a corner and proliferate new systems all day. I don’t think that real hardware is a good thing on balance, but I do think that talking to people is a positive thing. We just have to work harder to do this now, and have those conversations on purpose.

Slide 75

Slide 75 text

The first question you should talk about is how you’d solve the problem without adding anything new.

Slide 76

Slide 76 text

I think that you’ll notice that pretty often, this is enough to end the conversation. Because a high percentage of the time, the problem to be solved is that someone wants to use a new piece of tech for its own sake. You should not entertain this impulse as a serious person.

Slide 77

Slide 77 text

But anyway, assuming that you have a real problem, the answer is rarely that you can’t do it. If you have a functioning website of any kind and you think you can’t accomplish a specific new feature with what you’ve already have, you’re probably just not thinking hard enough. You may need to resort to unnatural acts, but you can get pretty far with a minimal stack.

Slide 78

Slide 78 text

Again, you might have to do really awkward things, and it’s possible that those are too costly. But you should talk about and write down what those things are.

Slide 79

Slide 79 text

And if you decide to try out a new piece of technology, you should figure out low-risk ways to get started. Your tactic should not be to rewrite your entire application with it in one step. You should be proving the technology in production with minimal risk, and then gradually gaining confidence in it.

Slide 80

Slide 80 text

But ultimately, if you’re adding a redundant piece of technology, your goal is to replace something with it. Your goal shouldn’t be to operate two pieces of technology that are redundant with one another forever—commit to replacing what you have, or don’t add the technology.

Slide 81

Slide 81 text

So, in closing

Slide 82

Slide 82 text

This is what you should do, most of the time. Choose technology that’s well understood, with failure modes that are known.

Slide 83

Slide 83 text

Use technology that lets you focus your attention on what really matters.

Slide 84

Slide 84 text

Don’t choose tech because of testimonials on Hacker News. Hacker News is kind of like Fox News, and not just because it’s dominated by libertarians. Something terrible is happening somewhere in the world all the time, so cable news always has a story. Someone’s porting their site to a NoSQL database right now, and they’ll write an unreasonable blog post about it that will be on HN. It’s unreasonable to extrapolate in either of these scenarios.

Slide 85

Slide 85 text

Choose a few globally optimal technologies. Don’t make local decisions. Be kind to your future coworkers. Be kind to your future selves.

Slide 86

Slide 86 text

It’s important to master the tools that you do pick.

Slide 87

Slide 87 text

Every piece of software has this curve to some degree. When you start out you encounter a bunch of problems, but you expect to get them ironed out over time.

Slide 88

Slide 88 text

There’s a natural tendency to want to give up on something in its infancy. When you’ve got a lot of problems with a thing, people freak out and want to switch to something else. If you encounter this and you’re naive, it can lead to a lot of wreckage. If you do one project with one database, encounter some of its quirks and then immediately give up, you can pretty rapidly wind up with ten different databases in production.

Slide 89

Slide 89 text

If you do that you miss out on the part of the curve that we call “mastery.” It’s possible that given enough time with something, you can reach a state of minimal problems. Probably not zero problems, but the situation will feel like it’s stabilized. Now the given curve here, both the magnitude and the shape of it, varies across different kinds of technology. It’s true that you’re probably going to have a better time with mysql than with mongodb. But you’re not going to have zero problems with mysql, and you should not expect that on the path to mastery.

Slide 90

Slide 90 text

There’s an unfortunate dilemma inherent in mastering your tools: having done that, you know where the bodies are buried. Familiarity with tools can breed contempt.

Slide 91

Slide 91 text

There are tradeoffs with every tool. You always have things that are good,

Slide 92

Slide 92 text

and you have things that aren’t great. That’s just the reality of mapping technology solutions onto problems in imperfect ways.

Slide 93

Slide 93 text

Human nature is to obsess about the pain points. Or at least this is my nature. I think a lot of engineers suffer from the same thing, though, and technology doesn’t help. We don’t usually set up alerts reminding us about how well everything is going, if we all just step back and reflect. Although that’s a good idea for your next hack week.

Slide 94

Slide 94 text

So it’s also human nature to look at another piece of technology and notice that it solves a couple of those pain points. And this is the definition of naïveté in engineering.

Slide 95

Slide 95 text

Because as we’ve seen, we might not even think to ask a bunch of questions about a new piece of tech that we should be asking.

Slide 96

Slide 96 text

There can be a lot of pain points hidden in our own blind spots.

Slide 97

Slide 97 text

So we recognize that we have all of these cognitive issues: we’re susceptible to the green grass fallacy. We know we will tend to give up on our tools too quickly. We’re all people who got into this business because we like technology, and that will lead us to chase shiny new stuff. Humans are amazing animals that have figured out a method for containing the damage created by our own psychology. It’s called “society.” The way we protect ourselves from our own natures is to have a process. Don’t let technology choices happen without discussion. Have a process.

Slide 98

Slide 98 text

Real happiness comes from what you can do after conquering technical choices, not from what you get from making technical choices. There’s a tendency among programmers to think that if they’re writing code, by definition they’re not wasting their time. This is a tar pit.

Slide 99

Slide 99 text

Real happiness comes from achieving your higher-level goals. Not from solving interesting technical riddles that you create for yourself.

Slide 100

Slide 100 text

No content