Right the First Time - Speaker Deck

Slide 1

Slide 1 text

Shawn M Moore @sartak Right the First Time Hi everybody. I’m Shawn Moore.

Slide 2

Slide 2 text

Best Practical I work for Best Practical. I’m the director of engineering there. We’re based in Boston.

Slide 3

Slide 3 text

You might know Best Practical because we make Request Tracker, or RT. As in rt.cpan.org or rt.perl.org.

Slide 4

Slide 4 text

You might know me personally from some of the stuff I’ve put on CPAN. Not for the modules I’d prefer to be known for, though! :)

Slide 5

Slide 5 text

Craftsmanship I’m here to talk about craftsmanship. It’s something I’ve thought a lot about in the eighteen years I’ve been programming. Well, mostly only the last ﬁve years or so. Before that I was happy just to get something working. Now the standards I apply to myself, and hence my output, are much higher.

Slide 6

Slide 6 text

Mark Dominus recently tweeted this. He put it in quotation marks but as far as I can tell the quote is original to him, from almost ten years ago now. So I bet Dominus is a really smart programmer by now. The implication here is that it always matters, except for like, one-liners. And new programmers, no matter how smart, vastly underestimate how important it is to do it right the ﬁrst time, because of how frequently bad code will come back to haunt them.

Slide 7

Slide 7 text

Don’t do this with your code. When you’re working and spot some nearby problematic code, take a little bit of extra time to ﬁx it the right way. You’ll be happier in the long run, and it pays dividends. If the project has a standard of being accepting of sloppy code, sloppy code is what you get!

Slide 8

Slide 8 text

Code that looks like through some divine inspiration you got it exactly right the first time Bernie Cosell – Coders at Work There’s a quote I like from the book Coders at Work. “I want every routine you work on to look as if it was just written. I do not want to see any evidence of afterthoughts or things gone wrong followed by something to correct the error or a mysterious piece of code saying, "This routine returns the wrong value every now and then so I've got to fix it." I don't want to see any of that. I want to see code that looks like through some divine inspiration you got it exactly right the first time.”

Slide 9

Slide 9 text

Code that looks like through some divine inspiration you got it exactly right the ﬁrst time Bernie Cosell – Coders at Work Say as you’re working on something you realize a better way of doing it. It happens all the time because you understand the big picture better as time goes on. You damn well better circle back and improve upon the existing code to use that new style too. You’ll see examples of this done poorly EVERYWHERE YOU LOOK. How many times have you come across some code that follows an otherwise-deprecated structure that had been almost but not completely replaced by a new one? Now you have to keep two separate systems in your head at the same time. Good luck with that. It’s such a great feeling to remove the last user of a module and so be able to get rid of it.

Slide 10

Slide 10 text

So when you install a shiny new pole, go back and move the clock that was there behind it. Divine inspiration looks like you somehow already knew the perfect place for the clock in advance. Sloppiness is instead writing in magic marker on the pole and then you become a slide that people laugh at.

Slide 11

Slide 11 text

understanding > writing You spend waaaayyyy more time reading and debugging code than writing it. So while it’s tempting and expedient to do so, don’t take the false economy of abbreviated or sloppy names, cute shortcuts, skipping comments, etc. Anything you can do to make your code just a tiny bit clearer will pay off every single time someone interacts with it.

Slide 12

Slide 12 text

Naming One of the most important places you can spend your time improving understanding is in choosing good names for your API. It shapes how developers think about, and come to understand, your API.

Slide 13

Slide 13 text

wantarray() wantarray is a good example of a bad name in Perl. It tells you about the context your sub was called in. It returns true if it was list context. Yet this name refers to wanting an array. There’s no such thing as array context. Everywhere else in Perl, arrays and lists are treated as distinct things, but wantarray is one glaring exception that makes it just that much harder to grok Perl. perldoc perlfunc even says “This function should have been named wantlist() instead.” That sucks because in order to really understand Perl, it’s vitally important to really understand context, and this just makes it a tiny bit harder.

Slide 14

Slide 14 text

Guide developers with your APIs This is what I strive for when I architect something. You want to build APIs that steer developers in the right direction. Ideally they couldn’t possibly use it incorrectly. Apple does a really great job of this with its APIs. There’s a consistency and clarity from everything from Foundation to UIKit that just makes you feel like a wizard who can do anything. (At least for the ﬁrst few months, until reality settles in.) There’s an obvious, straightforward way to do things. APIs are orthogonal and work together. A bad API is one you ﬁght with, you have to look up the documentation every single time you use it, and one you quickly come to use only reluctantly.

Slide 15

Slide 15 text

addslashes() mysql_escape_string() mysql_real_escape_string() Here’s a wonderful example of doing this poorly. How many developers do you think stopped at mysql_escape_string instead of going on to mysql_real_escape_string, leaving themselves vulnerable to SQL injection? By the way, the right answer is not to escape strings yourself but to use parameterized queries, that way there’s never any question whether escaping happened correctly. Thankfully we’ve more or less settled on this in Perl. At least as far as I’ve ever seen. Disclaimer: I haven’t seen your codebase.

Slide 16

Slide 16 text

extends ‘Super::Class’ has ‘attr’ override ‘method’ I think Moose and PSGI do a good job of this. The proliferation of Moose-lites, and even things like Class::Accessor ‘antlers’, is a testament to that. And how quickly everyone switched to PSGI and immediately reaped the beneﬁts is largely because it had the right API. HTTP::Engine was a similar idea a year earlier with a clunkier API and so it saw very little uptake.

Slide 17

Slide 17 text

while (<>) { … } I think this is a good example of an API that doesn’t guide you in the right direction. It’s SO easy to use, which is exactly the kind of thing that gives Perl its ubiquity and longevity. But it has a pretty fatal ﬂaw in that it will execute shell commands if you use the pipe form of open. This is a security hole waiting to happen. So I’m reluctant to use it.

Slide 18

Slide 18 text

$ ls -l -rw-r--r-- bar -rw-r--r-- foo -rw-r--r-- rm -rf * | -rwxr-xr-x script.pl $ cat script.pl #!/usr/bin/perl while (<>) { } $ ./script.pl * Can't open script.pl: No such file or directory at ./script.pl line 2. The script opens bar for reading, opens foo for reading, opens “rm -rf *” with a shell because of the |, and then by the time it tries to read script.pl for reading, it’s been deleted.

Slide 19

Slide 19 text

while (<<>>) { … } This is how you ﬁx it: use the double diamond operator from the latest version of Perl. I think the solution of adding a new safe version is kind of like our mysql_real_escape_string. Gross. I would have ﬁxed the existing operator and added a new unsafe version. That’s probably why I’d be a terrible pumpking. After all, that ship sailed decades ago, we can’t change the existing behavior. Victim of our own success.

Slide 20

Slide 20 text

SYNOPSIS use utf8::all; # Turn on UTF-8, all of it. open my $in, '<', ‘contains-utf8'; # UTF-8 already turned on here print length 'føø bār’; # 7 UTF-8 characters my $utf8_arg = shift @ARGV; # @ARGV is UTF-8 too (only for main) So. One trick for achieving an API that guides developers to do things right is to write the synopsis for your module or feature first, before you do any coding. It’s documentation, but it’s one screenful of real, runnable code that introduces your API. Be optimistic in that synopsis. Then try to achieve it. You might not always get there, but it’ll serve as a good guide. The synopsis is the first, and sometimes only, thing I look at when I judge whether I want to use a module. If there are irrelevant details I’m way more reluctant to use that module. That synopsis should serve as your first unit test too.

Slide 21

Slide 21 text

Remove doubt from your code Brent Simmons – How Not to Crash Remove doubt from your code by using proper names. Make it clear that you aren’t vulnerable to the rm issue by using while (<<>>). Use (not necessarily fatal) assertions or parameter validation to enforce your expectations, and to communicate them to maintainers. Be strict in what you accept. Explicit is better.

Slide 22

Slide 22 text

Consistency is important too. Similar things should look similar. Why are all these icons different from each other? I spent more time thinking about the sign itself rather than the restrictions it lists. Did the designer add the no-cell-phones icon later, is that why the stroke is going the other direction? The stroke for no power (which, frankly, I’m not sure why that’s listed as a prohibited thing) is thinner, does that mean it’s a only a little bit forbidden? Its circle is thicker though, which draws the eye, subtly suggesting it is more forbidden. Now I’ve got doubt. And hey, why is hamburger-with-cup the universal sign for food? What do they use outside of the West?

Slide 23

Slide 23 text

Now I’m confused and all of a sudden I noticed that I opened a bunch of tabs on Wikipedia articles. So… why not just use the unicode character for ⃠, or at least copy and paste the ﬁrst set of circle and line shapes in Photoshop? I’m taking this critique to a ridiculous length here. But actually, not really. These are all exactly the kinds of questions that subtle differences in similar code raises. So go that extra mile and make things consistent so that these irrelevant, useless questions would never even be asked. Remove doubt. As a fringe beneﬁt, making them consistent will make it easier to refactor later on, too.

Slide 24

Slide 24 text

I’ll write comments later… “I’ll come back after I’m done and ﬁll in comments later…” NO! You need to be in the mindset you were in while writing the code to be able to explain it properly. After all, the whole point of comments is speciﬁcally to communicate what's in your mind that isn't already apparent from the code. Capture it while you can! Don’t wait til it’s left your mind in the hopes that it will come back. It won’t. At best you’ll end up just describing the code itself, which is not what comments are for. Comments are for things that aren’t immediately apparent from reading the code. But usually what happens is you move onto the next thing and you never write those comments.

Slide 25

Slide 25 text

You’re trying to compress that very rich picture in your head for the reader on the other end Bret Victor – Media for Thinking the Unthinkable “[You] have a very rich picture in your head. And [you’re] trying to compress that picture to transmit over a very low-bandwidth channel, which is this stream of symbols. So it's a very lossy compression. And we've got the reader on the other end trying to decompress this picture into their own head. And it doesn't work very well.” So to make sure that there’s no miscommunication, you want to be very precise, more verbose than feels natural, and overly self-critical. I’d go so far as to say that you want to do this in all technical communication, whether it be code, comments, tests, documentation, IRC, etc.

Slide 26

Slide 26 text

git forensics Your tricky code may be sensible today. But will it still be sensible when someone comes to look at it in a year? It’s not enough to think of your code as it exists right now. Consider how this code will look in six months, a year, six years. Surely you haven't ever encountered six-year-old code before. And if you have I'm sure every time it was just wonderful. Will your future maintainer have to perform git forensics to ﬁgure out why you did it this way instead of another? Instead of making them guess, just leave a comment! Write good commit messages too.

Slide 27

Slide 27 text

What separates programming from painting, writing, etc is that there is no ﬁnished product Brent Simmons – How Not to Crash It’s important to have a plan for that six year old code, and to know that the code you’re writing today will probably become six years old. How will the code you’re writing today be received six years from now? Put in ﬁve extra minutes today to save hours, maybe days, over that time span.

Slide 28

Slide 28 text

The most depressing thing… is a chunk of code… you dare not modify Simon Peyton Jones – Coders at Work I’m sure no one here has been in this situation before. Lately I’ve been striving to be in the exact opposite situation. I go well out of my way to become master over the code I’m writing. If there’s something I don’t understand, I spend time tumbling down the rabbit hole. If I catch myself thinking “gosh, I hope I don’t ever have to touch THAT code” then something’s not right in paradise.

Slide 29

Slide 29 text

Loosely coupled modules When you do come across six year old code you might decide the best course of action is to rewrite it. Your job will be much easier if the old code was loosely coupled—using, and offering, only very small APIs. That also helps in understanding the codebase, since you can more readily understand each module in isolation, and then you can understand the connections between modules, and then you’re done. It’s very natural. Deep coupling means you have to understand everything globally before you can understand anything separately.

Slide 30

Slide 30 text

Sereal Sereal is a good example of this. I am a pretty good programmer but I am not the wizard its authors are. There’s no way I could have written it. But I can certainly use it conﬁdently, because it offers a very straightforward API, just sereal_encode and sereal_decode. If the code to encode data structures were in the ORM layer and the code to decode Sereal was in the frontend, it’d make everyone MUCH grumpier every time they saw it.

Slide 31

Slide 31 text

Being abstract is profoundly different from being vague Edsger W. Dijkstra - An Introduction to Implementation Issues (EWD 656) That leads me to my next point which is about abstraction. The whole point of abstraction is to build new tools you then use to work at a higher level in a precise way. When you’re interacting with a database, you don’t want to have to care about the ﬁle I/O operations that need to happen. You want to deal in the realm of SQL queries, which is a very precise system. There’s nothing vague about it. It’s just that details you don’t care about for the task you’re performing are handled behind the scenes. Dealing with the I/O yourself in order to do set operations in order to do SQL queries would be a disaster. There’s value in knowing how the database does its work, for example to create more efﬁcient indexes, but that’s an expert-level concern. A specialization.

Slide 32

Slide 32 text

Good programmers are more facile at jumping between layers of abstraction Peter Seibel – Coders at Work I worked with someone once who was terrible at this. Every bugﬁx and feature would be done in the wrong place. It drove me absolutely mental and it was a serious drag on the entire project. It’d be like in the database example if instead of changing a column name in a SQL query he’d do a search-and- replace on the text stream as it comes in from the database ﬁle. In stark contrast, the student I mentored for the Gnome Outreach Program for Women, Upasana Shukla, was perfectly good at this. By the way she is a real pleasure to work with.

Slide 33

Slide 33 text

I’m a big fan of this idea called “gumption”. Basically, deliberate and focused self improvement. Practice practice practice. Do things outside your comfort zone to expand your comfort zone. Be better. Never stagnate; never stop learning.

Slide 34

Slide 34 text

One technique I’ve been using lately is creating a todo list for each feature branch I’m working on. Here’s my OmniFocus project for a new feature that is nearly done. You can see there’s a lot of atomic next actions here: tests, documentation, ask questions of others, investigations into existing code, create tickets. Basically a continuous braindump of all the stuff I think to do as I’m working on the feature. Or even when I’m lying in bed trying to sleep and my subconscious, which has been chewing on the problem, reveals more insight. And I break down tasks into smaller chunks as I progress, understand the problem better. It’s almost fractal, but thankfully it does bottom out.

Slide 35

Slide 35 text

Thanks to this I have a very high level of confidence that I’ve completely finished a feature. There are no loose ends, because I know every single consideration has been captured and dealt with. That leads to a sort of mastery. I never have anxiety that I’ve forgotten something. So it means I get my branches right the first time! By the way, read Getting Things Done. It’s fantastic. It’ll change your life!

Slide 36

Slide 36 text

Many people do this with a plain text ﬁle directly in the repo. I’ve mindmelded with OmniFocus so I use that, but anything that externalizes your intent is good. Use all your brainpower on solving each problem, rather than remembering the list of things to do. Divide and conquer. It’s also handy if you have to put the branch down for days, or months, and come back to it later. Usually you’ve forgotten everything so you have to research just to get it back into your brain. But with this, you already have your list of things you’ve done, and a list of what’s left to ﬁnish.

Slide 37

Slide 37 text

I’d shy away from using RT or JIRA for this because you'd generate a lot of email as you open and close tiny tickets. Even my tiny feature branch ended up with 23 tasks. It’s also more friction than you want. For example you’d want to assign each and every ticket to yourself. There’s also no “inbox” which means capture requires a little bit more effort to locate the project and create a related ticket. But, anyone who’s worked with me will tell you though I have a tendency to make thousands of atomic tickets in RT and JIRA anyway.

Slide 38

Slide 38 text

Programmers should not waste their time debugging, they should not introduce the bugs to start with Edsger W. Dijkstra – The Humble Programmer (EWD 340) “If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.” Better APIs, clearer code, assertions, and guided APIs all help to avoid introducing bugs.

Slide 39

Slide 39 text

Be careful, and don’t be optimistic. Optimists write bugs. Brent Simmons – How Not to Crash

Slide 40

Slide 40 text

There’s a talk from a researcher at JPL about how the code in the Curiosity rover on Mars project, and how they made it robust. This is important because any bug could instantly wreck a 2 billion dollar mission, not to mention jeopardize future funding. Of course consumer software requires nowhere near that level of reliability, but reliability in general is important for all software, so it’s worth at least knowing what organizations like NASA and JPL do. When you think of code for space programs, you probably think of teams of scientists producing three lines of FORTRAN a day. Any more than that would be too hasty! https://vimeo.com/84991949

Slide 41

Slide 41 text

But that’s not how they roll any more. That discipline and vigilance are difﬁcult to maintain. The entire talk was about how JPL does code review, either from peers or modern source code analysis tools. The vast majority of code review feedback led to changes in code, rather than being rejected or ignored. He claimed “This worked remarkably well”. Even if code review isn’t a policy for your entire company yet, I hope you can consider doing it in your individual team.

Slide 42

Slide 42 text

Shorten the conceptual gap between the static program and the dynamic process Edsger W. Dijkstra – A Case against the GO TO Statement (EWD 215) “Our intellectual powers are rather geared to master static relations and our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost best to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.”

Slide 43

Slide 43 text

Shorten the conceptual gap between the static program and the dynamic process Edsger W. Dijkstra – A Case against the GO TO Statement (EWD 215) Your program is more or less a gigantic text ﬁle. When you run your program, the text isn’t itself doing anything. The process that results from your program is a living, breathing beast. It’s important that the program and its process match as closely as possible, otherwise it’s very difﬁcult to understand and modify the program to get the desired behavior out of the process.

Slide 44

Slide 44 text

Shorten the conceptual gap between the static program and the dynamic process Edsger W. Dijkstra – A Case against the GO TO Statement (EWD 215) It’s the difference between a recipe and its resulting dish. What you wouldn’t want in a recipe is nonsensical, extraneous steps. Like if you’re following a recipe and step 4 is flush your toilet. Why the hell is that in the recipe? Because the original guy had to take a leak while he was waiting for his noodles to boil. You might think that’s ridiculous, but it happens ALL THE DAMN TIME in code. “Oh I was trying to fix something so I just left that in there and didn’t bother to remove it when I figured it out that the problem was somewhere else.” Use “git add -p” to select exactly the changes that do what you need and throw the rest away.

Slide 45

Slide 45 text

Add the debug code early and leverage it throughout the whole process Casey Muratori – Working on The Witness “If you’re going to debug one thing in a piece of code, you’re going to debug a bunch of things, so you might as well add the debug code early and use it throughout the whole process.” You’ll have better tools at the end of it. Your infrastructure team does a great job with this with its dashboards.

Slide 46

Slide 46 text

A work project involved a monstrously making sense of, and porting parts of, a complicated spreadsheet. You feed it a few numbers, it runs a shitload of arithmetic, and spits out a few numbers. We basically use this as a speciﬁcation since this gigantic XLS would not be appropriate to ship to end users. The thing’s a mess. There are dozens of tabs, and thousands of cells per tab, etc.

Slide 47

Slide 47 text

Inevitably something will go wrong with our calculations and we’ll need to investigate why. No one carries the spreadsheet in their head, nor could anyone ever hope to work backwards from the outputs to ﬁgure out which intermediate calculations went wrong. We need some way to surface all those bazillions of lines of arithmetic. So I thought hey, printf. I even spruced it up with some formatting and highlighting of like terms on hover. It was beautiful!

Slide 48

Slide 48 text

But it was utterly useless. I was the only one who could ever debug anything, because I was the only one who could correlate debug output with the spreadsheet, and every time it was very time consuming. So back to the drawing board… the whole point is we want to ﬁgure out where our calculations diverge from the spreadsheet’s. So I threw away both our implementation based on the spreadsheet’s documented “reference algorithm” (which is euphemism for bullcrap) and the debug UI and produced something better.

Slide 49

Slide 49 text

So what I ended up with was more or less a straightforward Excel interpreter. I hew as closely to the spreadsheet’s format as possible. That way differences immediately jump out at you. Building the right tools like this not only cuts down time on time both for debugging and customizations dramatically, but also lets the next programmer come along and see exactly what’s going on. The static program, in this case the spec, is INCREDIBLY similar to the dynamic process, which is surfaced in this way.

Slide 50

Slide 50 text

Laziness Impatience Hubris Larry Wall said these are the three great virtues of a programmer. You’ve probably read Programming Perl but maybe for some of you it’s been a while. So let’s review. Laziness: “The quality that makes you go to great effort to reduce overall energy expenditure. It makes you […] document what you wrote so you don't have to answer so many questions about it.”

Slide 51

Slide 51 text

Laziness Impatience Hubris Impatience: “This makes you write programs that don't just react to your needs, but actually anticipate them.”

Slide 52

Slide 52 text

Laziness Impatience Hubris Hubris: “The quality that makes you write (and maintain) programs that other people won't want to say bad things about.”

Slide 53

Slide 53 text

svn ci -m ‘hack hack hack’ The days of just putting whatever you happened to have changed into the repository are over. We could get away with it back in the Subversion days because the tooling wasn’t as good. Git doesn’t suck. Spend the time to get good at it. After all it’s an API to the complete history of your codebase. Which is pretty damn useful.

Slide 54

Slide 54 text

Atomic commits rebase -i add -p I make atomic commits. Each commit changes the smallest thing it possibly could, without breaking the codebase. That way they’re easy to review, easy to bisect, and easy to reorder. It helps you form a narrative. Each commit is a baby step toward your goal. You want to get really good with git rebase -i and git add -p. These let you produce good commit history which is an investment in future maintenance. I don't know if your policy allows push —force to rewrite published history but even if it doesn't, holding back your pushes until you're a bit more conﬁdent about the code change can go a long way.

Slide 55

Slide 55 text

log bisect blame The git history is important because it gives you context for the changes that you’ve made. It’s one more piece of the puzzle in ﬁguring out why code exists as it does today. If, say, you’re debugging a piece of a subroutine and see that when it was committed, you hadn’t yet converted to PSGI, that’s a pretty big hint that that subroutine should be updated to take PSGI into account. That sort of thing. Blame lets you see the commit that each line of code comes from. Hugely useful. Bisect lets you binary search over commits to ﬁnd some change.

Slide 56

Slide 56 text

Start commit messages with a verb Reference ticket ID? Two small ways you can improve your commit messages is to start them with a verb. “Added x”. “Removed y.” “Refactored z.” That way they make sense as you read them in a log, or rebase, or any other context, etc. It also helps you kickstart writing your commit message. Another idea that helps tremendously is to put a ticket ID right into the commit message. Anything you can do to provide more context for future maintainers helps. That way when you dive into new code you have a better chance of understanding the whats and the whys.

Slide 57

Slide 57 text

Don’t run with scissors. Don’t even touch these scissors. They have a bunch of poison on them. Brent Simmons – How Not to Crash

Slide 58

Slide 58 text

Billy, don’t be a hero. You don’t get bonus points for being a hero. You know what’s actually really valuable that doesn’t get enough credit? Creating simple, readable code that anyone can modify. I don’t know what the culture at your company is like but I try to be very mindful in noticing when a team member isn’t causing problems. That deserves recognition.

Slide 59

Slide 59 text

Thank you! Thank you very much for your attention