Right the First Time

Shawn M Moore @sartak Right the First Time Hi everybody.
I’m Shawn Moore.

Best Practical I work for Best Practical. I’m the director
of engineering there. We’re based in Boston.

You might know Best Practical because we make Request Tracker,
or RT. As in rt.cpan.org or rt.perl.org.

You might know me personally from some of the stuff
I’ve put on CPAN. Not for the modules I’d prefer to be known for, though! :)

Craftsmanship I’m here to talk about craftsmanship. It’s something I’ve
thought a lot about in the eighteen years I’ve been programming. Well, mostly only the last ﬁve years or so. Before that I was happy just to get something working. Now the standards I apply to myself, and hence my output, are much higher.

Mark Dominus recently tweeted this. He put it in quotation
marks but as far as I can tell the quote is original to him, from almost ten years ago now. So I bet Dominus is a really smart programmer by now. The implication here is that it always matters, except for like, one-liners. And new programmers, no matter how smart, vastly underestimate how important it is to do it right the ﬁrst time, because of how frequently bad code will come back to haunt them.

Don’t do this with your code. When you’re working and
spot some nearby problematic code, take a little bit of extra time to ﬁx it the right way. You’ll be happier in the long run, and it pays dividends. If the project has a standard of being accepting of sloppy code, sloppy code is what you get!

Code that looks like through some divine inspiration you got
it exactly right the first time Bernie Cosell – Coders at Work There’s a quote I like from the book Coders at Work. “I want every routine you work on to look as if it was just written. I do not want to see any evidence of afterthoughts or things gone wrong followed by something to correct the error or a mysterious piece of code saying, "This routine returns the wrong value every now and then so I've got to fix it." I don't want to see any of that. I want to see code that looks like through some divine inspiration you got it exactly right the first time.”

Code that looks like through some divine inspiration you got
it exactly right the ﬁrst time Bernie Cosell – Coders at Work Say as you’re working on something you realize a better way of doing it. It happens all the time because you understand the big picture better as time goes on. You damn well better circle back and improve upon the existing code to use that new style too. You’ll see examples of this done poorly EVERYWHERE YOU LOOK. How many times have you come across some code that follows an otherwise-deprecated structure that had been almost but not completely replaced by a new one? Now you have to keep two separate systems in your head at the same time. Good luck with that. It’s such a great feeling to remove the last user of a module and so be able to get rid of it.

So when you install a shiny new pole, go back
and move the clock that was there behind it. Divine inspiration looks like you somehow already knew the perfect place for the clock in advance. Sloppiness is instead writing in magic marker on the pole and then you become a slide that people laugh at.

understanding > writing You spend waaaayyyy more time reading and
debugging code than writing it. So while it’s tempting and expedient to do so, don’t take the false economy of abbreviated or sloppy names, cute shortcuts, skipping comments, etc. Anything you can do to make your code just a tiny bit clearer will pay off every single time someone interacts with it.

Naming One of the most important places you can spend
your time improving understanding is in choosing good names for your API. It shapes how developers think about, and come to understand, your API.

wantarray() wantarray is a good example of a bad name
in Perl. It tells you about the context your sub was called in. It returns true if it was list context. Yet this name refers to wanting an array. There’s no such thing as array context. Everywhere else in Perl, arrays and lists are treated as distinct things, but wantarray is one glaring exception that makes it just that much harder to grok Perl. perldoc perlfunc even says “This function should have been named wantlist() instead.” That sucks because in order to really understand Perl, it’s vitally important to really understand context, and this just makes it a tiny bit harder.

Guide developers with your APIs This is what I strive
for when I architect something. You want to build APIs that steer developers in the right direction. Ideally they couldn’t possibly use it incorrectly. Apple does a really great job of this with its APIs. There’s a consistency and clarity from everything from Foundation to UIKit that just makes you feel like a wizard who can do anything. (At least for the ﬁrst few months, until reality settles in.) There’s an obvious, straightforward way to do things. APIs are orthogonal and work together. A bad API is one you ﬁght with, you have to look up the documentation every single time you use it, and one you quickly come to use only reluctantly.

addslashes() mysql_escape_string() mysql_real_escape_string() Here’s a wonderful example of doing this
poorly. How many developers do you think stopped at mysql_escape_string instead of going on to mysql_real_escape_string, leaving themselves vulnerable to SQL injection? By the way, the right answer is not to escape strings yourself but to use parameterized queries, that way there’s never any question whether escaping happened correctly. Thankfully we’ve more or less settled on this in Perl. At least as far as I’ve ever seen. Disclaimer: I haven’t seen your codebase.

extends ‘Super::Class’ has ‘attr’ override ‘method’ I think Moose and
PSGI do a good job of this. The proliferation of Moose-lites, and even things like Class::Accessor ‘antlers’, is a testament to that. And how quickly everyone switched to PSGI and immediately reaped the beneﬁts is largely because it had the right API. HTTP::Engine was a similar idea a year earlier with a clunkier API and so it saw very little uptake.

while (<>) { … } I think this is a
good example of an API that doesn’t guide you in the right direction. It’s SO easy to use, which is exactly the kind of thing that gives Perl its ubiquity and longevity. But it has a pretty fatal ﬂaw in that it will execute shell commands if you use the pipe form of open. This is a security hole waiting to happen. So I’m reluctant to use it.

$ ls -l -rw-r--r-- bar -rw-r--r-- foo -rw-r--r-- rm -rf
* | -rwxr-xr-x script.pl $ cat script.pl #!/usr/bin/perl while (<>) { } $ ./script.pl * Can't open script.pl: No such file or directory at ./script.pl line 2. The script opens bar for reading, opens foo for reading, opens “rm -rf *” with a shell because of the |, and then by the time it tries to read script.pl for reading, it’s been deleted.

while (<<>>) { … } This is how you ﬁx
it: use the double diamond operator from the latest version of Perl. I think the solution of adding a new safe version is kind of like our mysql_real_escape_string. Gross. I would have ﬁxed the existing operator and added a new unsafe version. That’s probably why I’d be a terrible pumpking. After all, that ship sailed decades ago, we can’t change the existing behavior. Victim of our own success.

SYNOPSIS use utf8::all; # Turn on UTF-8, all of it.
open my $in, '<', ‘contains-utf8'; # UTF-8 already turned on here print length 'føø bār’; # 7 UTF-8 characters my $utf8_arg = shift @ARGV; # @ARGV is UTF-8 too (only for main) So. One trick for achieving an API that guides developers to do things right is to write the synopsis for your module or feature first, before you do any coding. It’s documentation, but it’s one screenful of real, runnable code that introduces your API. Be optimistic in that synopsis. Then try to achieve it. You might not always get there, but it’ll serve as a good guide. The synopsis is the first, and sometimes only, thing I look at when I judge whether I want to use a module. If there are irrelevant details I’m way more reluctant to use that module. That synopsis should serve as your first unit test too.

Remove doubt from your code Brent Simmons – How Not
to Crash Remove doubt from your code by using proper names. Make it clear that you aren’t vulnerable to the rm issue by using while (<<>>). Use (not necessarily fatal) assertions or parameter validation to enforce your expectations, and to communicate them to maintainers. Be strict in what you accept. Explicit is better.

Consistency is important too. Similar things should look similar. Why
are all these icons different from each other? I spent more time thinking about the sign itself rather than the restrictions it lists. Did the designer add the no-cell-phones icon later, is that why the stroke is going the other direction? The stroke for no power (which, frankly, I’m not sure why that’s listed as a prohibited thing) is thinner, does that mean it’s a only a little bit forbidden? Its circle is thicker though, which draws the eye, subtly suggesting it is more forbidden. Now I’ve got doubt. And hey, why is hamburger-with-cup the universal sign for food? What do they use outside of the West?

Now I’m confused and all of a sudden I noticed
that I opened a bunch of tabs on Wikipedia articles. So… why not just use the unicode character for ⃠, or at least copy and paste the ﬁrst set of circle and line shapes in Photoshop? I’m taking this critique to a ridiculous length here. But actually, not really. These are all exactly the kinds of questions that subtle differences in similar code raises. So go that extra mile and make things consistent so that these irrelevant, useless questions would never even be asked. Remove doubt. As a fringe beneﬁt, making them consistent will make it easier to refactor later on, too.

I’ll write comments later… “I’ll come back after I’m done
and ﬁll in comments later…” NO! You need to be in the mindset you were in while writing the code to be able to explain it properly. After all, the whole point of comments is speciﬁcally to communicate what's in your mind that isn't already apparent from the code. Capture it while you can! Don’t wait til it’s left your mind in the hopes that it will come back. It won’t. At best you’ll end up just describing the code itself, which is not what comments are for. Comments are for things that aren’t immediately apparent from reading the code. But usually what happens is you move onto the next thing and you never write those comments.

You’re trying to compress that very rich picture in your
head for the reader on the other end Bret Victor – Media for Thinking the Unthinkable “[You] have a very rich picture in your head. And [you’re] trying to compress that picture to transmit over a very low-bandwidth channel, which is this stream of symbols. So it's a very lossy compression. And we've got the reader on the other end trying to decompress this picture into their own head. And it doesn't work very well.” So to make sure that there’s no miscommunication, you want to be very precise, more verbose than feels natural, and overly self-critical. I’d go so far as to say that you want to do this in all technical communication, whether it be code, comments, tests, documentation, IRC, etc.

git forensics Your tricky code may be sensible today. But
will it still be sensible when someone comes to look at it in a year? It’s not enough to think of your code as it exists right now. Consider how this code will look in six months, a year, six years. Surely you haven't ever encountered six-year-old code before. And if you have I'm sure every time it was just wonderful. Will your future maintainer have to perform git forensics to ﬁgure out why you did it this way instead of another? Instead of making them guess, just leave a comment! Write good commit messages too.

What separates programming from painting, writing, etc is that there
is no ﬁnished product Brent Simmons – How Not to Crash It’s important to have a plan for that six year old code, and to know that the code you’re writing today will probably become six years old. How will the code you’re writing today be received six years from now? Put in ﬁve extra minutes today to save hours, maybe days, over that time span.

The most depressing thing… is a chunk of code… you
dare not modify Simon Peyton Jones – Coders at Work I’m sure no one here has been in this situation before. Lately I’ve been striving to be in the exact opposite situation. I go well out of my way to become master over the code I’m writing. If there’s something I don’t understand, I spend time tumbling down the rabbit hole. If I catch myself thinking “gosh, I hope I don’t ever have to touch THAT code” then something’s not right in paradise.

Loosely coupled modules When you do come across six year
old code you might decide the best course of action is to rewrite it. Your job will be much easier if the old code was loosely coupled—using, and offering, only very small APIs. That also helps in understanding the codebase, since you can more readily understand each module in isolation, and then you can understand the connections between modules, and then you’re done. It’s very natural. Deep coupling means you have to understand everything globally before you can understand anything separately.

Sereal Sereal is a good example of this. I am
a pretty good programmer but I am not the wizard its authors are. There’s no way I could have written it. But I can certainly use it conﬁdently, because it offers a very straightforward API, just sereal_encode and sereal_decode. If the code to encode data structures were in the ORM layer and the code to decode Sereal was in the frontend, it’d make everyone MUCH grumpier every time they saw it.

Being abstract is profoundly different from being vague Edsger W.
Dijkstra - An Introduction to Implementation Issues (EWD 656) That leads me to my next point which is about abstraction. The whole point of abstraction is to build new tools you then use to work at a higher level in a precise way. When you’re interacting with a database, you don’t want to have to care about the ﬁle I/O operations that need to happen. You want to deal in the realm of SQL queries, which is a very precise system. There’s nothing vague about it. It’s just that details you don’t care about for the task you’re performing are handled behind the scenes. Dealing with the I/O yourself in order to do set operations in order to do SQL queries would be a disaster. There’s value in knowing how the database does its work, for example to create more efﬁcient indexes, but that’s an expert-level concern. A specialization.

Good programmers are more facile at jumping between layers of
abstraction Peter Seibel – Coders at Work I worked with someone once who was terrible at this. Every bugﬁx and feature would be done in the wrong place. It drove me absolutely mental and it was a serious drag on the entire project. It’d be like in the database example if instead of changing a column name in a SQL query he’d do a search-and- replace on the text stream as it comes in from the database ﬁle. In stark contrast, the student I mentored for the Gnome Outreach Program for Women, Upasana Shukla, was perfectly good at this. By the way she is a real pleasure to work with.

I’m a big fan of this idea called “gumption”. Basically,
deliberate and focused self improvement. Practice practice practice. Do things outside your comfort zone to expand your comfort zone. Be better. Never stagnate; never stop learning.

One technique I’ve been using lately is creating a todo
list for each feature branch I’m working on. Here’s my OmniFocus project for a new feature that is nearly done. You can see there’s a lot of atomic next actions here: tests, documentation, ask questions of others, investigations into existing code, create tickets. Basically a continuous braindump of all the stuff I think to do as I’m working on the feature. Or even when I’m lying in bed trying to sleep and my subconscious, which has been chewing on the problem, reveals more insight. And I break down tasks into smaller chunks as I progress, understand the problem better. It’s almost fractal, but thankfully it does bottom out.

Thanks to this I have a very high level of
confidence that I’ve completely finished a feature. There are no loose ends, because I know every single consideration has been captured and dealt with. That leads to a sort of mastery. I never have anxiety that I’ve forgotten something. So it means I get my branches right the first time! By the way, read Getting Things Done. It’s fantastic. It’ll change your life!

Many people do this with a plain text ﬁle directly
in the repo. I’ve mindmelded with OmniFocus so I use that, but anything that externalizes your intent is good. Use all your brainpower on solving each problem, rather than remembering the list of things to do. Divide and conquer. It’s also handy if you have to put the branch down for days, or months, and come back to it later. Usually you’ve forgotten everything so you have to research just to get it back into your brain. But with this, you already have your list of things you’ve done, and a list of what’s left to ﬁnish.

I’d shy away from using RT or JIRA for this
because you'd generate a lot of email as you open and close tiny tickets. Even my tiny feature branch ended up with 23 tasks. It’s also more friction than you want. For example you’d want to assign each and every ticket to yourself. There’s also no “inbox” which means capture requires a little bit more effort to locate the project and create a related ticket. But, anyone who’s worked with me will tell you though I have a tendency to make thousands of atomic tickets in RT and JIRA anyway.

Programmers should not waste their time debugging, they should not
introduce the bugs to start with Edsger W. Dijkstra – The Humble Programmer (EWD 340) “If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.” Better APIs, clearer code, assertions, and guided APIs all help to avoid introducing bugs.

Be careful, and don’t be optimistic. Optimists write bugs. Brent
Simmons – How Not to Crash

There’s a talk from a researcher at JPL about how
the code in the Curiosity rover on Mars project, and how they made it robust. This is important because any bug could instantly wreck a 2 billion dollar mission, not to mention jeopardize future funding. Of course consumer software requires nowhere near that level of reliability, but reliability in general is important for all software, so it’s worth at least knowing what organizations like NASA and JPL do. When you think of code for space programs, you probably think of teams of scientists producing three lines of FORTRAN a day. Any more than that would be too hasty! https://vimeo.com/84991949

But that’s not how they roll any more. That discipline
and vigilance are difﬁcult to maintain. The entire talk was about how JPL does code review, either from peers or modern source code analysis tools. The vast majority of code review feedback led to changes in code, rather than being rejected or ignored. He claimed “This worked remarkably well”. Even if code review isn’t a policy for your entire company yet, I hope you can consider doing it in your individual team.

Shorten the conceptual gap between the static program and the
dynamic process Edsger W. Dijkstra – A Case against the GO TO Statement (EWD 215) “Our intellectual powers are rather geared to master static relations and our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost best to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.”

dynamic process Edsger W. Dijkstra – A Case against the GO TO Statement (EWD 215) Your program is more or less a gigantic text ﬁle. When you run your program, the text isn’t itself doing anything. The process that results from your program is a living, breathing beast. It’s important that the program and its process match as closely as possible, otherwise it’s very difﬁcult to understand and modify the program to get the desired behavior out of the process.

dynamic process Edsger W. Dijkstra – A Case against the GO TO Statement (EWD 215) It’s the difference between a recipe and its resulting dish. What you wouldn’t want in a recipe is nonsensical, extraneous steps. Like if you’re following a recipe and step 4 is flush your toilet. Why the hell is that in the recipe? Because the original guy had to take a leak while he was waiting for his noodles to boil. You might think that’s ridiculous, but it happens ALL THE DAMN TIME in code. “Oh I was trying to fix something so I just left that in there and didn’t bother to remove it when I figured it out that the problem was somewhere else.” Use “git add -p” to select exactly the changes that do what you need and throw the rest away.

Add the debug code early and leverage it throughout the
whole process Casey Muratori – Working on The Witness “If you’re going to debug one thing in a piece of code, you’re going to debug a bunch of things, so you might as well add the debug code early and use it throughout the whole process.” You’ll have better tools at the end of it. Your infrastructure team does a great job with this with its dashboards.

A work project involved a monstrously making sense of, and
porting parts of, a complicated spreadsheet. You feed it a few numbers, it runs a shitload of arithmetic, and spits out a few numbers. We basically use this as a speciﬁcation since this gigantic XLS would not be appropriate to ship to end users. The thing’s a mess. There are dozens of tabs, and thousands of cells per tab, etc.

Inevitably something will go wrong with our calculations and we’ll
need to investigate why. No one carries the spreadsheet in their head, nor could anyone ever hope to work backwards from the outputs to ﬁgure out which intermediate calculations went wrong. We need some way to surface all those bazillions of lines of arithmetic. So I thought hey, printf. I even spruced it up with some formatting and highlighting of like terms on hover. It was beautiful!

But it was utterly useless. I was the only one
who could ever debug anything, because I was the only one who could correlate debug output with the spreadsheet, and every time it was very time consuming. So back to the drawing board… the whole point is we want to ﬁgure out where our calculations diverge from the spreadsheet’s. So I threw away both our implementation based on the spreadsheet’s documented “reference algorithm” (which is euphemism for bullcrap) and the debug UI and produced something better.

So what I ended up with was more or less
a straightforward Excel interpreter. I hew as closely to the spreadsheet’s format as possible. That way differences immediately jump out at you. Building the right tools like this not only cuts down time on time both for debugging and customizations dramatically, but also lets the next programmer come along and see exactly what’s going on. The static program, in this case the spec, is INCREDIBLY similar to the dynamic process, which is surfaced in this way.

Laziness Impatience Hubris Larry Wall said these are the three
great virtues of a programmer. You’ve probably read Programming Perl but maybe for some of you it’s been a while. So let’s review. Laziness: “The quality that makes you go to great effort to reduce overall energy expenditure. It makes you […] document what you wrote so you don't have to answer so many questions about it.”

Laziness Impatience Hubris Impatience: “This makes you write programs that
don't just react to your needs, but actually anticipate them.”

Laziness Impatience Hubris Hubris: “The quality that makes you write
(and maintain) programs that other people won't want to say bad things about.”

svn ci -m ‘hack hack hack’ The days of just
putting whatever you happened to have changed into the repository are over. We could get away with it back in the Subversion days because the tooling wasn’t as good. Git doesn’t suck. Spend the time to get good at it. After all it’s an API to the complete history of your codebase. Which is pretty damn useful.

Atomic commits rebase -i add -p I make atomic commits.
Each commit changes the smallest thing it possibly could, without breaking the codebase. That way they’re easy to review, easy to bisect, and easy to reorder. It helps you form a narrative. Each commit is a baby step toward your goal. You want to get really good with git rebase -i and git add -p. These let you produce good commit history which is an investment in future maintenance. I don't know if your policy allows push —force to rewrite published history but even if it doesn't, holding back your pushes until you're a bit more conﬁdent about the code change can go a long way.

log bisect blame The git history is important because it
gives you context for the changes that you’ve made. It’s one more piece of the puzzle in ﬁguring out why code exists as it does today. If, say, you’re debugging a piece of a subroutine and see that when it was committed, you hadn’t yet converted to PSGI, that’s a pretty big hint that that subroutine should be updated to take PSGI into account. That sort of thing. Blame lets you see the commit that each line of code comes from. Hugely useful. Bisect lets you binary search over commits to ﬁnd some change.

Start commit messages with a verb Reference ticket ID? Two
small ways you can improve your commit messages is to start them with a verb. “Added x”. “Removed y.” “Refactored z.” That way they make sense as you read them in a log, or rebase, or any other context, etc. It also helps you kickstart writing your commit message. Another idea that helps tremendously is to put a ticket ID right into the commit message. Anything you can do to provide more context for future maintainers helps. That way when you dive into new code you have a better chance of understanding the whats and the whys.

Don’t run with scissors. Don’t even touch these scissors. They
have a bunch of poison on them. Brent Simmons – How Not to Crash

Billy, don’t be a hero. You don’t get bonus points
for being a hero. You know what’s actually really valuable that doesn’t get enough credit? Creating simple, readable code that anyone can modify. I don’t know what the culture at your company is like but I try to be very mindful in noticing when a team member isn’t causing problems. That deserves recognition.

Thank you! Thank you very much for your attention

Right the First Time

Right the First Time

More Decks by Shawn Moore

Other Decks in Programming

Featured

Transcript