Thinking of Documentation as Code [YUIConf 2013]

Hi, everybody. I’m Evan Goer. It’s great to be back!
1

A li=le background: I’m a former technical writer, now a
frontend engineer at Intuit, makers of ﬁne products that help people deal with their ﬁnancial lives, such as TurboTax, QuickBooks, and Mint. Oh, and I also wrote a book! 2

So! Thinking of documentaMon as code. Or the alternaMve Mtle…
3

So I’m including this silly Mtle for a serious reason.
We all naturally and rightly want to make fun of the idea that there even is such a thing as a simple “Mp” or “trick” for something as complex as soSware development, or wriMng documentaMon. As many of you know, I’m not a big fan of airy statements about wriMng like, “Be clear,” or “Write short sentences,” or airy statements about soSware engineering, for that ma=er. I mean, clear wriMng! Well-‐craSed soSware! Who doesn’t want that? But these are just plaMtudes. How exactly do I “be clear” when I’m wriMng? How do I write “well-‐craSed soSware”? What is that, anyway? These statements might make us feel good, but other than that, they’re not helpful. What would be helpful is a guideline, a heurisMc, a razor. Like Occam’s Razor, “Among compe+ng hypotheses, select the one with the fewest assump+ons.” 4

So let’s start with a fable. You and a couple
of teammates asked to take on a legacy project, and aSer invesMgaMng it, you realize that it has no tests at all. Oh no! The three of you are a bit despondent over this, when your boss walks in. He says, “Don’t worry, we’ll get these tests done. In fact, I have this great unit tesMng soluMon for you. I’ve been talking to this vendor, and they have this awesome “unit test repository” for storing tests! It’s specially designed to hold tests. It comes with its own, internal specialized version control system. Isn’t that great? Oh, and the repository has all sorts of custom workﬂow rules and access rights we can setup. All built in! Also, they have this special proprietary tool for wriMng these tests. You have to edit tests using this tool, but don’t worry – it’s really really producMve! The three of you are going to love it.” What would your reacMon be? Imagine the three of you in the room hearing this spiel. Raise your hand if you think this sounds like a great idea. 5

This is how I imagine most soSware engineering teams reacMng.
You wouldn’t accept any of this for your tests, would you? It’s insane. And yet people are really quick to store their documentaMon in all kinds of strange places and deal with documentaMon source in all sorts of strange ways that would never ﬂy for source code. Or tests or build scripts. 6

And so here’s our heurisMc, our razor. Like Occam’s Razor,
it’s not a plaMtude to make us feel good, it’s a tool to help make a decision when you have lots of messy, expensive opMons to choose from. “To ﬁrst order, so:ware documenta<on is like code.” I would love to take the credit for this, but I can’t, because I have no doubt that the idea is not original to me. Today we’re going to see what it implies (and where it doesn’t quite work). We have to be careful. No heurisMc is perfect. They can be overly simple, or just wrong. I’ll start oﬀ by making it clear: if you have a specialized team of technical writers or support personnel who are fully responsible for wriMng docs, you can throw everything I’m about to say out the window. Everything I’m saying from here on out only makes sense if you have soSware engineers wriMng all of the docs, or at least some of the docs themselves. 7

So another way of staMng this heurisMc is, treat your
documentaMon like any other “meta-‐code” related to your project. The great thing about this heurisMc is that it’s easy to apply, because you have years of experience thinking about soSware code. We’ll see how it helps us quickly zero in on the right answer for documentaMon. 8

Start with an obvious one: where should we store our
documentaMon source? Maybe a specialized knowledge base? Maybe a wiki sounds reasonable? Raise your hand if you store your documentaMon source somewhere like that. Let’s apply our heurisMc. To first order, so:ware documenta<on is like code. Where do we store code? We store our unit tests in a hosted git repo, for the same reasons we store our library code in a hosted git repo. We get very sophisMcated version control of course, but also distribuMon, collaboraMon, backup. And the reason we do this is, version control systems for soSware developers have been evolving for decades, under intense scruMny. A modern version control system is the beneficiary of this evoluMon. Specialized repositories for documentaMon generally have watered-‐down, enterprisey, poorly reinvented versions of these features. YOU, as soSware developers, don’t need or want that any stuff when you have real version control at your disposal. As soon as you leave documentaMon outside of your repo, not only do you degrade criMcal features – that’s not even the worst thing. The worst thing is that you destroy your normal everyday engineering workflow. EdiMng code is normal, ediMng docs 9

Does it make sense to create documentaMon source in a
binary format? It sounds a bit crazy, but yes, people do this all the Mme. Let’s apply our heurisMc again. To first order, so:ware documenta<on is like code. And once again, once you frame the decision that way, you know the answer immediately. Why do we write our code as plain text? So you can edit it in vim, emacs, Sublime Text, an IDE, whatever makes sense for you. So you can bring to bear all of the uMliMes we’ve developed over the last few decades for searching, inspecMng, and manipulaMng text. So you can deal with it sanely in version control. If you choose a proprietary, binary format for your docs, once again, you blow apart your normal engineering workflow. Now everyone is forced to use a specialized tool to make changes, which means everyone is less producMve (assuming they even have the right tool in the first place). 10

As soSware engineers, we know that knocking out reams and
reams of code is not necessarily a good thing. Who’s looked under the hood of healthcare.gov? Raise your hand. We know that more code we write, the more our programs slow down, the more bugs they will have, the harder they will be to maintain. The same is true for documentaMon. In fact, it goes double for documentaMon. All along our heurisMc has been, “To first order, documentaMon is like code.” Let’s shine some a=enMon on the second-‐order effects, where the analogy breaks down. DocumentaMon is like code… except that documentaMon is much, much harder to regression test. You can test a few things in an automated fashion: you can run spellcheck. Seriously, run spellcheck! You can factor out your code examples as separate files, and run them automaMcally. And documentaMon generators like Doxygen and Javadoc at least get the basic structure right. But the only way to keep documentaMon correct and up to date is manual review by humans, and lots of it. So documentaMon is more expensive than code. Which leads us to a corollary… 11

I used to run into this situaMon all the Mme
back when I was a technical writer. I’d be wriMng the documentaMon for a new tool, and I’d wonder why there was so much work to do. I’d go over to the engineer and say, “This secMon is gelng really huge. Are you sure we need to do all this?” And the engineer would say, “Oh. No, we could make this simpler.” And poof!, just like that, my job would get twice as easy. Wri<ng lots of documenta<on is a code smell. Part of a technical writer’s job is to spot this and bring this to your a=enMon. As a soSware engineer, you should be looking for it too. Are you ﬁnding yourself wriMng complicated install documentaMon? Well, why is your installaMon process so complicated? Maybe you could reduce it down to three steps? Or one? Are you ﬁnding yourself wriMng lots of API docs? Well, why does your API have such a huge public surface? Could you simplify it? Does the world really need all those methods? BTW, the absolute worst when it comes to documentaMon is dev environment setup docs. Holy crap those are bad. In fact, I’d like you to take a pledge. Everyone please raise your hand & repeat aSer me. 12

Among my friends, it’s become a bit of a running
joke that I hate wikis. The truth is, I don’t have a problem with wikis as useful general purpose tool. My issues start with using wikis for technical documenta+on. Wikis are general purpose, which means they usually lack the semanMcs and features that a documentaMon nerd like me looks for. This isn’t to say that you couldn’t have a wiki that supports those features. But most don’t. Another problem is ediMng. Typing into a browser textarea all day sucks. And again, wikis could fix this. Today, we have full blown browser-‐based coding environments that are… actually decent. A wiki could use one of those, rather than some random jQuery plugin. Or the wiki could make offline ediMng easy. And some wikis do. The third and biggest problem with wikis, is that now docs are siloed outside of your repo, they use a different workflow. That is super, super bad. But again, you could imagine a wiki that Med its source very closely with your repo. Maybe the wiki is kind of alternaMve “view” on a big directory of Markdown or reStructuredText files. So in theory, you could build a wiki that answered my concerns. GitHub wikis are actually pre=y close. They are Med to your repo! They support some extensions to Markdown that are suited for documentaMon. For lightweight docs, they’re… not bad. 13

User comments. Don’t do this. Imagine if you treated your
code that way. “We’re going to allow anyone in the world to append methods into our module.” Again: treat docs as a first class ciMzen. Harvest contribuMons the same way you do with code. You might think this is a niche-‐y issue, but smart people keep making this mistake over and over. Above is screenshot is of AngularJS. Angular is a really cool project, worth checking out. But the AngularJS documenta+on is not exactly the project’s strong suit, and the comments are not helping at all. They consist of confusion, of people complaining about the docs. Lots of noise, very li=le signal. That stuff belongs in issues or IRC, not in the final product. Or React.js. Really interesMng new library from Facebook. Doing really innovaMve things with performance. Very smart people working on it. And of course, every doc page allows Facebook comments. These are even worse than AngularJS, which at least uses DISQUS. Old school. PHP.net. LongMme dedicated doc team. They have a hard job to do, and I think the core docs are really solid. But there’s lots of lousy, unsafe, deprecated advice in the user comments. Thinking about your docs holisMcally means you’ll never make this mistake. Don’t allow user comments. Ever. 14

Because documentaMon is so expensive, you want to add it
judiciously. You need a balance. What to write, and what not to write. Your code covers the what. If your code is well wri=en, it will illuminate the how. But even good code can fall down on covering “how”, and code rarely does a good job of explaining the why. That’s where documentaMon really shines. Code covers the stuﬀ on the leS. DocumentaMon covers the stuﬀ on the right. It’s sMll reasonable to write documentaMon that covers the “what”. A lot of people, parMcularly casual and new users, are never going to read your source. You’re going to need to write some docs to get them over the iniMal hump. That’s okay. But what you really want to focus on are the second two. “How do I do X with this library?” “How does Y work, exactly.” And even more than that, the “Why”. “Why would I call this method? What were the designers thinking when they wrote this, what did they intend?“ Focusing on “Why” and “How” will help you avoid wriMng terrible API documentaMon. The worst documentaMon is API documentaMon that simply restates the name of the method. “getConnecMon(): gets the connecMon.” Shoot me now! Ask yourself: why and when do I call this? How does it work with other methods? Etc. 15

Given that we’re duplicaMng informaMon when we cover the “what”
and “how”, how can we put on our developer hats and apply the principle of DRY (do not repeat yourself) to documentaMon? One obvious example is Javadoc and other documentaMon generators. As I menMoned before, at the very least that gives you the basic object/method structure right, even if the comments turn out to be a pack of lies. Do you have code examples? Could those code examples be broken out as separate files? Could those separate example files be part of your unit test suite? What about docs for command line tools? I love it when command line tools generate their usage statements from their own argument parsing code. I’m sure some of you are allergic to fancy arg parsers, “that’s way more than I need,” but that ability to Me docs & code together is really powerful and good. Of course DRY concepts in documentaMon can be much simpler than any of that. A humble cross-‐reference (“for more about configuring widgets, refer to secMon …”) is a form of DRY documentaMon. Or a glossary entry that automaMcally links back to its definiMon. Simple li=le building blocks like these can reduce duplicaMon and errors. 16

And finally, speaking of builds, people want to consume your
documentaMon in all sorts of different ways. I mean, man pages are awesome! How do we get there? HTML is a good target format for documentaMon. It’s not the best authoring format. Most of you know this – you write your READMEs as Markdown, not HTML. WriMng docs in HTML is kind of a pain, even for frontend engineers! The key is to author in an abstract format that is designed to be built into other formats, like reStructuredText or Markdown with some extensions. Not only are these lightweight markup formats easier to read & author, but they are unopinionated about the end target format, whether it’s HTML or something else. For example: this construct isn’t an HTML hypertext link, this is a cross-‐reference. And if you render it as PDF, it’ll say, “refer to Page 117,” which means it’s sMll useable when printed. Or, this this construct isn’t a specially forma=ed red div, this is an admoni+on, a cau+on – which again, you can display differently in a web page vs. a man page vs. a printed page. 17

This is just a sample of the diﬀerent ways you
can use the heurisMc. Not only can it help you avoid wasMng lots of Mme on ratholes (like user comments), but it opens up possibiliMes. DocumentaMon isn’t some alien thing; many of the concepts that we use every day in soSware engineering apply just as well to documentaMon. That’s it. You can reach me at my email address, or “evangoer” on Twi=er or any channel you can think of. Thank you! 18

Thinking of Documentation as Code [YUIConf 2013]

Thinking of Documentation as Code [YUIConf 2013]

Evan Goer

Other Decks in Programming

Featured

Transcript

Hi, everybody. I’m Evan Goer. It’s great to be back!

A li=le background: I’m a former technical writer, now a

So! Thinking of documentaMon as code. Or the alternaMve Mtle…

So I’m including this silly Mtle for a serious reason.

So let’s start with a fable. You and a couple

This is how I imagine most soSware engineering teams reacMng.

And so here’s our heurisMc, our razor. Like Occam’s Razor,

So another way of staMng this heurisMc is, treat your

Start with an obvious one: where should we store our

Does it make sense to create documentaMon source in a

As soSware engineers, we know that knocking out reams and

I used to run into this situaMon all the Mme

Among my friends, it’s become a bit of a running

User comments. Don’t do this. Imagine if you treated your

Because documentaMon is so expensive, you want to add it

Given that we’re duplicaMng informaMon when we cover the “what”

And ﬁnally, speaking of builds, people want to consume your

This is just a sample of the diﬀerent ways you