Finally! Trustworthy and Sensible API Documentation with GraphQL

Finally! Trustworthy and Sensible API Documentation with GraphQL @gjtorikian ARE
YOU IN THE AUDIENCE AND LOOKING AT THESE NOTES? IF SO, HELLO!

Finally! Trustworthy and Sensible API Documentation with GraphQL @gjtorikian Heya!
My name is Garen Torikian, and today I want to talk to you about API documentation. I know that this slot is a bit rough because it's right before lunchtime, which is one of my three favorite times of day. But I hope you'll stick around and ﬁnd this talk useful.    And the focus of this talk is going to be on the Finally! and Trustworthy! Portions of the title. But before I get into the details, I want to introduce myself.

@gjtorikian html-proofer jekyll & nanoc plugins markdowntutorial.com …other stuﬀ! I
am a programmer I'm an engineer working at GitHub.com. I work on the API that supports the GitHub Platform.

I was a technical writer @gjtorikian But before that, I
was a technical writer. I actually started the documentation team at GitHub almost ﬁve years ago, and before that I had been writing technical documentation for nearly a decade. And the truth is, I love writing. I sort of miss doing it as part of my full-time job, and I'm excited every time I get to dive in and explain some computer system with words. My move into full-time programming is a relatively new thing. It was only about two years ago that I decided to do more coding than explaining. And what I realized is that after I became an engineer, I was still interested in the same types of problems I was interested in while I was doing documentation.

@gjtorikian Writing is easy Maintaining is hard This is a
controversial slide, I know. I like to tell people that it's easy to write documentation. Anyone can write something down. They may not be able to write it down well, but they can write it down. The hard part is maintaining documentation. To make sure that it stays accurate and true as the software evolves. Part of the reason I wanted to move from documentation and into engineering was to focus on keeping accurate documentation. I reasoned that if I could concentrate on the documentation accuracy from an engineering perspective, in some way I could help mitigate some of the frustrations that I had experienced as a writer.

I'm interested in how we write @gjtorikian But what I
realized when I took a step back from writing was that I am and I have been obsessed with how we write. Not just which word is the mot juste, but also the processes and software that we use to do our documentation. And eventually even how our documentation is consumed. And this is something that aﬀects both engineers and technical writers. Engineers who aren't writers don't know how to provide useful documentation, and writers who aren't engineers sometimes aren't given the best tools to do their work.

What do API docs look like? @gjtorikian I want to
talk about how API documentation is written, what the history of that process has been as far as I've seen it. I want to talk about why it's traditionally been untrustworthy, and more importantly, I want to talk about the ways people are now rethinking the way in which these technical systems are being built, and how those changes are beneﬁting documentation teams and users.   I want to look at where we were, where we have been, and where we're going. But just so that we all have a shared understanding of what I'm talking about, I want to quickly deﬁne what I mean by API and documentation.

! the best way to build and ship software 8
A P wha? @gjtorikian Bank Let's pretend this blob right here is a bank. I can put up a little label there that says bank. You can tell that it's a bank now because it has columns and looks very bureaucratic.   Let's say you want to take money out of a bank. It's Saturday, you have a lot of time to waste, so you decide to skip the ATM and stand in line. What happens when you get there? The bank teller doesn't just hand you the keys and say, "All right, your money is kept in the back left of the vault, good luck!"

A P wha? @gjtorikian Customer Bank Money Ask In reality, it works a bit more like this. You approach the bank and ask for some money. You'll ﬁll out a form or a slip, you'll write down your account number, how much you want to take out, and you'll sign the form and hand it to the teller.

A P wha? @gjtorikian Customer Bank Money Get Ask The teller, acting as the bank, goes to the vault and gets the money.

A P wha? @gjtorikian Customer Bank Money Receive Deliver Ask Get And then the money comes back to you, through the bank.

A P wha? @gjtorikian Receive Deliver Ask Get Customer Bank Money And that's pretty much the ﬂow for how you can get money from a bank.

A P wha? @gjtorikian User Program/Server Data Receive Deliver Ask Get So an API isn't very diﬀerent from that scenario. The user, the person sitting at a computer somewhere, make a request for some data. That request could be made to a program running on your computer, or, you could be asking an external website, like GitHub, Google Maps, Wikipedia, whatever. The server goes and fetches the data you asked for, and then hands it back to you.

@gjtorikian User Program/Server Data Receive Deliver Ask Get A P wha? The API is the layer here that facilitates the "Ask" portion of the flow. It's sort of a gatekeeper between the user and the data.    This kind of system is becoming more and more common.The more stuff we put online, the more people realize that they want to fetch that stuff and do things with all that public data. Rather than companies deciding what data users should and should not have access to, and instead of just granting them access to everything, an API determines how you can get at information that's stored somewhere else in a safe and sanctioned way. Essentially, it's a contract between you and data.

Types of documentation @gjtorikian The other part of preliminary information
is what I mean when I talk about documentation. Not all documentation is written the same. I'm a little old school, and the way I learned it was that there essentially were three main types.

Reference @gjtorikian Terms, glossaries There's reference documentation. In a technical
writing context, that might be deﬁning words or phrases that are unique to your software. When describing an API, these would be the functions or methods that you would need to call to get access to your data.

Conceptual @gjtorikian What is…? There's conceptual documentation, which can describe
a technical system, what it's used for, or what it does. For example, explaining how Git works, or describing how photos are uploaded and stored online.

Procedural @gjtorikian How do I…? And there's procedural information, which
is usually some kind of ordered list explaining how to accomplish something in a series of steps. This'll be something like a step-by-step guide to opening a pull request or instructions on how to change your password.

Types of documentation @gjtorikian Reference Conceptual Procedural Of the three choices today when I'm talking about speciﬁcally API documentation

Types of documentation @gjtorikian Reference Conceptual Procedural I'm really only going to be concentrating on the ﬁrst kind, Reference documentation. So, you have a server sitting somewhere, there's some information you want to collect, and you want to know how you can make the request, The kind of documentation I'm talking about is the kind that explains to you what the function calls are, what they do, what arguments they take, and so on. It's a reference sheet, a dictionary, that enables you to speak the language that the server will understand.

Reference docs are the most important @gjtorikian When it comes
to APIs, reference documentation is the most pertinent form of documentation, period. Let me go back to that bank analogy again. If you don't ﬁnd and ﬁll the right forms, you won't be able to do anything with the money sitting in the vault on the other side. A server sitting somewhere online is the same thing. It's a mysterious black box. If you don't know how to talk to it, you're not going to be able to get anything from it. And if, as a writer, you don't adequately describe how someone is going to interface with your system, you run the risk of them not using your system at all. So accurate and consistent reference documentation for APIs is super important.

@gjtorikian 2006 2012 2017 Ok, so, now I've deﬁned the
technical system I'm talking about, and the kind of documentation I'm referring to. The theme of this talk was about how, ﬁnally, trustworthy API documentation can be produced. And in order to explain how we are where we are, we're going to travel through the years and each segment is going to build on top of each other and culminate towards an understanding of how API documentation has been produced, and how current API systems are changing towards a trend of more "trustworthiness."

2006 @gjtorikian Documentation next to code So it's time for
all of us to ﬁnd our mental time machine, step inside, and transport ourselves to 2006. Maybe it was a better time for you, maybe it was a worse time, but we can all agree, it was a point in time.     For me, this was a pretty good year, because I got my ﬁrst job doing technical documentation. I didn't plan it, I didn't even know writing documentation was a job. And what I experienced in 2006 was the notion of documentation next to the code.

@gjtorikian 2006 Documentation next to code By a show of
hands, does anyone know what this is?    That's right, it's a book. People used to print documentation on these things. This particular one is for some architecture software, AutoCAD, which was my first job. It's over one thousand two hundred pages long. I'm not really that old, but for my first technical writing gig, I would write documentation on the computer, we'd send it out to some editor somewhere off-site for proof-reading, and a few months later, ta-da we'd get a book. And this book would come with you when you bought your software.

2006: Documentation next to code @gjtorikian So how did a book like this get made? At the time we used to use this pretty amazing tool called Docomatic. The name alone is magnificent. By the way, I realized while I was preparing for this talk that this company is based in Berlin, so it's entirely possible that someone that made this software is here today, so, hello, if you're here.    This screenshot is from the latest release, and while I don't know how the software works now, in 2006, what Docomatic did was pretty simple: it would scan all the files in your project that contained your code. As a writer, I'd go through the software's source code files one by one and write documentation based on my understanding of the source. Red boxes next to name meant that there was missing documentation, and green boxes meant that there was at least some documentation.

2006: Documentation next to code @gjtorikian The API I was documenting was a set of methods that a user could call to automate drawing shapes, add shading, and other architecture-y things. And you can see some of the stuﬀ I needed to write about is pretty common to nearly every API. I had to provide a summary of the functionality. I had to list out all the parameters, what their types were, and what they did. In theory, this was a good idea. You know, the engineers had all these ﬁles split up, and in theory I could just go down them and write some content describing the system. As a technical writer though, I was forbidden from changing the source code. What that meant was that although I could see all the source at any time, I had to keep all the documentation in an entirely separate folder, stored in a separate space. This caused a number of problems.

2006: Documentation next to code @gjtorikian •Prone to missing updates •Required engineering approval •Slow turnaround The ﬁrst problem was that this documentation was prone to missing updates. As a writer, I had very little insight into the changes that the engineers were making. That meant that I could spend a week documenting some part of the system, rescan all the ﬁles, and then discover that some arguments had changed, some method names had changed, or something I had documented had disappeared entirely.

2006: Documentation next to code @gjtorikian •Prone to missing updates •Requires engineering approval •Slow turnaround Because I was working in my own little silo, I also had to send the documentation that I had written back to the engineering team for approval. So they would work on the code, I would document it, then I would send what I wrote back to them to ask if I had understood everything. At which point, invariably, they'd say something like "Oh, yeah, we changed this part of the system. Sorry!"

2006: Documentation next to code @gjtorikian •Prone to missing updates •Requires engineering approval •Slow turnaround And of course, because of all that, there was a much slower pace of documentation that I could produce. Working at a tech company is very manic. Sometimes there will be periods of lulls, and sometimes, you'll be forced to speed up and play catch-up as deadlines approach. What that means of course, is that the documentation written at the beginning of the project is sometimes the clearest and most accurate, but the stuﬀ you write towards the end of the cycle is sloppy and not-carefully reviewed.

@gjtorikian 2012 Documentation from the code Let's jump forward another
six years. I had moved around between diﬀerent jobs, but by and large the process I experienced was the same. Writers weren't allowed access to the same code and systems that the engineers were using, so very often the things that were being built were not being described accurately.    I took a role as a technical writer working on APIs in a startup around 2012. And I really wanted to write documentation in better ways than I had been exposed to. That's when I ﬁrst experienced working with documentation from the code.

@gjtorikian Languages are awesome I need to take a brief
tangent for a moment here. I love languages. I love learning new ones, I love hearing them spoken, I love discovering the words and phrases and slang that people use to communicate with each other. I like the word games you can create with them, and I love how they're an expression of a culture.     Let's talk about language for a bit.

@gjtorikian The book is blue Let's say you have this
sentence in English. The book is blue. If you look at that, and you know English, you can break it down grammatically into various parts.

@gjtorikian The book is blue There's an article, there's a
noun, there's a verb, and there's an adjective.    You know that, in English, this is describing a single item, the quality of it, and its state.

@gjtorikian The book is blue El libro es azul Let's
look at that same sentence in Spanish. El libro es azul.

@gjtorikian El libro es azul The book is blue The
words are completely different, but the concepts are the exact same. There's an article, a noun, a verb, and an adjective.    If you speak English and don't speak Spanish, or vice versa, you can still understand the concepts, even if they're in a different language, right? If I told you that "libro" meant "book," you would immediately know that it was a noun, what it looked like, what you did with it, et cetera. You're able to parse the meaning of the sentence into different parts.

@gjtorikian Programming languages are similar The thing is, programming languages
also behave the same way. You might hear someone talk about how they prefer to write their code in Ruby, or in JavaScript, or in Python, whatever. The point is, the underlying concepts are fundamentally the same. They all abide by a grammar that is shared between them. They just use diﬀerent words to get their point across.    And because they have a grammar, that also means that their structure can be parsed, analyzed, and represented in diﬀerent ways.

@gjtorikian El libro es azul The book is blue In
the slide before I had highlighted the diﬀerent parts of the sentences, article, noun, verb, adjective.

2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (name) -> print "Hello, #{name}" For a programmer, when they're writing code in their editor, their words are represented the same way. When someone is writing code, the text that they look at doesn't look like this.

2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (name) -> print "Hello, #{name}" It looks like these. Color delineates each part of the program. In yellow you have comments, in purple are function names, in light blue, you have a string of text.    Here's where it starts to get interesting. Before, as a writer, I would have had to keep a separate document that kept track of each method and arguments, and, create documentation next to that that was prone to go out of sync.    But, once we start to consider that we can extract the "comment portion" of the code out, we can start to look at the two pieces as not independent entities, but parts that can work together.

2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (name) -> print "Hello, #{name}" We can generate documentation straight oﬀ of the comments. When the documentation is right next to the code, there's no need to keep track of whether or not a method is going to change. There's less of a chance that the documentation is inaccurate: at a glance, one hopes, you ought to be able to verify if the documentation is correct or not. Everything is kept in the same place.    But, there's still the potential for a problem.

2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (first, last) -> print "Hello, #{first} #{last}" What happens if the programmer changes this method? Now, it no longer takes a single argument name, and instead takes two arguments: a ﬁrst name and a last name.  All of the tests that the programmer has written for the code would pass. If we only generated our documentation through the comments, this would pass through unnoticed.

2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (first, last) -> print "Hello, #{first} #{last}" ! A good documentation tool will refuse to build the documentation if it notices a discrepancy. This way, a writer can be certain that any changes to the code is always reﬂected in the documentation.    But this requires a rather intelligent documentation generating tool.

@gjtorikian El libro es azul The book is blue Das
buch ist blau ຊ ͸ ੨Ͱ͢ книга синяя * * *From Google Translate. I know. I'M SORRY. There are hundreds of diﬀerent languages. Even though you might know about grammar rules, you might not be as familiar with how diﬀerent languages are parsed. But even still, as a human, it's easy to understand that book and libro both mean the same thing.

2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (first, last) -> print "Hello, #{first} #{last}" def hello(first, last): """Says hello to a person :param name: A {String} naming a person""" print "Hello, #{first} #{last}" # Public: Says hello to a person # name - A {String} naming a person def hello(first, last) puts "Hello, #{first} #{last}" CoffeeScript Ruby Python But depending on what language your code is written in, a documentation tool will need to know the specifics of each grammar. On this slide, we're documenting the same method written three different ways in different programming languages. As a human, it's easy to reason that the yellowish string is the comment style for each of these languages. And while a tool or several tools could be built to parse each language, more likely, the documentation tool will parse each block in different ways. There is no single tool that will turn the comments from your API code into documentation for every language.    So while this is an improvement towards accuracy, the reliability of documenting things the same way changes as you move between different programming languages.

@gjtorikian Documentation is the code 2017 But let's keep building
on that idea that the code and the documentation are two fragments representing the same idea.    Coming up now to the present day, we'll get into the last portion of the talk. The documentation is not coming from the code: the documentation is the code.

@gjtorikian 2017: Documentation is the code GraphQL About two years ago, the ﬁne folks at facebook.com announced GraphQL, which they had been using internally for about ﬁve years at that point to build APIs for their website and mobile apps.

@gjtorikian •NOT a programming language •Speciﬁcation for how APIs behave •Slow turnaround 2017: Documentation is the code GraphQL isn't a programming language.

@gjtorikian •NOT a programming language •Specification for how APIs behave •Slow turnaround 2017: Documentation is the code It's more of a specification for how to model data and how to make requests to get that data. GraphQL is a way in which you can describe an API. It's a brand new way of building out APIs, that exists independently of the way it's implemented. Which means that, if you write an API using GraphQL, it can be implemented the same way regardless of the programming language.    Let's go back to that bank analogy. GraphQL is a specification for what the withdrawal form, which you hand to the teller, looks like. It specifies the fields, and it specifies how to hand it over to the bank. That means that every bank, no matter what, has the same kind of form to use when withdrawing money. It lowers the cognitive barrier if you decided to switch banks or servers and need money or data.

@gjtorikian •NOT a programming language •Speciﬁcation for how APIs behave •Fully introspective 2017: Documentation is the code A GraphQL schema is fully introspective, which means that every part of the schema, knows about every other part. That means that a GraphQL schema can parse itself. I'll go more into that in a little bit. The most interesting aspect to GraphQL is that it has a very documentation-centric approach.

@gjtorikian # A place to go to for delicious eats and drinks. type Restaurant { # What the owners of the Restaurant decided to call it. name: String! # The Restaurant's average rating. rating: Int! # The physical address of the Restaurant. address: String # Whether or not this Restaurant will deliver food to you. offersDelivery: Boolean # Whether or not you can order to-go at this Restaurant. hasTakeout: Boolean } 2017: Documentation is the code On this slide is an example of a GraphQL schema. This is the blueprint for how someone can communicate with an API. At the surface, it's building on a lot of the concepts I've mentioned. Here, we're describing a restaurant. Each separate line here represents a field. And each field has a clearly marked type, and a comment. This schema can be parsed, and because it's language independent, it doesn't suffer from the problem mentioned earlier about needing different tools for different languages. There's no such thing as a GraphQL schema for Ruby or a GraphQL schema for JavaScript or a GraphQL schema for Python. There's just this one schema specification.   A GraphQL schema is not just a blueprint for how users can interact with an API; it also self-documents. If the schema changes, the documentation automatically changes too. The documentation is the code.

@gjtorikian LIVE DEMO 2017: Documentation is the code I HOPE YOU ARE ENJOYING THE TALK SO FAR. LUNCH IS ALMOST HERE!

@gjtorikian •Good for writers •Good for engineers •Great for users 2017: Documentation is the code Why is this new way of writing APIs so great for writers? Because it allows them to concentrate on documenting the parts of the system that are more opaque. Writers can focus on the conceptual and procedural kinds of content with the conﬁdence of knowing that the reference material is always, forcibly, in sync with the code.

@gjtorikian •Good for writers •Good for engineers •Great for users 2017: Documentation is the code It's great for engineers, because it forces them to think about how what they're building and changing is going to aﬀect end users.

@gjtorikian •Good for writers •Good for engineers •Great for users 2017: Documentation is the code And it's great for users, because rather than hope and guess that the reference documentation they're receiving is accurate, they can be certain of it. As well, the level of tooling that's available to them allows them to learn about and interact with APIs in new ways.

@gjtorikian Documentation next to code 2012 2006 2017 Documentation from
the code Documentation is the code We're really only just getting started with GraphQL, and I'm pretty excited about what the future for GraphQL documentation holds. It's really mind boggling to me how in the span of ten or eleven years, we've gone from publishing hefty books like this to having autocompletion as the default. I'm really looking forward to how the technology improves over the next few years.

Thank you @gjtorikian github.com/welp/welp.reviews IT IS NOW TIME FOR LUNCH!!!

Finally! Trustworthy and Sensible API Document...

Finally! Trustworthy and Sensible API Documentation with GraphQL

More Decks by Garen Torikian

Other Decks in Technology

Featured

Transcript