Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Finally! Trustworthy and Sensible API Document...

Garen Torikian
September 12, 2017

Finally! Trustworthy and Sensible API Documentation with GraphQL

GitHub is migrating its public API system from REST to GraphQL. In this talk, I'd like to go over my personal experiences writing API documentation for ten years, and why GraphQL's "everything needs documentation" approach is a much needed improvement. I'd go over past API tooling and methodologies, current best practices to make "sane" autogenerated API documentation, and why I think that GraphQL is a boon for technical writers, developers, and end users.

Garen Torikian

September 12, 2017
Tweet

More Decks by Garen Torikian

Other Decks in Technology

Transcript

  1. Finally! Trustworthy and Sensible API Documentation with GraphQL @gjtorikian ARE

    YOU IN THE AUDIENCE AND LOOKING AT THESE NOTES? IF SO, HELLO!
  2. Finally! Trustworthy and Sensible API Documentation with GraphQL @gjtorikian Heya!

    My name is Garen Torikian, and today I want to talk to you about API documentation. I know that this slot is a bit rough because it's right before lunchtime, which is one of my three favorite times of day. But I hope you'll stick around and find this talk useful.
 
 And the focus of this talk is going to be on the Finally! and Trustworthy! Portions of the title. But before I get into the details, I want to introduce myself.
  3. @gjtorikian html-proofer jekyll & nanoc plugins markdowntutorial.com …other stuff! I

    am a programmer I'm an engineer working at GitHub.com. I work on the API that supports the GitHub Platform.
  4. I was a technical writer @gjtorikian But before that, I

    was a technical writer. I actually started the documentation team at GitHub almost five years ago, and before that I had been writing technical documentation for nearly a decade. And the truth is, I love writing. I sort of miss doing it as part of my full-time job, and I'm excited every time I get to dive in and explain some computer system with words. My move into full-time programming is a relatively new thing. It was only about two years ago that I decided to do more coding than explaining. And what I realized is that after I became an engineer, I was still interested in the same types of problems I was interested in while I was doing documentation.
  5. @gjtorikian Writing is easy Maintaining is hard This is a

    controversial slide, I know. I like to tell people that it's easy to write documentation. Anyone can write something down. They may not be able to write it down well, but they can write it down. The hard part is maintaining documentation. To make sure that it stays accurate and true as the software evolves. Part of the reason I wanted to move from documentation and into engineering was to focus on keeping accurate documentation. I reasoned that if I could concentrate on the documentation accuracy from an engineering perspective, in some way I could help mitigate some of the frustrations that I had experienced as a writer.
  6. I'm interested in how we write @gjtorikian But what I

    realized when I took a step back from writing was that I am and I have been obsessed with how we write. Not just which word is the mot juste, but also the processes and software that we use to do our documentation. And eventually even how our documentation is consumed. And this is something that affects both engineers and technical writers. Engineers who aren't writers don't know how to provide useful documentation, and writers who aren't engineers sometimes aren't given the best tools to do their work.
  7. What do API docs look like? @gjtorikian I want to

    talk about how API documentation is written, what the history of that process has been as far as I've seen it. I want to talk about why it's traditionally been untrustworthy, and more importantly, I want to talk about the ways people are now rethinking the way in which these technical systems are being built, and how those changes are benefiting documentation teams and users. 
 I want to look at where we were, where we have been, and where we're going. But just so that we all have a shared understanding of what I'm talking about, I want to quickly define what I mean by API and documentation.
  8. ! the best way to build and ship software 8

    A P wha? @gjtorikian Bank Let's pretend this blob right here is a bank. I can put up a little label there that says bank. You can tell that it's a bank now because it has columns and looks very bureaucratic. 
 Let's say you want to take money out of a bank. It's Saturday, you have a lot of time to waste, so you decide to skip the ATM and stand in line. What happens when you get there? The bank teller doesn't just hand you the keys and say, "All right, your money is kept in the back left of the vault, good luck!"
  9. ! the best way to build and ship software 9

    A P wha? @gjtorikian Customer Bank Money Ask In reality, it works a bit more like this. You approach the bank and ask for some money. You'll fill out a form or a slip, you'll write down your account number, how much you want to take out, and you'll sign the form and hand it to the teller.
  10. ! the best way to build and ship software 10

    A P wha? @gjtorikian Customer Bank Money Get Ask The teller, acting as the bank, goes to the vault and gets the money.
  11. ! the best way to build and ship software 11

    A P wha? @gjtorikian Customer Bank Money Receive Deliver Ask Get And then the money comes back to you, through the bank.
  12. ! the best way to build and ship software 12

    A P wha? @gjtorikian Receive Deliver Ask Get Customer Bank Money And that's pretty much the flow for how you can get money from a bank.
  13. ! the best way to build and ship software 13

    A P wha? @gjtorikian User Program/Server Data Receive Deliver Ask Get So an API isn't very different from that scenario. The user, the person sitting at a computer somewhere, make a request for some data. That request could be made to a program running on your computer, or, you could be asking an external website, like GitHub, Google Maps, Wikipedia, whatever. The server goes and fetches the data you asked for, and then hands it back to you.
  14. ! the best way to build and ship software 14

    @gjtorikian User Program/Server Data Receive Deliver Ask Get A P wha? The API is the layer here that facilitates the "Ask" portion of the flow. It's sort of a gatekeeper between the user and the data.
 
 This kind of system is becoming more and more common.The more stuff we put online, the more people realize that they want to fetch that stuff and do things with all that public data. Rather than companies deciding what data users should and should not have access to, and instead of just granting them access to everything, an API determines how you can get at information that's stored somewhere else in a safe and sanctioned way. Essentially, it's a contract between you and data.
  15. Types of documentation @gjtorikian The other part of preliminary information

    is what I mean when I talk about documentation. Not all documentation is written the same. I'm a little old school, and the way I learned it was that there essentially were three main types.
  16. Reference @gjtorikian Terms, glossaries There's reference documentation. In a technical

    writing context, that might be defining words or phrases that are unique to your software. When describing an API, these would be the functions or methods that you would need to call to get access to your data.
  17. Conceptual @gjtorikian What is…? There's conceptual documentation, which can describe

    a technical system, what it's used for, or what it does. For example, explaining how Git works, or describing how photos are uploaded and stored online.
  18. Procedural @gjtorikian How do I…? And there's procedural information, which

    is usually some kind of ordered list explaining how to accomplish something in a series of steps. This'll be something like a step-by-step guide to opening a pull request or instructions on how to change your password.
  19. ! the best way to build and ship software 19

    Types of documentation @gjtorikian Reference Conceptual Procedural Of the three choices today when I'm talking about specifically API documentation
  20. ! the best way to build and ship software 20

    Types of documentation @gjtorikian Reference Conceptual Procedural I'm really only going to be concentrating on the first kind, Reference documentation. So, you have a server sitting somewhere, there's some information you want to collect, and you want to know how you can make the request, The kind of documentation I'm talking about is the kind that explains to you what the function calls are, what they do, what arguments they take, and so on. It's a reference sheet, a dictionary, that enables you to speak the language that the server will understand.
  21. Reference docs are the most important @gjtorikian When it comes

    to APIs, reference documentation is the most pertinent form of documentation, period. Let me go back to that bank analogy again. If you don't find and fill the right forms, you won't be able to do anything with the money sitting in the vault on the other side. A server sitting somewhere online is the same thing. It's a mysterious black box. If you don't know how to talk to it, you're not going to be able to get anything from it. And if, as a writer, you don't adequately describe how someone is going to interface with your system, you run the risk of them not using your system at all. So accurate and consistent reference documentation for APIs is super important.
  22. @gjtorikian 2006 2012 2017 Ok, so, now I've defined the

    technical system I'm talking about, and the kind of documentation I'm referring to. The theme of this talk was about how, finally, trustworthy API documentation can be produced. And in order to explain how we are where we are, we're going to travel through the years and each segment is going to build on top of each other and culminate towards an understanding of how API documentation has been produced, and how current API systems are changing towards a trend of more "trustworthiness."
  23. 2006 @gjtorikian Documentation next to code So it's time for

    all of us to find our mental time machine, step inside, and transport ourselves to 2006. Maybe it was a better time for you, maybe it was a worse time, but we can all agree, it was a point in time. 
 
 For me, this was a pretty good year, because I got my first job doing technical documentation. I didn't plan it, I didn't even know writing documentation was a job. And what I experienced in 2006 was the notion of documentation next to the code.
  24. @gjtorikian 2006 Documentation next to code By a show of

    hands, does anyone know what this is?
 
 That's right, it's a book. People used to print documentation on these things. This particular one is for some architecture software, AutoCAD, which was my first job. It's over one thousand two hundred pages long. I'm not really that old, but for my first technical writing gig, I would write documentation on the computer, we'd send it out to some editor somewhere off-site for proof-reading, and a few months later, ta-da we'd get a book. And this book would come with you when you bought your software.
  25. ! the best way to build and ship software 25

    2006: Documentation next to code @gjtorikian So how did a book like this get made? At the time we used to use this pretty amazing tool called Docomatic. The name alone is magnificent. By the way, I realized while I was preparing for this talk that this company is based in Berlin, so it's entirely possible that someone that made this software is here today, so, hello, if you're here.
 
 This screenshot is from the latest release, and while I don't know how the software works now, in 2006, what Docomatic did was pretty simple: it would scan all the files in your project that contained your code. As a writer, I'd go through the software's source code files one by one and write documentation based on my understanding of the source. Red boxes next to name meant that there was missing documentation, and green boxes meant that there was at least some documentation.
  26. ! the best way to build and ship software 26

    2006: Documentation next to code @gjtorikian The API I was documenting was a set of methods that a user could call to automate drawing shapes, add shading, and other architecture-y things. And you can see some of the stuff I needed to write about is pretty common to nearly every API. I had to provide a summary of the functionality. I had to list out all the parameters, what their types were, and what they did. In theory, this was a good idea. You know, the engineers had all these files split up, and in theory I could just go down them and write some content describing the system. As a technical writer though, I was forbidden from changing the source code. What that meant was that although I could see all the source at any time, I had to keep all the documentation in an entirely separate folder, stored in a separate space. This caused a number of problems.
  27. ! the best way to build and ship software 27

    2006: Documentation next to code @gjtorikian •Prone to missing updates •Required engineering approval •Slow turnaround The first problem was that this documentation was prone to missing updates. As a writer, I had very little insight into the changes that the engineers were making. That meant that I could spend a week documenting some part of the system, rescan all the files, and then discover that some arguments had changed, some method names had changed, or something I had documented had disappeared entirely.
  28. ! the best way to build and ship software 28

    2006: Documentation next to code @gjtorikian •Prone to missing updates •Requires engineering approval •Slow turnaround Because I was working in my own little silo, I also had to send the documentation that I had written back to the engineering team for approval. So they would work on the code, I would document it, then I would send what I wrote back to them to ask if I had understood everything. At which point, invariably, they'd say something like "Oh, yeah, we changed this part of the system. Sorry!"
  29. ! the best way to build and ship software 29

    2006: Documentation next to code @gjtorikian •Prone to missing updates •Requires engineering approval •Slow turnaround And of course, because of all that, there was a much slower pace of documentation that I could produce. Working at a tech company is very manic. Sometimes there will be periods of lulls, and sometimes, you'll be forced to speed up and play catch-up as deadlines approach. What that means of course, is that the documentation written at the beginning of the project is sometimes the clearest and most accurate, but the stuff you write towards the end of the cycle is sloppy and not-carefully reviewed.
  30. @gjtorikian 2012 Documentation from the code Let's jump forward another

    six years. I had moved around between different jobs, but by and large the process I experienced was the same. Writers weren't allowed access to the same code and systems that the engineers were using, so very often the things that were being built were not being described accurately.
 
 I took a role as a technical writer working on APIs in a startup around 2012. And I really wanted to write documentation in better ways than I had been exposed to. That's when I first experienced working with documentation from the code.
  31. @gjtorikian Languages are awesome I need to take a brief

    tangent for a moment here. I love languages. I love learning new ones, I love hearing them spoken, I love discovering the words and phrases and slang that people use to communicate with each other. I like the word games you can create with them, and I love how they're an expression of a culture. 
 
 Let's talk about language for a bit.
  32. @gjtorikian The book is blue Let's say you have this

    sentence in English. The book is blue. If you look at that, and you know English, you can break it down grammatically into various parts.
  33. @gjtorikian The book is blue There's an article, there's a

    noun, there's a verb, and there's an adjective.
 
 You know that, in English, this is describing a single item, the quality of it, and its state.
  34. @gjtorikian The book is blue El libro es azul Let's

    look at that same sentence in Spanish. El libro es azul.
  35. @gjtorikian El libro es azul The book is blue The

    words are completely different, but the concepts are the exact same. There's an article, a noun, a verb, and an adjective.
 
 If you speak English and don't speak Spanish, or vice versa, you can still understand the concepts, even if they're in a different language, right? If I told you that "libro" meant "book," you would immediately know that it was a noun, what it looked like, what you did with it, et cetera. You're able to parse the meaning of the sentence into different parts.
  36. @gjtorikian Programming languages are similar The thing is, programming languages

    also behave the same way. You might hear someone talk about how they prefer to write their code in Ruby, or in JavaScript, or in Python, whatever. The point is, the underlying concepts are fundamentally the same. They all abide by a grammar that is shared between them. They just use different words to get their point across.
 
 And because they have a grammar, that also means that their structure can be parsed, analyzed, and represented in different ways.
  37. @gjtorikian El libro es azul The book is blue In

    the slide before I had highlighted the different parts of the sentences, article, noun, verb, adjective.
  38. ! the best way to build and ship software 38

    2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (name) -> print "Hello, #{name}" For a programmer, when they're writing code in their editor, their words are represented the same way. When someone is writing code, the text that they look at doesn't look like this.
  39. ! the best way to build and ship software 39

    2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (name) -> print "Hello, #{name}" It looks like these. Color delineates each part of the program. In yellow you have comments, in purple are function names, in light blue, you have a string of text.
 
 Here's where it starts to get interesting. Before, as a writer, I would have had to keep a separate document that kept track of each method and arguments, and, create documentation next to that that was prone to go out of sync.
 
 But, once we start to consider that we can extract the "comment portion" of the code out, we can start to look at the two pieces as not independent entities, but parts that can work together.
  40. ! the best way to build and ship software 40

    2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (name) -> print "Hello, #{name}" We can generate documentation straight off of the comments. When the documentation is right next to the code, there's no need to keep track of whether or not a method is going to change. There's less of a chance that the documentation is inaccurate: at a glance, one hopes, you ought to be able to verify if the documentation is correct or not. Everything is kept in the same place.
 
 But, there's still the potential for a problem.
  41. ! the best way to build and ship software 41

    2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (first, last) -> print "Hello, #{first} #{last}" What happens if the programmer changes this method? Now, it no longer takes a single argument name, and instead takes two arguments: a first name and a last name.
 All of the tests that the programmer has written for the code would pass. If we only generated our documentation through the comments, this would pass through unnoticed.
  42. ! the best way to build and ship software 42

    2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (first, last) -> print "Hello, #{first} #{last}" ! A good documentation tool will refuse to build the documentation if it notices a discrepancy. This way, a writer can be certain that any changes to the code is always reflected in the documentation.
 
 But this requires a rather intelligent documentation generating tool.
  43. @gjtorikian El libro es azul The book is blue Das

    buch ist blau ຊ ͸ ੨Ͱ͢ книга синяя * * *From Google Translate. I know. I'M SORRY. There are hundreds of different languages. Even though you might know about grammar rules, you might not be as familiar with how different languages are parsed. But even still, as a human, it's easy to understand that book and libro both mean the same thing.
  44. ! the best way to build and ship software 44

    2012: Documentation from the code @gjtorikian # Public: Says hello to a person # name - A {String} naming a person hello = (first, last) -> print "Hello, #{first} #{last}" def hello(first, last): """Says hello to a person :param name: A {String} naming a person""" print "Hello, #{first} #{last}" # Public: Says hello to a person # name - A {String} naming a person def hello(first, last) puts "Hello, #{first} #{last}" CoffeeScript Ruby Python But depending on what language your code is written in, a documentation tool will need to know the specifics of each grammar. On this slide, we're documenting the same method written three different ways in different programming languages. As a human, it's easy to reason that the yellowish string is the comment style for each of these languages. And while a tool or several tools could be built to parse each language, more likely, the documentation tool will parse each block in different ways. There is no single tool that will turn the comments from your API code into documentation for every language.
 
 So while this is an improvement towards accuracy, the reliability of documenting things the same way changes as you move between different programming languages.
  45. @gjtorikian Documentation is the code 2017 But let's keep building

    on that idea that the code and the documentation are two fragments representing the same idea.
 
 Coming up now to the present day, we'll get into the last portion of the talk. The documentation is not coming from the code: the documentation is the code.
  46. ! the best way to build and ship software 46

    @gjtorikian 2017: Documentation is the code GraphQL About two years ago, the fine folks at facebook.com announced GraphQL, which they had been using internally for about five years at that point to build APIs for their website and mobile apps.
  47. ! the best way to build and ship software 47

    @gjtorikian •NOT a programming language •Specification for how APIs behave •Slow turnaround 2017: Documentation is the code GraphQL isn't a programming language.
  48. ! the best way to build and ship software 48

    @gjtorikian •NOT a programming language •Specification for how APIs behave •Slow turnaround 2017: Documentation is the code It's more of a specification for how to model data and how to make requests to get that data. GraphQL is a way in which you can describe an API. It's a brand new way of building out APIs, that exists independently of the way it's implemented. Which means that, if you write an API using GraphQL, it can be implemented the same way regardless of the programming language.
 
 Let's go back to that bank analogy. GraphQL is a specification for what the withdrawal form, which you hand to the teller, looks like. It specifies the fields, and it specifies how to hand it over to the bank. That means that every bank, no matter what, has the same kind of form to use when withdrawing money. It lowers the cognitive barrier if you decided to switch banks or servers and need money or data.
  49. ! the best way to build and ship software 49

    @gjtorikian •NOT a programming language •Specification for how APIs behave •Fully introspective 2017: Documentation is the code A GraphQL schema is fully introspective, which means that every part of the schema, knows about every other part. That means that a GraphQL schema can parse itself. I'll go more into that in a little bit. The most interesting aspect to GraphQL is that it has a very documentation-centric approach.
  50. ! the best way to build and ship software 50

    @gjtorikian # A place to go to for delicious eats and drinks. type Restaurant { # What the owners of the Restaurant decided to call it. name: String! # The Restaurant's average rating. rating: Int! # The physical address of the Restaurant. address: String # Whether or not this Restaurant will deliver food to you. offersDelivery: Boolean # Whether or not you can order to-go at this Restaurant. hasTakeout: Boolean } 2017: Documentation is the code On this slide is an example of a GraphQL schema. This is the blueprint for how someone can communicate with an API. At the surface, it's building on a lot of the concepts I've mentioned. Here, we're describing a restaurant. Each separate line here represents a field. And each field has a clearly marked type, and a comment. This schema can be parsed, and because it's language independent, it doesn't suffer from the problem mentioned earlier about needing different tools for different languages. There's no such thing as a GraphQL schema for Ruby or a GraphQL schema for JavaScript or a GraphQL schema for Python. There's just this one schema specification. 
 A GraphQL schema is not just a blueprint for how users can interact with an API; it also self-documents. If the schema changes, the documentation automatically changes too. The documentation is the code.
  51. ! the best way to build and ship software 51

    @gjtorikian LIVE DEMO 2017: Documentation is the code I HOPE YOU ARE ENJOYING THE TALK SO FAR. LUNCH IS ALMOST HERE!
  52. ! the best way to build and ship software 52

    @gjtorikian •Good for writers •Good for engineers •Great for users 2017: Documentation is the code Why is this new way of writing APIs so great for writers? Because it allows them to concentrate on documenting the parts of the system that are more opaque. Writers can focus on the conceptual and procedural kinds of content with the confidence of knowing that the reference material is always, forcibly, in sync with the code.
  53. ! the best way to build and ship software 53

    @gjtorikian •Good for writers •Good for engineers •Great for users 2017: Documentation is the code It's great for engineers, because it forces them to think about how what they're building and changing is going to affect end users.
  54. ! the best way to build and ship software 54

    @gjtorikian •Good for writers •Good for engineers •Great for users 2017: Documentation is the code And it's great for users, because rather than hope and guess that the reference documentation they're receiving is accurate, they can be certain of it. As well, the level of tooling that's available to them allows them to learn about and interact with APIs in new ways.
  55. @gjtorikian Documentation next to code 2012 2006 2017 Documentation from

    the code Documentation is the code We're really only just getting started with GraphQL, and I'm pretty excited about what the future for GraphQL documentation holds. It's really mind boggling to me how in the span of ten or eleven years, we've gone from publishing hefty books like this to having autocompletion as the default. I'm really looking forward to how the technology improves over the next few years.