Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Understanding Documentation Systems - ConFoo Vancouver 2016

December 06, 2016

Understanding Documentation Systems - ConFoo Vancouver 2016

This talk will cover the internals of how a system for documenting software works. This talk will include specific examples from the Sphinx documentation generator, and cover the concepts they use to make documentation generation powerful.

The talk aims to impart high-level understanding of the concepts around documentation generation, which will be applicable to most documentation tooling, whether it be Commonmark, RST, or Asciidoc.


December 06, 2016

More Decks by ericholscher

Other Decks in Technology


  1. Who am I • Co-Founder of Read the Docs •

    Co-Founder of Write the Docs • I come from the Python world
  2. Today • Understand why docs are important • Under the

    underlying model for documentation tools • Be able to reason more completely about how to extend doc tooling
  3. Documentation Systems • Lightweight Markup Language • Templates • Command

    Line Interface • Common Output Formats (HTML, PDF)
  4. Jekyll • Built on Markdown • A simpler execution model

    • Great for static sites, blogs, and documentation. • A great number of extensions, and support on GitHub Pages
  5. Asciidoctor • Built on Asciidoc markup • Has powerful semantic

    constructs • Compatible with Docbook XML • Implementation in Ruby, but support for JVM, JS.
  6. Sphinx • Built on reStructuredText Markup • Has powerful semantic

    constructs • Has a lot of primitives for documenting so ware • Has a lot of extensions, especially around testing & documentation source code
  7. @ericholscher Lightweight Markup Language • Base format to generate other

    formats from • Readable as plain text • Works well with programmer tools • Works great in code comments/ docstrings
  8. @ericholscher Semantic Meaning • The power of HTML & LWML

    • Semantics mean you are saying what something is, not how to display it • Once you know what it is, you can display it properly • “Separation of Concerns”
  9. @ericholscher # HTML (Bad) <b>issue 72</b> # HTML (Good) <span

    class=“issue”>issue 72</span> # CSS (Good) .issue { text-format: bold; } Classic HTML Example
  10. @ericholscher # Bad <font color=“red”>Warning: Don’t do this!</font> # Good

    <span class=“warning”>Don’t do this!</span> # Best .. warning:: Don’t do this Classic RST Example
  11. @ericholscher # Markdown Check out [PEP 8](https:// www.python.org/dev/peps/pep-0008/) # RST

    Check out :pep:`8` # Asciidoc See pep:8 to get started. Semantic Comparison
  12. @ericholscher # Markdown This is an [idempotent](http:// docs.foo.com/glossary#term- idempotent] implementation.

    # RST This is an :term:`idempotent` implementation. # Asciidoc This is an term:idempotent implementation Semantic Comparison
  13. @ericholscher See our image :ref[scatter plot] to see more information.

    :: include{file=other-file.md} CommonMark Proposal https://talk.commonmark.org/t/generic-directives-plugins- syntax/444
  14. @ericholscher Semantic Markup • Shows the intent of your words

    • Works across output formats • You can style warnings differently in HTML, PDF, ePub, etc.
  15. @ericholscher Inline Markup • Anything that is included in the

    page content itself • Used for embedding things into the rendered output
  16. @ericholscher Coding Reference ---------------- Generally we follow :pep:`8` in our

    code. However, we also have our own :doc:`style-guide` with exceptions. Inline Markup Example
  17. @ericholscher Page Level Markup • Allows you to nest content

    inside of them • Great for reference endpoints
  18. @ericholscher Page Level Markup • .. directive-name:: • Main source

    of extendability • A lot of Sphinx’s power comes through Directives
  19. Templates • Allow for extending with specific logic in the

    markup language • Jinja & Liquid are the major ones
  20. Reader • Get input and read it into memory •

    Quite simple, generally don’t need to do much
  21. Parser • Takes the input and actually turns it into

    a Doctree • Handles directives, inline markup, etc. • RST is the only parser implemented in Docutils • Implemented with a lined-based recursive state machine
  22. Doctree • AST for Docutils • Source of Truth •

    Hierarchy with a document at the root node
  23. Transformer • Take the doctree and modify it in place

    • Allows for full knowledge of the content • Table of Contents • Implemented by traversing nodes
  24. <document ids="title" names="title" source="test.rst" title="Title"> <title> Title <paragraph> Words that

    have <strong> bold in them <paragraph> Paragraph. Transform Example
  25. Writer • Takes the Doctree and writes it to actual

    files • HTML, XML, PDF, etc. • Implemented with a Visitor pattern
  26. def paragraph(self, block): p = nodes.paragraph() p.line = block.start_line append_inlines(p,

    block.inline_content) self.current_node.append(p) Markdown Parser
  27. TOC Transformer • Enabled with `.. contents::` directive • Adds

    a pending node during parsing • Gets properly set with a transform
  28. <topic classes="contents" ids="toc" names="toc"> <title> TOC <pending> .. internal attributes:

    .transform: docutils.transforms.parts.Contents .details: TOC Transformer
  29. Sphinx • Builds on top of the standard docutils concepts

    • Add it’s own abstractions, but uses the same docutils machinery underneath • Goes from single page to project
  30. Sphinx Application • Main level of orchestration for Sphinx •

    Handles configuration & building • Sphinx()
  31. Sphinx Environment • Keeps state for all the files for

    a project • Serialized to disk in between runs • Works as a cache
  32. Sphinx Builder • Handles generating output • Where templates &

    styles are implemented • Allow customization of specific elements
  33. O en times you can save a lot of effort

    by writing simple extensions to your tools