GitHub Flavored Ruby - Speaker Deck

Slide 1

Slide 1 text

GITHUB FLAVORED RUBY Tom Preston-Werner Hello, my name is Tom Preston-Werner

Slide 2

Slide 2 text

@mojombo you should follow me and read my blog tom.preston-werner.com You can ﬁnd me on Twitter and GitHub as mojombo, and read my blog at http://tom.preston-werner.com.

Slide 3

Slide 3 text

I’m a cofounder and CTO of GitHub.

Slide 4

Slide 4 text

RakeGem Readme Driven Development TomDoc Semantic Versioning Relentless Modularization Today I’m going to talk about ﬁve ideas that we use at GitHub to streamline how we approach building Rubygems.

Slide 5

Slide 5 text

RELENTLESS MODULARIZATION --2 MINUTES--

Slide 6

Slide 6 text

0 5,000 10,000 15,000 20,000 Lines of code Tinything Bigshit Think of a small project you’ve done. Maybe it has 1000 lines of code. It’s a pleasure to work with. Easy to maintain. You love working on it. Now think of a huge monolithic project you’ve been part of. I’m betting you’d give anything to stay away from that code.

Slide 7

Slide 7 text

FUUUUUUUUUUUUUUUU How the hell does this even happen!? Large code has a tendency to become messy and tightly coupled. Sometimes you don’t even realize this is happening. Without a weapon to ﬁght this trend, you’ll end up spending your days untangling slinkies instead of clapping like an idiot while they slink down the stairs.

Slide 8

Slide 8 text

How do I decide what to modularize?” “ Sometimes it can be tricky to decide what to modularize and when something should be extracted. There’s a simple heuristic I like to use. Modularize...

Slide 9

Slide 9 text

EVERYTHINGGGGGGGGG EVERYTHINGGGGGGGGGGGG. If you ﬁnd yourself wondering if you should modularize something or not, just remember this baby staring into your soul and you’ll do the right thing.

Slide 10

Slide 10 text

github.com grit smoke chimney bertrpc proxymachine ernie failbot gerve resque RockQueue jekyll nodeload albino markup camo gollum heaven stratocaster amen haystack hubot services help.github.com jobs At GitHub we embrace modularization in a big way. We continually extract pieces of the main GitHub.com Rails app into their own components. Then we give them funny names.

Slide 11

Slide 11 text

A neat trick that I use to approach modularity is by remembering my good childhood friend Mr. Rogers. He liked to make believe, and so do I.

Slide 12

Slide 12 text

Make Believe Open Source Libraries I make believe that whatever I’m working on is going to be open sourced. This forces me to use proper abstractions and prevents me from coupling the code too tightly with the main app.

Slide 13

Slide 13 text

Make Believe Open Source Libraries Even better is if you really DO open source your libraries and components. We’re a huge fan of this at GitHub. We try to open source anything that does not represent core business value.

Slide 14

Slide 14 text

MODULARIZE TO PREVENT PAIN KEY CONCEPT Small projects are easy and enjoyable to write and maintain. Big projects are hard and suck to maintain. Save yourself some pain and modularize like you mean it!

Slide 15

Slide 15 text

README DRIVEN DEVELOPMENT --6 MINUTES--

Slide 16

Slide 16 text

[waterfall] In 1970 Winston Royce wrote a book about project management. In it he outlined a methodology called Waterfall Design. Even though he wrote about this system as an example of what NOT to do, enterprises and government ignored that and started using it anyway. =/ Over specifying requirements is a disaster. We’ve all embraced Agile techniques to escape its tyranny.

Slide 17

Slide 17 text

[cowboy] I don’t follow anyone’s rules... ...not even my own But in retaliation of Waterfall, we’re tempted to go too far in the other direction and become cowboy coders. This is just as bad.

Slide 18

Slide 18 text

A perfect implementation of the wrong specification is useless. Either way you can end up with the wrong specification. A perfect implementation of the wrong specification is useless.

Slide 19

Slide 19 text

IS THERE A MIDDLE GROUND? there must be a middle ground, right? Surely there must be some solution that lies between these two extremes. Something that’s not OVER speciﬁed or UNDER speciﬁed.

Slide 20

Slide 20 text

WRITE YOUR README FIRST There’s already a document that we write that contains the information we need to understand a project and how it works. It’s called the README. What if we wrote our READMEs ﬁrst? We could think through the problem domain enough to prevent big mistakes, but still leave ourselves with enough ﬂexibility to end up with a correct implementation.

Slide 21

Slide 21 text

Readme.md Spec.md When I ﬁrst started doing this, it was amazing. But it can be confusing if you have an empty repository with just a README ﬁle and no implementation. I’ve solved this problem by renaming README to SPEC during the initial phase. Then I move parts of the SPEC into the README as I implement features, thereby keeping the code and the docs in sync.

Slide 22

Slide 22 text

google://readme driven development I’ve written a blog post that further explains this idea. It’s on my weblog. Just search for “readme driven development” and it’ll be the ﬁrst result.

Slide 23

Slide 23 text

USE RDD TO SPECIFY THE RIGHT PRODUCT KEY CONCEPT RDD can help you build better software by writing down your thoughts before you start coding, and prevents you from locking in the wrong speciﬁcation by writing too much.

Slide 24

Slide 24 text

RAKEGEM --12 MINUTES--

Slide 25

Slide 25 text

Rakegem is a Minecraft plugin I created that totally makes it easy to harvest Rubies from a standard grass block. It’s really great when... Naw, I’m just kidding.

Slide 26

Slide 26 text

RAKE-BASED GEM BUILDER and deployer, doccer, tester, and manifester Rakegem is a ﬂexible, customizable Rake-based gem builder, and more.

Slide 27

Slide 27 text

github.com/ mojombo/ rakegem If you want to follow along, load up this URL. You’ll see just how simple it really is.

Slide 28

Slide 28 text

NO DEPENDENCIES like, for real. no gems involved*. * except yours, duh Rakegem has NO DEPENDENCIES whatsoever.

Slide 29

Slide 29 text

HAND-ROLLED GEMSPEC + SIMPLE RAKE TASKS Rubygems already have a great system for specifying everything about how the gem works. It’s called the gemspec. Rakegem gives you a template gemspec that’s easy to ﬁll out and doesn’t involve any magic. It combines that with a simple Rakeﬁle that handles all the build and release dynamics for you.

Slide 30

Slide 30 text

GEMSPEC Here’s what the gemspec template looks like. It provides a lot of guidance about how to write your gemspec so you don’t have to dig through mountains of documentation.

Slide 31

Slide 31 text

RAKEFILE The Rakeﬁle can be copied directly to your project without modiﬁcation. Everything it needs it can get from the gemspec.

Slide 32

Slide 32 text

$ rake -T rake build # Build scoped-0.1.0.gem into the pkg directory rake clobber_rdoc # Remove rdoc products rake console # Open an irb session preloaded with this library rake coverage # Generate RCov test coverage and open in your browser rake gemspec # Generate scoped.gemspec rake rdoc # Build the rdoc HTML Files rake release # Create tag v0.1.0 and build and push scoped-0.1.0.gem to Rubygems rake rerdoc # Force a rebuild of the RDOC files rake test # Run tests rake validate # Validate scoped.gemspec It adds Rake tasks for all your normal needs: building the gem and docs, running tests, and doing releases.

Slide 33

Slide 33 text

RAKEGEM — CUSTOMIZATION The beauty of this system is that it’s inﬁnitely customizable. Since the entire system is embedded in your project as simple code, you can change anything you want to get the perfect workﬂow.

Slide 34

Slide 34 text

RAKEFILE Here’s what the release task looks like. I like to use a version number that looks like “vX.Y.Z”, but maybe you don’t. To change how Rakegem works, just change that line of code!

Slide 35

Slide 35 text

STOP FIGHTING YOUR GEM BUILDING SYSTEM KEY CONCEPT Your gem management system should be simple and customizable. Rakegem gives you the ultimate power and freedom to get things done without any hassle.

Slide 36

Slide 36 text

TOMDOC --16 MINUTES--

Slide 37

Slide 37 text

FOUR LEVELS of documentation Line Code API Book I’ve identiﬁed four levels of code documentation. Line-level docs explain tricky lines of code within methods. Code-level docs describe how methods or classes work. API-level docs are for end users of your library. Book-level docs provide a long format overview suitable to beginners.

Slide 38

Slide 38 text

FOUR LEVELS of documentation Code TomDoc is my solution to Code-level docs.

Slide 39

Slide 39 text

tomdoc.org If you’d like to follow along, you can ﬁnd the speciﬁcation at this URL.

Slide 40

Slide 40 text

WHY DOCUMENT CODE? what does it do? is it considered public? what params are expected? what types are the params? what are valid options? how do I use the damn thing? what type is the return? There are a lot of things we ask ourselves when looking at new code. Ruby is especially difficult to unravel because of its ﬂexibility. If we don’t write down what we’re thinking when we write code, that information is easily lost to the ghosts of time.

Slide 41

Slide 41 text

PAST TOM AND FUTURE TOM I’d like to introduce you to Past Tom. He’s been looking out for me for a long time. Four years ago he was writing TomDoc that I still read today. Everytime I’m coding now, I think about Future Tom. If I write good docs, I know he’ll look back at me from the future and give me two big thumbs up, because I’ve saved him a ton of time and stress.

Slide 42

Slide 42 text

class Gollum class Wiki # # # def exist? # ... end end end Here’s some code. If all we have is the method signature, it’s hard to tell what’s going on. Even something simple like what type it returns requires reading the code.

Slide 43

Slide 43 text

class Gollum class Wiki # Public: Check whether the wiki's git repo exists on the filesystem. # # Returns true if the repo exists, and false if it does not. def exist? # ... end end end what does it do? is it considered public? what type is the return? With just a few shorts words, we can solve a lot of problems and make sure that future developers that work with this code don’t change it in unpredictable ways.

Slide 44

Slide 44 text

class Gollum class Wiki # # # # # # # # # # # # # # # # # # def write_page(name, format, data, commit = {}) # ... end end end Maybe you think that’s too trivial and reading the code would be fine. Ok, how about this example. Not so simple now, is it? We can get some idea of what the method does, and even though the argument names are good, there is no visibility into specifics about either. As coders, we rely on specifics to write good code.

Slide 45

Slide 45 text

class Gollum class Wiki # Public: Write a new version of a page to the Gollum repo root. # # name - The String name of the page. # format - The Symbol format of the page. # data - The new String contents of the page. # commit - The commit Hash details: # :message - The String commit message. # :name - The String author full name. # :email - The String email address. # :parent - Optional Grit::Commit parent to this update. # :tree - Optional String SHA1 of the tree to create the # index from. # :committer - Optional Gollum::Committer instance. If provided, # assume that this operation is part of a batch of # updates and the commit happens later. # # Returns the String SHA1 of the newly written version, or the # Gollum::Committer instance if this is part of a batch update. def write_page(name, format, data, commit = {}) # ... end end end what params are expected? what types are the params? what are valid options? With a little bit of extra work we can illuminate what this method does and make it obvious how to use it without having to dig through long method chains and a ton of code.

Slide 46

Slide 46 text

class Gollum class Page # # # # # # # # # # def self.cname(name) # ... end end end One last example. Here’s a simple method. The name was obvious to me when I wrote it, but two years later, it’s a different story.

Slide 47

Slide 47 text

class Gollum class Page # Convert a human page name into a canonical page name. # # name - The String human page name. # # Examples # # Page.cname("Bilbo Baggins") # # => 'Bilbo-Baggins' # # Returns the String canonical name. def self.cname(name) # ... end end end how do I use the damn thing? With just a few short lines of TomDoc, I’ve ensured that every developer that sees this code for the rest of time will understand and be able to use this method in the proper fashion. That’s a pretty big beneﬁt for a few minutes of effort!

Slide 48

Slide 48 text

The TomDoc speciﬁcation is designed to be as simple as possible. You should be able to read the spec once and know how to write TomDoc without referring back to it very often. Code docs should be optimized for humans. We are the ones reading and writing it.

Slide 49

Slide 49 text

ONE MORE THING Oooh.

Slide 50

Slide 50 text

This is Eric Hodel. He likes hats. He also likes TomDoc, and he just happens to be the maintainer of RDoc.

Slide 51

Slide 51 text

RDOC 3.10 WILL SUPPORT TOMDOC He’s added TomDoc support to the latest versions of RDoc and if you install 3.10 or later, you can convert your TomDoc’d code to nice HTML output without any extra tools!

Slide 52

Slide 52 text

rdoc --format=tomdoc Here’s the magic incantation.

Slide 53

Slide 53 text

And here’s what it looks like.

Slide 54

Slide 54 text

Kickass.

Slide 55

Slide 55 text

CODE DOCUMENTATION IS FOR HUMANS KEY CONCEPT Stop optimizing your docs for machines, and start writing them for Future You. TomDoc is easy to write, easy to read, and saves everyone a boatload of time.

Slide 56

Slide 56 text

SEMANTIC VERSIONING --23 MINUTES--

Slide 57

Slide 57 text

DEPENDENCY HELL Version Lock Version Promiscuity There’s a dread place in software development called dependency hell. It’s where you end up when you have version requirements that are either overly speciﬁc or so broad that incompatible versions can sneak in and screw up your system.

Slide 58

Slide 58 text

semver.org You can ﬁnd the Semantic Versioning spec at this URL. It’s very short and easy to follow.

Slide 59

Slide 59 text

PUBLIC API Remember TomDoc? The hardest part of implementing SemVer is deﬁning a public API for your project. Without a public API that tells people what classes/methods/etc they can and cannot use, it is impossible to tell users how those things change over time. Remember TomDoc? If you use TomDoc and the Public/Internal/Deprecated designators, you can easily deﬁne your public API without a lot of extra work. So do that.

Slide 60

Slide 60 text

2.4.3 major minor patch In SemVer, there are three numbers that comprise the version number. Major, minor, and patch.

Slide 61

Slide 61 text

MAJOR backwards incompatible big changes The major version number must be incremented anytime the public API changes in a backwards incompatible way. If you’re a responsible software developer, you don’t want this to happen very often. Maintaining backwards compatibility is a big part of not screwing over your users.

Slide 62

Slide 62 text

MINOR backwards compatible new functionality big internal changes may contain bug ﬁxes The minor version must be incremented when new functionality is added to the public API. These changes must always be backwards compatible.

Slide 63

Slide 63 text

PATCH backwards compatible bug ﬁxes only The patch version must be incremented if bugs are ﬁxed to bring the code back into line with the documentation. These must always be backwards compatible and must not change the public API in any way.

Slide 64

Slide 64 text

gem "gollum", "~> 2.4" BUNDLER If you follow these rules, you can avoid dependency hell in your project by using Bundler’s pessimistic version constraint operator. This rule means that any version >= 2.4.0 and < 3.0.0 will satisfy the requirement.

Slide 65

Slide 65 text

If you’re worried about large version numbers, you can relax. They’re numbers. It’s not like they’re going to run out.

Slide 66

Slide 66 text

USE VERSION NUMBERS TO CONVEY MEANING KEY CONCEPT Why bother with three part version numbers if you’re not going to convey consistent meaning with them? You may as well just use a single incrementing number if that’s the case. If you follow SemVer you can save yourself from dependency hell.

Slide 67

Slide 67 text

SUMMARY --28 MINUTES--

Slide 68

Slide 68 text

Is your system broken down into small manageable pieces? RELENTLESS MODULARIZATION

Slide 69

Slide 69 text

Are you wasting time because of too much or too little planning? README DRIVEN DEVELOPMENT