Slide 1

Slide 1 text

Augie is a co-maintainer of Mercurial; Nathaniel works on gRPC. NEXT: Let's jump right into some code... Code Unto Others Augie Fackler & Nathaniel Manista Google, Inc. 30 May 2016 Code Unto Others http://localhost:8080/print/ 1 of 75 5/30/16, 09:41

Slide 2

Slide 2 text

We're just going to jump right into code: here's a class in a years-established, widely-used codebase * 1700 lines! * 74 public methods and 36 private methods! * 18 public attributes and 9 private attributes! Code Unto Others http://localhost:8080/print/ 2 of 75 5/30/16, 09:41

Slide 3

Slide 3 text

Okay so we're good, right? Quick lecture? Is this a problem? Not for compiler, interpreter, or tools that process the class. Not for the hardware that executes the class. Not for users who use the software built with the class. Code Unto Others http://localhost:8080/print/ 3 of 75 5/30/16, 09:41

Slide 4

Slide 4 text

Most software is developed collaboratively, so this is a problem. The guilty party here is localrepo.localrepository in Mercurial, which we picked as a test case because the maintainers already know it's a problem and it's publicly visible. Software is Made of People Code Unto Others http://localhost:8080/print/ 4 of 75 5/30/16, 09:41

Slide 5

Slide 5 text

This is exactly what we call "readability". You don't scale in time: you won't be on the project forever. You don't scale in communication: your project may have hundreds of collaborating developers maintaining its code and tens of thousands of developer-users writing other software that relies on it. NEXT: Hierarchy of code needs. Readability Your software needs to describe itself to readers the way you would describe it. Code Unto Others http://localhost:8080/print/ 5 of 75 5/30/16, 09:41

Slide 6

Slide 6 text

Readability is third priority at best, but orthogonal to the others so there's no excuse. NEXT: Our first specific complaint with localrepository is lack of cohesion. Correct Efficient Readable Self-Actualization Kill All Humans Code Unto Others http://localhost:8080/print/ 6 of 75 5/30/16, 09:41

Slide 7

Slide 7 text

A grab-bag with too little relationship among its parts. Roughly, this class is serving three roles: storage logic, business logic on top of that storage, and being a data container for related data. It could probably be (at least) three layers of objects via composition. Lack Of Cohesion Code Unto Others http://localhost:8080/print/ 7 of 75 5/30/16, 09:41

Slide 8

Slide 8 text

Some of its methods do actual work; some merely wrap work-doing-methods for convenience. At least the "set" method comes out and says so in an implementation comment. Mixes function and convenience def set(self, expr, *args): '''Find revisions matching a revset and emit changectx instances. This is a convenience wrapper around ``revs()`` that iterates the result and is a generator of changectx instances. ''' for r in self.revs(expr, *args): yield self[r] Code Unto Others http://localhost:8080/print/ 8 of 75 5/30/16, 09:41

Slide 9

Slide 9 text

Mixes layers of abstraction (Low-level structures and high-level structures that use those low-level structures.) Code Unto Others http://localhost:8080/print/ 9 of 75 5/30/16, 09:41

Slide 10

Slide 10 text

Most people's working memory is seven plus or minus two items. Ten is a sharp person. Ninety-two is right out. How many is too many? When you can't talk about a group of elements without using the language of subdivision and segmentation, that's when the group is too large. Just too many elements in its API Code Unto Others http://localhost:8080/print/ 10 of 75 5/30/16, 09:41

Slide 11

Slide 11 text

1700 lines is roughly a magazine article or long blog post. No one should look at a single class and think "I'd better put a pot of coffee on". NEXT: well, that's emotional and subjective, can we be more concrete? Text is too long. Code Unto Others http://localhost:8080/print/ 11 of 75 5/30/16, 09:41

Slide 12

Slide 12 text

quadratic number of relationships is surprising but also not realistic. def long_function(parameter): v1 = # expression using parameter v2 = # expression using parameter, v1 v3 = # expression using parameter, v1, v2 v4 = # expression using parameter, v1, v2, v3 v5 = # expression using parameter, v1...v4 # # (much more) # v30 = # expression using parameter, v1...v29 return # expression using parameter, v1...v30 Code Unto Others http://localhost:8080/print/ 12 of 75 5/30/16, 09:41

Slide 13

Slide 13 text

in practice most values are computed from few recently-computed values and relationships are linear. NEXT: but the reader doesn't know that in the moment! def more_realistic_long_function(parameter): v1 = # expression using parameter v2 = # expression using parameter, v1 v3 = # expression using parameter, v1, v2 v4 = # expression using v1, v2, v3 v5 = # expression using v3, v4 # # (much more) # v30 = # expression using v28, 29 return # expression using v29, v30 Code Unto Others http://localhost:8080/print/ 13 of 75 5/30/16, 09:41

Slide 14

Slide 14 text

last line - the reader has to keep parameter and v1 through v30 in working memory; all of those values are *potentially* used in that return expression. Code length is about reader's working memory too. Ends of scopes allow the reader to "clear head". NEXT: localrepository not inflicted by enemy. def long_function_as_read(parameter): v1 = # expression using parameter v2 = # expression using parameter, v1 v3 = # expression using parameter, v1, v2 v4 = # expression using v1, v2, v3 v5 = # expression using v3, v4 # # (much more) # v30 = # expression using v28, 29 # Now imagine yourself about to read # the next line... Code Unto Others http://localhost:8080/print/ 14 of 75 5/30/16, 09:41

Slide 15

Slide 15 text

No ill will was intended by past authors, yet here we are. How can we do better? Let's reflect on the software development process and figure out how it got to this undesirable state without an evil actor. TRANSITION: First, we'll talk about what goes into a change. localrepository wasn't inflicted by a malevolent enemy Code Unto Others http://localhost:8080/print/ 15 of 75 5/30/16, 09:41

Slide 16

Slide 16 text

All changes must satisfy these requirements, and they get reviewed and considered mostly on their incremental merits. Occasionally someone will look at the bigger picture, but that's rare. TRANSITION: It's a near-universal truth that we underestimate how long software lives. Requirements of any software change: A change must meet a need in the domain of the software. A change's author must understand the problem sufficiently to create the change. Code Unto Others http://localhost:8080/print/ 16 of 75 5/30/16, 09:41

Slide 17

Slide 17 text

Structure is typically given only as much attention as is required to avoid broken code. And of course authors have to understand the problems that they are commanding computers to solve. TRANSITION: So what we're looking for is some silver bullets. Most of the time we guess about the lifetime of code, we guess short, and the code outlives our projections. Code Unto Others http://localhost:8080/print/ 17 of 75 5/30/16, 09:41

Slide 18

Slide 18 text

Tools/practices/etc that we can add or can do that cost little but that keep the code clean on an incremental basis. Important for these to be cheap because "pay a lot now; you'll be happy in ten years" won't sell. Silver Bullets Code Unto Others http://localhost:8080/print/ 18 of 75 5/30/16, 09:41

Slide 19

Slide 19 text

Some early work in the field. Most of the time people cite this statement it's to describe how easy it is to infer behavior from state relative to hard it is to infer state from behavior. We agree with that, but really we bring it up now to point out that he's saying he can understand things without even mentioning the problem domain. "Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious." — Fred Brooks The Mythical Man-Month Code Unto Others http://localhost:8080/print/ 19 of 75 5/30/16, 09:41

Slide 20

Slide 20 text

Mastering a particular problem domain can't generically be made easier, but readability and maintainability emerge from code structure, not problem domain. Code Unto Others http://localhost:8080/print/ 20 of 75 5/30/16, 09:41

Slide 21

Slide 21 text

Again we're speaking independently of problem domain. What's in control, and to what does it pass control? What's changing, and what observes and reacts to that change? Out-parameters hide change. Mutable global state hides change and runs the risk of being a hidden input to nearly every function in your system. TRANSITION: Let's talk about how we cope with similar things in the physical world. If it's apparent where the state is changed and how program control is passed from one component to another, most of the puzzle is solved. Code Unto Others http://localhost:8080/print/ 21 of 75 5/30/16, 09:41

Slide 22

Slide 22 text

We want to exclude undesirable influences from the system under test. Why do we perform experiments in laboratories? Code Unto Others http://localhost:8080/print/ 22 of 75 5/30/16, 09:41

Slide 23

Slide 23 text

Failure rates increase when contaminants are circulating around. Why do we manufacture materially complex goods in clean rooms? Code Unto Others http://localhost:8080/print/ 23 of 75 5/30/16, 09:41

Slide 24

Slide 24 text

Deliberately artificially simple environments not only reduce the problems that can take place inside them but also reduce the questions that can be asked about how the things inside them work. Do Where the Doing is Simplest Code Unto Others http://localhost:8080/print/ 24 of 75 5/30/16, 09:41

Slide 25

Slide 25 text

This sounds good, right? Code Where the Coding is Simplest Code Unto Others http://localhost:8080/print/ 25 of 75 5/30/16, 09:41

Slide 26

Slide 26 text

This is just a constant at class scope - what's remarkable about it? * Override a superclass constant? * Be overridden by a subclass? * Be shadowed by a subclass's instance variable? * Be used from outside the class? class bar(foo): ANSWER = 42 Code Unto Others http://localhost:8080/print/ 26 of 75 5/30/16, 09:41

Slide 27

Slide 27 text

Understand that other maintainers of your code are going to have those questions about what you put in a class. The maintainer of this code is going to need to know the answers to all of those. Placing a code element at class-scope invites questions. Code Unto Others http://localhost:8080/print/ 27 of 75 5/30/16, 09:41

Slide 28

Slide 28 text

Avoid placing code elements at class-scope unless you have no alternative. Code Unto Others http://localhost:8080/print/ 28 of 75 5/30/16, 09:41

Slide 29

Slide 29 text

"'only ever used from' implies 'should be syntactically nested inside'" is one of the biggest misconceptions we see. Should all programs be written as one giant main method? The class "looks nice" with it there. I really want it there! It's only ever used from the class. Code Unto Others http://localhost:8080/print/ 29 of 75 5/30/16, 09:41

Slide 30

Slide 30 text

So where do things go? The class cannot function as required by its users without the code element at class scope. Code Unto Others http://localhost:8080/print/ 30 of 75 5/30/16, 09:41

Slide 31

Slide 31 text

Only "promote" code elements into other scopes when required to do so. Place all code elements at module-scope by default. Code Unto Others http://localhost:8080/print/ 31 of 75 5/30/16, 09:41

Slide 32

Slide 32 text

A test input string that is only used in one test method: let's not go overboard; that's fine as a local constant in the one test method. Now that we've talked some about where to put code, let's talk about when it's reasonable to use a class. Classes: module-scope. Functions: module-scope. Constants: module-scope. Code Unto Others http://localhost:8080/print/ 32 of 75 5/30/16, 09:41

Slide 33

Slide 33 text

Only use classes for what classes are for. And what are classes for? Be a class realist. Code Unto Others http://localhost:8080/print/ 33 of 75 5/30/16, 09:41

Slide 34

Slide 34 text

An important note here is that there's no behavior - just data. 1. Classes provide a way to structure and aggregate data. Code Unto Others http://localhost:8080/print/ 34 of 75 5/30/16, 09:41

Slide 35

Slide 35 text

2. Purely abstract classes define types. Code Unto Others http://localhost:8080/print/ 35 of 75 5/30/16, 09:41

Slide 36

Slide 36 text

3. Classes implement types. Code Unto Others http://localhost:8080/print/ 36 of 75 5/30/16, 09:41

Slide 37

Slide 37 text

This is probably the most recognizably traditional use of classes. 4. Classes provide a way to create arbitrarily many instances that behave in ways that are mostly similar, but different according to values specified at construction. Code Unto Others http://localhost:8080/print/ 37 of 75 5/30/16, 09:41

Slide 38

Slide 38 text

3 is implementing an abstract type, 4 is parametric behavior Maintainable classes typically avoid mixing these. (3) and (4) are okay but not the others. Code Unto Others http://localhost:8080/print/ 38 of 75 5/30/16, 09:41

Slide 39

Slide 39 text

Some things classes are not for: Code Unto Others http://localhost:8080/print/ 39 of 75 5/30/16, 09:41

Slide 40

Slide 40 text

How can that be? They're so different! Being functions. Code Unto Others http://localhost:8080/print/ 40 of 75 5/30/16, 09:41

Slide 41

Slide 41 text

need state to compute something? Better a public function using a private class than a public class to exist just to expose a single public method. my_object = MyClass(construction_parameter) my_value = my_object.my_method(method_parameter) # is the same as my_value = MyClass(construction_parameter).my_method( method_parameter) # rename `my_method` to `__call__`... my_value = MyClass(construction_parameter)( method_parameter) # elide the class entirely my_value = my_function_that_was_a_class( construction_parameter, method_parameter) Code Unto Others http://localhost:8080/print/ 41 of 75 5/30/16, 09:41

Slide 42

Slide 42 text

Code smell of using classes for namespacing; use module instead. Being concrete, but never being instantiated. Code Unto Others http://localhost:8080/print/ 42 of 75 5/30/16, 09:41

Slide 43

Slide 43 text

when you're writing an interface and a handful of implementations of that interface that you think are the only ones that will ever need to exist. Enumerated polymorphism. Code Unto Others http://localhost:8080/print/ 43 of 75 5/30/16, 09:41

Slide 44

Slide 44 text

this looks like what we just covered... class MyClass(object): MY_CONSTANT = 'value in class' @classmethod def my_class_method(cls, parameter): # some implementation Code Unto Others http://localhost:8080/print/ 44 of 75 5/30/16, 09:41

Slide 45

Slide 45 text

Don't use class objects for enumerated polymorphism. Python has abc module for polymorphism; use that. class MyFirstSubClass(MyClass): MY_CONSTANT = 'value in first subclass' @classmethod def my_class_method(cls, parameter): # some different implementation class MySecondSubClass(MyClass): MY_CONSTANT = 'value in second subclass' @classmethod def my_class_method(cls, parameter): # some still different implementation Code Unto Others http://localhost:8080/print/ 45 of 75 5/30/16, 09:41

Slide 46

Slide 46 text

Science-fiction movie with alien thing falling to earth and taking over the scientist who finds and studies it. Scientist becomes villain because thing couldn't be used without its user becoming it. Code Unto Others http://localhost:8080/print/ 46 of 75 5/30/16, 09:41

Slide 47

Slide 47 text

Intended to be helper code, but "you can't use this without becoming it" is a very antisocial of "help". If you just wanted help and your public API is enlarged by accepting it, that's not help! NEXT: Let's talk some about how to maximize the clarity of class implementations when they're in use. Mixins (Eww, yuck!) Code Unto Others http://localhost:8080/print/ 47 of 75 5/30/16, 09:41

Slide 48

Slide 48 text

We've talked a fair amount about what classes are for. Now let's take a little more time to talk about classes, in terms of things to avoid structurally in their implementing code. Design of Classes Code Unto Others http://localhost:8080/print/ 48 of 75 5/30/16, 09:41

Slide 49

Slide 49 text

This isn't always bad, but it's often a path to confusion. It's also a great way to make your class hard or impossible to subclass (see also Josh Bloch's _Effective Java_, item 17). We bring it up because it displays layering violation - one element using another shows the API spread across at least two layers of abstraction. Avoid self-use of public APIs Code Unto Others http://localhost:8080/print/ 49 of 75 5/30/16, 09:41

Slide 50

Slide 50 text

We've found it's worth going a step further and not passing self as a parameter to functions called from within a class definition. It's a good way to end up with infinite recursion or reference cycles. Avoid self-escape in class implementation Code Unto Others http://localhost:8080/print/ 50 of 75 5/30/16, 09:41

Slide 51

Slide 51 text

Standard remedy: do you really want the subsystem to which you are passing "self" to be able later to call any arbitrary method of self? Mostly the answer is "no, just one part of self". So just pass that one part of "self". Pass something less than self: The value of an instance field. A custom type composed of the values of several instance fields. A bound method. Code Unto Others http://localhost:8080/print/ 51 of 75 5/30/16, 09:41

Slide 52

Slide 52 text

Never have "these fields are used during a call to [method]". (Nathaniel's leaked ten megabytes story.) Minimize instance state. Code Unto Others http://localhost:8080/print/ 52 of 75 5/30/16, 09:41

Slide 53

Slide 53 text

We've talked about them being the default place for code elements. Now we're going to talk about how to organize things in your modules for easier maintenance. Anything to say about modules? Code Unto Others http://localhost:8080/print/ 53 of 75 5/30/16, 09:41

Slide 54

Slide 54 text

If you don't have a directed acyclic graph of modules, you want your code and continuous integration to scream it rather than have it slip by accidentally because you allowed imports to be sprinkled throughout your functions. Always place imports at the top of your modules. Code Unto Others http://localhost:8080/print/ 54 of 75 5/30/16, 09:41

Slide 55

Slide 55 text

If a function in your module returns an instance of referenced_module.UsefulType, import referenced_module into your module. This helps maintain abstraction and avoid circularity and catches places where you're accidentally referring to non-public code elements. Import modules referenced in specification. import referenced_module import used_module def useful_value(parameter): """Returns a referenced_module.UsefulType value.""" return used_module.create_useful_value(parameter + 5) Code Unto Others http://localhost:8080/print/ 55 of 75 5/30/16, 09:41

Slide 56

Slide 56 text

The underscore only looks weird the first eight hundred times you type it. Default to using private visibility for code elements and only "promote" them to public when you make them a deliberate and intentional part of a public API. Code Unto Others http://localhost:8080/print/ 56 of 75 5/30/16, 09:41

Slide 57

Slide 57 text

Advanced Techniques Code Unto Others http://localhost:8080/print/ 57 of 75 5/30/16, 09:41

Slide 58

Slide 58 text

Consider now restricting your judgement later. Code Unto Others http://localhost:8080/print/ 58 of 75 5/30/16, 09:41

Slide 59

Slide 59 text

You've spent the last week on a software problem. Understanding, designing, coding, testing, debugging. You were thinking about the problem not just while working but also at meals, while bathing, right before sleep, right after waking. Are you in the right position to judge what's obvious about the problem and what's obscure? Like those others you are compromised. What's a limit for which you could have signed up back when your judgement was objective? Code Unto Others http://localhost:8080/print/ 59 of 75 5/30/16, 09:41

Slide 60

Slide 60 text

The longer you spent understanding the subject area and authoring code that solves a problem, the less qualified you are to decide anything about what's obvious and not obvious about the code. Line limits and complexity limits on functions, classes, and modules. Code Unto Others http://localhost:8080/print/ 60 of 75 5/30/16, 09:41

Slide 61

Slide 61 text

Does anyone respect this claim? You don't want to be this programmer. How many of you want to investigate a bug in someone else's thousand-line class? "Once you understand everything that's involved in horckleblaxing a fnastitude it's obvious why this function needs to be two hundred lines long!" Code Unto Others http://localhost:8080/print/ 61 of 75 5/30/16, 09:41

Slide 62

Slide 62 text

NEXT: more than any other reason, laziness What makes conforming to line count limits hard? Code Unto Others http://localhost:8080/print/ 62 of 75 5/30/16, 09:41

Slide 63

Slide 63 text

Since your function is defined in terms of a, b, c, x, y, and z, you probably implemented it in terms of a, b, c, x, y, and z, were happy to be done, and walked away. def long_function(a, b, c): """Returns triplet of x, y, and z.""" # # Looooooooooong function body omitted # # # # # # (But trust us, it's really loooooooong!) # ↓ ↓ ↓ ↓ ↓ ↓ Code Unto Others http://localhost:8080/print/ 63 of 75 5/30/16, 09:41

Slide 64

Slide 64 text

Go ahead and define the D through W. Reduce the size of your too-large code elements. They are a kindness to your co-maintainers with no effect on your public API. Helper functions for parts of a process. Data structures for partial results. Code too obvious? Be too obvious rather than not obvious enough. This is "code unto others", right? Code Unto Others http://localhost:8080/print/ 64 of 75 5/30/16, 09:41

Slide 65

Slide 65 text

Preexisting abbreviations like "TCP" and "HTTP" are fine, but don't create new ones. NEXT: blue marble. Don't abbreviate when naming. Code Unto Others http://localhost:8080/print/ 65 of 75 5/30/16, 09:41

Slide 66

Slide 66 text

Software engineering is a cosmopolitan endeavor. Spend a little horizontal space writing names that are more easily understood by readers who aren't as skilled with your language. Code Unto Others http://localhost:8080/print/ 66 of 75 5/30/16, 09:41

Slide 67

Slide 67 text

In storytelling, across media and in both fiction and nonfiction, characters are introduced in isolation before interacting with others. Except when they aren't; that's a narrative technique to surprise the reader. "1984" telescreen moment. Kool-Aid Man. Surprises are the opposite of what we want when reading code. Code Unto Others http://localhost:8080/print/ 67 of 75 5/30/16, 09:41

Slide 68

Slide 68 text

Having read a lot of code we've found that this is an encumberance. The reader has to pause their comprehension and scan ahead for _shadowy_stranger_function further down in the file. def _first_familiar_function(parameter): # some implementation def _second_familiar_function(parameter): # some implementation def high_level_function(parameter): v1 = _first_familiar_function(parameter) v2 = _second_familiar_function(parameter) return _shadowy_stranger_function(v1, v2) Code Unto Others http://localhost:8080/print/ 68 of 75 5/30/16, 09:41

Slide 69

Slide 69 text

Only mutual recursion can get in the way of this. How many of you write mutually recursive recursive functions every day? Sort your code elements in definition- before-use order. Code Unto Others http://localhost:8080/print/ 69 of 75 5/30/16, 09:41

Slide 70

Slide 70 text

So because private code elements are used by public code elements, this means that private code elements appear earlier in files and public code elements appear later in files. NEXT: Pushback is about users having to scroll. def _first_familiar_function(parameter): # some implementation def _second_familiar_function(parameter): # some other implementation def _shadowy_stranger_function(first, second): # not so shadowy and strange anymore! def public_function(first_parameter, second_parameter): v1 = _first_private_function(first_parameter) v2 = _second_private_function(second_parameter) return _shadowy_stranger_function(v1, v2) Code Unto Others http://localhost:8080/print/ 70 of 75 5/30/16, 09:41

Slide 71

Slide 71 text

Your code has two audiences: maintainers and users. "But I want the public parts of my code at the top so that users don't have to scroll!" Code Unto Others http://localhost:8080/print/ 71 of 75 5/30/16, 09:41

Slide 72

Slide 72 text

You can please all of the people some of the time; this is not one of those times. Maintainers and users are two different audiences of readers. Code Unto Others http://localhost:8080/print/ 72 of 75 5/30/16, 09:41

Slide 73

Slide 73 text

Your maintainers have to read your file, so write your file to make your maintainers most happy. Send your users to the documentation generated from your code, since that's all they need to be happy and you have additional tools there to make them happy. Code Unto Others http://localhost:8080/print/ 73 of 75 5/30/16, 09:41

Slide 74

Slide 74 text

The more you know about your code, the harder you may find relating to new readers and users of it Code Unto Others Readability is independent of correctness, efficiency, and problem domain Classes invite questions and complexity Consider setting your own judgement aside When forced to choose favor writing for maintainers rather than users Code Unto Others http://localhost:8080/print/ 74 of 75 5/30/16, 09:41

Slide 75

Slide 75 text

Thank you! Code Unto Others http://localhost:8080/print/ 75 of 75 5/30/16, 09:41