Out of the Tar Pit (Part 1)

AUTHORS PUBLISHED Ben Mosley Peter Marks 2006

The biggest problem in the development and maintenance of large-scale
software systems is complexity – large systems are hard to understand. “

I conclude that there are two ways of constructing a
software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. “ Tony Hoare, 1980 Turing Award Speech

SIMPLICITY HARD IS

METHODS OF UNDERSTANDING

TESTING Testing gains an understanding of the system from the
outside – treating the system as a “black box”. We make conclusions about a system based on how it behaves in response to specific inputs. INFORMAL REASONING Informal reasoning gains an understanding of the system from the inside, by directly examining its code and data. Since more information is available, hopefully a more accurate understanding can be achieved.

…testing is hopelessly inadequate… (it) can be used very effectively
to show the presence of bugs but never to show their absence. “ Edsger Dijkstra (1971)

CAUSES OF COMPLEXITY

STATE “computers… have very large numbers of states. This makes
conceiving, describing, and testing them hard. Software systems have orders-of-magnitude more states than computers do.”

TESTING The key problem is that a test (of any
kind) on a system or component that is in one particular state tells you nothing at all about the behavior of that system or component when it happens to be in another state. INFORMAL REASONING Large amounts of state make a system hard to reason about, because a developer must mentally model the behavior of the system in all possible states, as well as the code paths than can potentially lead to each state. Each additional bit of state doubles total number of possible states.

CONTROL Most languages have an implied or explicit order of
execution. In practice, this means each line in the program must be understood in context with the surrounding system.

CONTROL Do any of these lines depend on one another?
a := b + 3 c := d + 2 e := f * 4

CODE VOLUME Code volume is often a secondary effect of
external complexity – often large amounts of code are spent managing state or specifying control.

CODE VOLUME Code volume is easy to measure, and it
interacts badly with other forms of complexity – systems increase in complexity non-linearly as size increases.

COMPLEXITY Complexity breeds complexity. This is a specific, unfortunate variation
of broken window theory.

MANAGING COMPLEXITY

OBJECT ORIENTATION OO languages combine state with logic to access
and manipulate that state. In this way, state can be encapsulated and its integrity can be enforced by the object’s methods.

OBJECT ORIENTATION If state can be accessed in multiple places,
the code that enforces integrity may be repeated – and not necessarily in the same file.

OBJECT ORIENTATION It’s relatively easy to enforce single-object integrity, but
enforcing integrity between objects is much more difficult.

OBJECT ORIENTATION “The bottom line is that all forms of
OOP rely on state and in general all behavior is affected by this state. As a result of this. OOP suffers directly from the problems associated with state.”

FUNCTIONAL Functional Programming has its roots in completely stateless lambda
calculus.

STATE Modern FP languages are classified as pure – no
state and no side effects – or impure, which while recommending that developers avoid state, still allow their use. REFERENTIAL TRANSPARENCY A property of systems that implies that when supplied with the same inputs, a given function will always return exactly the same result. Guaranteed referential transparency removes a weakness of testing, since external state can no longer effect the validity of the test.

CONTROL Most FP languages have an implied left-to-right control flow,
and so suffer from the same control flow complexity as other languages. Some languages can a small advantage because they encourage a function-based approach to control flow (i.e., map and reduce instead of normal flow statements like for and while).

KINDS OF STATE Immutable state does not generally increase the
complexity of a system – only mutable state does. procedure int getNextCounter() { // counter is declared elsewhere counter := counter + 1 return counter } This procedure relies on external state, and so its behavior cannot reliably be reasoned about or tested with automated tests.

KINDS OF STATE Mutable state can be simulated by passing
in external dependencies and returning a new, modified result. procedure int getNextCounter(int oldCounter) { newCounter := oldCounter + 1 return newCounter } This modified procedure maintains referential transparency and can be reasoned about and tested.

FUNCTIONAL There’s nothing technically preventing a functional system from maintaining
state in a mutable collection and passing it from function to function.

FUNCTIONAL Doing so maintains the property of referential transparency but
does not reduce the complexity of the system. Notably, this is the approach taken by Redux – all system state is passed from transform to transform. Referential transparency is maintained because each transform does not mutate the global state – they return modified copies.

LOGIC PROGRAMMING TL; DR. Logic programming is super weird.

TYPES OF COMPLEXITY

ESSENTIAL Essential complexity is inherent to the problem being solved
– and so is visible to the user. ACCIDENTAL Accidental complexity is everything else – complexity with which the dev team would be able to avoid in an ideal world (i.e., complexity due to performance issues, bad design decisions, or a suboptimal language choice).

One implication of this definition is that if the user
doesn’t even know what something is, then it cannot possibly be essential. “ Optimistically assuming that the users actually understand the problem they want solved, of course.

TO BE CONTINUED

Out of the Tar Pit (Part 1)

Out of the Tar Pit (Part 1)

Joshua Tompkins

More Decks by Joshua Tompkins

Other Decks in Research

Featured

Transcript

AUTHORS PUBLISHED Ben Mosley Peter Marks 2006

The biggest problem in the development and maintenance of large-scale

I conclude that there are two ways of constructing a

SIMPLICITY HARD IS

METHODS OF UNDERSTANDING

TESTING Testing gains an understanding of the system from the

…testing is hopelessly inadequate… (it) can be used very effectively

CAUSES OF COMPLEXITY

STATE “computers… have very large numbers of states. This makes

TESTING The key problem is that a test (of any

CONTROL Most languages have an implied or explicit order of

CONTROL Do any of these lines depend on one another?

CODE VOLUME Code volume is often a secondary effect of

CODE VOLUME Code volume is easy to measure, and it

COMPLEXITY Complexity breeds complexity. This is a specific, unfortunate variation

MANAGING COMPLEXITY

OBJECT ORIENTATION OO languages combine state with logic to access

OBJECT ORIENTATION If state can be accessed in multiple places,

OBJECT ORIENTATION It’s relatively easy to enforce single-object integrity, but

OBJECT ORIENTATION “The bottom line is that all forms of

FUNCTIONAL Functional Programming has its roots in completely stateless lambda

STATE Modern FP languages are classified as pure – no

CONTROL Most FP languages have an implied left-to-right control flow,

KINDS OF STATE Immutable state does not generally increase the

KINDS OF STATE Mutable state can be simulated by passing

FUNCTIONAL There’s nothing technically preventing a functional system from maintaining

FUNCTIONAL Doing so maintains the property of referential transparency but

LOGIC PROGRAMMING TL; DR. Logic programming is super weird.

TYPES OF COMPLEXITY

ESSENTIAL Essential complexity is inherent to the problem being solved

One implication of this definition is that if the user

TO BE CONTINUED