Papers We Love, Too Mini: Out of the Tar Pit

Out of the Tarpit Kyle Isom

Out of the Tarpit • Ben Moseley and Peter Marks
• Published in 2006 • 63 pages and 2.5 pages of references • Only covering the first part

Summary • “Complexity causes more problems in large software than
anything else” • “What is the way out of the tar pit? … we believe there can be no doubt it is simplicity.”

What makes building software hard? • Brooks identified four elements:
• complexity • conformity • changeability • invisibility

What makes building software hard? • Last three: kinds of
or due to complexity • Complexity is the root of all problems

Why is complexity bad? • It makes reasoning about a
system difficult • Hoare: “the price of reliability is the pursuit of utmost simplicity” • Focus is on complexity that makes large systems difficult to understand

Where does complexity come from? Simplicity is hard

How do we understand a system? • Testing • Informal
reasoning

Testing is limited • Improves error detection • We can
only test what we can think to test

Informal reasoning is powerful • Less errors are created •
Avoid bugs

Where does complexity come from? • State • Control •
Code volume • Complexity • Power

Complexity from State • Increases the surface area that must
be reasoned about • Can only test system in a specific state • A single bit of state doubles the possible number of states • State contaminates

Complexity from Control • Control: ordering of things • We
don’t really want to have to think about this • Control is almost always implicit in a language • Requires specifying how system works, not what it should do

Complexity from Control • Artificial ordering is imposed where it
usually isn’t needed • Reasoner has to expend effort (possibly incorrectly) removing ordering a := b + 3 c := d + 2 e := f * 4

Complexity from Control • Concurrency is specified explicitly in most
languages • This leads to further difficulties in reasoning

Complexity from Volume • Complexity often increases nonlinearly with code
size • The best code, then, is code that isn’t written • The less there is to reason about, the easier the whole is to reason about

Other causes of complexity • Complexity breeds complexity • The
difficulty of building simple systems • Language power tends to encourage cleverness • Generally, complexity = {state, control}

Classical Complexity Management • Object-oriented programming • Functional programming •
Logic programming

The OO approach to complexity management • Essentially imperative approach
• Object: state + procedures for state access and manipulation • Enforcing constraints is awkward

The OO approach to complexity management • Intensional identity: objects
are uniquely identified separately from attributes • Extensional identity: identity via attribute comparison

The OO approach to complexity management • Object-oriented (and conventional
imperative) programs suffer from both state-derived and control-derived complexity

FP approach to complexity management • Explicit attempt to avoid
state • System gains referential transparency • This has benefits for testing and concurrency as well

FP approach to complexity management • Functional programs still implicitly
specify control (defun info (uuid) "Retrieve metadata about an entry." (let ((entry (lookup-entry uuid))) (pairlis '(:id :created :size :parent) (list (entry-uuid entry) (entry-created entry) (entry-size entry) (entry-parent entry))))))

Interlude: Kinds of state • Authors usually mean mutable state
when they say state

FP approach to complexity management • FP simulates state with
functional values • Often have a pool of global values • This can lead to hidden, implicit, mutable state • Avoids many state-derived complexity issues

Modularity • FP: the outcome of a function can be
determined by examining the arguments • Stateful programming: you have no idea • Tradeoff between complexity and simplicity

Logic Programming • Not derived from von-Neumann architecture • Specify
what needs to be done, not how • The system becomes a formal proof of the problem

Logic Programming • No mutable state • Purity is the
only guarantee that state-related problems will not occur • Prolog has operational commitment to process language in same textual order

Classifying Complexity • Essential: the essence of the problem as
seen by the users • Accidental: everything else • We’d ideally use tooling that lets us program using only the language of the users’ problem

Recommended General Approach • Start with informal requirements from users
• Ultimately, something needs to happen • Building formal requirements must be done without introducing accidental complexity

State in the Ideal World • No state • Informal
requirements specify data: input and derived • All data mentioned by users is essential data • Essential data does not necessarily imply essential state

Input Data • System might need this data in the
future • System does not need this data in the future

Essential derived data • Immutable • Can be re-derived from
the input data as needed • Doesn’t need to be stored • Mutable • Used where data cannot be easily re- derived from input data • Both are cases of accidental state

Accidental derived data • Not in users’ requirements • Accidental
state

State in the ideal world • Some essential state is
unavoidable • Pure functional programs can simulate accidental and essential state • Ideal world removes all non-essential state

Limitations • Formal specifications are essentially the same as formal
requirements • Formal specifications should ideally derive entirely from users’ informal requirements

Formal Specifications • Property-based: what is required; includes algebraic (equational
axiomatic semantics) • Model-based: potential model (often stateful) with behavioural description

Required Accidental Complexity • Sometimes it’s more natural to model
the problem in a non-ideal way • e.g. derived data dependent on a series of user data • Accidental state may be required for performance or ease of expression

Required Accidental Complexity • This requires awareness that accidental state
has been introduced • Increases risk of system entering an inconsistent state

Dealing with Complexity

Dealing with Complexity • There will always be complexity that
is either required or practically useful in some way • Avoid complexity where possible • Separate accidental complexity from essential complexity

Dealing with Complexity • Avoid having to explicitly management of
state • Declare the accidental state, and leave it to a separate infrastructure

Dealing with Complexity • Where ease-of-expression is concerned, treat accidental
state as essential state for separation • All state should be separated from logic • This gets further partitioned into accidental and essential

Dealing with Complexity • Structure: essential complexity + accidental but
useful complexity • Structuring the system like this → system functions correctly if “accidental but useful” complexity is removed • It could be unacceptably inefficient

Dealing with Complexity • Separated components may be very different
• “May be ideal to use different languages for each” • Emphasis on restricting power of languages • “The weaker the language, the more simple it is to reason about”

Dealing with Complexity

Essential State • Foundation of system • Specification is completely
self-contained

Essential logic • Business logic • Expresses what must be
true in terms of the state • Says nothing about how, when, or why state changes • Only references essential state

Accidental State and Control • Least important part of system
• Changes to this part never affect the other parts • Changes to either of the other parts can affect this part

Dealing with Complexity • Simplicity is hard • Making the
up-front cost will cause fewer difficulties than dealing with the resultant complexity • Complexity spreads

Papers We Love, Too Mini: Out of the Tar Pit

Papers We Love, Too Mini: Out of the Tar Pit

Other Decks in Programming

Featured

Transcript