constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”
outside (black box) • Draw conclusions based on observation • Testing with one set of inputs says nothing at all about what to expect with different inputs • Informal Reasoning • Attempt to understand from the inside • Draw conclusions based on reading the code
software system and been told to “try it again”, or “reload the document”, or “restart the program”, or “reboot your computer” or “re-install the program” or even “re- install the operating system and then the program” has direct experience of the problems that state causes for writing reliable, understandable software –Out of the Tarpit
program is ...difficult. • Testing a system/component in one particular state tells you nothing about it in a different state • Hidden state, inter-component interactions, system state after sequence of input, etc. • Informal reasoning often revolve around a case-by-case mental simulation of behavior • Contamination: Even indirect use of stateful procedures by stateless ones can only be understood in the context of state
• If always enforced implicitly (e.g,. imperative programming), program must be understood in that context • Often over-specifies the system • Developer must specify ordering/flow (how) instead of declare constraints (what) • Forces the developer (and compiler) to understand whether that ordering affects computation to understand the program • Can significantly complicate informal reasoning about a program • Misinterpretation of ordering's significance can result in subtle bugs
initial state and inputs) in the presence of concurrency tell you nothing about the next time they are run with the same initial state and inputs • Informal reasoning is made exponentially more complex with each additional piece of state
sd2 = 0.0; for (var u: number = 0; u < n; ++u) { for (i = 0; i < this.k; ++i) Huu[i] = this.g[i][u] = 0.0; for (var v = 0; v < n; ++v) { if (!(u === v)) { // The following loop randomly displaces nodes that are at identical po var maxDisplaces = n; // avoid infinite loop in the case of numerical i while (maxDisplaces--) { sd2 = 0.0; for (i = 0; i < this.k; ++i) { var dx = d[i] = x[i][u] - x[i][v] + 0.0; sd2 += d2[i] = dx * dx; } if (sd2 > 1e-9) break; var rd = this.offsetDir(); for (i = 0; i < this.k; ++i) x[i][v] += rd[i]; } var l: number = Math.sqrt(sd2); var D: number = this.D[u][v]; var weight = this.G != null ? this.G[u][v] : 1.0; if (weight > 1 && l > D || !isFinite(D)) { for (i = 0; i < this.k; ++i) this.H[i][u][v] = 0.0; } else { if (weight > 1.0) { weight = 1.0; } var D2: number = D * D; var gs: number = 2.0 * weight * (l - D) / (D2 * l); var l3 = l * l * l; var hs: number = 2.0 * -weight / (D2 * l3); //if (!isFinite(gs)) // console.log(gs); for (i = 0; i < this.k; ++i) { this.g[i][u] += d[i] * gs; Huu[i] -= this.H[i][u][v] = hs * (l3 + D * (d2[i] - sd2) + l * }
with state and control, complexity tends to increase non-linearly with code size • Compounds the problems caused by state and control, non- linearly • Managing code volume indirectly manages complexity
because state/control/LOC complexity makes comprehension difficult • Simplicity is hard • Requires significant effort to arrive at the simplest solution to a problem (time pressures, existing complexity etc) • Power Corrupts • In the absence of language enforced guarantees, mistakes will happen • Restriction of power • e.g., garbage collection, immutability • The more that is possible in a language, the harder it is to understand systems constructed in it
intensional identities • objects with identical attributes/values are not equivalent • domain-specific equality must be explicitly defined • No guarantee that they adhere to any standard equivalence relation
Explicit "shared state" concurrency • Sequential execution unless explicitly made concurrent • Shared state is usually mutable to concurrently executing code
Roots in stateless Lambda Calculus • Emphasis on immutability • Function composition over sequential statements • Equivalent in power to a Turing machine (Von Neumann architecture)
Transparency (given the same set of arguments, a function will always return the same result) • Simplifies testing, in contrast to stateful models • Informal reasoning also simplified
Neumann architecture • Pure logic programming simply makes statements about a problem and the desired solution • Describe using axioms • Constrain with required attributes • Solutions are formal logical consequences of axioms and constraints • "Running" the system is equivalent to constructing a formal proof
essence of, the problem (as seen by the users) • Complexity that must be dealt with, even in an ideal world • i.e., with a language and infrastructure to directly express the users' problem • Accidental • Everything else • Performance issues, language expressiveness, infrastructure, hardware limitations (e.g., computational complexity), etc.
no relevant ambiguity/omissions • Simply execute formal requirements • No control, just declaration of facts and constraints • Absolute simplicity • The essence of declarative programming
by the user is the only real essential state • Essential derived immutable data: Can always be derived from the user's input data and can be ignored • Essential derived mutable data: can be derived from input data, but can also be changed by the user. (Should be treated as input data, more later) • Accidental derived data: Since it is accidental, the user doesn't need it and it is not in the user's informal requirements– can be ignored • e.g., data derived for performance
as it is entirely accidental • Not usually present in informal requirements • Control flow is about how to execute • Results should be independent of control • Concurrency • Concurrent and sequential execution are identical if you assume execution takes zero time (synchrony hypothesis)
e.g., caching of derived data • Accidental state is sometimes important for ease of expression • Logic is sometimes easier to express in terms of accumulated values • e.g., current position in a simulation can be computed from all previous inputs over time, but is more easily expressed as accidental state • Note: time is considered an input alongside other inputs
or desired • To maintain the simplest, practical model we should • Avoid state and control where not absolutely and truly essential (within reasonable confines of ease of expression) • Separate accidental state and control from the essential data and logic
accidental state for performance • Instead, declare what accidental state should be used • Leave implementation to separate infrastructure • e.g., caching infrastructure • Removes possibility of state inconsistency within the system (correctness of external infrastructure notwithstanding) • Complexity is the enemy of performance and optimization • It is far simpler and oftentimes easier to improve the performance of a slow system designed for simplicity than remove complexity from a complex system designed to be fast (and may not be because of missed optimization opportunities due to the complexity)
state is the best way to express parts of the logic, externalize its derivation from the system logic and treat it as input • System logic should be free of complexity (state and control) • Other, non-useful (e.g., for ease of expression) accidental complexity • Avoid, may require discipline and additional effort to arrive at a simpler model
Use the least powerful language necessary for each separately specified component • e.g. use a language without control primitives when only specifying state • The weaker the language, the simpler it is to reason about
on any other components • Essential Logic • The heart of the system • Isolated from all accidental complexity • May require changes if essential state component changes • Accidental State and Control • May require changes if essential components change • Nothing essential depends on this *Arrows show reference
structuring data • a means to manipulate structured data • a mechanism for maintaining integrity and consistency of state • A clear separation of logical and physical layers of the system • Logical model to minimize complexity addressed separately from designing an efficient physical storage model
representing all data • Manipulation: means to specify derived data • Integrity: means to specify constraints on the data • Data Independence: Clear separation between the logical data and its physical representation
records each consisting of a heterogeneous set of uniquely named attributes • Contains no duplicates and has no ordering • Best thought of as a value • Base relation: stored directly • Derived Relation: A "view" defined in terms of other relations • Relation variables, "relvars" reference relation values • Path independent, i.e. no explicit connections need to be made between relation types • Contrast with network and hierarchical models (and OOP approaches)
Selects a subset of records according to some criteria • Project: Selects a subset of attributes • Product: Cartesian product of args • Union: All records in both args
Difference: All records in the first arg but not in the second • Join: Constructs all records that result from matching identical attributes of the argument relations' records • Divide: Returns all records of the first arg which occur in the second arg associated with each record of the third arg
storage representation • Analogous to the essential / accidental split • Hints may be declaratively provided to the storage system to optimize the physical representation
are based upon functional programming and the relational model • All essential state takes the form of relations • All essential logic is expressed using relational algebra, extended with pure user-defined functions • Recommendations of the essential / accidental split previously discussed are put into practice
relvars (the names/types of base relvars) • Essential Logic: Both functional and relational parts. Main part consists of derived relations and integrity constraints; can make use of an arbitrary set of pure user-defined functions • Accidental State and Control: Isolated (from each other) performance hints, e.g. whether to store derived relvar values, concurrency hints, etc. • Other: Interfacing– Feeders and Observers *Arrows show data-flow
assignments • Causes changes to the essential state • Specified in some state manipulation language by the infrastructure • Integrity constraints asserted • Observers • Generate output in response to changes of the values of derived relvars • Invoked by infrastructure • Both feeders and observers are tasked with converting between relational (flat) and input/output structured data
avoid useless accidental state • The core FRP system cannot be in a "bad state" • Derived state is not normally stored • Fixing bugs should never require exhaustive search through essential state • Accidental state does not need to be considered when developing the logic of your system
no access to state at all– referentially transparent • State represented with relations has no subjective bias in how related data is accessed • Integrity constraints are imposed in a purely declarative manner, do not interact with eachother, and therefore add them only increase the overall complexity of the system linearly • In contrast to OOP/imperative programming where interaction between methods cause complexity to grow at a much higher rate • More amenable to performance tuning
components (essential logic) • Logic is simply a set of equations equating relvars with relations calculated by their expressions • No implicit ordering • No explicit parallelism • Lack of state in essential logic makes implicit parallelism much simpler to implement at the infrastructure level
flat relations discourages building larger compound data abstractions making access more flexible • Minimal commitment to subjective groupings (only base relations) • (Discourages) Data Hiding • Benefits for referential transparency • testing • informal reasoning