Towards "annex", a Fact Based Dependency System

42d9867a0fee0fa6de6534e9df0f1e9b?s=47 Mark Hibberd
September 19, 2014

Towards "annex", a Fact Based Dependency System

Knowledge is not static. Yet when dealing with program artifacts, we choose to seal off what we know at the point in time when we know the least. This is wrong.

Context is important. Yet when defining dependencies on artifacts, instead of directly specify the query we want (and hence embedding its context), we manually translate our request into antiquated notions of meta-data, encoded as a number, embedded in a string. Yes, semantic versioning is wrong.

Reproducibility is essential. Yet most existing dependency systems force a trade off of rigour and reproducibility against flexibility and ease of use. This is not necessary.

Drawing on well understood foundations from datalog and deductive databases, and utilizing functional programming fundamentals, “annex” takes a different view on how to manage artifacts. We should be able to ask: “Give me the latest binary compatible versions of X with no known CVE”; or, “Give me the last stable builds of my dependencies that have been tested in IE 9, Chrome and Firefox”; or in a more general context outside of dependency resolution, queries such as “What platforms has build x of my library been tested on?” provide a useful understanding of the current state of artifacts; and finally, it should be possible to phrase all of these questions with a first class notion of time, for example “Give me the same dependencies when I last asked this query”.

This talk will start by walking through the concepts behind “annex”, before taking a deeper look at the design and implementation (in Haskell) and a multi-language demonstration. We will look at how its functional underpinnings give rise to very desirable properties for a dependency system. These properties include: trivial distribution and caching; guaranteed reproducibility with minimal context; predictable performance; and interestingly, how steadfastly holding to functional programming principles contributes to being able to deliver a humane user experience in the face of complexity.

Presented at strangeloop, 19th of September 2014 - https://thestrangeloop.com/sessions/towards-annex-a-fact-based-dependency-system

42d9867a0fee0fa6de6534e9df0f1e9b?s=128

Mark Hibberd

September 19, 2014
Tweet

Transcript

  1. towards annɛx @markhibberd

  2. “The enemy is the gramophone mind, whether or not one

    agrees with the record that is being played at the moment.” George Orwell - The Freedom of the Press
  3. one Motivation

  4. one Or, we are doing it ALL wrong

  5. 2.0.1

  6. 2.0.2

  7. 2.1.0

  8. 3.0.0

  9. Surely This is a Joke

  10. LANGUAGE WARS ARE PASSE

  11. TIME IS A THING

  12. commit

  13. ci

  14. publish

  15. platform test

  16. production

  17. performance

  18. cve

  19. CLOSED WORLD ASSUMPTIONS

  20. Dependencies Cost Too Much

  21. two Concepts

  22. eminence boxer napoleon snowball

  23. napoleon boxer snowball eminence wai base

  24. napoleon boxer snowball eminence wai base postgres

  25. napoleon boxer snowball eminence wai base postgres libpq OS

  26. napoleon boxer snowball eminence wai base postgres libpq OS CLANG

  27. annex is a fact store

  28. START WITH SOMETHING WE HAVE (OR CAN GET)

  29. boxer a family

  30. family/123-abc-456 a family

  31. boxer 1.2.1 an atom

  32. atom/123-abc-456 an atom

  33. a fact fact/123-abc-457: commit: bd2f074…02

  34. fact/123-abc-458: api-signature: […] a fact

  35. fact/123-abc-459: feature: it-works a fact

  36. We ascribe FACTS to ATOMS

  37. atom/123-abc-456 fact/123-abc-457 fact/123-abc-458 fact/123-abc-459

  38. boxer 1.2.1 commit: bd2f074…02 api-signature: […] feature: it-works

  39. The view of FACTS against ATOMS at a point in

    TIME is a WORLD
  40. boxer facts/… snowball facts/… napoleon facts/… a single world

  41. Worlds Change Over Time

  42. @v1 boxer 1.2.1 commit: bd2f074…02 api-signature: […] feature: it-works

  43. fact/123-abc-460: tested: FreeBSD-9.1

  44. @v1 boxer 1.2.1 commit: bd2f074…02 api-signature: […] feature: it-works @v2

    boxer 1.2.1 commit: bd2f074…02 api-signature: […] feature: it-works tested: FreeBSD-9.1
  45. Usability #1 Design desisions should be framed in terms of

    predictability and repeatability
  46. annex is a data store

  47. Because we believe in interacting with an open world doesn’t

    mean we have to trust it.
  48. atom/123-abc-456 fact/123-abc-457 fact/123-abc-458 fact/123-abc-459 artifact/123-abc-459

  49. boxer 1.2.1 commit: bd2f074…02 api-signature: […] feature: it-works artifact: tag:

    bin/boxer flags: […] address: e2f1…bc74
  50. points to annex/storage/e2f1…bc74/data /info boxer 1.2.1 artifact: tag: bin/boxer flags:

    […] address: e2f1…bc74
  51. annex/storage/e2f1…bc74/data /info local/storage/e2f1…bc74/data /info free predictable caching

  52. Usability #2 Never download something already on a users machine

  53. annex is a language

  54. :boxer :has :feature “multi-part-put” :has :commit “abcd-1345” :has :branch “master”

    ! :snowball :is :compatable-with atom/… ! :napoleon :semver >= 1.3 < 1.4 !
  55. First Class Notion of Time

  56. :boxer :has :feature “multi-part-put” :has :commit “abcd-1345” :has :branch “master”

    ! :snowball :is :compatable-with atom/… ! :napoleon :semver >= 1.3 < 1.4 ! +annex.example.com@v12345
  57. :boxer :has :feature “multi-part-put” :has :commit “abcd-1345” :has :branch “master”

    ! :snowball :is :compatable-with atom/… ! :napoleon :semver >= 1.3 < 1.4 +annex.internal.com@v123
  58. :boxer :has :feature “multi-part-put” :has :commit “abcd-1345” :has :branch “master”

    ! :snowball :is :compatable-with atom/… ! :napoleon :semver >= 1.3 < 1.4 :no-cve +annex.example.com@HEAD
  59. :no-cve +annex.example.com@HEAD assumes irrefutable facts

  60. Usability #3 Precision is important, users should only have to

    specify what is important to them
  61. Usability #4 Flexibility must never come at the cost of

    determinism
  62. annex is a tool

  63. annex fetch boxer.ax +server@v123

  64. annex fetch +server@v123

  65. annex fetch -u +server@v123

  66. +server@v123 .stable.ax:

  67. annex fetch

  68. Usability #5 Don’t generate files that a user wouldn’t write

    by hand
  69. annex fetch +repository@v123 annex fetch +repository@v678 annex fetch +repository@v123 Instant

    By Design
  70. annex atom --create family/1d…3b

  71. annex fact atom/12…ef feature red

  72. git checkout -b topic/feature git add src/Boxer.hs git commit -m

    ‘Great change! fixes #12 annex: :feature winning’ git push origin topic/feature annex fact atom/12…ef --git HEAD
  73. git notes add --ref=annex \ -m “:feature again” HEAD git

    push origin refs/notes/* annex fact atom/12…ef --git HEAD
  74. Usability #6 Leverage tools already in use

  75. Usability #7 Don’t be as bad as Git

  76. ! scalaVersion := 2.11 ! annexDependencies := List( atom(“ivory”) .has(“feature”,

    “puts”) ) build.sbt:
  77. ! name: napoleon depends-on: eminence :feature fix-#12 napoleon.cabal.annex:

  78. three A Deeper Look

  79. ! resolution

  80. annex.mth.io github.com/ambiata/boxer Resolution

  81. annex.mth.io github.com/ambiata/boxer Retrieve Facts Resolution

  82. annex.mth.io github.com/ambiata/boxer Retrieve Facts Resolution

  83. annex.mth.io github.com/ambiata/boxer Send Query Resolution

  84. annex.mth.io github.com/ambiata/boxer Synchronize Artifacts Resolution

  85. ! development

  86. Time-Dependent Resolution

  87. eminence boxer napoleon snowball

  88. napoleon.ax: :boxer :has :feature ingestion ! :snowball :has :feature timeline

    .stable.ax: +annex.mth.io@v123
  89. [ci] stable [ci] edge annex fetch @HEAD annex fetch

  90. None
  91. Usability #8 Design for simulation, notifications and metrics

  92. An Open World

  93. eminence boxer napoleon snowball

  94. eminence boxer napoleon snowball _.js wai

  95. napoleon.ax: :wai :source hackage :semver == 2.1.* :underscore.js :source cdnjs

    :semver == 1.*
  96. napoleon.ax: :wai :source hackage :semver == 2.1.* :underscore.js :source cdnjs

    :semver == 1.* :tested-on ie4
  97. Usability #9 Start with the premise that you need to

    interact with less principled systems
  98. Source Substitution

  99. eminence boxer napoleon snowball

  100. eminence boxer napoleon snowball Working On A Feature

  101. napoleon.ax: :boxer :has :feature ingestion ! :snowball :has :feature timeline

    ! ! !
  102. eminence boxer napoleon snowball Need A Bug Fix

  103. annex fetch --source-substitute \ eminence ../eminence

  104. annex fetch --source-substitute \ eminence ../eminence\ --ignore-constraints

  105. :boxer :has :feature ingestion ! :snowball :has :feature timeline !

    :eminence :has :commit ab34…f3e1 :transitive napoleon.ax:
  106. annex fact git/HEAD fix ‘#112’

  107. :boxer :has :feature ingestion ! :snowball :has :feature timeline !

    :eminence :has :fix #112 :transitive napoleon.ax:
  108. eminence boxer napoleon snowball Ship It

  109. eminence boxer napoleon snowball Never had to touch intermediates

  110. Binary Substitution

  111. Requires deduction of output signature BEFORE it is built

  112. eminence boxer napoleon snowball Depends On Transitives

  113. Nix Style Build The World + Better Language Support Essential

  114. ! distribution

  115. If we assume open world, multiple annex fact stores is

    reality
  116. Handling time is non-trivial in a distributed system

  117. Annex (currently) chooses federation over being a truly distributed system

  118. +red@v123 Time Axis Is Localized To A Given Store

  119. Immutability Gives Us ∞ Read Replicas +red@v123 red green blue

  120. Query Controlled Writes red green blue +red@v123 => +red@v124

  121. Working on a model of facts that always commutes

  122. ! trust

  123. Authenticated FAMILY and ATOM owners

  124. Signed Facts

  125. Mediation and Fact Views

  126. annex.mth.io github.com/ambiata/boxer Fact Mediation annex.inside.ambiata.com [signed-by:…] [owned-by:…]

  127. ! solving

  128. DPLL / SAT based solution w/ inspiration from OPIUM paper

  129. Main challenge is mapping fact model to equation

  130. Main challenge in mapping fact model is deducing identity from

    user specified query
  131. Secondary challenge is caching partial solutions and reducing duplicated work

    where possible
  132. Reproducibility of solver (a cover up)

  133. :boxer :has :feature ingestion ! :snowball :has :feature timeline !

    :annex-resolver :has :version 1 napoleon.ax:
  134. four A Look Forward

  135. 0 25 50 75 100 April May June July Deeper

    Analytics
  136. 0 25 50 75 100 April May June July Deeper

    Analytics arbitrary queries and reporting over atoms
  137. 0 25 50 75 100 April May June July Deeper

    Analytics inference of relevant facts for customer issues
  138. 0 25 50 75 100 April May June July Deeper

    Analytics predict failure in advance
  139. Fixing ALL the COMPILERS

  140. Easier extension via deductive rules

  141. Commutative fact model and non-linear versioning

  142. Better Support for Operational / Runtime Dependencies

  143. these ideas STEAL

  144. end transmission.

  145. towards annɛx @markhibberd

  146. Images ! Unmodified, Licences Specified at WikiMedia links" http://commons.wikimedia.org/wiki/File:Clock_Gare_de_Paris-Est.jpg! http://en.wikipedia.org/wiki/File:Bundesarchiv_Bild_101III-

    Merz-014-12A,_Russland,_Beginn_Unternehmen_Zitadelle,_Panzer.jpg! http://commons.wikimedia.org/wiki/File:French_-_Door_with_Cat_Hole_- _Walters_64164.jpg! ! Unmodified. CC BY 2.0 (https://creativecommons.org/licenses/by/2.0/)" https://www.flickr.com/photos/timothymorgan/75288582/! https://www.flickr.com/photos/timothymorgan/75288583/! https://www.flickr.com/photos/timothymorgan/75294154/! https://www.flickr.com/photos/timothymorgan/75593155/! https://www.flickr.com/photos/timothymorgan/75593155/