Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Mace Model Checker

The Mace Model Checker

This was an internal presentation at the University of Stavanger on the Mace Model Checker. It can be used to model check distributed, concurrent systems for correctness.

Christian Stigen Larsen

January 17, 2013
Tweet

More Decks by Christian Stigen Larsen

Other Decks in Science

Transcript

  1. « [Mace consists of tools to] enhance development, testing and

    understanding of the execution of distributed systems ... »
  2. ... ... ... ... ... ... ... ... ... ...

    ... ... ... ... ... ... Highlevel description Verification and compilation 0101010101 1100110101 0011001101 0111010111 1110110001 Generated code
  3. Design principles • Service objects in hierarchy of layers •

    Events as a unified concurrency model • Aspects for cross-cutting concerns
  4. Layers Upper layer Lower layer Provides Uses Upcalls Upcalls Downcalls

    Downcalls Implement downcalls to receive them Implement upcalls to receive them
  5. Concurrency • State = enumeration and variables • Transitions =

    upcalls from below, downcalls from above, scheduled events • Guards = only transition if condition is true
  6. Concurrency states { init; preJoining; joining; joined; } state_variables {

    NodeKey myhash; /* ... */ } transitions { scheduler global_maintenance() guard ( state == joined ) { NodeKey d = myhash; /* ... */ TCP.route(n, GlobalSample(d)); } upcall forward(const NodeKey& from, const NodeKey& to, /* etc */) guard ( state == joined ) { nextHop = make_routing_decision(msg.key); return true; } /* etc */ }
  7. Domain Specific Language (DSL) states { init; preJoining; joining; joined;

    } state_variables { NodeKey myhash; /* ... */ } transitions { scheduler global_maintenance() guard ( state == joined ) { NodeKey d = myhash; /* ... */ TCP.route(n, GlobalSample(d)); } upcall forward(const NodeKey& from, const NodeKey& to, /* etc */) guard ( state == joined ) { nextHop = make_routing_decision(msg.key); return true; } /* etc */ }
  8. Failures // local detection detect { guard = (range !=

    pre(range)); error = notifyNewRange; } // distributed detection across myleafset detect { guard = (state == joined); nodes = myleafset; send = { message = LeafsetPush(myhash, myleafset); period = 5sec; } receive = { message = LeafsetPull; period = 5min; } error = leafFailed; }
  9. Analysis • Uses aspects to generate debugging and logging code

    • Causal-paths: Think distributed call graphs • Model checking: Detect liveness violations
  10. Model Checking « While our experience is restricted to MaceMC,

    we believe our random execution algorithms for finding liveness violations and the critical transition generalize to any state-exploration model checker capable of replaying executions. »
  11. Model Checking « we believe our algorithms generalize to any

    model checker capable of replaying executions. »
  12. Search Algorithms • Bounded depth-first search (BDFS) • Random walks

    • Isolating the critical transition • Combined exhaustive search and random walks • Reducing the search space
  13. Some questions • Events are functions and therefore in- process?

    No remove events? • Are states set explicitly by code? • I’m missing a discussion on synchronization and details on external event handling • ... but this is all in the C++ source code :)
  14. References • Charles Killian et al, «Mace: Language support for

    building distributed systems», UCSD • Charles Killian et al, «Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code»