Hansei: Property-based Development of Concurrent Systems

Hansei: Property-based Development of Concurrent Systems

Slides from my Erlang Workshop 2012 presentation. Related paper: http://dl.acm.org/citation.cfm?id=2364505

5f1086a52e504fa025e138c6924903e1?s=128

Joseph Blomstedt

September 14, 2012
Tweet

Transcript

  1. 1.

    Joseph Blomstedt (@jtuple) Basho Technologies joe@basho.com Property-based Development of Concurrent

    Systems Erlang Workshop September 2012 Hansei Tuesday, September 25, 2012
  2. 12.

    Erlang Testing Tools • Property-based Testing Quickcheck Proper Triq 12

    • Interleaving Tools PULSE Concuerror McErlang Tuesday, September 25, 2012
  3. 20.

    Hansei Goals • Enable end to end testing prototype to

    final implementation • Use existing OTP behaviors Extended behaviors with property information • Message interleaving across VMs 20 Tuesday, September 25, 2012
  4. 24.

    Quickcheck eqc_statem 24 command(State) -> %% Commands to run against

    stateful system oneof(Cmds). precondition(State, Cmd) -> %% Return true if cmd is valid in current state. next_state(State, Result, Cmd) -> %% Update test state after a given cmd. postcondition(State, Cmd, Result) -> %% Test postconditions. Tuesday, September 25, 2012
  5. 27.

    Hansei 27 • Test consists of test module and a

    set of process modules • Events External events, timers, things you do not care to model • Calls/casts map to simulated receive/reply semantics Tuesday, September 25, 2012
  6. 28.

    Hansei Test 28 test module test process (server) test process

    (server) test process (server) test process (fsm) test process (fsm) test process (fsm) Tuesday, September 25, 2012
  7. 31.

    Test Module 31 after_call after_cast initial_state after_event post_call post_cast post_event

    always event precondition process_modules Tuesday, September 25, 2012
  8. 32.

    Hansei Operating Modes • Simulation Used during prototyping / modeling

    • Tracing • Tracing + Interception 32 Tuesday, September 25, 2012
  9. 33.

    Simulation Mode • Calls/casts mapped to command sequences • Generates

    sequence of events, calls, casts • Runs against simulated system of processes • Shrinks sequence when postconditions fail 33 Tuesday, September 25, 2012
  10. 35.

    Tracing Mode • Generate event sequences, not call/casts • Run

    against external stateful system • Erlang tracing used to capture actual call/casts that occurred • Verify events + observed call/casts against model and final cluster state 35 Tuesday, September 25, 2012
  11. 36.

    Tracing + Implementation • Modify implementation to enable controlling message

    interleaving • Implemented as a proxy process that delays forwarding messages until told to do so by test module 36 Tuesday, September 25, 2012
  12. 37.

    Simple Example • Nodes join together an form a cluster

    • Nodes periodically gossip membership state to other known nodes • Prototype nodes as gen_servers 37 Tuesday, September 25, 2012
  13. 39.

    39 -record(state, {id, members}). init(Node) -> {ok, #state{id=Node, members=[Node]}}. handle_call(get_members,

    _From, State) -> {reply, State#state.members, State}; handle_call(get_state, _From, State) -> {reply, State, State}. Node Server (2/5) Tuesday, September 25, 2012
  14. 40.
  15. 41.

    41 events(#state{id=Node, members=Members}) -> {call,?MODULE,send_gossip,[Node, [elements(Members)]]}. precondition({send_gossip, [Node, [OtherNode]]}, S)

    -> all([lists:member(OtherNode, S#state.members), Node /= OtherNode]). Node Server (4/5) Tuesday, September 25, 2012
  16. 42.

    42 handle_event({join, [OtherNode]}, State) -> OtherState = gen_server:call(OtherNode, get_state), Members

    = OtherState#state.members, Members2 = ordsets:add_element(State#state.id, Members), {noreply, State#state{members=Members2}; handle_event({send_gossip, [OtherNode]}, State) -> gen_server:cast(OtherNode, {gossip, State}), {noreply, State}. Node Server (5/5) Tuesday, September 25, 2012
  17. 43.

    43 -record(state, {nodes, singleton}). prop_riak() -> hansei_test:simulate(?MODULE). process_modules() -> lists:duplicate(?CLUSTER_SIZE,

    riak_node). initial_state(Procs) -> #state{nodes=Procs, singleton=Procs}. Test Module (1/2) Tuesday, September 25, 2012
  18. 44.

    44 events(#state{nodes=Nodes}) -> {call,?MODULE,join,[elements(Nodes), [elements(Nodes)]]}. precondition({join, [Node,[OtherNode]]}, S) -> Singleton

    = S#state.singleton, all([Node /= OtherNode, lists:member(Node, Singleton), (Singleton == S#state.nodes) or lists:member(OtherNode, Singleton)]). after_event({join, [Node,[OtherNode]]}) -> Singleton = S#state.singleton -- [Node, OtherNode], S#state{singleton=Singleton}. Test Module (2/2) Tuesday, September 25, 2012
  19. 45.

    Extended Example • Cluster maintains a weak leader Lowest node

    id in the cluster is considered the leader No actual leader election or failure detection • Property we care about At all times, there is only one node that believe it is the leader of a cluster 45 Tuesday, September 25, 2012
  20. 47.

    47 -record(state, {id, members, leader}). init(Node) -> {ok, #state{id=Node, members=[Node],

    leader=Node}}. handle_call(get_leader, _From, State) -> {reply, State#state.leader, State}; Extended Node Server (2/4) Tuesday, September 25, 2012
  21. 48.

    48 handle_cast({gossip, #state{members=OtherMembers}}, State=#state{members=Members}) -> Members2 = ordsets:union(Members, OtherMembers), case

    is_leader(State) of true -> Leader2 = hd(lists:sort(Members2)); false -> Leader2 = Leader end, State2 = State#state{members=Members2, leader=Leader2}, {noreply, State2}. Extended Node Server (3/4) Tuesday, September 25, 2012
  22. 49.

    49 handle_event({join, [OtherNode]}, State) -> OtherState = gen_server:call(OtherNode, get_state), #state{members=Members,

    leader=Leader} = OtherState, Members2 = ordsets:add_element(State#state.id, Members), {noreply, State#state{members=Members2, leader=Leader}}; handle_event({send_gossip, [OtherNode]}, State) -> gen_server:cast(OtherNode, {gossip, State}), {noreply, State}. Extended Node Server (4/4) Tuesday, September 25, 2012
  23. 50.

    50 always(S) -> all([begin Members = riak_node:get_members(Node), one_leader(Members) end ||

    Node <- S#state.nodes]). one_leader(Members) -> Leaders = [Leader || Node <- Members, Leader <- [riak_node:get_leader(Node)], Leader == Node], length(lists:usort(Leaders)) < 2. Extended Test Module Tuesday, September 25, 2012
  24. 51.

    Counterexample 51 [{init,{test_state,undefined,undefined,riak_model, 0,[],undefined,undefined,simulate}}, {set,{var,1},{call,hansei_test,init_dynamic,[]}}, {set,{var,2},{call,hansei_test,init_system,[riak_model]}}, {set,{var,3},{call,riak_model,join,[1,[3]]}}, {set,{var,4},{call,hansei_test,rcvmsg,[3,{1,{call,get_state}}]}}, {set,{var,5},{call,hansei_test,rcvreply,[1,{3,{state,3,[3],3}}]}}, {set,{var,6},{call,riak_node,send_gossip,[1,[3]]}},

    {set,{var,7}, {call,hansei_test,rcvmsg, [3,{1,{cast,{gossip,{state,1,[1,3],3}}}}]}}, {set,{var,8},{call,riak_node,send_gossip,[3,[1]]}}, {set,{var,9},{call,riak_node,send_gossip,[1,[3]]}}, {set,{var,16}, {call,hansei_test,rcvmsg,[3,{1,{cast,{gossip,{state,1,[1,3],3}}}}]}}, {set,{var,18}, {call,hansei_test,rcvmsg,[1,{3,{cast,{gossip,{state,3,[1,3],1}}}}]}}] {postcondition,false} Tuesday, September 25, 2012
  25. 52.

    2 3 1 join 3 call: get_state cast: gossip([1,3], 3)

    send_gossip 3 send_gossip 3 send_gossip 1 reply: ([3], 3) cast: gossip([1,3], 3) cast: gossip([1,3], 1) [1,3], 3 [1,3], 1 [3], 3 [1,3], 3 [1], 1 [1,3], 1 52 Tuesday, September 25, 2012
  26. 53.

    Versioned leader state 53 • Add version number to gossiped

    state • Leader increments version when changed • Node updates leader only if newer version • After changes, model passes without issue Tuesday, September 25, 2012
  27. 54.

    2 3 1 join 3 call: get_state cast: gossip([1,3], 3)

    send_gossip 3 send_gossip 3 send_gossip 1 reply: ([3], 3) cast: gossip([1,3], 3) cast: gossip([1,3], 1) [1,3], 1 [1,3], 1 [3], 3 [1,3], 3 [1], 1 [1,3], 1 54 Tuesday, September 25, 2012
  28. 55.

    Riak Implementation • Simple example similar to Riak clustering system

    • Can run tracing/interception mode against Riak • Use riak_test to bring up multiple Riak nodes • Change process_modules to return a list [{node(), riak_core_gossip})] 55 Tuesday, September 25, 2012
  29. 56.

    Open source • Hansei will be released as open-source http://github.com/basho/hansei

    • Apache License (most likely) • Soon! 56 Tuesday, September 25, 2012
  30. 57.

    Future Plans • Simulate monitors + links • Simulate dropping

    messages Earlier prototype did, recent changes broke code • Support process exits, supervisors • Add properties to most of riak_core • Use Hansei in construction of basho_ensemble New dynamic ensemble, leader election library 57 Tuesday, September 25, 2012