Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CUFP 2015: Modeling state transitions with specification-based random testing

CUFP 2015: Modeling state transitions with specification-based random testing

What if you thought about tests only in terms of properties and counterexamples? Properties that may assert failures and/or successes. Counterexamples to a set of properties that can “shrink” to smaller failures and be better reasoned about. Properties and counterexamples are the foundation of QuickCheck, a tool to generate tests over concurrent and non-deterministic code.

The difficult component of most real-world approaches to generative testing is understanding the bounds and requirements surrounding a problem/feature/application. Using Erlang’s QuickCheck implementation, we’ll walk through an example which models a continuous, side-effecting, hashtree-based synchronization mechanism, called Active Anti-Entropy (AAE), as an abstract state machine. By being able to query the (Erlang) process state and compare it against our model state, we can assure that our system matches its intended specification – which is a whole lot more important than tests being green.

Zeeshan Lakhani

September 05, 2015
Tweet

More Decks by Zeeshan Lakhani

Other Decks in Programming

Transcript

  1. Modeling State Transitions
    with
    Specification-based
    Random Testing

    View Slide

  2. Zeeshan Lakhani
    software engineer at Basho
    Technologies, Inc
    and
    founder/organizer of
    Papers We Love
    Talk by

    View Slide

  3. “It's really hard to explain that
    test infrastructure for
    distributed systems is actually
    a fantastically deep and cool
    problem.”
    Jay Kreps
    05/15/2015
    1

    View Slide

  4. View Slide

  5. extremely painful for you

    View Slide

  6. The Inspiration: Dynamo
    2
    3

    View Slide

  7. View Slide

  8. curl “$RIAK_HOST/search/query/cufp?wt=json&q=name_s:lj_*”
    • distributed across a cluster of nodes
    • all index entries for a given riak object are co-located on the
    same physical machine
    • b/c of replication, not all nodes need to be queried, only a
    coverage set 4

    View Slide

  9. “A liveness property guarantees that ‘something
    good eventually happens’; for example, all
    requests eventually receive a response.”
    Bailis, Ghodsi
    Eventual consistency today:
    limitations, extensions, and beyond
    Communications of the ACM #56
    5

    View Slide

  10. • “To ensure convergence, replicas must
    exchange information with one another about
    which writes they have seen”
    • This exchange is called anti-entropy
    • read-repair is an anti-entropy mechanism to
    repair out-of-date replicas
    optimistic replication and resolution
    7
    8
    6

    View Slide

  11. but what about “cold data?”

    View Slide

  12. or search queries that
    may return inconsistent
    results?

    View Slide

  13. read-repair all the keys?

    View Slide

  14. • background process using/storing merkle trees to keep
    divergent and missing data in sync
    • disk-based: issues with in-memory trees
    • containing billions of persistent keys: build a tree once

    • real-time updates - liveness

    • non-blocking - can’t affect incoming write-rate
    riak’s AAE (active anti-entropy)9

    View Slide

  15. • riak search / yokozuna iterates over
    entropy data containing {bucket_type,
    bucket}/key and object hash
    • riak search / yokozuna trees always
    repair entries remote_missing. . . key-
    value data is canonical

    View Slide

  16. riak_kv_vnode:actual_put:1460 vnode-kv:
    {r_object,<<"fruit_aae">>,<<"testfor spaces 342”>>...
    [info] <0.5785.0>@yz_solr:get_pairs:421 yz:
    [{struct,[{<<"vsn">>,<<"2">>},
    {<<"riak_bucket_type">>,<<"default">>},
    {<<"riak_bucket_name">>,<<"fruit_aae">>},
    {<<"riak_key">>,<<"testfor spaces 342">>}...
    a bug in the wild

    View Slide

  17. • a complete binary tree with an n-bit value
    associated with each node
    • Each internal node value is the result of a hash
    of the node values of its children (n is the
    number of bits returned by the hash function)
    merkle trees
    10

    View Slide

  18. 11

    View Slide

  19. 4

    View Slide

  20. 4

    View Slide

  21. 4

    View Slide

  22. 4

    View Slide

  23. 12

    View Slide

  24. View Slide

  25. • Generators
    • SHRINKING
    • Controlled Randomness
    • Randomized scheduling to find race conditions
    (Pulse)

    View Slide

  26. %% @doc - Perform BAD_is_member action
    -spec bad_is_member(list(),
    non_neg_integer()) -> boolean().
    bad_is_member(S, N) ->
    lists:member(N, lists:sublist(S, 5)).
    shrinking

    View Slide

  27. shrinking

    stateful_sm:is_member([5, 0, 5, 1, 4], 3) -> false
    stateful_sm:is_member([5, 0, 5, 1, 4], 4) -> true
    stateful_sm:is_member([5, 0, 5, 1, 4], 2) -> false
    stateful_sm:add([5, 0, 5, 1, 4], 3) -> [3, 5, 0, 5, 1, 4]
    stateful_sm:is_member([3, 5, 0, 5, 1, 4], 2) -> false
    stateful_sm:add([3, 5, 0, 5, 1, 4], 0) -> [0, 3, 5, 0, 5, 1, 4]
    stateful_sm:add([0, 3, 5, 0, 5, 1, 4], 2) -> [2, 0, 3, 5, 0, 5,
    1, 4]
    stateful_sm:add([2, 0, 3, 5, 0, 5, 1, 4], 0) -> [0, 2, 0, 3, 5,
    0, 5, 1, 4]
    stateful_sm:is_member([0, 2, 0, 3, 5, 0, 5, 1, 4], 4) -> false
    Reason: false
    Shrinking xxxxx....x..(6 times)
    [{set,{var,1},{call,stateful_sm,add,[[5,1,4],0]}}]

    View Slide

  28. my generation
    vclock() ->
    ?LET(VclockSym, vclock_sym(), eval(VclockSym)).
    vclock_sym() ->
    ?LAZY(
    oneof([
    {call, vclock, fresh, []},
    ?LETSHRINK([Clock], [vclock_sym()],
    {call, ?MODULE, increment,
    [noshrink(binary(4)), nat(), Clock]})
    ])).
    not_empty(G) ->
    ?SUCHTHAT(X, G, X /= [] andalso X /= <<>>).

    View Slide

  29. increment(Actor, Count, Vclock) ->
    lists:foldl(
    fun vclock:increment/2,
    Vclock,
    lists:duplicate(Count, Actor)).
    riak_object() ->
    ?LET({{Bucket, Key}, Vclock, Value},
    {bkey(), vclock(), binary()},
    riak_object:set_vclock(
    riak_object:new(Bucket, Key, Value),
    Vclock)).
    bkey() ->
    {non_blank_string(), %% bucket
    non_blank_string()}. %% key

    View Slide

  30. riak_object() ->
    ?LET({{Bucket, Key}, Vclock, Value},
    {bkey(), vclock(), binary()},
    riak_object:set_vclock(
    riak_object:new(Bucket, Key, Value),
    Vclock)).
    a generated riak object

    View Slide

  31. a generated riak object
    {r_object,<<"|plVWx&F">>,<<"?#sjiGS|">>,
    [{r_content,{dict,0,16,16,8,80,48,
    {[],[],[],[],[],[],[],[],[],[],[],[],[],[],
    [],[]},
    {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
    [],[]}}},
    <<"t+">>}],
    [],
    {dict,1,16,16,8,80,48,
    {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
    {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
    [[clean|true]],
    []}}},
    undefined}

    View Slide

  32. -include_lib("eqc/include/eqc.hrl").
    -include_lib("eqc/include/eqc_statem.hrl").
    modeling state declaratively
    Provides functions for testing operations with
    side-effects, which are specified via an abstract
    state machine.

    View Slide

  33. •commands - symbolic calls to run during test sequences
    •symbolic variables - generated during test generation
    •dynamic values - generated during test execution
    •next_state - operates during test generation and execution
    •preconditions - ensure a logical precedence between
    operations, checks command in generation and used in
    shrinking
    •postcondition - predicate called during test execution (that
    must hold) with the dynamic state before the call
    •aggregate - Collects a list of values in each test, and shows
    the distribution of list elements
    a few terms

    View Slide

  34. -record(state, {yz_idx_tree,
    kv_idx_tree,
    yz_idx_objects = dict:new(),
    kv_idx_objects = dict:new(),
    trees_updated = false,
    both = []}).
    %% Initialize State
    initial_state() ->
    #state{}.
    the model’s state record13

    View Slide

  35. • we start a 1 riak key-value idx-tree process
    and a 1 yokozuna idx-tree process
    • this happens per vnode (each vnode is
    responsible for a partition)
    • each process contains multiple hashtrees
    due to preflist overlap
    • some nodes contain data not part of preflist,
    otherwise always divergent

    View Slide

  36. %% We want to constrain all data to fit in the last portion of our
    %% hash space so that it maps to partition 0.
    hashfun({_Bucket, _Key}=ID) ->
    %% Top valid hash value
    Top = (1 bsl 160) - 1,
    %% Calculate partition size
    PartitionSize = chash:ring_increment(?RING_SIZE),
    %% Generate an integer hash
    Hash = intify_sha(riak_core_util:chash_std_keyfun(ID)),
    %% Map the hash to 1/?RING_SIZE of the full hash space
    SmallHash = Hash rem PartitionSize,
    %% Force the hash into the last 1/?RING_SIZE block
    Top - SmallHash.

    View Slide

  37. • start
    • insert
    • update
    • compare
    tree (process) operations

    View Slide

  38. insert_kv_tree_pre(S, _Args) ->
    S#state.kv_idx_tree /= undefined.
    insert_kv_tree_command(S) ->
    {call, ?MODULE, insert_kv_tree,
    [insert_method(), eqc_util:riak_object(),
    S#state.kv_idx_tree]}.
    insert_kv_tree_next(S, _V, [_, RObj, _]) ->
    {ok, TreeData} = dict:find(?TEST_INDEX_N,
    S#state.kv_idx_objects),
    S#state{kv_idx_objects=dict:store(
    ?TEST_INDEX_N,
    set_treedata(RObj, TreeData),
    S#state.kv_idx_objects),
    trees_updated=false}.
    insert into the key-val hashtree

    View Slide

  39. insert_kv_tree(Method, RObj, {ok, TreePid}) ->
    {Bucket, Key} = eqc_util:get_bkey_from_object(RObj),
    Items = [{object, {Bucket, Key}, RObj}],
    case Method of
    sync ->
    riak_kv_index_hashtree:insert(
    Items, [], TreePid);
    async ->
    riak_kv_index_hashtree:async_insert(
    Items, [], TreePid)
    end.
    insert_kv_tree_post(_S, _Args, _Res) ->
    true.

    View Slide

  40. insert_yz_tree_command(S) ->
    {call, ?MODULE, insert_yz_tree,
    [insert_method(), eqc_util:riak_object(),
    S#state.yz_idx_tree]}.
    insert_yz_tree_next(S, _V, [_, RObj, _]) ->
    {ok, TreeData} = dict:find(?TEST_INDEX_N,
    S#state.yz_idx_objects),
    S#state{yz_idx_objects=dict:store(
    ?TEST_INDEX_N,
    set_treedata(RObj, TreeData),
    S#state.yz_idx_objects),
    trees_updated=false}.
    insert_yz_tree(Method, RObj, {ok, TreePid}) ->
    BKey = eqc_util:get_bkey_from_object(RObj),
    yz_index_hashtree:insert(
    Method, ?TEST_INDEX_N, BKey,
    yz_kv:hash_object(RObj), TreePid, []).
    insert into the search hashtree

    View Slide

  41. -spec insert_both(sync|async, obj(), {ok, tree()},
    {ok, tree()}) -> {ok, ok}.
    insert_both(Method, RObj, YZOkTree, KVOkTree) ->
    {insert_yz_tree(Method, RObj, YZOkTree),
    insert_kv_tree(Method, RObj, KVOkTree)}.

    View Slide

  42. 4

    View Slide

  43. update_pre(S, _Args) ->
    S#state.yz_idx_tree /= undefined andalso
    S#state.kv_idx_tree /= undefined.
    update_command(S) ->
    {call, ?MODULE, update,
    [S#state.yz_idx_tree, S#state.kv_idx_tree]}.
    update_next(S, _Value, _Args) ->
    S#state{trees_updated=true}.
    update({ok, YZTreePid}, {ok, KVTreePid}) ->
    yz_index_hashtree:update(?TEST_INDEX_N, YZTreePid),
    riak_kv_index_hashtree:update(?TEST_INDEX_N, KVTreePid),
    ok.
    update_post(S, _Args, _Res) ->
    ...
    eq(ModelKVKeyCount, RealKVKeyCount) and
    eq(ModelYZKeyCount, RealYZKeyCount).
    update the hashtrees

    View Slide

  44. 4

    View Slide

  45. compare_pre(S, _Args) ->
    S#state.yz_idx_tree /= undefined andalso S#state.kv_idx_tree /= undefined
    andalso S#state.trees_updated.
    compare_command(S) ->
    {call, ?MODULE, compare, [S#state.yz_idx_tree, S#state.kv_idx_tree]}.
    compare({ok, YZTreePid}, {ok, KVTreePid}) ->
    Remote =
    fun(get_bucket, {L, B}) ->
    riak_kv_index_hashtree:exchange_bucket(?TEST_INDEX_N, L, B, KVTreePid);
    (key_hashes, Segment) ->
    riak_kv_index_hashtree:exchange_segment(?TEST_INDEX_N, Segment, KVTreePid);
    (_, _) -> ok
    end,
    AccFun = fun(KeyDiff, Count) ->
    lists:foldl(fun(Diff, InnerCount) ->
    case repair(0, Diff) of
    full_repair -> InnerCount + 1;
    _ -> InnerCount
    end
    end, Count, KeyDiff)
    end,
    yz_index_hashtree:compare(?TEST_INDEX_N, Remote, AccFun, 0, YZTreePid).
    compare the hashtrees

    View Slide

  46. 4

    View Slide

  47. compare_post(S, _Args, Res) ->
    YZTreeData = dict:fetch(?TEST_INDEX_N, S#state.yz_idx_objects),
    KVTreeData = dict:fetch(?TEST_INDEX_N, S#state.kv_idx_objects),
    LeftDiff = dict:fold(fun(BKey, Hash, Count) ->
    case dict:find(BKey, KVTreeData) of
    {ok, Hash} -> Count;
    {ok, _OtherHash} -> Count;
    error -> Count+1
    end
    end, 0, YZTreeData),
    RightDiff = dict:fold(fun(BKey, Hash, Count) ->
    case dict:find(BKey, YZTreeData) of
    {ok, Hash} -> Count;
    {ok, _OtherHash} -> Count;
    error -> Count+1
    end
    end, LeftDiff, KVTreeData),
    eq(RightDiff, Res).

    View Slide

  48. prop_correct() ->
    ?FORALL(Cmds,commands(?MODULE, #state{}),
    aggregate(command_names(Cmds),
    ?TRAPEXIT(begin
    {H, S, Res} = run_commands(?MODULE, Cmds),
    catch yz_index_hashtree:destroy(
    element(2, S#state.yz_idx_tree)),
    catch riak_kv_index_hashtree:destroy(
    element(2, S#state.kv_idx_tree)),
    pretty_commands(?MODULE,
    Cmds,
    {H, S, Res},
    Res == ok)
    end))).
    property test

    View Slide

  49. property test
    .................
    OK, passed 100 tests
    22.3% {yz_index_hashtree_eqc,insert_yz_tree,3}
    20.7% {yz_index_hashtree_eqc,insert_kv_tree,3}
    18.1% {yz_index_hashtree_eqc,insert_both,4}
    17.7% {yz_index_hashtree_eqc,update,2}
    9.2% {yz_index_hashtree_eqc,start_kv_tree,0}
    8.8% {yz_index_hashtree_eqc,start_yz_tree,0}
    3.3% {yz_index_hashtree_eqc,compare,2}

    View Slide

  50. John Daily
    Christopher Meiklejohn
    Sean Cribbs
    Jon Meredith
    Joe DeVivo
    thanks

    View Slide

  51. 1. bit.ly/1JS3nRB
    2. bit.ly/1Ne15if
    3. bit.ly/1LZ5sxj
    4. bit.ly/1f2JElM
    5. bit.ly/1JG4cLU
    6. bit.ly/1FmYpGS
    7. bit.ly/1JG4cLU
    8. bit.ly/1UqXAZW
    9. bit.ly/1JEHCAX
    10. bit.ly/1GjNLj7
    11. bit.ly/1JDqdJ8
    12. bit.ly/1EE9LLA
    13. bit.ly/1INKmwQ
    footnotes

    View Slide