Stateful PBT, with a game logic case study

55726c206ec597ea0e33fa04ce6b3110?s=47 Lou Xun
October 12, 2018

Stateful PBT, with a game logic case study

The talk I delivered at Code BEAM Lite Berlin about stateful PBT.

55726c206ec597ea0e33fa04ce6b3110?s=128

Lou Xun

October 12, 2018
Tweet

Transcript

  1. 2.

    Topics • Testing: exampled-based, TDD • Property-Based Testing • Stateful

    PBT, fixing a concurrency bug • Elixir • PropCheck (PropEr)
  2. 3.

    About Me • LOU Xun (楼洵) • Erlang since univ.

    |> Elixir ~4 years • Software Engineer @ CCP Games • ESI (player-facing APIs for game data) • Internal Tools and Pipelines • Chat System (ejabberd in Elixir!)
  3. 6.

    EVE Online • Sci-fi (spaceship!) MMO, sandbox by players •

    fleet fights! • large scale: 6000+ on a same battlefield • consequential: B-R cost $300,000+ (2014) • single Python process, Time Dilation (TiDi) • Elixir?!
  4. 12.

    • Location, Item • Attribute • Relationship • ==> LIAR

    (for source of truth ) • logical foundation of everything in space Core Rules
  5. 13.

    LIAR Goals • Prototype to replace current impl. • Each

    Location in an Erlang Process (Actor) • Multicore parallelism (multi-node?) • faster cores ($$$) more cores ($) • Message passing, “eventual consistency” • DSL, give more power to Game Design
  6. 14.

    Defined APIs • Relationship • add/remove modifiers (source -> target)

    • propagate updates (A -> B -> C) • DAG • Item, Attribute: new, get/set… • Location: start/stop (Actor)
  7. 15.

    TDD • Defined APIs make it easy to adopt •

    Incremental, iterative development • Focus on single feature • local, then remote • Example-based • Most tests we write are example-based
  8. 16.

    test "add item modifier should modify the target attribute value"

    do Liar.start_location(1) i2 = simple_item(2, %{1 => 10}) i3 = simple_item(3, %{2 => 20}) assert :ok = Liar.load_item(1, i2) assert :ok = Liar.load_item(1, i3) assert :ok = Liar.add_item_modifier(:add, {2, 1}, {3, 2}) assert 30 == Liar.get_value({3, 2}) end • Modifier carries source value • add source to target (both {item, attribute}) • in this case, add 10 to 20 => 30
  9. 17.

    Flaws • Heavy and duplicated setup • 5 lines out

    of 7, for the first test case… test "add item modifier should modify the target attribute value" do Liar.start_location(1) i2 = simple_item(2, %{1 => 10}) i3 = simple_item(3, %{2 => 20}) assert :ok = Liar.load_item(1, i2) assert :ok = Liar.load_item(1, i3) assert :ok = Liar.add_item_modifier(:add, {2, 1}, {3, 2}) assert 30 == Liar.get_value({3, 2}) end
  10. 18.

    Flaws • Simple and static input • 10 + 20

    = 30 test "add item modifier should modify the target attribute value" do Liar.start_location(1) i2 = simple_item(2, %{1 => 10}) i3 = simple_item(3, %{2 => 20}) assert :ok = Liar.load_item(1, i2) assert :ok = Liar.load_item(1, i3) assert :ok = Liar.add_item_modifier(:add, {2, 1}, {3, 2}) assert 30 == Liar.get_value({3, 2}) end
  11. 19.

    Flaws • Need human to think of edge cases •

    0? -1? inf? NaN?? test "add item modifier should modify the target attribute value" do Liar.start_location(1) i2 = simple_item(2, %{1 => 10}) i3 = simple_item(3, %{2 => 20}) assert :ok = Liar.load_item(1, i2) assert :ok = Liar.load_item(1, i3) assert :ok = Liar.add_item_modifier(:add, {2, 1}, {3, 2}) assert 30 == Liar.get_value({3, 2}) end
  12. 22.

    property "new attribute have correct data" do forall {id, value}

    <- {integer(), float()} do attr = Attribute.new(id, value) assert Attribute.get_value(attr) == Attribute.get_base_value(attr) end end
  13. 23.

    • Generators instead of static input • Defines input boundary

    property "new attribute have correct data" do forall {id, value} <- {integer(), float()} do attr = Attribute.new(id, value) assert Attribute.get_value(attr) == Attribute.get_base_value(attr) end end
  14. 24.

    • Generators instead of static input • Defines input boundary

    • Randomize input from large search spaces property "new attribute have correct data" do forall {id, value} <- {integer(), float()} do attr = Attribute.new(id, value) assert Attribute.get_value(attr) == Attribute.get_base_value(attr) end end
  15. 25.

    • Generators instead of static input • Defines input boundary

    • Randomize input from large search spaces • Find (minimal) counter examples for you property "new attribute have correct data" do forall {id, value} <- {integer(), float()} do attr = Attribute.new(id, value) assert Attribute.get_value(attr) == Attribute.get_base_value(attr) end end
  16. 26.

    • Generators instead of static input • Defines input boundary

    • Randomize input from large search spaces • Find (minimal) counter examples for you • How to define useful properties? property "new attribute have correct data" do forall {id, value} <- {integer(), float()} do attr = Attribute.new(id, value) assert Attribute.get_value(attr) == Attribute.get_base_value(attr) end end
  17. 28.

    Finding Properties • Modeling: simpler, inefficient impl. • quicksort ==

    bubble sort credit: Fred Hebert, propertesting.com
  18. 29.

    Finding Properties • Modeling: simpler, inefficient impl. • quicksort ==

    bubble sort • Partial invariant • list size/elements doesn’t change credit: Fred Hebert, propertesting.com
  19. 30.

    Finding Properties • Modeling: simpler, inefficient impl. • quicksort ==

    bubble sort • Partial invariant • list size/elements doesn’t change • Symmetric properties • encoder decoder pair credit: Fred Hebert, propertesting.com
  20. 31.

    One More Thi… Flaw • TDD: rarely cross-feature test cases!

    • load_item… unload and load again, does it work? • (hint: it doesn’t) • (hint2: no one would ever think of this) • Most other forms of testing as well • How is the system used in real world? • Generator, but for user behaviours?
  21. 32.

    Stateful PBT • Simulate real world usage of a system

    • Model the system with an “abstract statem” • Generate a sequence of commands • Execute all the commands • Check result / invariants • or, even just running all commands can fail
  22. 33.

    Almost Stateless… property "Liar top level APIs" do forall cmds

    in commands(__MODULE__) do ...setup ... {history, state, result} = run_commands(__MODULE__, cmds) ...tear down ... result == :ok ...custom output ... end end commands and run_commands • use defined callbacks • represents 2 steps in stateful PBT
  23. 37.

    Library Example • init: {[], []} (library, user) • command:

    new_book, borrow, return • precondition: true, library/user have the book • next_state: {[A], []} -> {[], [A]} • postcondition: only one A exist! (invariant)
  24. 38.

    Case Study: Concurrency Bug • Not live demo… (just for

    the look) • Read and use test output • Effectiveness vs. example-based tests • Tips on writing a stateful PBT • Inspiration for finding system property
  25. 39.

    Shrinking • As important as generating • removes inconsequential commands

    (noise) • focus on real problems • Tries to minimize the counter example • originally 27 commands… • shrank to 9 (1/3)
  26. 40.

    Symbolic Calls Commands: [ {:set, {:var, 1}, {:call, Liar, :start_location,

    [9]}}, {:set, {:var, 2}, {:call, Liar, :load_item, [9, Liar.Item<id: 92>]}}, {:set, {:var, 7}, {:call, Liar, :start_location, [7]}}, {:set, {:var, 14}, {:call, Liar, :load_item, [7, Liar.Item<id: 88>]}}, {:set, {:var, 23}, {:call, Liar, :add_item_modifier, [:dr_add, {92, 1}, {88, 46}]}}, {:set, {:var, 24}, {:call, Liar, :unload_item, [92]}}, {:set, {:var, 25}, {:call, Liar, :unload_item, [88]}}, {:set, {:var, 26}, {:call, Liar, :load_item, [9, Liar.Item<id: 77>]}}, {:set, {:var, 27}, {:call, Liar, :add_item_modifier, [:dr_add, {77, 11}, {77, 9}]}} ]
  27. 41.

    Actual Calls Liar.start_location(9) Liar.load_item(9, Liar.Item<id: 92>) Liar.start_location(7) Liar.load_item(7, Liar.Item<id: 88>)

    Liar.add_item_modifier(:dr_add, {92, 1}, {88, 46}) Liar.unload_item(92) Liar.unload_item(88) Liar.load_item(9, Liar.Item<id: 77>) Liar.add_item_modifier(:dr_add, {77, 11}, {77, 9})
  28. 42.

    Actual Calls Liar.start_location(9) Liar.load_item(9, Liar.Item<id: 92>) Liar.start_location(7) Liar.load_item(7, Liar.Item<id: 88>)

    Liar.add_item_modifier(:dr_add, {92, 1}, {88, 46}) Liar.unload_item(92) Liar.unload_item(88) Liar.load_item(9, Liar.Item<id: 77>) Liar.add_item_modifier(:dr_add, {77, 11}, {77, 9}) Auto-gen’ed later! === Debug Commands === # item generation item2_814 = X.simple_item(814, ...) # repro steps Liar.start_location(44) Liar.load_item(44, item2_814) Liar.unload_item(814) Liar.load_item(44, item2_814)
  29. 43.

    Actual Calls Liar.start_location(9) Liar.load_item(9, Liar.Item<id: 92>) Liar.start_location(7) Liar.load_item(7, Liar.Item<id: 88>)

    Liar.add_item_modifier(:dr_add, {92, 1}, {88, 46}) Liar.unload_item(92) Liar.unload_item(88) Liar.load_item(9, Liar.Item<id: 77>) Liar.add_item_modifier(:dr_add, {77, 11}, {77, 9}) Looks sane…
  30. 44.

    Captured Logs [error] GenServer {Liar.Runtime.LocationRegistry, 7} terminating ** (FunctionClauseError) ...

    (liar) lib/liar/item.ex:52: Liar.Item.get_attribute(nil, 46) ... Last message: {:"$gen_cast", {:rim_target, {92, 1}, {88, 46}}}
  31. 45.

    Captured Logs • Direct cause: trying to get attribute from

    nil item [error] GenServer {Liar.Runtime.LocationRegistry, 7} terminating ** (FunctionClauseError) ... (liar) lib/liar/item.ex:52: Liar.Item.get_attribute(nil, 46) ... Last message: {:"$gen_cast", {:rim_target, {92, 1}, {88, 46}}}
  32. 46.

    Captured Logs • Direct cause: trying to get attribute from

    nil item • First line: which actor crashed (“Location 7”) [error] GenServer {Liar.Runtime.LocationRegistry, 7} terminating ** (FunctionClauseError) ... (liar) lib/liar/item.ex:52: Liar.Item.get_attribute(nil, 46) ... Last message: {:"$gen_cast", {:rim_target, {92, 1}, {88, 46}}}
  33. 47.

    Captured Logs • Direct cause: trying to get attribute from

    nil item • First line: which actor crashed (“Location 7”) • Last line: crashed when handling what message • “remove item modifier at target location” • no “remove modifier” commands… • must happened during item unload! [error] GenServer {Liar.Runtime.LocationRegistry, 7} terminating ** (FunctionClauseError) ... (liar) lib/liar/item.ex:52: Liar.Item.get_attribute(nil, 46) ... Last message: {:"$gen_cast", {:rim_target, {92, 1}, {88, 46}}}
  34. 52.

    Observe • Erlang (thus Elixir) provides strong isolation • one

    crashed Actor doesn’t damage any other • neither the VM Liar.load_item(9, Liar.Item<id: 77>) Liar.add_item_modifier(:dr_add, {77, 11}, {77, 9})
  35. 53.

    Observe • Erlang (thus Elixir) provides strong isolation • one

    crashed Actor doesn’t damage any other • neither the VM • PropEr shows us all and only necessary steps • to produce and observe failure Liar.load_item(9, Liar.Item<id: 77>) Liar.add_item_modifier(:dr_add, {77, 11}, {77, 9})
  36. 54.

    Validating the Fix ◊ mix test test/liar_pbt_test.exs Excluding tags: [skip:

    true] OK: The input passed the test. . Finished in 0.1 seconds 1 property, 0 failures Randomized with seed 667925
  37. 55.

    Other Bugs Revealed • Item still registered after unload •

    Leftover outgoing modifiers after unload • Wrong return format • Bug in dependent package • …
  38. 56.

    Lines of Code Blank Comment Code code 238 185 927

    TDD 80 1 351 stateful PBT 53 2 177 (old data, PBT not complete)
  39. 57.

    Lines of Code Blank Comment Code code 238 185 927

    TDD 80 1 351 stateful PBT 53 2 177 (old data, PBT not complete) FUN!
  40. 58.

    Lines of Code Blank Comment Code code 278 228 1081

    example test 86 1 370 stateful PBT 99 10 386 • more commands, even Process.exit! • refactor for readability • ~100 lines for debug output!
  41. 60.

    propertesting.com • by Fred Hebert • this talk highly inspired

    by him • Free for online reading • Learn You Some Erlang • Erlang in Anger • The Zen of Erlang • and more…
  42. 62.

    Five Callbacks • init • command: control command generation •

    precondition: validate generated command • next_state • postcondition
  43. 64.

    command “filtering” • No locations: only generate start_location • No

    items: only start_location or load_item • Has items: most functions are valid • Has modifiers: can remove modifiers def command(%__MODULE__{items: items} = state) when map_size(items) == 0 do frequency([ {1, {:call, Liar, :start_location, [gen_new_lid(state)]}}, {50, {:call, Liar, :load_item, [gen_loaded_lid(state), gen_new_item(state)]}} ]) end
  44. 65.

    command “filtering” • No locations: only generate start_location • No

    items: only start_location or load_item • Has items: most functions are valid • Has modifiers: can remove modifiers • Forces you to think how flexible the system is • NOT used during shrinking!
  45. 66.

    precondition • Validate arguments (exist in StateM…) • Correct shrinking

    relies on this • WTH no locations! • Shrink: remove several commands, then use precondition to valid the remaining sequence Commands: [ {:set, {:var, 2}, {:call, Liar, :load_item, [10, Liar.Item<id: 21>]}} ]
  46. 67.

    DRY • Functions to list valid arguments • Wrap generators

    using ^ def command(%__MODULE__{locations: []} = state) do {:call, Liar, :start_location, [gen_new_lid(state)]} end def precondition(state, {:call, Liar, :load_item, [lid, item]}), do: Enum.member?(loaded_lids(state), lid) && Enum.member?(new_item_ids(state), item.id) defp loaded_lids(state), do: state.locations defp gen_loaded_lid(state), do: loaded_lids(state) |> elements()
  47. 68.

    Five Callbacks • init • command • precondition • next_state:

    abstract model transition • postcondition: check result / invariant
  48. 69.

    Stateful Test • Don’t repeat your logic! • Use simple

    state • Use inefficient algorithm • Check (partial) invariants • “only 1 book exist in library + user”
  49. 70.

    General Notes • “Fixing” the model (test) is normal •

    mix propcheck.clean • Adjust frequency to expose different bugs • And number / size of tests • especially helpful if setup is heavy • Nevertheless, great tools help (mix test )
  50. 71.

    LIAR Specific • Testing all Locations (multiple actors) • Requires

    synchronization • LIAR’s “consistency guarantee” • sync “call” after certain commands • reason for the observe step in case study
  51. 73.

    –Fred, propertesting.com Stateful property tests are particularly useful when “what

    the code should do”—what the user perceives —is simple, but “how the code does it”—how it is implemented—is complex.
  52. 74.

    EVE Rules • What is user’s perspective for attributes? •

    Actually quite simple: • base_value + all modifiers -> real value • modifiers carry source values • recursively apply the same simple rule!
  53. 75.

    LIAR “Property” defp calculate_value(state, item_attr) do base_value = ( ...

    get base value) state |> resolve_graph() |> Graph.in_edges(item_attr) |> Enum.map(fn e -> {mod, _} = e.label {mod, calculate_value(state, e.v1)} end) |> TestModifiers.evaluate(base_value) end
  54. 76.

    Inspired? • Think properties from a user’s perspective • does

    it have a “simple” mental model? • Real impl. can’t afford the simple model • calculate once, store the result, propagate • Caching vs. recursive calculation
  55. 77.

    PBT and co. • Complement not replacement • (example-based) TDD

    for dev., PBT for verification • Better understanding of your system & domain • Do require some effort to get comfortable with • not suitable for all problems • but gives a lot of satisfaction when useful :)