Jason Reich
August 30, 2012
110

A presentation on efficient testing of higher-order properties with mixed quantification.

August 30, 2012

Transcript

1. Advances in Lazy SmallCheck Jason S. Reich, Matthew Naylor, Colin

Runciman 30/08/12 – IFL 2012, Oxford, UK Wednesday, 29 August 12
2. A ‘conjectured’ property prop_ReduceFold :: ([Bool] -> Bool) -> Property

prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs Wednesday, 29 August 12
3. A ‘conjectured’ property prop_ReduceFold :: ([Bool] -> Bool) -> Property

prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs “All reductions on lists of Boolean values to a single Boolean value can be expressed as a foldr.” Wednesday, 29 August 12
4. A ‘conjectured’ property prop_ReduceFold :: ([Bool] -> Bool) -> Property

prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs “All reductions on lists of Boolean values to a single Boolean value can be expressed as a foldr.” Functional values Wednesday, 29 August 12
5. A ‘conjectured’ property prop_ReduceFold :: ([Bool] -> Bool) -> Property

prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs “All reductions on lists of Boolean values to a single Boolean value can be expressed as a foldr.” Existential quantiﬁer Functional values Wednesday, 29 August 12
6. A ‘conjectured’ property prop_ReduceFold :: ([Bool] -> Bool) -> Property

prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs “All reductions on lists of Boolean values to a single Boolean value can be expressed as a foldr.” Existential quantiﬁer Functional values Nested quantiﬁcation Wednesday, 29 August 12
7. Property-based testing QuickCheck SmallCheck Lazy SmallCheck (2008) Strategy Demand-driven Functional

values Existentials Nested quantiﬁcation Random Bounded Exhaustive Lazy Bounded Exhaustive ◦ ◦ • • • ◦ ◦ • ◦ • • ◦ Wednesday, 29 August 12
8. Property-based testing QuickCheck SmallCheck Lazy SmallCheck (2008) Strategy Demand-driven Functional

values Existentials Nested quantiﬁcation Random Bounded Exhaustive Lazy Bounded Exhaustive ◦ ◦ • • • ◦ ◦ • ◦ • • ◦ Wednesday, 29 August 12
9. In SmallCheck... >>> test prop_ReduceFold Depth 0: Completed 4 test(s)

without failure. ... Depth 2: Failed test no. 3. Test values follow. []-> True [True]-> True [True,True]-> True [True,True,True]-> True [True,True,False]-> True [True,False]-> True Wednesday, 29 August 12
10. In SmallCheck... >>> test prop_ReduceFold Depth 0: Completed 4 test(s)

without failure. ... Depth 2: Failed test no. 3. Test values follow. []-> True [True]-> True [True,True]-> True [True,True,True]-> True [True,True,False]-> True [True,False]-> True Wednesday, 29 August 12
11. In SmallCheck... True [True,False]-> True [True,False,True]-> True [True,False,False]-> True [False]->

True [False,True]-> True [False,True,True]-> True [False,True,False]-> True [False,False]-> False [False,False,True]-> True [False,False,False]-> True Wednesday, 29 August 12
12. In SmallCheck... True [True,False]-> True [True,False,True]-> True [True,False,False]-> True [False]->

True [False,True]-> True [False,True,True]-> True [False,True,False]-> True [False,False]-> False [False,False,True]-> True [False,False,False]-> True r = (/= [False, False]) Wednesday, 29 August 12

August 12
15. LSC? • Lazy SmallCheck (Runciman et al., 2008). • Check

– Property-based testing library. Wednesday, 29 August 12
16. LSC? • Lazy SmallCheck (Runciman et al., 2008). • Check

– Property-based testing library. • Small – Exhaustive search for minimal counterexamples in bounded test-data space. Wednesday, 29 August 12
17. LSC? • Lazy SmallCheck (Runciman et al., 2008). • Check

– Property-based testing library. • Small – Exhaustive search for minimal counterexamples in bounded test-data space. • Lazy – Space includes partial values and evaluation order guides search. Wednesday, 29 August 12
18. >>> depthCheck 7 prop_insertSet Depth 7: Completed 109600 test(s) without

failure. But 108576 did not meet ==> condition. Beneﬁt of being lazy prop_insertSet :: Char -> [Char] -> Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) In SC Wednesday, 29 August 12
19. >>> depthCheck 7 prop_insertSet Depth 7: Completed 109600 test(s) without

failure. But 108576 did not meet ==> condition. Beneﬁt of being lazy prop_insertSet :: Char -> [Char] -> Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) In SC Wednesday, 29 August 12
20. Beneﬁt of being lazy prop_insertSet :: Char -> [Char] ->

Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) >>> depthCheck 7 prop_insertSet OK, required 1716 tests at depth 7 In LSC 2008 1.6% of tests performed by SC Wednesday, 29 August 12
21. Beneﬁt of being lazy prop_insertSet :: Char -> [Char] ->

Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) Lazy antecedent >>> depthCheck 7 prop_insertSet OK, required 1716 tests at depth 7 In LSC 2008 1.6% of tests performed by SC Wednesday, 29 August 12
22. • xs = (1:0:⊥) falsiﬁes the antecedent. Beneﬁt of being

lazy prop_insertSet :: Char -> [Char] -> Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) Lazy antecedent Wednesday, 29 August 12
23. • xs = (1:0:⊥) falsiﬁes the antecedent. • Therefore, the

LSC doesn’t need to test; xs = [1,0] xs = [1,0,2,3] xs = [1,0,5,4] e.t.c. Beneﬁt of being lazy prop_insertSet :: Char -> [Char] -> Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) Lazy antecedent Wednesday, 29 August 12
24. • xs = (1:0:⊥) falsiﬁes the antecedent. • Therefore, the

LSC doesn’t need to test; xs = [1,0] xs = [1,0,2,3] xs = [1,0,5,4] e.t.c. • Or even any value of x for this class of xs. Beneﬁt of being lazy prop_insertSet :: Char -> [Char] -> Property prop_insertSet x xs = isOrdered xs ==> isOrdered (insert x xs) Lazy antecedent Wednesday, 29 August 12

26. Property-based testing Lazy SmallCheck (2008) Lazy SmallCheck (2012) Strategy Demand-driven

Functional values Existentials Nested quantiﬁcation Lazy Bounded Exhaustive Lazy Bounded Exhaustive • • ◦ • ◦ • ◦ • Wednesday, 29 August 12
27. Property-based testing Lazy SmallCheck (2008) Lazy SmallCheck (2012) Strategy Demand-driven

Functional values Existentials Nested quantiﬁcation Lazy Bounded Exhaustive Lazy Bounded Exhaustive • • ◦ • ◦ • ◦ • + better display of counter- examples Wednesday, 29 August 12
28. Property-based testing Lazy SmallCheck (2008) Lazy SmallCheck (2012) Strategy Demand-driven

Functional values Existentials Nested quantiﬁcation Lazy Bounded Exhaustive Lazy Bounded Exhaustive • • ◦ • ◦ • ◦ • + speedup! + better display of counter- examples Wednesday, 29 August 12
29. >>> test prop_ReduceFold ... Depth 6: Var 0: { []

-> False ; _:[] -> False ; _:_:_ -> True } In LSC 2012... prop_ReduceFold :: ([Bool] -> Bool) -> Property prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs Wednesday, 29 August 12
30. >>> test prop_ReduceFold ... Depth 6: Var 0: { []

-> False ; _:[] -> False ; _:_:_ -> True } In LSC 2012... prop_ReduceFold :: ([Bool] -> Bool) -> Property prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs “Tests for multi-item lists.” Wednesday, 29 August 12
31. >>> test prop_ReduceFold ... Depth 6: Var 0: { []

-> False ; _:[] -> False ; _:_:_ -> True } In LSC 2012... prop_ReduceFold :: ([Bool] -> Bool) -> Property prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs -> foldr f z xs == r xs “Tests for multi-item lists.” Wildcard patterns Wednesday, 29 August 12
32. >>> :{ >>| let prop_BitString p = >>| p [False,

False, True, False, False, True] >>| && p [False, False, False, False, True, True] >>| ==> p [False, False, False, False, False, True] >>| :} >>> test prop_BitString ... Depth 14: Var 0: { _:_:False:_:False:_ -> False ; _:_:False:_:True:_ -> True ; _:_:True:_ -> True } Functional values I Wednesday, 29 August 12
33. >>> :{ >>| let prop_BitString p = >>| p [False,

False, True, False, False, True] >>| && p [False, False, False, False, True, True] >>| ==> p [False, False, False, False, False, True] >>| :} >>> test prop_BitString ... Depth 14: Var 0: { _:_:False:_:False:_ -> False ; _:_:False:_:True:_ -> True ; _:_:True:_ -> True } Functional values I Wednesday, 29 August 12
34. >>> :{ >>| let prop_BitString p = >>| p [False,

False, True, False, False, True] >>| && p [False, False, False, False, True, True] >>| ==> p [False, False, False, False, False, True] >>| :} >>> test prop_BitString ... Depth 14: Var 0: { _:_:False:_:False:_ -> False ; _:_:False:_:True:_ -> True ; _:_:True:_ -> True } Functional values I Partial function Wednesday, 29 August 12
35. >>> :{ >>| let prop_BitString p = >>| p [False,

False, True, False, False, True] >>| && p [False, False, False, False, True, True] >>| ==> p [False, False, False, False, False, True] >>| :} >>> test prop_BitString ... Depth 14: Var 0: { _:_:False:_:False:_ -> False ; _:_:False:_:True:_ -> True ; _:_:True:_ -> True } Functional values I Partial function Wildcard patterns Wednesday, 29 August 12
36. >>> :{ >>| let prop_BitString p = >>| p [False,

False, True, False, False, True] >>| && p [False, False, False, False, True, True] >>| ==> p [False, False, False, False, False, True] >>| :} >>> test prop_BitString ... Depth 14: Var 0: { _:_:False:_:False:_ -> False ; _:_:False:_:True:_ -> True ; _:_:True:_ -> True } Functional values I Partial function Wildcard patterns + = Minimal example Wednesday, 29 August 12
37. Functional values II • LSC now generates partial functions including

wildcard patterns. • Tries in disguise! • Wildcards explicit but partiality of functions is a result of partial values. • Users need to implement ‘Argument’ instance for functional value argument types. • ‘deriveArgument’ does this automatically using Template Haskell. Wednesday, 29 August 12
38. >>> :{ >>| let prop_Foldx1 :: (Bool -> Bool ->

Bool) -> [Bool] >>| -> Bool >>| prop_Foldx1 f xs = (not.null) xs >>| ==> foldl1 f xs == foldr1 f xs >>| :} >>> test \$ prop_Foldx1 \$ const not ... [False,False,False] Displaying counterexamples In LSC 2008 Wednesday, 29 August 12
39. >>> :{ >>| let prop_Foldx1 :: (Bool -> Bool ->

Bool) -> [Bool] >>| -> Bool >>| prop_Foldx1 f xs = (not.null) xs >>| ==> foldl1 f xs == foldr1 f xs >>| :} >>> test \$ prop_Foldx1 \$ const not ... [False,False,False] Displaying counterexamples In LSC 2008 Displays the ﬁrst totally deﬁned counterexample. Wednesday, 29 August 12
40. >>> :{ >>| let prop_Foldx1 :: (Bool -> Bool ->

Bool) -> [Bool] >>| -> Bool >>| prop_Foldx1 f xs = (not.null) xs >>| ==> foldl1 f xs == foldr1 f xs >>| :} >>> test \$ prop_Foldx1 \$ const not ... Var 0: _:_:False:[] Displaying counterexamples In LSC 2012 Wednesday, 29 August 12
41. >>> :{ >>| let prop_Foldx1 :: (Bool -> Bool ->

Bool) -> [Bool] >>| -> Bool >>| prop_Foldx1 f xs = (not.null) xs >>| ==> foldl1 f xs == foldr1 f xs >>| :} >>> test \$ prop_Foldx1 \$ const not ... Var 0: _:_:False:[] Displaying counterexamples Displaying partial values gives more information! Uses “Chasing Bottoms” (Danielsson, 2004) In LSC 2012 Wednesday, 29 August 12
42. >>> :{ >>| let prop_Skolem :: (Peano -> Peano ->

Bool) >>| -> Property >>| prop_Skolem r = exists \$ \f -> forAll \$ \x -> >>| (exists \$ \y -> r x y) >>| <=> >>| (r x (f x)) >>| :} >>> :s +s >>> depthCheck 8 prop_skolem LSC2: Passed in 3342802 tests. (60.85 secs, 61941317512 bytes) Quantiﬁcation I Wednesday, 29 August 12
43. >>> :{ >>| let prop_Skolem :: (Peano -> Peano ->

Bool) >>| -> Property >>| prop_Skolem r = exists \$ \f -> forAll \$ \x -> >>| (exists \$ \y -> r x y) >>| <=> >>| (r x (f x)) >>| :} >>> :s +s >>> depthCheck 8 prop_skolem LSC2: Passed in 3342802 tests. (60.85 secs, 61941317512 bytes) Quantiﬁcation I Existential quantiﬁers Wednesday, 29 August 12
44. >>> :{ >>| let prop_Skolem :: (Peano -> Peano ->

Bool) >>| -> Property >>| prop_Skolem r = exists \$ \f -> forAll \$ \x -> >>| (exists \$ \y -> r x y) >>| <=> >>| (r x (f x)) >>| :} >>> :s +s >>> depthCheck 8 prop_skolem LSC2: Passed in 3342802 tests. (60.85 secs, 61941317512 bytes) Quantiﬁcation I Existential quantiﬁers Nested quantiﬁcation Wednesday, 29 August 12
45. >>> :{ >>| let prop_Skolem :: (Peano -> Peano ->

Bool) >>| -> Property >>| prop_Skolem r = exists \$ \f -> forAll \$ \x -> >>| (exists \$ \y -> r x y) >>| <=> >>| (r x (f x)) >>| :} >>> :s +s >>> depthCheck 8 prop_skolem LSC2: Passed in 3342802 tests. (60.85 secs, 61941317512 bytes) Quantiﬁcation I Existential quantiﬁers Nested quantiﬁcation Wednesday, 29 August 12
46. >>> :{ >>| let prop_Skolem :: (Peano -> Peano ->

Bool) >>| -> Property >>| prop_Skolem r = exists \$ \f -> forAll \$ \x -> >>| (exists \$ \y -> r x y) >>| <=> >>| (r x (f x)) >>| :} >>> :s +s >>> depthCheck 8 prop_skolem LSC2: Passed in 3342802 tests. (60.85 secs, 61941317512 bytes) Quantiﬁcation I Existential quantiﬁers Nested quantiﬁcation n.b. Don’t even get to depth 3 in SC. Wednesday, 29 August 12
47. Quantiﬁcation II • Lazy pruning is beneﬁcial for existentials too.

• Nested quantiﬁcation necessary for existentials to be useful. • Adds forAll and exists to Property DSL. • Required a complete rethink of underlying structure and refutation algorithm. Wednesday, 29 August 12

49. Performance Name Ratio Catch Circuits1 Circuits2 Circuits3 Countdown1 Countdown2 Huffman1

0.28 1.00 1.04 0.56 0.55 1.01 0.67 Name Ratio Huffman2 ListSet1 Mate RedBlack SumPuz Turner Geo. Mean 0.59 0.80 0.60 0.66 0.97 0.62 0.68 Ratio = LSC2012 execution time LSC2008 execution time Ratio < 1 is improvement. Wednesday, 29 August 12
50. Related work • Koen Claessen, Shrinking and Showing Functions (Functional

Pearl), Haskell 2012. • Extends QuickCheck’s functional value capabilities. • Uses tries (different formulation) to provide additional features. • Must wrap functional values in a ‘modiﬁer’. Wednesday, 29 August 12
51. Further work • Looking at Claessen’s trie formulation. • Could

use SYB instead of TH for automatic instances. • Difﬁcult to judge the depth of a functional value. Wednesday, 29 August 12
52. Further work • Looking at Claessen’s trie formulation. • Could

use SYB instead of TH for automatic instances. • Difﬁcult to judge the depth of a functional value. • Parallel LSC (with JMCT) • Naive so far. Testing on 8 cores. • Scales well for most examples. Wednesday, 29 August 12
53. Conclusions I • SCs handling of functional values wasn’t entirely

satisfying. • New formulation for LSC leverages the ‘lazy’ for maximum effect. • Displaying partial counterexample gives more information than a totally deﬁned one. Wednesday, 29 August 12
54. Conclusions II • Existentials also beneﬁt from laziness. • Making

things more complicated can strangely make them faster? • Broader range of functionality, looking for interesting applications. Wednesday, 29 August 12