Jason Reich
August 30, 2012
110

A presentation on efficient testing of higher-order properties with mixed quantification.

August 30, 2012

## Transcript

Lazy SmallCheck
Jason S. Reich, Matthew Naylor, Colin Runciman
30/08/12 – IFL 2012, Oxford, UK
Wednesday, 29 August 12

2. A ‘conjectured’ property
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z ->
forAll \$ \xs ->
foldr f z xs == r xs
Wednesday, 29 August 12

3. A ‘conjectured’ property
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z ->
forAll \$ \xs ->
foldr f z xs == r xs
“All reductions on lists of Boolean values to a single
Boolean value can be expressed as a foldr.”
Wednesday, 29 August 12

4. A ‘conjectured’ property
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z ->
forAll \$ \xs ->
foldr f z xs == r xs
“All reductions on lists of Boolean values to a single
Boolean value can be expressed as a foldr.”
Functional values
Wednesday, 29 August 12

5. A ‘conjectured’ property
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z ->
forAll \$ \xs ->
foldr f z xs == r xs
“All reductions on lists of Boolean values to a single
Boolean value can be expressed as a foldr.”
Existential quantiﬁer
Functional values
Wednesday, 29 August 12

6. A ‘conjectured’ property
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z ->
forAll \$ \xs ->
foldr f z xs == r xs
“All reductions on lists of Boolean values to a single
Boolean value can be expressed as a foldr.”
Existential quantiﬁer
Functional values
Nested quantiﬁcation
Wednesday, 29 August 12

7. Property-based testing
QuickCheck SmallCheck
Lazy SmallCheck
(2008)
Strategy
Demand-driven
Functional
values
Existentials
Nested
quantiﬁcation
Random Bounded Exhaustive
Lazy Bounded
Exhaustive
○ ○ ●
● ● ○
○ ● ○
● ● ○
Wednesday, 29 August 12

8. Property-based testing
QuickCheck SmallCheck
Lazy SmallCheck
(2008)
Strategy
Demand-driven
Functional
values
Existentials
Nested
quantiﬁcation
Random Bounded Exhaustive
Lazy Bounded
Exhaustive
○ ○ ●
● ● ○
○ ● ○
● ● ○
Wednesday, 29 August 12

9. In SmallCheck...
>>> test prop_ReduceFold
Depth 0:
Completed 4 test(s) without failure.
...
Depth 2:
Failed test no. 3. Test values follow.
[]->
True
[True]->
True
[True,True]->
True
[True,True,True]->
True
[True,True,False]->
True
[True,False]->
True
Wednesday, 29 August 12

10. In SmallCheck...
>>> test prop_ReduceFold
Depth 0:
Completed 4 test(s) without failure.
...
Depth 2:
Failed test no. 3. Test values follow.
[]->
True
[True]->
True
[True,True]->
True
[True,True,True]->
True
[True,True,False]->
True
[True,False]->
True
Wednesday, 29 August 12

11. In SmallCheck...
True
[True,False]->
True
[True,False,True]->
True
[True,False,False]->
True
[False]->
True
[False,True]->
True
[False,True,True]->
True
[False,True,False]->
True
[False,False]->
False
[False,False,True]->
True
[False,False,False]->
True
Wednesday, 29 August 12

12. In SmallCheck...
True
[True,False]->
True
[True,False,True]->
True
[True,False,False]->
True
[False]->
True
[False,True]->
True
[False,True,True]->
True
[False,True,False]->
True
[False,False]->
False
[False,False,True]->
True
[False,False,False]->
True
r = (/= [False, False])
Wednesday, 29 August 12

13. Lazy SmallCheck:
A refresher
Wednesday, 29 August 12

14. LSC?
• Lazy SmallCheck (Runciman et al., 2008).
Wednesday, 29 August 12

15. LSC?
• Lazy SmallCheck (Runciman et al., 2008).
• Check – Property-based testing library.
Wednesday, 29 August 12

16. LSC?
• Lazy SmallCheck (Runciman et al., 2008).
• Check – Property-based testing library.
• Small – Exhaustive search for minimal
counterexamples in bounded test-data
space.
Wednesday, 29 August 12

17. LSC?
• Lazy SmallCheck (Runciman et al., 2008).
• Check – Property-based testing library.
• Small – Exhaustive search for minimal
counterexamples in bounded test-data
space.
• Lazy – Space includes partial values and
evaluation order guides search.
Wednesday, 29 August 12

18. >>> depthCheck 7 prop_insertSet
Depth 7:
Completed 109600 test(s) without failure.
But 108576 did not meet ==> condition.
Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
In SC
Wednesday, 29 August 12

19. >>> depthCheck 7 prop_insertSet
Depth 7:
Completed 109600 test(s) without failure.
But 108576 did not meet ==> condition.
Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
In SC
Wednesday, 29 August 12

20. Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
>>> depthCheck 7 prop_insertSet
OK, required 1716 tests at depth 7
In LSC 2008
1.6% of tests performed by SC
Wednesday, 29 August 12

21. Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
Lazy antecedent
>>> depthCheck 7 prop_insertSet
OK, required 1716 tests at depth 7
In LSC 2008
1.6% of tests performed by SC
Wednesday, 29 August 12

22. • xs = (1:0:⊥) falsiﬁes the antecedent.
Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
Lazy antecedent
Wednesday, 29 August 12

23. • xs = (1:0:⊥) falsiﬁes the antecedent.
• Therefore, the LSC doesn’t need to test;
xs = [1,0] xs = [1,0,2,3]
xs = [1,0,5,4] e.t.c.
Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
Lazy antecedent
Wednesday, 29 August 12

24. • xs = (1:0:⊥) falsiﬁes the antecedent.
• Therefore, the LSC doesn’t need to test;
xs = [1,0] xs = [1,0,2,3]
xs = [1,0,5,4] e.t.c.
• Or even any value of x for this class of xs.
Beneﬁt of being lazy
prop_insertSet :: Char -> [Char] -> Property
prop_insertSet x xs = isOrdered xs
==> isOrdered (insert x xs)
Lazy antecedent
Wednesday, 29 August 12

25. New features
Wednesday, 29 August 12

26. Property-based testing
Lazy SmallCheck
(2008)
Lazy SmallCheck
(2012)
Strategy
Demand-driven
Functional
values
Existentials
Nested
quantiﬁcation
Lazy Bounded
Exhaustive
Lazy Bounded
Exhaustive
● ●
○ ●
○ ●
○ ●
Wednesday, 29 August 12

27. Property-based testing
Lazy SmallCheck
(2008)
Lazy SmallCheck
(2012)
Strategy
Demand-driven
Functional
values
Existentials
Nested
quantiﬁcation
Lazy Bounded
Exhaustive
Lazy Bounded
Exhaustive
● ●
○ ●
○ ●
○ ●
+ better
display of
counter-
examples
Wednesday, 29 August 12

28. Property-based testing
Lazy SmallCheck
(2008)
Lazy SmallCheck
(2012)
Strategy
Demand-driven
Functional
values
Existentials
Nested
quantiﬁcation
Lazy Bounded
Exhaustive
Lazy Bounded
Exhaustive
● ●
○ ●
○ ●
○ ●
+ speedup!
+ better
display of
counter-
examples
Wednesday, 29 August 12

29. >>> test prop_ReduceFold
...
Depth 6:
Var 0: { [] -> False
; _:[] -> False
; _:_:_ -> True }
In LSC 2012...
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs ->
foldr f z xs == r xs
Wednesday, 29 August 12

30. >>> test prop_ReduceFold
...
Depth 6:
Var 0: { [] -> False
; _:[] -> False
; _:_:_ -> True }
In LSC 2012...
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs ->
foldr f z xs == r xs
“Tests for multi-item lists.”
Wednesday, 29 August 12

31. >>> test prop_ReduceFold
...
Depth 6:
Var 0: { [] -> False
; _:[] -> False
; _:_:_ -> True }
In LSC 2012...
prop_ReduceFold :: ([Bool] -> Bool) -> Property
prop_ReduceFold r = exists \$ \f z -> forAll \$ \xs ->
foldr f z xs == r xs
“Tests for multi-item lists.”
Wildcard patterns
Wednesday, 29 August 12

32. >>> :{
>>| let prop_BitString p =
>>| p [False, False, True, False, False, True]
>>| && p [False, False, False, False, True, True]
>>| ==> p [False, False, False, False, False, True]
>>| :}
>>> test prop_BitString
...
Depth 14:
Var 0: { _:_:False:_:False:_ -> False
; _:_:False:_:True:_ -> True
; _:_:True:_ -> True }
Functional values I
Wednesday, 29 August 12

33. >>> :{
>>| let prop_BitString p =
>>| p [False, False, True, False, False, True]
>>| && p [False, False, False, False, True, True]
>>| ==> p [False, False, False, False, False, True]
>>| :}
>>> test prop_BitString
...
Depth 14:
Var 0: { _:_:False:_:False:_ -> False
; _:_:False:_:True:_ -> True
; _:_:True:_ -> True }
Functional values I
Wednesday, 29 August 12

34. >>> :{
>>| let prop_BitString p =
>>| p [False, False, True, False, False, True]
>>| && p [False, False, False, False, True, True]
>>| ==> p [False, False, False, False, False, True]
>>| :}
>>> test prop_BitString
...
Depth 14:
Var 0: { _:_:False:_:False:_ -> False
; _:_:False:_:True:_ -> True
; _:_:True:_ -> True }
Functional values I
Partial function
Wednesday, 29 August 12

35. >>> :{
>>| let prop_BitString p =
>>| p [False, False, True, False, False, True]
>>| && p [False, False, False, False, True, True]
>>| ==> p [False, False, False, False, False, True]
>>| :}
>>> test prop_BitString
...
Depth 14:
Var 0: { _:_:False:_:False:_ -> False
; _:_:False:_:True:_ -> True
; _:_:True:_ -> True }
Functional values I
Partial function Wildcard patterns
Wednesday, 29 August 12

36. >>> :{
>>| let prop_BitString p =
>>| p [False, False, True, False, False, True]
>>| && p [False, False, False, False, True, True]
>>| ==> p [False, False, False, False, False, True]
>>| :}
>>> test prop_BitString
...
Depth 14:
Var 0: { _:_:False:_:False:_ -> False
; _:_:False:_:True:_ -> True
; _:_:True:_ -> True }
Functional values I
Partial function Wildcard patterns
+ = Minimal example
Wednesday, 29 August 12

37. Functional values II
• LSC now generates partial functions including
wildcard patterns.
• Tries in disguise!
• Wildcards explicit but partiality of functions is a
result of partial values.
• Users need to implement ‘Argument’ instance
for functional value argument types.
• ‘deriveArgument’ does this automatically
Wednesday, 29 August 12

38. >>> :{
>>| let prop_Foldx1 :: (Bool -> Bool -> Bool) -> [Bool]
>>| -> Bool
>>| prop_Foldx1 f xs = (not.null) xs
>>| ==> foldl1 f xs == foldr1 f xs
>>| :}
>>> test \$ prop_Foldx1 \$ const not
...
[False,False,False]
Displaying counterexamples
In LSC 2008
Wednesday, 29 August 12

39. >>> :{
>>| let prop_Foldx1 :: (Bool -> Bool -> Bool) -> [Bool]
>>| -> Bool
>>| prop_Foldx1 f xs = (not.null) xs
>>| ==> foldl1 f xs == foldr1 f xs
>>| :}
>>> test \$ prop_Foldx1 \$ const not
...
[False,False,False]
Displaying counterexamples
In LSC 2008
Displays the ﬁrst totally deﬁned counterexample.
Wednesday, 29 August 12

40. >>> :{
>>| let prop_Foldx1 :: (Bool -> Bool -> Bool) -> [Bool]
>>| -> Bool
>>| prop_Foldx1 f xs = (not.null) xs
>>| ==> foldl1 f xs == foldr1 f xs
>>| :}
>>> test \$ prop_Foldx1 \$ const not
...
Var 0: _:_:False:[]
Displaying counterexamples
In LSC 2012
Wednesday, 29 August 12

41. >>> :{
>>| let prop_Foldx1 :: (Bool -> Bool -> Bool) -> [Bool]
>>| -> Bool
>>| prop_Foldx1 f xs = (not.null) xs
>>| ==> foldl1 f xs == foldr1 f xs
>>| :}
>>> test \$ prop_Foldx1 \$ const not
...
Var 0: _:_:False:[]
Displaying counterexamples
Displaying partial values
Uses “Chasing Bottoms”
(Danielsson, 2004)
In LSC 2012
Wednesday, 29 August 12

42. >>> :{
>>| let prop_Skolem :: (Peano -> Peano -> Bool)
>>| -> Property
>>| prop_Skolem r = exists \$ \f -> forAll \$ \x ->
>>| (exists \$ \y -> r x y)
>>| <=>
>>| (r x (f x))
>>| :}
>>> :s +s
>>> depthCheck 8 prop_skolem
LSC2: Passed in 3342802 tests.
(60.85 secs, 61941317512 bytes)
Quantiﬁcation I
Wednesday, 29 August 12

43. >>> :{
>>| let prop_Skolem :: (Peano -> Peano -> Bool)
>>| -> Property
>>| prop_Skolem r = exists \$ \f -> forAll \$ \x ->
>>| (exists \$ \y -> r x y)
>>| <=>
>>| (r x (f x))
>>| :}
>>> :s +s
>>> depthCheck 8 prop_skolem
LSC2: Passed in 3342802 tests.
(60.85 secs, 61941317512 bytes)
Quantiﬁcation I
Existential quantiﬁers
Wednesday, 29 August 12

44. >>> :{
>>| let prop_Skolem :: (Peano -> Peano -> Bool)
>>| -> Property
>>| prop_Skolem r = exists \$ \f -> forAll \$ \x ->
>>| (exists \$ \y -> r x y)
>>| <=>
>>| (r x (f x))
>>| :}
>>> :s +s
>>> depthCheck 8 prop_skolem
LSC2: Passed in 3342802 tests.
(60.85 secs, 61941317512 bytes)
Quantiﬁcation I
Existential quantiﬁers
Nested quantiﬁcation
Wednesday, 29 August 12

45. >>> :{
>>| let prop_Skolem :: (Peano -> Peano -> Bool)
>>| -> Property
>>| prop_Skolem r = exists \$ \f -> forAll \$ \x ->
>>| (exists \$ \y -> r x y)
>>| <=>
>>| (r x (f x))
>>| :}
>>> :s +s
>>> depthCheck 8 prop_skolem
LSC2: Passed in 3342802 tests.
(60.85 secs, 61941317512 bytes)
Quantiﬁcation I
Existential quantiﬁers
Nested quantiﬁcation
Wednesday, 29 August 12

46. >>> :{
>>| let prop_Skolem :: (Peano -> Peano -> Bool)
>>| -> Property
>>| prop_Skolem r = exists \$ \f -> forAll \$ \x ->
>>| (exists \$ \y -> r x y)
>>| <=>
>>| (r x (f x))
>>| :}
>>> :s +s
>>> depthCheck 8 prop_skolem
LSC2: Passed in 3342802 tests.
(60.85 secs, 61941317512 bytes)
Quantiﬁcation I
Existential quantiﬁers
Nested quantiﬁcation
n.b. Don’t even get to depth 3 in SC.
Wednesday, 29 August 12

47. Quantiﬁcation II
• Lazy pruning is beneﬁcial for existentials
too.
• Nested quantiﬁcation necessary for
existentials to be useful.
• Adds forAll and exists to
Property DSL.
• Required a complete rethink of underlying
structure and refutation algorithm.
Wednesday, 29 August 12

48. Evaluation
Wednesday, 29 August 12

49. Performance
Name Ratio
Catch
Circuits1
Circuits2
Circuits3
Countdown1
Countdown2
Huffman1
0.28
1.00
1.04
0.56
0.55
1.01
0.67
Name Ratio
Huffman2
ListSet1
Mate
RedBlack
SumPuz
Turner
Geo. Mean
0.59
0.80
0.60
0.66
0.97
0.62
0.68
Ratio = LSC2012 execution time
LSC2008 execution time
Ratio < 1 is improvement.
Wednesday, 29 August 12

50. Related work
• Koen Claessen, Shrinking and Showing
• Extends QuickCheck’s functional value
capabilities.
• Uses tries (different formulation) to
• Must wrap functional values in a ‘modiﬁer’.
Wednesday, 29 August 12

51. Further work
• Looking at Claessen’s trie formulation.
• Could use SYB instead of TH for automatic
instances.
• Difﬁcult to judge the depth of a functional
value.
Wednesday, 29 August 12

52. Further work
• Looking at Claessen’s trie formulation.
• Could use SYB instead of TH for automatic
instances.
• Difﬁcult to judge the depth of a functional
value.
• Parallel LSC (with JMCT)
• Naive so far. Testing on 8 cores.
• Scales well for most examples.
Wednesday, 29 August 12

53. Conclusions I
• SCs handling of functional values wasn’t
entirely satisfying.
• New formulation for LSC leverages the
‘lazy’ for maximum effect.
• Displaying partial counterexample gives