Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Best Test Data is Random Test Data

The Best Test Data is Random Test Data

Introduction to property-based testing, DevConf.cz 2015.

Fraser Tweedale

February 07, 2015
Tweet

More Decks by Fraser Tweedale

Other Decks in Programming

Transcript

  1. The Best Test Data is Random Test Data An introduction

    to property-based testing Fraser Tweedale Red Hat, Inc. @hackuador February 7, 2015
  2. About me Developer at Red Hat FreeIPA identity management and

    Dogtag PKI Mostly Python and Java at work Mostly Haskell for other projects
  3. This talk Introduce property-based testing; motivate with examples Concepts will

    be demonstrated in Haskell using QuickCheck A brief look at property-based testing in other languages Discussion of limitations Alternative approaches
  4. Property-based testing A property-based testing framework: 1. Gives you a

    way to state properties of functions 2. Gives you a way to declare how to generate arbitrary values of your types 3. Provides generators for standard types (usually) 4. Attempts to falsify your properties and reports counterexamples.
  5. Applications Check laws and invariants of algorithms, data, abstractions Check

    code against a model implementation Properties are meaningful documentation
  6. Reversing a list rev :: [a] -> [a] rev []

    = [] rev (x:xs) = rev xs ++ [x] prop_RevUnit :: Int -> Bool prop_RevUnit x = rev [x] == [x] prop_RevApp :: [Int] -> [Int] -> Bool prop_RevApp xs ys = rev (xs ++ ys) == rev ys ++ rev xs
  7. Exhaustion Na¨ ıve use of preconditions resulting in not enough

    test cases Solution: custom generator to ensure precondition satisfied Better solution: redesign data types such that precondition is invariant
  8. Trivial test data Trivial test data can result in tests

    passing vacuously. Use collect or cover to inspect distribution Use frequency to govern distribution
  9. Infinite data structures Useful, but be careful what you evaluate

    Use sized when defining generators for recursive data
  10. Property-based testing implementations Most languages have at least one implementation

    Incomplete list: https://en.wikipedia.org/wiki/QuickCheck Some decent or popular implementations are missing Python: pyqcy Java: Functional Java (fj.test)
  11. Python example from pyqcy import * def rev(l): return list(reversed(l))

    @qc def prop_rev_unit(x=int_()): assert rev([x]) == [x] @qc def prop_rev_app(xs=list_(of=int), ys=list_(of=int)): assert rev(xs + ys) == rev(ys) + rev(xs) if __name__ == ’__main__’: main()
  12. Java example import static org.junit.Assert.*; import org.junit.contrib.theories.*; import org.junit.runner.RunWith; import

    com.pholser.junit.quickcheck.ForAll; @RunWith(Theories.class) public class RevTestCase { // next slide }
  13. Java example @Theory public void revUnit(@ForAll Integer x) { ArrayList

    xs = new ArrayList(); xs.add(x); assertEquals(rev(xs), xs); } @Theory public void revApp( @ForAll ArrayList<Integer> xs, @ForAll ArrayList<Integer> ys ) { assertEquals( rev(app(xs, ys)), app(rev(ys), rev(xs)) ); }
  14. Randomness prop_verify_eq :: Password -> Bool prop_verify_eq s = verify

    (hash s) s prop_verify_neq :: Password -> Password -> Property prop_verify_neq s s’ = not (s == s’) ==> not (verify (hash s) s’)
  15. Randomness Previous slide: what if hash truncates input before hashing?

    Some bugs are unlikely to be found with random data Workaround: mutate or fuzz data in domain-relevant way
  16. Randomness fuzz :: Password -> Gen Password fuzz = {-

    truncation / extension / permutation / etc -} prop_verify_fuzzed :: Password -> Property prop_verify_fuzzed s = forAll (fuzz s) (prop_verify_neq s)
  17. Failure cases Arbitrary is great for generating random valid data

    How to specify behaviour given invalid data?
  18. Failure cases dump :: JSON -> String load :: String

    -> Maybe JSON prop_dumpLoad :: JSON -> Bool prop_dumpLoad a = load (dump a) == Just a loadSpec :: Spec loadSpec = describe "load" $ it "fails on bogus input" $ load "bogus" ‘shouldBe‘ Nothing
  19. Conclusion Property-based testing is true automated testing More thorough testing

    in less time ($$$) Relieves developer of burden of finding and manually writing tests for corner cases Properties are meaningful documentation The best test data is random test data, but. . . a bit of domain-specific non-randomness is sometimes useful examples still have their place.
  20. Exhaustive testing The best test data is all of the

    data Check that property holds for all values Supports existential properties Available in several languages SmallCheck (Haskell), smallcheck4scala, autocheck (C++), ocamlcheck, python-doublecheck
  21. Proof The best test data is no test data Some

    languages have theorem-proving capabilities Properties become theorems; no proof, no program Program extraction to other languages Completeness proofs rev example: http://is.gd/EhanO1
  22. Resources QuickCheck: A Lightweight Tool for Random Testing of Haskell

    Programs (2000) Koen Claessen, John Hughes: http://is.gd/mpsY7G Automated Unit Testing your Java using ScalaCheck by Tony Morris: http://is.gd/j0R7qq UCSD CSE 230 lecture: http://is.gd/0YfxOr QuickCheck: Beyond the Basics by Dave Laing: http://is.gd/pGKnhg Recommended Haskell learning path: https://github.com/bitemyapp/learnhaskell
  23. Thanks for listening Copyright 2015 Fraser Tweedale This work is

    licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Feedback http://devconf.cz/f/72 Slides https://github.com/frasertweedale/talks/ Email [email protected] Twitter @hackuador