(EuroClojure) Generative Testing: Properties, State and Beyond

(EuroClojure) Generative Testing: Properties, State and Beyond

Slides from a talk I delivered on 25.06.2015 at EuroClojure in Barcelona.

The talk will discuss how property-based testing compares to traditional testing methods and demonstrate its principles on simple examples. We will also see how does it fit into a TDD workflow. In order to bust the myth of inapplicability of property-based testing in a real-world setting we'll bring up some use cases from the industry. Afterwards we'll move on from immutable, static properties to a more dynamic setting. We'll see how tools present in the Clojure ecosystem allow us to validate stateful computations through generation of test scenarios. Finally, we'll wander into the world of concurrency and automation of race conditions detection.

Video: https://www.youtube.com/watch?v=D2TCuDXmyw4

Ae7a42fb716793697b1d222f3cc753b8?s=128

Jan Stępień

June 24, 2015
Tweet

Transcript

  1. (hola barcelona)

  2. Generative Testing Properties, State and Beyond Jan Stępień @janstepien jan@stepien.cc

    e photo of Barcelona is © Michele Ursino 2013, flickr.com/photos/micurs/9954407993
  3. © Giuseppe Milo 2015, flic.kr/p/urkX1S

  4. lein new project (ns project.core-test (:require [clojure.test :refer :all] [project.core

    :refer :all])) (deftest a-test (testing "FIXME, I fail." (is (= 0 1))))
  5. Why? user=> (= 0 1) false

  6. Properties

  7. /

  8. (deftest test-division (is (= 5 (/ 15 3))) (is (=

    0.5 (/ 1.0 2))) (is (thrown? ArithmeticException (/ 1 0))))
  9. (deftest test-division (are [a b c] (= (/ a b)

    c) 15 3 5 1.0 2 0.5))
  10. a b = a b a b · b =

    a b · b a b · b = a (= (* (/ a b) b) a)
  11. (deftest test-division (are [a b] (= (* (/ a b)

    b) a) 15 3 1.0 2))
  12. (deftest test-division (let [a (arbitrary-number) b (arbitrary-number)] (is (= (*

    (/ a b) b) a))))
  13. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs

    Koen Claessen Chalmers University of Technology koen@cs.chalmers.se John Hughes Chalmers University of Technology rjmh@cs.chalmers.se ABSTRACT QuickCheck is a tool which aids the Haskell programmer in formulating and testing properties of programs. Properties are described as Haskell functions, and can be automati- cally tested on random input, but it is also possible to de- ne custom test data generators. We present a number of case studies, in which the tool was successfully used, and also point out some pitfalls to avoid. Random testing is es- pecially suitable for functional programs because properties can be stated at a ne grain. When a function is built from separately tested components, then random testing suces to obtain good coverage of the de nition under test. 1. INTRODUCTION Testing is by far the most commonly used approach to ensuring software quality. It is also very labour intensive, accounting for up to 50% of the cost of software develop- ment. Despite anecdotal evidence that functional programs require somewhat less testing (`Once it type-checks, it usu- ally works'), in practice it is still a major part of functional program development. The cost of testing motivates e orts to automate it, wholly monad are hard to test), and so testing can be done at a ne grain. A testing tool must be able to determine whether a test is passed or failed; the human tester must supply an auto- matically checkable criterion of doing so. We have chosen to use formal speci cations for this purpose. We have de- signed a simple domain-speci c language of testable speci - cations which the tester uses to de ne expected properties of the functions under test. QuickCheckthen checks that the properties hold in a large number of cases. The speci ca- tion language is embedded in Haskell using the class system. Properties are normally written in the same module as the functions they test, where they serve also as checkable doc- umentation of the behaviour of the code. A testing tool must also be able to generate test cases au- tomatically. We have chosen the simplest method, random testing [11], which competes surprisingly favourably with systematic methods in practice. However, it is meaningless to talk about random testing without discussing the distri- bution of test data. Random testing is most e ective when the distribution of test data follows that of actual data, but when testing reuseable code units as opposed to whole sys- tems this is not possible, since the distribution of actual data in all subsequent reuses is not known. A uniform dis-
  14. [org.clojure/test.check "0.6.2"]

  15. (require '[clojure.test.check :as tc]) (require '[clojure.test.check [generators :as gen] [properties

    :as prop]]) (def prop-division (prop/for-all [a gen/int b gen/int] (= (* (/ a b) b) a))) ;; notice that it's ∀a ∈ Z ∀b ∈ Z ( a b · b = a ) (tc/quick-check 1000 prop-division)
  16. {:result #<ArithmeticException: Divide by zero>, :seed 1420978137717, :failing-size 0, :num-tests

    1, :fail [0 0], :shrunk {:total-nodes-visited 0, :depth 0, :result #<ArithmeticException: Divide by zero>, :smallest [0 0]}}
  17. (def gen-non-zero-int (gen/such-that #(not= 0 %) gen/int)) (gen/sample gen/int) #_=>

    (0 0 -1 -2 -2 0 1 -4 -1 0) (gen/sample gen-non-zero-int) #_=> (-2 1 1 2 4 -4 -5 1 3 -7)
  18. (def prop-division (prop/for-all [a gen/int b gen-non-zero-int] (= (* (/

    a b) b) a))) (tc/quick-check 1000 prop-division) #_=> {:result true, :num-tests 1000, :seed 1420978581409}
  19. (def gen-double (gen/fmap double gen/int)) (gen/sample gen-double) #_=> (0.0 0.0

    0.0 1.0 -1.0 -1.0 2.0 -4.0 7.0 3.0) (def gen-non-zero-double (gen/such-that #(not= 0.0 %) gen-double))
  20. (def prop-division (prop/for-all [a gen-double b gen-non-zero-double] (= (* (/

    a b) b) a))) (tc/quick-check 1000 prop-division) #_=> {:result false :seed 1421584282754 :failing-size 25 :num-tests 26 :fail [23.0 21.0] :shrunk {:total-nodes-visited 9 :depth 0 :result false :smallest [23.0 21.0]}}
  21. (= (* (/ 23.0 21.0) 21.0) 23.0) #_=> false (*

    (/ 23.0 21.0) 21.0) #_=> 23.000000000000004
  22. reverse (= (comp reverse reverse) identity) (= (reverse (reverse coll))

    coll)
  23. (ns project.core-test) (defn reverse [coll] ()) (require '[clojure.test.check.clojure-test :refer [defspec]])

    (defspec prop-reverse-composed (prop/for-all [coll (gen/list gen/int)] (= coll (reverse (reverse coll)))))
  24. (clojure.test/run-tests 'project.core-test) {:result false, :seed 1420987857457, :failing-size 1, :num-tests 2,

    :fail [(0)], :shrunk {:total-nodes-visited 1 :depth 0 :result false :smallest [(0)]}}
  25. (clojure.test/run-tests 'project.core-test) {:result false, :seed 1420981522879, :failing-size 2, :num-tests 3,

    :fail [(2 1)], :shrunk {:total-nodes-visited 4 :depth 2 :result false :smallest [(0)]}}
  26. www.stylefruits.de/hosen/lee/farbe-hellblau/40-70-euro/seite-2

  27. (path->descriptor ctx path) (descriptor->path ctx desc)

  28. (prop/for-all [descr gen-descriptor ctx gen-context] (= descr (->> descr (descriptor->path

    ctx) (path->descriptor ctx))))
  29. “Testing the Hard Stuff and Staying Sane” by John Hughes

  30. “Jepsen IV: Hope Springs Eternal” by Kyle Kingsbury

  31. State

  32. Testing Telecoms Software with Quviq QuickCheck Thomas Arts IT University

    of G¨ oteborg, Gothenburg, Sweden and Quviq AB thomas.arts@ituniv.se John Hughes Chalmers University, Gothenburg, Sweden and Quviq AB rjmh@cs.chalmers.se Joakim Johansson Ulf Wiger Ericsson AB, ¨ Alvsj¨ o, Sweden joakim.l.johansson@ericsson.com ulf.wiger@ericsson.com Abstract We present a case study in which a novel testing tool, Quviq QuickCheck, is used to test an industrial implementation of the Megaco protocol. We considered positive and negative testing and we used our developed specification to test an old version in order to estimate how useful QuickCheck could potentially be when used early in development. The results of the case study indicate that, by using Quviq QuickCheck, we would have been able to detect faults early in the development. We detected faults that had not been detected by other testing techniques. We found unclarities in the specifications and potential faults when the software is used in a different setting. The results are considered promising enough to Ericsson that they are investing in an even larger case study, this time from the beginning of the development of a new product. Categories and Subject Descriptors D.2.5 [Software Engineer- ing]: Testing and Debugging—Testing tools; D.2.4 [Software En- gineering]: Software/Program Verification—Formal methods General Terms Verification Keywords Test Automation, Property Based Testing 1. Introduction it potentially reduce testing time? Would it find obscure bugs, and help to improve final product quality? Our study is small and qual- itative, but it suggests that the answer to all of these questions is a resounding “Yes”. The rest of the paper is structured as follows. In section 2 we give an introduction to Quviq QuickCheck, and in section 3 we explain our case study, giving some background on the software testing methods already used at Ericsson. In section 4 we explain our approach in detail, with samples of the testing code and a description of the extensions we made in parallel to QuickCheck, to better support this kind of testing. In 5 we present the results we obtained, and in 8 we draw conclusions. 2. Quviq QuickCheck Quviq QuickCheck is a property-based testing tool, developed from Claessen and Hughes’ earlier QuickCheck tool for Haskell [3] and a re-design for Erlang [2]. Apart from adaption to an Erlang setting, Quviq QuickCheck includes a number of extensions, of which the most significant is an ability to simplify failing test cases automatically. Quviq QuickCheck is a product of Quviq AB. A user of QuickCheck writes properties that are expected to hold, as Erlang source code making use of the QuickCheck API. For example, one property of the standard list reversal function is
  33. Modus Operandi ▶ Generate actions changing the state ▶ Apply

    each action to the state, if possible ▶ Aer each application verify the model
  34. Operations on Vectors ▶ conj ▶ pop

  35. (ns operations-on-vectors (:require [clojure.test.check [clojure-test :refer [defspec]] [generators :as gen]]

    [states.core :as states])) (defspec prop-vec-ops (states/run-commands commands next-step postcondition {:init-state {:vec []}}))
  36. (defn commands [{:keys [vec elems]}] (let [conj-gen (gen/tuple (gen/return 'conj)

    (gen/return vec) gen/int)] (if (seq elems) (gen/one-of [conj-gen (gen/return ['pop vec])]) conj-gen))) (gen/sample (commands {:vec 'a, :elems [1 2 3]})) #_=> ([conj a 0] [pop a] [pop a] [conj a -3] ...)
  37. (defn next-step [state var [fn _ elem]] (case fn conj

    (-> state (update-in [:elems] conj elem) (assoc :vec var)) pop (-> state (update-in [:elems] rest) (assoc :vec var))))
  38. (defn postcondition [{:keys [elems]} [fn & _] val] (case fn

    pop (= val elems) true))
  39. (clojure.test/run-tests 'operations-on-vectors) #_=> {:result #<Postcondition unsatisfied {:elems (-2 7 -1)

    :vec var-2}>, :seed 1421594608384, :failing-size 9, :num-tests 10, :fail [((set var-0 (conj [] -1)) (set var-1 (conj var-0 7)) (set var-2 (conj var-1 -2)) (set var-3 (pop var-2)))], :shrunk {...}}
  40. (defn postcondition [{:keys [elems]} [fn & _] val] (case fn

    pop (= val elems) true))
  41. (defn postcondition [{:keys [elems]} [fn & _] val] (case fn

    pop (= val (reverse elems)) true))
  42. (clojure.test/run-tests 'operations-on-vectors) Testing operations-on-vectors {:result true, :num-tests 100, :seed 14215224}

    Ran 1 tests containing 1 assertions. 0 failures, 0 errors.
  43. github.com/michalmarczyk/ctries.clj (esp. the ctries.clj-test namespace)

  44. Beyond

  45. “Testing the Hard Stuff and Staying Sane” John Hughes

  46. “Generative testing with clojure.test.check” Philip Poer

  47. github.com/jstepien/states github.com/czan/stateful-check github.com/krestenkrab/triq github.com/manopapad/proper basho.co.jp/quickchecking-poolboy-for-fun-and-profit yellerapp.com/posts/2015-04-13-effective-test-check.html mostlyerlang.com/2014/05/04/quickcheck

  48. Generative Testing Properties, State and Beyond Jan Stępień @janstepien jan@stepien.cc