Scala Italy 2018 - Random Data Generation with ScalaCheck

Scala Italy 2018 - Random Data Generation with ScalaCheck

ScalaCheck is a well-known library for property-base testing. However, property-based testing is not always possible when side effects are involved, for example when writing an integration test that involves data being stored in a database.

When writing non-property-base tests, we often need to initialise some data and then verify some assertions on it. However, manual data generation can make our data biased and preventing us from spotting bugs in our code.

Having our data generated randomly not only it would make our test less biased, but it will also make it a lot more readable by highlighting what part of our data is actually relevant in our test.

In this talk, we will discuss how to reuse some of the existing ScalaCheck code to generate random instances of given types and how these can be combined to generate random case classes. We will analyse the properties of a ScalaCheck generator and provide examples of how we can manipulate existing generators to meet our needs.

E99b07644586e9e1723757bf8e34ea68?s=128

Daniela Sfregola

September 15, 2018
Tweet

Transcript

  1. RANDOM DATA GENERATION WITH SCALACHECK @DANIELASFREGOLA SCALA ITALY 2018

  2. HOW TO STEAL OTHER'S PEOPLE CODE AND LOOK COOL

  3. HOW TO DO OPEN SOURCE AND LOOK COOL

  4. None
  5. TESTS TESTS TESTS

  6. BUT I WROTE UNIT TESTS FOR IT

  7. 100% UNIT TEST COVERAGE

  8. ALL DONE, IT WORKS!

  9. 2 UNIT TESTS - 0 INTEGRATION TESTS

  10. None
  11. COMMON APPROACHES STATIC FIXTURES

  12. None
  13. COMMON APPROACHES SCALACHECK

  14. SCALACHECK - PROPERTY BASED TESTING property("startsWith") = forAll { (a:

    String, b: String) => (a+b).startsWith(a) } // + String.startsWith: OK, passed 100 tests. property("concatenate") = forAll { (a: String, b: String) => (a+b).length > a.length && (a+b).length > b.length } // ! String.concat: Falsified after 0 passed tests. // > ARG_0: "" // > ARG_1: ""
  15. PROPERTY BASED TESTING - PROS > Test data is less

    biased > On failing, counter-example provided > Higher confidence that our code probably works
  16. PROPERTY BASED TESTING - CONS > Restructuring your tests as

    properties is not always immediate > Not always applicable with side effects > Configurations do affect the test result
  17. None
  18. None
  19. COMMON APPROACHES

  20. CAN WE COMPROMISE?

  21. CAN WE REUSE SOME OF THE SCALACHECK MAGIC?

  22. RANDOM DATA GENERATOR GITHUB.COM/DANIELASFREGOLA/RANDOM-DATA-GENERATOR

  23. RANDOM DATA GENERATOR case class Example(text: String, n: Int) val

    example: Example = random[Example] // Example(ਈ⼝ꏣᰣ∯෢ꪔ䃂ᅟ䑪⡨⿽ᵅ䎎ߐ, 73967257)
  24. RANDOM DATA GENERATOR case class User(name: String, surname: String) "create

    a user" { val user = random[User] Post("/users", user) ~> check { status === StatusCodes.Created assertCreation(user) } }
  25. FIX YOUR SEED Each session has a seed number associated

    Generating random data with seed -2481216758852790303 Use it to debug problematic tests export RANDOM_DATA_GENERATOR_SEED=-2481216758852790303 unset RANDOM_DATA_GENERATOR_SEED
  26. LESS BIASED TEST DATA For every session different test data

    will be randomly* selected * We can still fix the seed when needed!
  27. BUGS BUGS EVERYWHERE

  28. EASIER TO MAINTAIN case class User(name: String, surname: String, age:

    Int) "create a user" { val user = random[User] Post("/users", user) ~> check { status === Created assertCreated(user) } }
  29. IMPROVED READABILITY case class User(name: String, surname: String, age: Int)

    "reject user creation of an underage user" { val user = random[User].copy(age = 17) Post("/users", user) ~> check { status === BadRequest assertNotCreated(user) } }
  30. HOW DOES IT WORK?

  31. SCALACHECK1 1 github.com/rickynils/scalacheck

  32. SCALACHECK-SHAPELESS2 AUTOMATICALLY INFERS ARBITRARY[T] IF: > T is a case

    class > T is an sealed trait 2 github.com/alexarchambault/scalacheck-shapeless
  33. LET'S LOOK AT THE CODE!

  34. import org.scalacheck._ trait RandomDataGenerator extends ShapelessLike { private val seed

    = RandomDataGenerator.seed def random[T](implicit arb: Arbitrary[T]): T = { val gen: Gen[T] = arb.arbitrary val optT: Option[T] = gen.apply(Gen.Parameters.default, seed) optT.get // !!! } }
  35. WHY AN OPTION? Arbitrary { Gen.chooseNum(1, 100).suchThat(_ > 200) }

  36. ARBITRARY OF CUSTOM TYPE import java.util.Currency import scala.collection.JavaConversions._ implicit val

    arbitraryCurrency: Arbitrary[Currency] = Arbitrary { Gen.oneOf(Currency.getAvailableCurrencies.toSeq) } random[Currency] // java.util.Currency = OMR
  37. MAKE SURE THAT THE GENERATED TEST DATA MAKES SENSE

  38. CUSTOMISE YOUR ARBITRARY Before: random[String] // ᭞❱᭟ⳘԺ〈ᦙ᠓ꍊꎼꙐႀ⤌惲 After: /** Generates

    a string of alpha characters */ implicit val arbitraryString: Arbitrary[String] = Arbitrary(Gen.alphaStr) random[String] // hqtbonxacrmvmuMpofwtasrojjnycwuoTfkrhOpli
  39. CUSTOMISE YOUR ARBITRARY case class Person(name: String, age: Int) implicit

    val arbitraryPerson: Arbitrary[Person] = Arbitrary { for { name <- Gen.oneOf("Daniela", "John", "Martin") age <- Gen.choose(0, 100) } yield Person(name, age) } random[Person] // Person(John,16)
  40. WHAT RANDOM DATA GENERATOR IS NOT FOR

  41. TYPE CLASS DERIVATION WITH SHAPELESS IMPACTS ON COMPILATION TIME

  42. MILES SABIN, TYPELEVEL SCALA REBOOTED, SCALAEXCHANGE 2016

  43. CAN WE DO BETTER?

  44. TYPE CLASS DERIVATION WITH MAGNOLIA3 BY @PROPENSIVE 3 github.com/propensive/magnolia

  45. SCALACHECK-MAGNOLIA4 BY @ETATY 4 github.com/etaty/scalacheck-magnolia

  46. 248% SPEED UP WITH MAGNOLIA !!!!!

  47. RANDOM DATA GENERATOR MAGNOLIA GITHUB.COM/DANIELASFREGOLA/RANDOM-DATA-GENERATOR-MAGNOLIA

  48. SHAPELESS VS MAGNOLIA

  49. WRAP UP (1) > A compromise between using ScalaCheck and

    predefined fixtures > Customise the data generation to your context > Do not ignore random test failures > Use it only for test purposes
  50. WRAP UP (2) > Magnolia is for faster type class

    derivation > Shapeless has more features > Open Source is awesome!
  51. Testing shows the presence, not the absence of bugs —

    Edsger W. Dijkstra, 1969
  52. THANK YOU! > Random Data Generator: github.com/DanielaSfregola/random-data-generator > Random Data

    Generator Magnolia: github.com/DanielaSfregola/random-data-generator-magnolia > Twitter: @DanielaSfregola > Blog: danielasfregola.com