Random Data Generation with ScalaCheck - Scalar 2017

Random Data Generation with ScalaCheck - Scalar 2017

ScalaCheck is a well-known library for property-base testing. However, property-base testing is not always possible when side effects are involved, for example when writing an integration test that involves data being stored in a database. When writing non-property-base tests, we often need to initialize some data and then verify some assertions on it. However, manual data generation can make our data biased and preventing us from spotting bugs in our code. Having our data generated randomly not only it would make our test less biased, but it will also make it a lot more readable by highlighting what part of our data is actually relevant in our test.
In this talk, we will discuss how to reuse some of the existing ScalaCheck code to generate random instances of given types and how these can be combined to generate random case classes. We will analyse the properties of a ScalaCheck generator and provide examples of how we can manipulate existing generators to meet our needs.

E99b07644586e9e1723757bf8e34ea68?s=128

Daniela Sfregola

April 08, 2017
Tweet

Transcript

  1. RANDOM DATA GENERATION WITH SCALACHECK @DANIELASFREGOLA SCALAR 2017

  2. TESTS TESTS TESTS

  3. None
  4. COMMON APPROACHES STATIC FIXTURES

  5. HM, WORKED IN TESTS WHEN I POURED WATER DIRECTLY INTO

    DRAIN BY EMILBRONIKOWSKI
  6. COMMON APPROACHES SCALACHECK 1 1 github.com/rickynils/scalacheck

  7. SCALACHECK - PROPERTY BASED TESTING property("startsWith") = forAll { (a:

    String, b: String) => (a+b).startsWith(a) } // + String.startsWith: OK, passed 100 tests. property("concatenate") = forAll { (a: String, b: String) => (a+b).length > a.length && (a+b).length > b.length } // ! String.concat: Falsified after 0 passed tests. // > ARG_0: "" // > ARG_1: ""
  8. PROPERTY BASED TESTING - PROS > Test data is less

    biased > On failing, counter-example provided > Higher confidence that our code probably works
  9. PROPERTY BASED TESTING - CONS > Not always immediate >

    Configurations do affect the test result > Not always applicable with side effects
  10. None
  11. None
  12. None
  13. COMMON APPROACHES

  14. CAN WE COMPROMISE?

  15. LET'S DO IT! def random[T]: T = ???

  16. CAN WE REUSE SOME SCALACHECK MAGIC?

  17. Gen[T] package org.scalacheck sealed abstract class Gen[T] { def sample:

    Option[T] = apply(Gen.Parameters.default, Seed.random()) }
  18. RANDOM DATA GENERATOR - WIP import org.scalacheck.Gen def random[T]: T

    = { val gen: Gen[T] = ??? val optT: Option[T] = gen.sample optT.get }
  19. Arbitrary[T] package org.scalacheck sealed abstract class Arbitrary[T] { val arbitrary:

    Gen[T] }
  20. RANDOM DATA GENERATOR - WIP import org.scalacheck.{Arbitrary, Gen} def random[T](implicit

    arb: Arbitrary[T]): T = { val gen: Gen[T] = arb.arbitrary val optT: Option[T] = gen.sample optT.get }
  21. SCALACHECK-SHAPELESS 2 AUTOMATICALLY INFERS ARBITRARY[T] IF: > T is a

    case class > T is an sealed trait 2 github.com/alexarchambault/scalacheck-shapeless
  22. RANDOM DATA GENERATOR - DONE? import org.scalacheck.{Arbitrary, Gen} import org.scalacheck.Shapeless._

    def random[T](implicit arb: Arbitrary[T]): T = { val gen: Gen[T] = arb.arbitrary val optT: Option[T] = gen.sample optT.get }
  23. RANDOM DATA GENERATOR scala> random[String] res0: String = ح㻞ꔛᵏ⌧㈽フᲆ哃᩠ ꕸẃḷ╏䉁

    scala> random[Int] res1: Int = 2147483647 scala> random[Int] res2: Int = -407671469 scala> case class Bro(a: String, b: Double) defined class Bro scala> random[Bro] res3: Bro = Bro(ꌏ ࠞ ໐㞸㣣ߖ啾࿏奤ಧ 㞧㽡Ṥᘭ!㭾梖䅱ϩ⅖๳☥梠,-2.4322029262034435E58
  24. RANDOM DATA GENERATOR case class User(name: String, surname: String) "create

    a user" { val user = random[User] Post("/users", user) ~> check { status === StatusCodes.Created assertCreation(user) } }
  25. CAN WE MAKE TESTS DETERMINISTIC ON DEMAND?

  26. FIX YOUR SEED Each session has a seed number associated

    Generating random data with seed -2481216758852790303 Use it to debug problematic tests export RANDOM_DATA_GENERATOR_SEED=-2481216758852790303 unset RANDOM_DATA_GENERATOR_SEED
  27. Gen[T] package org.scalacheck sealed abstract class Gen[T] { def sample:

    Option[T] = apply(Gen.Parameters.default, Seed.random()) }
  28. RANDOM DATA GENERATOR - DONE! import org.scalacheck.Shapeless._ val seedNum: Long

    = ??? def random[T](implicit arb: Arbitrary[T]): T = { val gen: Gen[T] = arb.arbitrary val optT: Option[T] = gen.apply(Gen.Parameters.default, Seed(seedNum)) optT.get }
  29. RANDOM DATA GENERATOR GITHUB.COM/DANIELASFREGOLA/RANDOM-DATA-GENERATOR

  30. RANDOM DATA GENERATOR PROS

  31. EASIER TO MAINTAIN case class User(name: String, surname: String, age:

    Int) "create a user" { val user = random[User] Post("/users", user) ~> check { status === Created assertCreated(user) } }
  32. IMPROVED READABILITY case class User(name: String, surname: String, age: Int)

    "reject user creation of an underage user" { val user = random[User].copy(age = 17) Post("/users", user) ~> check { status === BadRequest assertNotCreated(user) } }
  33. LESS BIASED TEST DATA For every session different test data

    will be randomly* selected * We can still fix the seed when needed!
  34. BUGS BUGS EVERYWHERE

  35. RANDOM DATA GENERATOR LESSONS LEARNED

  36. SCALACHECK-SHAPELESS IS NOT ALWAYS ENOUGH

  37. ARBITRARY OF CUSTOM TYPE import java.util.Currency import scala.collection.JavaConversions._ implicit val

    arbitraryCurrency: Arbitrary[Currency] = Arbitrary { Gen.oneOf(Currency.getAvailableCurrencies.toSeq) } random[Currency] > java.util.Currency = OMR
  38. MAKE SURE THAT THE GENERATED TEST DATA MAKES SENSE

  39. CUSTOMISE YOUR ARBITRARY random[String] > ᭞❱᭟ⳘԺ〈ᦙ᠓ꍊꎼꙐႀ⤌惲 /** Generates a string

    of alpha characters */ implicit val arb: Arbitrary[String] = Arbitrary(Gen.alphaStr) random[String] > hqtbonxacrmvmuMpofwtasrojjnycwuoTfkrhOpli
  40. CUSTOMISE YOUR ARBITRARY case class Person(name: String, age: Int) implicit

    val arbitraryPerson: Arbitrary[Person] = Arbitrary { for { name <- Gen.oneOf("Daniela", "John", "Martin") age <- Gen.choose(0, 100) } yield Person(name, age) } random[Person] > Person(John,16)
  41. SHAPELESS impacts on COMPILATION TIME

  42. MILES SABIN, TYPELEVEL SCALA REBOOTED, SCALAEXCHANGE 2016

  43. INDUCTIVE HEURISTICS TO THE RESCUE! Faster compilation of inductive implicits

    > Typelevel Scala [ #129 - Merged ] > Lightbend Scala [ #5649 - Open ]
  44. CACHING Arbitrary[T] HELPS

  45. shapeless.cachedImplicit import shapeless._ object CachedArbitraryImplicits { implicit val arbA: Arbitrary[A]

    = cachedImplicit implicit val arbB: Arbitrary[B] = cachedImplicit }
  46. RANDOM DATA GENERATOR IS FOR TESTING

  47. WRAP UP > A compromise between test strategies > Customise

    your data generation > Meant for testing > Do not ignore random test failures
  48. Testing shows the presence, not the absence of bugs —

    Edsger W. Dijkstra, 1969
  49. THANK YOU! > Random Data Generator: github.com/DanielaSfregola/random-data-generator > Twitter: @DanielaSfregola

    > Blog: danielasfregola.com