Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Random Data Generation with ScalaCheck - Scalar 2017

Random Data Generation with ScalaCheck - Scalar 2017

ScalaCheck is a well-known library for property-base testing. However, property-base testing is not always possible when side effects are involved, for example when writing an integration test that involves data being stored in a database. When writing non-property-base tests, we often need to initialize some data and then verify some assertions on it. However, manual data generation can make our data biased and preventing us from spotting bugs in our code. Having our data generated randomly not only it would make our test less biased, but it will also make it a lot more readable by highlighting what part of our data is actually relevant in our test.
In this talk, we will discuss how to reuse some of the existing ScalaCheck code to generate random instances of given types and how these can be combined to generate random case classes. We will analyse the properties of a ScalaCheck generator and provide examples of how we can manipulate existing generators to meet our needs.

Daniela Sfregola

April 08, 2017
Tweet

More Decks by Daniela Sfregola

Other Decks in Programming

Transcript

  1. RANDOM DATA GENERATION
    WITH SCALACHECK
    @DANIELASFREGOLA
    SCALAR 2017

    View Slide

  2. TESTS TESTS TESTS

    View Slide

  3. View Slide

  4. COMMON APPROACHES
    STATIC FIXTURES

    View Slide

  5. HM, WORKED IN TESTS WHEN I POURED WATER
    DIRECTLY INTO DRAIN
    BY EMILBRONIKOWSKI

    View Slide

  6. COMMON APPROACHES
    SCALACHECK 1
    1 github.com/rickynils/scalacheck

    View Slide

  7. SCALACHECK - PROPERTY BASED TESTING
    property("startsWith") = forAll { (a: String, b: String) =>
    (a+b).startsWith(a)
    }
    // + String.startsWith: OK, passed 100 tests.
    property("concatenate") = forAll { (a: String, b: String) =>
    (a+b).length > a.length && (a+b).length > b.length
    }
    // ! String.concat: Falsified after 0 passed tests.
    // > ARG_0: ""
    // > ARG_1: ""

    View Slide

  8. PROPERTY BASED TESTING - PROS
    > Test data is less biased
    > On failing,
    counter-example provided
    > Higher confidence that
    our code probably works

    View Slide

  9. PROPERTY BASED TESTING - CONS
    > Not always immediate
    > Configurations do affect
    the test result
    > Not always applicable
    with side effects

    View Slide

  10. View Slide

  11. View Slide

  12. View Slide

  13. COMMON APPROACHES

    View Slide

  14. CAN WE COMPROMISE?

    View Slide

  15. LET'S DO IT!
    def random[T]: T = ???

    View Slide

  16. CAN WE REUSE SOME
    SCALACHECK
    MAGIC?

    View Slide

  17. Gen[T]
    package org.scalacheck
    sealed abstract class Gen[T] {
    def sample: Option[T] =
    apply(Gen.Parameters.default,
    Seed.random())
    }

    View Slide

  18. RANDOM DATA GENERATOR - WIP
    import org.scalacheck.Gen
    def random[T]: T = {
    val gen: Gen[T] = ???
    val optT: Option[T] = gen.sample
    optT.get
    }

    View Slide

  19. Arbitrary[T]
    package org.scalacheck
    sealed abstract class Arbitrary[T] {
    val arbitrary: Gen[T]
    }

    View Slide

  20. RANDOM DATA GENERATOR - WIP
    import org.scalacheck.{Arbitrary, Gen}
    def random[T](implicit arb: Arbitrary[T]): T = {
    val gen: Gen[T] = arb.arbitrary
    val optT: Option[T] = gen.sample
    optT.get
    }

    View Slide

  21. SCALACHECK-SHAPELESS 2
    AUTOMATICALLY INFERS ARBITRARY[T] IF:
    > T is a case class
    > T is an sealed trait
    2 github.com/alexarchambault/scalacheck-shapeless

    View Slide

  22. RANDOM DATA GENERATOR - DONE?
    import org.scalacheck.{Arbitrary, Gen}
    import org.scalacheck.Shapeless._
    def random[T](implicit arb: Arbitrary[T]): T = {
    val gen: Gen[T] = arb.arbitrary
    val optT: Option[T] = gen.sample
    optT.get
    }

    View Slide

  23. RANDOM DATA GENERATOR
    scala> random[String]
    res0: String = ح㻞ꔛᵏ⌧㈽フᲆ哃᩠ ꕸẃḷ╏䉁
    scala> random[Int]
    res1: Int = 2147483647
    scala> random[Int]
    res2: Int = -407671469
    scala> case class Bro(a: String, b: Double)
    defined class Bro
    scala> random[Bro]
    res3: Bro = Bro(ꌏ
    ࠞ ໐㞸㣣ߖ啾࿏奤ಧ 㞧㽡Ṥᘭ!㭾梖䅱ϩ⅖๳☥梠,-2.4322029262034435E58

    View Slide

  24. RANDOM DATA GENERATOR
    case class User(name: String, surname: String)
    "create a user" {
    val user = random[User]
    Post("/users", user) ~> check {
    status === StatusCodes.Created
    assertCreation(user)
    }
    }

    View Slide

  25. CAN WE MAKE TESTS
    DETERMINISTIC
    ON DEMAND?

    View Slide

  26. FIX YOUR SEED
    Each session has a seed number associated
    Generating random data with seed -2481216758852790303
    Use it to debug problematic tests
    export RANDOM_DATA_GENERATOR_SEED=-2481216758852790303
    unset RANDOM_DATA_GENERATOR_SEED

    View Slide

  27. Gen[T]
    package org.scalacheck
    sealed abstract class Gen[T] {
    def sample: Option[T] =
    apply(Gen.Parameters.default,
    Seed.random())
    }

    View Slide

  28. RANDOM DATA GENERATOR - DONE!
    import org.scalacheck.Shapeless._
    val seedNum: Long = ???
    def random[T](implicit arb: Arbitrary[T]): T = {
    val gen: Gen[T] = arb.arbitrary
    val optT: Option[T] = gen.apply(Gen.Parameters.default,
    Seed(seedNum))
    optT.get
    }

    View Slide

  29. RANDOM DATA
    GENERATOR
    GITHUB.COM/DANIELASFREGOLA/RANDOM-DATA-GENERATOR

    View Slide

  30. RANDOM DATA GENERATOR
    PROS

    View Slide

  31. EASIER TO MAINTAIN
    case class User(name: String, surname: String, age: Int)
    "create a user" {
    val user = random[User]
    Post("/users", user) ~> check {
    status === Created
    assertCreated(user)
    }
    }

    View Slide

  32. IMPROVED READABILITY
    case class User(name: String, surname: String, age: Int)
    "reject user creation of an underage user" {
    val user = random[User].copy(age = 17)
    Post("/users", user) ~> check {
    status === BadRequest
    assertNotCreated(user)
    }
    }

    View Slide

  33. LESS BIASED TEST DATA
    For every session
    different test data
    will be randomly* selected
    * We can still fix the seed when needed!

    View Slide

  34. BUGS
    BUGS EVERYWHERE

    View Slide

  35. RANDOM DATA GENERATOR
    LESSONS LEARNED

    View Slide

  36. SCALACHECK-SHAPELESS
    IS NOT ALWAYS
    ENOUGH

    View Slide

  37. ARBITRARY OF CUSTOM TYPE
    import java.util.Currency
    import scala.collection.JavaConversions._
    implicit val arbitraryCurrency: Arbitrary[Currency] =
    Arbitrary {
    Gen.oneOf(Currency.getAvailableCurrencies.toSeq)
    }
    random[Currency]
    > java.util.Currency = OMR

    View Slide

  38. MAKE SURE THAT
    THE GENERATED TEST DATA
    MAKES SENSE

    View Slide

  39. CUSTOMISE YOUR ARBITRARY
    random[String]
    > ᭞❱᭟ⳘԺ〈ᦙ᠓ꍊꎼꙐႀ⤌惲
    /** Generates a string of alpha characters */
    implicit val arb: Arbitrary[String] = Arbitrary(Gen.alphaStr)
    random[String]
    > hqtbonxacrmvmuMpofwtasrojjnycwuoTfkrhOpli

    View Slide

  40. CUSTOMISE YOUR ARBITRARY
    case class Person(name: String, age: Int)
    implicit val arbitraryPerson: Arbitrary[Person] =
    Arbitrary {
    for {
    name <- Gen.oneOf("Daniela", "John", "Martin")
    age <- Gen.choose(0, 100)
    } yield Person(name, age)
    }
    random[Person]
    > Person(John,16)

    View Slide

  41. SHAPELESS
    impacts on
    COMPILATION TIME

    View Slide

  42. MILES SABIN, TYPELEVEL SCALA REBOOTED, SCALAEXCHANGE 2016

    View Slide

  43. INDUCTIVE HEURISTICS TO THE RESCUE!
    Faster compilation of inductive implicits
    > Typelevel Scala [ #129 - Merged ]
    > Lightbend Scala [ #5649 - Open ]

    View Slide

  44. CACHING
    Arbitrary[T]
    HELPS

    View Slide

  45. shapeless.cachedImplicit
    import shapeless._
    object CachedArbitraryImplicits {
    implicit val arbA: Arbitrary[A] = cachedImplicit
    implicit val arbB: Arbitrary[B] = cachedImplicit
    }

    View Slide

  46. RANDOM DATA GENERATOR
    IS FOR TESTING

    View Slide

  47. WRAP UP
    > A compromise between test strategies
    > Customise your data generation
    > Meant for testing
    > Do not ignore random test failures

    View Slide

  48. Testing shows the presence,
    not the absence of bugs
    — Edsger W. Dijkstra, 1969

    View Slide

  49. THANK YOU!
    > Random Data Generator:
    github.com/DanielaSfregola/random-data-generator
    > Twitter: @DanielaSfregola
    > Blog: danielasfregola.com

    View Slide