$30 off During Our Annual Pro Sale. View Details »

Scala Italy 2018 - Random Data Generation with ScalaCheck

Scala Italy 2018 - Random Data Generation with ScalaCheck

ScalaCheck is a well-known library for property-base testing. However, property-based testing is not always possible when side effects are involved, for example when writing an integration test that involves data being stored in a database.

When writing non-property-base tests, we often need to initialise some data and then verify some assertions on it. However, manual data generation can make our data biased and preventing us from spotting bugs in our code.

Having our data generated randomly not only it would make our test less biased, but it will also make it a lot more readable by highlighting what part of our data is actually relevant in our test.

In this talk, we will discuss how to reuse some of the existing ScalaCheck code to generate random instances of given types and how these can be combined to generate random case classes. We will analyse the properties of a ScalaCheck generator and provide examples of how we can manipulate existing generators to meet our needs.

Daniela Sfregola

September 15, 2018
Tweet

More Decks by Daniela Sfregola

Other Decks in Programming

Transcript

  1. RANDOM DATA GENERATION
    WITH SCALACHECK
    @DANIELASFREGOLA
    SCALA ITALY 2018

    View Slide

  2. HOW TO
    STEAL OTHER'S PEOPLE CODE
    AND LOOK COOL

    View Slide

  3. HOW TO
    DO OPEN SOURCE
    AND LOOK COOL

    View Slide

  4. View Slide

  5. TESTS TESTS TESTS

    View Slide

  6. BUT I WROTE UNIT TESTS FOR IT

    View Slide

  7. 100% UNIT TEST COVERAGE

    View Slide

  8. ALL DONE, IT WORKS!

    View Slide

  9. 2 UNIT TESTS - 0 INTEGRATION TESTS

    View Slide

  10. View Slide

  11. COMMON APPROACHES
    STATIC FIXTURES

    View Slide

  12. View Slide

  13. COMMON APPROACHES
    SCALACHECK

    View Slide

  14. SCALACHECK - PROPERTY BASED TESTING
    property("startsWith") = forAll { (a: String, b: String) =>
    (a+b).startsWith(a)
    }
    // + String.startsWith: OK, passed 100 tests.
    property("concatenate") = forAll { (a: String, b: String) =>
    (a+b).length > a.length && (a+b).length > b.length
    }
    // ! String.concat: Falsified after 0 passed tests.
    // > ARG_0: ""
    // > ARG_1: ""

    View Slide

  15. PROPERTY BASED TESTING - PROS
    > Test data is less biased
    > On failing,
    counter-example provided
    > Higher confidence that
    our code probably works

    View Slide

  16. PROPERTY BASED TESTING - CONS
    > Restructuring your tests
    as properties
    is not always immediate
    > Not always applicable
    with side effects
    > Configurations do affect
    the test result

    View Slide

  17. View Slide

  18. View Slide

  19. COMMON APPROACHES

    View Slide

  20. CAN WE COMPROMISE?

    View Slide

  21. CAN WE REUSE
    SOME OF THE
    SCALACHECK MAGIC?

    View Slide

  22. RANDOM DATA
    GENERATOR
    GITHUB.COM/DANIELASFREGOLA/RANDOM-DATA-GENERATOR

    View Slide

  23. RANDOM DATA GENERATOR
    case class Example(text: String, n: Int)
    val example: Example = random[Example]
    // Example(ਈ⼝ꏣᰣ∯෢ꪔ䃂ᅟ䑪⡨⿽ᵅ䎎ߐ, 73967257)

    View Slide

  24. RANDOM DATA GENERATOR
    case class User(name: String, surname: String)
    "create a user" {
    val user = random[User]
    Post("/users", user) ~> check {
    status === StatusCodes.Created
    assertCreation(user)
    }
    }

    View Slide

  25. FIX YOUR SEED
    Each session has a seed number associated
    Generating random data with seed -2481216758852790303
    Use it to debug problematic tests
    export RANDOM_DATA_GENERATOR_SEED=-2481216758852790303
    unset RANDOM_DATA_GENERATOR_SEED

    View Slide

  26. LESS BIASED TEST DATA
    For every session
    different test data
    will be randomly* selected
    * We can still fix the seed when needed!

    View Slide

  27. BUGS
    BUGS EVERYWHERE

    View Slide

  28. EASIER TO MAINTAIN
    case class User(name: String, surname: String, age: Int)
    "create a user" {
    val user = random[User]
    Post("/users", user) ~> check {
    status === Created
    assertCreated(user)
    }
    }

    View Slide

  29. IMPROVED READABILITY
    case class User(name: String, surname: String, age: Int)
    "reject user creation of an underage user" {
    val user = random[User].copy(age = 17)
    Post("/users", user) ~> check {
    status === BadRequest
    assertNotCreated(user)
    }
    }

    View Slide

  30. HOW DOES IT WORK?

    View Slide

  31. SCALACHECK1
    1 github.com/rickynils/scalacheck

    View Slide

  32. SCALACHECK-SHAPELESS2
    AUTOMATICALLY INFERS ARBITRARY[T] IF:
    > T is a case class
    > T is an sealed trait
    2 github.com/alexarchambault/scalacheck-shapeless

    View Slide

  33. LET'S LOOK AT THE CODE!

    View Slide

  34. import org.scalacheck._
    trait RandomDataGenerator extends ShapelessLike {
    private val seed = RandomDataGenerator.seed
    def random[T](implicit arb: Arbitrary[T]): T = {
    val gen: Gen[T] = arb.arbitrary
    val optT: Option[T] = gen.apply(Gen.Parameters.default, seed)
    optT.get // !!!
    }
    }

    View Slide

  35. WHY AN OPTION?
    Arbitrary {
    Gen.chooseNum(1, 100).suchThat(_ > 200)
    }

    View Slide

  36. ARBITRARY OF CUSTOM TYPE
    import java.util.Currency
    import scala.collection.JavaConversions._
    implicit val arbitraryCurrency: Arbitrary[Currency] =
    Arbitrary {
    Gen.oneOf(Currency.getAvailableCurrencies.toSeq)
    }
    random[Currency]
    // java.util.Currency = OMR

    View Slide

  37. MAKE SURE THAT
    THE GENERATED TEST DATA
    MAKES SENSE

    View Slide

  38. CUSTOMISE YOUR ARBITRARY
    Before:
    random[String]
    // ᭞❱᭟ⳘԺ〈ᦙ᠓ꍊꎼꙐႀ⤌惲
    After:
    /** Generates a string of alpha characters */
    implicit val arbitraryString: Arbitrary[String] = Arbitrary(Gen.alphaStr)
    random[String]
    // hqtbonxacrmvmuMpofwtasrojjnycwuoTfkrhOpli

    View Slide

  39. CUSTOMISE YOUR ARBITRARY
    case class Person(name: String, age: Int)
    implicit val arbitraryPerson: Arbitrary[Person] =
    Arbitrary {
    for {
    name <- Gen.oneOf("Daniela", "John", "Martin")
    age <- Gen.choose(0, 100)
    } yield Person(name, age)
    }
    random[Person]
    // Person(John,16)

    View Slide

  40. WHAT
    RANDOM DATA GENERATOR
    IS NOT FOR

    View Slide

  41. TYPE CLASS DERIVATION
    WITH SHAPELESS
    IMPACTS ON COMPILATION TIME

    View Slide

  42. MILES SABIN, TYPELEVEL SCALA REBOOTED, SCALAEXCHANGE 2016

    View Slide

  43. CAN WE DO BETTER?

    View Slide

  44. TYPE CLASS DERIVATION
    WITH MAGNOLIA3
    BY @PROPENSIVE
    3 github.com/propensive/magnolia

    View Slide

  45. SCALACHECK-MAGNOLIA4
    BY @ETATY
    4 github.com/etaty/scalacheck-magnolia

    View Slide

  46. 248% SPEED UP WITH MAGNOLIA !!!!!

    View Slide

  47. RANDOM DATA
    GENERATOR MAGNOLIA
    GITHUB.COM/DANIELASFREGOLA/RANDOM-DATA-GENERATOR-MAGNOLIA

    View Slide

  48. SHAPELESS
    VS
    MAGNOLIA

    View Slide

  49. WRAP UP (1)
    > A compromise between using
    ScalaCheck and predefined fixtures
    > Customise the data generation
    to your context
    > Do not ignore random test failures
    > Use it only for test purposes

    View Slide

  50. WRAP UP (2)
    > Magnolia is for faster type class derivation
    > Shapeless has more features
    > Open Source is awesome!

    View Slide

  51. Testing shows the presence,
    not the absence of bugs
    — Edsger W. Dijkstra, 1969

    View Slide

  52. THANK YOU!
    > Random Data Generator:
    github.com/DanielaSfregola/random-data-generator
    > Random Data Generator Magnolia:
    github.com/DanielaSfregola/random-data-generator-magnolia
    > Twitter: @DanielaSfregola
    > Blog: danielasfregola.com

    View Slide