$30 off During Our Annual Pro Sale. View Details »

Easy and Efficient Data Validation with Cats - Scala Central 2017 #7

Easy and Efficient Data Validation with Cats - Scala Central 2017 #7

Often when we create a client/server application, we need to validate the requests: can the user associated to the request perform this operation? Can they access or modify the data? Is the input well-formed? When the data validation component in our application is not well designed, the code can quickly become not expressive enough and probably difficult to maintain. Business rules don’t help, adding more and more requirements to add in our validation, making it more and more complex to clearly represent and maintain. At the same time when the validation fails, it should be fairly straight forward to understand why the request was rejected, so that actions can be taken accordingly. This talk introduces Cats, a Scala library based on category theory, and some of its most interesting components for data validation. In particular, we’ll discuss some options to achieve efficient and expressive data validation. We will also argue that, compared to other options in the language, Cats is particularly suited for the task thanks to its easy-to-use data types and more approachable syntax. Throughout the talk, you will see numerous examples on how data validation can be achieved in a clean and robust way, and how we can easily integrate it in our code, without any specific knowledge of category theory.

Daniela Sfregola

April 25, 2017
Tweet

More Decks by Daniela Sfregola

Other Decks in Programming

Transcript

  1. EASY AND EFFICIENT
    DATA VALIDATION
    WITH CATS
    @DANIELASFREGOLA
    SCALA CENTRAL #7

    View Slide

  2. HELLOOOOO
    > ex Java Developer
    > OOP background
    > I am not a mathematician !

    View Slide

  3. FUNCTIONAL BUZZWORDS
    FREE ZONE

    View Slide

  4. DATA VALIDATION
    > In almost every application
    > Can be come complex quite quickly
    > Needs to be maintained

    View Slide

  5. ...NO NEED TO
    REINVENT THE WHEEL

    View Slide

  6. MAP
    scala> Some("daniela").map(s => "yo " + s)
    res0: Option[String] = Some(yo daniela)
    scala> None.map(s => "yo " + s)
    res1: Option[String] = None

    View Slide

  7. MAP + FLATTEN
    scala> Some("daniela").map(s => Some("yo " + s))
    res2: Option[Option[String]] = Some(Some(yo daniela))
    scala> Some("daniela").map(s => Some("yo " + s)).flatten
    res3: Option[String] = Some(yo daniela)

    View Slide

  8. FLATMAP
    map + flatten
    scala> Some("daniela").flatMap(s => Some("yo " + s))
    res4: Option[String] = Some(yo daniela)
    scala> None.flatMap(s => Some("yo " + s))
    res5: Option[String] = None

    View Slide

  9. FOR-COMPREHENSION
    map + flatMap (+ filter)
    scala > for { a <- Some(1); b <- Some(5) } yield a + b
    res6: Option[Int] = Some(6)
    scala > for { a <- Some(1); b <- None } yield a + b
    res7: Option[Int] = None

    View Slide

  10. CASE STUDY

    View Slide

  11. SCALA
    2.11
    NOT THE LATEST VERSION !!!

    View Slide

  12. OPTION
    package scala
    sealed abstract class Option[+A]
    final case class Some[+A](x: A) extends Option[A]
    case object None extends Option[Nothing]
    def map[B](f: A => B): Option[B] = ???
    def flatMap[B](f: A => Option[B]): Option[B] = ???

    View Slide

  13. OPTION
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Option[String] = ???
    def validatePhone(p: String): Option[String] = ???
    def validateData(d: Data): Option[Data] =
    for {
    validEmail <- validateEmail(d.email)
    validPhone <- validatePhone(d.phone)
    } yield Data(validEmail, validPhone)

    View Slide

  14. OPTION
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: Option[Data] = Some(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: Option[Data] = None
    > validateData(Data(okEmail, badPhone))
    res2: Option[Data] = None
    > validateData(Data(badEmail, okPhone))
    res3: Option[Data] = None

    View Slide

  15. Y U NO TELL ME
    WHICH ONE IS THE WRONG ONE?

    View Slide

  16. JUST DO NOT USE OPTION

    View Slide

  17. EITHER (2.11)
    package scala.util
    sealed abstract class Either[+A, +B]
    final case class Left[+A, +B](a: A) extends Either[A, B]
    final case class Right[+A, +B](b: B) extends Either[A, B]
    // no map
    // no flatMap

    View Slide

  18. EITHER (2.11)
    package scala.util
    final case class LeftProjection[+A, +B](e: Either[A, B])
    final case class RightProjection[+A, +B](e: Either[A, B])
    /**
    * Right(12).left.map(x => "flower") // Result: Right(12)
    * Left(12).left.map(x => "flower") // Result: Left("flower")
    *
    * Right(12).right.map(x => "flower") // Result: Right("flower")
    * Left(12).right.map(x => "flower") // Result: Left(12)
    **/
    // same for flatmap!

    View Slide

  19. EITHER (2.11)
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Either[List[String], String] = ???
    def validatePhone(p: String): Either[List[String], String] = ???
    def validateData(d: Data): Either[List[String], Data] = {
    val validEmail = validateEmail(d.email)
    val validPhone = validatePhone(d.phone)
    (validEmail, validPhone) match {
    case (Right(e), Right(p)) => Right(Data(e, p))
    case (Left(errE), Left(errP)) => Left(errE ++ errP)
    case (Left(errE), _) => Left(errE)
    case (_, Left(errP)) => Left(errP)
    }
    }

    View Slide

  20. EITHER (2.11)
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: Either[List[String],Data] = Right(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: Either[List[String],Data] =
    Left(List("Invalid email format", "Phone number must be numeric"))
    > validateData(Data(okEmail, badPhone))
    res2: Either[List[String],Data] = Left(List("Phone number must be numeric"))
    > validateData(Data(badEmail, okPhone))
    res3: Either[List[String],Data] = Left(List("Invalid email format"))

    View Slide

  21. EITHER (2.11)
    Which one is the error?
    Which one is the valid value?
    Either is not biased*
    *things have changed in Scala 2.12

    View Slide

  22. EITHER (2.11)
    Combine Either instances
    is not always
    easy or maintainable

    View Slide

  23. SCALA
    2.12

    View Slide

  24. EITHER (2.12)
    package scala.util
    sealed abstract class Either[+A, +B]
    final case class Left[+A, +B](a: A) extends Either[A, B]
    final case class Right[+A, +B](b: B) extends Either[A, B]
    def map[Y](f: B => Y): Either[A, Y] = ???
    def flatMap[AA >: A, Y](f: B => Either[AA, Y]): Either[AA, Y] = ???

    View Slide

  25. EITHER (2.12)
    package scala.util
    final case class LeftProjection[+A, +B](e: Either[A, B])
    final case class RightProjection[+A, +B](e: Either[A, B])
    // right projection used by default
    /**
    * Right(12).left.map(x => "flower") // Result: Right(12)
    * Left(12).left.map(x => "flower") // Result: Left("flower")
    *
    * Right(12).right.map(x => "flower") // Result: Right("flower")
    * Left(12).right.map(x => "flower") // Result: Left(12)
    **/
    // same for flatmap!

    View Slide

  26. EITHER (2.12)
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Either[List[String], String] = ???
    def validatePhone(p: String): Either[List[String], String] = ???
    def validateData(d: Data): Either[List[String], Data] =
    for {
    validEmail <- validateEmail(d.email)
    validPhone <- validatePhone(d.phone)
    } yield Data(validEmail, validPhone)

    View Slide

  27. EITHER (2.12)
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: Either[List[String],Data] = Right(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: Either[List[String],Data] = Left(List("Invalid email format"))
    > validateData(Data(okEmail, badPhone))
    res2: Either[List[String],Data] = Left(List("Phone number must be numeric"))
    > validateData(Data(badEmail, okPhone))
    res3: Either[List[String],Data] = Left(List("Invalid email format"))

    View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. EITHER (2.12)
    > only one validation is performed
    > ideal only when error accumulation
    is not needed

    View Slide

  35. CATS
    0.9.0
    GITHUB.COM/TYPELEVEL/CATS

    View Slide

  36. BIASED EITHER WITH 2.11
    import cats.syntax.either._
    * Xor removed from cats 0.8.0

    View Slide

  37. VALIDATED
    package cats.data
    sealed abstract class Validated[+E, +A]
    final case class Valid[+A](a: A) extends Validated[Nothing, A]
    final case class Invalid[+E](e: E) extends Validated[E, Nothing]
    def map[B](f: A => B): Validated[E,B]
    // no flatmap
    //...but we have something else *really* useful!

    View Slide

  38. VALIDATED AND APPLY*
    import cats.Apply
    import cats.data.Validated
    import cats.implicits._
    def accumulate[E, A1, A2, B](v1: Validated[E, A1],
    v2: Validated[E, A2])
    (f: (A1, A2) => B): Validated[E, B] =
    (v1 |@| v2).map(f)
    // same as: Apply[Validated[E, ?]].map2(v1,v2)(f)
    * More info on Apply at http://typelevel.org/cats/typeclasses/applicative.html

    View Slide

  39. VALIDATED
    import cats.implicits._
    import cats.data.Validated
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Validated[List[String], String] = ???
    def validatePhone(p: String): Validated[List[String], String] = ???
    def validateData(d: Data): Validated[List[String], Data] = {
    val validEmail = validateEmail(d.email)
    val validPhone = validatePhone(d.phone)
    (validEmail |@| validPhone).map(Data)
    }

    View Slide

  40. VALIDATED
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: cats.data.Validated[List[String],Data] = Valid(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: cats.data.Validated[List[String],Data] =
    Invalid(List("Invalid email format", "Phone number must be numeric"))
    > validateData(Data(okEmail, badPhone))
    res2: cats.data.Validated[List[String],Data] = Invalid(List("Phone number must be numeric"))
    > validateData(Data(badEmail, okPhone))
    res3: cats.data.Validated[List[String],Data] = Invalid(List("Invalid email format"))

    View Slide

  41. VALIDATEDNEL
    package cats.data
    type NonEmptyList[A] = OneAnd[List, A]
    type ValidatedNel[E, A] = Validated[NonEmptyList[E], A]

    View Slide

  42. USE AN EXPRESSIVE ERROR TYPE
    object ErrorCode extends Enumeration {
    type ErrorCode = Value
    val InvalidEmailFormat, ...,
    PhoneMustBeNumeric = Value
    }
    import ErrorCode._
    case class Err(code: ErrorCode, msg: String)

    View Slide

  43. OUR FINAL SOLUTION
    import cats.data._
    import cats.implicits._
    case class Data(email: String, phone: String)
    def validateEmail(e: String): ValidatedNel[Err, String] = ???
    def validatePhone(p: String): ValidatedNel[Err, String] = ???
    def validateData(d: Data): ValidatedNel[Err, Data] = {
    val validEmail = validateEmail(d.email)
    val validPhone = validatePhone(d.phone)
    validEmail |@| validPhone map (Data)
    }

    View Slide

  44. VALIDATEDNEL + ERR
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: cats.data.ValidatedNel[Err,Data] = Valid(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: cats.data.ValidatedNel[Err,Data] = Invalid(NonEmptyList(
    Err("InvalidEmailFormat","Invalid email format"),
    Err("PhoneMustBeNumeric","Phone number must be numeric")))
    > validateData(Data(okEmail, badPhone))
    res2: cats.data.ValidatedNel[Err,Data] = Invalid(NonEmptyList(
    Err("PhoneMustBeNumeric","Phone number must be numeric")))
    > validateData(Data(badEmail, okPhone))
    res3: cats.data.ValidatedNel[Err,Data] = Invalid(NonEmptyList(
    Err("InvalidEmailFormat","Invalid email format")))

    View Slide

  45. HOW TO STRUCTURE VALIDATION
    WITHIN YOUR APPLICATION?

    View Slide

  46. STEP 1
    > Pick an error representation
    > stick to it!
    case class Err(code: ErrorCode, msg: String)

    View Slide

  47. STEP 2
    > Use a type alias
    type Validation[T] = ValidatedNel[Err, T]

    View Slide

  48. STEP 3
    > Create a companion object
    > make it simple for your team

    View Slide

  49. A CONCRETE EXAMPLE
    sealed trait Err {
    val code: String
    val msg: String
    val values: Seq[AnyRef]
    }
    case class BadRequest(code: String,
    msg: String) extends Err {
    val values = Seq.empty
    }
    case class NotFound(code: String,
    msg: String,
    values: Seq[AnyRef]) extends Err

    View Slide

  50. A CONCRETE EXAMPLE
    type Validation[T] = ValidatedNel[Err, T]
    import cats.data._
    object Validation extends AccumulateArities {
    def success[T](t: T): Validation[T] = Validated.valid(t)
    def failure[T](e: Err): Validation[T] = Validated.invalidNel(e)
    }

    View Slide

  51. A CONCRETE EXAMPLE
    trait AccumulateArities {
    /** Accumulate function for Validation[T] of arity 2 */
    def accumulate[T1,T2,Z](v1: Validation[T1],
    v2: Validation[T2])
    (f: (T1,T2) => Z): Validation[Z] =
    Apply[Validation].map2(v1,v2)(f)
    /** Accumulate function for Validation[T] of arity 3 */
    def accumulate[T1,T2,T3,Z](v1: Validation[T1],
    v2: Validation[T2],
    v3: Validation[T3])
    (f: (T1,T2,T3) => Z): Validation[Z] =
    Apply[Validation].map3(v1,v2,v3)(f)
    // ...until arity 22!
    }

    View Slide

  52. SUMMARY
    > Do not reinvent the wheel
    > Choose an expressive type
    > Customise the solution
    to your needs

    View Slide

  53. THANK YOU!
    > Code on github:
    github.com/DanielaSfregola/data-validation
    > Twitter: @DanielaSfregola
    > Blog: danielasfregola.com

    View Slide