$30 off During Our Annual Pro Sale. View Details »

Easy and Efficient Data Validation with Cats - Typelevel Summit 2017 NYC

Easy and Efficient Data Validation with Cats - Typelevel Summit 2017 NYC

Often when we create a client/server application, we need to validate the requests: can the user associated to the request perform this operation? Can they access or modify the data? Is the input well-formed? When the data validation component in our application is not well designed, the code can quickly become not expressive enough and probably difficult to maintain. Business rules don’t help, adding more and more requirements to add in our validation, making it more and more complex to clearly represent and maintain. At the same time when the validation fails, it should be fairly straight forward to understand why the request was rejected, so that actions can be taken accordingly. This talk introduces Cats, a Scala library based on category theory, and some of its most interesting components for data validation. In particular, we’ll discuss some options to achieve efficient and expressive data validation. We will also argue that, compared to other options in the language, Cats is particularly suited for the task thanks to its easy-to-use data types and more approachable syntax. Throughout the talk, you will see numerous examples on how data validation can be achieved in a clean and robust way, and how we can easily integrate it in our code, without any specific knowledge of category theory.

Daniela Sfregola

March 23, 2017
Tweet

More Decks by Daniela Sfregola

Other Decks in Programming

Transcript

  1. EASY AND EFFICIENT
    DATA VALIDATION
    WITH CATS
    @DANIELASFREGOLA
    TYPELEVEL SUMMIT NYC 2017

    View Slide

  2. HELLOOOOO
    > ex Java Developer
    > OOP background
    > I am not a mathematician !

    View Slide

  3. FUNCTIONAL BUZZWORDS
    FREE ZONE

    View Slide

  4. DATA VALIDATION
    > In almost every application
    > Can be come complex quite quickly
    > Needs to be maintained

    View Slide

  5. ...NO NEED TO
    REINVENT THE WHEEL

    View Slide

  6. MAP
    scala> Some("daniela").map(s => "yo " + s)
    res0: Option[String] = Some(yo daniela)
    scala> None.map(s => "yo " + s)
    res1: Option[String] = None

    View Slide

  7. FLATMAP
    map + flatten
    scala> Some("daniela").flatMap(s => Some("yo " + s))
    res2: Option[String] = Some(yo daniela)
    scala> None.flatMap(s => Some("yo " + s))
    res3: Option[String] = None

    View Slide

  8. FOR-COMPREHENSION
    map + flatMap + filter
    scala > for { a <- Some(1); b <- Some(5) } yield a + b
    res4: Option[Int] = Some(6)
    scala > for { a <- Some(1); b <- None } yield a + b
    res5: Option[Int] = None

    View Slide

  9. CASE STUDY

    View Slide

  10. SCALA
    2.11
    NOT THE LATEST VERSION !!!

    View Slide

  11. OPTION
    package scala
    sealed abstract class Option[+A]
    final case class Some[+A](x: A) extends Option[A]
    case object None extends Option[Nothing]
    def map[B](f: A => B): Option[B] = ???
    def flatMap[B](f: A => Option[B]): Option[B] = ???

    View Slide

  12. OPTION
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Option[String] = ???
    def validatePhone(p: String): Option[String] = ???
    def validateData(d: Data): Option[Data] =
    for {
    validEmail <- validateEmail(d.email)
    validPhone <- validatePhone(d.phone)
    } yield Data(validEmail, validPhone)

    View Slide

  13. OPTION
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: Option[Data] = Some(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: Option[Data] = None
    > validateData(Data(okEmail, badPhone))
    res2: Option[Data] = None
    > validateData(Data(badEmail, okPhone))
    res3: Option[Data] = None

    View Slide

  14. Y U NO TELL ME
    WHICH ONE IS THE WRONG ONE?

    View Slide

  15. JUST DO NOT USE OPTION

    View Slide

  16. EITHER (2.11)
    package scala.util
    sealed abstract class Either[+A, +B]
    final case class Left[+A, +B](a: A) extends Either[A, B]
    final case class Right[+A, +B](b: B) extends Either[A, B]
    // no map
    // no flatMap

    View Slide

  17. EITHER (2.11)
    package scala.util
    final case class LeftProjection[+A, +B](e: Either[A, B])
    final case class RightProjection[+A, +B](e: Either[A, B])
    /**
    * Right(12).left.map(x => "flower") // Result: Right(12)
    * Left(12).left.map(x => "flower") // Result: Left("flower")
    *
    * Right(12).right.map(x => "flower") // Result: Right("flower")
    * Left(12).right.map(x => "flower") // Result: Left(12)
    **/
    // same for flatmap!

    View Slide

  18. EITHER (2.11)
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Either[List[String], String] = ???
    def validatePhone(p: String): Either[List[String], String] = ???
    def validateData(d: Data): Either[List[String], Data] = {
    val validEmail = validateEmail(d.email)
    val validPhone = validatePhone(d.phone)
    (validEmail, validPhone) match {
    case (Right(e), Right(p)) => Right(Data(e, p))
    case (Left(errE), Left(errP)) => Left(errE ++ errP)
    case (Left(errE), _) => Left(errE)
    case (_, Left(errP)) => Left(errP)
    }
    }

    View Slide

  19. EITHER (2.11)
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: Either[List[String],Data] = Right(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: Either[List[String],Data] =
    Left(List("Invalid email format", "Phone number must be numeric"))
    > validateData(Data(okEmail, badPhone))
    res2: Either[List[String],Data] = Left(List("Phone number must be numeric"))
    > validateData(Data(badEmail, okPhone))
    res3: Either[List[String],Data] = Left(List("Invalid email format"))

    View Slide

  20. EITHER (2.11)
    Which one is the error?
    Which one is the valid value?
    Either is not biased*
    *things have changed in Scala 2.12

    View Slide

  21. EITHER (2.11)
    Combine Either instances
    is not always
    easy or maintainable

    View Slide

  22. SCALA
    2.12

    View Slide

  23. EITHER (2.12)
    package scala.util
    sealed abstract class Either[+A, +B]
    final case class Left[+A, +B](a: A) extends Either[A, B]
    final case class Right[+A, +B](b: B) extends Either[A, B]
    def map[Y](f: B => Y): Either[A, Y] = ???
    def flatMap[AA >: A, Y](f: B => Either[AA, Y]): Either[AA, Y] = ???

    View Slide

  24. EITHER (2.12)
    package scala.util
    final case class LeftProjection[+A, +B](e: Either[A, B])
    final case class RightProjection[+A, +B](e: Either[A, B])
    // right projection used by default
    /**
    * Right(12).left.map(x => "flower") // Result: Right(12)
    * Left(12).left.map(x => "flower") // Result: Left("flower")
    *
    * Right(12).right.map(x => "flower") // Result: Right("flower")
    * Left(12).right.map(x => "flower") // Result: Left(12)
    **/
    // same for flatmap!

    View Slide

  25. EITHER (2.12)
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Either[List[String], String] = ???
    def validatePhone(p: String): Either[List[String], String] = ???
    def validateData(d: Data): Either[List[String], Data] =
    for {
    validEmail <- validateEmail(d.email)
    validPhone <- validatePhone(d.phone)
    } yield Data(validEmail, validPhone)

    View Slide

  26. EITHER (2.12)
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: Either[List[String],Data] = Right(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: Either[List[String],Data] = Left(List("Invalid email format"))
    > validateData(Data(okEmail, badPhone))
    res2: Either[List[String],Data] = Left(List("Phone number must be numeric"))
    > validateData(Data(badEmail, okPhone))
    res3: Either[List[String],Data] = Left(List("Invalid email format"))

    View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. EITHER (2.12)
    > only one validation is performed
    > ideal only when error accumulation
    is not needed

    View Slide

  34. CATS
    0.9.0
    GITHUB.COM/TYPELEVEL/CATS

    View Slide

  35. BIASED EITHER WITH 2.11
    import cats.syntax.either._
    * Xor removed from cats 0.8.0

    View Slide

  36. VALIDATED
    package cats.data
    sealed abstract class Validated[+E, +A]
    final case class Valid[+A](a: A) extends Validated[Nothing, A]
    final case class Invalid[+E](e: E) extends Validated[E, Nothing]
    def map[B](f: A => B): Validated[E,B]
    // no flatmap
    //...but we have something else *really* useful!

    View Slide

  37. VALIDATED AND APPLY*
    import cats.Apply
    import cats.data.Validated
    import cats.implicits._
    def accumulate[E, A1, A2, B](v1: Validated[E, A1],
    v2: Validated[E, A2])
    (f: (A1, A2) => B): Validated[E, B] =
    (v1 |@| v2).map(f)
    // same as: Apply[Validated[E, ?]].map2(v1,v2)(f)
    * More info on Apply at http://typelevel.org/cats/typeclasses/applicative.html

    View Slide

  38. VALIDATED
    import cats.implicits._
    import cats.data.Validated
    case class Data(email: String, phone: String)
    def validateEmail(e: String): Validated[List[String], String] = ???
    def validatePhone(p: String): Validated[List[String], String] = ???
    def validateData(d: Data): Validated[List[String], Data] = {
    val validEmail = validateEmail(d.email)
    val validPhone = validatePhone(d.phone)
    (validEmail |@| validPhone).map(Data)
    }

    View Slide

  39. VALIDATED
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: cats.data.Validated[List[String],Data] = Valid(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: cats.data.Validated[List[String],Data] =
    Invalid(List("Invalid email format", "Phone number must be numeric"))
    > validateData(Data(okEmail, badPhone))
    res2: cats.data.Validated[List[String],Data] = Invalid(List("Phone number must be numeric"))
    > validateData(Data(badEmail, okPhone))
    res3: cats.data.Validated[List[String],Data] = Invalid(List("Invalid email format"))

    View Slide

  40. VALIDATEDNEL
    package cats.data
    type NonEmptyList[A] = OneAnd[List, A]
    type ValidatedNel[E, A] = Validated[NonEmptyList[E], A]

    View Slide

  41. USE AN EXPRESSIVE ERROR TYPE
    object ErrorCode extends Enumeration {
    type ErrorCode = Value
    val InvalidEmailFormat, ...,
    PhoneMustBeNumeric = Value
    }
    import ErrorCode._
    case class Err(code: ErrorCode, msg: String)

    View Slide

  42. OUR FINAL SOLUTION
    import cats.data._
    import cats.implicits._
    case class Data(email: String, phone: String)
    def validateEmail(e: String): ValidatedNel[Err, String] = ???
    def validatePhone(p: String): ValidatedNel[Err, String] = ???
    def validateData(d: Data): ValidatedNel[Err, Data] = {
    val validEmail = validateEmail(d.email)
    val validPhone = validatePhone(d.phone)
    validEmail |@| validPhone map (Data)
    }

    View Slide

  43. VALIDATEDNEL + ERR
    val okEmail = "[email protected]"; val badEmail = "email.com"
    val okPhone = "+1 1234567890123"; val badPhone = "not-a-valid-phone"
    > validateData(Data(okEmail, okPhone))
    res0: cats.data.ValidatedNel[Err,Data] = Valid(Data([email protected],+1 1234567890123))
    > validateData(Data(badEmail, badPhone))
    res1: cats.data.ValidatedNel[Err,Data] = Invalid(NonEmptyList(
    Err("InvalidEmailFormat","Invalid email format"),
    Err("PhoneMustBeNumeric","Phone number must be numeric")))
    > validateData(Data(okEmail, badPhone))
    res2: cats.data.ValidatedNel[Err,Data] = Invalid(NonEmptyList(
    Err("PhoneMustBeNumeric","Phone number must be numeric")))
    > validateData(Data(badEmail, okPhone))
    res3: cats.data.ValidatedNel[Err,Data] = Invalid(NonEmptyList(
    Err("InvalidEmailFormat","Invalid email format")))

    View Slide

  44. HOW TO STRUCTURE VALIDATION
    WITHIN YOUR APPLICATION?

    View Slide

  45. STEP 1
    > Pick an error representation
    > stick to it!
    case class Err(code: ErrorCode, msg: String)

    View Slide

  46. STEP 2
    > Use a type alias
    type Validation[T] = ValidatedNel[Err, T]

    View Slide

  47. STEP 3
    > Create a companion object
    > make it simple for your team

    View Slide

  48. A CONCRETE EXAMPLE
    sealed trait Err {
    val code: String
    val msg: String
    val values: Seq[AnyRef]
    }
    case class BadRequest(code: String,
    msg: String) extends Err {
    val values = Seq.empty
    }
    case class NotFound(code: String,
    msg: String,
    values: Seq[AnyRef]) extends Err

    View Slide

  49. A CONCRETE EXAMPLE
    type Validation[T] = ValidatedNel[Err, T]
    import cats.data._
    object Validation extends AccumulateArities {
    def success[T](t: T): Validation[T] = Validated.valid(t)
    def failure[T](e: Err): Validation[T] = Validated.invalidNel(e)
    }

    View Slide

  50. A CONCRETE EXAMPLE
    trait AccumulateArities {
    /** Accumulate function for Validation[T] of arity 2 */
    def accumulate[T1,T2,Z](v1: Validation[T1],
    v2: Validation[T2])
    (f: (T1,T2) => Z): Validation[Z] =
    Apply[Validation].map2(v1,v2)(f)
    /** Accumulate function for Validation[T] of arity 3 */
    def accumulate[T1,T2,T3,Z](v1: Validation[T1],
    v2: Validation[T2],
    v3: Validation[T3])
    (f: (T1,T2,T3) => Z): Validation[Z] =
    Apply[Validation].map3(v1,v2,v3)(f)
    // ...until arity 22!
    }

    View Slide

  51. SUMMARY
    > Do not reinvent the wheel
    > Choose an expressive type
    > Customise the solution
    to your needs

    View Slide

  52. THANK YOU!
    > Code on github:
    github.com/DanielaSfregola/data-validation
    > Twitter: @DanielaSfregola
    > Blog: danielasfregola.com

    View Slide