Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Last Frontier and Beyond

Shannon
November 22, 2019

The Last Frontier and Beyond

Shannon

November 22, 2019
Tweet

More Decks by Shannon

Other Decks in Technology

Transcript

  1. The last frontier and beyond
    Think Outside The Box™
    of what’s
    !1

    View full-size slide

  2. What is FP good for?
    !2

    View full-size slide

  3. FP is good for
    modeling data
    !3

    View full-size slide

  4. FP is good for
    modeling effects
    !4

    View full-size slide

  5. FP apps are like
    nice little boxes
    !5

    View full-size slide

  6. But still...
    !6

    View full-size slide

  7. Data “escapes”
    the nice and tidy
    realm of our FP
    programs
    !7

    View full-size slide

  8. In every project
    We need to write some specific code for:
    ● Serializing data
    !8

    View full-size slide

  9. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON

    View full-size slide

  10. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro

    View full-size slide

  11. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf

    View full-size slide

  12. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift

    View full-size slide

  13. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON

    View full-size slide

  14. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input

    View full-size slide

  15. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input
    ● Reading configurations

    View full-size slide

  16. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input
    ● Reading configurations
    ● Accessing data stored in data bases

    View full-size slide

  17. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input
    ● Reading configurations
    ● Accessing data stored in data bases
    ● Generating random data

    View full-size slide

  18. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input
    ● Reading configurations
    ● Accessing data stored in data bases
    ● Generating random data
    ● Pretty printing

    View full-size slide

  19. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input
    ● Reading configurations
    ● Accessing data stored in data bases
    ● Generating random data
    ● Pretty printing
    ● Comparing values

    View full-size slide

  20. In every project
    We need to write some specific code for:
    ● Serializing data
    !8
    ● In JSON, Avro, Protobuf, Thrift, JSON, JSON, JSON
    ● Validating user input
    ● Reading configurations
    ● Accessing data stored in data bases
    ● Generating random data
    ● Pretty printing
    ● Comparing values

    View full-size slide

  21. Across projects
    Writing such code
    ● Is repetitive
    ● Mechanical
    ● Brings almost no business value
    !9

    View full-size slide

  22. Can we do
    better?
    !10

    View full-size slide

  23. Let’s derive this boilerplate at compile-time
    !11

    View full-size slide

  24. Compile-time derivation
    !12

    View full-size slide

  25. Compile-time derivation
    ● Many solutions:
    ● Scala macros, scalameta
    ● shapeless
    ● magnolia
    ● scalaz-deriving
    !12

    View full-size slide

  26. Compile-time derivation
    ● Many solutions:
    ● Scala macros, scalameta
    ● shapeless
    ● magnolia
    ● scalaz-deriving
    ● And specialized libs:
    ● scodec,
    ● circe
    ● avro4s,
    ● scalacheck-shapeless, etc.
    !12

    View full-size slide

  27. Compile-time derivation
    ● Many solutions:
    ● Scala macros, scalameta
    ● shapeless
    ● magnolia
    ● scalaz-deriving
    ● And specialized libs:
    ● scodec,
    ● circe
    ● avro4s,
    ● scalacheck-shapeless, etc.
    !12
    Many different apis

    View full-size slide

  28. Compile-time derivation
    ● Many solutions:
    ● Scala macros, scalameta
    ● shapeless
    ● magnolia
    ● scalaz-deriving
    ● And specialized libs:
    ● scodec,
    ● circe
    ● avro4s,
    ● scalacheck-shapeless, etc.
    !12
    Not easily customized
    Many different apis

    View full-size slide

  29. Compile-time derivation
    ● Many solutions:
    ● Scala macros, scalameta
    ● shapeless
    ● magnolia
    ● scalaz-deriving
    ● And specialized libs:
    ● scodec,
    ● circe
    ● avro4s,
    ● scalacheck-shapeless, etc.
    !12
    Slows down compilation
    Not easily customized
    Many different apis

    View full-size slide

  30. Compile-time derivation
    ● Many solutions:
    ● Scala macros, scalameta
    ● shapeless
    ● magnolia
    ● scalaz-deriving
    ● And specialized libs:
    ● scodec,
    ● circe
    ● avro4s,
    ● scalacheck-shapeless, etc.
    !12
    Slows down compilation
    Not easily customized
    Many different apis
    Unfriendly error messages

    View full-size slide

  31. But it gets worse…
    !13

    View full-size slide

  32. Compile-time
    derivation makes
    the model
    difficult to evolve
    !14

    View full-size slide

  33. The evolution problem
    !15

    View full-size slide

  34. The evolution problem
    !15

    View full-size slide

  35. Let’s solve the evolution problem!
    !16

    View full-size slide

  36. To solve the evolution problem
    We first need:
    ● A uniform way to abstract over the structure of data
    ● A runtime reification of this abstraction
    ● A method to derive “operations” from this reification
    !17

    View full-size slide

  37. Such
    abstraction
    exists, it’s called
    schema
    !18

    View full-size slide

  38. Let’s define schemas
    !19

    View full-size slide

  39. The “A” in ADT
    Every Algebraic Data Type can be represented using only:
    ● Unit
    ● Sum (Either)
    ● Product (Tuple2)
    ● A way to handle recursive types
    type Bit = Either[Unit, Unit]

    type Byte = (Bit, (Bit, (Bit, (Bit, (Bit, (Bit, (Bit, Bit)))))))


    type Option[A] = Either[Unit, A]


    "// intuitively: Either[Unit, (A, List[A])]

    type List[A] = Fix[λ[α "=> Either[Unit, (A, α)]]]
    !20

    View full-size slide

  40. The “A” in ADT (cont’d)
    The same principle applies to our beloved sealed traits and case classes
    sealed trait User

    case class Admin(credentials: String) extends User

    case class Customer(firstName: String, lastName: String, age: Int) extends User


    type Admin_ = String

    type Customer_ = (String, (String, Int))

    type User_ = Either[Admin_, Customer_]
    !21

    View full-size slide

  41. That’s all we need
    to abstract over
    any possible data
    structure
    !22

    View full-size slide

  42. But that wouldn’t be very convenient
    !23

    View full-size slide

  43. In real-world applications
    We also need some “convenience” constructors for schemas:
    ● Primitive types
    ● Sequences
    ● Records
    ● Unions
    !24

    View full-size slide

  44. Isomorphisms
    Given a Schema[A] and an Iso[A, B], we can build a Schema[B]
    val bit: Schema[Bit] = unit :+: unit

    val bit2Boolean = Iso[Either[Unit, Unit], Boolean]
    { bit "=> bit.fold(true, false)}
    { bool "=> if(bool) Left(()) else Right(())}

    val boolean: Schema[Boolean] = iso(bit, bit2Boolean)
    !25

    View full-size slide

  45. It’s actually
    slightly more
    complicated

    View full-size slide

  46. Yay! Higher-Kinded Recursion Schemes
    ● Like regular recursion-schemes, but the carrier of algebras is of kind * "-> *
    ● Functions are replaced by natural transformation
    ● Actually not that big of a deal, but makes one feel smart
    !27
    sealed trait SchemaF[S[_], A]

    case class Sum[S[_], A, B](left: S[A], right: S[B]) extends SchemaF[S, A \/ B]
    case class Prod[S[_], A, B](left: S[A], right: S[B]) extends SchemaF[S, (A, B)]
    "// etc…

    View full-size slide

  47. OK... now what?
    !28

    View full-size slide

  48. Where’ve we got so far?
    Remember, we want:
    ● A uniform way to abstract over the structure of data ✓
    ● A runtime reification of this abstraction ✓
    ● A method to derive “operations” from this reification ❓
    !29

    View full-size slide

  49. What is an “operation”?
    An operation on A is something equivalent to a function that:
    ● Takes an A as argument
    ● Returns an A
    ● Takes an A and returns an A
    In summary, simply F[A].
    !30

    View full-size slide

  50. What is “deriving”?
    Deriving an operation F from a schema is coming up with a function:
    Schema[A] "=> F[A] for any A
    Such polymorphic function is called natural transformation and is written:
    Schema "~> F
    So “deriving F” means “building a Schema "~> F”
    !31

    View full-size slide

  51. And how do we do that?
    Intuitively, a schema is a tree.
    So we fold that tree into an F[_].
    Starting from the leaves (primitive types) we walk back up the tree, combining smaller
    F[_] into bigger ones.
    For example, when we reach a Prod node we combine the F[A] and F[B] into an
    F[(A, B)].
    This is typically done by a (higher-kinded) catamorphism of an algebra over a schema
    !32

    View full-size slide

  52. Solving the evolution problem
    !33

    View full-size slide

  53. The evolution problem: recap
    ● Only one version of each type in the code base
    ● Backward compatibility (new nodes read old data)
    ● Forward compatibility (old nodes read new data)
    !34

    View full-size slide

  54. The evolution problem: recap
    ● Only one version of each type in the code base
    ● Backward compatibility (new nodes read old data)
    ● Forward compatibility (old nodes read new data)
    It’s “just” a matter of coming up with alternative readers.
    !35

    View full-size slide

  55. The evolution strategy
    1. Define a set of backward/forward compatible migration steps
    2. Define other schemas in terms of the current one
    3. Use that to produce an uprading/downgrading schema
    4. Derive a reader from it
    !36

    View full-size slide

  56. Just an ADT describing b/f compatible migration steps
    Step 1: Migration steps
    sealed trait MigrationStep

    case class AddField[A](name: String, schema: Schema[A], default: A) extends MigrationStep

    case class RenameField(oldName: String, newName: String) extends MigrationStep
    "// etc.
    !37

    View full-size slide

  57. Step 2: define migrations
    We use this ADT to define older schemas in terms of the current one
    "// The current version can be manually defined or derived at compile-time

    val personV2: Schema[Person] = ""??? 


    val personV1: Schema[Person] = 

    Schema

    .upgradingVia(AddField("age", prim(ScalaInt), 0))

    .to(personV2)


    val personV0: Schema[Person] = 

    Schema

    .upgradingVia(RenameField("name", "username"))

    .to(personV1)
    !38

    View full-size slide

  58. Step 3: Upgrading/downgrading schemas
    Let’s suppose the current version of Person looks like:
    The personV1 upgrading schema from the previous slide could be manually written as:
    val personV1 = iso(
    personV2,
    Iso[(String, String), Person]

    (pair "=> Person(0, pair._1, pair._2))

    (pers "=> (pers.username, pers.email))
    )
    case class Person(age: Int, username: String, email: String)

    !39

    View full-size slide

  59. Step 4: Deriving readers
    Upgrading/downgrading schemas are… just schemas!
    We can derive operations from them like we do with other schemas:
    val personV1Reads: Reads[Person] = personV1.to[play.api.libs.json.Reads]
    !40

    View full-size slide

  60. Problem solved!!!
    !41

    View full-size slide

  61. Err… well,
    that’s actually
    not that easy

    View full-size slide

  62. Problem #1: not enough type safety
    ● The « recursion-scheme-y » encoding hides the internal structure
    ● But we need to make sure that a given migration « makes sense »
    ● Do our introduced Isos align with the rest of the schema?
    !43

    View full-size slide

  63. Solution #1: Introduce a phantom type
    ● Tag the schema constructors with a type representing their internal structure
    ● Use that structure to verify stuff at compile time
    ● A migration becomes a function: SchemaZ[R1, A] "=> SchemaZ[R2, A]
    !44
    sealed trait Tagged[R]
    type SchemaZ[Repr, A] = Schema[A] with Tagged[Repr]

    View full-size slide

  64. Problem #2: Scalac doesn’t help (how surprising is that, huh?)
    ● Migrations are in fact dependent functions SchemaZ[R1, A] "=> SchemaZ[R2, A]
    where R2 depends on R1.
    ● In the general case, scalar fails to infer R2.
    ● (It even ends up saying stuff like one was not equal to one, charming)
    !45

    View full-size slide

  65. Solution #2: Just give up…
    ● … On solving the general case
    ● Everything works « at the shallowest depth »
    ● You can add/remove/rename fields of a record (resp. branches of an union)
    ● But you cannot change their inner schema
    ● So let’s just force the user to
    ● define their schemas at top-level and
    ● compose schemas using functions
    !46

    View full-size slide

  66. Problem #3: This isn’t practical, at all
    ● Leads to too finely grained definitions
    ● When migrating a schema, you need to redefine all the schemas that depend on it
    ● You end up redefining everything for each version
    ● That’s precisely what we wanted to avoid in the first place



    !47

    View full-size slide

  67. Solution #3: Type-level Schema Registry
    ● Define a Version as an heterogeneous list (acting as a stack) of functions (that
    construct schemas)
    ● Each such constructor can depend on the results of what’s defined « below »
    ● Perform some implicit wizardry to « weave » these functions together
    ● Voilà!
    !48

    View full-size slide

  68. The end result
    !49
    val current = Current
    .schema(
    record(
    "name" -"*>: prim(JsonString) :*: "active" -"*>: prim(JsonBool),
    Iso[(String, Boolean), User](User.apply)(u "=> (u.name, u.active))
    )
    ).schema((u: Schema[User]) "=> … )"// Some Person schema depending on User
    val version0 = current.migrate[User].change(_.addField("name", "John Doe »))
    val personV0 = version0.lookup[Person] "// will contain a migrated User

    View full-size slide

  69. Schemas give us
    a lot of things
    for “free”!
    !50

    View full-size slide

  70. Random data generators
    !51
    val personGen = personSchema.to[Gen]

    View full-size slide

  71. Eq
    !52
    val personEq = personSchema.to[Eq]

    View full-size slide

  72. Ordering
    !53
    val personOrd = personSchema.to[Ordering]

    View full-size slide

  73. Show
    !54
    val personShow = personSchema.to[Show]

    View full-size slide

  74. Forms and UIs
    !55
    val personForm = personSchema.to[Form]

    View full-size slide

  75. Avro
    !56
    type Avro[A] = GenericContainer "=> A
    val personAvro = personSchema.to[Avro]

    View full-size slide

  76. SQL queries and migrations
    !57

    View full-size slide

  77. Generic data pipelines
    !58

    View full-size slide

  78. Coming soon… SchemaZ!
    These ideas are in active development: https://github.com/spartanz/schemaz
    So far we have:
    ● Schema representation ✓
    ● Derivation mechanism ✓
    ● Migration/evolution
    Ask me anything @ValentinKasas
    Your contribution is very welcome!
    !59

    View full-size slide

  79. Special thanks
    !60

    John A De Goes Dominic Egger
    @GrafBlutwurst
    @jdegoes

    View full-size slide

  80. I’m @ValentinKasas
    !61
    Solution architect @ 47 Degrees We’re Hiring!
    https://47deg.com

    View full-size slide