Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keep Your Data Safe With Refined Types

Keep Your Data Safe With Refined Types

Regardless of whether you use a statically or dynamically typed language, specifying your inputs and outputs is a very important step in system design. If you are not surgically precise in defining which data your program takes and produces, you are looking for trouble during the operation phase of the system lifecycle. Making guesses and undeclared assumptions might be easier when writing the code but will certainly bite you as your system lives in production.

Clojure, being dynamically typed, might not give you strong compile-time guarantees. But enforcing the shape of the data on system boundaries allows us to have an untyped data transformation layer and stay sane. Today, we will look into specific Clojure instruments for dealing with strongly shaped data, pitfalls, and hard lessons we’ve learned so far delivering reliable and maintainable systems in Clojure.

Oleksii Kachaiev

June 19, 2018

More Decks by Oleksii Kachaiev

Other Decks in Programming


  1. @Me • CTO at Attendify • 5+ years with Clojure

    in production • Creator of Muse | Aleph & Netty contributor • More: protocols, algebras, Haskell, Idris • @kachayev on Twitter & Github
  2. "No Types" ™ In The Wild I don't like the

    term "dynamic language" But you all know what I mean Almost no compile-time correctness guarantees
  3. "No Types" ™ In The Wild You still can do

    a lot and go really far Less data structures requires less checks, right? Kinda "banned" topic by the community
  4. U Y No Types? We still need some kind of

    "types" • to model data in advance • to validate your data Otherwise you'll mess something up quickly
  5. a choice between “you want to take your pain up

    front or gradually over time” — Clojure the Devil…is in the detail
  6. Any data-intense application is built around the model that's being

    implemented in a dynamically typed language remains informally defined and requires the number of prays quadratic to the number of non-defined data types. At some point supporting such a system becomes indistinguishable from magic.
  7. To Take From This Talk • non-defined data shape =

    someone's assumption • simple when designing, impossible when operating • typing data with Int and String doesn't help a lot • you don't have to type data transformations • (as long as input & output are covered)
  8. When To Validate? • RPC request comes in • RPC

    response comes out • Reading from & writing to DB (disks, caches etc) • Reading from & writing to Kafka (queues, logs etc) • And more!
  9. (require '[schema.core :as s]) (def Event {:id s/Uuid :name s/Str

    :online? s/Bool :sits s/Num :tickets [{:id s/Uuid :title s/Str :description (s/maybe s/Str) :quantity s/Num (s/optional-key :price) s/Num :status (s/enum :open :closed)}]})
  10. Are Those Even "Types"? Checking things in runtime opens a

    lot of doors! Idris built on the idea that types are values Same goes for your runtime It's just data! You can operate it as you need.
  11. (def TicketStatus (s/enum :open :closed)) (def Ticket {:id s/Uuid :title

    s/Str :description (s/maybe s/Str) :quantity s/Num (s/optional-key :price) s/Num :status TicketStatus}) (def Event {:id s/Uuid :name s/Str :online? s/Bool :sits s/Num :tickets [Ticket]}) (def CreateTicketRequest (dissoc Ticket :id))
  12. (s/check Ticket {:id 42 :title "Early Bird" :text "Some randomness"

    :quantity 100 :status :skipped}) ;; => {:id (not (instance? java.util.UUID 42)) :description missing-required-key :status (not (#{:open :closed} :skipped)) :text disallowed-key}
  13. Our Goals Are • Readability and Soundness (harder than it

    seems) • Being as precise as we can • Avoid as many bugs as possible • Provide clean and useful error messages • Keep serialization and business logic separated
  14. How To Define Optional? #1: s/maybe (def TicketDescription {:description (s/maybe

    s/Str)}) ;; works {:description nil} {:description "A lot of free places!"} ;; doesn't {:description 1457} {:text "Really a lot!"}
  15. How To Define Optional? #1: s/maybe Looks good! It's "type-safe"

    and it's functional! (functor, wheeee) But it's still error-prone ☹
  16. How To Define Optional? #1: s/maybe (let [draft (read-from-db db-conn

    ticket-id)] (rpc/call "createFreeTicket" {:id ticket-id :title (:title draft) :description (sanitize-html (:decsription draft)) :price default-price :quantity 100}))
  17. Tolerant-Reader They Say ;; clojure.spec has "open keys space" design

    ;; meaning unknown keys are OKay ;; combine with nil-able values (rpc/call "createFreeTicket" {:id ticket-id :title (:title draft) :decsription (sanitize-html (:description draft)) :price default-price :quantity 100}) ;; welcome to data hell ;; :trollface:
  18. How To Define Optional? #2: optional-key (def TicketDescription {(s/optional-key :description)

    s/Str}) ;; now you have more work to do (cond-> ticket (do-i-have-description?) (assoc :description "This would be amazing!")) (harder to mess up, but still... )
  19. How To Define Optional? #3: All The Above! (def TicketDescription

    {(s/optional-key :description) (s/maybe s/Str)}) ;; hmm... now you can pass whatever you want!
  20. How To Define Optional? #3: All The Above! Your API

    users will beg you for this! It's so super flexible! Please, just don't.
  21. How To Define Optional? Being "type-safe" is not a goal

    Being "functional" is not a goal Our goal is to reduce number of errors
  22. Define Optional With Own "Void" ;; domain specific voided value

    (def UnlimitedPurchase {:limited? (s/eq false)}) (def PurchaseLimit {:limited? (s/eq true) :limit s/Num}) ;; generic voided value (def NoTicketDescription {:description {:nothing (s/eq true)}}) (def TicketDescription {:description {:just s/Str}})
  23. Be Precise! (def PositiveInt (s/constrained s/Int pos? 'should-be-positive)) (def NonEmptyStr

    (s/constrained s/Str #(not (clojure.string/blank? %)) 'should-not-be-blank))
  24. Combine Things! (defn BoundedListOf [dt left right] (s/constrained [dt] #(<=

    left (count %) right) 'collection-length-should-conform-boundaries)) {:id s/Uuid :name NonEmptyStr :online? s/Bool :sits PositiveInt :tickets (BoundedListOf Ticket 1 25)}
  25. Express Business Rules (def -Event {:id s/Uuid :name s/Str :online?

    s/Bool :sits s/Num :tickets [Ticket]}) (def Event (-> -Event (s/constrained (fn [{:keys [sits tickets]}] (>= sits (apply + (map :quantity tickets)))) 'tickets-quantities-should-not-exceed-sits-count) (s/constrained ...)))
  26. Sum Types data Result a b = Ok a |

    Error b enum Result<T, E> { Ok(T), Error(E), } type result('good, 'bad) = | Ok('good) | Error('bad);
  27. Sum Types In Clojure (defn Result [ok error] (s/either {:ok

    ok} {:error error})) WARN: Deprecated!
  28. Sum Types In Clojure (personal opinion) schema is designed for

    validation, not modeling No "difference by construction" Easy to mess up
  29. Just Specify Discriminator (defn Result [ok error] (s/conditional #(contains? %

    :ok) {:ok ok} #(contains? % :error) {:error error})) (better, but still... ! )
  30. This Is Bad :( (def Ticket {:id Id :type (s/enum

    "free" "paid") :name NonEmptyStr :quantity (TypedRange int 1 1e4) :description (Maybe NonEmptyStr) (s/optional-key :priceInCents) PositiveInt (s/optional-key :taxes) [Tax] (s/optional-key :fees) (s/enum :absorb :pass) :status (e/enum :open :closed)})
  31. Way Better! (def FreeTicket {:id Id :type (s/eq "free") :title

    NonEmptyStr :quantity (TypedRange int 1 1e4) :description (Maybe NonEmptyStr) :status (e/enum :open :closed)}) (def PaidTicket (assoc FreeTicket :type (s/eq "paid") :priceInCents PositiveInt :taxes [Tax] :fees (s/enum :absorb :pass)))
  32. After Cosmetic Changes... (def Ticket (s/conditional #(= "free" (:type %))

    FreeTicket #(= "paid" (:type %)) PaidTicket)) turned into (def Ticket (dispatch-on :type "free" FreeTicket "paid" PaidTicket))
  33. (def EmptyScrollableList {:items (s/eq []) :totalCount (s/eq 0) :hasNext (s/eq

    false) :hasPrev (s/eq false) :nextPageCursor (s/eq nil) :prevPageCursor (s/eq nil)}) (defn NonEmptyScrollableList [dt] (dispatch-on (juxt :hasNext :hasPrev) [false false] (SinglePage dt) [true false] (FirstPage dt) [false true] (LastPage dt) [true true] (ScrollableListSlice dt))) (defn ScrollableList [dt] (dispatch-on :totalCount 0 EmptyScrollableList :else (NonEmptyScrollableList dt)))
  34. (def -Ticket {:id s/Uuid :title s/Str :description (s/maybe s/Str) :quantity

    s/Num (s/optional-key :price) s/Num :status (s/enum :open :closed)}) (def Ticket (s/constrained -Ticket (fn [{:keys [quantity status]}] (or (= :closed status) (< 0 quantity))))) (def CreateTicketRequest (dissoc Ticket :id :status))
  35. So Far So Good? (s/check CreateTicketRequest {:title "Works?" :description "Probably

    not :(" :quantity 10}) ;;=> {:id missing-required-key, :status missing-required-key} ;; but why? (class -Ticket) ;;=> clojure.lang.PersistentArrayMap (class Ticket) ;;=> schema.core.Constrained
  36. schema-refined • https://github.com/KitApps/schema-refined • schema on steroids (a lot of

    them) • refined: constrained on steroids • Struct: product types (maps) on steroids • StructDispatch: conditional on steroids
  37. Predicates ;; "manually" with refined and predicates (def LatCoord (r/refined

    double (r/OpenClosedInterval -90.0 90.0))) ;; the same using built-in types ;; (or functions to create types from other types, a.k.a. generics) (def LngCoord (r/OpenClosedIntervalOf double -180.0 180.0)) ;; Product type using a simple map (def GeoPoint {:lat LatCoord :lng LngCoord}) ;; using built-in types (def Route (r/BoundedListOf GeoPoint 2 50))
  38. Now What? (def input [{:lat 48.8529 :lng 2.3499} {:lat 51.5085

    :lng -0.0762} {:lat 40.0086 :lng 28.9802}]) ;; Route now is a valid schema, ;; so you can use it as any other schema (schema/check Route input)
  39. Predicates... More! (def InZurich {:lat (r/refined double (r/OpenInterval 47.34 47.39))

    :lng (r/refined double (r/OpenInterval 8.51 8.57))}) (def InRome {:lat (r/refined double (r/OpenInterval 41.87 41.93)) :lng (r/refined double (r/OpenInterval 12.46 12.51))})
  40. Predicates... Compose! ;; you can use schemas as predicates ;;

    First/Last are good examples of predicate "generics" (def RouteFromZurich (r/refined Route (r/First InZurich))) (def RouteToRome (r/refined Route (r/Last InRome))) ;; And, Or, Not, On (def RouteFromZurichToRome (r/refined Route (r/And (r/First InZurich) (r/Last InRome))))
  41. Predicates... Compose! ;; or even more (def FromZurichToRome (r/And (r/First

    InZurich) (r/Last InRome))) (defn LessNHops [n] (r/BoundedSize 2 (+ 2 n))) (def RouteFromZurichToRomeWithLess3Hops (r/refined Route (r/And FromZurichToRome (LessNHops n))))
  42. Readability Matters. A Lot ;; following the rule ;; {v:

    T | P(v)} (def Coord (refined double (OpenClosedInterval -180.0 180.0))) ;; #Refined{v: double | v ∈ (-180.0, 180.0]} (def QuickRoute (BoundedListOf double 2 4)) ;; #Refined{v: [double] | (count v) ∈ [2, 4]} (refined [double] (Rest (OpenInterval 0 1))) ;; #Refined{v: [double] | ∀v' ∊ (rest v): v' ∈ (0, 1)}
  43. (def -FreeTicket (Struct :id Id :type (s/eq "free") :title NonEmptyStr

    :quantity (OpenIntervalOf 1 1e4) :description (s/maybe NonEmptyStr) :status (s/enum :open :closed))) (def FreeTicket (guard -FreeTicket '(:quantity :status) enough-sits-when-open)) ;; #<StructMap {:description (constrained Str should-not-be-blank) ;; :type (eq "free") ;; :title (constrained Str should-not-be-blank) ;; :status (enum :open :closed) ;; :id java.lang.String ;; :quantity (constrained int should-be-bounded-by-range-given)} ;; Guarded with ;; enough-sits-when-open over '(:quantity :status)>
  44. Carry Guards Carefully (def -PaidTicket (assoc FreeTicket :type (s/eq "paid")

    :priceInCents PositiveInt :taxes [Tax] :fees (s/enum :absorb :pass))) (def PaidTicket (guard -PaidTicket '(:taxes :fees) pass-tax-included)) ;; #<StructMap {...} ;; Guarded with ;; enough-sits-when-open over '(:quantity :status) ;; pass-tax-included over '(:taxes :fees)>
  45. Respectful Sum Type (def Ticket (StructDispatch :type "free" FreeTicket "paid"

    PaidTicket)) ;; #<StructDispatch on '(:type): ;; free => {...} ;; paid => {...}> (def CreateTicketRequest (dissoc Ticket :id :status)) ;; this works!
  46. Track Guards Applicability (dissoc PaidTicket :status) ;; #<StructMap {...} ;;

    Guarded with ;; pass-tax-included over '(:taxes :fees)> ;; (only one guard left)
  47. Catch Modeling Issues In Advance (def CreateFreeTicket (dissoc Ticket :type))

    ;; CompilerException java.lang.IllegalArgumentException: ;; You are trying to dissoc key ':type' ;; that is used in dispatch function. ;; Even thought it's doable theoretically, ;; we are kindly encourage you ;; to avoid such kind of manipulations. ;; Otherwise it's gonna be a mess. ;; , compiling:(form-init467997445288647843.clj:1:23)
  48. Philosophical Extending type (assoc, merge etc) is simpler • by

    implementation • to catch mentally We still fully support reduction (dissoc) "Request" types are a perfect use case
  49. What's Inside StructMap ;; potemkin's helper creates type that acts

    like a map (def-map-type StructMap [data ;; <- key/value pairs by themself guards ;; <- guards appended mta] ;; <- meta information (meta [_] ...) (with-meta [_ m] ...) (keys [_] ...) (assoc [_ k v] ...) (dissoc [_ k] ...) (get [_ k default-value] ...))
  50. What's Inside (def-map-type StructDispatchMap [keys-slice downstream-slice ;; <- keys slices

    collected from options dispatch-fn options guards updates ;; <- delayed assoc, dissoc operations mta] ...)
  51. Put Everything Together (extend-type StructMap ;; same for StructDispatch Guardable

    (append-guard [^StructMap this guard]) (get-guards [^StructMap this]) s/Schema (spec [this] this) (explain [^StructMap this]) schema-spec/CoreSpec (subschemas [^StructMap this]) (checker [^StructMap this params])) (defmethod print-method StructMap ;; same for StructDispatch [^StructMap struct ^java.io.Writer writer])
  52. What Do We Have Now? • Less tests (way-way-way less)

    • Less bugs (way-way-way less) • More confidence • Better sleep
  53. Error Messages • Clean & friendly errors are hard •

    You should invest a lot from the very beginning • .. just this to make it happen • Context sensitivity is super useful • Machines and human craves for different messages
  54. More Features • Separated "business" and "serialization" logic • Catch

    more "unreasonable" predicates • Support for "generics" • ... functions are not always the best fit