Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Clojure LA: Building type-safe Clojure services with HTTP, JSON and Avro

Clojure LA: Building type-safe Clojure services with HTTP, JSON and Avro

Originally presented at Clojure Los Angeles at Pivotal Labs Santa Monica office.

This talk covers the design and features of Duckula: an approach for building type-safe backend services in Clojure.

Łukasz Korecki

September 10, 2019
Tweet

More Decks by Łukasz Korecki

Other Decks in Programming

Transcript

  1. Building type-safe Clojure services with HTTP, JSON and Avro (=

    :Duckula (+ \ \‍♂)) Lukasz Korecki @lukaszkorecki Clojure Los Angeles September 2019
  2. Agenda - What is EnjoyHQ and how it works -

    What problems we’ve run into and how we tried to solve them - What didn’t work - What did: our RabbitMQ framework - Avro - Duckula - What’s next - Questions
  3. EnjoyHQ - Three applications in one (sort-of) - Ingesting data

    from external systems such as Zendesk, Intercom, Salesforce - Collaboration and document authoring platform - A search engine and analytics - “Helping UX research and product teams understand their customers faster” - Means reading a lot of written feedback and watching a lot of video recordings of user interviews, testing sessions etc - We help to organize all of that raw data and research findings
  4. The stack Frontend: - React + Typescript - Ruby on

    Rails - Postgres - RabbitMQ Backend: - Clojure - Compojure - Component - Data Stores - Postgres - Redis - RabbitMQ - Elasticsearch - RethinkDB (going away)
  5. Issues - It’s not clear what the client needs from

    the backend when reading - It’s not clear what the backend needs when client writes data - Many services, fewer clients - Lots of endpoints, no convention (GET /documents vs POST /document/set-subject ) - Documentation vs reality - “I don’t know what the frontend actually needs” - a backend engineer - “I need an endpoint where I can do 3 things at once” - a frontend engineer - We need metrics - it’s hard to consistently instrument endpoints across many services
  6. Early attempts - Enforcing schemas and types at different levels:

    - Postgres - Plumatic Schema in most places - pre/post conditions for simple checks (“is the query arg a string?”) - “Smart” ring middleware - Hard to infer metrics from dynamic routes such as “GET /documents/:id” - Downside: impossible to document the API automatically, API are docs written by hand - Also, Schemas cannot be shared with the client written in a language different than Clojure
  7. Issues Potential security issue with Schema: (require '[schema.core :as s])

    (s/defn some-handler [{:keys [db-conn redis-conn] :as component} ; untyped! params :- {:query s/String :page s/Number }] (let [{:keys [query page] } params] query ;- at this point, guaranteed to be a string page ;- at this point guaranteed to be a number (long, int etc) #_ ... ))
  8. Issues (s/defn some-handler [component :- s/Any ; <- actually params

    :- {:query s/String :page s/Number }] (let [{:keys [db-conn redis-conn]} component {:keys [query page] } params] query ;- at this point, guaranteed to be a string page ;- at this point guaranteed to be a number (long, int etc) #_ ... )) What happened is that failed validation errors would get reported to our exception tracking software and Schema would include all arguments passed to the function in the exception metadata. DB credential leaks are not fun.
  9. Bunnicula Framework for creating systems of RabbitMQ consumers and publishers.

    Smoothes out the setup of RabbitMQ clients, channels, exchanges etc as well as creating components for asynchronous communication. https://blog.getenjoyhq.com/bunnicula-asynchronous-messaging-with-rabbitmq-for-clojure/
  10. Bunnicula - RabbitMQ from day 0, before adopting Clojure -

    Bunnicula was built to replace the original Ruby code, and allow for the frontend layer to queue up messages to be processed by the Clojure services - Built in logging, metrics and error reporting - All pluggable via protocols and Components - Uses JSON by default as message format - Borrowed the concept of serializers and deserializers from Kafka so that JSON can be easily replaced with a different formats, for example Transit - We use Avro (more on that in a minute)
  11. Bunnicula Simple architecture: - Consumers are functions which receive the

    payload and Components they depend on - All Consumers emit metrics - Publishers are a simple function call - Safety is guaranteed by sharing Avro schemas - When published messages are serialized into Avro guaranteeing that consumer will receive known input - Avro is more compact so queues take less memory
  12. Avro - Simple - Primitive types (long, string) - Complex

    (enums, records, maps) - Part of Hadoop ecosystem - Supports recursive schemas - Plays nicely with Clojure - Schemas can be defined in JSON or EDN - Schemas can be reloaded when working in the REPL - outside compilation step is not required! - Can be used to safely exchange data between services written in different languages - https://github.com/nomnom-insights/abracad - our own fork of the original library by Damballa, fixing a couple of long standing bugs
  13. Why not Protocol Buffers or Thrift? Even though both Protocol

    Buffers and Thrift have a great Java support, they don’t fit the Clojure workflow very well: - Schemas are defined in separate IDL (Avro has its own IDL too) - Require (re)compiling to a Java classes - Reloading code is not as straightforward - Need to store auto-generated code - Most importantly: neither play well with HTTP and JSON, therefore it’s hard to migrate - Designed for service to service communication - gRPC for web is not a thing (at the moment)
  14. Why not GraphQL? - Significant departure from current setup -

    One more thing to learn and add to the stack - We don’t actually care that much about gathering data from multiple services and merging it - Mutations look somewhat complex
  15. Goals - HTTP + JSON support - Out of the

    box validation of inputs and outputs - Built-in instrumentation (metrics, logs, exception tracking) - Enforcing conventions - Playing nicely with Component - Standard response format - Built-in API documentation
  16. Duckula Effectively RPC over HTTP + JSON Only POST Routes

    tend to follow Clojure namespaces + function names brother-eye.handler.subscriptions/create POST subscriptions/create brother-eye.handler.subscriptions/get-all POST subscriptions/get-all
  17. (def config {:name "test-server-rpc" :endpoints {;; composite schemas "/search/test" {:request

    ["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config))
  18. (def config {:name "test-server-rpc" :endpoints {;; composite schemas "/search/test" {:request

    ["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config)) POST /number/multiply Body has to conform to the Request schema, and handler has to respond with the data conforming to the Response schema
  19. (def config {:name "test-server-rpc" :endpoints {;; composite schemas "/search/test" {:request

    ["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config)) Route definition with validations, schemas are merged together. Handler is just a function receiving the Ring request map
  20. (def config {:name "test-server-rpc" :endpoints {;; composite schemas "/search/test" {:request

    ["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config)) Validation is optional, but you still get metrics, routing and exception handling
  21. It’s all Ring! Services backed by Duckula can be used

    like any other Ring-compatible router. All Ring middlewares are supported. It’s a glorified Ring middleware, it can be mounted as a sub-route in a Compojure context: (compojure/defroutes app (compojure/GET "/some/endpoint" [] "foo") (compojure/context "/rpc-api" [] (duckula.handler/build (assoc config :prefix "/rpc-api"))))
  22. No silver bullets - Sharing schemas between services is a

    bit problematic - We’re using a git submodule, updated everytime we deploy - Confluent (the Kafka company) released a schema registry service, but it takes time to integrate it - Avro errors can be annoyingly bad sometimes: Not in union ["null","string"]: {} (somewhere we’re passing an empty map, where the schema expects either null or a string, that’s all we know)
  23. Roadmap - Open source it - HTTP+Avro (there is some

    support already, but needs more work) - OpenAPI/Swagger support: - Will generate documentation based on Avro schemas and the config - Will allow for clients to be generated based on that - clj-http middleware for making it easier to build service-to-service sync comms
  24. PS Twirp https://twitchtv.github.io/twirp/docs/intro.html Twitch.tv project with similar goals: - HTTP

    + JSON - ProtocolBuffers IDL to describe services and their input/output formats - Generates clients and servers based on the Protocol Buffers configuration - Has a spec https://twitchtv.github.io/twirp/docs/spec_v6.html - Announcement blog post echoes a lot of my experiences https://blog.twitch.tv/twirp-a-sweet-new-rpc-framework-for-go-5f2febbf35f#d991 Used at Twitch and GitHub
  25. Questions? Slides will be available on the #clojure-losangeles channel on

    Clojurians Slack Also my Twitter @lukaszkorecki