Clojure LA: Building type-safe Clojure services with HTTP, JSON and Avro

Building type-safe Clojure services with HTTP, JSON and Avro (=
:Duckula (+ \ \‍♂)) Lukasz Korecki @lukaszkorecki Clojure Los Angeles September 2019

Agenda - What is EnjoyHQ and how it works -
What problems we’ve run into and how we tried to solve them - What didn’t work - What did: our RabbitMQ framework - Avro - Duckula - What’s next - Questions

EnjoyHQ - Three applications in one (sort-of) - Ingesting data
from external systems such as Zendesk, Intercom, Salesforce - Collaboration and document authoring platform - A search engine and analytics - “Helping UX research and product teams understand their customers faster” - Means reading a lot of written feedback and watching a lot of video recordings of user interviews, testing sessions etc - We help to organize all of that raw data and research ﬁndings

EnjoyHQ Web Frontend Router

Connections between components omitted for clarity

The stack Frontend: - React + Typescript - Ruby on
Rails - Postgres - RabbitMQ Backend: - Clojure - Compojure - Component - Data Stores - Postgres - Redis - RabbitMQ - Elasticsearch - RethinkDB (going away)

Issues - It’s not clear what the client needs from
the backend when reading - It’s not clear what the backend needs when client writes data - Many services, fewer clients - Lots of endpoints, no convention (GET /documents vs POST /document/set-subject ) - Documentation vs reality - “I don’t know what the frontend actually needs” - a backend engineer - “I need an endpoint where I can do 3 things at once” - a frontend engineer - We need metrics - it’s hard to consistently instrument endpoints across many services

Early attempts - Enforcing schemas and types at different levels:
- Postgres - Plumatic Schema in most places - pre/post conditions for simple checks (“is the query arg a string?”) - “Smart” ring middleware - Hard to infer metrics from dynamic routes such as “GET /documents/:id” - Downside: impossible to document the API automatically, API are docs written by hand - Also, Schemas cannot be shared with the client written in a language different than Clojure

Issues Potential security issue with Schema: (require '[schema.core :as s])
(s/defn some-handler [{:keys [db-conn redis-conn] :as component} ; untyped! params :- {:query s/String :page s/Number }] (let [{:keys [query page] } params] query ;- at this point, guaranteed to be a string page ;- at this point guaranteed to be a number (long, int etc) #_ ... ))

Issues (s/defn some-handler [component :- s/Any ; <- actually params
:- {:query s/String :page s/Number }] (let [{:keys [db-conn redis-conn]} component {:keys [query page] } params] query ;- at this point, guaranteed to be a string page ;- at this point guaranteed to be a number (long, int etc) #_ ... )) What happened is that failed validation errors would get reported to our exception tracking software and Schema would include all arguments passed to the function in the exception metadata. DB credential leaks are not fun.

What worked?

https://github.com/nomnom-insights/nomnom.bunnicula

Bunnicula Framework for creating systems of RabbitMQ consumers and publishers.
Smoothes out the setup of RabbitMQ clients, channels, exchanges etc as well as creating components for asynchronous communication. https://blog.getenjoyhq.com/bunnicula-asynchronous-messaging-with-rabbitmq-for-clojure/

Bunnicula - RabbitMQ from day 0, before adopting Clojure -
Bunnicula was built to replace the original Ruby code, and allow for the frontend layer to queue up messages to be processed by the Clojure services - Built in logging, metrics and error reporting - All pluggable via protocols and Components - Uses JSON by default as message format - Borrowed the concept of serializers and deserializers from Kafka so that JSON can be easily replaced with a different formats, for example Transit - We use Avro (more on that in a minute)

Bunnicula Simple architecture: - Consumers are functions which receive the
payload and Components they depend on - All Consumers emit metrics - Publishers are a simple function call - Safety is guaranteed by sharing Avro schemas - When published messages are serialized into Avro guaranteeing that consumer will receive known input - Avro is more compact so queues take less memory

avro.apache.org

Avro - Simple - Primitive types (long, string) - Complex
(enums, records, maps) - Part of Hadoop ecosystem - Supports recursive schemas - Plays nicely with Clojure - Schemas can be deﬁned in JSON or EDN - Schemas can be reloaded when working in the REPL - outside compilation step is not required! - Can be used to safely exchange data between services written in different languages - https://github.com/nomnom-insights/abracad - our own fork of the original library by Damballa, ﬁxing a couple of long standing bugs

Why not Protocol Buffers or Thrift? Even though both Protocol
Buffers and Thrift have a great Java support, they don’t fit the Clojure workflow very well: - Schemas are defined in separate IDL (Avro has its own IDL too) - Require (re)compiling to a Java classes - Reloading code is not as straightforward - Need to store auto-generated code - Most importantly: neither play well with HTTP and JSON, therefore it’s hard to migrate - Designed for service to service communication - gRPC for web is not a thing (at the moment)

Why not GraphQL? - Signiﬁcant departure from current setup -
One more thing to learn and add to the stack - We don’t actually care that much about gathering data from multiple services and merging it - Mutations look somewhat complex

Name TBC

Goals - HTTP + JSON support - Out of the
box validation of inputs and outputs - Built-in instrumentation (metrics, logs, exception tracking) - Enforcing conventions - Playing nicely with Component - Standard response format - Built-in API documentation

Duckula Effectively RPC over HTTP + JSON Only POST Routes
tend to follow Clojure namespaces + function names brother-eye.handler.subscriptions/create POST subscriptions/create brother-eye.handler.subscriptions/get-all POST subscriptions/get-all

(def config {:name "test-server-rpc" :endpoints {;; composite schemas "/search/test" {:request
["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config))

["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config)) POST /number/multiply Body has to conform to the Request schema, and handler has to respond with the data conforming to the Response schema

["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config)) Route deﬁnition with validations, schemas are merged together. Handler is just a function receiving the Ring request map

["shared/Query" "search/test/Request"] :response ["shared/Query" "search/test/Response"] :handler example.handler.search/handler} "/number/multiply" {:request "number/multiply/Request" :response "number/multiply/Response" :handler example.handler.number/handler} ;; no validation "/echo" {:handler handler.echo/handler}}}) (def app (duckula.handler/build config)) Validation is optional, but you still get metrics, routing and exception handling

It’s all Ring! Services backed by Duckula can be used
like any other Ring-compatible router. All Ring middlewares are supported. It’s a gloriﬁed Ring middleware, it can be mounted as a sub-route in a Compojure context: (compojure/defroutes app (compojure/GET "/some/endpoint" [] "foo") (compojure/context "/rpc-api" [] (duckula.handler/build (assoc config :prefix "/rpc-api"))))

Short demo

But...

No silver bullets - Sharing schemas between services is a
bit problematic - We’re using a git submodule, updated everytime we deploy - Conﬂuent (the Kafka company) released a schema registry service, but it takes time to integrate it - Avro errors can be annoyingly bad sometimes: Not in union ["null","string"]: {} (somewhere we’re passing an empty map, where the schema expects either null or a string, that’s all we know)

Roadmap - Open source it - HTTP+Avro (there is some
support already, but needs more work) - OpenAPI/Swagger support: - Will generate documentation based on Avro schemas and the conﬁg - Will allow for clients to be generated based on that - clj-http middleware for making it easier to build service-to-service sync comms

PS Twirp https://twitchtv.github.io/twirp/docs/intro.html Twitch.tv project with similar goals: - HTTP
+ JSON - ProtocolBuffers IDL to describe services and their input/output formats - Generates clients and servers based on the Protocol Buffers conﬁguration - Has a spec https://twitchtv.github.io/twirp/docs/spec_v6.html - Announcement blog post echoes a lot of my experiences https://blog.twitch.tv/twirp-a-sweet-new-rpc-framework-for-go-5f2febbf35f#d991 Used at Twitch and GitHub

Thank you

Questions? Slides will be available on the #clojure-losangeles channel on
Clojurians Slack Also my Twitter @lukaszkorecki

Clojure LA: Building type-safe Clojure services...

Clojure LA: Building type-safe Clojure services with HTTP, JSON and Avro

More Decks by Łukasz Korecki

Other Decks in Programming

Featured

Transcript