Clojure at Attendify (2nd ed)

Clojure at Attendify (2nd ed)

We will try to compact 4+ years and 220k+ lines of code (running in production right now) into a couple of hours of informal conversation. Covering both success stories and pitfalls, featuring almost all “buzzwords”, including microservices, event sourcing, Kafka, property-based testing, GraphQL and others. We have a lot of things to talk about but still looking forward to having an open conversation. So… all your questions like “why don’t you use core.async” or “when will you switch to spec” are highly appreciated (and desirable).

B9b7a5ffa24e2af6f877a7950461ba0f?s=128

Oleksii Kachaiev

November 14, 2017
Tweet

Transcript

  1. @ Alexey Kachayev, Ivan Kryvoruchko | November, 2017

  2. Who We Are • 3k+ mobile apps for events •

    Content management system • Social network for attendees • Leads retrieval mobile application • Marketing automation platform • Event registration system (in development)
  3. In Figures • 9K+ events with 500K+ speakers • 3.7M+

    mobile devices • 1.7M+ users shared 1.4M posts with 11.3M+ likes • Mobile API with 19K req/min at peak • 118 repos and 25+ services (micro!) • 4B+ messages in Kafka (weekly) • 360M+ documents in ElasticSearch
  4. What Do We Use Clojure For? • For everything •

    No, seriously
  5. Elaborate? • Application servers • Data pipeline • ETL jobs

    • Internal tools • UI • Scheduling, talking to databases, data manipulation, … you name it
  6. Clojure at Attendify • Main & default language • 4+

    years in production • 12 engineers (and hiring) • CLOC: 112K+ Clojure, 78K+ ClojureScript • It’s still a technology, not magic
  7. Why Do We Use Clojure? • Because we like it

    • No, seriously
  8. Elaborate? • Clojure = ideas + access to Java libraries

    and JVM - immutability, epochal time model, atoms, transformation chains, EDN, uniform domain modeling, declarative approach, runtime polymorphism with custom dispatch etc • It’s easier to use Clojure ideas in Clojure - no, seriously - “haskell in python” << “python” & “haskell in python” << “haskell” - btw, “java in clojure” >> “java” - discipline & convention don’t work, language core does
  9. Agenda • Libraries & ecosystem • Project setup & dev

    tools • Servers & microservices • HTTP • GraphQL • Data: PG, Kafka and others • Errors handling: why do you need to • Property based testing
  10. Libraries & ecosystem • A lot of stuff - Clojars,

    Clojure Toolbox, ClojureWerkz • You still have Java - There are a lot of wrappers, but when it’s necessary… - You still have tooling: GC logs, OOM analyzers, profilers and more • You still work with Maven - lein, wagon, checkouts • Lisp-style ecosystem - A lot of libraries per each task (including not-very-production-ready)
  11. Project Setup • Configuration is always tricky - environ library,

    .lein-env-dev files • components - Manage (stateful) world around you - Show code dependencies • dev/user.clj - With (go) and (reset) • nREPL - Try ultra (plugin for lein)
  12. Project Setup: component (defrecord CSVExporter [] component/Lifecycle (start [component] (if

    (:executor component) component (assoc component :executor (e/fixed-thread-executor threads-count)))) (stop [component] (when-let [executor (:executor component)] (.shutdownNow executor)) (assoc component :executor nil)))
  13. Project Setup: component (defrecord CSVExporter [] component/Lifecycle (start [component] (if

    (:executor component) component (assoc component :executor (e/fixed-thread-executor threads-count)))) (stop [component] (when-let [executor (:executor component)] (.shutdownNow executor)) (assoc component :executor nil))) prepare cleanup
  14. Project Setup: component (component/system-map … :csv-exporter (component/using (export/new-csv-exporter {:immediate-export-limit csv-immediate-export-limit

    :immediate-export-timeout csv-immediate-export-timeout :threads-count csv-export-threads-count}) [:s3-client :email-notifier :profiles-search-handler :elastic-client :settings-event-store]) … ) (e/export csv-exporter account-id other-params)
  15. Project Setup: component (component/system-map … :csv-exporter (component/using (export/new-csv-exporter {:immediate-export-limit csv-immediate-export-limit

    :immediate-export-timeout csv-immediate-export-timeout :threads-count csv-export-threads-count}) [:s3-client :email-notifier :profiles-search-handler :elastic-client :settings-event-store]) … ) (e/export csv-exporter account-id other-params) component config deps convention
  16. Project Setup: REPL $ lein repl nREPL server started on

    port 63248 on host 127.0.0.1 - nrepl://127.0.0.1:63248 user=> (go) :started user=> (keys system) (:mailchimp-client :app-component :kafka-producer …) user=> (:s3-client system) #augustine.s3.S3Client{:url …} user=> (reset) :reloading (…) :resumed user=>
  17. Project Setup: REPL $ lein repl nREPL server started on

    port 63248 on host 127.0.0.1 - nrepl://127.0.0.1:63248 user=> (go) :started user=> (keys system) (:mailchimp-client :app-component :kafka-producer …) user=> (:s3-client system) #augustine.s3.S3Client{:url …} user=> (reset) :reloading (…) :resumed user=> try this once
  18. Servers & Microservices • RPC and GraphQL, not REST -

    JSON-RPC for clients and servers, nippy for inter-servers communication • aleph: smart wrapper for Netty - With ring & compojure (migrating to aleph only) • manifold, not core.async - Futures & chaining, executors, scheduling • Own library to define/use RPC - There are a few open-source libraries, like slacker and castra, tho’
  19. Servers & Microservices (defservice “audience.fetchTimeline" :allowed-auth #{:builder} :validate-params {:profileId s/Str

    (s/optional-key :includeTypes) [s/Str] (s/optional-key :pageSize) s/Int} :response-format Timeline (fn [{:keys [timeline-handler]} {:keys [profileId]}] (timeline/timeline-entries …)))
  20. Servers & Microservices (defservice “audience.fetchTimeline" :allowed-auth #{:builder} :validate-params {:profileId s/Str

    (s/optional-key :includeTypes) [s/Str] (s/optional-key :pageSize) s/Int} :response-format Timeline (fn [{:keys [timeline-handler]} {:keys [profileId]}] (timeline/timeline-entries …))) macro method name component(s) call params schema enforced check when :debug?
  21. Servers & Microservices (rpc/call http-client {:method “audience.fetchTimeline” :params {:profileId “42”}})

  22. Servers & Microservices (rpc/call http-client {:method “audience.fetchTimeline” :params {:profileId “42”}

    :codec :nippy :idempotent? true})
  23. Servers & Microservices returns manifold.deferred component with aleph.http (rpc/call http-client

    {:method “audience.fetchTimeline” :params {:profileId “42”} :codec :nippy :idempotent? true}) retries policy request/response serializer
  24. Servers & Microservices • defmulti rpc/execute • defservice compiles down

    to defmethod • ring route to call rpc/execute • ring handlers to take care about - Logging - Errors handling & reporting - Serialization format (JSON by default)
  25. Servers & Microservices (defservice “audience.fetchTimeline" :allowed-auth #{:builder} :validate-params {:profileId s/Str

    (s/optional-key :includeTypes) [s/Str] (s/optional-key :pageSize) s/Int} :response-format Timeline :dispatch-on rpc/execute (fn [{:keys [timeline-handler]} {:keys [profileId]}] (timeline/timeline-entries …))) :dispatch-on rpc/execute “magic”
  26. manifold || core.async • core.async is opinionated - a beautiful

    idea fighting an ugly reality - pretending everything is fine • manifold forces you to make decisions - a reality is ugly, find a way to deal with it - good luck, at least you know the truth!
  27. manifold || core.async • core.async is opinionated - a beautiful

    idea fighting an ugly reality - pretending everything is fine • manifold forces you to make decisions - a reality is ugly, find a way to deal with it - good luck, at least you know the truth! More: https://github.com/ztellman/manifold/blob/master/docs/rationale.md
  28. manifold || core.async • Concurrency is (still) hard • Implementing

    a server you (still) care about - Capacity planning - Interruptions and termination logic - Exceptions - Timeouts - Tail latency during spikes - Etc etc etc
  29. HTTP with Aleph • Hundreds of millions HTTP requests daily

    - It’s hard, no matter what technology you use • “Upgraded” aleph server: - backpressure with no “waiters”, using j.u.c.Semaphore - custom threads pool for RPC tasks execution - a lot of (smaller) threads pools not to mess up with the server - “manual” j.u.c.RejectedExecutionException handlers instead of built-in from aleph - graceful shutdown
  30. HTTP with Aleph: Own Wrapper (let [rpc-thread-count (atom 0) rpc-thread-factory

    (e/thread-factory #(format "augustine-rpc-%s-%d" project (swap! rpc-thread-count inc)) (deliver (promise) nil)) ;; unbounded rpc-executor (e/utilization-executor 0.9 512 {:thread-factory rpc-thread-factory}) semaphore (Semaphore. (or max-concurrent-requests 1024))]
  31. HTTP with Aleph: Own Wrapper (let [rpc-thread-count (atom 0) rpc-thread-factory

    (e/thread-factory #(format "augustine-rpc-%s-%d" project (swap! rpc-thread-count inc)) (deliver (promise) nil)) ;; unbounded rpc-executor (e/utilization-executor 0.9 512 {:thread-factory rpc-thread-factory}) semaphore (Semaphore. (or max-concurrent-requests 1024))] to use our names separate executor (not default) pretty hard control
  32. HTTP with Aleph: Own Wrapper (let [rpc-thread-count (atom 0) rpc-thread-factory

    (e/thread-factory #(format "augustine-rpc-%s-%d" project (swap! rpc-thread-count inc)) (deliver (promise) nil)) ;; unbounded rpc-executor (e/utilization-executor 0.9 512 {:thread-factory rpc-thread-factory}) semaphore (Semaphore. (or max-concurrent-requests 1024))] to use our names separate executor (not default) pretty hard control (deliver (promise) nil))
  33. HTTP with Aleph

  34. HTTP with Aleph • Using aleph client - Moving on

    with a default executor is (generally) a bad idea - But we had been doing this for 2+ years tho’ • “Upgraded” aleph client: - Accepting > 1 hosts to deal with servers failures - Retries policy that understands backpressure control, idempotency flags and connection errors - multipart/form-data encoding (back-ported)
  35. HTTP with Aleph • Good HTTP client is harder than

    you might think - Connections pooling, dealing with keep-alive, embracing server-side interruptions - Unexpected server’s behavior (who reads RFC anyways?) - A lot of different (!) timeouts (connection, request sending, response first byte, response headers, chunks, body etc) - Essential complexity from a library point of view
  36. HTTP with Aleph ¯\_(ツ)_/¯

  37. HTTP with Aleph ¯\_(ツ)_/¯ (╯°□°)╯︵ ┻━┻

  38. HTTP with Aleph ¯\_(ツ)_/¯ (╯°□°)╯︵ ┻━┻ Code: https://github.com/ztellman/aleph/blob/ 2ba3484ead3a8443667d3a3caddf9739c6841d9c/src/aleph/http.clj

  39. GraphQL • Query language for APIs • Using it for

    Attendify Leads mobile application • Own wrapper on top of graphql-java • Everything is data ¯\_(ツ)_/¯
  40. GraphQL • There are lacinia, graphql-clj and others • Libraries

    solve least of your problems: parsing & validating requests • Key problem is dealing with a client-side flexibility and loading data • Lucky we: most queries are compiled to SQL • Ideally: use muse to figure out a way to fetch data
  41. Data • korma & hikari to work with PostgreSQL -

    Not really happy • Java API for Kafka producers/consumers - Own wrapper, but there are a few open-sourced, like kinsky & clj-kafka • carmine to work with Redis - Used to use built-in tasks/jobs queue, not now • Event sourcing with our own library - There are a few open-source libraries, like rill and cqrs-server, tho’
  42. Data: Korma (In Theory) (defentity applications (table “application”) (pk :id)

    (transform #(update % :name capitalize))) (defn select-app [{:keys [conn]} app-id] (-> (select applications (db conn) (fields :name :icon) (where {:id app-id})) first))
  43. Data: Korma (In Practice) (defn fetch-profile-with-sources* [db apikey id] (->>

    (models/execute-query db [(format "SELECT ps.*, ss.integration_id as source_integration_id, ss.remote_id as source_remote_id, ss.payload as source_payload, ss.is_deactivated as source_is_deactivated FROM %s ps LEFT OUTER JOIN %s ss on (ps.id = ss.profile_id) WHERE ps.apikey = ? AND ps.id = ?" (tables/route tables/profile-tbl apikey) (tables/route tables/profile-source-tbl apikey)) [apikey id]] {:raw-results? true}) (functor/fmap (fn [profiles] (->> profiles utils/aggregate-profiles (map transform-profile) first (#(dissoc % :password)))))))
  44. Data: Korma (In Practice) (defn fetch-profile-with-sources* [db apikey id] (->>

    (models/execute-query db [(format "SELECT ps.*, ss.integration_id as source_integration_id, ss.remote_id as source_remote_id, ss.payload as source_payload, ss.is_deactivated as source_is_deactivated FROM %s ps LEFT OUTER JOIN %s ss on (ps.id = ss.profile_id) WHERE ps.apikey = ? AND ps.id = ?" (tables/route tables/profile-tbl apikey) (tables/route tables/profile-source-tbl apikey)) [apikey id]] {:raw-results? true}) (functor/fmap (fn [profiles] (->> profiles utils/aggregate-profiles (map transform-profile) first (#(dissoc % :password))))))) query, not DSL component not entities either manual transform partitioning
  45. Data: Korma (In Practice) • Connections pooling • Sophisticated queries

    • Serialization handling (array, hstore, jsonb) • Transactions, retries, rollbacks • java.sql.SQLException • Tables partitioning
  46. Data: Kafka Consumer • Internal library • Manage threads, subscribe/unsubscribe

    • Handle serialization • Exceptions, errors • Commit offsets
  47. Data: Kafka Consumer (component/using (kafka/new-whisper {:name "pulse-audience-latest" :topic “attendify.s.a.v1" :on-message

    (partial pulse-audience/notify! pulse-audience-expiration-threshold pulse-audience-cache-ttl-ms) :threads 32 :kafka-config (assoc kafka-config :offset-reset "latest")}) [:redis-connector :db])
  48. Data: Kafka Consumer (component/using (kafka/new-whisper {:name "pulse-audience-latest" :topic “attendify.s.a.v1" :on-message

    (partial pulse-audience/notify! pulse-audience-expiration-threshold pulse-audience-cache-ttl-ms) :threads 32 :kafka-config (assoc kafka-config :offset-reset "latest")}) [:redis-connector :db]) internal library sub/unsub consumers called on each message passed as a first arg
  49. Event Sourcing: Own Library (def event-store-config {:initial-state {:segments []} :journal

    (event-sourcing/map->SyncPartitionedSQLJournal {:snapshot-entity settings-snapshot-entity :journal-entity settings-journal-entity}) :commands [{:cmd/name "create-segment" :cmd/schema {:name s/Str :query s/Str :createdBy s/Str} :cmd/init (fn [state cmd] (e/right (assoc cmd :id (next-id)))) :cmd/fold (fn [state cmd] (e/right (update state :segments conj cmd)))} ]})
  50. Event Sourcing: Own Library (def event-store-config {:initial-state {:segments []} :journal

    (event-sourcing/map->SyncPartitionedSQLJournal {:snapshot-entity settings-snapshot-entity :journal-entity settings-journal-entity}) :commands [{:cmd/name "create-segment" :cmd/schema {:name s/Str :query s/Str :createdBy s/Str} :cmd/init (fn [state cmd] (e/right (assoc cmd :id (next-id)))) :cmd/fold (fn [state cmd] (e/right (update state :segments conj cmd)))} ]}) “updating” state prepare command validate params how to persist “the beginning” errors handling
  51. Property Based Testing • “Official” org.clojure/test.check library • A lot

    of “supporting” libraries
  52. Property Based Testing (defn gen-entity [n] (gen/fmap (fn [[id rev

    entity]] (assoc entity :id id :rev rev)) (gen/tuple gen-maybe-flake gen-maybe-flake (gen/resize n (gen/map gen/keyword gen/string-alphanumeric))))) (…) (defspec resolving-invalid-map-siblings 100 (prop/for-all [items (gen/such-that (fn [items] (< 1 (count (set (map :id items))))) (gen/not-empty (gen/vector (gen-entity 5))))] (either/left? (merge-items default-resolver items))))
  53. Property Based Testing (defn gen-entity [n] (gen/fmap (fn [[id rev

    entity]] (assoc entity :id id :rev rev)) (gen/tuple gen-maybe-flake gen-maybe-flake (gen/resize n (gen/map gen/keyword gen/string-alphanumeric))))) (…) (defspec resolving-invalid-map-siblings 100 (prop/for-all [items (gen/such-that (fn [items] (< 1 (count (map :id items)))) (gen/not-empty (gen/vector (gen-entity 5))))] (either/left? (merge-items default-resolver items)))) describe properties define generators combining provided gens
  54. Errors Handling • Most of your production code - No,

    seriously • No good story here - Clojure gives us exceptions and nil, only • Own library for either - cats seems to be the standard here, but no one cares about monads, right? • Typing your data, not your code - schema instead of spec
  55. Errors Handling (In Practice) (->> (pg-profile/fetch-profile-by-email* db' apikey email) (either/bind

    #(check-profile-nil kp apikey token email %)) (either/bind #(check-profile-deleted kp apikey token email %)) (either/bind #(check-profile-claimed-or-registered kp apikey token email %)) (either/bind #(check-valid-password kp apikey token email % password)) (either/bind #(deactivate-old-sessions db' apikey % token)) (either/bind (fn [profile] (let [session-id (or session-id (generate-session-id)) now (Timestamp/from (Instant/now))] (—>> (insert-session db' …) (either/fmap (fn [session] {…})))))))
  56. Errors Handling (In Practice) (->> (pg-profile/fetch-profile-by-email* db' apikey email) (either/bind

    #(check-profile-nil kp apikey token email %)) (either/bind #(check-profile-deleted kp apikey token email %)) (either/bind #(check-profile-claimed-or-registered kp apikey token email %)) (either/bind #(check-valid-password kp apikey token email % password)) (either/bind #(deactivate-old-sessions db' apikey % token)) (either/bind (fn [profile] (let [session-id (or session-id (generate-session-id)) now (Timestamp/from (Instant/now))] (—>> (insert-session db' …) (either/fmap (fn [session] {…}))))))) errors handling
  57. Errors Handling (In Practice) (lete [profiles (fetch-access-keys-by-events db apikey restricted-event-ids)

    labels (labels/fetch-labels {:db db} apikey {:type "access-key"}) events-access-keys (->> labels (filter #(some (set (:events %)) restricted-event-ids)) (map :id)) profile-ids-blacklist (->> profiles (remove #(some events-access-keys (vec (.getArray (:access_keys %))))) (map :id))] (perform-fetching-by-events db apikey event-ids profile-ids-blacklist))
  58. Errors Handling (In Practice) (lete [profiles (fetch-access-keys-by-events db apikey restricted-event-ids)

    labels (labels/fetch-labels {:db db} apikey {:type "access-key"}) events-access-keys (->> labels (filter #(some (set (:events %)) restricted-event-ids)) (map :id)) profile-ids-blacklist (->> profiles (remove #(some events-access-keys (vec (.getArray (:access_keys %))))) (map :id))] (perform-fetching-by-events db apikey event-ids profile-ids-blacklist)) macro deferred[either[a’]] deferred[either[a’]] a’
  59. Dynamic Typing (In Theory) (defn new-app-from-opts [{:keys [environ private? project

    custom-handler exec-fn request-log-level request-log-transform-fn max-concurrent-requests await-termination-ms] :or {private? true project "Attendify" exec-fn execute request-log-level :debug request-log-transform-fn identity max-concurrent-requests 16384}}] (map->AppComponent {:environ environ …)
  60. Dynamic Typing (In Theory) (defn new-app-from-opts [{:keys [environ private? project

    custom-handler exec-fn request-log-level request-log-transform-fn max-concurrent-requests await-termination-ms] :or {private? true project "Attendify" exec-fn execute request-log-level :debug request-log-transform-fn identity max-concurrent-requests 16384}}] (map->AppComponent {:environ environ …) “defaults”
  61. Dynamic Typing (In Practice) (new-app-from-opts {:environ {} :private? false :project

    "Riverside" :exec-fn rpc/execute :max-conucrrent-requests 1024 :await-termination-ms 2000})
  62. Dynamic Typing (In Practice) (new-app-from-opts {:environ {} :private? false :project

    "Riverside" :exec-fn rpc/execute :max-conucrrent-requests 1024 :await-termination-ms 2000}) 1. How many concurrent requests are allowed?
  63. Dynamic Typing (In Practice) (new-app-from-opts {:environ {} :private? false :project

    "Riverside" :exec-fn rpc/execute :max-conucrrent-requests 1024 :await-termination-ms 2000}) 1. How many concurrent requests are allowed? 2. When will we notice???
  64. Dynamic Typing (In Practice) (defn new-app-from-opts [{:keys [environ private? project

    custom-handler exec-fn request-log-level request-log-transform-fn max-concurrent-requests await-termination-ms] :or {private? true project "Attendify" exec-fn execute request-log-level :debug request-log-transform-fn identity max-concurrent-requests 16384} :as opts}] (schema.core/validate datatypes/AppOptions opts) (…))
  65. Dynamic Typing (In Practice) (defn new-app-from-opts [{:keys [environ private? project

    custom-handler exec-fn request-log-level request-log-transform-fn max-concurrent-requests await-termination-ms] :or {private? true project "Attendify" exec-fn execute request-log-level :debug request-log-transform-fn identity max-concurrent-requests 16384} :as opts}] (schema.core/validate datatypes/AppOptions opts) (…)) (schema.core/validate datatypes/AppOptions opts) close keys space (by default)
  66. Errors Handling • schema: close keys space by default •

    clojure.spec: open keys space by design • ¯\_(ツ)_/¯
  67. We’re hiring #_:) More info: https://attendify.workable.com/j/6A2EBF73B6

  68. Q&A