Microservices in Clojure

Microservices in Clojure

Lessons learned after 1.5+ years of developing/evolving microservices at Attendify. Ideas, problems, pitfalls, plans.

B9b7a5ffa24e2af6f877a7950461ba0f?s=128

Oleksii Kachaiev

April 17, 2015
Tweet

Transcript

  1. Lessons learned Microservices in Clojure by Alexey Kachayev for KyivClojure

    #7, 2015
  2. About Me ‣ Alexey Kachayev, @kachayev ‣ CTO at Attendify.com

    ‣ Clojure, Scala, Erlang engineer ‣ Active open source contributor ‣ Author of Fn.py library (Python) ‣ Hobbies: Haskell, Rust, CRDTs, compilers
  3. Agenda ‣ Product overview ‣ The big idea behind Microservices

    ‣ What we built and why ‣ Problems and pitfalls ‣ The Road Not Taken
  4. Attendify Product Overview

  5. Attendify ‣ Mobile applications builder ‣ Thousands of mobile apps

    ‣ Private social networks in each application ‣ Real-time analytic ‣ Sponsored Posts (ads) ‣ EventWall for screen projection
  6. Attendify Hub

  7. Social

  8. Multi-Event App

  9. The Idea

  10. Microservices ‣ Your Server As a Function [1] ‣ Scaling,

    multiple languages and bla-bla-bla… ‣ Hyped as well as NoSQL, BigData etc ‣ Just google it to find more information ‣ We use it because it’s convenient ‣ As well as split your code into small functions ‣ We moved from Django project almost 2 years ago
  11. Applicability ‣ If you don’t know how to split your

    system into small services: -it’s too small to be split -you don’t know your system well enough -how are you going to scale your engineering team?
  12. What We Built

  13. Current State ‣ 7 services in Clojure (from a total

    of 23) ‣ 82 RPC endpoints in Clojure (from a total of 290+) ‣ 17k+ LOC of Clojure code, 2850+ commits ‣ 4-6M requests handled each day ‣ 3 engineers work with Clojure on a regular basis ‣ Not only Clojure company (also Erlang, Scala, Go)
  14. Brief History ‣ Started 1.5 year ago ‣ With 2

    services in Clojure (sophisticated data processing modules) ‣ Didn’t choose any of existing microservices framework or platform
  15. Ready-to-use Solutions ‣ All systems are targeted to fit in

    predefined requirements (as any framework) ‣ We didn’t know all requirements in advance ‣ Requirements are subject to change (continuously) ‣ There is no “right way” ‣ Non-technical requirements (i.e. organization structure) are rarely portable
  16. Started From… ‣ JSON-RPC 2.0 protocol over HTTP transport ‣

    Server: jetty & ring ‣ Service: implicit, ad-hoc definition, code copy & paste ‣ Deploy: JAR (uber), upstart, fab ‣ Discovery: URI with environment variables ‣ Security: HMAC request signature
  17. Next steps (1) ‣ Better JSON-RPC: -meta information -another multiplexing

    procedure -named params
  18. Next steps (2) ‣ Deployment procedure: -move all fab commands

    to shared library -save uberjars (each version) on S3 -ping and http-based health checker -report all activity to Slack -run:as (to connect local service to QA or Prod clusters)
  19. Next steps (3) ‣ Switched to httpkit (http server &

    client) -better benchmarks but not really applicable for our case -wanted to use core.async for service definitions, but still using futures (it’s ok for us)
  20. More Services, New Problems ‣ logs: unification, collect/process ‣ errors

    tracking: new type of errors (inter-service communication) ‣ auth: different levels and procedures ‣ metrics: collect, view, analyze ‣ protocol: dynamic typing is hard to scale
  21. Solutions So Far ‣ logs: used Loggly, not really a

    problem now ‣ errors tracking: Rollbar for failure reports, either abstraction, timeouts handling as first class citizen ‣ metrics: used Graphite, now using InfluxDB ‣ protocol: schema library to params definition and validation
  22. Augustine ‣ shared library with s3-wagon ‣ defservice macro that

    uses multimethod ‣ protocol definition/validation with schema ‣ auth level specification and control ‣ errors, exceptions and timeouts handling ‣ meta information, req/resp ID with flake algorithm
  23. None
  24. Problems and Pitfalls

  25. Java ‣ You don’t need to know Java to write

    Clojure ‣ Your http server/framework is written in Java ‣ “Java in Clojure” is easier than “Java in Java” ‣ Most probably you will deal with Java code somehow ‣ GC is your “good but very unpredictable” friend
  26. Java We Use ‣ io operations, streaming & buffering ‣

    XLS reader ‣ base64 ‣ timers ‣ java.text.SimpleDateFormat
  27. Data Communication ‣ Databases: Riak, Redis, CouchDB, PostgreSQL ‣ Our

    Clojure services are “data-centric” (mostly about data manipulations) ‣ “single data responsibility” sounds good, but doesn’t work in our case ‣ Databases are used a lot for cross-service communications to decrease inter-services coupling
  28. The Road Not Taken

  29. Actual Problems ‣ scaling is a hard problem even with

    best instruments ‣ there is no “critical” problem that we can’t solve ‣ there is a big room for enhancements ‣ there is even bigger room for experiments
  30. Investigations (1) ‣ active investigations ‣ errors processing (even with

    either, trying monads) ‣ distributed tracing (partially solved with req IDs) ‣ service discovery & (smart) load balancing ‣ binary protocol & TCP for inter-server communication
  31. Investigations (2) ‣ not really active investigations ‣ auto-generated SDKs

    ‣ back pressure control (looking at Hystrix) ‣ core.async (long story) ‣ tasks cancellation
  32. core.async (1) ‣ Your Server as a Transducer ‣ augustine

    library accepts channel as a return type ‣ httpkit provides async interface ‣ but… futures work fine for us (still?) ‣ still experimenting…
  33. core.async (2) ‣ better timeouts ‣ better multiplexing ‣ easier

    to deal with back-pressure control ‣ async abstractions are very leaky ‣ should reimplement most parts of the code ‣ hard to debug (just like futures)
  34. Finagle-Clojure ‣ github.com/finagle/finagle-clojure ‣ good interface to work with Thrift

    ‣ easy to start with basic template and docs ‣ inconvenient Scala runtime ‣ not-really-idiomatic Clojure ‣ no more comments for now (not using in production)
  35. Thoughts ‣ No regrets about our technical decision(s) ‣ We

    have time to solve problems & concerns ‣ Clojure is ok for product development ‣ Clojure is ok when supporting old code ‣ Clojure development is hard when # engineers > 1** ‣ ** it’s hard to work with people in any case
  36. alexey@attendify.com We’re hiring!

  37. Thank You! Questions?