Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Adding some Rust to your Kafka

Adding some Rust to your Kafka

Using two implementations of Rust in a Bank simulation. And compare them with implementations in Clojure and Kotlin.

4ee70ec4734b5a64ba55d9a31485d7d6?s=128

Gerard Klijs

May 28, 2019
Tweet

Transcript

  1. Adding some Rust to your Kafka

  2. About me Gerard Klijs • @GKlijs • https://github.com/gklijs • Java

    developer with • Used Kafka several times • Author of schema_registry_converter • Living in Papendrecht
  3. Contents • Test setup • 3 languages • 4 Implementations

    • Results of the tests • Conclusion • Questions
  4. Bank simulation with end-to-end performance test

  5. None
  6. Overview of the whole system Every yellow component will be

    explained in details, and are all but the command handler only written in Clojure. We use the schema registry for schema management, also making the messages smaller. All messages will have string as key and avro as value type. Meaning of the colors: • Orange are the Confluent platform parts • Yellow are the parts of open bank mark • Green is an NginX instance • Light blue are PostgreSQL databases
  7. None
  8. Topology Common dependency of the other parts. Several functions: •

    One way to set the logging for all components. • All components have knowledge over the topics and data types without needing to connect. • Will generate Avro object for (de)serialization. • Functions wrapping the (Java) Kafka Consumer and Producer. • Functions for dealing with IBAN and UUID.
  9. None
  10. Synchronizer Makes sure both the correct topics and schema’s are

    set. Checks if it’s possible to set the replication factor to what’s in the config, takes the minimum of the available nodes and the config. Note that this has been used only on a clean Kafka cluster, and there is currently no check for topic properties being correct.
  11. None
  12. Heartbeat Just a simple producer for easy debugging. Used a

    simple message with just a long value. Exposes and nrepl. A nrepl is a network repl, which can be used to execute code remotely and get the result back. This is a powerful concept, making it possible to apply fixes as the code runs, or interactively solve bugs. With the nrepl the pace of the send messages can be changed.
  13. None
  14. Command generator Consumes the heartbeats and generates a command for

    each received heartbeat. This will be ConfirmAccountCreation first, as it runs there will be less of these. It rondomly ceates different kinds of ConfirmMoneyTransfer which might fail because it would cause the balance to become below the limit.
  15. None
  16. Command handler Handles the different kinds of command. • AccountCreationCommand:

    generates a new iban, if it not already exists creates a balance using the default values, if it does exists gives back an AccountCreationFailed. • ConfirmMoneyTransfer: if the supplied token is correct, and there is enough money, makes the transfer. Updates both to and form if they are ‘open-bank’ ibans. And creates a BalanceChanged event for each changed balance.
  17. GraphQL-endpoint

  18. GraphQL-endpoint Exposes a GraphQL endpoint to make it easy to

    issue commands and get the results back in the frontend. All services have there own consumer, and share the producer and the database. • Transaction service: makes it possible to query or subscribe to balance changed events. • Account creation service: used to create an account. Will link the username used to log in with the uuid send for the account creation, in order to get the same iban back should the user log in at another time. • Money transfer service: tries to transfer money, and provides feedback.
  19. None
  20. None
  21. None
  22. Frontend

  23. Frontend The frontend is build on several parts that all

    end up in a NginX container to be served. • The javascript part is build using clojurescript, an important part is the re-graph library. For clojurescript re-frame is often used, which uses react to update the dom depending on a global state. Clojurescript is using the Google Closure compiler to reduce the size of the resulting javascript. • Bulma is used for the css with just the colors set differently and some additional animations. • The output from the tests are added to NginX to make them easily accessible.
  24. None
  25. Running a test When the test is run it will

    do several kind of transaction that either increase or lower the money on the balance in such a way as much goes in as goes out after 10 runs. It measures the time till the new balance comes in. During the test the load of the system is increased by using the nrepl of the heartbeat. Increasing the number of heartbeats which in turn will trigger additional commands to be processed. Also during the test using lispyclouds/clj-docker-client both the cpu and memory of parts of the system are measured. Al the data is written into a file so it can be analyzed later on.
  26. None
  27. Output a test The generated files can be compared to

    other files to generate graphs. All the data is combined, and for each point with the same load some statistics are calculated. Most often the mean and the standard error. For different values graphs are generated in the public folder for the frontend so they can be easily viewed. They are available at the background tab at open-bank.
  28. Clojure

  29. Clojure and Kafka • Rich Hickley is the creator and

    Benevolent Dictator for Life. • Runs on the jvm, and has interop with Java. • Cognitect is the company behind Clojure, it has several product around Clojure, like Datomic an elastic scaling transactional database. • Multiple recent libraries, besides the consumer and the producer sometimes also supporting streams, the admin client and avro. • At the time I started the project the latest Clojure was still java 6 compatible, and there was no recent Clojure Kafka client. • Some fuss with Jackson in combination with other libraries, using explicit Jackson versions to make it work.
  30. Code example producer

  31. Code example start consuming

  32. Kotlin

  33. Kotlin with Spring and Kafka • Kotin is closely tied

    to the IntelliJ IDEA. • Can change java code to Kotlin automatically. • Bit more functional then Java, and often immutable defaults. • Spring makes it easy to set up and have something working fast. • Getting Avro serializers to work was a puzzle, getting the right properties to use Avro serialisation. • With Spring Cloud Streams is using Kafka Streams Api under the hood. • Easiest it to start on Spring Initializr. • Make sure to use the kotlin-maven-allopen and kotlin-maven-noarg plugin to compile.
  34. None
  35. Code example money transfer

  36. Rust

  37. Rust and Kafka • System programming language with focus on

    safety and speed. • Mozilla was the first investor for Rust and continues to sponsor the work of the open source project. • Used by dropbox in production. • Two libraries, one that recently is getting more active, bumped to 1.0.0 of librdkafka, another one using pure rust, but has little activity and little features. • No support for avro when I started. • Created library to use the schema registry to transform bytes to Value and the other way around, and also to set a schema in the schema registry. • Library is more low level than Java, things like logging have to be setup. Some examples are available making it easy.
  38. Dockerfile pure rust library

  39. Database update

  40. Code example money transfer

  41. Some results of the 10 runs on TravisCI (2 cpu)

    Language Clojure Kotlin Rust(rdkafka) Rust(kafka) Docker image size (MB) 152 206 102 8 Average start (ms) 2988 12878 2222 1929 Max load reached (msg/s) 310 330 260 220
  42. Some graphs, more available at https://open-bank.gklijs.tech/

  43. Latency • Rust-kafka quickly rises because only sending one message

    at a time. • Rust-rdkafka goes up eventually is stressing the Kafka broker more than the jvm languages. • Both jvm languages are pretty close.
  44. None
  45. Cpu load Kafka broker • Rust-kafka is causing high cpu

    because every message is send seperately. • Rust-rdkafka goes up eventually is stressing the Kafka broker more than the jvm languages, I don’t know why. • Both jvm languages are pretty close.
  46. None
  47. Cpu command handler • Rust-kafka is the lowest, is pretty

    simple and bare docker image • Rust-rdkafka only needs slightly more. • Clojure is pretty close to rust, after jit has kicked in. • Kotlin jit seems effective about the same but more overhead because of Spring.
  48. None
  49. Conclusion, use Rust when: • Startup time is important, but

    other options for the JVM with GraalVM like Quarkus or Micronaut. • Memory footprint matters. • A small Docker image is important. • Memory safety is important. But: • Be sure to test if in your case the broker can keep up. • What the application needs to do can be done with Rust. • Development may take a bit longer.
  50. Questions? Code available at open-bank-mark