Slide 1

Slide 1 text

Advanced Karafka

Slide 2

Slide 2 text

Karafka hooks/events - Based on dry-monitor - You can subscribe to a lot of events to add logging, instrumentation, error handling with external services, extend logic etc.

Slide 3

Slide 3 text

Karafka hooks/events # 1. Karafka.monitor.subscribe("sync_producer.call.retry") do |event| do_something_with_the_event(event) end # 2. Karafka.monitor.subscribe(ExampleListener) class ExampleListener def self.on_sync_producer_call_retry(_event) end end

Slide 4

Slide 4 text

Karafka hooks/events - supported events - params.params.deserialize - params.params.deserialize.error - connection.listener.before_fetch_loop - connection.listener.fetch_loop - connection.listener.fetch_loop.error - connection.client.fetch_loop.error - connection.batch_delegator.call - connection.message_delegator.call - fetcher.call.error - backends.inline.process - process.notice_signal - consumers.responders.respond_with - async_producer.call.error - async_producer.call.retry - sync_producer.call.error - sync_producer.call.retry - app.initializing - app.initialized - app.running - app.stopping - app.stopping.error

Slide 5

Slide 5 text

Waterdrop - Standalone gem used by Karafka for publishing messages - You can use Waterdrop directly, but it’s better to use Responders as they provide some extras and are more convenient to use - There is already version 2.0, although current Karafka version (1.4) uses 1.4 version of Waterdrop, so keep that in mind when reading docs

Slide 6

Slide 6 text

Waterdrop #sync WaterDrop::SyncProducer.call({ user_id: 1 }.to_json, topic: "users", key: "user-1", partition_key: "1") #async WaterDrop::AsyncProducer.call({ user_id: 1 }.to_json, topic: "users")

Slide 7

Slide 7 text

Async Producers - Non-blocking - the operations involving publishing something to Kafka can get some extra performance gain - Somewhat protective against Kafka being unavailable - Could lead to loss of messages (some of them might no be published at all)

Slide 8

Slide 8 text

Async Responders class ExampleResponder < ApplicationResponder topic :sync_topic topic :async_topic, async: true end

Slide 9

Slide 9 text

Serializers - Easily customisable when using Responders - By default JSON Serializer is used - Requires equivalent Deserializer on Consumer’s side

Slide 10

Slide 10 text

Example XML Serializer class KarafkaResponderXmlSerializer def self.call(object) object.to_xml end end class ExampleResponder < ApplicationResponder topic :users, serializer: KarafkaResponderXmlSerializer def respond(user) respond_to :users, user end end

Slide 11

Slide 11 text

Deserializers - Configurable when declaring a topic on the consumer’s side - By default JSON Deserializer is used

Slide 12

Slide 12 text

Deserializers class KarkafkaExampleXmlDeserializer def self.call(params) Hash.from_xml(params.raw_payload) end end KarafkaApp.routes.draw do topic :users do consumer UserConsumer deserializer KarkafkaExampleXmlDeserializer end end

Slide 13

Slide 13 text

Responding from consumers class PagesConsumer < ApplicationConsumer def consume respond_with page_id: 1 end end class PagesResponder < ApplicationResponder topic :pages_from_consumer def respond(payload_with_page_id) respond_to :pages_from_consumer, payload_with_page_id end end

Slide 14

Slide 14 text

Responding from consumers KarafkaApp.consumer_groups.draw do consumer_group :group_for_kafka_example do batch_fetching true topic :pages do consumer PagesConsumer responder PagesResponder batch_consuming true end end end

Slide 15

Slide 15 text

Testing Karafka - consumers - Dedicated gem for testing consumers: “karafka-testing” - It brings 2 helpers: - karafka_consumer_for - publish_for_karafka

Slide 16

Slide 16 text

Testing Karafka - consumers RSpec.configure do |config| config.include Karafka::Testing::RSpec::Helpers end RSpec.describe UsersConsumer do subject(:consumer) { karafka_consumer_for(:users) } before do publish_for_karafka({ "user_id" => 1 }.to_json) end it "does some stuff" do # some potential mock consumer.consume # do some assertion here end end

Slide 17

Slide 17 text

Testing Karafka - responders WaterDrop.setup do |config| config.deliver = !Rails.env.test? end RSpec.describe UsersResponder do subject(:responder) { described_class.new } describe "#call" do let(:payload) { { "user_id" => 1 } } let(:data) do [[payload.to_json, { topic: "users" }]] end it "publishes stuff" do responder.call(payload) expect(responder.messages_buffer["users"]).to eq data end end end

Slide 18

Slide 18 text

karafka-sidekiq-backend - A separate gem - Useful when you need to maximize the throughput on the consumers’ side - High price to pay: messages no longer ordered :(

Slide 19

Slide 19 text

karafka-sidekiq-backend #1 class KarafkaApp < Karafka::App setup do |config| config.backend = :sidekiq end end class ApplicationWorker < Karafka::BaseWorker end #2 KarafkaApp.routes.draw do consumer_group :example_consumer_group do topic :users do backend :sidekiq consumer UserConsumer worker KarafkaWorkers::UserWorker interchanger Interchangers:UserInterchanger # optional end end end

Slide 20

Slide 20 text

Manual offset management - Make sure that you know what you are doing and why, in most cases you don’t need that feature - Karafka handles offset management out-of- box - it commits offsets after processing individual message or a batch (depending on “batch_fetching” setting) -

Slide 21

Slide 21 text

Manual offset management class App < Karafka::App setup do |config| config.kafka.automatically_mark_as_consumed = false end consumer_groups.draw do consumer_group :users do automatically_mark_as_consumed false end consumer_group :accounts do automatically_mark_as_consumed true end end end

Slide 22

Slide 22 text

Manual offset management class UsersConsumer < ApplicationConsumer def consume do_something_with_the_batch(params_batch) mark_as_consumed!(params_batch.last) # blocking/sync operation # or mark_as_consumed(params_batch.last) # non-blocking/async operation end end

Slide 23

Slide 23 text

What if there is an exception on the consumer’s side? - If the consumer blows up with an error, it will stop for a while (configurable) and retry later - The messages will never be skipped - That means that the consumer will get stuck and not process any other messages until the issue is addressed

Slide 24

Slide 24 text

What if there is an exception on the consumer’s side? - By default, the worker will retry consuming every 10 seconds (configurable via ” pause_timeout” config param) - You can also enable exponential backoff (”pause_exponential_backoff” - disabled by default). Might be a good idea also to set ”pause_max_timeout” to not let the retry delay go out of control

Slide 25

Slide 25 text

Integration with Sentry module KarafkaSentryListener PROBLEM_POSTFIXES = %w[ _error _retry ].freeze class << self def method_missing(method_name, *args, &block) return super unless eligible?(method_name) Raven.capture_exception(args.last[:error]) end def respond_to_missing?(method_name, include_private = false) eligible?(method_name) || super end private def eligible?(method_name) PROBLEM_POSTFIXES.any? do |postfix| method_name.to_s.end_with?(postfix) end end end end

Slide 26

Slide 26 text

Integration with Sentry # in an initializer Karafka.monitor.subscribe(KarafkaSentryListener)

Slide 27

Slide 27 text

Integration with NewRelic class KarafkaNewRelicListener class << self def method_missing(method_name, *args, &block) return super unless method_name.to_s.end_with?("_error") NewRelic::Agent.notice_error(args.last[:error]) end def respond_to_missing?(method_name, include_private = false) method_name.to_s.end_with?("_error") || super end end end

Slide 28

Slide 28 text

Integration with NewRelic # in an initializer Rails.application.config.to_prepare do Karafka::BaseConsumer.class_eval do def consume(*) end end Karafka::BaseConsumer.descendants.each do |consumer_class| consumer_class.instance_eval do include ::NewRelic::Agent::Instrumentation::ControllerInstrumentation add_transaction_tracer :consume, category: :task end consumer_class.class_eval do include ::NewRelic::Agent::Instrumentation::ControllerInstrumentation add_transaction_tracer :consume, category: :task end end end Karafka.monitor.subscribe(KarafkaNewRelicListener)

Slide 29

Slide 29 text

Thanks!