Upgrade to Pro — share decks privately, control downloads, hide ads and more …

I Can’t Believe It’s Not A Queue: Using Kafka w...

hone
May 04, 2016

I Can’t Believe It’s Not A Queue: Using Kafka with Rails - RailsConf 2016

Video: https://www.youtube.com/watch?v=s3VQIGD5iGo

Your existing message system is great, until it gets overloaded. Then what? That's when you should try Kafka.

Kafka's designed to be resilient. It takes the stress out of moving from a Rails monolith into a scalable system of microservices. Since you can capture every event that happens in your app, it's great for logging. You can even use Kafka's distributed, ordered log to simulate production load in your staging environment.

Come and learn about Kafka, where it fits in your Rails app, and how to make it do the things that message queues simply can't.

hone

May 04, 2016
Tweet

More Decks by hone

Other Decks in Programming

Transcript

  1. Agenda • What is Kafka? • Kafka + Ruby •

    Use Case: Metrics • Other Patterns
  2. Kafka is a distributed, partitioned, replicated commit log service. It

    provides the functionality of a messaging system, but with a unique design.
  3. "hundreds of thousands to millions of messages a second on

    a small cluster" Tom Crayford Heroku Kafka
  4. Send a message require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"])

    producer = kafka.producer producer.produce("hello1", topic: "test-messages")
  5. Keyed Message require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"]) producer

    = kafka.producer producer.produce("hello1", topic: "test-messages") producer.produce("hello2", key: "x", topic: "test-messages")
  6. Message to a Partition require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092",

    "kafka2:9092"]) producer = kafka.producer producer.produce("hello1", topic: "test-messages") producer.produce("hello2", key: "x", topic: "test-messages") producer.produce("hello3", topic: "test-messages", partition: 1)
  7. Deliver Messages require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"]) producer

    = kafka.producer producer.produce("hello1", topic: "test-messages") producer.produce("hello2", key: "x", topic: "test-messages") producer.produce("hello3", topic: "test-messages", partition: 1) producer.deliver_messages
  8. Async Producer # `async_producer` will create a new asynchronous producer.

    producer = kafka.async_producer( # Trigger a delivery once 100 messages have been buffered. delivery_threshold: 100, # Trigger a delivery every second. delivery_interval: 1, )
  9. Serialization event = { "name" => "pageview", "url" => "https://example.com/posts/123",

    # ... } data = JSON.dump(event) producer.produce(data, topic: "events")
  10. Rails Producer # config/initializers/kafka_producer.rb require "kafka" # Configure the Kafka

    client with the broker hosts and the Rails # logger. $kafka = Kafka.new( seed_brokers: ["kafka1:9092", "kafka2:9092"], logger: Rails.logger, )
  11. Rails Producer # ... # Set up an asynchronous producer

    that delivers its buffered messages # every ten seconds: $kafka_producer = $kafka.async_producer( delivery_interval: 10, ) # Make sure to shut down the producer when exiting. at_exit { $kafka_producer.shutdown }
  12. Rails Producer class OrdersController def create @order = Order.create!(params[:order]) event

    = { order_id: @order.id, amount: @order.amount, timestamp: Time.now, } $kafka_producer.produce(event.to_json, topic: "order_events") end end
  13. Consumer Groups consumer = kafka.consumer(group_id: "my-consumer") consumer.subscribe("greetings") consumer.each_message do |message|

    puts message.topic, message.partition puts message.offset, message.key, message.value end
  14. Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  15. Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  16. Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  17. Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  18. Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  19. POST Request Body 83 <40>1 2012-11-30T06:45:29+00:00 host app web.3 -

    State changed from starting to up 119 <40>1 2012-11-30T06:45:26+00:00 host app web.3 - Starting process with command `bundle exec rackup config.ru -p 24405`
  20. POST Request Body 83 <40>1 2012-11-30T06:45:29+00:00 host app web.3 -

    State changed from starting to up 119 <40>1 2012-11-30T06:45:26+00:00 host app web.3 - Starting process with command `bundle exec rackup config.ru -p 24405` does NOT conform to RFC5424. It leaves out STRUCTURED-DATA but does not replace it with a NILVALUE.
  21. Log Drain App (Producer) def process_messages(body_text) messages = [] stream

    = Syslog::Stream.new( Syslog::Stream:: OctetCountingFraming.new(StringIO.new(body_text)), parser: Syslog::Parser.new(allow_missing_structured_data: true) ) messages = stream.messages.to_a
  22. Log Drain App (Producer) $kafka_pools[:producer].with do |producer| messages.each do |message|

    producer.produce(message.to_h.to_json, topic: message. procid) if message.procid == "router" end end end
  23. Create a Heroku Kafka Cluster $ heroku addons:create heroku-kafka:beta-dev -a

    kafka-demo Creating kafka-reticulated-61055... done, (free) Adding kafka-reticulated-61055 to kafka-demo... done The cluster should be available in 15-45 minutes. Run `heroku kafka:wait` to wait until the cluster is ready. ! WARNING: Kafka is in beta. Beta releases have a higher risk of data loss and downtime. ! Use with caution. Use `heroku addons:docs heroku-kafka` to view documentation.
  24. Cluster Info $ heroku kafka:info === KAFKA_URL Name: kafka-reticulated-61055 Created:

    2016-04-19 19:54 UTC Plan: Beta Dev Status: available Version: 0.9.0.0 Topics: 2 topics (see heroku kafka:list) Connections: 0 consumers (0 applications) Messages: 0.37 messages/s Traffic: 28 Bytes/s in / 12.1 KB/s out
  25. Topic Info $ heroku kafka:topic router === KAFKA_URL :: router

    Producers: 0.0 messages/second (0 Bytes/second) total Consumers: 20.8 KB/second total Partitions: 32 partitions Replication Factor: 1 Compaction: Compaction is disabled for router Retention: 24 hours
  26. Tail Topic $ heroku kafka:tail router router 20 2627 378

    {"prival":158,"version":1,"timestamp":"2016- 05-04 08:33:23 +0000","hostname":"ho router 20 2628 371 {"prival":158,"version":1,"timestamp":"2016- 05-04 08:59:00 +0000","hostname":"ho router 20 2629 370 {"prival":158,"version":1,"timestamp":"2016- 05-04 09:22:29 +0000","hostname":"ho
  27. Metrics Aggregator (Consumer) consumer = Kafka.new(...).consumer(group_id: "metrics") consumer.subscribe("router", default_offset: :latest)

    redis = Redis.new(url: ENV['REDIS_URL']) metrics = RouteMetrics.new(redis) consumer.each_message do |message| json = JSON.parse(message.value) route = Route.new(json) metrics.insert(route) if route.path end
  28. Metrics Aggregator (Consumer) def insert(route) path = route.path path_digest =

    Digest::SHA256.hexdigest(path) @redis.hset "routes", path, path_digest [:service, :connect].each do |metric| value = route.send(metric).to_i key = "#{path_digest}::#{metric}" @redis.hincrby key, "sum", value @redis.hincrby key, "count", 1 @redis.hset key, "average", @redis.hget(key, "sum").to_i / @redis.hget(key, "count").to_f end @redis.hincrby "#{path_digest}::statuses", route.status, 1 end
  29. Replay (Consumer) consumer = Kafka.new(...).consumer(group_id: "replay") consumer.subscribe("router", default_offset: :latest) client

    = HttpClient.httpClient(...) consumer.each_message do |message| json = JSON.parse(message.value) route = Route.new(json) controller.fork.start do client.get(java.net.URI.new("#{ENV ['REPLAY_HOST']}#{route.path}")).then do |response| puts response.get_body.get_text end end end
  30. Kafka's unique design can be used to help Rails apps

    become fast, scalable, and durable