I Can’t Believe It’s Not A Queue: Using Kafka with Rails - RailsConf 2016

I Can’t Believe It’s Not A Queue: Using Kafka with Rails - RailsConf 2016

Video: https://www.youtube.com/watch?v=s3VQIGD5iGo

Your existing message system is great, until it gets overloaded. Then what? That's when you should try Kafka.

Kafka's designed to be resilient. It takes the stress out of moving from a Rails monolith into a scalable system of microservices. Since you can capture every event that happens in your app, it's great for logging. You can even use Kafka's distributed, ordered log to simulate production load in your staging environment.

Come and learn about Kafka, where it fits in your Rails app, and how to make it do the things that message queues simply can't.

B87c43d4be875c9b41cd436f5c364f75?s=128

hone

May 04, 2016
Tweet

Transcript

  1. 1.
  2. 4.
  3. 7.
  4. 9.

    Agenda • What is Kafka? • Kafka + Ruby •

    Use Case: Metrics • Other Patterns
  5. 11.

    Kafka is a distributed, partitioned, replicated commit log service. It

    provides the functionality of a messaging system, but with a unique design.
  6. 14.

    "hundreds of thousands to millions of messages a second on

    a small cluster" Tom Crayford Heroku Kafka
  7. 15.
  8. 20.
  9. 28.

    Send a message require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"])

    producer = kafka.producer producer.produce("hello1", topic: "test-messages")
  10. 29.

    Keyed Message require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"]) producer

    = kafka.producer producer.produce("hello1", topic: "test-messages") producer.produce("hello2", key: "x", topic: "test-messages")
  11. 30.

    Message to a Partition require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092",

    "kafka2:9092"]) producer = kafka.producer producer.produce("hello1", topic: "test-messages") producer.produce("hello2", key: "x", topic: "test-messages") producer.produce("hello3", topic: "test-messages", partition: 1)
  12. 31.

    Deliver Messages require "kafka" kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"]) producer

    = kafka.producer producer.produce("hello1", topic: "test-messages") producer.produce("hello2", key: "x", topic: "test-messages") producer.produce("hello3", topic: "test-messages", partition: 1) producer.deliver_messages
  13. 32.

    Async Producer # `async_producer` will create a new asynchronous producer.

    producer = kafka.async_producer( # Trigger a delivery once 100 messages have been buffered. delivery_threshold: 100, # Trigger a delivery every second. delivery_interval: 1, )
  14. 33.

    Serialization event = { "name" => "pageview", "url" => "https://example.com/posts/123",

    # ... } data = JSON.dump(event) producer.produce(data, topic: "events")
  15. 34.

    Rails Producer # config/initializers/kafka_producer.rb require "kafka" # Configure the Kafka

    client with the broker hosts and the Rails # logger. $kafka = Kafka.new( seed_brokers: ["kafka1:9092", "kafka2:9092"], logger: Rails.logger, )
  16. 35.

    Rails Producer # ... # Set up an asynchronous producer

    that delivers its buffered messages # every ten seconds: $kafka_producer = $kafka.async_producer( delivery_interval: 10, ) # Make sure to shut down the producer when exiting. at_exit { $kafka_producer.shutdown }
  17. 36.

    Rails Producer class OrdersController def create @order = Order.create!(params[:order]) event

    = { order_id: @order.id, amount: @order.amount, timestamp: Time.now, } $kafka_producer.produce(event.to_json, topic: "order_events") end end
  18. 38.

    Consumer Groups consumer = kafka.consumer(group_id: "my-consumer") consumer.subscribe("greetings") consumer.each_message do |message|

    puts message.topic, message.partition puts message.offset, message.key, message.value end
  19. 44.

    Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  20. 45.

    Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  21. 46.

    Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  22. 47.

    Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  23. 48.

    Heroku Router Logs $ heroku logs -a issuetriage -p router

    2016-05-04T04:57:12.222253+00:00 heroku[router]: at=info method=GET path="/haiwen/seafile" host=issuetriage.herokuapp.com request_id=cf59a503-3159-4d7c-8287-3ba52d7c44df fwd=" 144.76.27.118" dyno=web.2 connect=0ms service=166ms status=200 bytes=36360
  24. 50.

    POST Request Body 83 <40>1 2012-11-30T06:45:29+00:00 host app web.3 -

    State changed from starting to up 119 <40>1 2012-11-30T06:45:26+00:00 host app web.3 - Starting process with command `bundle exec rackup config.ru -p 24405`
  25. 51.

    POST Request Body 83 <40>1 2012-11-30T06:45:29+00:00 host app web.3 -

    State changed from starting to up 119 <40>1 2012-11-30T06:45:26+00:00 host app web.3 - Starting process with command `bundle exec rackup config.ru -p 24405` does NOT conform to RFC5424. It leaves out STRUCTURED-DATA but does not replace it with a NILVALUE.
  26. 54.
  27. 55.

    Log Drain App (Producer) def process_messages(body_text) messages = [] stream

    = Syslog::Stream.new( Syslog::Stream:: OctetCountingFraming.new(StringIO.new(body_text)), parser: Syslog::Parser.new(allow_missing_structured_data: true) ) messages = stream.messages.to_a
  28. 56.

    Log Drain App (Producer) $kafka_pools[:producer].with do |producer| messages.each do |message|

    producer.produce(message.to_h.to_json, topic: message. procid) if message.procid == "router" end end end
  29. 59.

    Create a Heroku Kafka Cluster $ heroku addons:create heroku-kafka:beta-dev -a

    kafka-demo Creating kafka-reticulated-61055... done, (free) Adding kafka-reticulated-61055 to kafka-demo... done The cluster should be available in 15-45 minutes. Run `heroku kafka:wait` to wait until the cluster is ready. ! WARNING: Kafka is in beta. Beta releases have a higher risk of data loss and downtime. ! Use with caution. Use `heroku addons:docs heroku-kafka` to view documentation.
  30. 63.

    Cluster Info $ heroku kafka:info === KAFKA_URL Name: kafka-reticulated-61055 Created:

    2016-04-19 19:54 UTC Plan: Beta Dev Status: available Version: 0.9.0.0 Topics: 2 topics (see heroku kafka:list) Connections: 0 consumers (0 applications) Messages: 0.37 messages/s Traffic: 28 Bytes/s in / 12.1 KB/s out
  31. 64.

    Topic Info $ heroku kafka:topic router === KAFKA_URL :: router

    Producers: 0.0 messages/second (0 Bytes/second) total Consumers: 20.8 KB/second total Partitions: 32 partitions Replication Factor: 1 Compaction: Compaction is disabled for router Retention: 24 hours
  32. 65.

    Tail Topic $ heroku kafka:tail router router 20 2627 378

    {"prival":158,"version":1,"timestamp":"2016- 05-04 08:33:23 +0000","hostname":"ho router 20 2628 371 {"prival":158,"version":1,"timestamp":"2016- 05-04 08:59:00 +0000","hostname":"ho router 20 2629 370 {"prival":158,"version":1,"timestamp":"2016- 05-04 09:22:29 +0000","hostname":"ho
  33. 67.

    Metrics Aggregator (Consumer) consumer = Kafka.new(...).consumer(group_id: "metrics") consumer.subscribe("router", default_offset: :latest)

    redis = Redis.new(url: ENV['REDIS_URL']) metrics = RouteMetrics.new(redis) consumer.each_message do |message| json = JSON.parse(message.value) route = Route.new(json) metrics.insert(route) if route.path end
  34. 68.

    Metrics Aggregator (Consumer) def insert(route) path = route.path path_digest =

    Digest::SHA256.hexdigest(path) @redis.hset "routes", path, path_digest [:service, :connect].each do |metric| value = route.send(metric).to_i key = "#{path_digest}::#{metric}" @redis.hincrby key, "sum", value @redis.hincrby key, "count", 1 @redis.hset key, "average", @redis.hget(key, "sum").to_i / @redis.hget(key, "count").to_f end @redis.hincrby "#{path_digest}::statuses", route.status, 1 end
  35. 69.

    Replay (Consumer) consumer = Kafka.new(...).consumer(group_id: "replay") consumer.subscribe("router", default_offset: :latest) client

    = HttpClient.httpClient(...) consumer.each_message do |message| json = JSON.parse(message.value) route = Route.new(json) controller.fork.start do client.get(java.net.URI.new("#{ENV ['REPLAY_HOST']}#{route.path}")).then do |response| puts response.get_body.get_text end end end
  36. 77.

    Kafka's unique design can be used to help Rails apps

    become fast, scalable, and durable
  37. 78.