Slide 1

Slide 1 text

Kafka & Karafka - Monitoring

Slide 2

Slide 2 text

Monitoring - Warning - No monitoring? Forget about using Kafka on production

Slide 3

Slide 3 text

Monitoring - Kafka cluster itself vs. application-related metrics - Both are critical - Karafka uses ruby-kafka (versions 1.x) under the hood, which has a great integration with Datadog/Statsd

Slide 4

Slide 4 text

Kafka Broker metrics - Under-replicated partitions

Slide 5

Slide 5 text

Kafka Broker metrics - Active Controller Count

Slide 6

Slide 6 text

Kafka Broker metrics - messages_in_per_sec

Slide 7

Slide 7 text

Kafka Broker metrics - bytes_in_per_sec

Slide 8

Slide 8 text

Kafka Broker metrics - bytes_out_per_sec

Slide 9

Slide 9 text

Kafka Broker metrics - Leader count

Slide 10

Slide 10 text

Kafka Broker metrics - Offline Partitions Count

Slide 11

Slide 11 text

Kafka Broker metrics - Replication Max Lag

Slide 12

Slide 12 text

Kafka Broker metrics - (AWS MSK) data_logs_disk_used

Slide 13

Slide 13 text

Producer metrics - is stuff working at all? - ruby_kafka.api.calls

Slide 14

Slide 14 text

Producer metrics - are messages getting published? - ruby_kafka.producer.deliver.messages

Slide 15

Slide 15 text

Producer metrics - are messages getting published? - ruby_kafka.producer.deliver.attempts

Slide 16

Slide 16 text

Producer metrics - are messages getting published? - ruby_kafka.producer.deliver.errors

Slide 17

Slide 17 text

Consumer metrics - does stuff work at all? - ruby_kafka.api.calls

Slide 18

Slide 18 text

Consumer metrics - does stuff work at all? - ruby_kafka.api.errors

Slide 19

Slide 19 text

Consumer metrics - are messages getting consumed? - ruby_kafka.consumer.lag{*} by {topic,partition}

Slide 20

Slide 20 text

Consumer metrics - are messages getting consumed? - ruby_kafka.consumer.messages by {topic,partition}

Slide 21

Slide 21 text

Consumer metrics - is there anything wrong going on with consumers? - ruby_kafka.consumer.leave_group

Slide 22

Slide 22 text

Consumer metrics - is there anything wrong going on with consumers? - ruby_kafka.consumer.sync_group

Slide 23

Slide 23 text

Consumer metrics - is there anything wrong going on with consumers? - ruby_kafka.consumer.join_group

Slide 24

Slide 24 text

Consumer metrics - is there anything wrong going on with consumers? - ruby_kafka.fetcher.queue_size

Slide 25

Slide 25 text

Thanks!