Running large scale Kafka clusters with minimum toil

Running large scale Kafka clusters with minimum toil

Balthazar Rouberol showcases the tooling his Data Reliability Team team built at Datadog to alleviate operational toil when running large Kafka clusters. He dives into sources of toil and time consumption, tools implemented to alleviate the amount of toil, as well as monitoring and general good practices as well.

6832e99e94636c4872030004c6f8fd70?s=128

Balthazar Rouberol

October 03, 2019
Tweet