Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to scale a Logging Infrastructure

How to scale a Logging Infrastructure

Logging infrastructure using ELK + Kafka

Paul Stack

June 03, 2015
Tweet

More Decks by Paul Stack

Other Decks in Technology

Transcript

  1. How do you scale a logging infrastructure to accept a

    billion messages a day? Paul Stack http://twitter.com/stack72 mail: [email protected]
  2. About Me Infrastructure Engineer for a cool startup :) Reformed

    ASP.NET / C# Developer DevOps Extremist Conference Junkie
  3. “Kafka is a distributed publish- subscribe messaging system that is

    designed to be fast, scalable, and durable” Source: Cloudera Blog
  4. Introduction to Kafka • Kafka is made up of ‘topics’,

    ‘producers’, ‘consumers’ and ‘brokers’ • Communication is via TCP • Backed by Zookeeper
  5. Kafka Producers • Producers are responsible to chose what topic

    to publish data to • The producer is responsible for choosing a partition to write to • Can be handled round robin or partition functions
  6. The Numbers • Logs kept in ES for 30 days

    then archived • 12 billion documents active in ES • ES space was about 25 - 30TB in EBS volumes • Average Doc Size ~ 1.2KB • V-Day 2015: ~750M docs collected without failure
  7. Monitoring - Nagios • Alerts on • ES Cluster •

    zK and Kafka Nodes • Logstash / Redis nodes