Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to scale a Logging Infrastructure
Search
Paul Stack
June 03, 2015
Technology
0
190
How to scale a Logging Infrastructure
Logging infrastructure using ELK + Kafka
Paul Stack
June 03, 2015
Tweet
Share
More Decks by Paul Stack
See All by Paul Stack
Infrastructure as Software
stack72
0
82
Mirror, Mirror on the way, what is the vainest metric of them all?
stack72
1
2.4k
Continuously Delivering Infrastructure to the Cloud
stack72
0
210
DevOops 2016
stack72
0
130
The Quest for Infrastructure Management 2.0
stack72
0
160
The Biggest Trick Consultants Ever Pulled was Telling The World Continuous Delivery is Easy
stack72
1
130
The Transition from Product to Infrastructure
stack72
0
75
Continuous Delivery - the missing parts
stack72
0
980
Windows: Having its ass kicked by puppet and powershell
stack72
0
150
Other Decks in Technology
See All in Technology
AI駆動開発における設計思想 認知負荷を下げるフロントエンドアーキテクチャ/ 20251211 Teppei Hanai
shift_evolve
PRO
2
390
年間40件以上の登壇を続けて見えた「本当の発信力」/ 20251213 Masaki Okuda
shift_evolve
PRO
1
130
「図面」から「法則」へ 〜メタ視点で読み解く現代のソフトウェアアーキテクチャ〜
scova0731
0
160
Challenging Hardware Contests with Zephyr and Lessons Learned
iotengineer22
0
210
2025年 開発生産「可能」性向上報告 サイロ解消からチームが能動性を獲得するまで/ 20251216 Naoki Takahashi
shift_evolve
PRO
1
150
SSO方式とJumpアカウント方式の比較と設計方針
yuobayashi
7
680
AI 駆動開発勉強会 フロントエンド支部 #1 w/あずもば
1ftseabass
PRO
0
370
Lookerで実現するセキュアな外部データ提供
zozotech
PRO
0
110
MapKitとオープンデータで実現する地図情報の拡張と可視化
zozotech
PRO
1
140
初めてのDatabricks AI/BI Genie
taka_aki
0
160
5分で知るMicrosoft Ignite
taiponrock
PRO
0
370
AWSを使う上で最低限知っておきたいセキュリティ研修を社内で実施した話 ~みんなでやるセキュリティ~
maimyyym
2
1.3k
Featured
See All Featured
It's Worth the Effort
3n
187
29k
Raft: Consensus for Rubyists
vanstee
141
7.2k
Mobile First: as difficult as doing things right
swwweet
225
10k
Balancing Empowerment & Direction
lara
5
800
Into the Great Unknown - MozCon
thekraken
40
2.2k
How to Ace a Technical Interview
jacobian
280
24k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Done Done
chrislema
186
16k
Automating Front-end Workflow
addyosmani
1371
200k
Making Projects Easy
brettharned
120
6.5k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
Designing for humans not robots
tammielis
254
26k
Transcript
How do you scale a logging infrastructure to accept a
billion messages a day? Paul Stack http://twitter.com/stack72 mail:
[email protected]
About Me Infrastructure Engineer for a cool startup :) Reformed
ASP.NET / C# Developer DevOps Extremist Conference Junkie
Background Project was to replace the legacy ‘logging solution’
Iteration 0: A Developer created a single box with the
ELK all in 1 jar
Time to make it production ready now
None
Iteration 1: Using Redis as the input mechanism for LogStash
None
None
Enter Apache Kafka
“Kafka is a distributed publish- subscribe messaging system that is
designed to be fast, scalable, and durable” Source: Cloudera Blog
Introduction to Kafka • Kafka is made up of ‘topics’,
‘producers’, ‘consumers’ and ‘brokers’ • Communication is via TCP • Backed by Zookeeper
Kafka Topics Source: http://kafka.apache.org/documentation.html
Kafka Producers • Producers are responsible to chose what topic
to publish data to • The producer is responsible for choosing a partition to write to • Can be handled round robin or partition functions
Kafka Consumers • Consumption can be done via: • queuing
• pub-sub
Kafka Consumers • Kafka consumer group • Strong ordering
Kafka Consumers • Strong ordering
https://github.com/opentable/puppet-exhibitor
None
Iteration 2 Introduction of Kafka
None
None
Iteration 3 Further ‘Improvements’ to the cluster layout
None
The Numbers • Logs kept in ES for 30 days
then archived • 12 billion documents active in ES • ES space was about 25 - 30TB in EBS volumes • Average Doc Size ~ 1.2KB • V-Day 2015: ~750M docs collected without failure
What about metrics and monitoring?
Monitoring - Nagios • Alerts on • ES Cluster •
zK and Kafka Nodes • Logstash / Redis nodes
None
https://github.com/stack72/nagios-elasticsearch
Metrics - Kafka Offset Monitor
https://github.com/opentable/KafkaOffsetMonitor
Metrics - ElasticSearch
None
None
None
Visibility Rocks!
None
So what would I do differently?
Questions?
Paul Stack @stack72