Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to scale a Logging Infrastructure
Search
Paul Stack
June 03, 2015
Technology
0
150
How to scale a Logging Infrastructure
Logging infrastructure using ELK + Kafka
Paul Stack
June 03, 2015
Tweet
Share
More Decks by Paul Stack
See All by Paul Stack
Infrastructure as Software
stack72
0
55
Mirror, Mirror on the way, what is the vainest metric of them all?
stack72
1
2.2k
Continuously Delivering Infrastructure to the Cloud
stack72
0
160
DevOops 2016
stack72
0
110
The Quest for Infrastructure Management 2.0
stack72
0
100
The Biggest Trick Consultants Ever Pulled was Telling The World Continuous Delivery is Easy
stack72
1
90
The Transition from Product to Infrastructure
stack72
0
49
Continuous Delivery - the missing parts
stack72
0
820
Windows: Having its ass kicked by puppet and powershell
stack72
0
100
Other Decks in Technology
See All in Technology
[新卒向け研修資料] テスト文字列に「うんこ」と入れるな(2024年版)
infiniteloop_inc
4
16k
EMとして2023年度に頑張ったこと / What we did well in FY2023 as a EM
pauli
1
190
VSCodeの拡張機能を作っている話
ebarakazuhiro
1
700
.NET Profiler in 2024.
kkamegawa
2
600
MixIT 2024 - Pulumi : Gérer son infra avec son langage de programmation préféré
ju_hnny5
1
110
ワールドカフェI /チューターを改良する / World Café I and Improving the Tutors
ks91
PRO
0
140
Python と Snowflake はズッ友だょ!~ Snowflake の Python 関連機能をふりかえる ~
__allllllllez__
2
140
エンジニア候補者向け資料2024.04.24.pdf
macloud
0
3.3k
Android Target SDK 35 (Android 15) 対応の概要
akkie76
0
130
よく聞くけど使ったことないソフトウェアNo.1 KafkaとSnowflake
foursue
4
370
実例で紹介するRAG導入時の知見と精度向上の勘所
yamahiro
1
300
One engineer company with Ruby on Rails
rstankov
2
380
Featured
See All Featured
How To Stay Up To Date on Web Technology
chriscoyier
782
250k
Docker and Python
trallard
35
2.7k
A Modern Web Designer's Workflow
chriscoyier
689
190k
Pencils Down: Stop Designing & Start Developing
hursman
117
11k
GraphQLの誤解/rethinking-graphql
sonatard
54
9.3k
Optimising Largest Contentful Paint
csswizardry
11
2.4k
Testing 201, or: Great Expectations
jmmastey
29
6.4k
It's Worth the Effort
3n
180
27k
Creatively Recalculating Your Daily Design Routine
revolveconf
211
11k
Design by the Numbers
sachag
274
18k
Practical Orchestrator
shlominoach
183
9.7k
From Idea to $5000 a Month in 5 Months
shpigford
378
45k
Transcript
How do you scale a logging infrastructure to accept a
billion messages a day? Paul Stack http://twitter.com/stack72 mail:
[email protected]
About Me Infrastructure Engineer for a cool startup :) Reformed
ASP.NET / C# Developer DevOps Extremist Conference Junkie
Background Project was to replace the legacy ‘logging solution’
Iteration 0: A Developer created a single box with the
ELK all in 1 jar
Time to make it production ready now
None
Iteration 1: Using Redis as the input mechanism for LogStash
None
None
Enter Apache Kafka
“Kafka is a distributed publish- subscribe messaging system that is
designed to be fast, scalable, and durable” Source: Cloudera Blog
Introduction to Kafka • Kafka is made up of ‘topics’,
‘producers’, ‘consumers’ and ‘brokers’ • Communication is via TCP • Backed by Zookeeper
Kafka Topics Source: http://kafka.apache.org/documentation.html
Kafka Producers • Producers are responsible to chose what topic
to publish data to • The producer is responsible for choosing a partition to write to • Can be handled round robin or partition functions
Kafka Consumers • Consumption can be done via: • queuing
• pub-sub
Kafka Consumers • Kafka consumer group • Strong ordering
Kafka Consumers • Strong ordering
https://github.com/opentable/puppet-exhibitor
None
Iteration 2 Introduction of Kafka
None
None
Iteration 3 Further ‘Improvements’ to the cluster layout
None
The Numbers • Logs kept in ES for 30 days
then archived • 12 billion documents active in ES • ES space was about 25 - 30TB in EBS volumes • Average Doc Size ~ 1.2KB • V-Day 2015: ~750M docs collected without failure
What about metrics and monitoring?
Monitoring - Nagios • Alerts on • ES Cluster •
zK and Kafka Nodes • Logstash / Redis nodes
None
https://github.com/stack72/nagios-elasticsearch
Metrics - Kafka Offset Monitor
https://github.com/opentable/KafkaOffsetMonitor
Metrics - ElasticSearch
None
None
None
Visibility Rocks!
None
So what would I do differently?
Questions?
Paul Stack @stack72