Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ido Barkan
Search
AppsFlyer
July 27, 2016
Technology
1
130
Ido Barkan
Using Druid Analyzing web access logs for 8 billion events per day
AppsFlyer
July 27, 2016
Tweet
Share
More Decks by AppsFlyer
See All by AppsFlyer
Processing 15 Billion events a day without breaking the bank - ReversimX ILTechTalks
appsflyer
0
460
Journey to the Real-Time Analytics in Extreme Growth
appsflyer
0
270
10 Real problems & solutions in your build and deploy process
appsflyer
0
130
DevOps paradigm in R&D day-to-day
appsflyer
0
120
Building a Mobile Backend to Evolve
appsflyer
0
74
Sometimes, Druid is not the best solution for a business use case
appsflyer
1
410
Processing 8 Billion Daily Events in Real Time!
appsflyer
1
97
React Performance
appsflyer
1
190
Real-time analytics with Druid at Appsflyer
appsflyer
0
330
Other Decks in Technology
See All in Technology
ADRを運用して3年経った僕らの現在地
onk
PRO
9
2.7k
Oracle Database 23ai 新機能#4 Real Application Clusters
oracle4engineer
PRO
0
140
「ばん・さく・つき・たー!」にならないためにSHIROBAKOから 学んだこと
ysknsid25
3
610
入門 KRR
donkomura
0
100
Rubyはなぜ「たのしい」のか? / Why is Ruby a programmers' best friend? #tqrk15
expajp
4
1.8k
エンジニアは伝え方が9割/90% of what engineers need is communication skills
ykanoh
2
180
山手線一周のパフォーマンス改善
suzukahr
0
120
分析者起点の企画を成功させた連携面の工夫
lycorptech_jp
PRO
1
240
All your memory are belong to… whom?
ennael
PRO
0
620
YAPC::Hakodateの映像記録を支える技術
godan
4
200
それでもやっぱり ExpressRoute が好き!
skmkzyk
0
140
MLOpsの「あるある」課題の解決と、そのためのライブラリgokart
mski_iksm
1
160
Featured
See All Featured
The Pragmatic Product Professional
lauravandoore
31
6.2k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
30
2.6k
The Cult of Friendly URLs
andyhume
77
6k
A better future with KSS
kneath
236
17k
Designing Experiences People Love
moore
138
23k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
2
220
Fireside Chat
paigeccino
32
2.9k
Robots, Beer and Maslow
schacon
PRO
157
8.2k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
663
120k
4 Signs Your Business is Dying
shpigford
180
21k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
37
1.7k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.5k
Transcript
Ido Barkan Analyzing web access logs for 8 billion events
per day
5xx Errors
Appsflyer gets around 8B web events per day.
Micro Services Architecture Real Time Attr.
AWS Elastic load balancer Log entry
A Log line 2016-02-06T12:51:54.201846Z Appsflyer-web 139.162.156.169:50435 10.10.8.90:6555 0.000021 0.001916 0.00001
200 200 780 2 "POST https://track. appsflyer.com:443/... HTTP/1.1" "Dalvik/1.6.0 (Linux; U; Android 4.4.4; SM-J110H Build/KTU84P)" ECDHE-RSA-AES128-SHA TLSv1 $ head -1 195229424603_elasticloadbalancing_eu-west-1_appsflyer-web_.log | wc -c 331 Total: 300-1500 bytes =>sub sampling of 1/10 => 223 GB daily approx.
What was missing? No transparency of incoming web requests. ?
# error (400 / 500) responses grouped by app ? # of events grouped by app ? # of events grouped by response code
What wasn’t missing? ! No single event granularity- only analytics
! No fancy enterprise features (role-based access, alerts etc.)
Possible solutions 1. Our own ELK- will not hold the
volume 2. SaaS based ELK (logz.io, loggly...)- expensive and gives more than we want.
Data flow Log to bucket Trigger Lambda Druid sink service
Druid configured naively • 3 data nodes (historical+RT) • 1
master (coordinator) • 1 broker • No data duplication • 7d data retention • Only 5 machines
Basic log processing 2016-02-06T12:51:54.201846Z Appsflyer-web 139.162.156.169:50435 10.10.8.90:6555 0.000021 0.001916 0.00001
200 200 780 2 "POST https://track.appsflyer.com:443/... HTTP/1.1" "Dalvik/1.6.0 (Linux; U; Android 4.4.4; SM-J110H Build/KTU84P)" ECDHE-RSA-AES128-SHA TLSv1
Demo! • druidquery • caravel
Thank you
[email protected]
Questions?
Thank you
[email protected]
We are hiring!