Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ido Barkan
Search
AppsFlyer
July 27, 2016
Technology
1
130
Ido Barkan
Using Druid Analyzing web access logs for 8 billion events per day
AppsFlyer
July 27, 2016
Tweet
Share
More Decks by AppsFlyer
See All by AppsFlyer
Processing 15 Billion events a day without breaking the bank - ReversimX ILTechTalks
appsflyer
0
450
Journey to the Real-Time Analytics in Extreme Growth
appsflyer
0
260
10 Real problems & solutions in your build and deploy process
appsflyer
0
130
DevOps paradigm in R&D day-to-day
appsflyer
0
120
Building a Mobile Backend to Evolve
appsflyer
0
74
Sometimes, Druid is not the best solution for a business use case
appsflyer
1
410
Processing 8 Billion Daily Events in Real Time!
appsflyer
1
95
React Performance
appsflyer
1
190
Real-time analytics with Druid at Appsflyer
appsflyer
0
320
Other Decks in Technology
See All in Technology
フルリモートワークはエンジニアの夢を叶えたか? #cm_odyssey
mamohacy
2
600
データベース研修 DB基礎【MIXI 24新卒技術研修】
mixi_engineers
PRO
0
210
開発と事業を繋ぐ!SREのオブザーバビリティ戦略 ~ Developers Summit 2024 Summer ~
leveragestech
0
630
E2Eテスト自動化プラットフォームにおけるAIの活用
shift_evolve
0
190
運用改善、不都合な真実 / 20240722-ssmjp-kaizen
opelab
17
8.2k
サーバーレスAPI(API Gateway+Lambda)とNext.jsで 個人ブログを作ろう!
shuntaka
PRO
0
560
データ分析基盤を作ってみよう~設計編~
nrinetcom
PRO
1
110
ここがすごいよ! AWS Systems Manager!
saichan11
0
1.8k
Luupの開発組織におけるインシデントマネジメントの変遷 ver.RoadtoSRENEXT2024
grimoh
1
270
LINE WORKSへ簡単通知!Incoming Webhookアプリの紹介
mmclsntr
0
110
Classmethod Odyssey 登壇資料
yamahiro
0
390
さらに高品質・高速化を目指すAI時代のテスト設計支援と、めざす先 / AI Test Lab vol.1
shift_evolve
0
190
Featured
See All Featured
GraphQLとの向き合い方2022年版
quramy
36
13k
Code Reviewing Like a Champion
maltzj
517
39k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
13
430
Why Our Code Smells
bkeepers
PRO
332
56k
Debugging Ruby Performance
tmm1
71
11k
Optimising Largest Contentful Paint
csswizardry
18
2.6k
How STYLIGHT went responsive
nonsquared
93
5k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
18
1.2k
GraphQLの誤解/rethinking-graphql
sonatard
59
9.6k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
90
47k
Into the Great Unknown - MozCon
thekraken
20
1.3k
The Mythical Team-Month
searls
217
43k
Transcript
Ido Barkan Analyzing web access logs for 8 billion events
per day
5xx Errors
Appsflyer gets around 8B web events per day.
Micro Services Architecture Real Time Attr.
AWS Elastic load balancer Log entry
A Log line 2016-02-06T12:51:54.201846Z Appsflyer-web 139.162.156.169:50435 10.10.8.90:6555 0.000021 0.001916 0.00001
200 200 780 2 "POST https://track. appsflyer.com:443/... HTTP/1.1" "Dalvik/1.6.0 (Linux; U; Android 4.4.4; SM-J110H Build/KTU84P)" ECDHE-RSA-AES128-SHA TLSv1 $ head -1 195229424603_elasticloadbalancing_eu-west-1_appsflyer-web_.log | wc -c 331 Total: 300-1500 bytes =>sub sampling of 1/10 => 223 GB daily approx.
What was missing? No transparency of incoming web requests. ?
# error (400 / 500) responses grouped by app ? # of events grouped by app ? # of events grouped by response code
What wasn’t missing? ! No single event granularity- only analytics
! No fancy enterprise features (role-based access, alerts etc.)
Possible solutions 1. Our own ELK- will not hold the
volume 2. SaaS based ELK (logz.io, loggly...)- expensive and gives more than we want.
Data flow Log to bucket Trigger Lambda Druid sink service
Druid configured naively • 3 data nodes (historical+RT) • 1
master (coordinator) • 1 broker • No data duplication • 7d data retention • Only 5 machines
Basic log processing 2016-02-06T12:51:54.201846Z Appsflyer-web 139.162.156.169:50435 10.10.8.90:6555 0.000021 0.001916 0.00001
200 200 780 2 "POST https://track.appsflyer.com:443/... HTTP/1.1" "Dalvik/1.6.0 (Linux; U; Android 4.4.4; SM-J110H Build/KTU84P)" ECDHE-RSA-AES128-SHA TLSv1
Demo! • druidquery • caravel
Thank you
[email protected]
Questions?
Thank you
[email protected]
We are hiring!