Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Casual Log Collection and Querying with fluent-...
Search
UENISHI Kota
June 01, 2013
Technology
3
450
Casual Log Collection and Querying with fluent-plugin-riak
My talk at RubyKaigi 2013
http://rubykaigi.org/2013/talk/S70
UENISHI Kota
June 01, 2013
Tweet
Share
More Decks by UENISHI Kota
See All by UENISHI Kota
Storage Systems in Preferred Networks
kuenishi
0
32
Metadata Management in Distributed File Systems
kuenishi
2
510
Behind The Scenes: Cloud Native Storage System for AI
kuenishi
2
400
Apache Ozone behind Simulation and AI Industries
kuenishi
0
380
Distributed Deep Learning with Chainer and Hadoop
kuenishi
3
1.2k
A Few Ways to Accelerate Deep Learning
kuenishi
0
1.1k
Introducing Retz
kuenishi
5
1.1k
Introducing Retz and how to develop practical frameworks
kuenishi
3
740
Formalization and Proof of Distributed Systems (ja)
kuenishi
10
6.4k
Other Decks in Technology
See All in Technology
AIエージェントを支える設計
tkikuchi1002
12
2.5k
株式会社島津製作所_研究開発(集団協業と知的生産)の現場を支える、OSS知識基盤システムの導入
akahane92
1
1.3k
AI駆動開発 with MixLeap Study【大阪支部 #3】
lycorptech_jp
PRO
0
280
【CEDEC2025】大規模言語モデルを活用したゲーム内会話パートのスクリプト作成支援への取り組み
cygames
PRO
1
540
怖くない!GritQLでBiomeプラグインを作ろうよ
pal4de
1
150
ecspressoの設計思想に至る道 / sekkeinight2025
fujiwara3
12
2.2k
テキストからの実世界知能の実現に向けて
sumoai
0
110
AI によるドキュメント処理を加速するためのOCR 結果の永続化と再利用戦略
tomoaki25
0
230
ML Pipelineの開発と運用を OpenTelemetryで繋ぐ @ OpenTelemetry Meetup 2025-07
getty708
0
330
公開初日に個人環境で試した Gemini CLI 体験記など / Gemini CLI実験レポート
you
PRO
3
1.2k
東京海上日動におけるセキュアな開発プロセスの取り組み
miyabit
0
210
【Λ(らむだ)】最近のアプデ情報 / RPALT20250729
lambda
0
170
Featured
See All Featured
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Unsuck your backbone
ammeep
671
58k
[RailsConf 2023] Rails as a piece of cake
palkan
56
5.7k
The World Runs on Bad Software
bkeepers
PRO
70
11k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
53k
A designer walks into a library…
pauljervisheath
207
24k
Speed Design
sergeychernyshev
32
1k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
GitHub's CSS Performance
jonrohan
1031
460k
Transcript
Casual Log Collection and Querying with fluent-plugin-riak @kuenishi from @basho
2013/6/1 RubyKaigi
Who the hell are you? •UENISHI, Kota (@kuenishi) •Basho Japan
KK •devoted to Distributed Systems for ~6 yrs •msgpack-erlang, Jubatus
Casual Log Collection •Aggregate Every Log with Fluentd •Put Them
all into <Some Storage You Like> •Ask your Query to <Some Storage You Like>
Whole Sketch
fluentd: casual log collector http://www.flickr.com/photos/markchadwick/8757802771/ http://www.flickr.com/photos/usdagov/5681152426/ before: logs are scattered
all over the servers in chaos after: all logs flows cleanly via fluentd in order
Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache
Frontend Access logs syslogd App logs System logs Backend Databases
Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache
Frontend Access logs syslogd App logs System logs Backend Databases filter / buffer / routing
Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache
Frontend Access logs syslogd App logs System logs Backend Databases filter / buffer / routing Riak
what’s ? •Distributed Key-Value Store •Focused on •Availability •Scalability •Easy
Operation, ҆ (Sleep)
when Riak? •Hadoop is too much •MongoDB is too small
•Document DB aspect of Riak •put them all into Riak
Not Only KVS •Aspect of Document Database •MapReduce in JavaScript
/ Erlang
Buy it if interested
fluent-plugin-riak JSON
fluent.conf <match apache.**> type riak # define the cluster via
pb ports nodes 192.168.0.1:8087 192.168.0.2:8087 </match>
log everything as JSON { "host":"103.5.142.5", "user":"-", "method":"PUT", "path":"/buckets/moriyoshi/object/riaklogo.png", "code":"200",
"size":"0", "referer":"", "agent":"", "time":"2013-05-27T05:42:09Z", "tag":"riak.cluster2" }, ...
How to Query
Ruby Cluent for Querying irb> q = client.bucket(‘fluentlog’) irb> q
= q.map(“function(v){ return [v]; }”).reduce(“function(values){ return values; }“, :keep => false) irb> r = q.run()
Debug distributed JS http://www.flickr.com/photos/heatsink/110859301/
Any Other Rubyish way? http://www.flickr.com/photos/snazzyshot/5366645175/
ripple
github.com/basho/ripple •a rich Ruby toolkit for Riak, consists of •Riak
client •Riak-sessions •Ripple
http://www.flickr.com/photos/toco/2612055052/
None
None
Mohair: Not Only NoSQL http://www.flickr.com/photos/frank-wouters/2464743512/
JSON { "host":"103.5.142.5", "user":"-", "method":"PUT", "path":"/buckets/moriyoshi/object/riaklogo.png", "code":"200", "size":"0", "referer":"", "agent":"",
"time":"2013-05-27T05:42:09Z", "tag":"riak.cluster2" }, ...
SQL create table apachelogs { host varchar(16), user varchar(256), method
varchar(5), path varchar(1024), code integer, size integer, referer text, agent varchar(1024), time timestamp, tag varchar(1024) }
“Mohair” for Querying > select * from fluentlog \ where
method = “GET” group by host
Converting SQL to MapReduce •SQL -(parslet)-> JS -> Riak mapred
•where sentence is at Map •group by, count(-) is at Reduce
Chef’s Capricious Roadmap •Secondary Index Support •Query Optimization •types: timestamp,
float •nested columns •insert / delete
check it out! github: basho/riak kuenishi/fluent-plugin-riak kuenishi/mohair (kuenishi/fluent-logger-erlang)
Conclusion •NoSQL is not NoSQL any more •put’em all into
Riak via Fluentd •Query via SQL with Mohair •waiting for pull requests
Questions? •
[email protected]
•Riak Meetup (7/10) •Riak SCR (twice in a
month) •ιϑτΣΞσβΠϯ7݄߸(nginx/riak) •σʔλϕʔεΤϯδχΞཆಡຊ