Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Casual Log Collection and Querying with fluent-...
Search
UENISHI Kota
June 01, 2013
Technology
3
460
Casual Log Collection and Querying with fluent-plugin-riak
My talk at RubyKaigi 2013
http://rubykaigi.org/2013/talk/S70
UENISHI Kota
June 01, 2013
Tweet
Share
More Decks by UENISHI Kota
See All by UENISHI Kota
Storage Systems in Preferred Networks
kuenishi
0
50
Metadata Management in Distributed File Systems
kuenishi
2
520
Behind The Scenes: Cloud Native Storage System for AI
kuenishi
2
420
Apache Ozone behind Simulation and AI Industries
kuenishi
0
400
Distributed Deep Learning with Chainer and Hadoop
kuenishi
3
1.2k
A Few Ways to Accelerate Deep Learning
kuenishi
0
1.1k
Introducing Retz
kuenishi
5
1.2k
Introducing Retz and how to develop practical frameworks
kuenishi
3
750
Formalization and Proof of Distributed Systems (ja)
kuenishi
10
6.4k
Other Decks in Technology
See All in Technology
us-east-1 の障害が 起きると なぜ ソワソワするのか
miu_crescent
PRO
2
730
設計は最強のプロンプト - AI時代に武器にすべきスキルとは?-
kenichirokimura
1
340
ユーザーストーリー x AI / User Stories x AI
oomatomo
0
160
AI-ready"のための"データ基盤 〜 LLMOpsで事業貢献するための基盤づくり
ismk
0
150
仕様駆動 x Codex で 超効率開発
ismk
2
1.3k
Snowflakeとdbtで加速する 「TVCMデータで価値を生む組織」への進化論 / Evolving TVCM Data Value in TELECY with Snowflake and dbt
carta_engineering
2
230
ソフトウェアテストのAI活用_ver1.50
fumisuke
0
290
AWS資格は取ったけどIAMロールを腹落ちできてなかったので、年内に整理してみた
hiro_eng_
0
160
エンタープライズ企業における開発効率化のためのコンテキスト設計とその活用
sergicalsix
1
280
ソフトウェア品質を支える テストとレビュー再考 / 吉澤 智美さん
findy_eventslides
1
950
手を動かしながら学ぶデータモデリング - 論理設計から物理設計まで / Data modeling
soudai
PRO
0
310
The Twin Mandate of Observability
charity
1
1.3k
Featured
See All Featured
GitHub's CSS Performance
jonrohan
1032
470k
Designing for Performance
lara
610
69k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.8k
Balancing Empowerment & Direction
lara
5
740
Become a Pro
speakerdeck
PRO
29
5.6k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
2.9k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
660
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
How STYLIGHT went responsive
nonsquared
100
5.9k
Git: the NoSQL Database
bkeepers
PRO
431
66k
A Modern Web Designer's Workflow
chriscoyier
697
190k
Transcript
Casual Log Collection and Querying with fluent-plugin-riak @kuenishi from @basho
2013/6/1 RubyKaigi
Who the hell are you? •UENISHI, Kota (@kuenishi) •Basho Japan
KK •devoted to Distributed Systems for ~6 yrs •msgpack-erlang, Jubatus
Casual Log Collection •Aggregate Every Log with Fluentd •Put Them
all into <Some Storage You Like> •Ask your Query to <Some Storage You Like>
Whole Sketch
fluentd: casual log collector http://www.flickr.com/photos/markchadwick/8757802771/ http://www.flickr.com/photos/usdagov/5681152426/ before: logs are scattered
all over the servers in chaos after: all logs flows cleanly via fluentd in order
Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache
Frontend Access logs syslogd App logs System logs Backend Databases
Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache
Frontend Access logs syslogd App logs System logs Backend Databases filter / buffer / routing
Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache
Frontend Access logs syslogd App logs System logs Backend Databases filter / buffer / routing Riak
what’s ? •Distributed Key-Value Store •Focused on •Availability •Scalability •Easy
Operation, ҆ (Sleep)
when Riak? •Hadoop is too much •MongoDB is too small
•Document DB aspect of Riak •put them all into Riak
Not Only KVS •Aspect of Document Database •MapReduce in JavaScript
/ Erlang
Buy it if interested
fluent-plugin-riak JSON
fluent.conf <match apache.**> type riak # define the cluster via
pb ports nodes 192.168.0.1:8087 192.168.0.2:8087 </match>
log everything as JSON { "host":"103.5.142.5", "user":"-", "method":"PUT", "path":"/buckets/moriyoshi/object/riaklogo.png", "code":"200",
"size":"0", "referer":"", "agent":"", "time":"2013-05-27T05:42:09Z", "tag":"riak.cluster2" }, ...
How to Query
Ruby Cluent for Querying irb> q = client.bucket(‘fluentlog’) irb> q
= q.map(“function(v){ return [v]; }”).reduce(“function(values){ return values; }“, :keep => false) irb> r = q.run()
Debug distributed JS http://www.flickr.com/photos/heatsink/110859301/
Any Other Rubyish way? http://www.flickr.com/photos/snazzyshot/5366645175/
ripple
github.com/basho/ripple •a rich Ruby toolkit for Riak, consists of •Riak
client •Riak-sessions •Ripple
http://www.flickr.com/photos/toco/2612055052/
None
None
Mohair: Not Only NoSQL http://www.flickr.com/photos/frank-wouters/2464743512/
JSON { "host":"103.5.142.5", "user":"-", "method":"PUT", "path":"/buckets/moriyoshi/object/riaklogo.png", "code":"200", "size":"0", "referer":"", "agent":"",
"time":"2013-05-27T05:42:09Z", "tag":"riak.cluster2" }, ...
SQL create table apachelogs { host varchar(16), user varchar(256), method
varchar(5), path varchar(1024), code integer, size integer, referer text, agent varchar(1024), time timestamp, tag varchar(1024) }
“Mohair” for Querying > select * from fluentlog \ where
method = “GET” group by host
Converting SQL to MapReduce •SQL -(parslet)-> JS -> Riak mapred
•where sentence is at Map •group by, count(-) is at Reduce
Chef’s Capricious Roadmap •Secondary Index Support •Query Optimization •types: timestamp,
float •nested columns •insert / delete
check it out! github: basho/riak kuenishi/fluent-plugin-riak kuenishi/mohair (kuenishi/fluent-logger-erlang)
Conclusion •NoSQL is not NoSQL any more •put’em all into
Riak via Fluentd •Query via SQL with Mohair •waiting for pull requests
Questions? •
[email protected]
•Riak Meetup (7/10) •Riak SCR (twice in a
month) •ιϑτΣΞσβΠϯ7݄߸(nginx/riak) •σʔλϕʔεΤϯδχΞཆಡຊ