My talk at RubyKaigi 2013 http://rubykaigi.org/2013/talk/S70
Casual Log Collectionand Querying withfluent-plugin-riak@kuenishi from @basho2013/6/1 RubyKaigi
View Slide
Who the hell are you?•UENISHI, Kota (@kuenishi)•Basho Japan KK•devoted to Distributed Systems for ~6 yrs•msgpack-erlang, Jubatus
Casual Log Collection•Aggregate Every Log with Fluentd•Put Them all into •Ask your Query to
Whole Sketch
fluentd: casuallog collectorhttp://www.flickr.com/photos/markchadwick/8757802771/ http://www.flickr.com/photos/usdagov/5681152426/before: logs are scattered allover the servers in chaosafter: all logs flows cleanlyvia fluentd in order
NagiosMongoDBHadoopAlertingAmazon S3AnalysisArchivingMySQLApacheFrontendAccess logssyslogdApp logsSystem logsBackendDatabases
NagiosMongoDBHadoopAlertingAmazon S3AnalysisArchivingMySQLApacheFrontendAccess logssyslogdApp logsSystem logsBackendDatabasesfilter / buffer / routing
NagiosMongoDBHadoopAlertingAmazon S3AnalysisArchivingMySQLApacheFrontendAccess logssyslogdApp logsSystem logsBackendDatabasesfilter / buffer / routingRiak
what’s ?•Distributed Key-Value Store•Focused on•Availability•Scalability•Easy Operation, ҆ (Sleep)
when Riak?•Hadoop is too much•MongoDB is too small•Document DB aspect of Riak•put them all into Riak
Not Only KVS•Aspect of Document Database•MapReduce in JavaScript / Erlang
Buy it ifinterested
fluent-plugin-riakJSON
fluent.conftype riak# define the cluster via pb portsnodes 192.168.0.1:8087 192.168.0.2:8087
log everything as JSON{"host":"103.5.142.5","user":"-","method":"PUT","path":"/buckets/moriyoshi/object/riaklogo.png","code":"200","size":"0","referer":"","agent":"","time":"2013-05-27T05:42:09Z","tag":"riak.cluster2"},...
How to Query
Ruby Cluent for Queryingirb> q = client.bucket(‘fluentlog’)irb> q = q.map(“function(v){ return[v]; }”).reduce(“function(values){ returnvalues; }“, :keep => false)irb> r = q.run()
Debug distributed JShttp://www.flickr.com/photos/heatsink/110859301/
Any Other Rubyish way?http://www.flickr.com/photos/snazzyshot/5366645175/
ripple
github.com/basho/ripple•a rich Ruby toolkit for Riak, consists of•Riak client•Riak-sessions•Ripple
http://www.flickr.com/photos/toco/2612055052/
Mohair:Not OnlyNoSQLhttp://www.flickr.com/photos/frank-wouters/2464743512/
JSON{"host":"103.5.142.5","user":"-","method":"PUT","path":"/buckets/moriyoshi/object/riaklogo.png","code":"200","size":"0","referer":"","agent":"","time":"2013-05-27T05:42:09Z","tag":"riak.cluster2"},...
SQLcreate table apachelogs {host varchar(16),user varchar(256),method varchar(5),path varchar(1024),code integer,size integer,referer text,agent varchar(1024),time timestamp,tag varchar(1024)}
“Mohair” for Querying> select * from fluentlog \where method = “GET” group by host
Converting SQL to MapReduce•SQL -(parslet)-> JS -> Riak mapred•where sentence is at Map•group by, count(-) is at Reduce
Chef’s Capricious Roadmap•Secondary Index Support•Query Optimization•types: timestamp, float•nested columns•insert / delete
check it out!github:basho/riakkuenishi/fluent-plugin-riakkuenishi/mohair(kuenishi/fluent-logger-erlang)
Conclusion•NoSQL is not NoSQL any more•put’em all into Riak via Fluentd•Query via SQL with Mohair•waiting for pull requests
Questions?•[email protected]•Riak Meetup (7/10)•Riak SCR (twice in a month)•ιϑτΣΞσβΠϯ7݄߸(nginx/riak)•σʔλϕʔεΤϯδχΞཆಡຊ