Slide 1

Slide 1 text

Casual Log Collection and Querying with fluent-plugin-riak @kuenishi from @basho 2013/6/1 RubyKaigi

Slide 2

Slide 2 text

Who the hell are you? •UENISHI, Kota (@kuenishi) •Basho Japan KK •devoted to Distributed Systems for ~6 yrs •msgpack-erlang, Jubatus

Slide 3

Slide 3 text

Casual Log Collection •Aggregate Every Log with Fluentd •Put Them all into •Ask your Query to

Slide 4

Slide 4 text

Whole Sketch

Slide 5

Slide 5 text

fluentd: casual log collector http://www.flickr.com/photos/markchadwick/8757802771/ http://www.flickr.com/photos/usdagov/5681152426/ before: logs are scattered all over the servers in chaos after: all logs flows cleanly via fluentd in order

Slide 6

Slide 6 text

Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Databases

Slide 7

Slide 7 text

Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Databases filter / buffer / routing

Slide 8

Slide 8 text

Nagios MongoDB Hadoop Alerting Amazon S3 Analysis Archiving MySQL Apache Frontend Access logs syslogd App logs System logs Backend Databases filter / buffer / routing Riak

Slide 9

Slide 9 text

what’s ? •Distributed Key-Value Store •Focused on •Availability •Scalability •Easy Operation, ҆຾ (Sleep)

Slide 10

Slide 10 text

when Riak? •Hadoop is too much •MongoDB is too small •Document DB aspect of Riak •put them all into Riak

Slide 11

Slide 11 text

Not Only KVS •Aspect of Document Database •MapReduce in JavaScript / Erlang

Slide 12

Slide 12 text

Buy it if interested

Slide 13

Slide 13 text

fluent-plugin-riak JSON

Slide 14

Slide 14 text

fluent.conf type riak # define the cluster via pb ports nodes 192.168.0.1:8087 192.168.0.2:8087

Slide 15

Slide 15 text

log everything as JSON { "host":"103.5.142.5", "user":"-", "method":"PUT", "path":"/buckets/moriyoshi/object/riaklogo.png", "code":"200", "size":"0", "referer":"", "agent":"", "time":"2013-05-27T05:42:09Z", "tag":"riak.cluster2" }, ...

Slide 16

Slide 16 text

How to Query

Slide 17

Slide 17 text

Ruby Cluent for Querying irb> q = client.bucket(‘fluentlog’) irb> q = q.map(“function(v){ return [v]; }”).reduce(“function(values){ return values; }“, :keep => false) irb> r = q.run()

Slide 18

Slide 18 text

Debug distributed JS http://www.flickr.com/photos/heatsink/110859301/

Slide 19

Slide 19 text

Any Other Rubyish way? http://www.flickr.com/photos/snazzyshot/5366645175/

Slide 20

Slide 20 text

ripple

Slide 21

Slide 21 text

github.com/basho/ripple •a rich Ruby toolkit for Riak, consists of •Riak client •Riak-sessions •Ripple

Slide 22

Slide 22 text

http://www.flickr.com/photos/toco/2612055052/

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Mohair: Not Only NoSQL http://www.flickr.com/photos/frank-wouters/2464743512/

Slide 26

Slide 26 text

JSON { "host":"103.5.142.5", "user":"-", "method":"PUT", "path":"/buckets/moriyoshi/object/riaklogo.png", "code":"200", "size":"0", "referer":"", "agent":"", "time":"2013-05-27T05:42:09Z", "tag":"riak.cluster2" }, ...

Slide 27

Slide 27 text

SQL create table apachelogs { host varchar(16), user varchar(256), method varchar(5), path varchar(1024), code integer, size integer, referer text, agent varchar(1024), time timestamp, tag varchar(1024) }

Slide 28

Slide 28 text

“Mohair” for Querying > select * from fluentlog \ where method = “GET” group by host

Slide 29

Slide 29 text

Converting SQL to MapReduce •SQL -(parslet)-> JS -> Riak mapred •where sentence is at Map •group by, count(-) is at Reduce

Slide 30

Slide 30 text

Chef’s Capricious Roadmap •Secondary Index Support •Query Optimization •types: timestamp, float •nested columns •insert / delete

Slide 31

Slide 31 text

check it out! github: basho/riak kuenishi/fluent-plugin-riak kuenishi/mohair (kuenishi/fluent-logger-erlang)

Slide 32

Slide 32 text

Conclusion •NoSQL is not NoSQL any more •put’em all into Riak via Fluentd •Query via SQL with Mohair •waiting for pull requests

Slide 33

Slide 33 text

Questions? •[email protected] •Riak Meetup (7/10) •Riak SCR (twice in a month) •ιϑτ΢ΣΞσβΠϯ7݄߸(nginx/riak) •σʔλϕʔεΤϯδχΞཆ੒ಡຊ