Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Casual Log Collection and Querying with fluent-plugin-riak

Casual Log Collection and Querying with fluent-plugin-riak

My talk at RubyKaigi 2013 http://rubykaigi.org/2013/talk/S70

UENISHI Kota

June 01, 2013
Tweet

More Decks by UENISHI Kota

Other Decks in Technology

Transcript

  1. Casual Log Collection
    and Querying with
    fluent-plugin-riak
    @kuenishi from @basho
    2013/6/1 RubyKaigi

    View Slide

  2. Who the hell are you?
    •UENISHI, Kota (@kuenishi)
    •Basho Japan KK
    •devoted to Distributed Systems for ~6 yrs
    •msgpack-erlang, Jubatus

    View Slide

  3. Casual Log Collection
    •Aggregate Every Log with Fluentd
    •Put Them all into
    •Ask your Query to

    View Slide

  4. Whole Sketch

    View Slide

  5. fluentd: casual
    log collector
    http://www.flickr.com/photos/markchadwick/8757802771/ http://www.flickr.com/photos/usdagov/5681152426/
    before: logs are scattered all
    over the servers in chaos
    after: all logs flows cleanly
    via fluentd in order

    View Slide

  6. Nagios
    MongoDB
    Hadoop
    Alerting
    Amazon S3
    Analysis
    Archiving
    MySQL
    Apache
    Frontend
    Access logs
    syslogd
    App logs
    System logs
    Backend
    Databases

    View Slide

  7. Nagios
    MongoDB
    Hadoop
    Alerting
    Amazon S3
    Analysis
    Archiving
    MySQL
    Apache
    Frontend
    Access logs
    syslogd
    App logs
    System logs
    Backend
    Databases
    filter / buffer / routing

    View Slide

  8. Nagios
    MongoDB
    Hadoop
    Alerting
    Amazon S3
    Analysis
    Archiving
    MySQL
    Apache
    Frontend
    Access logs
    syslogd
    App logs
    System logs
    Backend
    Databases
    filter / buffer / routing
    Riak

    View Slide

  9. what’s ?
    •Distributed Key-Value Store
    •Focused on
    •Availability
    •Scalability
    •Easy Operation, ҆຾ (Sleep)

    View Slide

  10. when Riak?
    •Hadoop is too much
    •MongoDB is too small
    •Document DB aspect of Riak
    •put them all into Riak

    View Slide

  11. Not Only KVS
    •Aspect of Document Database
    •MapReduce in JavaScript / Erlang

    View Slide

  12. Buy it if
    interested

    View Slide

  13. fluent-plugin-riak
    JSON

    View Slide

  14. fluent.conf

    type riak
    # define the cluster via pb ports
    nodes 192.168.0.1:8087 192.168.0.2:8087

    View Slide

  15. log everything as JSON
    {
    "host":"103.5.142.5",
    "user":"-",
    "method":"PUT",
    "path":"/buckets/moriyoshi/object/riaklogo.png",
    "code":"200",
    "size":"0",
    "referer":"",
    "agent":"",
    "time":"2013-05-27T05:42:09Z",
    "tag":"riak.cluster2"
    },
    ...

    View Slide

  16. How to Query

    View Slide

  17. Ruby Cluent for Querying
    irb> q = client.bucket(‘fluentlog’)
    irb> q = q.map(“function(v){ return
    [v]; }”).reduce(“function(values){ return
    values; }“, :keep => false)
    irb> r = q.run()

    View Slide

  18. Debug distributed JS
    http://www.flickr.com/photos/heatsink/110859301/

    View Slide

  19. Any Other Rubyish way?
    http://www.flickr.com/photos/snazzyshot/5366645175/

    View Slide

  20. ripple

    View Slide

  21. github.com/basho/ripple
    •a rich Ruby toolkit for Riak, consists of
    •Riak client
    •Riak-sessions
    •Ripple

    View Slide

  22. http://www.flickr.com/photos/toco/2612055052/

    View Slide

  23. View Slide

  24. View Slide

  25. Mohair:
    Not Only
    NoSQL
    http://www.flickr.com/photos/frank-wouters/2464743512/

    View Slide

  26. JSON
    {
    "host":"103.5.142.5",
    "user":"-",
    "method":"PUT",
    "path":"/buckets/moriyoshi/object/riaklogo.png",
    "code":"200",
    "size":"0",
    "referer":"",
    "agent":"",
    "time":"2013-05-27T05:42:09Z",
    "tag":"riak.cluster2"
    },
    ...

    View Slide

  27. SQL
    create table apachelogs {
    host varchar(16),
    user varchar(256),
    method varchar(5),
    path varchar(1024),
    code integer,
    size integer,
    referer text,
    agent varchar(1024),
    time timestamp,
    tag varchar(1024)
    }

    View Slide

  28. “Mohair” for Querying
    > select * from fluentlog \
    where method = “GET” group by host

    View Slide

  29. Converting SQL to MapReduce
    •SQL -(parslet)-> JS -> Riak mapred
    •where sentence is at Map
    •group by, count(-) is at Reduce

    View Slide

  30. Chef’s Capricious Roadmap
    •Secondary Index Support
    •Query Optimization
    •types: timestamp, float
    •nested columns
    •insert / delete

    View Slide

  31. check it out!
    github:
    basho/riak
    kuenishi/fluent-plugin-riak
    kuenishi/mohair
    (kuenishi/fluent-logger-erlang)

    View Slide

  32. Conclusion
    •NoSQL is not NoSQL any more
    •put’em all into Riak via Fluentd
    •Query via SQL with Mohair
    •waiting for pull requests

    View Slide

  33. Questions?
    [email protected]
    •Riak Meetup (7/10)
    •Riak SCR (twice in a month)
    •ιϑτ΢ΣΞσβΠϯ7݄߸(nginx/riak)
    •σʔλϕʔεΤϯδχΞཆ੒ಡຊ

    View Slide