Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Norikra to realtime log analytics

Norikra to realtime log analytics

Norikra meetup #2
2015-06-03

Harukasan

June 03, 2015
Tweet

More Decks by Harukasan

Other Decks in Technology

Transcript

  1. Harukasan / MICHII Shunsuke - Infrastructure engineer in pixiv since

    2012 - Develops contents distribution / convertor / storage - distributes up-to 16Gbps image traffic - Log collecting/analytics platform - Elasticsearch/Kibana - Fluentd
  2. Agenda - Log ecosystem - Batch processing vs. Stream processing

    - Getting started with Norikra - Norikra Deployment
  3. Application Application Application Database Storage service HDFS RDB / Other

     ʁ rsync syslog ssh custom script … Storage log storage
  4. Application Application Application Database Storage service Fluentd HDFS RDB /

    Other  Storage Google BigQuery Elasticsearch MongoDB log storage Treasure Data
  5. Application Application Application Database Storage service HDFS RDB / Other

     Storage Google BigQuery Elasticsearch MongoDB log storage Treasure Data Fluentd
  6. Application Application Application Database Storage service HDFS RDB / Other

     Storage Google BigQuery Elasticsearch MongoDB log storage Treasure Data Fluentd Kibana Spreadsheet HRForecast Tableau GrowthForecast Custom Script visualisation / analytics
  7. Application Application Application Database Storage service HDFS RDB / Other

     Storage Google BigQuery Elasticsearch MongoDB log storage Treasure Data Fluentd Kibana Spreadsheet HRForecast Tableau GrowthForecast Custom Script visualisation / analytics GAS
  8. Application Application Application Database Storage service HDFS RDB / Other

     Storage Google BigQuery Elasticsearch MongoDB log storage Treasure Data Fluentd Kibana Spreadsheet HRForecast Tableau GrowthForecast Custom Script visualisation / analytics Shib
  9. Application Application Application Database Storage service HDFS RDB / Other

     Storage Google BigQuery Elasticsearch MongoDB log storage Treasure Data Fluentd Kibana Spreadsheet HRForecast Tableau GrowthForecast Custom Script visualisation / analytics
  10. Application Application Application Database Storage pixiv RDB / Other 

    Storage Google BigQuery Elasticsearch MongoDB log storage Fluentd Kibana HRForecast Tableau Custom Script visualisation / analytics Jenkins
  11. Log ecosystem with Fluentd - Every log can stream to

    any type storages/queues - Every log are converted to structured data
  12. Batch processing Daily / Weekly / Monthly Reporting - page

    view - conversion count - num. of events デイリーレポート ================ - 2015/06/03更新 ▪ページビュー 2015/05/30 (水) 888888 PV 2015/05/30 (木) 888888 PV 2015/05/30 (金) 888888 PV 2015/05/30 (土) 888888 PV 2015/05/31 (日) 888888 PV ★過去最高 2015/06/01 (月) 888888 PV 2015/06/02 (火) 888888 PV 2015/06/03 (水) 888888 PV ▪新規登録数 2015/05/30 (水) 8888 人
  13. Offline Analysis - Excel is awesome - Analysis small data

    on laptops - Many techniques and know-how in Japan
  14. Sometimes, Batch processes
 are too heavy Minutely Report - to

    know burst access - to know changes in the day Minutely Notification - to report error - to detect attacks
  15. Stream Processing
 to realtime analytics - Process small data (almost

    case, in-memory) - High throughput - Low latency time window data stream 1 min.
  16. Realtime Aggregation SELECT COUNT(1, status REGEXP '^2..$') AS count_2xx, COUNT(1,

    status REGEXP '^3..$') AS count_3xx, COUNT(1, status REGEXP '^4..$') AS count_4xx, COUNT(1, status REGEXP '^5..$') AS count_5xx FROM access_log.win:time_batch(1 min)
  17. Output from fluent-plugin-norikra <source> type forward </source> <match log.**> #

    output to Norikra type norikra norikra localhost:26571 # specify norikra host (26571: default port) target_map_tag true # create target with tag </match>
  18. Sweep from Norikra <source> type norikra norikra localhost:26571 <fetch> method

    sweep # sweep output of query target gf # specify query group tag query_name # use query_name as tag tag_prefix norikra.gf # add tag prefix interval 10s </fetch> …
  19. Sweep from Norikra … <fetch> method sweep # sweep output

    of query target idobata # specify query group tag query_name # use query_name as tag tag_prefix norikra.idobata # add tag prefix interval 10s </fetch> …
  20. Sweep from Norikra … <fetch> method sweep # sweep output

    of query target es # specify query group tag query_name # use query_name as tag tag_prefix norikra.es # add tag prefix interval 10s </fetch> </source>
  21. Output to GrowthForecast <match norikra.gf.**> type growthforecast remove_prefix norikra.gf name_key_pattern

    . gfapi_url http://localhost:5125/api/ graph_path norikra/${tag}/${key_name} </match>
  22. HTTP Status count SELECT COUNT(1, status REGEXP '^2..$') AS count_2xx,

    COUNT(1, status REGEXP '^3..$') AS count_3xx, COUNT(1, status REGEXP '^4..$') AS count_4xx, COUNT(1, status REGEXP '^5..$') AS count_5xx FROM access_log.win:time_batch(1 min) Name status_count Group gf Query
  23. HTTP Status count SELECT COUNT(1, status REGEXP '^2..$') AS count_2xx,

    COUNT(1, status REGEXP '^3..$') AS count_3xx, COUNT(1, status REGEXP '^4..$') AS count_4xx, COUNT(1, status REGEXP '^5..$') AS count_5xx FROM access_log.win:time_batch(1 min) Name status_count Group mackerel Query
  24. HTTP Status count SELECT "Notify: over 1000 access" AS message,

    COUNT(*) AS count FROM access_log.win:time_batch(1 min) WHERE count > 1000 Name notify_error Group idobata Query
  25. Hardware structure - Norikra needs many memory (min. 8GB) -

    CPU cores are not so much required - Norikra is SPOF yet - Norikra can’t share query stats between active/standby
  26. Build environment - Install JVM 1.7 by apt - Build

    JRuby by xbuild xbuild/ruby-install jruby-1.7.18 ~/local/jruby-1.7.18/
  27. Daemonize with Supervisord [program:norikra] command=/home/norikra/local/jruby-1.7.18/bin/norikra start \ --logdir=/var/log/norikra \ -s

    /home/norikra/norikra/norikra-stat.json \ --ui-context-path=/norikra \ -Xmx2048m … user=norikra directory=/home/norikra/norikra autostart=true autorestart=true environment=LANG=C