Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduce log collector in Go into production

Tatsuhiko Kubo
December 06, 2015

Introduce log collector in Go into production

Introduce log collector in Go into production

Tatsuhiko Kubo

December 06, 2015
Tweet

More Decks by Tatsuhiko Kubo

Other Decks in Technology

Transcript

  1. @cubicdaiya / Tatsuhiko Kubo Site Reliability Engineer @ Mercari, Inc.

    C Go Lua nginx ngx_small_light, ngx_dynamic_upstream, nginx-build, slackboard, cachectl, gaurun, etc…
  2. Agenda • Casual introduction of log-collectors in Go • fluent-forwarder,

    fluent-agent-hydra, Heka • In the case of Mercari • Fluentd and fluent-agent-hydra • Road to introduce fluent-agent-hydra into production environment
  3. fluent-agent-hydra • https://github.com/fujiwara/fluent-agent-hydra • Features • in_tail & (in|out)_forward •

    Handle multiple file per 1 process • Monitoring API • Configuration with TOML • Support LTSV and JSON
  4. Heka • https://github.com/mozilla-services/heka • Event processing / windowed stream operations

    • Provides Plugin system • User can write plugin in Go or Lua • Custom Input, Output, Encoder, Decoder, Filter • Configuration with TOML
  5. Fluentd • Fluentd is very flexible and robust log- collector

    • Plugin systems • Active developers and communities • But performance is not always enough
  6. 0QFO3FTUZ 0QFO3FTUZ 0QFO3FTUZ (PPHMF#JH2VFSZ Developer Data Sientist Analyze by SQL

    send events send events send events Powered by cookpad/puree-(ios|android) utilize events utilize events utilize events PascalʙMercari analysis baseʙ (Before) in_tail & out_forward
  7. 0QFO3FTUZ 0QFO3FTUZ 0QFO3FTUZ (PPHMF#JH2VFSZ Developer Data Sientist Analyze by SQL

    send events send events send events Powered by cookpad/puree-(ios|android) utilize events utilize events utilize events hydra(※) hydra(※) hydra(※) (※) fluent-agent-hydra PascalʙMercari analysis baseʙ (Now) in_tail & out_forward
  8. PascalʙMercari analysis baseʙ • Built with the software blocks below

    • Puree, OpenResty (ngx_lua), Fluentd, fluent-agent-hydra, Google BigQuery • Aggregate various logs to Google BigQuery • Event in app • A/B Testing • etc…
  9. Why switched to fluent-agent-hydraʁ • Low server resource is required

    for Pascal • Fluentd had consumed non-negligible amount of CPU resource (50ʙ60% at peak) • Pascal indicates modestly high workload • OpenResty processes a lot of JSONs and outputs various logs • Requests are come from not only device but API-servers also
  10. Why switched to fluent-agent-hydraʁ • fluent-agent-hydra • CPU usage is

    less than Fluentd • Half as compared to Fluentd in our caseʂ • Enable handling multiple logs per 1 process • Simple
  11. fluent-agent-hydra internal ɿgoroutine monitor out_forwarder in_forwarder watcher & in_tail for

    file go func() go func() go func() go func() run main() wait signal Some gorouines make more goroutines
  12. goroutines communicates with channel ɿgoroutine monitor out_forwarder in_forwarder watcher &

    in_tail for file receiver is monitor receiver is out_forwarder
  13. Monitoring fluent-agent-hydra • fluent-agent-hydra provides monitoring APIs • Application stats

    • current positions for tailing log files • sent amount and bytes per a log • other informations (e.g. error) • System stats • Powered by golang-stats-api-handler
  14. Application Stats DVSMT<.POJUPS)PTU><.POJUPS1PSU>cKRb` { "receiver": null, "servers": [ { "error":

    "", "alive": true, "address": "host1:24224" }, { "error": "", "alive": true, "address": “host2:24225" } ], "files": { “/var/log/nginx/access.log“: { "error": "", "position": 472132311, "tag": “nginx.access_log” }, … …
  15. System Stats DVSMT<.POJUPS)PTU><.POJUPS1PSU>TZTUFNcKRb` { "gc_pause": [], "gc_pause_per_second": 0, "gc_per_second": 0,

    "gc_num": 216, "gc_last": 1449345754024961000, "gc_next": 98962453, "heap_objects": 590421, "heap_released": 0, "heap_inuse": 90906624, "heap_idle": 11984896, "heap_sys": 102891520, "cgo_call_num": 1, "gomaxprocs": 8, "goroutine_num": 31, "cpu_num": 8, "go_arch": "amd64", "go_os": "linux", "go_version": "go1.5.1", "time": 1449345783047388700, "memory_alloc": 88036000, "memory_total_alloc": 10373554664, "memory_sys": 111020280, "memory_lookups": 1343, "memory_mallocs": 162497629, "memory_frees": 161907208, "memory_stack": 917504, "heap_alloc": 88036000 }
  16. After switched to fluent-agent-hydra BigQuery error in load operation: Error

    processing job Field:xxx: Cloud not convert value to integer ( bad value or out of range ) Field:yyy: Cloud not convert value to integer ( bad value or out of range ) Field:xxx: Cloud not convert value to integer ( bad value or out of range ) Field:xxx: Cloud not convert value to integer ( bad value or out of range ) Field:xxx: Cloud not convert value to integer ( bad value or out of range ) … ʂʁ
  17. Bigquery’s demand • Google Bigquery demands fixed table schema and

    strict data format [ { “name”:”value”, “type”:”INTEGER” }, ] ▪ schema.json {“value”:150} ▪ valid data foramt ▪ invalid data format {“value”:150.0} {“value”:”150”}
  18. Special conversion behavior for numerical value • fluent-agent-hydra treats a

    numerical value as float64 even if its type is integer • When log-format is JSON • Whyʁ • Because fluent-agent-hydra uses encoding/json and unmarshal JSON into interface values
  19. {“value”:150} {“value”:150.0} (In actual, msgpack) (PPHMF#JH2VFSZ 0QFO3FTUZ qVFOUBHFOU IZESB out_forward

    & out_file in_tail & out_forward output to log file -PH 4FSWFS bq load xxx.log schema.json Error processing job [ { “name”:”value”, “type”:”INTEGER” }, ] schema.json {“value”:150.0}
  20. By the way, • fluent-agent-hydra provides the directive Types for

    converting type # in config.toml Types = “value:integer”
  21. By the way, • fluent-agent-hydra provides the directive Types for

    converting type # in config.toml Types = “value:integer” But this was provided for only LTSV at that time…
  22. Now • fluent-agent-hydra provides the directive Types for converting type

    # in config.toml Types = “value:integer” Always convert type to int64 regardless format is LTSV or JSON
  23. 0QFO3FTUZ 0QFO3FTUZ 0QFO3FTUZ (PPHMF#JH2VFSZ Developer Data Sientist Analyze by SQL

    send events send events send events Powered by cookpad/puree-(ios|android) utilize events utilize events utilize events hydra(※) hydra(※) hydra(※) (※) fluent-agent-hydra PascalʙMercari analysis baseʙ (Now) in_tail & out_forward
  24. Summary • Fluentd is very flexible and robust log-collector •

    But performance is not always enough • There are some alternatives in Go • fluent-agent-hydra might fit the case below • Want a faster and light-weight log-collector for in_tail & out_forward • But robustness is less than Fluentd • e.g. position file is not supported