Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fluentd - CNCF Paris

Horgix
February 15, 2018

Fluentd - CNCF Paris

Talk on Fluentd with introduction on what it is, how it works, and some real life feedback on its usage. This was presented at the Cloud Native Paris Meetup on 15th February 2018 : https://www.meetup.com/Cloud-Native-Computing-Paris/events/247273583/

Horgix

February 15, 2018
Tweet

More Decks by Horgix

Other Decks in Programming

Transcript

  1. Fluentd Who am I ▼ Alexis “Horgix” Chotard ▼ Systems

    Engineer & Consultant @ Xebia ▼ Love to automate things ▼ Striving to do everything cleanly 2 Horgix Horgix
  2. Fluentd What I’ll present you today ▼ Theory ▽ Introduction

    to Fluentd ▽ How Fluentd works ▼ Experience feedback ▽ Context @ Photobox ▽ Photobox logs lifecycle ▼ Conclusion 3
  3. Fluentd How it’s made ▼ Written in Ruby, but with

    performance sensitive parts in C ▼ Uses MessagePack ▼ 650+ plugins (data sources / data output) ▼ “Regular PC box can handle 18,000 messages/second with a single process” 8 https://msgpack.org/
  4. Fluentd And a bit of history... Joins the CNCF in

    November 9 https://www.cncf.io/blog/2016/11/09/fluentd-joins-cloud-native-computing-foundation/ 2011 2016 ▼ Conceived by Treasure Data ▼ Open-sourced in October
  5. Fluentd Distribution Fluentd VS td-agent: ▼ td-agent is a stable

    distribution package of Fluentd ▼ Kind of the same thing than Moby/Docker ▼ Community Support vs Treasure Data, Inc maintainers ▼ Gem vs rpm/deb/dmg packages ▼ ... 10
  6. Fluentd Distribution Others : ▼ Maintained Docker image available ▼

    Official Kubernetes Daemonset ▼ Fluent Bit / td-agent-bit : lightweight forwarder only 11 https://hub.docker.com/r/fluent/fluentd https://docs.fluentd.org/v0.12/articles/kubernetes-fluent
  7. Fluentd Log lifecycle ▼ Fluentd get logs from inputs ▼

    ... and associate a (single) tag to it ▼ Matches the tag against outputs ▼ Then send the log accordingly ▼ Can also re-emit logs after modification with a new tag : filtering ! Input → filter 1 → ... → filter N → Output 13
  8. Fluentd Log lifecycle <source> @type http port 9880 </source> <filter

    myapp.access> @type record_transformer <record> hello "Cloud Native Paris" </record> </filter> <match myapp.access> @type file path /var/log/fluent/access.log </match> 14 http://the.fluentd.host.example:9880/myapp.access?json={"event":"data"}
  9. Fluentd Logging driver for Kubernetes “Kubernetes doesn’t specify a logging

    agent, but two optional logging agents are packaged with the Kubernetes release: Stackdriver Logging for use with Google Cloud Platform, and Elasticsearch. You can find more information and instructions in the dedicated documents. Both use fluentd with custom configuration as an agent on the node.” 16 https://kubernetes.io/docs/concepts/cluster-administration/logging/
  10. Fluentd Technical context Immutable infrastructure : 1. Build an image

    2. Deploy this image, without changing it across environments 18
  11. Fluentd Goals ▼ Do things cleanly, as Cloud Native as

    possible ▼ Ship logs! ▽ Logstash? Nope, decision already made ▽ Let's give a try to fluentd (td-agent actually) ▼ Have configuration that is re-usable across projects 19
  12. Fluentd What we have to do ▼ Embed the agent

    inside images (AMI) ▼ Configure it ▼ Make it take logs from: ▽ Application ▽ Nginx access logs (went out of ELB ones) ▼ Send it to the fluentd forwarders 21
  13. Fluentd Logging destination - 12 factor “A twelve-factor app never

    concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout.” 22 https://12factor.net/logs
  14. Fluentd Parsing logs ▼ Complex log formats are a pain

    to parse: ▽ Multiline tracebacks ▽ Varying formats ▽ Strong coupling between the log format of the app and the parsed format ▼ Solution: ▽ Make nginx log in JSON ▽ Make the application log in JSON too! 23
  15. Fluentd JSON logging - Nginx log_format json '{ ' {%

    for elt in nginx_log_elements %} '"{{ elt }}": "${{ elt }}"{{ ',' if not loop.last else '' }} ' {% endfor %} '}'; access_log syslog:server=unix:/dev/log,nohostname json; Result : log_format json '{ ' "http_referer": "$http_referer", "request_uri": "$request_uri", "status": "$status" '}'; access_log syslog:server=unix:/dev/log,nohostname json; 24
  16. Fluentd JSON logging - Gunicorn (app) [loggers] keys=root, gunicorn.error [handlers]

    keys=console [formatters] keys=json 25 [logger_root] level=INFO handlers=console [logger_gunicorn.error] level=ERROR handlers=console propagate=0 qualname=gunicorn.error [handler_console] class=StreamHandler formatter=json args=(sys.stdout, ) [formatter_json] class=jsonlogging.JSONFormatter
  17. Fluentd Getting logs from journald <source> @type systemd tag hellocncf

    path /run/log/journal filters [{ "_SYSTEMD_UNIT": "myapp.service" }] read_from_head true <storage> @type local persistent true path /run/td-agent/myapp.pos </storage> </source> 29
  18. Fluentd Parsing JSON <match hellocncf> @type parser tag myapp.parsed key_name

    full_message reserve_data yes format json inject_key_prefix parsed_ </match> 30
  19. Fluentd Log enrichment with AWS metadata <match foo.**> type ec2_metadata

    aws_key_id YOUR_AWS_KEY_ID aws_sec_key YOUR_AWS_SECRET/KEY metadata_refresh_seconds 300 # Optional, default 300 seconds output_tag ${instance_id}.${tag} <record> instance_id ${instance_id} instance_type ${instance_type} az ${availability_zone} private_ip ${private_ip} vpc_id ${vpc_id} ami_id ${image_id} account_id ${account_id} project ${tagset_project} </record> </match> 32
  20. Fluentd Label - preconfigured forward Labels: like namespaces for Fluentd

    rules Relabel-ing: <match {{ app_name }}.app.**> @type relabel @label @PRECONFIGURED_FORWARD </match> 34 <label @PRECONFIGURED_FORWARD> <match **> @type forward <server> name tdagent-forwarder host my.tdagent.forwarder.example.org port 5150 </server> dns_round_robin true </match> </label>
  21. Fluentd Why Fluentd is cool ▼ Many plugins for everything

    you can think of ▼ Really simple configurations ▼ Tags flow is straightforward 37
  22. Fluentd Could be better ▼ Ruby :( ▼ Could be

    a pain for complex logging format (no JSON, multiline ...) ▼ Documentation 38
  23. A TECHNICAL CONFERENCE FOR DEVELOPERS AND TECH LEADS ABOUT DATA

    Data science: the next chapter Streaming: data in motion Big data architecture essentials dataxday.fr [email protected] Pan Piper: 2-4 Impasse Lamier, 75011 Paris 17 May, 2018 SAVE THE DATE !
  24. Le Paris Container Day est la conférence pionnière en France

    dédiée à l’ écosystème des conteneurs et de ses bonnes pratiques. Le thème de cette année: “Vivre avec l'Orchestration” Save the date! 26 juin 2018 au Cap Event Center, 3 Quai de Grenelle, 75015 Paris paris-container-day.fr / [email protected]