Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Logstash

Introduction to Logstash

Or: How to wring useful data from those files just lying around consuming disk space

Elasticsearch Inc

November 13, 2013
Tweet

More Decks by Elasticsearch Inc

Other Decks in Technology

Transcript

  1. Introduction to Logstash Or: How to wring useful data from

    those files just lying around consuming disk space
  2. Logs… In computing, a logfile or simply log is a

    file that records events taking place in the execution of a system in order to provide an audit trail that can be used to understand the activity of the system and to diagnose problems. The act of keeping a logfile is called logging. Logs are essential to understand the activities of complex systems, particularly in the case of applications with little user interaction (such as server applications). It can also be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly unrelated events on different servers. Other solutions employ network-wide querying and reporting. http://en.wikipedia.org/wiki/Logfile
  3. Logs… In computing, a logfile or simply log is a

    file that records events taking place in the execution of a system in order to provide an audit trail that can be used to understand the activity of the system and to diagnose problems. The act of keeping a logfile is called logging. Logs are essential to understand the activities of complex systems, particularly in the case of applications with little user interaction (such as server applications). It can also be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly unrelated events on different servers. Other solutions employ network-wide querying and reporting. http://en.wikipedia.org/wiki/Logfile
  4. Logs… In computing, a logfile or simply log is a

    file that records events taking place in the execution of a system in order to provide an audit trail that can be used to understand the activity of the system and to diagnose problems. The act of keeping a logfile is called logging. Logs are essential to understand the activities of complex systems, particularly in the case of applications with little user interaction (such as server applications). It can also be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly unrelated events on different servers. Other solutions employ network-wide querying and reporting. http://en.wikipedia.org/wiki/Logfile
  5. Logs… In computing, a logfile or simply log is a

    file that records events taking place in the execution of a system in order to provide an audit trail that can be used to understand the activity of the system and to diagnose problems. The act of keeping a logfile is called logging. Logs are essential to understand the activities of complex systems, particularly in the case of applications with little user interaction (such as server applications). It can also be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly unrelated events on different servers. Other solutions employ network-wide querying and reporting. http://en.wikipedia.org/wiki/Logfile
  6. Logs… In computing, a logfile or simply log is a

    file that records events taking place in the execution of a system in order to provide an audit trail that can be used to understand the activity of the system and to diagnose problems. The act of keeping a logfile is called logging. Logs are essential to understand the activities of complex systems, particularly in the case of applications with little user interaction (such as server applications). It can also be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly unrelated events on different servers. Other solutions employ network-wide querying and reporting. http://en.wikipedia.org/wiki/Logfile
  7. Logs… In computing, a logfile or simply log is a

    file that records events taking place in the execution of a system in order to provide an audit trail that can be used to understand the activity of the system and to diagnose problems. The act of keeping a logfile is called logging. Logs are essential to understand the activities of complex systems, particularly in the case of applications with little user interaction (such as server applications). It can also be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly unrelated events on different servers. Other solutions employ network-wide querying and reporting. http://en.wikipedia.org/wiki/Logfile
  8. Logs… 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200

    2326 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/htdocs/test Nov 4 14:56:40 blackbox kernel[0]: Sandbox: collabpp(40395) deny mach-lookup com.apple.coreservices.launchservicesd 1383583716.483 45760 172.19.42.101 TCP_MISS/200 82216 CONNECT www.google.com:443 - HIER_DIRECT/216.239.32.20 - Nov 4 14:59:07 blackbox pf[209]: 00:00:00.000017 rule 1.800.icefloor.10/0(match): block in on en1: 172.19.42.1.9300 > 172.19.42.1.63825: Flags [S.], seq 4062841714, ack 1097543482, win 65535, options [mss 16344,nop,wscale 4,nop,nop,TS val 585988791 ecr 585984807,sackOK,eol], length 0 2012-05-04 11:10:42,650|ERROR| |[ACTIVE] ExecuteThread: '51' for queue: 'weblogic.kernel.Default (self-tuning)’| com.some.crazy.method|ConnectionRequest to http://xx.xx.xx.xx:xxxx/XML/something.xml failed with status code [401] (specified timeout: 8 seconds) 2012-05-04 17:17:20,870 [[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'] INFO another.crazy.method.name - Error goes here… 14:52:41,755 ERROR [org.jboss.msc.service.fail] MSC00001: Failed to start service jboss.as: org.jboss.msc.service.StartException in service jboss.as: Failed to start service at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1767) [jboss-msc-1.0.2.GA.jar:1.0.2.GA at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [rt.jar:1.7.0_11] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [rt.jar:1.7.0_11] at java.lang.Thread.run(Unknown Source) [rt.jar:1.7.0_11] 2013-11-02 12:27:11 11986 [Note] /usr/local/Cellar/mysql/5.6.12/bin/mysqld: ready for connections. {DATE} + {DATA} = Log
  9. Human-readable text format Fairly simple to understand (again, for a

    human) …As long as it’s only a few at a time …And not spanning too many servers Logs… PRO:
  10. Format variations ≈ ∞ Date formats, Data fields, Strings, Integers,

    oh my! Which box(es) are having the error? Which file will I find the error in? Search = PAIN Logs… CON:
  11. Search = PAIN grep cat file.log | grep pattern cat

    file.log | egrep ‘pattern1|pattern2’ sort? columns? if/then? Multiply this effort across n servers?
  12. Logstash Flow tail -f | (pipe) grep | (pipe) >

  13. Input Filter Output Logstash Flow tail -f | (pipe) grep

    | (pipe) >
  14. Live demo time!

  15. Input Filter Output elasticsearch eventlog exec file ganglia gelf generator

    graphite heroku imap log4j lumberjack pipe rabbitmq redis relp s3 snmptrap sqlite sqs stdin syslog tcp twitter udp unix varnishlog websocket wmi xmpp zenoss zeromq … advisor alter anonymize checksum cidr cipher clone csv date dns drop environment gelfify geoip grep grok json kv metaevent metrics multiline mutate range ruby split syslog_pri translate urldecode useragent uuid xml zeromq … elasticsearch email exec file ganglia gelf google_cloud_storage graphite jira librato loggly lumberjack mongodb nagios opentsdb pagerduty pipe rabbitmq redis riak riemann s3 sns sqs statsd stdout syslog tcp udp xmpp zabbix zeromq …
  16. Questions?

  17. Commercial Support Development Production Get support from the core team

    that built Elasticsearch, Kibana, and Logstash http://elasticsearch.com/support/
  18. Thank you for coming!