Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Discovery and Systems Diagnostics with the...

Robin Moffatt
November 10, 2015

Data Discovery and Systems Diagnostics with the ELK stack

Video: https://vimeo.com/145630431

The suite of tools made up of Elasticsearch, Logstash and Kibana (ELK) offers a powerful and flexible way with which to explore and analyse data. Whether log file or streamed, batch or realtime, system or business data - ELK makes it easy to ingest and analyse. This presentation will use the tools to both demonstrate the systems monitoring and interactive diagnostic capabilities that ELK gives, as well as how it can be used for data discovery against "Big Data" from a variety of sources including Hadoop. With live demos, we'll explore what the ELK components are, the options to ingest data, and the powerful visualisation and search capabilities provided.

Robin Moffatt

November 10, 2015
Tweet

More Decks by Robin Moffatt

Other Decks in Technology

Transcript

  1. [email protected] www.rittmanmead.com @rittmanmead Robin Moffatt, Principal Consultant, Rittman Mead |

    YoDB November 2015 Data Discovery & Systems Diagnostics with the ELK Stack 1
  2. [email protected] www.rittmanmead.com @rittmanmead • Principal Consultant with Rittman Mead -

    OBIEE & ODI / SysAdmin / Performance • Previously … - OBIEE/DW developer at large UK retailer - SQL Server DBA, BusinessObjects, DB2, COBOL….
 • Blog: http://ritt.md/rmoff • Twitter: @rmoff • IRC: rmoff / #obihackers / freenode Robin Moffatt 2
  3. T : +44 (0) 1273 911 268 (UK) or (888)

    631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com ELK
  4. [email protected] www.rittmanmead.com @rittmanmead ELK 4 • Elasticsearch - schema-free, distributed

    data store • Logstash - centralised data processing • Kibana - analytics and visualisation
  5. T : +44 (0) 1273 911 268 (UK) or (888)

    631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com Data Discovery
  6. [email protected] www.rittmanmead.com @rittmanmead T : +44 (0) 1273 911 268

    (UK) or (888) 631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com 6 ELK - Data Discovery
  7. T : +44 (0) 1273 911 268 (UK) or (888)

    631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com Systems Diagnostics
  8. [email protected] www.rittmanmead.com @rittmanmead Elasticsearch 14 • The core component of

    the ELK stack • Based on Apache Lucene (same as Cloudera’s Solr) • Distributed for scalability & resilience • Near-realtime document indexing
  9. [email protected] www.rittmanmead.com @rittmanmead Elasticsearch Uses - Search and Analytics 15

    • Search - Soundcloud - GitHub
 • Analytics - The Guardian’s Ophan application https://www.elastic.co/assets/bltd061cc55096a5780/case-study-the-guardian.pdf A quarter of a billion events per day … typically the lag before something shows up on the dashboard is somewhere between three to five seconds… http://tnw.to/s3NV5
  10. [email protected] www.rittmanmead.com @rittmanmead Elasticsearch 16 • Stores data as JSON

    documents within an index • An index is made up of shards • Shards are distributed around a cluster automatically - Resilience and scale-out are simple
  11. [email protected] www.rittmanmead.com @rittmanmead Working with the REST API 18 •

    curl is all you need ;-) - optionally with jq for syntax highlighting of the resulting json • Sense is a free plugin from Elastic for Kibana and useful for rapid prototyping of more complex interactions
  12. [email protected] www.rittmanmead.com @rittmanmead Elasticsearch REST API 19 $ curl -XPOST

    
 'http://es:9200/viz/characters/' -d '{"name":"finbarr saunders"}'
  13. [email protected] www.rittmanmead.com @rittmanmead $ curl -XGET 'http://localhost:9200/viz/_search?q=roger' […] "hits" :

    { "total" : 1, "max_score" : 0.11506981, "hits" : [ { "_index" : "viz", "_type" : "characters", "_id" : "AUyyNUrTI0Rm5Pb-t8_l", "_score" : 0.11506981, "_source":{"name":"roger mellie" ,"notes":"the man on 
 the tele"} Elasticsearch REST API 21
  14. [email protected] www.rittmanmead.com @rittmanmead Elasticsearch-Hadoop 23 • Two-way connector between Hadoop

    and Elasticsearch • Read/Write with Elasticsearch from Hive, Pig, Spark, etc https://www.elastic.co/products/hadoop Hive HDFS Elasticsearch Tweets Website logs Blog post metadata Flume CSV elasticsearch-hadoop Kibana http://ritt.md/elk-hadoop-01 Logstash
  15. [email protected] www.rittmanmead.com @rittmanmead Logstash 24 input filter output elasticsearch email

    kafka nagios pagerduty stdout file grok geoip mutate drop kafka log csv tsv json syslog tcp jdbc stdin twitter
  16. [email protected] www.rittmanmead.com @rittmanmead Logstash 25 • Does Logstash support <foo>

    …. yes, probably! - Vast number of supported input (and output) formats couchdb_changes drupal_dblog elasticsearch exec eventlog file ganglia gelf generator graphite github heartbeat heroku irc imap jmx kafka log4j lumberjack meetup pipe puppet_facter relp rss rackspace rabbitmq redis snmptrap stdin sqlite s3 sqs stomp syslog tcp twitter unix udp varnishlog wmi websocket xmpp zenoss zeromq Outputs boundary circonus csv cloudwatch datadog datadog_metrics email elasticsearch exec file google_bigquery google_cloud_storage ganglia gelf graphtastic graphite hipchat http irc influxdb juggernaut jira kafka lumberjack librato loggly mongodb metriccatcher nagios null nagios_nsca opentsdb pagerduty pipe riemann redmine rackspace rabbitmq redis riak s3 sqs stomp statsd solr_http sns syslog stdout tcp udp websocket xmpp zabbix zeromq Inputs http://www.elastic.co/guide/en/logstash/master/input-plugins.html http://www.elastic.co/guide/en/logstash/master/output-plugins.html
  17. [email protected] www.rittmanmead.com @rittmanmead Logstash Filters 26 • Powerful data processing

    - Extract fields from input (grok) - Enrich data (geoip, dns) - Reformat (csv, split, multiline, json, xml) alter anonymize collate csv cidr clone cipher checksum date dns drop elasticsearch extractnumbers environment elapsed fingerprint geoip grok i18n json json_encode kv mutate metrics multiline metaevent prune punct ruby range syslog_pri sleep split throttle translate uuid urldecode useragent xml zeromq Filters http://www.elastic.co/guide/en/logstash/master/filter-plugins.html
  18. [email protected] www.rittmanmead.com @rittmanmead Grok — Time to get your RegEx

    on! 27 Input data Grok pattern Key/Value output http://grokdebug.herokuapp.com/
  19. [email protected] www.rittmanmead.com @rittmanmead Logstash in Action 28 filter { grok

    { match => [ "message", "\[%{TIMESTAMP_ISO8601:timestamp}\] 
 \[%{DATA:Component}\] 
 \[%{WORD:Severity} (:%{NUMBER:LogLevelNum})?\] input {file { path => ["nqserver.log" ] }} output { elasticsearch { host => "localhost" }}
  20. [email protected] www.rittmanmead.com @rittmanmead T : +44 (0) 1273 911 268

    (UK) or (888) 631-1410 (USA) or 
 +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com About Rittman Mead 30 • Oracle BI and DW Gold partner •Winner of five UKOUG Partner of the Year awards in 2013 and 2014 - including BI • World leading specialist partner for technical excellence, 
 solutions delivery and innovation in Oracle BI • Approximately 80 consultants worldwide • All expert in Oracle BI and DW • Offices in US (Atlanta), Europe, Australia and India • Skills in broad range of supporting Oracle tools: - OBIEE, OBIA - ODIEE - Essbase, Oracle OLAP - GoldenGate - Endeca Systems Diagnostics
 with ELK
  21. [email protected] www.rittmanmead.com @rittmanmead #EOF 36 email
 [email protected] web
 http://ritt.md/rmoff twitter


    @rmoff irc
 rmoff @ #obihackers Data Discovery http://ritt.md/go-elk-1 System Diagnostics & Monitoring http://ritt.md/go-elk-2 http://ritt.md/go-elk-3 Logstash & Kafka http://ritt.md/kafka-elk
 http://ritt.md/kafka-pipelines