Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Discovery and Systems Diagnostics with the ELK stack

Robin Moffatt
November 10, 2015

Data Discovery and Systems Diagnostics with the ELK stack

Video: https://vimeo.com/145630431

The suite of tools made up of Elasticsearch, Logstash and Kibana (ELK) offers a powerful and flexible way with which to explore and analyse data. Whether log file or streamed, batch or realtime, system or business data - ELK makes it easy to ingest and analyse. This presentation will use the tools to both demonstrate the systems monitoring and interactive diagnostic capabilities that ELK gives, as well as how it can be used for data discovery against "Big Data" from a variety of sources including Hadoop. With live demos, we'll explore what the ELK components are, the options to ingest data, and the powerful visualisation and search capabilities provided.

Robin Moffatt

November 10, 2015
Tweet

More Decks by Robin Moffatt

Other Decks in Technology

Transcript

  1. [email protected] www.rittmanmead.com @rittmanmead
    Robin Moffatt, Principal Consultant, Rittman Mead | YoDB November 2015
    Data Discovery & Systems Diagnostics
    with the ELK Stack
    1

    View Slide

  2. [email protected] www.rittmanmead.com @rittmanmead
    • Principal Consultant with Rittman Mead

    - OBIEE & ODI / SysAdmin / Performance
    • Previously …

    - OBIEE/DW developer at large UK retailer
    - SQL Server DBA, BusinessObjects, DB2, COBOL….

    • Blog: http://ritt.md/rmoff

    • Twitter: @rmoff

    • IRC: rmoff / #obihackers / freenode
    Robin Moffatt
    2

    View Slide

  3. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    ELK

    View Slide

  4. [email protected] www.rittmanmead.com @rittmanmead
    ELK
    4
    • Elasticsearch - schema-free, distributed data store

    • Logstash - centralised data processing

    • Kibana - analytics and visualisation

    View Slide

  5. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    Data
    Discovery

    View Slide

  6. [email protected] www.rittmanmead.com @rittmanmead
    T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]ead.com
    W : www.rittmanmead.com
    6
    ELK - Data Discovery

    View Slide

  7. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    Systems
    Diagnostics

    View Slide

  8. [email protected] www.rittmanmead.com @rittmanmead
    ELK - Systems Diagnostics
    8

    View Slide

  9. [email protected] www.rittmanmead.com @rittmanmead
    Getting started is easy!
    9
    1. Download
    2. Unarchive
    3. Run
    4. …

    5. erm

    6. that’s it!

    View Slide

  10. [email protected] www.rittmanmead.com @rittmanmead 10
    logstash
    Elasticsearch
    Kibana
    logs

    View Slide

  11. [email protected] www.rittmanmead.com @rittmanmead
    DEMO!
    11

    View Slide

  12. [email protected] www.rittmanmead.com @rittmanmead 12
    Logstash
    Elasticsearch
    Kibana
    twitter
    csv

    View Slide

  13. [email protected] www.rittmanmead.com @rittmanmead
    DEMO!
    13

    View Slide

  14. [email protected] www.rittmanmead.com @rittmanmead
    Elasticsearch
    14
    • The core component of the ELK stack
    • Based on Apache Lucene (same as Cloudera’s Solr)

    • Distributed for scalability & resilience

    • Near-realtime document indexing

    View Slide

  15. [email protected] www.rittmanmead.com @rittmanmead
    Elasticsearch Uses - Search and Analytics
    15
    • Search
    - Soundcloud
    - GitHub

    • Analytics
    - The Guardian’s Ophan application
    https://www.elastic.co/assets/bltd061cc55096a5780/case-study-the-guardian.pdf
    A quarter of a billion
    events per day …
    typically the lag before
    something shows up on
    the dashboard is
    somewhere between
    three to five seconds…
    http://tnw.to/s3NV5

    View Slide

  16. [email protected] www.rittmanmead.com @rittmanmead
    Elasticsearch
    16
    • Stores data as JSON documents
    within an index

    • An index is made up of shards

    • Shards are distributed around a
    cluster automatically
    - Resilience and scale-out are simple

    View Slide

  17. [email protected] www.rittmanmead.com @rittmanmead
    Elasticsearch Administration
    17
    https://github.com/lmenezes/elasticsearch-kopf
    Kopf
    Marvel

    View Slide

  18. [email protected] www.rittmanmead.com @rittmanmead
    Working with the REST API
    18
    • curl is all you need ;-)

    - optionally with jq for syntax highlighting of the resulting json
    • Sense is a free plugin from Elastic for Kibana and useful for
    rapid prototyping of more complex interactions

    View Slide

  19. [email protected] www.rittmanmead.com @rittmanmead
    Elasticsearch REST API
    19
    $ curl -XPOST 

    'http://es:9200/viz/characters/'
    -d '{"name":"finbarr saunders"}'

    View Slide

  20. [email protected] www.rittmanmead.com @rittmanmead
    $ curl -XPOST
    'http://es:9200/viz/characters/'
    -d '{"name":"roger mellie”,
    "notes":"the man on the tele"}'
    Elasticsearch REST API
    20

    View Slide

  21. [email protected] www.rittmanmead.com @rittmanmead
    $ curl -XGET 'http://localhost:9200/viz/_search?q=roger'
    […]
    "hits" : { "total" : 1, "max_score" : 0.11506981,
    "hits" : [ {
    "_index" : "viz",
    "_type" : "characters",
    "_id" : "AUyyNUrTI0Rm5Pb-t8_l",
    "_score" : 0.11506981,
    "_source":{"name":"roger mellie" ,"notes":"the man on 

    the tele"}
    Elasticsearch REST API
    21

    View Slide

  22. [email protected] www.rittmanmead.com @rittmanmead
    $ curl -XDELETE 

    'http://localhost:9200/viz'
    Elasticsearch REST API
    22

    View Slide

  23. [email protected] www.rittmanmead.com @rittmanmead
    Elasticsearch-Hadoop
    23
    • Two-way connector between Hadoop
    and Elasticsearch

    • Read/Write with Elasticsearch from
    Hive, Pig, Spark, etc
    https://www.elastic.co/products/hadoop
    Hive
    HDFS
    Elasticsearch
    Tweets Website logs Blog post metadata
    Flume CSV
    elasticsearch-hadoop
    Kibana
    http://ritt.md/elk-hadoop-01
    Logstash

    View Slide

  24. [email protected] www.rittmanmead.com @rittmanmead
    Logstash
    24
    input
    filter
    output
    elasticsearch email
    kafka
    nagios
    pagerduty stdout file
    grok geoip mutate drop
    kafka
    log csv
    tsv
    json
    syslog tcp jdbc
    stdin
    twitter

    View Slide

  25. [email protected] www.rittmanmead.com @rittmanmead
    Logstash
    25
    • Does Logstash support …. yes, probably!

    - Vast number of supported input (and output) formats
    couchdb_changes
    drupal_dblog
    elasticsearch
    exec
    eventlog
    file
    ganglia
    gelf
    generator
    graphite
    github
    heartbeat
    heroku
    irc
    imap
    jmx
    kafka
    log4j
    lumberjack
    meetup
    pipe
    puppet_facter
    relp
    rss
    rackspace
    rabbitmq
    redis
    snmptrap
    stdin
    sqlite
    s3
    sqs
    stomp
    syslog
    tcp
    twitter
    unix
    udp
    varnishlog
    wmi
    websocket
    xmpp
    zenoss
    zeromq
    Outputs
    boundary
    circonus
    csv
    cloudwatch
    datadog
    datadog_metrics
    email
    elasticsearch
    exec
    file
    google_bigquery
    google_cloud_storage
    ganglia
    gelf
    graphtastic
    graphite
    hipchat
    http
    irc
    influxdb
    juggernaut
    jira
    kafka
    lumberjack
    librato
    loggly
    mongodb
    metriccatcher
    nagios
    null
    nagios_nsca
    opentsdb
    pagerduty
    pipe
    riemann
    redmine
    rackspace
    rabbitmq
    redis
    riak
    s3
    sqs
    stomp
    statsd
    solr_http
    sns
    syslog
    stdout
    tcp
    udp
    websocket
    xmpp
    zabbix
    zeromq
    Inputs
    http://www.elastic.co/guide/en/logstash/master/input-plugins.html http://www.elastic.co/guide/en/logstash/master/output-plugins.html

    View Slide

  26. [email protected] www.rittmanmead.com @rittmanmead
    Logstash Filters
    26
    • Powerful data processing

    - Extract fields from input
    (grok)
    - Enrich data (geoip, dns)
    - Reformat (csv, split, multiline,
    json, xml)
    alter
    anonymize
    collate
    csv
    cidr
    clone
    cipher
    checksum
    date
    dns
    drop
    elasticsearch
    extractnumbers
    environment
    elapsed
    fingerprint
    geoip
    grok
    i18n
    json
    json_encode
    kv
    mutate
    metrics
    multiline
    metaevent
    prune
    punct
    ruby
    range
    syslog_pri
    sleep
    split
    throttle
    translate
    uuid
    urldecode
    useragent
    xml
    zeromq
    Filters
    http://www.elastic.co/guide/en/logstash/master/filter-plugins.html

    View Slide

  27. [email protected] www.rittmanmead.com @rittmanmead
    Grok — Time to get your RegEx on!
    27
    Input
    data
    Grok
    pattern
    Key/Value output
    http://grokdebug.herokuapp.com/

    View Slide

  28. [email protected] www.rittmanmead.com @rittmanmead
    Logstash in Action
    28
    filter { grok { match => [ "message",
    "\[%{TIMESTAMP_ISO8601:timestamp}\] 

    \[%{DATA:Component}\] 

    \[%{WORD:Severity}
    (:%{NUMBER:LogLevelNum})?\]
    input {file { path => ["nqserver.log" ] }}
    output {
    elasticsearch { host => "localhost" }}

    View Slide

  29. [email protected] www.rittmanmead.com @rittmanmead
    Logstash -> Elasticsearch
    29

    View Slide

  30. [email protected] www.rittmanmead.com @rittmanmead
    T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 

    +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)
    E : [email protected]
    W : www.rittmanmead.com
    About Rittman Mead
    30
    • Oracle BI and DW Gold partner

    •Winner of five UKOUG Partner of the Year awards in 2013 and 2014 - including BI
    • World leading specialist partner for technical excellence, 

    solutions delivery and innovation in Oracle BI

    • Approximately 80 consultants worldwide

    • All expert in Oracle BI and DW

    • Offices in US (Atlanta), Europe, Australia and India

    • Skills in broad range of supporting Oracle tools:

    - OBIEE, OBIA
    - ODIEE
    - Essbase, Oracle OLAP
    - GoldenGate
    - Endeca
    Systems
    Diagnostics

    with ELK

    View Slide

  31. [email protected] www.rittmanmead.com @rittmanmead 31

    View Slide

  32. [email protected] www.rittmanmead.com @rittmanmead
    System Diagnostics
    32

    View Slide

  33. [email protected] www.rittmanmead.com @rittmanmead
    System Monitoring
    33

    View Slide

  34. [email protected] www.rittmanmead.com @rittmanmead
    System Monitoring
    34

    View Slide

  35. [email protected] www.rittmanmead.com @rittmanmead
    Performance Diagnostics
    35

    View Slide

  36. [email protected] www.rittmanmead.com @rittmanmead
    #EOF
    36
    email

    [email protected]
    web

    http://ritt.md/rmoff
    twitter

    @rmoff
    irc

    rmoff @ #obihackers
    Data Discovery
    http://ritt.md/go-elk-1
    System Diagnostics & Monitoring
    http://ritt.md/go-elk-2
    http://ritt.md/go-elk-3
    Logstash & Kafka
    http://ritt.md/kafka-elk

    http://ritt.md/kafka-pipelines

    View Slide