Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic{ON} 2018: Latest in Logstash

Elastic Co
March 01, 2018

Elastic{ON} 2018: Latest in Logstash

Much has happened since 5.0. Persistent queues, pipeline viewer (x-ray vision, basically) and the ability to run multiple pipelines at the same time for different use cases, and a move to the latest version of JRuby — it's all laying the foundation for even more goodness to come. See where the Logstash roadmap is headed and what to expect next.

Elastic Co

March 01, 2018
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Elastic
    February 28, 2018
    @jordansissel / @andrewvc
    What’s the Latest in Logstash
    Jordan Sissel and Andrew Cholakian

    View full-size slide

  2. Thank you to our community
    You complete the picture

    View full-size slide

  3. In the last year, 1,413 Logstashers have helped us with 9,237
    issues, comments, and pull requests to our logstash-plugins
    repository. 864 pull requests were opened.
    The Logstash Plugins Community
    3

    View full-size slide

  4. In the last year, 76 Logstashers have helped us with 1,131
    issues, comments, and pull requests to our logstash-plugins
    repository. 179 pull requests were opened.
    The Logstash Core Community
    4

    View full-size slide

  5. Triaging in Two Parts

    View full-size slide

  6. We couldn’t do it without you!

    View full-size slide

  7. queue.type: persisted # (v5.4)
    input
    input
    input
    filter
    filter
    filter
    output
    output
    Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY
    queue
    Theme: Don’t lose data.

    View full-size slide

  8. WHAT IF WE CAN’T
    DELIVER SOMETHING?
    Theme: Don’t lose data.

    View full-size slide

  9. Simple things should be simple

    View full-size slide

  10. bin/logstash --modules netflow

    View full-size slide

  11. bin/logstash --modules arcsight

    View full-size slide

  12. bin/logstash --modules

    View full-size slide

  13. 17
    A Tale of Two* Pipelines
    *(or more)

    View full-size slide

  14. 1
    8
    input {
    beats { port => 3444 tag => apache }
    tcp { port => 4222 tag => firewall }
    }
    filter {
    if "apache" in [tags] {
    dissect { ... }
    } else if "firewall" in [tags] {
    grok { ... }
    }
    }
    output {
    if "apache" in [tags] {
    elasticsearch { ... }
    } else if "firewall" in [tags] {
    tcp { ... }
    }
    }
    TCP BEATS
    DISSECT
    GROK
    TCP ES

    View full-size slide

  15. 1
    9
    input {
    beats { port => 3444 tag => apache }
    tcp { port => 4222 tag => firewall }
    }
    filter {
    if "apache" in [tags] {
    dissect { ... }
    } else if "firewall" in [tags] {
    grok { ... }
    }
    }
    output {
    if "apache" in [tags] {
    elasticsearch { ... }
    } else if "firewall" in [tags] {
    tcp { ... }
    }
    }
    input {
    beats { port => 3444 tag => apache }
    tcp { port => 4222 tag => firewall }
    }
    filter {
    if "apache" in [tags] {
    dissect { ... }
    } else if "firewall" in [tags] {
    grok { ... }
    }
    }
    output {
    if "apache" in [tags] {
    elasticsearch { ... }
    } else if "firewall" in [tags] {
    tcp { ... }
    }
    }

    View full-size slide

  16. 2
    0
    input {
    beats { port => 3444 tag => apache }
    }
    filter {
    dissect { ... }
    }
    output {
    elasticsearch { ... }
    }
    input {
    tcp { port => 4222 tag => firewall }
    }
    filter {
    grok { ... }
    }
    output {
    tcp { ... }
    }
    Multiple Pipelines (v6.0)
    BEATS
    DISSECT
    ES
    TCP
    GROK
    TCP

    View full-size slide

  17. X-Pack Central Management (v6.0)
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ---
    ---
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ---
    ---
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ---
    ---
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    Three Logstash instances.
    Doing the same thing.
    Let’s simplify this.

    View full-size slide

  18. X-Pack Central Management (v6.0)
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    One configuration source.

    View full-size slide

  19. WITH GREAT POWER
    COMES GREAT

    View full-size slide

  20. --experimental-java-execution (v6.1)
    27
    0: new #121
    3: dup
    4: iconst_2
    5: fconst_1
    6: invokespecial #122
    9: astore_1
    10: aload_1
    11: ldc #47
    13: aload_0
    14: getfield #6
    17: invokeinterface #52, 3
    filter {
    if “debug” in [tags] {
    drop { }
    }
    grok {
    match => { … }
    }
    }
    if event.getField(“tags”)... {
    drop.execute(event);
    if event.isCancelled() {
    return;
    }
    }
    grok.execute(event);
    Compile bytecode
    Load Configuration Transform to Java

    View full-size slide

  21. --experimental-java-execution (v6.1)
    28
    “Compile”
    code = config.compile()
    pipeline = eval(code)
    loop do
    batch = queue.pop()
    pipeline.execute(batch)
    end
    Load Configuration
    filter {
    if “debug” in [tags] {
    drop { }
    }
    grok {
    match => { … }
    }
    }
    Transform to Ruby
    if event.get(“tags”).include?(
    drop.execute(event)
    return if event.cancelled?
    end
    grok.execute(event)
    Ruby

    View full-size slide

  22. Logstash 6.2: Protect credentials with the keystore
    logstash-keystore (Logstash 6.2)
    % logstash-keystore create
    % logstash-keystore add es_password
    # use es_password in the pipeline:
    output {
    elasticsearch {
    hosts => …
    user => “elastic”
    password => “${es_password}”
    }
    }

    View full-size slide

  23. Upcoming Features

    View full-size slide

  24. Visual Pipeline Builder

    View full-size slide

  25. More Security Modules
    (SIEM, IDS, Firewalls, etc.)

    View full-size slide

  26. SNMP Poller
    (It’s not a trap!)

    View full-size slide

  27. Centralized Management
    Improvements

    View full-size slide

  28. How we do it today
    3
    7
    xpack.management.enabled: true
    xpack.management.pipeline.id: ["apache", "cloudwatch_logs"]
    LS1 LS2 LS3
    apache
    cloudwatch
    logs

    View full-size slide

  29. Where we Want to Go
    3
    8
    xpack.management.enabled: true
    xpack.management.pipeline.node_groups: ["webserver_logs”, “security_logs”]
    LS1 LS2 LS3
    apache
    logs
    cloudwatch
    logs
    webserver_logs security_logs
    nginx
    logs
    azure
    activity logs

    View full-size slide

  30. Pipeline Settings Will Also Be
    Supported!

    View full-size slide

  31. Ancillary Files in Config
    Management

    View full-size slide

  32. Config
    File
    Grok
    Patterns
    GeoIP
    Database
    Without Config Management

    View full-size slide

  33. Config
    File
    Grok
    Patterns
    GeoIP
    Database
    Config Management Today

    View full-size slide

  34. Config
    File
    Grok
    Patterns
    GeoIP
    Database
    Config Management in the Future

    View full-size slide

  35. Java Plugin API

    View full-size slide

  36. • Allow people to use additional JVM languages to develop plugins
    • Enable performance optimization where required
    Java Plugin API Goals
    4
    5

    View full-size slide

  37. • We’re starting with a low level API, optimized for performance
    • Will add sugar on top
    • We plan to support the current plugin API indefinitely
    • Some esoteric APIs, like flush, may go away
    Java Plugin API Plan
    4
    6

    View full-size slide

  38. Logstash -> Logstash:
    Making it a better story

    View full-size slide

  39. Currently, things are a little ugly

    View full-size slide

  40. How we do it today
    4
    9
    output {
    lumbjerjack { … }
    }
    input {
    beats { … }
    }
    ?
    ?
    ?

    View full-size slide

  41. Interpipeline Communications

    View full-size slide

  42. - pipeline.id: senderone
    config.string: "input { generator { message => huhx } } output { internal { send_to => [foo] } }"
    - pipeline.id: sendertwo
    config.string: "input { generator { message => whutx } } output { internal { send_to => [foo] } }"
    - pipeline.id: out
    config.string: " input { internal { address => foo } } output { stdout { codec => json_lines } }"
    What it looks like
    5
    1
    senderone
    sendertwo
    out

    View full-size slide

  43. Share a Port
    5
    2
    Beats
    Ingest +
    Routing
    Enrich
    Weblogs +
    Output
    Enrich ETL
    Logs +
    Output
    Enrich
    Metrics +
    Output

    View full-size slide

  44. Buffer as Needed
    5
    3
    PQ
    PQ
    Logs Ingest
    Output to
    Elasticsearch
    Output to S3

    View full-size slide

  45. Put it Together
    5
    4
    PQ
    PQ
    PQ
    Beats
    Ingest and
    Route
    Enrich
    Weblogs
    Enrich ETL
    Logs
    Enrich
    Metrics +
    Output to
    Metrics ES
    Cluster
    Output to
    Logging ES
    Cluster

    View full-size slide

  46. Language Improvements on the
    Horizons

    View full-size slide

  47. • Is not null checks
    • Array/field reference conflation (is [foo] an array, or a field reference?)
    • What else does the lang need?
    • Can we experiment with new languages?
    The Language Can Move Forward
    5
    6

    View full-size slide

  48. An Ephemeral World
    Or, how I learned to stop worrying about my disk and
    embrace ephemeral storage

    View full-size slide

  49. An Ephemeral World
    Part 1: End to End ACKs

    View full-size slide

  50. Resiliency today isn’t as easy as
    it could be

    View full-size slide

  51. 60
    You need to think
    about disks if you
    use the PQ

    View full-size slide

  52. 61
    You need to be able
    to tolerate data loss
    if you use the in-
    memory queue

    View full-size slide

  53. What if we could give you the
    best of both worlds?

    View full-size slide

  54. End-to-End (E2E) ACKs solve
    this problem

    View full-size slide

  55. We currently buffer at each stage

    View full-size slide

  56. Each event must be persisted, then acknowledged

    View full-size slide

  57. With E2E ACKs, Nodes Can Fully Fail

    View full-size slide

  58. Where we can replay or delay asynchronously
    acknowledge, we can skip persistence
    6
    7

    View full-size slide

  59. • TCP, UDP, and Syslog protocol offer no facility to replay
    • The best strategy here is to store ASAP on Logstash, and then try to get it elsewhere ASAP
    Where we can’t replay, the PQ is still best
    6
    8

    View full-size slide

  60. • Currently in progress
    • Will likely require rewrites to input plugins to be efficient
    • Only works for things that are replayable
    • Luckily, a lot of things are replayable
    E2E ACK Summary
    6
    9

    View full-size slide

  61. An Ephemeral World
    Part 2: Distributed Plugin State/Execution

    View full-size slide

  62. Problem 1:
    Plugins store metadata on disk

    View full-size slide

  63. This all must be backed up

    View full-size slide

  64. Problem 2:
    Inputs cannot share work

    View full-size slide

  65. Today, Logstashes cannot coordinate their access to a service like S3
    Active
    Failover

    View full-size slide

  66. • Users must implement backups of metadata
    • Scaling past one box requires manual partitioning, or is not possible
    • Failover is tricky, and involves restoring backed up metadata
    Why Local State is Irritating
    7
    5

    View full-size slide

  67. Solution: Centralized
    State/Distributed Exec

    View full-size slide

  68. Local File Metadata Will Still Work!
    Active
    Failover

    View full-size slide

  69. But This is Way Cooler
    Indexing cluster
    Management Cluster
    Management Data

    View full-size slide

  70. • Plugin state kept in ES
    • Leader election through ES
    • Task assignment through ES
    • Still in design phase, we have a PoC in progress
    How it Works
    7
    9

    View full-size slide

  71. Problem 3:
    The DLQ is Local

    View full-size slide

  72. Solution: Create an ES DLQ

    View full-size slide

  73. • Pitch in on PRs
    • Report bugs
    • Review PRs
    We Need Your Help!
    8
    2

    View full-size slide