Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic{ON} 2018: Latest in Logstash

Elastic Co
March 01, 2018

Elastic{ON} 2018: Latest in Logstash

Much has happened since 5.0. Persistent queues, pipeline viewer (x-ray vision, basically) and the ability to run multiple pipelines at the same time for different use cases, and a move to the latest version of JRuby — it's all laying the foundation for even more goodness to come. See where the Logstash roadmap is headed and what to expect next.

Elastic Co

March 01, 2018
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Elastic
    February 28, 2018
    @jordansissel / @andrewvc
    What’s the Latest in Logstash
    Jordan Sissel and Andrew Cholakian

    View Slide

  2. Thank you to our community
    You complete the picture

    View Slide

  3. In the last year, 1,413 Logstashers have helped us with 9,237
    issues, comments, and pull requests to our logstash-plugins
    repository. 864 pull requests were opened.
    The Logstash Plugins Community
    3

    View Slide

  4. In the last year, 76 Logstashers have helped us with 1,131
    issues, comments, and pull requests to our logstash-plugins
    repository. 179 pull requests were opened.
    The Logstash Core Community
    4

    View Slide

  5. Triaging in Two Parts

    View Slide

  6. We couldn’t do it without you!

    View Slide

  7. The Latest

    View Slide

  8. queue.type: persisted # (v5.4)
    input
    input
    input
    filter
    filter
    filter
    output
    output
    Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY
    queue
    Theme: Don’t lose data.

    View Slide

  9. WHAT IF WE CAN’T
    DELIVER SOMETHING?
    Theme: Don’t lose data.

    View Slide

  10. View Slide

  11. Simple things should be simple

    View Slide

  12. bin/logstash --modules netflow

    View Slide

  13. View Slide

  14. bin/logstash --modules arcsight

    View Slide

  15. View Slide

  16. bin/logstash --modules

    View Slide

  17. 17
    A Tale of Two* Pipelines
    *(or more)

    View Slide

  18. 1
    8
    input {
    beats { port => 3444 tag => apache }
    tcp { port => 4222 tag => firewall }
    }
    filter {
    if "apache" in [tags] {
    dissect { ... }
    } else if "firewall" in [tags] {
    grok { ... }
    }
    }
    output {
    if "apache" in [tags] {
    elasticsearch { ... }
    } else if "firewall" in [tags] {
    tcp { ... }
    }
    }
    TCP BEATS
    DISSECT
    GROK
    TCP ES

    View Slide

  19. 1
    9
    input {
    beats { port => 3444 tag => apache }
    tcp { port => 4222 tag => firewall }
    }
    filter {
    if "apache" in [tags] {
    dissect { ... }
    } else if "firewall" in [tags] {
    grok { ... }
    }
    }
    output {
    if "apache" in [tags] {
    elasticsearch { ... }
    } else if "firewall" in [tags] {
    tcp { ... }
    }
    }
    input {
    beats { port => 3444 tag => apache }
    tcp { port => 4222 tag => firewall }
    }
    filter {
    if "apache" in [tags] {
    dissect { ... }
    } else if "firewall" in [tags] {
    grok { ... }
    }
    }
    output {
    if "apache" in [tags] {
    elasticsearch { ... }
    } else if "firewall" in [tags] {
    tcp { ... }
    }
    }

    View Slide

  20. 2
    0
    input {
    beats { port => 3444 tag => apache }
    }
    filter {
    dissect { ... }
    }
    output {
    elasticsearch { ... }
    }
    input {
    tcp { port => 4222 tag => firewall }
    }
    filter {
    grok { ... }
    }
    output {
    tcp { ... }
    }
    Multiple Pipelines (v6.0)
    BEATS
    DISSECT
    ES
    TCP
    GROK
    TCP

    View Slide

  21. 2
    1

    View Slide

  22. X-Pack Central Management (v6.0)
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ---
    ---
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ---
    ---
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ---
    ---
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    Three Logstash instances.
    Doing the same thing.
    Let’s simplify this.

    View Slide

  23. X-Pack Central Management (v6.0)
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    -------
    -----------
    -------- - ------ ---
    ----- ---- ----
    ------
    ---
    ------------
    -- -- - -- ---- - -- ---
    -- - ----
    - --- ---
    One configuration source.

    View Slide

  24. View Slide

  25. 2
    5

    View Slide

  26. WITH GREAT POWER
    COMES GREAT

    View Slide

  27. --experimental-java-execution (v6.1)
    27
    0: new #121
    3: dup
    4: iconst_2
    5: fconst_1
    6: invokespecial #122
    9: astore_1
    10: aload_1
    11: ldc #47
    13: aload_0
    14: getfield #6
    17: invokeinterface #52, 3
    filter {
    if “debug” in [tags] {
    drop { }
    }
    grok {
    match => { … }
    }
    }
    if event.getField(“tags”)... {
    drop.execute(event);
    if event.isCancelled() {
    return;
    }
    }
    grok.execute(event);
    Compile bytecode
    Load Configuration Transform to Java

    View Slide

  28. --experimental-java-execution (v6.1)
    28
    “Compile”
    code = config.compile()
    pipeline = eval(code)
    loop do
    batch = queue.pop()
    pipeline.execute(batch)
    end
    Load Configuration
    filter {
    if “debug” in [tags] {
    drop { }
    }
    grok {
    match => { … }
    }
    }
    Transform to Ruby
    if event.get(“tags”).include?(
    drop.execute(event)
    return if event.cancelled?
    end
    grok.execute(event)
    Ruby

    View Slide

  29. Logstash 6.2: Protect credentials with the keystore
    logstash-keystore (Logstash 6.2)
    % logstash-keystore create
    % logstash-keystore add es_password
    # use es_password in the pipeline:
    output {
    elasticsearch {
    hosts => …
    user => “elastic”
    password => “${es_password}”
    }
    }

    View Slide

  30. Upcoming Features

    View Slide

  31. Quick Hits

    View Slide

  32. Visual Pipeline Builder

    View Slide

  33. More Security Modules
    (SIEM, IDS, Firewalls, etc.)

    View Slide

  34. SNMP Poller
    (It’s not a trap!)

    View Slide

  35. Centralized Management
    Improvements

    View Slide

  36. Node Groups

    View Slide

  37. How we do it today
    3
    7
    xpack.management.enabled: true
    xpack.management.pipeline.id: ["apache", "cloudwatch_logs"]
    LS1 LS2 LS3
    apache
    cloudwatch
    logs

    View Slide

  38. Where we Want to Go
    3
    8
    xpack.management.enabled: true
    xpack.management.pipeline.node_groups: ["webserver_logs”, “security_logs”]
    LS1 LS2 LS3
    apache
    logs
    cloudwatch
    logs
    webserver_logs security_logs
    nginx
    logs
    azure
    activity logs

    View Slide

  39. Pipeline Settings Will Also Be
    Supported!

    View Slide

  40. Ancillary Files in Config
    Management

    View Slide

  41. Config
    File
    Grok
    Patterns
    GeoIP
    Database
    Without Config Management

    View Slide

  42. Config
    File
    Grok
    Patterns
    GeoIP
    Database
    Config Management Today

    View Slide

  43. Config
    File
    Grok
    Patterns
    GeoIP
    Database
    Config Management in the Future

    View Slide

  44. Java Plugin API

    View Slide

  45. • Allow people to use additional JVM languages to develop plugins
    • Enable performance optimization where required
    Java Plugin API Goals
    4
    5

    View Slide

  46. • We’re starting with a low level API, optimized for performance
    • Will add sugar on top
    • We plan to support the current plugin API indefinitely
    • Some esoteric APIs, like flush, may go away
    Java Plugin API Plan
    4
    6

    View Slide

  47. Logstash -> Logstash:
    Making it a better story

    View Slide

  48. Currently, things are a little ugly

    View Slide

  49. How we do it today
    4
    9
    output {
    lumbjerjack { … }
    }
    input {
    beats { … }
    }
    ?
    ?
    ?

    View Slide

  50. Interpipeline Communications

    View Slide

  51. - pipeline.id: senderone
    config.string: "input { generator { message => huhx } } output { internal { send_to => [foo] } }"
    - pipeline.id: sendertwo
    config.string: "input { generator { message => whutx } } output { internal { send_to => [foo] } }"
    - pipeline.id: out
    config.string: " input { internal { address => foo } } output { stdout { codec => json_lines } }"
    What it looks like
    5
    1
    senderone
    sendertwo
    out

    View Slide

  52. Share a Port
    5
    2
    Beats
    Ingest +
    Routing
    Enrich
    Weblogs +
    Output
    Enrich ETL
    Logs +
    Output
    Enrich
    Metrics +
    Output

    View Slide

  53. Buffer as Needed
    5
    3
    PQ
    PQ
    Logs Ingest
    Output to
    Elasticsearch
    Output to S3

    View Slide

  54. Put it Together
    5
    4
    PQ
    PQ
    PQ
    Beats
    Ingest and
    Route
    Enrich
    Weblogs
    Enrich ETL
    Logs
    Enrich
    Metrics +
    Output to
    Metrics ES
    Cluster
    Output to
    Logging ES
    Cluster

    View Slide

  55. Language Improvements on the
    Horizons

    View Slide

  56. • Is not null checks
    • Array/field reference conflation (is [foo] an array, or a field reference?)
    • What else does the lang need?
    • Can we experiment with new languages?
    The Language Can Move Forward
    5
    6

    View Slide

  57. An Ephemeral World
    Or, how I learned to stop worrying about my disk and
    embrace ephemeral storage

    View Slide

  58. An Ephemeral World
    Part 1: End to End ACKs

    View Slide

  59. Resiliency today isn’t as easy as
    it could be

    View Slide

  60. 60
    You need to think
    about disks if you
    use the PQ

    View Slide

  61. 61
    You need to be able
    to tolerate data loss
    if you use the in-
    memory queue

    View Slide

  62. What if we could give you the
    best of both worlds?

    View Slide

  63. End-to-End (E2E) ACKs solve
    this problem

    View Slide

  64. We currently buffer at each stage

    View Slide

  65. Each event must be persisted, then acknowledged

    View Slide

  66. With E2E ACKs, Nodes Can Fully Fail

    View Slide

  67. Where we can replay or delay asynchronously
    acknowledge, we can skip persistence
    6
    7

    View Slide

  68. • TCP, UDP, and Syslog protocol offer no facility to replay
    • The best strategy here is to store ASAP on Logstash, and then try to get it elsewhere ASAP
    Where we can’t replay, the PQ is still best
    6
    8

    View Slide

  69. • Currently in progress
    • Will likely require rewrites to input plugins to be efficient
    • Only works for things that are replayable
    • Luckily, a lot of things are replayable
    E2E ACK Summary
    6
    9

    View Slide

  70. An Ephemeral World
    Part 2: Distributed Plugin State/Execution

    View Slide

  71. Problem 1:
    Plugins store metadata on disk

    View Slide

  72. This all must be backed up

    View Slide

  73. Problem 2:
    Inputs cannot share work

    View Slide

  74. Today, Logstashes cannot coordinate their access to a service like S3
    Active
    Failover

    View Slide

  75. • Users must implement backups of metadata
    • Scaling past one box requires manual partitioning, or is not possible
    • Failover is tricky, and involves restoring backed up metadata
    Why Local State is Irritating
    7
    5

    View Slide

  76. Solution: Centralized
    State/Distributed Exec

    View Slide

  77. Local File Metadata Will Still Work!
    Active
    Failover

    View Slide

  78. But This is Way Cooler
    Indexing cluster
    Management Cluster
    Management Data

    View Slide

  79. • Plugin state kept in ES
    • Leader election through ES
    • Task assignment through ES
    • Still in design phase, we have a PoC in progress
    How it Works
    7
    9

    View Slide

  80. Problem 3:
    The DLQ is Local

    View Slide

  81. Solution: Create an ES DLQ

    View Slide

  82. • Pitch in on PRs
    • Report bugs
    • Review PRs
    We Need Your Help!
    8
    2

    View Slide

  83. Thank You!

    View Slide