Slide 1

Slide 1 text

Elastic February 28, 2018 @jordansissel / @andrewvc What’s the Latest in Logstash Jordan Sissel and Andrew Cholakian

Slide 2

Slide 2 text

Thank you to our community You complete the picture

Slide 3

Slide 3 text

In the last year, 1,413 Logstashers have helped us with 9,237 issues, comments, and pull requests to our logstash-plugins repository. 864 pull requests were opened. The Logstash Plugins Community 3

Slide 4

Slide 4 text

In the last year, 76 Logstashers have helped us with 1,131 issues, comments, and pull requests to our logstash-plugins repository. 179 pull requests were opened. The Logstash Core Community 4

Slide 5

Slide 5 text

Triaging in Two Parts

Slide 6

Slide 6 text

We couldn’t do it without you!

Slide 7

Slide 7 text

The Latest

Slide 8

Slide 8 text

queue.type: persisted # (v5.4) input input input filter filter filter output output Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY queue Theme: Don’t lose data.

Slide 9

Slide 9 text

WHAT IF WE CAN’T DELIVER SOMETHING? Theme: Don’t lose data.

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Simple things should be simple

Slide 12

Slide 12 text

bin/logstash --modules netflow

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

bin/logstash --modules arcsight

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

bin/logstash --modules

Slide 17

Slide 17 text

17 A Tale of Two* Pipelines *(or more)

Slide 18

Slide 18 text

1 8 input { beats { port => 3444 tag => apache } tcp { port => 4222 tag => firewall } } filter { if "apache" in [tags] { dissect { ... } } else if "firewall" in [tags] { grok { ... } } } output { if "apache" in [tags] { elasticsearch { ... } } else if "firewall" in [tags] { tcp { ... } } } TCP BEATS DISSECT GROK TCP ES

Slide 19

Slide 19 text

1 9 input { beats { port => 3444 tag => apache } tcp { port => 4222 tag => firewall } } filter { if "apache" in [tags] { dissect { ... } } else if "firewall" in [tags] { grok { ... } } } output { if "apache" in [tags] { elasticsearch { ... } } else if "firewall" in [tags] { tcp { ... } } } input { beats { port => 3444 tag => apache } tcp { port => 4222 tag => firewall } } filter { if "apache" in [tags] { dissect { ... } } else if "firewall" in [tags] { grok { ... } } } output { if "apache" in [tags] { elasticsearch { ... } } else if "firewall" in [tags] { tcp { ... } } }

Slide 20

Slide 20 text

2 0 input { beats { port => 3444 tag => apache } } filter { dissect { ... } } output { elasticsearch { ... } } input { tcp { port => 4222 tag => firewall } } filter { grok { ... } } output { tcp { ... } } Multiple Pipelines (v6.0) BEATS DISSECT ES TCP GROK TCP

Slide 21

Slide 21 text

2 1

Slide 22

Slide 22 text

X-Pack Central Management (v6.0) ------- ----------- -------- - ------ --- ----- ---- ---- --- --- --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- --- --- --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- --- --- --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- Three Logstash instances. Doing the same thing. Let’s simplify this.

Slide 23

Slide 23 text

X-Pack Central Management (v6.0) ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- ------- ----------- -------- - ------ --- ----- ---- ---- ------ --- ------------ -- -- - -- ---- - -- --- -- - ---- - --- --- One configuration source.

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

2 5

Slide 26

Slide 26 text

WITH GREAT POWER COMES GREAT

Slide 27

Slide 27 text

--experimental-java-execution (v6.1) 27 0: new #121 3: dup 4: iconst_2 5: fconst_1 6: invokespecial #122 9: astore_1 10: aload_1 11: ldc #47 13: aload_0 14: getfield #6 17: invokeinterface #52, 3 filter { if “debug” in [tags] { drop { } } grok { match => { … } } } if event.getField(“tags”)... { drop.execute(event); if event.isCancelled() { return; } } grok.execute(event); Compile bytecode Load Configuration Transform to Java

Slide 28

Slide 28 text

--experimental-java-execution (v6.1) 28 “Compile” code = config.compile() pipeline = eval(code) loop do batch = queue.pop() pipeline.execute(batch) end Load Configuration filter { if “debug” in [tags] { drop { } } grok { match => { … } } } Transform to Ruby if event.get(“tags”).include?( drop.execute(event) return if event.cancelled? end grok.execute(event) Ruby

Slide 29

Slide 29 text

Logstash 6.2: Protect credentials with the keystore logstash-keystore (Logstash 6.2) % logstash-keystore create % logstash-keystore add es_password # use es_password in the pipeline: output { elasticsearch { hosts => … user => “elastic” password => “${es_password}” } }

Slide 30

Slide 30 text

Upcoming Features

Slide 31

Slide 31 text

Quick Hits

Slide 32

Slide 32 text

Visual Pipeline Builder

Slide 33

Slide 33 text

More Security Modules (SIEM, IDS, Firewalls, etc.)

Slide 34

Slide 34 text

SNMP Poller (It’s not a trap!)

Slide 35

Slide 35 text

Centralized Management Improvements

Slide 36

Slide 36 text

Node Groups

Slide 37

Slide 37 text

How we do it today 3 7 xpack.management.enabled: true xpack.management.pipeline.id: ["apache", "cloudwatch_logs"] LS1 LS2 LS3 apache cloudwatch logs

Slide 38

Slide 38 text

Where we Want to Go 3 8 xpack.management.enabled: true xpack.management.pipeline.node_groups: ["webserver_logs”, “security_logs”] LS1 LS2 LS3 apache logs cloudwatch logs webserver_logs security_logs nginx logs azure activity logs

Slide 39

Slide 39 text

Pipeline Settings Will Also Be Supported!

Slide 40

Slide 40 text

Ancillary Files in Config Management

Slide 41

Slide 41 text

Config File Grok Patterns GeoIP Database Without Config Management

Slide 42

Slide 42 text

Config File Grok Patterns GeoIP Database Config Management Today

Slide 43

Slide 43 text

Config File Grok Patterns GeoIP Database Config Management in the Future

Slide 44

Slide 44 text

Java Plugin API

Slide 45

Slide 45 text

• Allow people to use additional JVM languages to develop plugins • Enable performance optimization where required Java Plugin API Goals 4 5

Slide 46

Slide 46 text

• We’re starting with a low level API, optimized for performance • Will add sugar on top • We plan to support the current plugin API indefinitely • Some esoteric APIs, like flush, may go away Java Plugin API Plan 4 6

Slide 47

Slide 47 text

Logstash -> Logstash: Making it a better story

Slide 48

Slide 48 text

Currently, things are a little ugly

Slide 49

Slide 49 text

How we do it today 4 9 output { lumbjerjack { … } } input { beats { … } } ? ? ?

Slide 50

Slide 50 text

Interpipeline Communications

Slide 51

Slide 51 text

- pipeline.id: senderone config.string: "input { generator { message => huhx } } output { internal { send_to => [foo] } }" - pipeline.id: sendertwo config.string: "input { generator { message => whutx } } output { internal { send_to => [foo] } }" - pipeline.id: out config.string: " input { internal { address => foo } } output { stdout { codec => json_lines } }" What it looks like 5 1 senderone sendertwo out

Slide 52

Slide 52 text

Share a Port 5 2 Beats Ingest + Routing Enrich Weblogs + Output Enrich ETL Logs + Output Enrich Metrics + Output

Slide 53

Slide 53 text

Buffer as Needed 5 3 PQ PQ Logs Ingest Output to Elasticsearch Output to S3

Slide 54

Slide 54 text

Put it Together 5 4 PQ PQ PQ Beats Ingest and Route Enrich Weblogs Enrich ETL Logs Enrich Metrics + Output to Metrics ES Cluster Output to Logging ES Cluster

Slide 55

Slide 55 text

Language Improvements on the Horizons

Slide 56

Slide 56 text

• Is not null checks • Array/field reference conflation (is [foo] an array, or a field reference?) • What else does the lang need? • Can we experiment with new languages? The Language Can Move Forward 5 6

Slide 57

Slide 57 text

An Ephemeral World Or, how I learned to stop worrying about my disk and embrace ephemeral storage

Slide 58

Slide 58 text

An Ephemeral World Part 1: End to End ACKs

Slide 59

Slide 59 text

Resiliency today isn’t as easy as it could be

Slide 60

Slide 60 text

60 You need to think about disks if you use the PQ

Slide 61

Slide 61 text

61 You need to be able to tolerate data loss if you use the in- memory queue

Slide 62

Slide 62 text

What if we could give you the best of both worlds?

Slide 63

Slide 63 text

End-to-End (E2E) ACKs solve this problem

Slide 64

Slide 64 text

We currently buffer at each stage

Slide 65

Slide 65 text

Each event must be persisted, then acknowledged

Slide 66

Slide 66 text

With E2E ACKs, Nodes Can Fully Fail

Slide 67

Slide 67 text

Where we can replay or delay asynchronously acknowledge, we can skip persistence 6 7

Slide 68

Slide 68 text

• TCP, UDP, and Syslog protocol offer no facility to replay • The best strategy here is to store ASAP on Logstash, and then try to get it elsewhere ASAP Where we can’t replay, the PQ is still best 6 8

Slide 69

Slide 69 text

• Currently in progress • Will likely require rewrites to input plugins to be efficient • Only works for things that are replayable • Luckily, a lot of things are replayable E2E ACK Summary 6 9

Slide 70

Slide 70 text

An Ephemeral World Part 2: Distributed Plugin State/Execution

Slide 71

Slide 71 text

Problem 1: Plugins store metadata on disk

Slide 72

Slide 72 text

This all must be backed up

Slide 73

Slide 73 text

Problem 2: Inputs cannot share work

Slide 74

Slide 74 text

Today, Logstashes cannot coordinate their access to a service like S3 Active Failover

Slide 75

Slide 75 text

• Users must implement backups of metadata • Scaling past one box requires manual partitioning, or is not possible • Failover is tricky, and involves restoring backed up metadata Why Local State is Irritating 7 5

Slide 76

Slide 76 text

Solution: Centralized State/Distributed Exec

Slide 77

Slide 77 text

Local File Metadata Will Still Work! Active Failover

Slide 78

Slide 78 text

But This is Way Cooler Indexing cluster Management Cluster Management Data

Slide 79

Slide 79 text

• Plugin state kept in ES • Leader election through ES • Task assignment through ES • Still in design phase, we have a PoC in progress How it Works 7 9

Slide 80

Slide 80 text

Problem 3: The DLQ is Local

Slide 81

Slide 81 text

Solution: Create an ES DLQ

Slide 82

Slide 82 text

• Pitch in on PRs • Report bugs • Review PRs We Need Your Help! 8 2

Slide 83

Slide 83 text

Thank You!