Slide 1

Slide 1 text

‹#› Colin Surprenant, Software Engineer Andrew Cholakian, Software Engineer Feb 2016 Dive Deep with Logstash From Pipelines to Persistent Queues colinsurprenant andrewvc

Slide 2

Slide 2 text

Agenda 2 Logstash quick intro/overview The (Old) Life of an Event Moving the Old Pipeline Into the Future Java Event Persistence 1 2 3 4 5

Slide 3

Slide 3 text

Collect Parse / Transform 3 Store

Slide 4

Slide 4 text

Input Plugins Filter Plugins 4 Output Plugins date, advisor, alter, anonymize, checksum, cidr, cipher, geoip, clone, collate, csv,, dns, drop, elapsed, elasticsearch, environment, extractnumbers, fingerprint,gelfify, geoip, useragent, grep, grok, grokdiscovery, i18n, json, json_encode, kv, metaevent, metrics, multiline, mutate, noop, ~200 plugins

Slide 5

Slide 5 text

5

Slide 6

Slide 6 text

Definitions • 3 stages processing • Orchestrate data flow • Manage queuing • Manage plugins lifecycle Pipeline • Internal data representation • Raw input data turned into Event at input • Event mutated across filters • The main API in the config language • The main API in plugins Event

Slide 7

Slide 7 text

input/output Transport Data Format input output input decode data data data event output encode event data codecs

Slide 8

Slide 8 text

input { file { codec => lines } } filter { … } output { file { … codec => json } } Example Configuration - Codecs 8

Slide 9

Slide 9 text

Codecs Codecs - Input Decoding

Slide 10

Slide 10 text

Codecs - Output Encoding

Slide 11

Slide 11 text

Agenda 11 Java Event Moving the Old Pipeline Into the Future Persistence The (Old) Life of an Event 2 4 3 Logstash quick intro/overview 1 5

Slide 12

Slide 12 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 12 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 13

Slide 13 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 13 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 14

Slide 14 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 14 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 15

Slide 15 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 15 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 16

Slide 16 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 16 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 17

Slide 17 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 17 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 18

Slide 18 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 18 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 19

Slide 19 text

The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 19 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 20

Slide 20 text

Agenda 20 Java Event The (Old) Life of an Event Persistence Moving the Old Pipeline Into the Future 3 4 2 Logstash quick intro/overview 1 5

Slide 21

Slide 21 text

How can we recover data in the event of a full crash? 21

Slide 22

Slide 22 text

Can we achieve “at least once” message delivery? 22

Slide 23

Slide 23 text

Can we do these things without sacrificing performance? 23

Slide 24

Slide 24 text

The Simplest Thing we Could Do Make Both Queues Durable 24 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 25

Slide 25 text

The Simplest Thing we Could Do Make Both Queues Durable 25 Input Codec Persistent Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Persistent Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 26

Slide 26 text

One Durable Queue, One In-Memory Make the First Queue Durable 26 Input Codec Persistent Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n)

Slide 27

Slide 27 text

One Durable Queue 27 Input Codec Persistent Sized Queue (20) Codec Input Filters Outputs Filters Outputs Filters Outputs

Slide 28

Slide 28 text

One Durable Queue + Batcher 28 Input Codec Persistent Sized Queue (20) Codec Input Filters Outputs Batcher Filters Outputs Batcher Filters Outputs Batcher

Slide 29

Slide 29 text

‹#› The (New) Life of an Event Logstash 2.2+

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

31 AFile.Log Line Data - 1 Line Data - 2 Line Data - 3 Line Data - 4 File Input JSON Codec Synchronous Queue Worker Threads Queue “ACK” for persistence

Slide 32

Slide 32 text

32 AFile.Log Line Data - 1 Line Data - 2 Line Data - 3 Line Data - 4 File Input JSON Codec Synchronous Queue Worker Threads Queue “ACK” for persistence Line Data - 1 Line Data - 1 Event / Line 1

Slide 33

Slide 33 text

33 JSON Codec Synchronous Queue Worker Threads Queue “ACK” for persistence Batcher Filters Outputs Batcher Filters Outputs Batcher Filters Outputs Where to? Event / Line 1

Slide 34

Slide 34 text

34 JSON Codec Synchronous Queue Worker Threads Queue “ACK” for persistence Batcher Filters Outputs Batcher Filters Outputs Batcher Filters Outputs Event / Line 1 Event / Line 1

Slide 35

Slide 35 text

35 JSON Codec Synchronous Queue Worker Threads Queue “ACK” for persistence Batcher Filters Outputs Batcher Filters Outputs Batcher Filters Outputs

Slide 36

Slide 36 text

36 JSON Codec Synchronous Queue Worker Threads Queue “ACK” for persistence Batcher Filters Outputs Batcher Filters Outputs Batcher Filters Outputs

Slide 37

Slide 37 text

Where are my threads? 37

Slide 38

Slide 38 text

Output Processing Thread The Old (Logstash <= 2.1) Pipeline One event at a time with buffered queues 38 Input Codec Sized Queue (20) Filter Worker 1 Codec Input Output A Worker 1 Output A Worker 2 Output B Worker 1 Sized Queue (20) Filter Worker 2 Filter Worker (n) Input Thread Input Thread Filter Worker Thread Filter Worker Thread Filter Worker Thread Output Delegating Thread Output Worker Thread Output Worker Thread Output Worker Thread

Slide 39

Slide 39 text

A Simpler Threading Story 39 Input Codec Persistent Sized Queue (20) Codec Input Filters Outputs Batcher Filters Outputs Batcher Filters Outputs Batcher Input Thread Input Thread Pipeline Worker Thread Pipeline Worker Thread Pipeline Worker Thread

Slide 40

Slide 40 text

‹#› NG Pipeline Performance Logstash 2.2+

Slide 41

Slide 41 text

‹#› Apache Parser, no IO

Slide 42

Slide 42 text

input { stdin {} } filter { grok { … } geoip {… } useragent { … } date { … } } output { codec => dots } Apache Pipeline (no IO) Overview Full Config @ https://gist.github.com/andrewvc/a5708783166e01d904ef 42

Slide 43

Slide 43 text

User Execution Time for Apache Parser Parsing Apache common log format with Geo-IP, Date, and UserAgent filters 43 0 1000 2000 3000 4000 5000 User execution time (lower is better) Logstash 2.2.0 (NG) Logstash 2.1.2 User Time

Slide 44

Slide 44 text

System Execution Time for Apache Parser Parsing Apache common log format with Geo-IP, Date, and UserAgent filters 44 0 20 40 60 80 100 System execution time (lower is better) Logstash 2.2.0 (NG) Logstash 2.1.2 System Time

Slide 45

Slide 45 text

Wall Clock Execution Time Parsing Apache common log format with Geo-IP, Date, and UserAgent filters 45 0 100 200 300 400 500 Wall clock execution time in seconds (lower is better) Logstash 2.2.0 (NG) Logstash 2.1.2 Wall Time

Slide 46

Slide 46 text

Wall Time This is the total processing time as measured by the clock on the wall. -26% -30% -13% Performance Summary 46 User Time Time spent executing userspace code System Time Time spent in kernel code, including resolving lock contention.

Slide 47

Slide 47 text

‹#› Apache Parser, With IO

Slide 48

Slide 48 text

input { file { … } } filter { grok { … } geoip {… } useragent { … } date { … } } output { elasticsearch { … } } Apache Pipeline (no IO) Overview Full test info @ https://github.com/elastic/logstash/pull/4340#issuecomment-164062362 48

Slide 49

Slide 49 text

Event Throughput / Time 28.67% speedup on new pipeline comparing best case - worst case 49 0 1000 2000 3000 4000 5000 Events per second. Larger is better. Logstash 2.2.0 (NG) Logstash 2.1.2 Events per Second

Slide 50

Slide 50 text

50 Performance Tips • TEST EVERY CHANGE • Tune worker count with -w. More IO = more workers! • Tune batch count with -b. Bigger batches are not always better! • Batch size of pipeline is new max batch size for output plugins • Monitor GC activity for memory pressure! Source: Gray Arial10pt

Slide 51

Slide 51 text

Agenda 51 The (Old) Life of an Event Moving the Old Pipeline Into the Future Persistence Java Event 4 2 3 Logstash quick intro/overview 1 5

Slide 52

Slide 52 text

52

Slide 53

Slide 53 text

Event Object 53 Simplified Object Composition Event Accessors Timestamp 1 Field reference handling - Config API - Plugin API 2 Date/Time normalization 3 Notable functions: - sprint() - to_json() & from_json

Slide 54

Slide 54 text

Event object 54 API for logstash config syntax API for plugins development 1 2

Slide 55

Slide 55 text

filter { if [type] == "syslog" { mutate { add_field => [“[times][created_at]", "%{syslog_timestamp}"] add_field => [“[times][received_at]", "%{@timestamp}"] } } } Event Object Logstash Config Accessors 55 1 Field reference in conditional expression 2 Nested field reference

Slide 56

Slide 56 text

filter { if [type] == "syslog" { mutate { add_field => [“[times][created_at]", "%{syslog_timestamp}"] add_field => [“[times][received_at]", "%{@timestamp}"] } } } Event Object Logstash Config sprintf() 56 1 sprints format string - refer to field values from within strings

Slide 57

Slide 57 text

Event object 57 API for logstash config syntax API for plugins development 2 1

Slide 58

Slide 58 text

event[@target] = value if event[“[deep][field]”] == value event.tag(“sometag”) end event[“[deep][field]”] = event.sprintf(format) event.timestamp = LogStash::Timestamp.new json = event.to_json e = Event.from_json(s) Event Object Ruby plugin API 58 1 field reference getters & setters 2 tag setter sprints function 3 4 timestamp getter & setter 5 json serialization/deserialization

Slide 59

Slide 59 text

Why Java? 59

Slide 60

Slide 60 text

60 Because

Slide 61

Slide 61 text

Java Event Performance - config #1 61 1 Dec 16 Pipeline-TNG merge 70% increase Jan 27 Java Event merge 60% increase 2

Slide 62

Slide 62 text

‹#› Lies, damned lies and benchmarks

Slide 63

Slide 63 text

Java Event Performance - config #2 63 1 Dec 16 Pipeline-TNG merge 70% increase Jan 31 Java Event fix merge 50% increase 3 Jan 27 Java Event merge -90% decrease 2

Slide 64

Slide 64 text

Why Java? 64 Java API • Paves the way for native Java/Scala/Closure/ Groovy plugins • Share plugins with ES Ingest Node Faster Serialization • Java Serializable interface • Pure inner Java data structures Faster Persistence • Leverage Faster Serialization • Direct access to Java NIO + Memory Mapping

Slide 65

Slide 65 text

Java Event 100% Ruby plugins compatibility 65 Event Accessors Timestamp JRuby API Proxy 1 explicit control over Ruby/Java type conversions 2 Pure Java internal objects representation

Slide 66

Slide 66 text

Agenda 66 Persistence The (Old) Life of an Event Moving the Old Pipeline Into the Future Java Event Logstash quick intro/overview 5 2 3 4 1

Slide 67

Slide 67 text

Persistence 67 Reliability Backpressure 1 2

Slide 68

Slide 68 text

Reliability legacy pipeline 68 1 merged filter + output stages 2 single intermediate queue

Slide 69

Slide 69 text

Reliability legacy pipeline 69 2 bulk requests batching 1000s of items 1 look ‘ma - another queue Output Stage

Slide 70

Slide 70 text

The Road to Reliability 70 Java Event 1 2 Filter & Output Merged Micro Batching & Acknowledgement 3 ✓ ✓

Slide 71

Slide 71 text

The Road to Reliability 71 input filter + output Persistent Queue ACK N batch N batch N+1 batch N+2 batch N

Slide 72

Slide 72 text

Persistence 72 Reliability Backpressure 2 1

Slide 73

Slide 73 text

Backpressure 73 Propagate to producer

Slide 74

Slide 74 text

Intermediate Queuing Architecture 74 elasticsearch Payments   Server Database Web   Server

Slide 75

Slide 75 text

The Road to Reliability 75 input filter+output Variable Size Persistent Queue

Slide 76

Slide 76 text

Simplified Architecture 76 elasticsearch Payments   Server Database Web   Server … … … … … 1 Variable size persistent queue

Slide 77

Slide 77 text

‹#› Colin Surprenant, Software Engineer Andrew Cholakian, Software Engineer Feb 2016 Dive Deep with Logstash From Pipelines to Persistent Queues colinsurprenant andrewvc

Slide 78

Slide 78 text

‹#› Please attribute Elastic with a link to elastic.co Except where otherwise noted, this work is licensed under http://creativecommons.org/licenses/by-nd/4.0/ Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 78