1
Alexander Reelsen
@spinscale
Introduction into
Elasticsearch Ingest Node
Slide 2
Slide 2 text
2
What?
• Elasticsearch did not have any possibility to enrich JSON before indexing
• Logstash usually takes over the part of document enrichment
• Getting apache logs required a full ELK setup
• Getting data from a beat to Elasticsearch required logstash in between
• What if we had a little bit of enrichment power in Elasticsearch?
Slide 3
Slide 3 text
3
Will logstash be replaced?
No
Slide 4
Slide 4 text
4
Definitions
• Pipeline
• Guide to document enrichment
• Stored inside ClusterState
• Index operations can have a pipeline configured
• A pipeline consists of a series of processors
• Processor
• A single step to change a document
• Configurable as part of a pipeline
Slide 5
Slide 5 text
5
APIs
• PUT _ingest/pipeline/my-pipeline-id
• GET _ingest/pipeline/my-pipeline-id
• DELETE _ingest/pipeline/my-pipeline-id
• POST _ingest/pipeline/_simulate
7
Ingestion inside of a cluster
C
PUT foo/bar/1
P
R R
Slide 8
Slide 8 text
8
Ingestion inside of a cluster
C
PUT foo/bar/1?pipeline_id=my-pipeline
P
R R
Slide 9
Slide 9 text
9
dedicated ingest nodes
C
PUT foo/bar/1?pipeline_id=my-pipeline
P
R R
node.ingest: true
node.ingest: false
node.ingest: false node.ingest: false
node.ingest: false
node.ingest: true
Slide 10
Slide 10 text
10
Demo: Using pipelines
Slide 11
Slide 11 text
11
Writing your own processor
Slide 12
Slide 12 text
12
Writing your own processor
• Processors can be written as own plugins
• Use any JVM language
• Processors are fully unit testable!
• Beware of the security manager!