Slide 1

Slide 1 text

1 Alexander Reelsen @spinscale Introduction into Elasticsearch Ingest Node

Slide 2

Slide 2 text

2 What? • Elasticsearch did not have any possibility to enrich JSON before indexing • Logstash usually takes over the part of document enrichment • Getting apache logs required a full ELK setup • Getting data from a beat to Elasticsearch required logstash in between • What if we had a little bit of enrichment power in Elasticsearch?

Slide 3

Slide 3 text

3 Will logstash be replaced? No

Slide 4

Slide 4 text

4 Definitions • Pipeline • Guide to document enrichment • Stored inside ClusterState • Index operations can have a pipeline configured • A pipeline consists of a series of processors • Processor • A single step to change a document • Configurable as part of a pipeline

Slide 5

Slide 5 text

5 APIs • PUT _ingest/pipeline/my-pipeline-id • GET _ingest/pipeline/my-pipeline-id • DELETE _ingest/pipeline/my-pipeline-id • POST _ingest/pipeline/_simulate

Slide 6

Slide 6 text

6 Processors • Append, Convert, Date, Date Index Name, Fail • Foreach, Grok, Gsub, Join, JSON, KV, Lowercase • Remove, Rename, Script, Set, Split, Sort, Trim, Uppercase, Dot Expander • Plugins: useragent, geoip, attachment

Slide 7

Slide 7 text

7 Ingestion inside of a cluster C PUT foo/bar/1 P R R

Slide 8

Slide 8 text

8 Ingestion inside of a cluster C PUT foo/bar/1?pipeline_id=my-pipeline P R R

Slide 9

Slide 9 text

9 dedicated ingest nodes C PUT foo/bar/1?pipeline_id=my-pipeline P R R node.ingest: true node.ingest: false node.ingest: false node.ingest: false node.ingest: false node.ingest: true

Slide 10

Slide 10 text

10 Demo: Using pipelines

Slide 11

Slide 11 text

11 Writing your own processor

Slide 12

Slide 12 text

12 Writing your own processor • Processors can be written as own plugins • Use any JVM language • Processors are fully unit testable! • Beware of the security manager!

Slide 13

Slide 13 text

13 Demo: Writing your own processor

Slide 14

Slide 14 text

14 Further reading

Slide 15

Slide 15 text

15 https://speakerdeck.com/elastic/ingest-node-enriching-documents-within-elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest.html https://www.elastic.co/blog/ingest-node-a-clients-perspective https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest-apis.html https://www.elastic.co/blog/new-way-to-ingest-part-1 https://www.elastic.co/blog/ingesting-and-exploring-scientific-papers-using-elastic-cloud https://www.elastic.co/blog/writing-your-own-ingest-processor-for-elasticsearch https://github.com/spinscale/elasticsearch-ingest-opennlp https://github.com/spinscale/elasticsearch-ingest-langdetect https://github.com/spinscale/cookiecutter-elasticsearch-ingest-processor Further reading

Slide 16

Slide 16 text

16 Questions?