Introduction into the Elasticsearch Ingest Node

1 Alexander Reelsen @spinscale Introduction into Elasticsearch Ingest Node

2 What? • Elasticsearch did not have any possibility to
enrich JSON before indexing • Logstash usually takes over the part of document enrichment • Getting apache logs required a full ELK setup • Getting data from a beat to Elasticsearch required logstash in between • What if we had a little bit of enrichment power in Elasticsearch?

3 Will logstash be replaced? No

4 Definitions • Pipeline • Guide to document enrichment •
Stored inside ClusterState • Index operations can have a pipeline configured • A pipeline consists of a series of processors • Processor • A single step to change a document • Configurable as part of a pipeline

5 APIs • PUT _ingest/pipeline/my-pipeline-id • GET _ingest/pipeline/my-pipeline-id • DELETE
_ingest/pipeline/my-pipeline-id • POST _ingest/pipeline/_simulate

6 Processors • Append, Convert, Date, Date Index Name, Fail
• Foreach, Grok, Gsub, Join, JSON, KV, Lowercase • Remove, Rename, Script, Set, Split, Sort, Trim, Uppercase, Dot Expander • Plugins: useragent, geoip, attachment

7 Ingestion inside of a cluster C PUT foo/bar/1 P
R R

8 Ingestion inside of a cluster C PUT foo/bar/1?pipeline_id=my-pipeline P
R R

9 dedicated ingest nodes C PUT foo/bar/1?pipeline_id=my-pipeline P R R
node.ingest: true node.ingest: false node.ingest: false node.ingest: false node.ingest: false node.ingest: true

10 Demo: Using pipelines

11 Writing your own processor

12 Writing your own processor • Processors can be written
as own plugins • Use any JVM language • Processors are fully unit testable! • Beware of the security manager!

13 Demo: Writing your own processor

14 Further reading

15 https://speakerdeck.com/elastic/ingest-node-enriching-documents-within-elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest.html https://www.elastic.co/blog/ingest-node-a-clients-perspective https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest-apis.html https://www.elastic.co/blog/new-way-to-ingest-part-1 https://www.elastic.co/blog/ingesting-and-exploring-scientific-papers-using-elastic-cloud https://www.elastic.co/blog/writing-your-own-ingest-processor-for-elasticsearch https://github.com/spinscale/elasticsearch-ingest-opennlp https://github.com/spinscale/elasticsearch-ingest-langdetect
https://github.com/spinscale/cookiecutter-elasticsearch-ingest-processor Further reading

16 Questions?

Introduction into the Elasticsearch Ingest Node

Introduction into the Elasticsearch Ingest Node

Alexander Reelsen

More Decks by Alexander Reelsen

Other Decks in Technology

Featured

Transcript

1 Alexander Reelsen @spinscale Introduction into Elasticsearch Ingest Node

2 What? • Elasticsearch did not have any possibility to

3 Will logstash be replaced? No

4 Definitions • Pipeline • Guide to document enrichment •

5 APIs • PUT _ingest/pipeline/my-pipeline-id • GET _ingest/pipeline/my-pipeline-id • DELETE

6 Processors • Append, Convert, Date, Date Index Name, Fail

7 Ingestion inside of a cluster C PUT foo/bar/1 P

8 Ingestion inside of a cluster C PUT foo/bar/1?pipeline_id=my-pipeline P

9 dedicated ingest nodes C PUT foo/bar/1?pipeline_id=my-pipeline P R R

10 Demo: Using pipelines

11 Writing your own processor

12 Writing your own processor • Processors can be written

13 Demo: Writing your own processor

14 Further reading

16 Questions?