Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction into the Elasticsearch Ingest Node

Introduction into the Elasticsearch Ingest Node

This is a short introduction into the Elasticsearch Ingest Node. The corresponding blog post is at https://www.elastic.co/blog/writing-your-own-ingest-processor-for-elasticsearch

Alexander Reelsen

January 16, 2017
Tweet

More Decks by Alexander Reelsen

Other Decks in Technology

Transcript

  1. 1
    Alexander Reelsen
    @spinscale
    Introduction into
    Elasticsearch Ingest Node

    View Slide

  2. 2
    What?
    • Elasticsearch did not have any possibility to enrich JSON before indexing
    • Logstash usually takes over the part of document enrichment
    • Getting apache logs required a full ELK setup
    • Getting data from a beat to Elasticsearch required logstash in between
    • What if we had a little bit of enrichment power in Elasticsearch?

    View Slide

  3. 3
    Will logstash be replaced?
    No

    View Slide

  4. 4
    Definitions
    • Pipeline
    • Guide to document enrichment
    • Stored inside ClusterState
    • Index operations can have a pipeline configured
    • A pipeline consists of a series of processors
    • Processor
    • A single step to change a document
    • Configurable as part of a pipeline

    View Slide

  5. 5
    APIs
    • PUT _ingest/pipeline/my-pipeline-id
    • GET _ingest/pipeline/my-pipeline-id
    • DELETE _ingest/pipeline/my-pipeline-id
    • POST _ingest/pipeline/_simulate

    View Slide

  6. 6
    Processors
    • Append, Convert, Date, Date Index Name, Fail
    • Foreach, Grok, Gsub, Join, JSON, KV, Lowercase
    • Remove, Rename, Script, Set, Split, Sort, Trim, Uppercase, Dot Expander
    • Plugins: useragent, geoip, attachment

    View Slide

  7. 7
    Ingestion inside of a cluster
    C
    PUT foo/bar/1
    P
    R R

    View Slide

  8. 8
    Ingestion inside of a cluster
    C
    PUT foo/bar/1?pipeline_id=my-pipeline
    P
    R R

    View Slide

  9. 9
    dedicated ingest nodes
    C
    PUT foo/bar/1?pipeline_id=my-pipeline
    P
    R R
    node.ingest: true
    node.ingest: false
    node.ingest: false node.ingest: false
    node.ingest: false
    node.ingest: true

    View Slide

  10. 10
    Demo: Using pipelines

    View Slide

  11. 11
    Writing your own processor

    View Slide

  12. 12
    Writing your own processor
    • Processors can be written as own plugins
    • Use any JVM language
    • Processors are fully unit testable!
    • Beware of the security manager!

    View Slide

  13. 13
    Demo: Writing your own processor

    View Slide

  14. 14
    Further reading

    View Slide

  15. 15
    https://speakerdeck.com/elastic/ingest-node-enriching-documents-within-elasticsearch
    https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest.html
    https://www.elastic.co/blog/ingest-node-a-clients-perspective
    https://www.elastic.co/guide/en/elasticsearch/reference/master/ingest-apis.html
    https://www.elastic.co/blog/new-way-to-ingest-part-1
    https://www.elastic.co/blog/ingesting-and-exploring-scientific-papers-using-elastic-cloud
    https://www.elastic.co/blog/writing-your-own-ingest-processor-for-elasticsearch
    https://github.com/spinscale/elasticsearch-ingest-opennlp
    https://github.com/spinscale/elasticsearch-ingest-langdetect
    https://github.com/spinscale/cookiecutter-elasticsearch-ingest-processor
    Further reading

    View Slide

  16. 16
    Questions?

    View Slide