$30 off During Our Annual Pro Sale. View Details »

Elasticsearch Performance Best Practices

Elasticsearch Performance Best Practices

Presentation held at the JAX conference in Mainz on April 20, 2016.

Patrick Peschlow

April 20, 2016
Tweet

More Decks by Patrick Peschlow

Other Decks in Technology

Transcript

  1. Elasticsearch Performance

    Best Practices
    Patrick Peschlow

    View Slide

  2. Fundamentals

    View Slide

  3. Documents
    {
    _id: "1",
    author: "Patrick Peschlow",
    title: "Elasticsearch Performance"
    }
    {
    _id: "2",
    author: "Patrick Peschlow",
    title: "Elasticsearch Scalability"
    }

    View Slide

  4. Queries
    {
    match: {
    content: "performance"
    }
    }

    View Slide

  5. Results
    {
    hits: {
    total: 1,
    hits: [
    {
    _id: "1",
    _score: 0.15342641,
    _source : {
    author: "Patrick Peschlow",
    title: "Elasticsearch Performance"
    }
    }
    ]
    }
    }

    View Slide

  6. How indexing works
    Persisted
    segments
    Searchable

    View Slide

  7. Persisted
    segments
    Searchable
    How indexing works

    View Slide

  8. Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  9. translog()
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  10. translog()
    Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  11. index()
    Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  12. Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  13. refresh()
    Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  14. flush()
    Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  15. Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  16. merge()
    Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    How indexing works

    View Slide

  17. Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    merge()
    How indexing works

    View Slide

  18. Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    merge()
    How indexing works

    View Slide

  19. Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    merge()
    How indexing works

    View Slide

  20. translog() flush()
    refresh()
    Indexing Buffer
    Transaction Log
    Persisted
    segments
    Searchable
    index()
    merge()
    How indexing works

    View Slide

  21. How searching works
    Searcher
    search()
    Persisted

    View Slide

  22. query()
    Persisted
    Searcher
    How searching works

    View Slide

  23. compute_hits_to_return()
    Persisted
    Searcher
    How searching works

    View Slide

  24. fetch()
    Persisted
    Searcher
    How searching works

    View Slide

  25. Persisted
    Searcher
    How searching works

    View Slide

  26. Scaling out

    View Slide

  27. •Synchronous
    •Returns only when all replicas have acknowledged receipt
    Replication
    Node 1
    P1
    Node 2
    R1

    View Slide

  28. •High availability
    •Automatic failover if the primary fails
    Replication benefits
    Node 1
    P1
    Node 2
    R1

    View Slide

  29. •Increase capacity for search requests
    •Default: round-robin
    Replication benefits
    Node 1
    P1
    Node 2
    R1

    View Slide

  30. •Good: Number of replicas can be changed dynamically
    •Desired level of fault tolerance?
    •It’s all about risk
    •If shard recovery is quick, maybe one replica is enough?
    •More replicas require more hardware resources
    •To increase search throughput, scaling up is also an option
    How many replicas are needed?

    View Slide

  31. •Partitioning of documents by some „routing“ value
    •Default: document ID hash
    Sharding
    Node 1
    P1
    Node 2
    P2 P3

    View Slide

  32. •Scale out
    •Distribute a large index onto multiple machines
    Sharding benefits
    Node 1
    P1
    Node 2
    P2 P3

    View Slide

  33. •Increase capacity for write operations
    •Inserts, updates, deletes (and merges!)
    Sharding benefits
    Node 1
    P1
    Node 2
    P2 P3

    View Slide

  34. •Parallelize searches
    •Unit of work in the search thread pool: the shard
    Sharding benefits
    Node 1
    P1
    Node 2
    P2 P3

    View Slide

  35. •Distributed search requires coordination
    •Need to aggregate results from different shards
    •Similar to aggregating results from the segments of a shard
    Sharding drawbacks
    Node 1
    P1
    Node 2
    P2 P3

    View Slide

  36. •Bad: Number of shards needs to be set on index creation
    •Finding the right number requires some care
    •Formulate assumptions/expectations
    •Test and measure
    •Overallocate a little
    •Maximum shard size?
    •Often cited: 50 GB
    •Mainly a rule of thumb for quick recovery
    How many shards are needed?

    View Slide

  37. •Maybe you can just use multiple indices?
    •Searching multiple indices is easy
    •Indices are more flexible (e.g., creation, deletion)
    •But: Every index consumes certain resources
    •Cluster state, in-memory data structures
    •Recommendation: Shard an index if…
    •…you suspect that one shard might not be enough
    •…and there is no indicator for a „smarter“ approach
    When to shard?

    View Slide

  38. User-based approach
    Index 1
    ...
    User 1 User 5
    User 4
    User 6 User 7
    User 8
    Search by user 1
    P1
    Index 1
    P2
    Index 1
    P3
    User 9
    P1
    Index 2
    Virtual index
    User 3
    User 2

    View Slide

  39. Time-based approach
    ...
    Search within the last 3 days
    P1
    2016-03-20
    P1
    2016-04-20
    P1
    2016-04-19
    P1
    2016-04-18
    Virtual index

    View Slide

  40. •Separate concerns
    •Master nodes
    •Data nodes
    •Client/aggregator nodes
    •Client applications
    •HTTP client
    •TransportClient
    •NodeClient
    Cluster nodes

    View Slide

  41. Mapping

    View Slide

  42. Examples
    "filename" : {
    "type" : "string",
    "index" : "not_analyzed"
    }
    "filename_german" : {
    "type" : "string",
    "index" : "analyzed",
    "analyzer" : "german"
    }
    "filename_fancy" : {
    "type" : "string",
    "index" : "analyzed",
    "analyzer" : "my_fancy_analyzer"
    }

    View Slide

  43. •Which fields to analyze, and how?
    •Which data to store for analyzed fields?
    •Term frequencies, positions, offsets?
    •Field norms?
    •Term vectors?
    •Which fields not to index at all?
    Indexing fields

    View Slide

  44. •Consider indexing fields multiple times
    •Index time vs. query time solutions
    •multi-fields, copy_to
    •Disable unneeded multiple indexing done by default
    •Need the _all field?
    •Need raw fields?
    Indexing fields multiple times

    View Slide

  45. •Be careful with dynamic mapping/templates
    •May lead to huge mappings (cluster state)
    •For known unknowns, consider the key-value pattern
    •Define just two fields: key and value
    Indexing unknown fields

    View Slide

  46. •Do you need to store the whole _source?
    •Needed for, e.g., Reindex API, Update API
    •Can you exclude some fields from the _source?
    •Do you need to store _source at all?
    •Disable _source and only store a few selected fields?
    Storing fields

    View Slide

  47. Indexing

    View Slide

  48. •Limit the size of potentially large document fields
    •And hope that no one notices
    •Huge documents can OutOfMemory your cluster
    Limit input size

    View Slide

  49. •Update = Read, Delete, Create
    •To replace a whole document, just index it again
    •Reduces network traffic
    •Specify update as partial document or script
    •Update by ID or by query
    •Small updates might take a while
    •A single expensive field is enough
    Update API

    View Slide

  50. •Parent-child relationships
    •Model 1:N relations between documents
    •Advantage: Individual updates but combined queries
    •Warning: Performance issues with frequent refreshes
    •Observed query slowdowns between 300 ms and 5 seconds
    Relations

    View Slide

  51. •Reduces overhead
    •Less network overhead
    •Only one translog fsync per bulk
    •Optimum bulk size depends on the document size
    •When in doubt, prefer smaller bulks
    •Still hitting a limit with bulk indexing?
    •The bottleneck might not be at the server
    •Try concurrent indexing with multiple clients
    Bulk indexing

    View Slide

  52. •Depends on many factors
    •External data source?
    •Zero downtime?
    •Live index? Update API usage? Versioning? Possible deletes?
    •Ways to speed up reindexing
    •Bulk indexing
    •Disable refresh
    •Decrease number of replicas
    •The Reindex API only covers some scenarios
    Reindexing

    View Slide

  53. Search

    View Slide

  54. •Limit the amount of data transferred
    •Don’t request more hits than needed
    •Don’t return fields not needed
    •Limit the amount of indexes/shards queried
    •Only query those where hits are possible
    •Request aggregations/facets only when needed
    •Might not have changed when requesting the next results page
    Reduce search overhead

    View Slide

  55. •Avoid deep pagination
    •Sorting millions of documents is expensive
    •To iterate over lots of documents use scroll search
    •Sort by _doc and use the scroll parameter
    Deep pagination

    View Slide

  56. •Search offset
    •Number of rows requested
    •Number of search terms
    •Total length of the search string
    Limit user input

    View Slide

  57. •Some defaults reduce accuracy
    •Need more accurate scoring?
    •Set search_type to dfs_query_then_fetch
    •But: one more round trip
    •What is accurate scoring anyway? (e.g., deleted documents)
    •Need more accurate counts in aggregations?
    •Set shard_size higher than size
    •But: more work for each shard
    Accuracy vs. speed trade-offs with sharding

    View Slide

  58. •Force Merge API (aka Optimize)
    •Turning 20 segments into 1 can be highly beneficial
    •But: merges will invalidate caches
    •Most useful for indices not modified anymore
    Optimize indices?

    View Slide

  59. •Page cache (OS)
    •Node query cache
    •Shard request cache
    •Disabled by default
    •Field data cache
    •Not as relevant as it used to be
    Caches

    View Slide

  60. •Profile API
    •Detailed timing analysis for a query
    •Slow log
    Slow queries

    View Slide

  61. Misc

    View Slide

  62. •Java: Avoid blacklisted versions
    •GC: Use CMS
    •Heap size
    •Measure how much is needed
    •No more than roughly 30 GB (enables pointer compression)
    •Number of processors?
    •Set JVM options (defaults based on OS virtual processors)
    •Set the Elasticsearch processors configuration property
    JVM

    View Slide

  63. •DRAM: The more, the better
    •Page cache is crucial for performance
    •Disk: Local SSD is best
    •CPU: 8 cores are nice
    •Consider separating into hot and cold nodes
    •Set up allocation constraints
    Hardware resources

    View Slide

  64. •Monitoring
    •Check out the API
    •Detailed case studies
    •Lots of examples on the web
    •Configuration may differ between Elasticsearch versions
    •The official channels (GitHub, forum, documentation) are great
    •Outdated (mostly pre 2.x) topics
    •Field data without doc_values, filters vs. queries, scan+scroll,
    split brain, unnecessary recovery, cluster state without diffs
    There is more

    View Slide

  65. codecentric AG
    Merscheider Straße 1
    42699 Solingen, Deutschland
    tel: +49 (0) 212.23362854
    fax: +49 (0) 212.23362879
    [email protected]
    www.codecentric.de
    blog.codecentric.de
    Questions?
    Dr. Patrick Peschlow
    Head of Development - CenterDevice

    View Slide