Elasticsearch Performance Best Practices

Elasticsearch Performance Best Practices

Presentation held at the JAX conference in Mainz on April 20, 2016.

00655e17a4f690cb462153f921f8eb77?s=128

Patrick Peschlow

April 20, 2016
Tweet

Transcript

  1. Elasticsearch Performance
 Best Practices Patrick Peschlow

  2. Fundamentals

  3. Documents { _id: "1", author: "Patrick Peschlow", title: "Elasticsearch Performance"

    } { _id: "2", author: "Patrick Peschlow", title: "Elasticsearch Scalability" }
  4. Queries { match: { content: "performance" } }

  5. Results { hits: { total: 1, hits: [ { _id:

    "1", _score: 0.15342641, _source : { author: "Patrick Peschlow", title: "Elasticsearch Performance" } } ] } }
  6. How indexing works Persisted segments Searchable

  7. Persisted segments Searchable How indexing works

  8. Transaction Log Persisted segments Searchable How indexing works

  9. translog() Transaction Log Persisted segments Searchable How indexing works

  10. translog() Indexing Buffer Transaction Log Persisted segments Searchable How indexing

    works
  11. index() Indexing Buffer Transaction Log Persisted segments Searchable How indexing

    works
  12. Indexing Buffer Transaction Log Persisted segments Searchable How indexing works

  13. refresh() Indexing Buffer Transaction Log Persisted segments Searchable How indexing

    works
  14. flush() Indexing Buffer Transaction Log Persisted segments Searchable How indexing

    works
  15. Indexing Buffer Transaction Log Persisted segments Searchable How indexing works

  16. merge() Indexing Buffer Transaction Log Persisted segments Searchable How indexing

    works
  17. Indexing Buffer Transaction Log Persisted segments Searchable merge() How indexing

    works
  18. Indexing Buffer Transaction Log Persisted segments Searchable merge() How indexing

    works
  19. Indexing Buffer Transaction Log Persisted segments Searchable merge() How indexing

    works
  20. translog() flush() refresh() Indexing Buffer Transaction Log Persisted segments Searchable

    index() merge() How indexing works
  21. How searching works Searcher search() Persisted

  22. query() Persisted Searcher How searching works

  23. compute_hits_to_return() Persisted Searcher How searching works

  24. fetch() Persisted Searcher How searching works

  25. Persisted Searcher How searching works

  26. Scaling out

  27. •Synchronous •Returns only when all replicas have acknowledged receipt Replication

    Node 1 P1 Node 2 R1
  28. •High availability •Automatic failover if the primary fails Replication benefits

    Node 1 P1 Node 2 R1
  29. •Increase capacity for search requests •Default: round-robin Replication benefits Node

    1 P1 Node 2 R1
  30. •Good: Number of replicas can be changed dynamically •Desired level

    of fault tolerance? •It’s all about risk •If shard recovery is quick, maybe one replica is enough? •More replicas require more hardware resources •To increase search throughput, scaling up is also an option How many replicas are needed?
  31. •Partitioning of documents by some „routing“ value •Default: document ID

    hash Sharding Node 1 P1 Node 2 P2 P3
  32. •Scale out •Distribute a large index onto multiple machines Sharding

    benefits Node 1 P1 Node 2 P2 P3
  33. •Increase capacity for write operations •Inserts, updates, deletes (and merges!)

    Sharding benefits Node 1 P1 Node 2 P2 P3
  34. •Parallelize searches •Unit of work in the search thread pool:

    the shard Sharding benefits Node 1 P1 Node 2 P2 P3
  35. •Distributed search requires coordination •Need to aggregate results from different

    shards •Similar to aggregating results from the segments of a shard Sharding drawbacks Node 1 P1 Node 2 P2 P3
  36. •Bad: Number of shards needs to be set on index

    creation •Finding the right number requires some care •Formulate assumptions/expectations •Test and measure •Overallocate a little •Maximum shard size? •Often cited: 50 GB •Mainly a rule of thumb for quick recovery How many shards are needed?
  37. •Maybe you can just use multiple indices? •Searching multiple indices

    is easy •Indices are more flexible (e.g., creation, deletion) •But: Every index consumes certain resources •Cluster state, in-memory data structures •Recommendation: Shard an index if… •…you suspect that one shard might not be enough •…and there is no indicator for a „smarter“ approach When to shard?
  38. User-based approach Index 1 ... User 1 User 5 User

    4 User 6 User 7 User 8 Search by user 1 P1 Index 1 P2 Index 1 P3 User 9 P1 Index 2 Virtual index User 3 User 2
  39. Time-based approach ... Search within the last 3 days P1

    2016-03-20 P1 2016-04-20 P1 2016-04-19 P1 2016-04-18 Virtual index
  40. •Separate concerns •Master nodes •Data nodes •Client/aggregator nodes •Client applications

    •HTTP client •TransportClient •NodeClient Cluster nodes
  41. Mapping

  42. Examples "filename" : { "type" : "string", "index" : "not_analyzed"

    } "filename_german" : { "type" : "string", "index" : "analyzed", "analyzer" : "german" } "filename_fancy" : { "type" : "string", "index" : "analyzed", "analyzer" : "my_fancy_analyzer" }
  43. •Which fields to analyze, and how? •Which data to store

    for analyzed fields? •Term frequencies, positions, offsets? •Field norms? •Term vectors? •Which fields not to index at all? Indexing fields
  44. •Consider indexing fields multiple times •Index time vs. query time

    solutions •multi-fields, copy_to •Disable unneeded multiple indexing done by default •Need the _all field? •Need raw fields? Indexing fields multiple times
  45. •Be careful with dynamic mapping/templates •May lead to huge mappings

    (cluster state) •For known unknowns, consider the key-value pattern •Define just two fields: key and value Indexing unknown fields
  46. •Do you need to store the whole _source? •Needed for,

    e.g., Reindex API, Update API •Can you exclude some fields from the _source? •Do you need to store _source at all? •Disable _source and only store a few selected fields? Storing fields
  47. Indexing

  48. •Limit the size of potentially large document fields •And hope

    that no one notices •Huge documents can OutOfMemory your cluster Limit input size
  49. •Update = Read, Delete, Create •To replace a whole document,

    just index it again •Reduces network traffic •Specify update as partial document or script •Update by ID or by query •Small updates might take a while •A single expensive field is enough Update API
  50. •Parent-child relationships •Model 1:N relations between documents •Advantage: Individual updates

    but combined queries •Warning: Performance issues with frequent refreshes •Observed query slowdowns between 300 ms and 5 seconds Relations
  51. •Reduces overhead •Less network overhead •Only one translog fsync per

    bulk •Optimum bulk size depends on the document size •When in doubt, prefer smaller bulks •Still hitting a limit with bulk indexing? •The bottleneck might not be at the server •Try concurrent indexing with multiple clients Bulk indexing
  52. •Depends on many factors •External data source? •Zero downtime? •Live

    index? Update API usage? Versioning? Possible deletes? •Ways to speed up reindexing •Bulk indexing •Disable refresh •Decrease number of replicas •The Reindex API only covers some scenarios Reindexing
  53. Search

  54. •Limit the amount of data transferred •Don’t request more hits

    than needed •Don’t return fields not needed •Limit the amount of indexes/shards queried •Only query those where hits are possible •Request aggregations/facets only when needed •Might not have changed when requesting the next results page Reduce search overhead
  55. •Avoid deep pagination •Sorting millions of documents is expensive •To

    iterate over lots of documents use scroll search •Sort by _doc and use the scroll parameter Deep pagination
  56. •Search offset •Number of rows requested •Number of search terms

    •Total length of the search string Limit user input
  57. •Some defaults reduce accuracy •Need more accurate scoring? •Set search_type

    to dfs_query_then_fetch •But: one more round trip •What is accurate scoring anyway? (e.g., deleted documents) •Need more accurate counts in aggregations? •Set shard_size higher than size •But: more work for each shard Accuracy vs. speed trade-offs with sharding
  58. •Force Merge API (aka Optimize) •Turning 20 segments into 1

    can be highly beneficial •But: merges will invalidate caches •Most useful for indices not modified anymore Optimize indices?
  59. •Page cache (OS) •Node query cache •Shard request cache •Disabled

    by default •Field data cache •Not as relevant as it used to be Caches
  60. •Profile API •Detailed timing analysis for a query •Slow log

    Slow queries
  61. Misc

  62. •Java: Avoid blacklisted versions •GC: Use CMS •Heap size •Measure

    how much is needed •No more than roughly 30 GB (enables pointer compression) •Number of processors? •Set JVM options (defaults based on OS virtual processors) •Set the Elasticsearch processors configuration property JVM
  63. •DRAM: The more, the better •Page cache is crucial for

    performance •Disk: Local SSD is best •CPU: 8 cores are nice •Consider separating into hot and cold nodes •Set up allocation constraints Hardware resources
  64. •Monitoring •Check out the API •Detailed case studies •Lots of

    examples on the web •Configuration may differ between Elasticsearch versions •The official channels (GitHub, forum, documentation) are great •Outdated (mostly pre 2.x) topics •Field data without doc_values, filters vs. queries, scan+scroll, split brain, unnecessary recovery, cluster state without diffs There is more
  65. codecentric AG Merscheider Straße 1 42699 Solingen, Deutschland tel: +49

    (0) 212.23362854 fax: +49 (0) 212.23362879 patrick.peschlow@codecentric.de www.codecentric.de blog.codecentric.de Questions? Dr. Patrick Peschlow Head of Development - CenterDevice