Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic{ON} 2018 - What's Evolving in Elasticse...

Elastic{ON} 2018 - What's Evolving in Elasticsearch

Avatar for Elastic Co

Elastic Co

March 01, 2018
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Team and Tech Leads, Elasticsearch 28 February 2018 @clintongormley @s1m0nw

    What’s Evolving in Elasticsearch Clinton Gormley & Simon Willnauer
  2. • Only pay for what you use Sparse doc values

    Disk on a Diet Removed _all field • Replaced by "default_fields": ["*"]
  3. 0 25 50 75 100 5.x 6.0 OOTB _all disabled

    Sample Metricbeat Dataset Apples-to-apples index size improvements
  4. Data Rollups Flexible bucketing and filtering by time, histograms, and

    terms prod-1.myco.com prod-2.myco.com prod-3.myco.com prod-4.myco.com prod-5.myco.com Date Histogram Histogram Terms (coming soon to X-Pack)
  5. Data Rollups prod-1.myco.com prod-2.myco.com prod-3.myco.com prod-4.myco.com prod-5.myco.com Uses the new

    composite aggregation • Paginate through all buckets of a multi-level aggregation • Accurate term counts • Sorted by “natural” order (coming soon to X-Pack)
  6. Data Rollups @timestamp datacenter url.path Flexible bucketing and filtering by

    time, histograms, and terms (coming soon to X-Pack)
  7. Data Rollups @timestamp datacenter url.path Flexible bucketing and filtering by

    time, histograms, and terms (coming soon to X-Pack)
  8. Data Rollups @timestamp datacenter url.path Flexible bucketing and filtering by

    time, histograms, and terms (coming soon to X-Pack)
  9. Data Rollups @timestamp group by datacenter url.path Flexible bucketing and

    filtering by time, histograms, and terms (coming soon to X-Pack)
  10. Data Rollups @timestamp datacenter url.path Flexible bucketing and filtering by

    time, histograms, and terms (coming soon to X-Pack)
  11. Data Rollups Flexible bucketing and filtering by time, histograms, and

    terms @timestamp datacenter filter by
 url.path (coming soon to X-Pack)
  12. Data Rollups The more data you have, the more space

    you save, easily 90%+ Raw data (coming soon to X-Pack)
  13. Scaleable Cross Cluster Search Search across two major versions 5.latest

    7.x 6.latest Elasticsearch Kibana Elasticsearch Elasticsearch
  14. Improved Search Scalability Searches across many shards are more scalable:

    • Fast pre-check phase, exclude any shards that can’t match query. • Limits to the number of shards which are searched in parallel, so that a single query cannot dominate the cluster. • Batched reduction of results, reduces memory usage on the coordinating node. Multi-shard Search Request Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard N Subset of Shards containing results ...
  15. Index Sorting Player 1 Score: 600 5.x Query for top

    3 player scores Player 2 Score: 0 Player 3 Score: 200 Player 4 Score: 700 Player 5 Score: 300 Player 1907 Score: 800 ... Query for top 3 player scores ... Player 1907 Score: 800 Player 4 Score: 700 Player 1 Score: 600 Player 5 Score: 300 Player 3 Score: 200 Player 2 Score: 0 6.x • Sort at index time vs. query time • Optimize on-disk format for some use cases • Improve query performance at the cost of index performance Much speedier sorted queries
  16. SQL Client SELECT course, avg(age),count(*) FROM mytable WHERE match(uni,"oxford") GROUP

    BY course ORDER BY course, avg(age) HAVING avg(age) > 18 (coming soon to X-Pack)
  17. SQL Client CLI JDBC Kibana Canvas SQL over REST GET

    /_sql {} (coming soon to X-Pack)
  18. SQL Client CLI JDBC Kibana Canvas SQL over REST GET

    /_sql {} ODBC (coming soon to X-Pack)
  19. “title": { "type": "text", "index_prefix": { "min_chars": 2, "max_chars": 6

    } } Index prefixes for faster querying Faster Prefix Queries
  20. Index shingles for faster phrase queries Faster Phrase and Prefix

    Queries "match_phrase_prefix": { “title": "phrase and pref*" }
  21. Faster Top-N Queries GET /_search { "size": 2, "query": {

    "match": { "text": "quick brown fox" } } }
  22. Faster Top-N Queries GET /_search { "size": 2, "query": {

    "match": { "text": "quick brown fox" } } } Collect all docs because: • Aggregations • Total hits • To find top-N documents
  23. Faster Top-N Queries GET /_search { "size": 2, "query": {

    "match": { "text": "quick brown fox" } } } Collect all docs because: • Aggregations • Total hits • To find top-N documents
  24. Faster Top-N Queries When total hits and aggregations not required

    DOC quick brown fox total 1 2 3 1 6 2 2 3 0 5 3 0 3 1 4 4 2 0 1 3 GET /_search { "size": 2, "query": { "match": { "text": "quick brown fox" } } }
  25. • Scripted Search Similarity • Significant Text Aggregation • Ranking

    Evaluation API • Korean Analyzer • Nano-second timestamps More Search Features
  26. Attribute-Based Access Control { “attrs": ["team:finance", "country:usa"] … } {

    "attrs": ["team:finance", "country:usa", "clearance:secret"] … } All attributes must be present ( X-Pack feature)
  27. Audit Log Events Ignore Policies xpack.security.audit.logfile.events.ignore_filters: ignore_bulk_logging: users: ["beats"] indices:

    ["filebeat*", "metricbeat*"] Fine grained filtering of the security audit log ( X-Pack feature)
  28. Distributed watch execution • Watches are no longer executed on

    only the master node • They are executed on nodes which hold shards of the .watches index • Configure all or specific nodes dedicated to watch execution X-Pack feature (Gold)
  29. Cluster Protection Circuit Breakers and Soft Limits • Circuit breakers

    track memory usage, now including Lucene’s requirements. • Soft limits prevent users from running dangerous requests - help admins to protect their clusters from unwitting users. • Added limits on highlighting, terms query, 
 n-gram and shingle analysers, nested docs
  30. 1 2 3 4 5 6 7 Ops-Based Recovery (6.0)

    Primary Replica 1 2 3 4 5 6 7
  31. Index Lifecycle Management Hot Nodes 1 2 3 Cold Nodes

    Hot Phase - Index to my-logs-write, Search on my-logs-read Warm Nodes (coming soon to X-Pack)
  32. Index Lifecycle Management 1 2 3 Hot Phase - Rollover

    1 2 3 Hot Nodes Warm Nodes (coming soon to X-Pack) Cold Nodes
  33. Index Lifecycle Management 1 2 3 Warm Phase - Allocate

    1 2 3 Hot Nodes Warm Nodes (coming soon to X-Pack) Cold Nodes
  34. 2 3 Index Lifecycle Management 1 2 3 Warm Phase

    - Shrink 1 Hot Nodes Warm Nodes (coming soon to X-Pack) Cold Nodes
  35. 1 Index Lifecycle Management 1 2 3 Warm Phase -

    Compress Hot Nodes Warm Nodes (coming soon to X-Pack) Cold Nodes
  36. 1 Index Lifecycle Management 1 2 3 Cold Phase -

    Allocate Hot Nodes Warm Nodes (coming soon to X-Pack) Cold Nodes
  37. Index Lifecycle Management 1 2 3 Delete Phase Hot Nodes

    Warm Nodes (coming soon to X-Pack) Cold Nodes 1
  38. Zen Discovery v2 minimum_master_nodes = 3 Master Eligible Node Master

    Eligible Node Master Eligible Node Master Eligible Node Automatic management of master nodes
  39. • Boaz Leskes, Jason Tedor, David Turner, Yannick Welsch •

    Wednesday, 3:30pm Elasticsearch Consensus: The Past, the Present, and the Future Get the Lay of the Lucene Land • Adrien Grand • Wednesday, 10:30am Other Talks You Should See The State of Geo in Elasticsearch • Nick Knize, Thomas Neirynck • Thursday, 9:30am The State of the Elasticsearch Java Client • Nik Everett • Thursday, 2:30pm Elasticsearch SQL • Costin Leau • Thursday, 3:30pm
  40. Except where otherwise noted, this work is licensed under http://creativecommons.org/licenses/by-nd/4.0/

    Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. Please attribute Elastic with a link to elastic.co