Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Two Years of Elasticsearch in Development and Production

Two Years of Elasticsearch in Development and Production

Presentation held at the Elasticsearch User Group Berlin on March 31, 2015.

00655e17a4f690cb462153f921f8eb77?s=128

Patrick Peschlow

March 31, 2015
Tweet

More Decks by Patrick Peschlow

Other Decks in Technology

Transcript

  1. codecentric AG Patrick Peschlow Two Years of Elasticsearch in Development

    and Production
  2. codecentric AG Mapping − Disable the _all field (unless you

    really need it) ! − Prefer _source over _stored − _source is useful anyway (for updates, reindexing, highlighting) ! − Only analyze/compute what you need − not_analyzed, field norms, term frequencies and positions ! − Be careful with dynamic mapping and dynamic templates − Can lead to undesired fields or types in the index − Can considerably grow the cluster state
  3. codecentric AG Queries − Pagination − Don’t load too many

    results with a single query − Avoid deep pagination − Consider using the scan+scroll API when you don’t need sorting ! − Think about index-time vs. query-time solutions − Prefix query vs. edge ngrams ? − Sorting via script vs. indexing another field ? − Don’t be afraid to index a source field twice
  4. codecentric AG Filters and Caching − Use filters for yes/no

    criteria that don’t need scoring − In contrast to queries, filter results can be cached ! − Tricky caching behavior − Some filters are cached by default, others not (depends on cost) − Caching may also depend on how often filters are used − Pay special attention to compound filters ! − Possible to override caching behavior and cache key
  5. codecentric AG Filters and Ordering − Elements of bool filters

    are executed sequentially − Place more selective filters first ! − Consider using „accelerator“ filters − Redundant filters that reduce work for heavyweight filters ! − Learn about possible „strategy“ settings for filtered queries − Controls how filter and query parts are interleaved − Measure, don’t guess ! − Note: With ES 2.0 queries and filters might get unified
  6. codecentric AG Analysis Tooling − Use the search/explain feature (score

    computation) ! − Use the validate/explain feature (query rewriting, cache usage) ! − Make sure your analyzers work correctly − Use the analyze API − Check out the „inquisitor“ and „extended-analyze“ plugins ! − When in doubt, take a look at the terms in your index − http://rosssimpson.com/blog/2014/05/06/using-luke-with-elasticsearch/ − „skywalker“ plugin
  7. codecentric AG Replication and Search Preference − With replicas, we

    can get different results for the same search − Searches are routed to replicas in „round robin“ fashion − Deleted documents still affect scoring − Segment merging (physical deletion) can differ among replicas ! ! ! ! ! − Solution: Use the search „preference“ parameter − For consistent results by user, choose user ID as preference doc1 doc2 doc3 doc4 doc1 doc2
  8. codecentric AG Aggregations (Facets) − Load aggregations as lazily as

    possible − Do you really need to offer all of them on the UI right away? − Can you hide some less relevant ones by default? ! − Only load aggregations once when retrieving paginated results − Consider not requesting them again when just switching the page − They likely stay the same ! − Many aggregations use approximation algorithms − Don’t expect results to be 100% true
  9. codecentric AG Field Data − Some operations require document field

    data − Sorting, aggregation, parent-child queries, some scripts ! − Field data is usually loaded for all documents − Leads to high memory consumption or OutOfMemoryError ! − Use „doc values“: Store field data on the file system − Let the OS do the caching − Can be enabled on a per-field basis ! − Note: With ES 2.0 „doc values“ might become the default
  10. codecentric AG Unit/Integration Testing − Set up a comprehensive test

    suite − Test expectations about matches − Prevent regressions when changing or modifying analyzers ! − The Elasticsearch Java client is embeddable − No mocks or test doubles needed ! − Try it by solving the „mapping challenge“ ! − https://github.com/peschlowp/elasticsearch-mapping-challenge
  11. codecentric AG Indexing and Real-Time Requirements − Default refresh interval:

    1 second − Targeted at human users ! − What if API clients want RYOW semantics for search ? − Refresh after every request ? ! − Recommendation: Leave RYOW to the primary database, if at all − Provide a separate API if needed
  12. codecentric AG Bulk Indexing − For optimum bulk size, consider

    document size not count ! − Be careful with merge throttling − Elasticsearch might throttle indexing anyway − Look out for „now throttling indexing“ log messages − Is it worth it? ! − Decrease refresh rate (or disable completely) ! − Reduce number of replicas (or set to zero) − Add missing replicas later, much cheaper than „live“ replication
  13. codecentric AG Update API − Update = Delete + Add

    − Only saves network traffic ! − Even small updates might take a while − Consider splitting (nested documents or parent-child relationships) ! − „Partial document“ update trickiness − Fields are replaced, except for inner objects which are merged − To replace inner objects, consider wrapping them in an array
  14. codecentric AG Cluster settings − Safety − Choose a unique

    cluster name − Consider using unicast discovery ! − Recovery − gateway.recover_after_nodes − gateway.recover_after_time − gateway.expected_nodes ! − Stability − minimum_master_nodes
  15. codecentric AG Split Brain ! ! ! ! ! !

    ! ! ! − Prevent split brains by partitioning − Set minimum_master_nodes to quorum
  16. codecentric AG Split Brain ! ! ! ! ! !

    ! ! ! − Prevent split brains when single links fail − Upgrade to ES 1.4.x
  17. codecentric AG Split Brain ! ! ! ! ! !

    ! ! ! − Monitor the cluster for split brains − Ask each node who is master − Use the cat master API
  18. codecentric AG Dedicated Master Nodes master Node 1 Other nodes

    master Node 3 Node 2 master
  19. codecentric AG Distributed Search Client Compute global statistics Get local

    top hits Get global top hits fields
  20. codecentric AG Aggregator Nodes Node 1 data Node 2 data

    Search client Node 3
  21. codecentric AG Aggregator Nodes Node 1 data Node 2 data

    client Node 3 Indexing preferable
  22. codecentric AG Java Clients − NodeClient − Joins the cluster

    as a client node − Potentially saves a network hop − Will participate in distributed searches ! − TransportClient − More lightweight than NodeClient ! − Some HTTP Client − Smaller memory footprint − Pay attention to settings: Chunking, long-lived HTTP connections
  23. codecentric AG Some Stories from Production − The close/open gamble

    ! − Last resort single node ! − The devastating query ! − About upgrades
  24. codecentric AG Designing for Scalability − Think about scaling right

    from the start − Fixed number of shards per index − Shard key cannot be changed later − Distributed searches are expensive ! − Patterns in the data can be used for optimization − Time-based data − User-based data
  25. codecentric AG User-based Data: Separate Indexes Index 1 Index 2

    Index N ... User 1 User 2 User N ! ! ! ! ! ! ! ! ! ! − Disadvantage: Resource consumption, larger cluster state
  26. codecentric AG User-based Data: Shared Index Shard 1 Shard 2

    Shard M ... Search by user 1 filter by user 1 ! ! ! ! ! ! ! ! ! ! − Disadvantage: Distributed search
  27. codecentric AG filter by user 1 User-based Data: Shared Index

    with Routing Shard 1 Shard 2 Shard M ... User 2 User 1 User 5 User 3 User 4 User 6 User N User N-1 Search by user 1 ! ! ! ! ! ! ! ! ! ! − Disadvantage: At most one shard per user (capacity)
  28. codecentric AG User-based Data: Aliases − With aliases the approach

    chosen can be hidden from clients − Aliases can even carry filter and routing information − Present separate „user“ indexes (aliases) to the client ! − Advantage − Flexibility: Adapt mapping to physical indexes/shards on demand ! − Limitation − Huge number of users means lots of aliases (cluster state) − Still much better than huge number of indexes
  29. codecentric AG Zero Downtime Migration − Possible reasons − Backwards-incompatible

    mapping changes − Index/shard reaches its capacity ! − Needs a lot of careful thinking − Especially challenging if the update API is used
  30. codecentric AG Questions? Dr. rer. nat. Patrick Peschlow
 codecentric AG


    Merscheider Straße 1
 42699 Solingen
 
 tel +49 (0) 212.23 36 28 54
 fax +49 (0) 212.23 36 28 79
 patrick.peschlow@codecentric.de
 
 www.codecentric.de