$30 off During Our Annual Pro Sale. View Details »

Scaling Elasticsearch Successfully

Scaling Elasticsearch Successfully

Presentation held at the data2day conference in Karlsruhe on November 27, 2014.

Patrick Peschlow

November 27, 2014
Tweet

More Decks by Patrick Peschlow

Other Decks in Technology

Transcript

  1. codecentric AG
    Patrick Peschlow
    Scaling Elasticsearch Successfully

    View Slide

  2. codecentric AG
    Cluster Basics
    master
    client
    data
    data
    data
    data
    data
    data
    client
    client
    client
    Nodes

    View Slide

  3. codecentric AG
    Split Brain
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    − Set minimum_master_nodes to quorum
    − Prevents split brains caused by „full“ partitioning
    − But: Split brains may still occur when single links fail, e.g., due to overload

    View Slide

  4. codecentric AG
    Sharding
    − Enable larger indexes
    − Parallelize/scale direct operations on individual documents
    − Index, update, delete, get
    Node 1 Node 2
    Shard 1 Shard 2 Shard 3

    View Slide

  5. codecentric AG
    Routing
    − By default the document _id field is used for shard key calculation
    !
    − Can be overridden via explicit „routing“
    !
    − For example, select the shard depending on a user ID or some document field

    View Slide

  6. codecentric AG
    Distributed Search
    − Sharding (by default) implies distributed search
    − Tends to make each individual search request (much) slower
    !
    − A single search may involve several round trips to various nodes
    − 1. Gather global information for more accurate scoring
    − 2. Perform the actual search and compute scores
    − 3. Retrieve the final set of documents from the relevant shards
    − In between coordination/reduction by the node that initially received the request
    !
    − The desired behavior may be specified on a per request basis („search type“)
    − By default, step 1 is omitted
    − Step 2 and 3 may be combined into one (but that’s risky with pagination)

    View Slide

  7. codecentric AG
    Sharding Gotcha
    − The number of shards needs to be chosen on index creation
    − No shard splitting later
    !
    − General recommendation for determining the number of shards
    − Define metrics for a shard „capacity limit“
    − Test the capacity limit of a single shard
    − Use realistic data and workloads
    − Depending on expected total amount of data, calculate the required number of shards
    − Overallocate a little

    View Slide

  8. codecentric AG
    Replication
    Node 1 Node 2
    Primary 1 Primary 2 Primary 3
    Replica 2
    Replica 3 Replica 1
    − Enable HA
    − Parallelize/scale read operations
    − Get, Search

    View Slide

  9. codecentric AG
    Consistency
    − „consistency“
    − How many shards need to be available to permit a write operation
    − all, quorum (default), one
    !
    − „replication“
    − Complete a write request already when the primary is done
    − Or only when all replicas have acknowledged the write (translog)
    − sync (default), async
    !
    − „preference“
    − On which shards to execute a search
    − round robin (default), local, primary, only some shards or nodes, arbitrary string
    − Helps to avoid inconsistent user experience when scoring differs between replicas
    − May happen because documents marked for deletion still affect scoring

    View Slide

  10. codecentric AG
    Index Aliases
    − A logical name for one or more Elasticsearch index(es)
    !
    − Decouples client view from physical storage
    − Create views on an index, e.g., for different users
    − Search across multiple indexes
    !
    − Enables changes without clients noticing
    − Point an alias to something new, e.g., switch to another index
    !
    − Limitation: Writes are only permitted for aliases that point to a single index
    !
    − Recommendation: Use aliases right from the start

    View Slide

  11. codecentric AG
    Designing for Scalability
    − Why should we think about scaling right from the start?
    − Fixed number of shards per index
    − Each new index involves some basic costs
    − Distributed searches are expensive
    !
    − Consider possible patterns in your data
    − Time-based data
    − User-based data
    − Or maybe none at all
    !
    − Recommended reading
    − http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scale.html

    View Slide

  12. codecentric AG
    Time-based Data
    − Assumptions
    − Documents arrive with (close-to-real-time) timestamps
    − (Almost) no updates of existing documents
    !
    − Examples
    − Log files
    − Tweets

    View Slide

  13. codecentric AG
    One Index per Time Frame
    2014-11-25 2014-11-26
    current

    (used for indexing)
    2014-11-27
    Search for „last 2 days“
    some-old-index
    ...

    View Slide

  14. codecentric AG
    Observations
    − Relatively simple to implement
    − Thanks to index templates and aliases
    !
    − The cost of error is small
    − Frequent index creation facilitates quick improvements
    !
    − But more complicated when updates/deletes of individual documents are needed

    View Slide

  15. codecentric AG
    User-based Data
    − Assumption
    − Documents form disjoint partitions with respect to visibility
    !
    − Examples
    − Unrelated users on the same platform
    − Unrelated tenants with multiple users each

    View Slide

  16. codecentric AG
    One Index per User
    Index 1 Index 2 Index N
    ...
    User 1 User 2 User N
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    − Disadvantage
    − Each index consumes resources, does not scale to large numbers of users

    View Slide

  17. codecentric AG
    Single Index
    Shard 1 Shard 2 Shard M
    ...
    Search by user 1
    filter by
    user 1
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    − Disadvantage
    − Distributed search even for users with little data

    View Slide

  18. codecentric AG
    filter by
    user 1
    Single Index with Routing
    Shard 1 Shard 2 Shard M
    ...
    User 2 User 1 User 5
    User 3
    User 4
    User 10 User N
    User N-1
    Search by user 1
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    − Disadvantage
    − Some shards may become much bigger than others

    View Slide

  19. codecentric AG
    Observations
    − Clients do not need to know the approach chosen
    − Aliases can be associated with filter and routing information
    − In all cases, the client may address separate „user“ indexes (aliases)
    !
    − It is possible to combine the approaches behind scenes
    − For example, start with „single index with routing“
    − Depending on the need, migrate big users to dedicated indexes
    !
    − Regardless of the approach chosen, we may always hit capacity limits
    − An index or a shard (and thus, with it, the index) may become too large
    − Then we basically have to deal with a „one big index“ scenario

    View Slide

  20. codecentric AG
    One Big Index
    − What to do when an index has reached its capacity?
    − Let’s say we even overallocated a bit, but growth is larger than expected
    !
    − Option 1: Extend the index by a second one
    !
    − Option 2: Migrate to a new index with more shards
    !
    − Note: Searching multiple indexes is the same as searching a sharded index
    − 1 index with 50 shards =~ 50 indexes with 1 shard each
    − In both cases, 50 Lucene indexes are searched

    View Slide

  21. codecentric AG
    Extending an Index
    − Create a second index for new documents
    − Define an alias so that search considers both indexes
    !
    − Challenge: Which index to address for updates, deletes, everything „by ID“?
    − Boils down to some kind of „sharding“ in the application
    − Documents need to carry something that can be used as „shard key“
    Old New
    Client
    ???

    View Slide

  22. codecentric AG
    Possible Approaches
    − Use information from the main DB for mapping documents to indexes
    − For example, everything beyond a certain „creation date“ is directed to the new index
    − Need to add client-side logic for mapping dates to index name
    − Alternatively, store the index name directly in the main DB
    − Only applicable if there actually is a main DB
    !
    − Encode the index name into the document ID
    − For example, UUID followed by index name
    − Does not require a main DB
    − Need to add logic during document ID generation
    − Clients need to know how to extract the index name from the document ID
    !
    − A bit fragile overall, as it depends on non-search parts of the application

    View Slide

  23. codecentric AG
    Old New
    Migrating to a new Index
    Migrator
    Client

    View Slide

  24. codecentric AG
    Old New
    Migrating to a new Index
    Client

    View Slide

  25. codecentric AG
    Old New
    Migrating to a new Index
    Client
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    − Can do that easily with downtime, but usually we want zero downtime

    View Slide

  26. codecentric AG
    Migrator
    − Helper application that reads from the old index and writes to the new index
    !
    − Read via scan+scroll API
    − Iterate in batches over a snapshot of the data
    !
    − Write via bulk API
    − Send batches of documents in single requests
    − Bulk size needs to be determined empirically
    !
    − Notes
    − Requires _source, but that’s really a best practice anyway
    − Consider having the migrator read and write in parallel
    − Consider (partially) disabling replication during migration

    View Slide

  27. codecentric AG
    Old New
    Zero Downtime Migration
    Client

    View Slide

  28. codecentric AG
    Old New
    Zero Downtime Migration
    Client
    Writes
    Reads

    View Slide

  29. codecentric AG
    Old New
    Zero Downtime Migration
    Migrator
    Client
    Writes
    Reads

    View Slide

  30. codecentric AG
    Old New
    Zero Downtime Migration
    Client
    Writes
    Reads

    View Slide

  31. codecentric AG
    Old New
    Zero Downtime Migration
    Client

    View Slide

  32. codecentric AG
    Old New
    Caveats
    Migrator
    Client
    Writes
    Reads
    Create

    only

    View Slide

  33. codecentric AG
    Old New
    Caveats
    Migrator
    Client
    Writes
    Reads
    Make deletes

    irreversible

    View Slide

  34. codecentric AG
    Old New
    Caveats
    Migrator
    Client
    Writes
    Reads
    Sync migrator

    start with

    writing to

    new index

    View Slide

  35. codecentric AG
    Summary of Steps
    − For read operations use an alias pointing to the old index
    − Create a new index
    − Set index.gc_deletes to a large enough value (makes deletes irreversible)
    − Direct writes to both indexes
    − Wait until all old-index-only writes have been refreshed (check via search)
    − Run the migrator using optype=create (prevents lost updates)
    − When the migrator is done, stop indexing into the old index
    − Switch the read alias to the new index
    − Delete the old index
    !
    − Note: Having a global „indexing queue“ eases the implementation of some steps
    − Single point where we need to make changes or monitor things

    View Slide

  36. codecentric AG
    Support for the Update API
    − Things are more complex when the application uses the Update API
    − Updates to the new index require an existing document
    !
    − Possible solutions
    − Buffer writes to the new index, only run them when the migrator is done
    − But need to prevent duplicate updates
    − For example, by explicit versioning or a synced start of buffering and migrator
    − Another idea is to turn each update into a full re-indexing during migration
    − Start using the Update API again only after done with migration
    !
    − Once again, having an „indexing queue“ is highly beneficial
    !
    − Recipes for different scenarios will be detailed in the codecentric blog :-)

    View Slide

  37. codecentric AG
    Questions?
    Dr. rer. nat. Patrick Peschlow

    codecentric AG

    Merscheider Straße 1

    42699 Solingen


    tel +49 (0) 212.23 36 28 54

    fax +49 (0) 212.23 36 28 79

    [email protected]

    www.codecentric.de

    View Slide