ElasticSearch UserGroup Berlin meetup

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission
is strictly prohibited Elastic Fantastic ElasticSearch Usergroup Berlin simon.willnauer@elasticsearch.com @s1m0nw

is strictly prohibited What is this guy talking about? • Shard Allocation • What is this and why do I need it? • Is it new? • What is new? • Improvements in the pipeline • “new stuff in Lucene 4” ...wait, in what? • things you care that will come soon in ES

is strictly prohibited In the beginning was the single node... node 1 1P 2R 2P C curl -XPUT localhost:9200/index_1 -d ‘{ “settings” : { “number_of_shards” : 3, “number_of_replicas” : 0 } }’ 3P

is strictly prohibited And two indices.... node 1 1P 2P C curl -XPUT localhost:9200/index_2 -d ‘{ “settings” : { “number_of_shards” : 2, “number_of_replicas” : 0 } }’ 1P 2P 3P

is strictly prohibited 2nd Node... now what? node 2 node 1 1P 2P 1P 2P 3P 2P 1P 1P

is strictly prohibited But you rather wanna have this, no? node 2 node 1 1P 2P 1P 2P 1P 1P 3P 3P

is strictly prohibited Quick Demo curl -XPUT localhost:9200/index_1 -d ‘{ “settings” : { “number_of_shards” : 3, “number_of_replicas” : 0 } }’ curl -XPUT localhost:9200/index_2 -d ‘{ “settings” : { “number_of_shards” : 2, “number_of_replicas” : 0 } }’ curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.balance.index" : 0.0, "cluster.routing.allocation.balance.shard" : 1.0, "cluster.routing.allocation.balance.primary" : 0.0, } }' Behave like the previous ShardsAllocator

is strictly prohibited What changed? • EvenShardCountAllocator • balanced across shards and nodes • no notion of an index • tries to put same amount of shards on each node • BalancedShardsAllocator • based on a weight function • weights are calculated per node in an index context • users can inﬂuence the weight of an attribute

is strictly prohibited What can I adjust • how important for you is • balance over # of shards • balance over indices • balance over primaries • how aggressive rebalance acts • a threshold deﬁning the minimum delta between 2 nodes to issue a rebalance operation. Default is 1.0f • ...more to come

is strictly prohibited Example settings... curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.balance.index" : 0.0, "cluster.routing.allocation.balance.shard" : 1.0, "cluster.routing.allocation.balance.primary" : 0.0, } }' curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.balance.index" : 0.55, "cluster.routing.allocation.balance.shard" : 0.4, "cluster.routing.allocation.balance.primary" : 0.05, } }' Defaults: Acts like EvenShardCountsAllocator:

is strictly prohibited Future Work? • Ways to expand the weight function • size of a shard • average number of request on a shard • number of docs in the shard • <your requirement goes here> • Eventually we want the weight function to be customizable to be able to allow users to balance their cluster based on their needs.

is strictly prohibited Improvements in the Pipeline • Lucene 4.0 / 4.1 • Codec Support (0.21) • Concurrent Flushing (0.21) • Spellchecking / Suggestions (0.21) • Similarity per Field • FieldData Refactoring • API (0.21) • Implementations (0.2?)

is strictly prohibited Lucene 4.0 / 4.1 • Many features under the hood • Massive improvements in terms of memory consumption internally • Compression build in. • Fast FuzzyQuery • Faster Batch-Indexing • Bloom Filters build at index time • refresh might be much cheaper now • Default encoding on disk is based on blocks...

is strictly prohibited FieldData? • Used for Faceting, Sorting, Scoring • Until 0.20 not very flexible • Implementation details leaked the interface • 0.21 adds a new interface in order to improve memory and runtime performance • new FieldData will allow specialized implementations / data-structures per field • Defaults will be much more memory efficient (UTF-8 bytes vs. UTF-16 chars) • Future implementations can even read from MemoryMaps etc.

is strictly prohibited That’s it folks.... Ask your questions...!

ElasticSearch UserGroup Berlin meetup

ElasticSearch UserGroup Berlin meetup

Simon Willnauer

More Decks by Simon Willnauer

Other Decks in Programming

Featured

Transcript

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission