Stories from Support: Top Problems and Solutions

Slide 1

Slide 1 text

‹#› Chris Earle - Support Engineer @pickypg Mark Walkom - Support Engineer @warkolm Stories from Support: Top Problems and Solutions

Slide 2

Slide 2 text

Agenda 2 • This One Weird Phrase Will Endear You To Any Elastic Engineer • Fielddata • Cluster State • Recovery • Node Sizing • Sharing • Configuration Options • Queries • Aggregations • Indexing An Old Gem The Big Issues Of Course, There’s More!

Slide 3

Slide 3 text

‹#› It Depends Pretty Much Every Elastic Engineer Ever

Slide 4

Slide 4 text

Slide 5

Slide 5 text

The Big Issues 5 Fielddata Recovery 3 Node Sizing 4 Cluster State 1 2 Sharding 5

Slide 6

Slide 6 text

Of Total Heap Can be allocated to fielddata in 1.X 60% 100% 2.X Fielddata 6 Of Data Nodes Will be impacted (Mostly) Solves This Thanks to doc_values being on by default

Slide 7

Slide 7 text

Doc Values! • Columnar store of values • Written at index time to disk • Leverages the operating system’s filesystem cache • Will require a reindex for existing data • GET /_cat/fielddata?v is your friend 7

Slide 8

Slide 8 text

Doc Value Caveats • Analyzed strings do not currently support doc_values, which means that you must avoid using such fields for sorting, aggregating, and scripting • Analyzed strings are generally tokenized into multiple terms, which means that there is an array of values • With few exceptions (e.g., significant terms), aggregating against analyzed strings is not doing what you want • Unless you want the individual tokens, scripting is largely not useful • Big improvement coming in ES 2.3 (“keyword” field) 8

Slide 9

Slide 9 text

The Big Issues 9 Fielddata Recovery 3 Node Sizing 4 Cluster State 2 1 Sharding 5

Slide 10

Slide 10 text

Cluster State • Every cluster state change is sent to every node • Requires a lot of short lived, potentially large network messages • Gets worse with more nodes or indices • Mappings tend to be the largest portion • GET /_cluster/state?pretty • Not stored in memory as JSON, so this is just to give the idea (it’s likely 5% of it, at best) 10

Slide 11

Slide 11 text

State Of The Union • ES 2.0 introduces Cluster State Diffs between nodes • Changes become far more manageable and a large cluster state is no longer as problematic • Reducing your mapping size helps too • Do not allow dynamic mappings in production • Do not use _types to separate data • Create a “type” field to do this for you • Prefer changes in bulk rather than one-by-one (allow changes to be batched) 11

Slide 12

Slide 12 text

The Big Issues 12 Node Sizing 4 Cluster State 2 Fielddata 1 Recovery 3 Sharding 5

Slide 13

Slide 13 text

‹#› • Restarting a node or otherwise needing to replicate shards • Terribly slow process • Segment by segment • Minor risk for corruption pre-ES 1.5 with sketchy networks 13

Slide 14

Slide 14 text

Fully Recovered! 14 Used temporary file names 1.5 Asynchronous allocation and synced flushing Delayed allocation and prioritized allocation Cancelled allocation Prioritized allocation for replicas 1.6 1.7 2.0 2.1

Slide 15

Slide 15 text

The Big Issues 15 Recovery 3 Cluster State 2 Fielddata 1 Node Sizing 4 Sharding 5

Slide 16

Slide 16 text

16 • Elasticsearch is a parallel processing machine • Java can be a slow garbage collecting calculator • Slow disks. The problem for every data store? • A few huge boxes or a ton of tiny boxes?

Slide 17

Slide 17 text

And How Long Is A Piece Of String? 17 • 50% of System RAM to heap • Up to 30500M - no more or your heap loses optimizations! Memory • Indexing tends to be CPU bound • At least 2 cores per instance CPU IO • Disks get hammered for other reasons, including write-impacting • Translog in 2.0 fsyncs for every index operation • SSDs or Flash are always welcome

Slide 18

Slide 18 text

The Big Issues 18 Recovery 3 Cluster State 2 Fielddata 1 Sharding 5 Node Sizing 4

Slide 19

Slide 19 text

Do You Know Where Your Shards Are At Night 19 Elasticsearch 1.X defaults to 5 primary, 1 replica Elasticsearch 2.0 defaults to 5 primary, 1 replica Increase primaries for higher write throughput and to spread load 50GB is the rule of thumb max size for a primary shard. More for recovery than performance Replicas are not backups. Rarely see a benefit with more than 1

Slide 20

Slide 20 text

‹#› Bonus* time!

Slide 21

Slide 21 text

‹#› Configuration • discovery.zen.minimum_master_nodes • discovery.zen.ping.unicast.hosts • cluster.name • node.name • network.host (ES 2.0) • ES_HEAP_SIZE • Automation is key 21

Slide 22

Slide 22 text

Queries • Deep pagination • ES 2.0 has a soft limit on 10K hits per request. Linearly more expensive per shard • Use scan and/or scroll API • Leading wildcards • Equivalent to a full table scan (bad) • Scripting • Without parameters • Dynamically (inline) • Unnecessary filter caching (e.g., exact date ranges down to the millisecond) 22

Slide 23

Slide 23 text

Aggregations • Cardinality • Setting the threshold to 40K (or higher) is memory intensive and generally unnecessary • Using in place of search • Searching will be faster • Enormous sizes • Requesting large shard sizes (relative to actual size) • Linearly more expensive per shard touched • Generally unnecessary • Returning hits when you don’t want them 23

Slide 24

Slide 24 text

Indexing • Too many shards • If your shards are small (define: small as < 5 GB) and they outnumber your nodes, then you have too many • Refreshing too fast • This controls “near real time” search • Merge throttling • Disable it on SSDs • Make single threaded on HDDs (see node sizing link) • Not using bulk processing 24

Slide 25

Slide 25 text

25 Thanks!