Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Not all Nodes are Created Equal - Scaling Elast...

Not all Nodes are Created Equal - Scaling Elasticsearch

Elasticsearch is famous for being easy to set up and for having good defaults. A single node can go a long way and a handful of nodes will deliver a surprising punch. However, there comes a point where generic defaults become less than ideal and cluster architecture starts to matter. In this session, we will talk about capacity planning and custom setups suitable for large Elasticsearch deployments.

The talk was given at the Elasticsearch meetup in Tel Aviv on 24 Nov 2014

Elasticsearch Inc

November 24, 2014
Tweet

More Decks by Elasticsearch Inc

Other Decks in Technology

Transcript

  1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch • Real time Search and Analytics Engine • Schema-free, REST & JSON based document store • Distributed and horizontally scalable • Open Source: Apache License 2.0 • Zero configuration • Written in Java, extensible
  2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Big Numbers Are No Fun they cost money, time and hairTM
  3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited so we want to know that…. • that money needs to be spent • but also that we’re safe
  4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited So how do we go from • I need to index 500GB (or 500MB) per day of application data • I need to serve 10.000 (or 3) requests per seconds
  5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited To… • I need 20 nodes (or 2). • With SSDs (or maybe spinning disks are fine) • With 64GB (8GB) each.
  6. Each node: • stores data: • indexes, stores and searches

    
 data
 • can become master: • performs cluster 
 administration
 • receives requests: • coordination and response merging 3 nodes node node node
  7. data
 node data
 node data
 node data
 node data
 node

    data
 node data
 node data
 node master
 node master
 node master
 node role separation data
 node data
 node client
 node client
 node client
 node client
 node
  8. Lots of data == Sharding index shard 3 shard 4

    shard 1 shard 2 node node shard 3 shard 1 shard 4 shard 2 node node copy 1 copy 4 copy 3 copy 2
  9. More nodes, less sharing node node copy 1 copy 3

    node node node node node node shard 1 shard 3 copy 4 copy 2 shard 4 shard 2 shard 3 copy 4 copy 2 shard 4
  10. indexing single doc shard 4 shard 1 # curl -XPUT

    localhost:9200/index1/type/id -d { f: 1 } any node shard 2 shard 3
  11. bulk indexing shard 4 shard 1 # curl -XPUT localhost:9200/index1/type/_bulk

    -d …… any node shard 2 shard 3 more shards == scaling without sharing resources
  12. search shard 4 shard 1 any node shard 2 shard

    3 # curl localhost:9200/index/_search?q=something
  13. search - more replicas shard 4 shard 1 any node

    shard 2 shard 3 # curl localhost:9200/index/_search?q=something shard 4 shard 1 shard 2 shard 3 # curl localhost:9200/index/_search?q=something more replicas == scaling without sharing resources
  14. search - single shard, single request shard nosql 128 New

    York lat=6.9 lon=50 F 2 6 8 48 112 379 6 9 10 48 11 13 14 134 207 6 9 2 4 9 36 103 310 search time ~ number docs hits ~ number of docs in shard
  15. in short • indexing: • # shards → higher throughput

    • searching: • # shards → more data (fixed latency) • # replicas → higher throughput • and: • shard capacity is the base metric
  16. shard size shard node single indexer single searcher doc mix

    query mix search takes 160ms data time shard size
  17. shard throughput (version 2) shard node scale searchers known size

    fixed multiple indexers doc mix max q/sec
  18. what did we learn? • Max Shard Capacity • dictated

    by latency • (Max) Shard/Node Throughput • indexing • searching • Resources needed to support a shard 
 under required load • CPU • Memory • IO
  19. what does it tell us? • How many shards we

    need • to fit the data • to support out indexing/searching requirement • How many shards we can put on and node • if we didn’t max out resources • How many nodes we need
  20. do more with less? • Tweak Queries / data structures

    • measure the effect • Invest in your real bottleneck • faster storage • more memory • more cores • Use your resources efficiently • dedicated nodes for hot indices • shared nodes for old data • Sometimes, you just need those nodes
  21. keep it simple • a few nodes can take you

    a surprising long way • defaults == predictable • Dedicated master nodes • as soon as you start to grow • Even simple experiments teach your a lot