Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling Elasticsearch

Scaling Elasticsearch

376e4eb9dc6c2e33d1330262edc4f109?s=128

Janko Marohnić

February 09, 2018
Tweet

Transcript

  1. Scaling Elasticsearch janko-m @jankomarohnic at AT&T M2X

  2. Elasticsearch is a distributed, RESTful search engine

  3. Elasticsearch is a distributed, RESTful search engine … { }

  4. Elasticsearch is a distributed, RESTful search engine GET /users/_search {"query":{"term":{"first_name":"janko"}}}

    => { ... }
  5. Elasticsearch is a distributed, RESTful search engine "car" A car

    (or automobile) is a wheeled motor vehicle used for transportation. Most definitions of car say they run primarily on roads, seat one to eight people, have four tires, and mainly transport people rather than goods. Cars came into global use during the 20th century, and developed economies depend on them. The year 1886 is regarded as the birth year of the modern car, when German inventor Karl Benz built his Benz Patent-Motorwagen. Cars did not become widely available until the early 20th century. One of the first cars that was accessible to the masses was the 1908 Model T, an American car manufactured by the Ford Motor Company. Cars were rapidly adopted in the US, where they replaced animal-drawn carriages and carts, but took much longer to be accepted in Western Europe and other parts of the world.
  6. Cluster – group of nodes Node – server that holds

    part of the data Index – similar to a "table" in a relational database Shard – partition of an index
  7. M2X • Stores time-series data • Datapoints (temperature, humidity, speed

    etc.) • Geolocations • Triggers • Widgets & Dashboards
  8. M2X – datapoints Month New Datapoints July 2016 42,216,421 August

    2016 38,456,296 September 2016 43,252,336 October 2016 89,572,942 November 2016 222,608,051 December 2016 338,588,651 January 2017 2,326,317,955 February 2017 7,192,489,182 ⠇ ⠇
  9. #1 – Time-series indices datapoints

  10. #1 – Time-series indices … datapoints-201501 datapoints-201502 datapoints-201503 datapoints-201504 datapoints-201505

    … datapoints-201601 datapoints-201602 datapoints-201603 datapoints-201604 datapoints-201605 …
  11. #1 – Time-series indices … datapoints-20170101 datapoints-20170102 datapoints-20170103 datapoints-20170104 datapoints-20170105

    … datapoints-20170201 datapoints-20170202 datapoints-20170203 datapoints-20170204 datapoints-20170205 …
  12. #2 – Reduce number of indices you need to query

    … datapoints-20161229 datapoints-20161230 datapoints-20161231 datapoints-20170101 datapoints-20170102 datapoints-20170103 datapoints-20170104 datapoints-20170105 datapoints-20170106 datapoints-20170107 … … 29 December 2016 30 December 2016 31 December 2016 1 January 2017 2 January 2017 3 January 2017 4 January 2017 5 January 2017 6 January 2017 7 January 2017 … Device created
  13. #2 – Reduce number of indices you need to query

    … datapoints-20161229 datapoints-20161230 datapoints-20161231 datapoints-20170101 datapoints-20170102 datapoints-20170103 datapoints-20170104 datapoints-20170105 datapoints-20170106 datapoints-20170107 … … 29 December 2016 30 December 2016 31 December 2016 1 January 2017 2 January 2017 3 January 2017 4 January 2017 5 January 2017 6 January 2017 7 January 2017 … Device created
  14. #3 – Query less indices at once Q: Fetch 100

    latest datapoints GET /all-datapoints/_search GET /v1-datapoints-20170616/_search GET /v1-datapoints-20170615/_search GET /v1-datapoints-20170614/_search GET /v1-datapoints-20170613/_search GET /v1-datapoints-20170612/_search …
  15. #4 – # shards ≈ # nodes

  16. #4 – # shards ≈ # nodes 5 shards

  17. #4 – # shards ≈ # nodes 5 shards

  18. #4 – # shards ≈ # nodes 15 shards

  19. #5– Determine a good sharding strategy QUERY ⚙ ⚙ ⚙

    ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ RESULT
  20. #5– Determine a good sharding strategy QUERY ⚙ RESULT

  21. #5– Determine a good sharding strategy Device Stream 1 Stream

    2 Stream 3 Stream 4 Stream 5 … SHARDING Q: Fetch stream values => 1 shard Q: Fetch device values => 15 shards
  22. #5– Determine a good sharding strategy Device Stream 1 Stream

    2 Stream 3 Stream 4 Stream 5 … SHARDING Q: Fetch stream values => 1 shard Q: Fetch device values => 1 shard
  23. #6 – Upgrade Elasticsearch 5.1, 5.0 2.4, 2.3, 2.2, 2.1,

    2.0 1.7, 1.6, 1.5, 1.4, 1.3 0.90
  24. #6 – Upgrade Elasticsearch v1.x Loads fields into memory and

    creates a data structure for searching at query time • Slow searches • No available memory for caching • OutOfMemory exceptions for large indices
  25. #6 – Upgrade Elasticsearch v2.x Creates a columnar data structure

    on disk at write time • Fast searches • Small memory usage • Works for indices of any size
  26. #7 – Scroll in large pages GET /datapoints/_search?size=1000&scroll=true GET /_search/_scroll

    GET /_search/_scroll GET /_search/_scroll GET /_search/_scroll …
  27. #7 – Scroll in large pages GET /datapoints/_search?size=10000&scroll=true GET /_search/_scroll

    GET /_search/_scroll GET /_search/_scroll GET /_search/_scroll … 5x faster datapoint exports
  28. #8 – Use cached counts when possible GET /all-datapoints/datapoint/_count

  29. #8 – Use cached counts when possible GET /_cat/indices index

    docs.count users 110843 … datapoints 43879824976 … devices 180301 streams 11793537 Datapoint count speedup from 18s to 0.7s
  30. #9 – Use timeouts API request Elasticsearch request 30s

  31. #9 – Use timeouts GET /datapoints/_search {"query":{…}} GET /datapoints/_search {"query":{…},"timeout":"30s"}

  32. #9 – Use timeouts API request Elasticsearch request 30s

  33. #10 – Use unscored queries { "query": { "bool": {

    "must": { "term": { "first_name": "janko" } } } } }
  34. #10 – Use unscored queries { "_index": "accounts", "_type": "account",

    "_id": "AVWGIxr7A4FHE06BKOJi", "_score": 11.933598, "_source": { … } }
  35. #10 – Use unscored queries { "query": { "bool": {

    "filter": { "term": { "first_name": "Janko" } } } } }
  36. The End