Scaling Elasticsearch - Speaker Deck

Slide 1

Slide 1 text

Scaling Elasticsearch janko-m @jankomarohnic at AT&T M2X

Slide 2

Slide 2 text

Elasticsearch is a distributed, RESTful search engine

Slide 3

Slide 3 text

Elasticsearch is a distributed, RESTful search engine … { }

Slide 4

Slide 4 text

Elasticsearch is a distributed, RESTful search engine GET /users/_search {"query":{"term":{"first_name":"janko"}}} => { ... }

Slide 5

Slide 5 text

Elasticsearch is a distributed, RESTful search engine "car" A car (or automobile) is a wheeled motor vehicle used for transportation. Most deﬁnitions of car say they run primarily on roads, seat one to eight people, have four tires, and mainly transport people rather than goods. Cars came into global use during the 20th century, and developed economies depend on them. The year 1886 is regarded as the birth year of the modern car, when German inventor Karl Benz built his Benz Patent-Motorwagen. Cars did not become widely available until the early 20th century. One of the ﬁrst cars that was accessible to the masses was the 1908 Model T, an American car manufactured by the Ford Motor Company. Cars were rapidly adopted in the US, where they replaced animal-drawn carriages and carts, but took much longer to be accepted in Western Europe and other parts of the world.

Slide 6

Slide 6 text

Cluster – group of nodes Node – server that holds part of the data Index – similar to a "table" in a relational database Shard – partition of an index

Slide 7

Slide 7 text

M2X • Stores time-series data • Datapoints (temperature, humidity, speed etc.) • Geolocations • Triggers • Widgets & Dashboards

Slide 8

Slide 8 text

M2X – datapoints Month New Datapoints July 2016 42,216,421 August 2016 38,456,296 September 2016 43,252,336 October 2016 89,572,942 November 2016 222,608,051 December 2016 338,588,651 January 2017 2,326,317,955 February 2017 7,192,489,182 ⠇ ⠇

Slide 9

Slide 9 text

#1 – Time-series indices datapoints

Slide 10

Slide 10 text

#1 – Time-series indices … datapoints-201501 datapoints-201502 datapoints-201503 datapoints-201504 datapoints-201505 … datapoints-201601 datapoints-201602 datapoints-201603 datapoints-201604 datapoints-201605 …

Slide 11

Slide 11 text

#1 – Time-series indices … datapoints-20170101 datapoints-20170102 datapoints-20170103 datapoints-20170104 datapoints-20170105 … datapoints-20170201 datapoints-20170202 datapoints-20170203 datapoints-20170204 datapoints-20170205 …

Slide 12

Slide 12 text

#2 – Reduce number of indices you need to query … datapoints-20161229 datapoints-20161230 datapoints-20161231 datapoints-20170101 datapoints-20170102 datapoints-20170103 datapoints-20170104 datapoints-20170105 datapoints-20170106 datapoints-20170107 … … 29 December 2016 30 December 2016 31 December 2016 1 January 2017 2 January 2017 3 January 2017 4 January 2017 5 January 2017 6 January 2017 7 January 2017 … Device created

Slide 13

Slide 13 text

Slide 14

Slide 14 text

#3 – Query less indices at once Q: Fetch 100 latest datapoints GET /all-datapoints/_search GET /v1-datapoints-20170616/_search GET /v1-datapoints-20170615/_search GET /v1-datapoints-20170614/_search GET /v1-datapoints-20170613/_search GET /v1-datapoints-20170612/_search …

Slide 15

Slide 15 text

#4 – # shards ≈ # nodes

Slide 16

Slide 16 text

#4 – # shards ≈ # nodes 5 shards

Slide 17

Slide 17 text

#4 – # shards ≈ # nodes 5 shards

Slide 18

Slide 18 text

#4 – # shards ≈ # nodes 15 shards

Slide 19

Slide 19 text

#5– Determine a good sharding strategy QUERY ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ ⚙ RESULT

Slide 20

Slide 20 text

#5– Determine a good sharding strategy QUERY ⚙ RESULT

Slide 21

Slide 21 text

#5– Determine a good sharding strategy Device Stream 1 Stream 2 Stream 3 Stream 4 Stream 5 … SHARDING Q: Fetch stream values => 1 shard Q: Fetch device values => 15 shards

Slide 22

Slide 22 text

#5– Determine a good sharding strategy Device Stream 1 Stream 2 Stream 3 Stream 4 Stream 5 … SHARDING Q: Fetch stream values => 1 shard Q: Fetch device values => 1 shard

Slide 23

Slide 23 text

#6 – Upgrade Elasticsearch 5.1, 5.0 2.4, 2.3, 2.2, 2.1, 2.0 1.7, 1.6, 1.5, 1.4, 1.3 0.90

Slide 24

Slide 24 text

#6 – Upgrade Elasticsearch v1.x Loads ﬁelds into memory and creates a data structure for searching at query time • Slow searches • No available memory for caching • OutOfMemory exceptions for large indices

Slide 25

Slide 25 text

#6 – Upgrade Elasticsearch v2.x Creates a columnar data structure on disk at write time • Fast searches • Small memory usage • Works for indices of any size

Slide 26

Slide 26 text

#7 – Scroll in large pages GET /datapoints/_search?size=1000&scroll=true GET /_search/_scroll GET /_search/_scroll GET /_search/_scroll GET /_search/_scroll …

Slide 27

Slide 27 text

#7 – Scroll in large pages GET /datapoints/_search?size=10000&scroll=true GET /_search/_scroll GET /_search/_scroll GET /_search/_scroll GET /_search/_scroll … 5x faster datapoint exports

Slide 28

Slide 28 text

#8 – Use cached counts when possible GET /all-datapoints/datapoint/_count

Slide 29

Slide 29 text

#8 – Use cached counts when possible GET /_cat/indices index docs.count users 110843 … datapoints 43879824976 … devices 180301 streams 11793537 Datapoint count speedup from 18s to 0.7s

Slide 30

Slide 30 text

#9 – Use timeouts API request Elasticsearch request 30s

Slide 31

Slide 31 text

#9 – Use timeouts GET /datapoints/_search {"query":{…}} GET /datapoints/_search {"query":{…},"timeout":"30s"}

Slide 32

Slide 32 text

#9 – Use timeouts API request Elasticsearch request 30s

Slide 33

Slide 33 text

#10 – Use unscored queries { "query": { "bool": { "must": { "term": { "first_name": "janko" } } } } }

Slide 34

Slide 34 text

#10 – Use unscored queries { "_index": "accounts", "_type": "account", "_id": "AVWGIxr7A4FHE06BKOJi", "_score": 11.933598, "_source": { … } }

Slide 35

Slide 35 text

#10 – Use unscored queries { "query": { "bool": { "filter": { "term": { "first_name": "Janko" } } } } }

Slide 36

Slide 36 text

The End