Slide 1

Slide 1 text

ELASTICSEARCH & MARVEL EKO KURNIAWAN KHANNEDY

Slide 2

Slide 2 text

ELASTICSEARCH & MARVEL EKO KURNIAWAN KHANNEDY ▸ Principal Software Development Engineer at Blibli. ▸ Part of Research and Development Team at Blibli. ▸ Code Scala & Java, but sometimes code Ruby (In this demo we use Ruby) ▸ https://www.linkedin.com/in/khannedy

Slide 3

Slide 3 text

ELASTICSEARCH & MARVEL AGENDA ▸ What is Elasticsearch? ▸ Cluster ▸ Shard & Replication ▸ Distributed Document Store ▸ Distributed Search ▸ Marvel ▸ DEMO ▸ Setup Elasticsearch ▸ Scale Horizontally ▸ Data In & Data Out ▸ Zero Downtime Migration ▸ Marvel Monitoring Tool

Slide 4

Slide 4 text

WHAT IS ELASTICSEARCH? EKO KURNIAWAN KHANNEDY

Slide 5

Slide 5 text

YOU KNOW, FOR SEARCH

Slide 6

Slide 6 text

CLUSTER ELASTICSEARCH & MARVEL

Slide 7

Slide 7 text

ELASTICSEARCH & MARVEL ELASTICSEARCH ▸ Elasticsearch is build to be always available, and to scale with your needs. Scale can come from buying bigger servers (vertical scale) or from buying more servers (horizontal scale). ▸ Real scalability comes from horizontal scale - the ability to add more nodes to the cluster and to spread load and reliability between them. ▸ Elasticsearch is distributed by nature; it knows how to manage multiple nodes to provide scale and high availability. This also means that your application doesn’t need to care about it.

Slide 8

Slide 8 text

ELASTICSEARCH & MARVEL ELASTICSEARCH CLUSTER ▸ A Node is a running instance of Elasticsearch. ▸ A Cluster consists of one or more nodes with the same cluster.name that working together to share their data and workloads. ▸ As nodes are added to or removed from the cluster, the cluster reorganizes itself to spread the data evenly.

Slide 9

Slide 9 text

ELASTICSEARCH & MARVEL CLUSTER HEALTH ▸ GREEN : All primary and replicas shards are active. ▸ YELLOW : All primary shards are active, but not all replicas shards are active. ▸ RED : Not all primary shards are active.

Slide 10

Slide 10 text

ELASTICSEARCH & MARVEL AN EMPTY CLUSTER

Slide 11

Slide 11 text

ELASTICSEARCH & MARVEL ADD AN INDEX

Slide 12

Slide 12 text

ELASTICSEARCH & MARVEL A TWO-NODE CLUSTER

Slide 13

Slide 13 text

ELASTICSEARCH & MARVEL SCALE HORIZONTALLY

Slide 14

Slide 14 text

ELASTICSEARCH & MARVEL CLUSTER AFTER KILLING ONE NODE

Slide 15

Slide 15 text

SHARD & REPLICA ELASTICSEARCH & MARVEL

Slide 16

Slide 16 text

ELASTICSEARCH & MARVEL ELASTICSEARCH SHARD & REPLICA ▸ By default Elasticsearch will give you 5 shards and 1 Replica per index. ▸ You can change the replica size in runtime without downtime. ▸ But we can not change the shard size. If we want to change the shard size, we need to create new index and migrate old index to new index.

Slide 17

Slide 17 text

ELASTICSEARCH & MARVEL 3 SHARDS WITH NO REPLICA

Slide 18

Slide 18 text

ELASTICSEARCH & MARVEL 3 SHARDS WITH 1 REPLICA

Slide 19

Slide 19 text

ELASTICSEARCH & MARVEL 3 SHARDS WITH 2 REPLICAS

Slide 20

Slide 20 text

DISTRIBUTED DOCUMENT STORE ELASTICSEARCH & MARVEL

Slide 21

Slide 21 text

ELASTICSEARCH & MARVEL ROUTING DOCUMENT TO A SHARD shard = hash(routing) % number_of_primary_shards

Slide 22

Slide 22 text

ELASTICSEARCH & MARVEL CREATING, INDEXING AND DELETING A DOCUMENT

Slide 23

Slide 23 text

ELASTICSEARCH & MARVEL RETRIEVING A DOCUMENT

Slide 24

Slide 24 text

ELASTICSEARCH & MARVEL UPDATING A DOCUMENT

Slide 25

Slide 25 text

ELASTICSEARCH & MARVEL IMMUTABILITY ▸ The inverted index that is written to disk is immutable: it doesn’t change. Ever. This immutability has important benefits. ▸ There is no need for locking. If you never have to update the index, you never have to worry about multiple processes trying to make changes at the same time.

Slide 26

Slide 26 text

ELASTICSEARCH & MARVEL DELETE AND UPDATES ▸ Because document is immutable, so the document cannot be removed, nor can be updated to a newer version of the document. ▸ Every commit point includes a .del file that lists which documents have been deleted. ▸ When a document deleted, it is actually marked as deleted in the .del file. ▸ Document updates work in similar way: when a document is updated, the old version of the document is marked as deleted, and the new version of the document is indexed in a new segment.

Slide 27

Slide 27 text

DISTRIBUTED SEARCH ELASTICSEARCH & MARVEL

Slide 28

Slide 28 text

ELASTICSEARCH & MARVEL DISTRIBUTED SEARCH ▸ Search require a more complicated execution model because we don’t know which documents will match the query: they could be on any shard in the cluster. ▸ Finding all matching documents is only half the story. Result from multiple shards must be combined into single sorted list before return the results. ▸ For this reason, search executed in two-phase process called “query then fetch”

Slide 29

Slide 29 text

ELASTICSEARCH & MARVEL QUERY PHASE

Slide 30

Slide 30 text

ELASTICSEARCH & MARVEL FETCH PHASE

Slide 31

Slide 31 text

ELASTICSEARCH & MARVEL DEEP PAGINATION ▸ Remember that each shard must build a priority queue of length from + size, all of which need to be passed back to the coordinating node. And coordinating node needs to sort through number_or_shards + (from + size) documents in order to find the correct size documents. ▸ With big-enough from values, the sorting process can become very heavy indeed, using vast amount of CPU, memory and bandwidth. ▸ For this reason, we strongly advice against deep paging. ▸ As alternative, we can use Scan & Scroll API for deep pagination.

Slide 32

Slide 32 text

MARVEL ELASTICSEARCH & MARVEL

Slide 33

Slide 33 text

ELASTICSEARCH & MARVEL WHAT IS MARVEL? ▸ Marvel is Elasticsearch Monitoring Tool. ▸ Marvel can monitoring all nodes in Elasticsearch Cluster. ▸ Marvel running on top Kibana.

Slide 34

Slide 34 text

ELASTICSEARCH & MARVEL CLUSTER METRICS

Slide 35

Slide 35 text

ELASTICSEARCH & MARVEL INDEX METRICS

Slide 36

Slide 36 text

ELASTICSEARCH & MARVEL NODE METRICS

Slide 37

Slide 37 text

DEMO

Slide 38

Slide 38 text

THANKS

Slide 39

Slide 39 text

ELASTICSEARCH & MARVEL REFERENCES ▸ https://www.elastic.co/ ▸ https://www.elastic.co/products/elasticsearch ▸ https://www.elastic.co/products/marvel ▸ https://www.elastic.co/learn