Slide 1

Slide 1 text

‹#› @lucacavanna Cross cluster search with Elasticsearch

Slide 2

Slide 2 text

Agenda 2 Why cross cluster search? How does it work? How does it compare to tribe node? 1 2 3

Slide 3

Slide 3 text

‹#› Why cross cluster search?

Slide 4

Slide 4 text

4

Slide 5

Slide 5 text

5

Slide 6

Slide 6 text

‹#› How does cross cluster search work?

Slide 7

Slide 7 text

‹#› Search detour

Slide 8

Slide 8 text

cluster Search api 8 Client node1 logs 2P posts 1P node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search posts index: 2 primaries logs index: 3 primaries users index: 1 primary

Slide 9

Slide 9 text

cluster Search request parsing 9 Client node1 logs 2P posts 1P node2 logs 3P users 1P node3 logs 1P posts 2P The coordinating node parses the request GET /posts/_search

Slide 10

Slide 10 text

cluster Query phase 10 Client node1 logs 2P posts 1P node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search The query gets executed on the relevant shards The query gets executed on the relevant shards

Slide 11

Slide 11 text

cluster Reduce phase 11 Client node1 logs 2P posts 1P node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search The coordinating node receives "size" hits per shard and performs reduction

Slide 12

Slide 12 text

cluster Fetch phase 12 Client node1 logs 2P posts 1P node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search The coordinating node fetches the top hits from the relevant shards The coordinating node fetches the top hits from the relevant shards

Slide 13

Slide 13 text

cluster Search response 13 Client node1 logs 2P posts 1P node2 logs 3P users 1P node3 logs 1P posts 2P The coordinating node returns the top hits back to the client

Slide 14

Slide 14 text

‹#› Cross cluster search

Slide 15

Slide 15 text

Register remote clusters PUT /_cluster/settings { "persistent" : { "search.remote" : { "australia" : { "seeds": "host_au:9300" }, "usa" : { "seeds": "host_us:9300" } } } } 15

Slide 16

Slide 16 text

16 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P GET /posts,usa:posts,australia:posts/_search The coordinating node parses the request

Slide 17

Slide 17 text

17 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node fetches info about remote indices and their shards GET /posts,usa:posts,australia:posts/_search

Slide 18

Slide 18 text

18 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The query gets executed on the relevant local shards GET /posts,usa:posts,australia:posts/_search

Slide 19

Slide 19 text

19 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The query gets also executed on the relevant remote shards GET /posts,usa:posts,australia:posts/_search

Slide 20

Slide 20 text

20 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node receives "size" hits per shard and performs reduction GET /posts,usa:posts,australia:posts/_search

Slide 21

Slide 21 text

21 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node fetches the top hits from the relevant shards GET /posts,usa:posts,australia:posts/_search

Slide 22

Slide 22 text

22 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node returns the top hits back to the client GET /posts,usa:posts,australia:posts/_search

Slide 23

Slide 23 text

Search response "hits" : [ { "_index" : "australia:posts", ... }, { "_index" : "posts", ... }, { "_index" : "usa:posts", ... } ] 23

Slide 24

Slide 24 text

‹#› Connection settings

Slide 25

Slide 25 text

Limit connections per cluster PUT /_cluster/settings { "persistent" : { "search.remote" : { "initial_connect_timeout": "30s", "australia" : { "seeds": "host_au:9300" }, "usa" : { "seeds": "host_us:9300" } } } } 25

Slide 26

Slide 26 text

Limit connections per cluster PUT /_cluster/settings { "persistent" : { "search.remote" : { "australia" : { "seeds": "host_au:9300", "connections_per_cluster": 3 }, "usa" : { "seeds": "host_us:9300", "connections_per_cluster": 1 } } } } 26

Slide 27

Slide 27 text

Select nodes to connect to PUT /_cluster/settings { "persistent" : { "search.remote" : { "node.attr": "gateway", "australia" : { "seeds": "host_au:9300" }, "usa" : { "seeds": "host_us:9300" } } } } 27

Slide 28

Slide 28 text

28 Client europe node1 posts 1P users 1P node2 posts 3P posts 2P usa australia The coordinating node only communicates with the nodes marked as "gateway" GET /posts,usa:posts,australia:posts/_search node2 posts 1P node1 logs 1P gateway node2 posts 2P posts 1P gateway

Slide 29

Slide 29 text

‹#› How does it compare to tribe node?

Slide 30

Slide 30 text

• Adding remote clusters requires node restart 30 Tribe node Cross cluster search • Remote clusters can be dynamically registered

Slide 31

Slide 31 text

• Adding remote clusters requires node restart • Doesn't support indices with same names on different clusters 31 Tribe node Cross cluster search • Remote clusters can be dynamically registered • No limitations on indices naming

Slide 32

Slide 32 text

• Adding remote clusters requires node restart • Doesn't support indices with same names on different clusters • Requires an additional node (tribe) to join all the remote clusters 32 Tribe node Cross cluster search • Remote clusters can be dynamically registered • No limitations on indices naming • No additional nodes required

Slide 33

Slide 33 text

• Adding remote clusters requires node restart • Doesn't support indices with same names on different clusters • Requires an additional node (tribe) to join all the remote clusters • Bi-directional connections to every remote node 33 Tribe node Cross cluster search • Remote clusters can be dynamically registered • No limitations on indices naming • No additional nodes required • Uni-directional connections to selected gateway nodes

Slide 34

Slide 34 text

• Receives all cluster state updates from remote clusters 34 Tribe node Cross cluster search • Retrieves info on demand from remote clusters

Slide 35

Slide 35 text

• Receives all cluster state updates from remote clusters • Works with almost every api 35 Tribe node Cross cluster search • Retrieves info on demand from remote clusters • Specific to search api

Slide 36

Slide 36 text

‹#› https://www.elastic.co/guide/en/elasticsearch/reference/5.x/ modules-cross-cluster-search.html Coming soon with Elasticsearch 5.3.0

Slide 37

Slide 37 text

‹#› Thank you