ES in Rome - Cross cluster search with Elasticsearch

660d1a296a8badddc4c44fb2c7eef011?s=47 Luca Cavanna
February 07, 2017

ES in Rome - Cross cluster search with Elasticsearch

660d1a296a8badddc4c44fb2c7eef011?s=128

Luca Cavanna

February 07, 2017
Tweet

Transcript

  1. 2.

    Agenda 2 Why cross cluster search? How does it work?

    How does it compare to tribe node? 1 2 3
  2. 4.

    4

  3. 5.

    5

  4. 8.

    cluster Search api 8 Client node1 logs 2P posts 1P

    node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search posts index: 2 primaries logs index: 3 primaries users index: 1 primary
  5. 9.

    cluster Search request parsing 9 Client node1 logs 2P posts

    1P node2 logs 3P users 1P node3 logs 1P posts 2P The coordinating node parses the request GET /posts/_search
  6. 10.

    cluster Query phase 10 Client node1 logs 2P posts 1P

    node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search The query gets executed on the relevant shards The query gets executed on the relevant shards
  7. 11.

    cluster Reduce phase 11 Client node1 logs 2P posts 1P

    node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search The coordinating node receives "size" hits per shard and performs reduction
  8. 12.

    cluster Fetch phase 12 Client node1 logs 2P posts 1P

    node2 logs 3P users 1P node3 logs 1P posts 2P GET /posts/_search The coordinating node fetches the top hits from the relevant shards The coordinating node fetches the top hits from the relevant shards
  9. 13.

    cluster Search response 13 Client node1 logs 2P posts 1P

    node2 logs 3P users 1P node3 logs 1P posts 2P The coordinating node returns the top hits back to the client
  10. 15.

    Register remote clusters PUT /_cluster/settings { "persistent" : { "search.remote"

    : { "australia" : { "seeds": "host_au:9300" }, "usa" : { "seeds": "host_us:9300" } } } } 15
  11. 16.

    16 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P GET /posts,usa:posts,australia:posts/_search The coordinating node parses the request
  12. 17.

    17 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node fetches info about remote indices and their shards GET /posts,usa:posts,australia:posts/_search
  13. 18.

    18 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The query gets executed on the relevant local shards GET /posts,usa:posts,australia:posts/_search
  14. 19.

    19 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The query gets also executed on the relevant remote shards GET /posts,usa:posts,australia:posts/_search
  15. 20.

    20 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node receives "size" hits per shard and performs reduction GET /posts,usa:posts,australia:posts/_search
  16. 21.

    21 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node fetches the top hits from the relevant shards GET /posts,usa:posts,australia:posts/_search
  17. 22.

    22 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa node2 posts 2P posts 1P australia node1 logs 1P node2 posts 1P The coordinating node returns the top hits back to the client GET /posts,usa:posts,australia:posts/_search
  18. 23.

    Search response "hits" : [ { "_index" : "australia:posts", ...

    }, { "_index" : "posts", ... }, { "_index" : "usa:posts", ... } ] 23
  19. 25.

    Limit connections per cluster PUT /_cluster/settings { "persistent" : {

    "search.remote" : { "initial_connect_timeout": "30s", "australia" : { "seeds": "host_au:9300" }, "usa" : { "seeds": "host_us:9300" } } } } 25
  20. 26.

    Limit connections per cluster PUT /_cluster/settings { "persistent" : {

    "search.remote" : { "australia" : { "seeds": "host_au:9300", "connections_per_cluster": 3 }, "usa" : { "seeds": "host_us:9300", "connections_per_cluster": 1 } } } } 26
  21. 27.

    Select nodes to connect to PUT /_cluster/settings { "persistent" :

    { "search.remote" : { "node.attr": "gateway", "australia" : { "seeds": "host_au:9300" }, "usa" : { "seeds": "host_us:9300" } } } } 27
  22. 28.

    28 Client europe node1 posts 1P users 1P node2 posts

    3P posts 2P usa australia The coordinating node only communicates with the nodes marked as "gateway" GET /posts,usa:posts,australia:posts/_search node2 posts 1P node1 logs 1P gateway node2 posts 2P posts 1P gateway
  23. 30.

    • Adding remote clusters requires node restart 30 Tribe node

    Cross cluster search • Remote clusters can be dynamically registered
  24. 31.

    • Adding remote clusters requires node restart • Doesn't support

    indices with same names on different clusters 31 Tribe node Cross cluster search • Remote clusters can be dynamically registered • No limitations on indices naming
  25. 32.

    • Adding remote clusters requires node restart • Doesn't support

    indices with same names on different clusters • Requires an additional node (tribe) to join all the remote clusters 32 Tribe node Cross cluster search • Remote clusters can be dynamically registered • No limitations on indices naming • No additional nodes required
  26. 33.

    • Adding remote clusters requires node restart • Doesn't support

    indices with same names on different clusters • Requires an additional node (tribe) to join all the remote clusters • Bi-directional connections to every remote node 33 Tribe node Cross cluster search • Remote clusters can be dynamically registered • No limitations on indices naming • No additional nodes required • Uni-directional connections to selected gateway nodes
  27. 34.

    • Receives all cluster state updates from remote clusters 34

    Tribe node Cross cluster search • Retrieves info on demand from remote clusters
  28. 35.

    • Receives all cluster state updates from remote clusters •

    Works with almost every api 35 Tribe node Cross cluster search • Retrieves info on demand from remote clusters • Specific to search api