Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Optimizing ElasticSearch on Google Compute Engine

Bhuvanesh
February 22, 2020

Optimizing ElasticSearch on Google Compute Engine

If you are running the elastic search clusters on the GCE, then we need to take a look at the Capacity planning, OS level, and Elasticsearch level optimization. I have presented this at GDG Delhi on Feb 22,2020.

Bhuvanesh

February 22, 2020
Tweet

More Decks by Bhuvanesh

Other Decks in Technology

Transcript

  1. Agenda Short into about GCE ElasticSearch Terms Capacity Planning &

    Architecture Best Practices for Production Grade ES Cluster
  2. Compute Engine Compute Engine delivers configurable virtual machines running in

    Google’s data centers with access to high-performance networking infrastructure and block storage. Live migration for VMs Compute Engine virtual machines can live-migrate between host systems without rebooting, which keeps your applications running even when host systems require maintenance. Preemptible VMs Run batch jobs and fault-tolerant workloads on preemptible VMs to reduce your vCPU and memory costs by up to 80% while still getting the same performance and capabilities as regular VMs. Sole-tenant nodes Sole-tenant nodes are physical Compute Engine servers dedicated exclusively for your use. Sole-tenant nodes simplify deployment for bring your own license (BYOL) applications. Sole-tenant nodes give you access to the same machine types and VM configuration options as regular compute instances.
  3. What is Elastic Search? • First release 2010 • Open

    Source search and analytical engine • Elasticsearch is the central component of the Elastic Stack • Distributed processing • Works with all types of data (textual, numerical, geospatial, structured, and unstructured) • Powerful REST API • And everything is indexed
  4. Use cases • Logging and log analytics • Infrastructure metrics

    and container monitoring • Application performance monitoring • Geospatial data analysis and visualization • Security analytics • Enterprise search • Website search • And more….
  5. ES Terms Master Node: • Master Node controls the Cluster.

    • Responsible for maintaining the metadata about the cluster. • Decide where to move the data and relocating the data. • We can have multiple nodes for Master role. • But Elasticsearch will select any one of the node as an elastic master. • In the event of failure, a new elastic master will be selected from the available nodes.
  6. ES Terms Data Node • All of your is stored

    here. • Responsible for managing the stored data. • Perform the operations when it queried. Ingest Node • Pre-process’s documents before the actual document indexing. • The ingest node intercepts bulk and index requests, applies transformations, and it then passes the documents back to the index or bulk APIs.
  7. Memory Elastic Search will use the memory in 2 ways.

    1. Java Heap 2. Other processes “More memory – More time on Garbage collection”
  8. Network From GCP Docs, The egress traffic from a given

    VM instance is subject to maximum network egress throughput caps. These caps are dependent on the number of vCPUs that the VM instance has. Each vCPU is subject to a 2 Gbps cap for peak performance. Each additional vCPU increases the network cap, up to a theoretical maximum of 32 Gbps for each instance. The actual performance you experience will vary depending on your workload. All caps are meant as maximum possible performance, and not sustained performance.
  9. How to identify the right VM size? 1. Simulate your

    workload and do the load test. 2. Or use Rally(https://github.com/elastic/rally)
  10. Swapping • Memory based operations are super fast. But we

    can’t give a tons of memory to the server. • The OS will swap out the unused applications memory. • That’s bad for the performance. Prevent Swapping 1. From OS Level(temporarily) - sudo swapoff –a 2. Configure swappiness from the Kernal - vm.swappiness=1 3. Enable bootstrap-memory_lock - bootstrap.memory_lock: true
  11. JVM Heap • By default, Elasticsearch tells the JVM to

    use a heap with a minimum and maximum size of 1 GB. • When moving to production, it is important to configure heap size to ensure that Elasticsearch has enough heap available. • Set the Heap size <50% of your total Memory “The more heap available to Elasticsearch, the more memory it can use for its internal caches, larger heaps can cause longer garbage collection pauses” – From Elastic
  12. ulimit Ulimit is the number of open file descriptors per

    process. vi /etc/security/limits.conf elasticsearch - nofile 65535 --For Ubuntu vi /etc/pam.d/su session required pam_limits.so --For systemd vi /usr/lib/systemd/system/elasticsearch.service LimitMEMLOCK=infinity sudo systemctl daemon-reload
  13. MMAP Elasticsearch uses a mmapfs directory by default to store

    its indices sysctl -w vm.max_map_count=262144 /etc/sysctl.conf vm.max_map_count = 262144
  14. Operating System & File System • Windows • Debian •

    Ubuntu • CentOS • RedHat • Windows - NTFS • Linux – Ext4 (if you have less than 1TB Data), XFS for >1TB data
  15. Some parameters for a generic workload indices.memory.index_buffer_size: 40% indices.query.cache.enabled: false

    thread_pool.bulk.queue_size: 3000 thread_pool.index.queue_size: 3000 store.throttle.type: 'none' index.refresh_interval: "1m"
  16. Local SSD • Max size of one Local SSD disk

    = 375 GB • You can add up to 8 Local SSD/Instance (3TB) • You can’t reboot/stop the VM • In case of the maintenance – Replace the node
  17. How many nodes • Master – 3 nodes • Ingest

    – 2 nodes • Data – 2-3 nodes (for a fresh setup)
  18. Rally for the benchmark tests What is Rally? You want

    to benchmark Elasticsearch? Then Rally is for you. It can help you with the following tasks: • Setup and teardown of an Elasticsearch cluster for benchmarking • Management of benchmark data and specifications even across Elasticsearch versions • Running benchmarks and recording results • Finding performance problems by attaching so-called telemetry devices • Comparing performance results pip3 install esrally
  19. How to run the esrally esrally --track=nyc_taxis \ --target-hosts=10.20.4.157:9200 \

    --pipeline=benchmark-only \ --challenge=append-no-conflicts-index-only \ --on-error=continue \ --report-format=markdown \ --report-file=/opt/report.md