Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling Elasticsearch: Designing for Performance and Availability

Scaling Elasticsearch: Designing for Performance and Availability

Elasticsearch has become a popular tool for distributed search, log analytics, and data visualization. What's the best way to scale out your deployment when you've reached capacity? In this presentation, we'll cover topics like shard allocation, clustering, and best practices to increase performance and stability of an Elasticsearch cluster. A good talk for those both working with existing deployments or looking to learn about the scaling capabilities of Elasticsearch.

Tyler L

May 08, 2015
Tweet

More Decks by Tyler L

Other Decks in Technology

Transcript

  1. scaling elasticsearch:
    designing for performance
    and availability
    OpenWest 2015
    Tyler Langlois

    View Slide

  2. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Speaker Bio
    ● Infrastructure Engineering @ Elastic
    ○ Previous: Qualtrics, Sandia National Laboratories, Blue Coat Systems, BYU
    ● Background in systems, security, *nix,
    smattering of different coding experience
    (scripting, web dev, devops)
    ● Happy as long as I’m automating things
    in a terminal
    ● Permanent mental bindings for vim and zsh
    leothrix
    tylerjl
    tjll.net
    Introduction

    View Slide

  3. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    What We’ll Cover - Overview
    ● When & Why Scale?
    ● Elasticsearch Scalability Primer
    ● Basic Mechanisms
    ○ Node Types
    ○ Sharding
    ○ Adding Nodes
    ● Going Further
    ○ Per-host resource tuning
    ○ Optimizing Use
    ● Q&A
    Introduction

    View Slide

  4. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    When?
    ● Different for each use case
    ○ measure and tune for optimal balance between horizontal scale and performance
    ● Places to look for bottlenecking:
    ○ Memory - Resident dataset occupancy in the JVM Heap
    ○ Load - Look at hot threads for query load, indexing load, etc.
    ○ I/O - CPU iowait for indications of I/O needing to be scaled out
    ● TBH, most often going to be Java OOM errors in elasticsearch logs
    ● How to find these?
    ○ Memory pressure: curl [elasticsearch]:9200/_nodes/stats/jvm?
    human&format=yaml
    ○ Dashboards: Marvel, bigdesk, kopf, paramedic, & more
    ○ $ tail -F /var/log/elasticsearch/$clustername.log
    ● tl;dr: [before] you have performance or stability issues
    When & Why Scale?

    View Slide

  5. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Why?
    ● Meant to scale horizontally by design
    ● Native horizontal scaling means:
    ○ Fewer cluster state headaches (master election,
    distributed queries, etc.)
    ○ First-class sharding and replicas
    ● Cheaper clusters (scale out, not up)
    ● High availability (master election, data redundancy)
    ● Eas(y|ier) performance scaling
    ● tl;dr when you want more performance or fault tolerance
    When & Why Scale?
    Don’t stress! We’ll make
    horizontal scaling easy.

    View Slide

  6. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Before Diving In
    There’s some really basic stuff to mention first that applies globally:
    ● Increase the JVM heap to 50% of available RAM, capping at 30GB or so.
    ○ [Compressed pointers]
    ● Increase file descriptor limits (set both hard and soft to 65535)
    ● bootstrap.mlockall: true (avoid swapping the ES JVM)
    ● Understand your environment and configure appropriately
    ○ Appropriate I/O scheduler, network links, cores > Mhz
    ○ On AWS?
    ■ discovery.zen.ping.multicast.enabled: false
    ■ Mount volume with good iops for ES data volume (point path.data at a RAID)
    ● Choose a unique cluster.name
    ● Google for “elasticsearch pre-flight checklist”
    Elasticsearch Scalability Primer

    View Slide

  7. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Sharding
    ● Each index in elasticsearch is composed
    of 1 to n number of shards
    ● Each shard can be backed by n number
    of replicas
    ● Shards allocate documents according to:
    shard = hash(routing) %
    number_of_primary_shards, where
    routing usually equals the document’s
    internal _id (customizable)
    ● Replicas are simply copies of primaries
    ● Documents are written to primaries,
    which get replicated to primaries. Reads
    can occur on either.
    Basic Mechanisms

    View Slide

  8. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Indexes & Shards - Primaries
    ● Indexes broken into settings.number_of_shards shards (default 5)
    ● Example with n = 3
    ● P = Primary shard
    ● Primary shards can serve both reads and writes
    ● Elasticsearch handles all of this under the covers
    Elasticsearch Scalability Primer

    View Slide

  9. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Indexes & Shards - Replicas
    ● Shards replicated into settings.number_of_replicas shards (default 1)
    ● Example with three shards and one replica
    ● R = Replica shard
    ● Replicas can serve reads, they’re not just idle hot standbys
    ● (setting mutable at any time)
    Elasticsearch Scalability Primer

    View Slide

  10. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Cluster State
    ● Increase replica count for better redundancy and read balancing
    ● Example with three shards, two replicas, and three nodes
    Notes:
    ● Elasticsearch inherently knows to place replicas apart from other replicas or
    primaries
    ● Moving replicas and primaries is handled automatically (set shards and replicas,
    add needed number of nodes, and forget [kinda])
    Elasticsearch Scalability Primer

    View Slide

  11. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Minor Caveats
    This process is magical and wonderful, however, there are things you should be
    aware of:
    ● After settings.number_of_shards is set, it’s permanent for that index
    ○ This seems like a biggie at first, but either re-indexing or naturally rotating indices (daily logs, etc.)
    mitigate this
    ○ number_of_replicas can be changed at any time (factor in needed time to copy shard data
    over)
    ● Need to take some minor steps to avoid split brain, which is not a good time
    ○ In our example 3-node cluster, setting discovery.zen.minimum_master_nodes = 2 works
    ○ Varies depending on cluster architecture (will cover more on that later)
    Elasticsearch Scalability Primer

    View Slide

  12. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Sharding Demo
    Pray to the demo gods, for the terminal is dark and full of terrors
    Basic Mechanisms

    View Slide

  13. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Sharding
    Provides easy:
    ● Scaling
    ○ Overallocate shards to begin with (nodes can have multiple primaries
    allocated) and as more nodes are added, ES naturally rebalances and
    distributes load across shards
    ● High Availability
    ○ Replicas take over for lost primaries as soon as they’re lost
    ○ Change # of replicas at any time
    ○ If a node doesn’t return, replicas that don’t exist (unallocated) get allocated
    by being copied over to other nodes, restoring a good cluster state
    Sharding is the basic mechanism for horizontal scaling; use it.
    Basic Mechanisms

    View Slide

  14. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Sharding - con’t
    Strategies
    ● number_of_shards
    ○ Temptation is to create 1,000 shards and never worry about needing to fool
    with number of shards again
    ○ Don’t do it!
    ○ Shards have overhead, need to find a balance
    ○ Test rates with a single node; extrapolate
    ● Things to remember:
    ○ Reads/searches can be served by primaries or replicas — leverage for
    easier search scaling
    ○ ES will allocate shards for optimal distribution, a cluster in a red or yellow
    state indicates that non-optimal shard distribution is occurring
    Basic Mechanisms

    View Slide

  15. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    You can trust Elasticsearch to make smart decisions when allocating shards. Things
    to be aware of:
    ● Arbitrary node tags can be defined and considered
    when routing:
    ○ node.aisle: east
    ○ cluster.routing.allocation.awareness.attributes: aisle
    ● Disk space magic
    ○ By default: cluster.routing.allocation.disk.
    threshold_enabled: true
    ○ Note that filling disks will start making elasticsearch nervous for you
    Sharding - Allocation
    Basic Mechanisms

    View Slide

  16. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Why and How Would One Scale?
    ● Our three-node example cluster has some failure cases:
    ○ Load on datanodes - if we start heavily querying on datanodes and we lose
    heartbeart from a master, things get confused
    ○ If we start putting too much data into the cluster, we can exhaust RAM and
    get OOM errors
    ○ Data at rest - again, too much data may either 1) exhaust filesystem space
    or 2) demand too many I/O operations from hardware
    We’ll address all these!
    Elasticsearch Scalability Primer

    View Slide

  17. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Node Types
    Cluster Design… made easy!
    ● node.data: true
    ○ Indicates this node should allow shards to be allocated to it,
    thus holding data
    ● node.data: false
    ○ Remain a member of the cluster, but do not accept primaries or
    replicas
    ● node.master: true
    ○ Node should be master-eligible (cluster will only have one
    active master at a time, but have multiple eligible nodes)
    ● node.master: false
    ○ Do not enter pool of master-eligible nodes, and thus will never
    have possibility of becoming a master
    Master Nodes
    Data nodes are self-
    explanatory; what do
    master nodes do?
    ● Coordinate cluster
    operations
    ● Index creation,
    shard allocation,
    etc.
    ● Cluster metadata
    (fields, etc.)
    Basic Mechanisms

    View Slide

  18. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    node.master: true
    node.data: true
    Multi-purpose node:
    can act as a master
    and hold data as
    well.
    Example use case
    would be our simple
    three-node cluster.
    node.master: false
    node.data: true
    Dedicated data node.
    Will never serve as a
    master, only
    allocating data to its
    elasticsearch
    process.
    Node Types
    This means we can design each node with specific purposes with node.data and
    node.master
    node.master: true
    node.data: false
    Dedicated master
    node.
    Will only serve as a
    master/master
    eligible node and not
    allow shards to be
    allocated to itself.
    node.master: false
    node.data: false
    Client node.
    Holds no data or
    master
    responsibilities, but
    can query cluster,
    send data (serve as
    indexer), etc.
    Basic Mechanisms

    View Slide

  19. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Node Types - con’t
    Why split node responsibilities?
    Separation of concerns for elasticsearch processes.
    ● Garbage collections on work-heavy nodes (data nodes, for example) can drop
    out of the cluster due to stop-the-world events. Dedicated masters can keep
    strenuous load to a minimum to ensure availability.
    ● Dedicated client nodes can be used for misc. purposes — such as searching,
    indexing, etc. — that do not impact data processing or master functions
    ● Spend money on beefy nodes (data) and let more, smaller nodes worry about
    cluster operations
    Basic Mechanisms

    View Slide

  20. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Adding Nodes
    Multicast vs. Unicast
    ● Multicast (zen)
    ○ If you’re self-hosting and in a protected subnet, it’s up to you
    ○ Easy cluster formation, re-discovery, etc.
    ○ Not always best in production (random joining, broadcasts, etc.)
    ● Unicast
    ○ Needs >=1 discovery IP
    ○ After cluster is discovered, gossip takes care of the rest
    ○ Needs config, but less prone to surprises
    ○ Only option for some hosting types (DO, Linode, etc.) — EC2 has plugin to
    manage discovery if that’s your jive
    Easy!
    Basic Mechanisms

    View Slide

  21. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Resource Tuning
    Some notes:
    ● The defaults are defaults for a reason
    ● Don’t go throwing knobs until you’ve covered the cases that cover the majority
    of cases
    ● Stuff like JVM tuning is probably overkill (use a supported JVM and run with it)
    ● ES trusts the OS with lots of tasks, so take the time to set proper descriptor
    limits/heap size/etc.
    ● Due to responsibilities ES gives to OS, tuning there can have big effect
    Going Further

    View Slide

  22. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Resource Tuning - I/O
    Under the covers, Lucene is talking to disk a lot (immutable structures.) How to
    improve in cases of I/O thrashing?
    ● RAID
    ○ Hardware-level — RAID up EC2 disks and present it to path.data
    ○ Software-level — Give ES “path.data: /dev/sdb,/dev/sdc” and let it
    stripe
    ■ Be careful of striping (newer versions of ES stripe even more safely)
    ● I/O Scheduling
    ○ SSD — ec2: noop, everything else: deadline
    ○ Rotating media — leave at cfq
    ● When bulk indexing in a batch, refresh = -1 (revert later)
    Going Further

    View Slide

  23. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Optimizing Use
    Sometimes it’s a better idea to tune usage patterns rather than ramp up
    horsepower. (we’ll go into each)
    ● Query smarter
    ○ filters — bitsets, excluding docs, etc.
    ● Managing the Data Lifecycle
    ○ Time-series indices with automatic data expiration
    ○ optimize, close, snapshot, delete
    ● Doc Values
    ○ Perfect for things that don’t need analysis (log levels, IPs, anything .raw)
    ○ Trade disk space for RAM
    Going Further

    View Slide

  24. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Optimizing Use - Query Smarter
    If you know how to reduce your in-memory dataset, you can get much more mileage
    out of Elasticsearch
    ● Filters
    ○ Define most-specific filters first to drastically reduce eligible results
    ○ Stuff like Kibana makes this easy (point and click)
    ● Bitsets
    ○ ES intelligently caches filters for later re-use
    ○ Dead simple data structure easily re-queried (docs 1, 2, 4 match a filter —
    [1, 1, 0, 1] stored in memory to re-use)
    ● With well-designed doc schemas, sometimes don’t need fulltext search for
    some use cases (queries for logs from specific hosts/processes, etc.)
    Going Further

    View Slide

  25. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Optimizing Use - Managing the Data Lifecycle
    Don’t need to keep time-sensitive data forever — retire it!
    ● (daily|weekly|monthly) indices, optimize, close, snapshot, delete
    ○ Logstash/fluentd can write to time-specific indices
    ○ Once these no longer receive writes, optimize them for maximum efficiency
    ○ Close them once they’re outside query window
    ○ Snapshot them to S3 eventually for long-term storage, delete if disk space
    needed
    ○ Rotate bucket objects into glacier if you’re a ninja
    ● Curator perfect for these types of tasks
    ● Alternatively, use expiring docs [with caution] (more expensive to detect/use)
    Going Further

    View Slide

  26. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Specific to some use cases, but extremely useful
    ● Normally tokenized terms are analyzed and added to lucene segments in that
    analyzed format
    ● This means a facet/aggregation (terms panel in K3 terminology) has to bring
    every possible value for a field into consideration when determining common
    values
    ● High cardinality for an analyzed field = OOM the JVM
    ● doc_values can alleviate memory pressure
    ○ Write out discrete values for non-analyzed strings, numeric types, geo
    points to disk on index
    ○ When performing aggregations, just look at the values instead of pulling all
    possible values into memory and calculating (I/O vs. memory)
    Optimizing Use - doc_values
    Going Further

    View Slide

  27. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Questions?
    Q&A

    View Slide

  28. www.elastic.co Copyright Elastic 2015 Copying, publishing and/or
    distributing without written permission is strictly prohibited
    Information
    ● Elasticsearch documentation
    ○ www.elastic.co/guide
    ○ Elasticsearch - The Definitive Guide - for in-depth learning
    ○ Official documentation, API docs, etc.
    ○ Client library docs (javascript, ruby, python, java, php)
    ● Get involved in the ES community
    ○ www.elastic.co/community/meetups
    ○ SLC Meetup!
    ● Give feedback at: https://joind.in/talk/view/13968
    Additional Resources

    View Slide