Maintaining performance in distributed systems

Slide 1

Slide 1 text

Alexander Reelsen @spinscale [email protected] Maintaining performance in distributed systems

Slide 2

Slide 2 text

Distributed Systems Elasticsearch Performance aspects Hardware & Operating System JVM & GC Libraries & Application Agenda

Slide 3

Slide 3 text

Me Software Engineer at Elasticsearch Interested in all things search & scale Search Meetup Munich http://www.meetup.com/Search-Meetup-Munich/events/218856224/ Elasticsearch Founded in 2012 Products: Elasticsearch, Logstash, Kibana, elasticsearch for Apache Hadoop, Marvel, Shield Professional Services: Support subscriptions, trainings About...

Slide 4

Slide 4 text

Distributed systems

Slide 5

Slide 5 text

Fallacies of distributed computing 1. The network is reliable. 2. Latency is zero. 3. Bandwidth is infinite. 4. The network is secure. 5. Topology doesn't change. 6. There is one administrator. 7. Transport cost is zero. 8. The network is homogeneous. by Peter Deutsch https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing Distributed systems

Slide 6

Slide 6 text

Redundancy Resiliency Recovery Scalability Availability ... Distributed systems expectations

Slide 7

Slide 7 text

Cope with node outage Maintenance, network split, power loss, garbage collection Example...         

Slide 8

Slide 8 text

Cope with node outage Maintenance, network split, power loss, garbage collection Example: Outages          Still operational No data loss CRUD works

Slide 9

Slide 9 text

Nodes come back Maintenance, network failure ends... Example: Recovery          Self healing Shift data back Higher load

Slide 10

Slide 10 text

Nodes come back Maintenance, network failure ends... Example: Recovery         

Slide 11

Slide 11 text

Nodes come back Maintenance, network failure ends... Example: Recovery         

Slide 12

Slide 12 text

Scalability Writes vs. reads Example: Data distribution " # $ % & & 

Slide 13

Slide 13 text

Scalability Writes vs. reads Example: Data distribution " # $ % & & 

Slide 14

Slide 14 text

Scalability Writes vs. reads Example: Data distribution " # $ % & &  

Slide 15

Slide 15 text

Scalability Writes vs. reads Example: Data distribution " # $ % & &  

Slide 16

Slide 16 text

Scalability Writes vs. reads Example: Data distribution " # $ % & &   " $ % & &

Slide 17

Slide 17 text

Scalability Writes vs. reads Example: Data distribution " # $ % & &   " $ % & &

Slide 18

Slide 18 text

Read & write scalability Example: Data distribution     " # $ % & & " $ % & &

Slide 19

Slide 19 text

Read & write scalability Example: Data distribution     " # $ % & & " $ % & &

Slide 20

Slide 20 text

...is affected by this More Coordination, more distant Different boundaries compared to single process applications hard to predict/test on many different layers But... performance?  Application & Libraries Runtime environment (JVM) OS Hardware Network

Slide 21

Slide 21 text

Elasticsearch Introduction

Slide 22

Slide 22 text

HTTP & JSON What is Elasticsearch? '

Slide 23

Slide 23 text

HTTP & JSON Schema-less What is Elasticsearch? ' (

Slide 24

Slide 24 text

HTTP & JSON Schema-less distributed What is Elasticsearch? ) ' (

Slide 25

Slide 25 text

HTTP & JSON Schema-less distributed document-oriented What is Elasticsearch? ) ' ( *

Slide 26

Slide 26 text

HTTP & JSON Schema-less distributed document-oriented near-realtime What is Elasticsearch? ) + ' ( *

Slide 27

Slide 27 text

HTTP & JSON Schema-less distributed document-oriented near-realtime search What is Elasticsearch? , ) + ' ( *

Slide 28

Slide 28 text

HTTP & JSON Schema-less distributed document-oriented near-realtime search analytics What is Elasticsearch? - , ) + ' ( *

Slide 29

Slide 29 text

Master Node 1 . Cluster always has one master Reelection on node failure

Slide 30

Slide 30 text

Node joins Node 1 . Node 2 Node 3 Node 2 & Node 3 ping around

Slide 31

Slide 31 text

Node joins Node 1 . Node 2 Node 3 Node 2 & Node 3 join cluster

Slide 32

Slide 32 text

Data distribution Node 1 a Index: Collection of documents

Slide 33

Slide 33 text

Data distribution Node 1 a0 Shards: Units of scale a1 a2

Slide 34

Slide 34 text

Data distribution Node 1 Node 2 Node 3 a0 Shards: Units of scale a1 a2

Slide 35

Slide 35 text

Data distribution Node 1 Node 2 Node 3 a0 Primary shards Replica shards a1 a2 a0 a2 a1

Slide 36

Slide 36 text

Data distribution Node 1 Node 2 Node 3 a0 Different scaling strategies per index a1 a2 a0 a2 a1 b0 b0 b0

Slide 37

Slide 37 text

CPU Indexing, searching, highlighting I/O Indexing, searching, merging Memory Aggregations, indices Network Relocation, snapshot & restore Elasticsearch can easily max out...

Slide 38

Slide 38 text

Competing resources Resizing is out of our control Requires thorough testing & configuration But... performance?

Slide 39

Slide 39 text

Hardware & operating system

Slide 40

Slide 40 text

What is locked memory? What is the best scheduler for SSDs? Is TRIM supported on every FS? What is mechanical sympathy? Quiz ? ? ? ?

Slide 41

Slide 41 text

Bigger is better? It depends... CPU: # cores, more parallel threads Main memory: No limit Disk: SAN vs. local, SSD vs. spindle Bare metal vs. virtualization https://speakerdeck.com/elasticsearch/life-after-ec2 Hardware

Slide 42

Slide 42 text

TRIM Write amplification Garbage collection Coding for SSDs http://codecapsule.com/2014/02/12/coding-for-ssds-part-1- introduction-and-table-of-contents/ SSDs are awesome

Slide 43

Slide 43 text

File system descriptors, file system cache Memlocked memory bootstrap.mlockall: true NUMA http://engineering.linkedin.com/performance/optimizing-linux- memory-management-low-latency-high-throughput-databases http://queue.acm.org/detail.cfm?id=2513149 Don't swap out if you need performance! OOM killer: Just dont... Operating systems

Slide 44

Slide 44 text

JVM

Slide 45

Slide 45 text

When does the JIT compiler kick in? Are client/server JVMs different? What’s the default thread stack size? Is there a memory based thread limit? Quiz ? ? ? ?

Slide 46

Slide 46 text

Less than 32 GB of heap, allowing to use compressed pointers Serialize everything yourself JVM versions tend to be incompatible use server vm, allocate all memory on startup reduce thread stack size http://rdiyewar-tech.blogspot.de/2013/02/outofmemoryerror- because-of-default.html JVM tricks

Slide 47

Slide 47 text

JVM is good at managing threads, but not several thousands of them Single thread pool does not fit all Solution: Dedicated thread pools, based on the amount of available CPUs and their task complexity JVM Threads