Slide 1

Slide 1 text

Threads Adam Dubiel

Slide 2

Slide 2 text

Being abstract is something profoundly different from being vague … The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise. Edsger W. Dijkstra

Slide 3

Slide 3 text

12 mln users 700+ microservices 600+ engineers

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

JVM starts

Slide 9

Slide 9 text

VM threads come to life (~20 threads)

Slide 10

Slide 10 text

spring context boots up (+~15 threads)

Slide 11

Slide 11 text

traffic starts flowing (+~80 threads)

Slide 12

Slide 12 text

The process is born

Slide 13

Slide 13 text

systemd(1) |-agetty(1338) |-sshd(1133)---sshd(1790)---bash(1791)

Slide 14

Slide 14 text

java -version

Slide 15

Slide 15 text

systemd(1) |-agetty(1338) |-sshd(1133)-...-bash(1791)-java(1876)

Slide 16

Slide 16 text

systemd(1) |-agetty(1338) |-java(1876)

Slide 17

Slide 17 text

Process Memory T1 T2 TN

Slide 18

Slide 18 text

Threads in JVM

Slide 19

Slide 19 text

pstree -Ap

Slide 20

Slide 20 text

systemd(1) |-agetty(1338) |-java(1876)-+-{java}(1893) | |-{java}(1892) | |-{java}(1896) | |-{java}(1890) | |-{java}(1882) | |-{java}(1881) | |-{java}(1885) | |-{java}(1886) | |-{java}(1887) | |-{java}(1879) | |-{java}(1880) | |-{java}(1897) | |-{java}(1898) | |-{java}(1889)

Slide 21

Slide 21 text

JVM threads == system threads

Slide 22

Slide 22 text

pthread_t tid; int ret = pthread_create(&tid, &attr, (void* (*)(void*)) thread_native_entry, thread); http://hg.openjdk.java.net/jdk10/jdk10/hotspot/file/5ab7a67bc155/src/os/linux/vm/os_linux.cpp#l731

Slide 23

Slide 23 text

If JVM threads are system threads.. observable using system tools scheduled using system scheduler

Slide 24

Slide 24 text

ls /proc/1876/task 1876 1893 1892 1896 1890 1882 1881 1885 1886 1887 1879 1880 1897 1898 1889

Slide 25

Slide 25 text

ps H -p -o pid,tid,pmem,pcpu,time,comm

Slide 26

Slide 26 text

top -H -p

Slide 27

Slide 27 text

ps H -p -o pid,tid,pmem,pcpu,time,comm process id thread id

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

PID TID %MEM %CPU TIME COMMAND 8511 8511 28.4 0.0 00:00:00 java 8511 8520 28.4 0.5 00:00:00 java 8511 8521 28.4 30.2 00:00:09 java 8511 8522 28.4 30.0 00:00:09 java 8511 8523 28.4 30.2 00:00:09 java 8511 8524 28.4 30.3 00:00:10 java 8511 8525 28.4 1.9 00:00:00 java 8511 8526 28.4 0.0 00:00:00 java 8511 8527 28.4 0.0 00:00:00 java 8511 8528 28.4 0.0 00:00:00 java

Slide 30

Slide 30 text

PID TID %MEM %CPU TIME COMMAND 8511 8511 28.4 0.0 00:00:00 java 8511 8520 28.4 0.5 00:00:00 java 8511 8521 28.4 30.2 00:00:09 java 8511 8522 28.4 30.0 00:00:09 java 8511 8523 28.4 30.2 00:00:09 java 8511 8524 28.4 30.3 00:00:10 java 8511 8525 28.4 1.9 00:00:00 java 8511 8526 28.4 0.0 00:00:00 java 8511 8527 28.4 0.0 00:00:00 java 8511 8528 28.4 0.0 00:00:00 java

Slide 31

Slide 31 text

PID TID %MEM %CPU TIME COMMAND 8511 8511 28.4 0.0 00:00:00 java 8511 8520 28.4 0.5 00:00:00 java 8511 8521 28.4 30.2 00:00:09 java 8511 8522 28.4 30.0 00:00:09 java 8511 8523 28.4 30.2 00:00:09 java 8511 8524 28.4 30.3 00:00:10 java 8511 8525 28.4 1.9 00:00:00 java 8511 8526 28.4 0.0 00:00:00 java 8511 8527 28.4 0.0 00:00:00 java 8511 8528 28.4 0.0 00:00:00 java

Slide 32

Slide 32 text

How to match sys & JVM thread?

Slide 33

Slide 33 text

jstack

Slide 34

Slide 34 text

jcmd Thread.print

Slide 35

Slide 35 text

"pool-1-thread-10" #19 prio=5 os_prio=31 tid=0x00007fcd05868000 nid=0x6103 runnable [0x000070000ffc5000]

Slide 36

Slide 36 text

"pool-1-thread-10" #19 prio=5 os_prio=31 tid=0x00007fcd05868000 nid=0x6103 runnable [0x000070000ffc5000]

Slide 37

Slide 37 text

pool-1-thread-10 tid=0x00007fcd05868000 nid=0x6103 hex(pthreads id) hex(tid)

Slide 38

Slide 38 text

pool-1-thread-10 nid=0x6103

Slide 39

Slide 39 text

PID TID %MEM %CPU TIME COMMAND 8511 8511 28.4 0.0 00:00:00 java 8511 8520 28.4 0.5 00:00:00 java 8511 8521 28.4 30.2 00:00:09 java 8511 8522 28.4 30.0 00:00:09 java 8511 8523 28.4 30.2 00:00:09 java 8511 8524 28.4 30.3 00:00:10 java 8511 8525 28.4 1.9 00:00:00 java 8511 8526 28.4 0.0 00:00:00 java 8511 8527 28.4 0.0 00:00:00 java 8511 8528 28.4 0.0 00:00:00 java

Slide 40

Slide 40 text

hex(8521) = 2149 jstack | grep -i 2149

Slide 41

Slide 41 text

jstack | grep `printf %x 8521`

Slide 42

Slide 42 text

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fd7ec01f800 nid=0x2149 runnable hex(8521)

Slide 43

Slide 43 text

Java 8 http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/os/linux/vm/os_linux.cpp#l5038 void os::set_native_thread_name(const char *name) { // Not yet implemented. return; }

Slide 44

Slide 44 text

Java 9+ http://hg.openjdk.java.net/jdk10/jdk10/hotspot/file/5ab7a67bc155/src/os/linux/vm/os_linux.cpp#l5038 void os::set_native_thread_name(const char *name) { if (Linux::_pthread_setname_np) { char buf [16]; // according to glibc manpage, 16 chars incl. '/0' snprintf(buf, sizeof(buf), "%s", name); buf[sizeof(buf) - 1] = '\0'; const int rc = Linux::_pthread_setname_np(pthread_self(), buf); // ERANGE should not happen; all other errors should just be ignored. assert(rc != ERANGE, "pthread_setname_np failed"); } }

Slide 45

Slide 45 text

Thread name propagation Java 9+ Thread.setThreadName() is propagated to pthreads 15 chars limit Java thread names visible in system tools

Slide 46

Slide 46 text

systemd(1) |-agetty(1338) |-java(1876)-+-{C1 CompilerThre}(1893) | |-{C2 CompilerThre}(1892) | |-{Common-Cleaner}(1896) | |-{Finalizer}(1890) | |-{G1 Conc#0}(1882) | |-{G1 Main Marker}(1881) | |-{G1 Refine#0}(1885) | |-{G1 Refine#1}(1886) | |-{G1 Young RemSet}(1887) | |-{GC Thread#0}(1879) | |-{GC Thread#1}(1880) | |-{Service Thread}(1895) | |-{XNIO-1 Accept}(1904) | |-{XNIO-1 I/O-1}(1903)

Slide 47

Slide 47 text

PID TID %MEM %CPU TIME COMMAND 3799 3799 15.9 0.0 00:00:00 java 3799 3800 15.9 2.5 00:00:08 java 3799 3804 15.9 10.0 00:00:00 GC Thread#0 3799 3805 15.9 10.0 00:00:00 GC Thread#1 3799 3806 15.9 0.0 00:00:00 G1 Main Marker 3799 3807 15.9 0.0 00:00:00 G1 Conc#0 3799 3808 15.9 0.0 00:00:00 G1 Refine#0 3799 3809 15.9 0.0 00:00:00 G1 Refine#1 3799 3810 15.9 0.0 00:00:00 G1 Young RemSet 3799 3811 15.9 0.0 00:00:00 VM Thread

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

Cost of a thread

Slide 50

Slide 50 text

Cost of thread stack memory - 1MB by default: explicit context switches: implicit safepointing: implicit gc roots: implicit

Slide 51

Slide 51 text

Thread stack

Slide 52

Slide 52 text

Heap Offheap thread stacks live here

Slide 53

Slide 53 text

java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics -version

Slide 54

Slide 54 text

java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics -version

Slide 55

Slide 55 text

Thread (reserved=18548KB, committed=18548KB) (thread #18) (stack: reserved=18468KB, committed=18468KB) (malloc=59KB #101) (arena=21KB #34)

Slide 56

Slide 56 text

Thread (reserved=18548KB, committed=18548KB) (thread #18) (stack: reserved=18468KB, committed=18468KB) (malloc=59KB #101) (arena=21KB #34) number of threads memory taken from the system 18468KB / 18 = 1024KB

Slide 57

Slide 57 text

total app memory = heap size + thread count * stack size + ...

Slide 58

Slide 58 text

How many threads can i spawn? Heap Offheap RAM

Slide 59

Slide 59 text

How many threads can i spawn? Heap Offheap RAM 2GB 500MB

Slide 60

Slide 60 text

How many threads can i spawn? Heap Offheap RAM 2GB 512MB 300MB for stuff.. 1.2GB for threads

Slide 61

Slide 61 text

How many threads can i spawn? 1.2GB / 1MB = 1200

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

VSZ vs RES

Slide 65

Slide 65 text

VSZ vs RES virtual memory size address space process can access doesn't mean there is enough physical memory

Slide 66

Slide 66 text

VSZ vs RES resident set size memory acutaly used by the process the only metric showing real memory usage

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 7873 max locked memory (kbytes, -l) 16384 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 7873 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

Slide 69

Slide 69 text

core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 7873 max locked memory (kbytes, -l) 16384 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 7873 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

Slide 70

Slide 70 text

Why less is more?

Slide 71

Slide 71 text

CPU CPU CPU CPU waiting..

Slide 72

Slide 72 text

Cost of thread stack memory - 1MB by default: explicit context switches: implicit safepointing: implicit gc roots: implicit

Slide 73

Slide 73 text

Context switch

Slide 74

Slide 74 text

core L3 cache L1 cache L2 cache shared exclusive

Slide 75

Slide 75 text

core L3 cache L1 cache L2 cache <1 ns ~4 ns ~20-40 ns ~100 ns RAM

Slide 76

Slide 76 text

core RAM data flows L2 -> L1

Slide 77

Slide 77 text

core RAM another thread this data belonged to other thread - probably useless for new one

Slide 78

Slide 78 text

Context switches up to 30 micro seconds each observable via perf and vmstat

Slide 79

Slide 79 text

CPU CPU CPU CPU

Slide 80

Slide 80 text

Back to the app...

Slide 81

Slide 81 text

No content

Slide 82

Slide 82 text

VM threads come to life (~20 threads)

Slide 83

Slide 83 text

G1 GC threads VMThread C1 & C2 PeriodicTaskThread

Slide 84

Slide 84 text

traffic starts flowing (+~80 threads) spring context boots up (+~15 threads)

Slide 85

Slide 85 text

Managing threads

Slide 86

Slide 86 text

new Thread() Executors.fixedThreadPool()

Slide 87

Slide 87 text

Spring default: SimpleAsyncTaskExecutor TaskExecutor implementation that fires up a new Thread for each task, executing it asynchronously. By default, the number of concurrent threads is unlimited. This implementation does not reuse threads!

Slide 88

Slide 88 text

There is hope! http://my-little-pony-friendship-is-magic-rakoon1.wikia.com/wiki/Radiant_Hope

Slide 89

Slide 89 text

Thread pool

Slide 90

Slide 90 text

Thread pool what happens when it's full?

Slide 91

Slide 91 text

Thread pool how big is it?

Slide 92

Slide 92 text

Thread pool

Slide 93

Slide 93 text

new ThreadPoolExecutor( corePoolSize, maxPoolSize, keepAliveTime, keepAliveTimeUnit, taskQueue, threadFactory, rejectionPolicy )

Slide 94

Slide 94 text

new ThreadPoolExecutor( corePoolSize, maxPoolSize, keepAliveTime, keepAliveTimeUnit, taskQueue, threadFactory, rejectionPolicy ) task queue implementation

Slide 95

Slide 95 text

new ThreadPoolExecutor( corePoolSize, maxPoolSize, keepAliveTime, keepAliveTimeUnit, taskQueue, threadFactory, rejectionPolicy ) what happens when queue is full

Slide 96

Slide 96 text

Rejection: Happy path Thread Pool Caller Thread

Slide 97

Slide 97 text

Rejection: Default unhappy path Thread Pool Caller Thread

Slide 98

Slide 98 text

Rejection: CallerRunsPolicy Thread Pool Caller Thread

Slide 99

Slide 99 text

Rejection: AbortPolicy Thread Pool Caller Thread Rejected Execution Exception

Slide 100

Slide 100 text

ThreadPoolExecutor.DiscardPolicy Thread Pool Caller Thread /dev/null

Slide 101

Slide 101 text

ThreadPoolExecutor.DiscardOldestPolicy Thread Pool Caller Thread /dev/null

Slide 102

Slide 102 text

traffic starts flowing (+~80 threads)

Slide 103

Slide 103 text

No content

Slide 104

Slide 104 text

No content

Slide 105

Slide 105 text

once used threads are kept alive

Slide 106

Slide 106 text

ThreadPoolExecutor.prestartAllCoreThreads()

Slide 107

Slide 107 text

traffic starts flowing (+~80 threads) spring context boots up (+~15 threads)

Slide 108

Slide 108 text

-Xlog:thread+os:file=/tmp/threads.log

Slide 109

Slide 109 text

Key takeaways JVM threads are system threads can be observed using system tools WebFlux, because less is should be more tune your thread pools

Slide 110

Slide 110 text

github.com/adamdubiel @dubieladam

Slide 111

Slide 111 text

No content