Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Voyeurs in the JVM land

Voyeurs in the JVM land

Jaroslaw Palka

March 18, 2018

More Decks by Jaroslaw Palka

Other Decks in Programming


  1. Voyeurs in the JVM land

  2. about me Jarek Pałka Allegro.tech, doing stu , back to

    coding, hell yeah!!! JDD, 4Developers and one more conference (still under development) where I serve as a dictator for life JVM, bytecode, parsers, graphs and other cool things (like ponies) owner at Symentis trainings, former chief architect, development manager, head of development, backend developer and performance guy
  3. You are all invited!

  4. agenda JDK with batteries included JVM logging and tracking Linux

    tools for curious other tools for weirdos
  5. JDK with batteries included jps jmap jstack jstat jcmd

  6. how it works JVM stores metrics in memory mapped les

  7. test lsof +d /tmp/hsperfdata_jarek

  8. jps lists all running JVM processes

  9. jstack dumps stacks of all JVM threads (in a selected

    process) jstack -l [pid] # to include locks info
  10. jmap prints heap information, histogram or dump heap content to

    a le
  11. jmap -heap [pid] # to print heap usage jmap -histo

    [pid] # to print histogram jmap -dump: le=jvm.dump # to dump heap
  12. jstat samples running JVM for selected metrics jstat -gc [pid]

  13. jcmd

  14. one tool to rule them all, one stop shop for

    all commands available in JVM
  15. let’s play with it jcmd [pid] help

  16. JVM logging and tracking

  17. JVM has tons of diagnostic options

  18. garbage collection

  19. jstat -gc [pid] [interval] or

  20. -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintClassHistogramAfterFullGC -XX:+PrintClassHistogramBeforeFullGC -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10M

  21. safepoint

  22. what? — Nitsan Wakart Imagine if you will a JVM

    full of mutator threads, all busy, sweating, mutating the heap. Some of them have <gasp> shared mutable state. They’re mutating each others state, concurrently, like animals. Some stand in corners mutating their own state (go blind they will). Suddenly a neon sign ashes the word PINEAPPLES. One by one the mutators stop their rampant heap romping and wait, sweat dripping. When the last mutator stops, a bunch of elves come in, empty the ashtrays, ll up all the drinks, mop up the puddles, and quickly as they can they vanish back to the north pole. The sign is turned o and the threads go back to it
  23. — Nitsan Wakart At a safepoint the mutator thread is

    at a known and well de ned point in it’s interaction with the heap. This means that all the references on the stack are mapped (at known locations) and the JVM can account for all of them. As long as the thread remains at a safepoint we can safely manipulate the heap + stack such that the thread’s view of the world remains consistent when it leaves the safepoint.
  24. -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 Debugging JVM safepoint pauses

  25. just in time compilation

  26. -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:LogFile=mylog le.log -XX:+PrintAssembly

  27. TLAB

  28. what? — Ross K A Thread Local Allocation Bu er

    (TLAB) is a region of Eden that is used for allocation by a single thread. It enables a thread to do object allocation using thread local top and limit pointers, which is faster than doing an atomic operation on a top pointer that is shared across threads. A thread acquires a TLAB at it’s rst object allocation after a GC scavenge. The size of the TLAB is computed via a somewhat complex process discribed below. The TLAB is released when it is full (or nearly so), or the next GC scavenge occurs. TLABs are allocated only in Eden, never from From-Space or the OldGen.
  29. should I care? you want as much of allocations to

    happen in TLABs, period
  30. -XX:+PrintTLAB The Real Thing

  31. native memory tracking

  32. Stackover ow Java process taking more memory than its max

    heap size
  33. java -XX:NativeMemoryTracking=[o |summary|detail] jcmd [pid] VM.native_memory summary

  34. a weapon of mass destruction

  35. or pair made in heaven

  36. FlightRecorder — Oracle Help Center Java Flight Recorder (JFR) is

    a tool for collecting diagnostic and pro ling data about a running Java application. It is integrated into the Java Virtual Machine (JVM) and causes almost no performance overhead, so it can be used even in heavily loaded production environments.
  37. java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder - XX:StartFlightRecording=duration=60s, lename=myrecording.jfr

  38. warning as of now, you can’t use it to analyze

    production systems
  39. until JDK 10 comes out, this is o cial statement

  40. java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder jcmd [pid] JFR.start name=recording jcmd [pid] JFR.start

    name=recording lename=recording.jfr
  41. Java Mission Control

  42. Linux tools for curious sysstat sysdig perf

  43. sysstat pidstat -t -d -p [pid] 1 # IO usage

    per thread pidstat -t -w -p [pid] 1 # task switching per thread pidstat -r -p [pid] 1 # page faults per process
  44. warning forget about strace, ptrace syscall is not what you

    want :)
  45. tracing syscalls

  46. sysdig sysdig prod.pid=[pid] -w [pid].scap # record events csysdig -r

    [pid].scap # analyze
  47. perf perf record -p [pid] -o [pid].perf # record events

    perf report -i [pid].perf # analyze
  48. tools for weirdos honest pro ler amegraphs

  49. honest pro ler it uses uno cial JVM API call

    AsyncGetCallTrace as opposed to other pro lers which use JVMTI (JVM tool interface)
  50. here goes long boring discussion about complexity of OpenJDK global

    safepoint mechanism
  51. — Honest pro ler wiki It accurately pro les applications,

    avoiding an inherent bias towards places that have safepoints. It pro les applications with signi cantly lower overhead than traditional pro ling techniques, making it suitable for use in production.
  52. The Pros and Cons of AGCT

  53. java -agentpath:../honest-pro ler/liblagent.so=logPath=honest.logs Main

  54. tools I didn’t mention GCviewer JITWatch PrintAssembly Solaris Studio Censum

    Memory Analyzer Tool
  55. Q&A

  56. links JVM Anatomy Park Nitsan’s blog Chris Newland blog, JITwatch

    author Marcus Hirt blog, all stu JMC System calls in the Linux kernel sysdig perf: Linux pro ling with performance counters
  57. Java Microbenchmark Harness

  58. — Wes Dyer Make it correct, make it clear, make

    it concise, make it fast. In that order.
  59. — JMH wiki JMH is a Java harness for building,

    running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM.
  60. mvn archetype:generate \ -DinteractiveMode=false \ -DarchetypeGroupId=org.openjdk.jmh \ -DarchetypeArtifactId=jmh-java-benchmark-archetype \ -DgroupId=org.sample

    \ -DartifactId=test \ -Dversion=1.0 http://openjdk.java.net/projects/code-tools/jmh/
  61. benchmarks these are public non-static methods annotated with @Benchmark import

    org.openjdk.jmh.annotations.Benchmark; public class CodeBenchmark { @Benchmark public void testMethod(){ } }
  62. managing state & life cycle more complex examples will need

    to work with some data (state), this is what for state objects are for
  63. @State(Scope.Benchmark) public class CodeBenchmarkState{ public final ArrayList<Integer> list = new

    ArrayList<>(); }
  64. public class CodeBenchmark{ @Benchmark public void testMethod(CodeBenchmarkState state){ state.add(0); }

  65. note on scopes

  66. Scope.Benchmark With benchmark scope, all instances of the same type

    will be shared across all worker threads
  67. Scope.Group With group scope, all instances of the same type

    will be shared across all threads within the same group. Each thread group will be supplied with its own state object
  68. Scope.Thread With thread scope, all instances of the same type

    are distinct, even if multiple state objects are injected in the same benchmark
  69. lifecycle every state object can have @Setup and @TearDown xture

  70. time for rst benchmark let’s compare iteration speed over primitive

    array, ArrayList and LinkedList
  71. running benchmarks mvn package java -jar target/benchmark.jar

  72. forks, warm ups and iterations by default JMH forks JVM

    for each run of benchmark, within each fork you have two phases warm up iteration number of repetitions of each phase can be controlled over command line
  73. command line -f - number of forks -wi - number

    of warm ups -i - number of iterations
  74. java -jar target/benchmark.jar -f 1 -i 5 -wi 5

  75. parameterized tests JMH supports parameterized tests through @Param annotation Test

    parameters should be public non- nal elds on state objects they are injected right before call to setup xture methods
  76. @State(Scope.Benchmark) public class CodeBenchmark { @Param{"0.1","0.2","0.5","0.75","1.0"} public float loadFactor; private

    Map<String,String> map; @Setup public void setUp(){ map = new HashMap<>(16,loadFactor); } }
  77. controlling parameters you overwrite values of the parameters with command

    line options java -jar target/benchmarks.jar -p loadFactor=0.8,0.9
  78. dead code

  79. … and black holes

  80. one of the dangers JMH tries to mitigate is dead

    code optimization from JIT, to avoid it, consume return values from functions with black holes @Benchmark public void testMethod(Blackhole blackhole){ blackhole.consume(codeBenchmark()); }
  81. asymmetric tests

  82. sometimes you want to benchmark your concurrent code, like performance

    of read and write paths this is where @Group and @GroupThreads come in
  83. @State(Scope.Benchmark) public class CodeBenchmark { @Benchmark @Group("benchmarkGroup") @GroupThreads(1) public void

    testWrites() { } @Benchmark @Group("bechmarkGroup") @GroupThreads(1) public void testReads(Blackhole blackhole) { } }
  84. time for third benchmark compare performance of various thread-safe counter

    implementations public class Counter { private long counter; public void inc() { ++counter; } public long counter() { return counter; } }
  85. pro lers they can provide some insights into your code

    java -jar benchmark.jar -lprof java -jar benchmark.jar -prof hs_gc
  86. reporters and last but not least, writing test results to

    les java -jar benchmark.jar -lr java -jar benchmark.jar -rf csv -rff results.csv
  87. tips and tricks on laptops governors can trick you, it’s

    easy to control them on linux with cpufreq-set