Slide 1

Slide 1 text

Do you trust profilers? I once did too Johannes Bechberger The best JDKTM Krzysztof Ślusarski

Slide 2

Slide 2 text

Image by Yatheesh Gowda from Pixabay Nightmares of a profiler developer Johannes Bechberger’s gruesome tales on Java profiling APIs

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Client Admin Server Question? Set Question JSON

Slide 6

Slide 6 text

Premature Optimization We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. — Donald Knuth “ Source: Jacob Appelbaum, Wikipedia

Slide 7

Slide 7 text

So the event came... ... and the server was overwhelmed

Slide 8

Slide 8 text

Premature Optimization We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. — Donald Knuth “ Source: Jacob Appelbaum, Wikipedia

Slide 9

Slide 9 text

Profilers to the rescue

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Source: Abgedreht Karlsruhe

Slide 12

Slide 12 text

main serverLoop handleQuestionRequest currentQuestion isQuestionEnabled fun main() { ... serverLoop() } fun serverLoop() { while (true) { req = ... if (req.isQuestionRequest) handleQuestionRequest(req) ... } } fun handleQuestionRequest(req) { if (isQuestionEnabled()) { emit(currentQuestion().json) } else { emit({}) } } parseJSON parseJSON

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

main serverLoop handleQuestionRequest fun main() { ... serverLoop() } fun serverLoop() { while (true) { req = ... if (req.isQuestionRequest) handleQuestionRequest(req) ... } } fun handleQuestionRequest(req) { if (isQuestionEnabled()) { emit(currentQuestion().json) } else { emit({}) } }

Slide 15

Slide 15 text

Debugging Testing Profiling Toolbox

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Who of you used a profiler?

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

What is profiling? A report on the amounts of time spent in each routine of a program, used to find and tune away the hot spots in it. — The Jargon File “

Slide 21

Slide 21 text

Different Profilers

Slide 22

Slide 22 text

Instrumenting Profilers Inserting instructions into the code automatically fun serverLoop() { logEntry(“serverLoop”) while (true) { req = ... if (req.isQuestionRequest) handleQuestionRequest(req) ... } logExit(“serverLoop”) } But JITs?

Slide 23

Slide 23 text

Sampling Profilers fun serverLoop() { while (true) { req = ... if (req.isQuestionRequest) handleQuestionRequest(req) ... } }

Slide 24

Slide 24 text

Profilers aren’t rocket science

Slide 25

Slide 25 text

So let’s write our own *ignoring safepoint bias * https://mostlynerdless.de

Slide 26

Slide 26 text

Main agentmain(args) premain(args) Profiler sample sleep Store start store

Slide 27

Slide 27 text

Main agentmain(args) premain(args)

Slide 28

Slide 28 text

Main Class java -javaagent:./target/tiny_profiler.jar=agentArgs ...

Slide 29

Slide 29 text

Normally public static void main(String[] args) { ... } public static void main(String args[]) { ... } void main() { ... }

Slide 30

Slide 30 text

Main Class java -javaagent:./target/tiny_profiler.jar=agentArgs ... public static void agentmain(String agentArgs) { premain(agentArgs); } JVM start later attach attach public static void premain(String agentArgs) { Main main = new Main(); main.run(new Options(agentArgs)); }

Slide 31

Slide 31 text

Main class private void run(Options options) { Thread t = new Thread( Profiler.newInstance(options)); t.setDaemon(true); t.setName("Profiler"); t.start(); } public class Options { Duration interval; Optional flamePath; boolean printMethodTable; }

Slide 32

Slide 32 text

Profiler sample sleep

Slide 33

Slide 33 text

Profiler public static Profiler newInstance(Options options) { Profiler profiler = new Profiler(options); Runtime.getRuntime() .addShutdownHook( new Thread(profiler::onEnd)); return profiler; }

Slide 34

Slide 34 text

Profiler private void sample() { Thread.getAllStackTraces() .forEach( (thread, st) -> { if (thread.isDaemon()) { return; } store.addSample(st); } }); }

Slide 35

Slide 35 text

StackTraceElement class StackTraceElement { Class> declaringClassObject; String classLoaderName; String moduleName; String moduleVersion; String declaringClass; String methodName; String fileName; int lineNumber; }

Slide 36

Slide 36 text

Store

Slide 37

Slide 37 text

Store class Store { Optional flamePath; Map methodOnTopSampleCount; Map methodSampleCount; long totalSampleCount = 0; Node rootNode = new Node("root"); void addSample(StackTraceElement[] st) void printMethodTable() void storeFlameGraphIfNeeded() }

Slide 38

Slide 38 text

Flamegraphs would be great main serverLoop handleQuestionRequest currentQuestion isQuestionEnabled parseJSON parseJSON

Slide 39

Slide 39 text

Turning traces into flamegraphs parseJSON currentQuestion handleQR serverLoop main private static class Node { final String method; final Map children; long samples = 0; void addTrace(List trace) }

Slide 40

Slide 40 text

Turning traces into flamegraphs parseJSON currentQuestion handleQR serverLoop main main serverLoop handleQR currentQuestion parseJSON 1 1 1 1 1

Slide 41

Slide 41 text

Flamegraphs would be great main serverLoop handleQuestionRequest currentQuestion isQuestionEnabled parseJSON parseJSON

Slide 42

Slide 42 text

Turning traces into flamegraphs parseJSON currentQuestion handleQR serverLoop main main serverLoop handleQR currentQuestion parseJSON 2 2 2 2 2

Slide 43

Slide 43 text

Flamegraphs would be great main serverLoop handleQuestionRequest currentQuestion isQuestionEnabled parseJSON parseJSON

Slide 44

Slide 44 text

Turning traces into flamegraphs currentQuestion handleQR serverLoop main main serverLoop handleQR currentQuestion parseJSON 3 3 3 3 2

Slide 45

Slide 45 text

Flamegraphs would be great main serverLoop handleQuestionRequest currentQuestion isQuestionEnabled parseJSON parseJSON

Slide 46

Slide 46 text

Turning traces into flamegraphs isQuestionEnabled handleQR serverLoop main main serverLoop handleQR currentQuestion parseJSON 4 4 4 3 2 parseJSON isQuestionEnabled parseJSON 1 1

Slide 47

Slide 47 text

Turning traces into flamegraphs main serverLoop handleQR currentQuestion parseJSON 4 4 4 3 2 isQuestionEnabled parseJSON 1 1 main serverLoop handleQuestionRequest currentQuestion isQEnabled parseJSON parseJSON

Slide 48

Slide 48 text

With the power of d3-flame-graph

Slide 49

Slide 49 text

Reality

Slide 50

Slide 50 text

Java/JDK Mission Control Application Performance Monitors, ... Sampling Profiler External VisualVM Netbeans async-profiler Sync Async Forte Analyzer Built-In Java/JDK Flight Recorder 2016 1991 ASGCT 2002 2012 2018 Open Source 2010

Slide 51

Slide 51 text

Obtaining a profile

Slide 52

Slide 52 text

Async Profiler java \ -agentpath:libasyncProfiler.so=start,\ event=cpu,\ file=flame.html,flamegraph \ arguments Download from GitHub

Slide 53

Slide 53 text

JDK Flight Recorder (JFR) java \ -XX:+UnlockDiagnosticVMOptions \ -XX:+DebugNonSafepoints \ -XX:+FlightRecorder \ -XX:StartFlightRecording=filename=file.jfr \ arguments Already included in your JDK 8+

Slide 54

Slide 54 text

Don’t trust them

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

Safepoint fun serverLoop() { while (true) { req = ... if (req.isQuestionRequest) handleQuestionRequest(req) ... } } Safepoint requested? Safepoint requested? Safepoint requested? Safepoint requested? Safepoint requested? Yes Stop!

Slide 57

Slide 57 text

Safepoint

Slide 58

Slide 58 text

Please stop, please, when you’re ready. Stop! using Safepoints fully asynchronous Safepoint bias

Slide 59

Slide 59 text

Please stop, please, when you’re ready. Stop! Thread 1 using Safepoints fully asynchronous Safepoint bias with multiple threads

Slide 60

Slide 60 text

AsyncGetCallTrace

Slide 61

Slide 61 text

write writeBytes FileOutputStream::write BufferedOutputStream::flushBuffer BufferedOutputStream::implFlush BufferedOutputStream::flush PrintStream::implWrite PrintStream::write ... PrintStream::println BasicSample::waitForever BasicSample::main JavaCalls::call_helper jni_invoke_static ... _pthread_start ucontext frames num_frames 16 ASGCT_CallTrace Stack walking C/C++ frames Java frames Stack lineno (BCI) 25 method_id &FileOutputStr eam::write ...

Slide 62

Slide 62 text

Tests in the OpenJDK?

Slide 63

Slide 63 text

1

Slide 64

Slide 64 text

But its a good one?

Slide 65

Slide 65 text

public class ASGCTBaseTest { // ... public static void main(String[] args) { if (!check()) { throw ...; } } private static native boolean check(); } https://github.com/openjdk/jdk/blob/7d25bf7722f6fbe3633dc718adf6f755e354adb9/test/hotspot/jtreg/serviceability/AsyncGetCallTrace/MyPackage/ASGCTBaseTest.java checked just this method

Slide 66

Slide 66 text

Take profiles with a grain of salt

Slide 67

Slide 67 text

Profilers have bugs, e.g. JDK-8302320 public class Main { public static void main(String[] args) ... { Class> klass = Main.class; Method mainMethod = klass.getMethod("test"); mainMethod.invoke(null); } static void test() { javaLoop(); } static void javaLoop() { /* endless loop */ } } missing Flamegraph

Slide 68

Slide 68 text

But surely OpenJ9 is better?

Slide 69

Slide 69 text

0 Tests for AsyncGetCallTrace

Slide 70

Slide 70 text

Or GetStackTrace?

Slide 71

Slide 71 text

2

Slide 72

Slide 72 text

Safepoint biased is bad

Slide 73

Slide 73 text

@Benchmark public void blameSetResult() { byte b = 0; for (int i = 0; i < size; i++) { b += buffer[i]; } setResult(b); /* LINE 38 */ } private void setResult(byte b) { setResult(b == 1); /* LINE 90 */ } @CompilerControl(CompilerControl.Mode.DONT_INLINE) private void setResult(boolean b) { result = b; } http://psy-lob-saw.blogspot.com/2016/02/why-most-sampling-java-profilers-are.html

Slide 74

Slide 74 text

Profilers are just software

Slide 75

Slide 75 text

Tests could better > 1

Slide 76

Slide 76 text

Performance and accuracy could be better

Slide 77

Slide 77 text

Could be safer* *crashes are rare in practice

Slide 78

Slide 78 text

Not designed with safety in mind AGCT is a legacy mechanism that was created for one single unsupported purpose – David Holmes I should add that the CrashProtection mechanism was mainly put in place as a result of having to deliver JFR from JRockit into Hotspot under a deadline, upholding feature-parity. The stack walking code was in really bad shape back then. – Markus Grönlund “ “

Slide 79

Slide 79 text

TDD Test Driven Development

Slide 80

Slide 80 text

TDD

Slide 81

Slide 81 text

RDD Rage Driven Development

Slide 82

Slide 82 text

More Tests

Slide 83

Slide 83 text

No content

Slide 84

Slide 84 text

No content

Slide 85

Slide 85 text

Testing Approaches

Slide 86

Slide 86 text

Layered Oracle Instrumentation based traces GetStackTrace ASGCT at safe point AsyncGetCallTrace checks fun serverLoop() { logEntry(“serverLoop”) while (true) { req = ... if (req.isQuestionRequest) handleQuestionRequest(req) ... } logExit(“serverLoop”) }

Slide 87

Slide 87 text

Fuzzing for safety memory AsyncGetCallTrace(..., frame = •, ...)

Slide 88

Slide 88 text

Experimentation technique

Slide 89

Slide 89 text

Profiling Loop Model Hypothesis Evaluation

Slide 90

Slide 90 text

Profilers are great But don’t trust them too much

Slide 91

Slide 91 text

@parttimen3rd on Twitter parttimenerd on GitHub mostlynerdless.de @SweetSapMachine sapmachine.io