Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Your Program as a Transpiler: Improving Application Performance by Applying Compiler Design

Your Program as a Transpiler: Improving Application Performance by Applying Compiler Design

The term "transpiler" indicates a program that translates source code from one programming language into another target language; but transpilers are close cousins to compilers; and, at the end of the day, compilers are just programs that transform an input into an output. As GraalVM become more and more relevant, with new native-first frameworks such as Quarkus coming into the picture, we have the opportunity to get huge performance boosts in our applications; but we need to learn to think differently of our own code, recognizing the parts that can be processed statically and those that need to be processed dynamically. In other words, we need to understand what is the "compiler" part in our own programs! We will explore together this brave new world and get a sneak peek on what is coming next in the Drools rule engine and the jBPM platform.

Edoardo Vacchi

April 13, 2019
Tweet

More Decks by Edoardo Vacchi

Other Decks in Technology

Transcript

  1. About Me • Edoardo Vacchi @evacchi • Research @ UniMi

    / Horsa • Research @ UniCredit R&D • Drools and jBPM Team @ Red Hat
  2. Motivation • Language implementation is often seen as a dark

    art • But some design patterns are simple at their core • Best practices can be applied to everyday programming
  3. Motivation (cont'd) • As GraalVM and becomes more and more

    relevant, thinking differently of our own code can buy us quick performance wins
  4. Boot-time vs. Run-time • There is always a pre-processing phase

    where you prepare your program for execution • Then, there's actual process execution phase
  5. Goals 1. How can we factor pre-processing out of program

    run-time ? 2. Can we factor it out of the program ?
  6. Example: A Quick DI Framework https://github.com/evacchi/reflection-vs-codegen public class Example {

    private final Animal animal; @Inject public Example(Animal animal) { this.animal = animal; } public Animal animal() { return animal; } } public interface Animal {} @InjectCandidate public class Dog implements Animal {}
  7. Binder binder = new Binder(); binder.scan(); Example ex = binder.createInstance(Example.class);

    Animal animal = ex.animal(); Objects.requireNonNull(animal); assert animal instanceof Dog https://github.com/evacchi/reflection-vs-codegen Vaguely inspired by Guice https://github.com/google/guice
  8. public class Binder { public Binder scan() { Reflections reflections

    = new Reflections(); reflections.getTypesAnnotatedWith(InjectCandidate.class) .forEach(t -> bindings.put(interfaceOf(t), constructorOf(t))); return this; } public <T> T createInstance(Class<? extends T> t) { return (T) Arrays.stream(t.getDeclaredConstructors()) .filter(c -> c.getAnnotation(Inject.class) != null) .peek(c -> c.setAccessible(true)) .map(this::createInstance) .findFirst().get(); } ... } https://github.com/evacchi/reflection-vs-codegen
  9. Example: Boot Time ︎% time java io.github.evacchi.Reflective 6.94s user 0.29s

    system 259% cpu 2.785 total • Not much, but not great • Might be fine for long-running • Not great for microservices or serverless
  10. GraalVM: “One VM to Rule Them All” • Polyglot VM

    with cross-language JIT • Java Bytecode and JVM Languages • Dynamic Languages (Truffle API) • Native binary compilation (SubstrateVM)
  11. GraalVM: “One VM to Rule Them All” • Polyglot VM

    with cross-language JIT • Java Bytecode and JVM Languages • Dynamic Languages (Truffle API) • Native binary compilation (SubstrateVM)
  12. Native Image % javac A.java ︎ % time java A

    hello java A 0.09s user 0.02s system 39% cpu 0.267 total ︎ % native-image A ... ︎ % time ./a hello ./a 0.00s user 0.00s system 86% cpu 0.001 total public class A { public static void main(String[] args) { System.out.println("hello"); } }
  13. Our DI Framework breaks % native-image io.github.evacchi.Reflective Build on Server(pid:

    24595, port: 42437)* [io.github.evacchi.reflective:24595] classlist: 11,753.18 ms [io.github.evacchi.reflective:24595] (cap): 1,514.71 ms [io.github.evacchi.reflective:24595] setup: 4,127.10 ms [io.github.evacchi.reflective:24595] analysis: 1,006.40 ms Fatal error: com.oracle.svm.core.util.VMError$HostedError: should not reach here at com.oracle.svm.core.util.VMError.shouldNotReachHere(VMError.java:62) ... Error: Image build request failed with exit status 1
  14. Native Image: Restrictions • Native binary compilation • Restriction: "closed-world

    assumption" • No dynamic code loading • You must declare classes upon which you plan to do reflection
  15. Transpilers vs. Compilers • Compiler: translates code written in a

    language (source code) into code written in a target language (object code). The target language may be at a lower level of abstraction • Transpiler: translates code written in a language into code written in another language at the same level of abstraction (Source-to-Source Translator).
  16. Are transpilers simpler than compilers? • Lower-level languages are complex

    • They are not: if anything, they're simple • Syntactic sugar is not a higher-level of abstraction • It is: a concise construct is expanded at compile-time • Proper compilers do low-level optimizations • You are thinking of optimizing compilers.
  17. The distinction is moot • It is pretty easy to

    write a crappy compiler, call it a transpiler and feel at peace with yourself • Writing a good transpiler is no different or harder than writing a good compiler • So, how do you write a good compiler?
  18. Compiler-like workflows • At least two classes of problems can

    be solved with compiler-like workflows • Data transformation problems • Boot time optimization problems
  19. Compiler-like workflows • At least two classes of problems can

    be solved with compiler-like workflows • Data transformation problems • Boot time optimization problems
  20. What's a compilation phase? • It's your setup phase. •

    You do it only once before the actual processing of your program begins • But do you have to do it every single time it starts?
  21. Configuring the application • Will that configuration change across runs?

    • Do you have to repackage the application to bundle the new configuration?
  22. Application wiring • You are building an immutable Dockerized microservice

    • Do you really need all that Runtime Reflection? • Do you really need Runtime Dependency Injection? public class Example { private final Animal animal; @Inject public Example(Animal animal) { this.animal = animal; } public Animal animal() { return animal; } } public interface Animal {} @InjectCandidate public class Dog implements Animal {}
  23. All these things make your startup slow! • But it's

    done only once! • Never is better than once • But it's flexible • Ask yourself when is the last time you changed dependencies/startup config/classpath at runtime • If it's recent, ask yourself the price you pay for that flexibility
  24. Compiling a programming language • You start from a text

    representation of a program • The text representation is fed to a parser • The parser returns a parse tree • The parse tree is refined into an abstract syntax tree (AST) • The AST is further refined through intermediate representations (IRs) • Up until the final representation is returned
  25. Compiling a programming language • You start from a text

    representation of a program • The text representation is fed to a parser • The parser returns a parse tree • The parse tree is refined into an abstract syntax tree (AST) • The AST is further refined through intermediate representations (IRs) • Up until the final representation is returned
  26. Recognize your compiler passes 1. Collect your resources 2. Find

    all the dependencies between resources 3. Build a data structure representation (e.g. a graph) 4. Visit the resulting structure as many times as you like 5. Generate the (source) code
  27. What makes a compiler a proper compiler • Not optimization

    • Compilation Phases • You can have as many as you like
  28. Compilation Phases • Misconception: one pass doing many things is

    better than doing many passes, each doing one thing • It is not: the complexity is the same
  29. Compilation Phases • Better separation of concerns • Better testability

    (you can test each intermediate result) • You can choose when and where each phase gets evaluated
  30. Example. A Configuration File 3 Resolve includes 2 Unmarshall file

    into a typed object 1 Read file from (class)path 5 Validate data types and values 4 Resolve variables
  31. Example. An ORM Library 3 Fetch the DB Catalog 2

    Find relationships between classes 1 Scan classpath for annotations 5 Synthesize entity implementations 4 Synthesize prepared statements
  32. Example. A DI Framework 3 Verify all deps are satisfied

    2 Find relationships between classes 1 Scan classpath for annotations 5 Synthesize factories 4 Find a cycle-free path public class Example { private final Animal animal; @Inject public Example(Animal animal) { this.animal = animal; } public Animal animal() { return animal; } } public interface Animal {} @InjectCandidate public class Dog implements Animal {}
  33. Binder binder = new Binder(); binder.scan(); Example ex = binder.createInstance(Example.class);

    Animal animal = ex.animal(); Objects.requireNonNull(animal); assert animal instanceof Dog https://github.com/evacchi/reflection-vs-codegen
  34. public class Binder { public Binder scan() { Reflections reflections

    = new Reflections(); reflections.getTypesAnnotatedWith(InjectCandidate.class) .forEach(t -> bindings.put(interfaceOf(t), constructorOf(t))); return this; } public <T> T createInstance(Class<? extends T> t) { return (T) Arrays.stream(t.getDeclaredConstructors()) .filter(c -> c.getAnnotation(Inject.class) != null) .peek(c -> c.setAccessible(true)) .map(this::createInstance) .findFirst().get(); } ... } https://github.com/evacchi/reflection-vs-codegen At run-time, complexity is easy to miss This loop gets executed at each instance creation
  35. public void scan() { Reflections reflections = new Reflections(); //

    resolve injection candidates reflections.getTypesAnnotatedWith(InjectCandidate.class); // resolve injected constructors reflections.getConstructorsAnnotatedWith(Inject.class); // collect candidates reflections.forEach(this::collect); // resolve mappings resolveMappings(); }
  36. Compilation Phases • Phases are now more apparent • Run-time

    is not affected • Careful: classpath scanning is still costly • Avoid doing it more than once! • Do we need to scan() at each startup ?
  37. Code-generated DI GeneratedBinder binder = new GeneratedBinder(); Example ex =

    binder.createInstance(Example.class); Animal animal = ex.animal(); Objects.requireNonNull(animal); assert animal instanceof Dog; public class GeneratedBinder { public <T> T createInstance(Class<?> type) { if (Example.class == type) return (T) new Example(new Dog()); if (Animal.class == type) return (T) new Dog(); throw new UnsupportedOperationException(); }} cf. https://google.github.io/dagger/ Binder binder = new Binder(); binder.scan(); Example ex = binder.createInstance(Example.class); Animal animal = ex.animal(); Objects.requireNonNull(animal); assert animal instanceof Dog
  38. F O R T H E L O V E

    O F G O D DO NOT CONCATENATE STRINGS Seriously, Stop. You're Killing Kittens. Not Even StringBuilder
  39. Use proper code generation tooling • Use a type-safe API

    to generate source or byte code • Java Parser provides code generation APIs • Java Poet • ByteBuddy • ASM • etc. • Don't like APIs? A templating engine is fine too • It won't be typesafe, but it's still ok • Hell, even String.format() is better
  40. JavaParser private void generateJavaSources(Bindings bindings) { String packageName = "io.github.evacchi";

    String className = "GeneratedBinder"; String sourceFileName = packageName + "." + className; CompilationUnit cu = new CompilationUnit(); ClassOrInterfaceDeclaration cls = cu .setPackageDeclaration("io.github.evacchi") .addClass(className); MethodDeclaration methodDeclaration = cls .addMethod("createInstance") .setTypeParameters(new NodeList<>(new TypeParameter("T"))) .setType("T") .setModifiers(Modifier.Keyword.PUBLIC) .addParameter(Class.class, "type"); ... } https://github.com/evacchi/reflection-vs-codegen
  41. Write a build plug-in • A Maven/Gradle/SBT/Whatever Plug-In • An

    Annotation Processor • A Quarkus Extension
  42. The processor is triggered by the Java compiler for claimed

    annotations. Bindings bindings = processInjectionCandidates( env.getElementsAnnotatedWith(InjectCandidate.class)); processInjectionSites( env.getElementsAnnotatedWith(Inject.class), bindings); generateJavaSources(bindings); https://github.com/evacchi/reflection-vs-codegen DI: Annotation Processor
  43. Example % time java io.github.evacchi.Reflective 6.94s user 0.29s system 259%

    cpu 2.785 total % time java io.github.evacchi.Codegen 0.08s user 0.01s system 111% cpu 0.087 total
  44. Example % time java io.github.evacchi.Reflective 6.94s user 0.29s system 259%

    cpu 2.785 total % time java io.github.evacchi.Codegen 0.08s user 0.01s system 111% cpu 0.087 total % time ./io.github.evacchi.codegen ./io.github.evacchi.codegen 0.00s user 0.00s system 86% cpu 0.003 total
  45. The Submarine Initiative “The question of whether a computer can

    think is no more interesting than the question of whether a submarine can swim.” Edsger W. Dijkstra
  46. AI and Automation Platform • Drools rule engine • jBPM

    workflow platform • OptaPlanner constraint solver
  47. Drools and jBPM rule R1 when // constraints $r :

    Result() $p : Person( age >= 18 ) then // consequence $r.setValue( $p.getName() + " can drink"); end Drools jBPM
  48. Drools DRL rule R1 when // constraints $r : Result()

    $p : Person( age >= 18 ) then // consequence $r.setValue( $p.getName() + " can drink"); end var r = declarationOf(Result.class, "$r"); var p = declarationOf(Person.class, "$p"); var rule = rule("com.example", "R1").build( pattern(r), pattern(p) .expr("e", p -> p.getAge() >= 18), alphaIndexedBy( int.class, GREATER_OR_EQUAL, 1, this::getAge, 18), reactOn("age")), on(p, r).execute( ($p, $r) -> $r.setValue( $p.getName() + " can drink")));
  49. jBPM RuleFlowProcessFactory factory = RuleFlowProcessFactory.createProcess("demo.orderItems"); factory.variable("order", new ObjectDataType("com.myspace.demo.Order")); factory.variable("item", new

    ObjectDataType("java.lang.String")); factory.name("orderItems"); factory.packageName("com.myspace.demo"); factory.dynamic(false); factory.version("1.0"); factory.visibility("Private"); factory.metaData("TargetNamespace", "http://www.omg.org/bpmn20"); org.jbpm.ruleflow.core.factory.StartNodeFactory startNode1 = factory.startNode(1); startNode1.name("Start"); startNode1.done(); org.jbpm.ruleflow.core.factory.ActionNodeFactory actionNode2 = factory.actionNode(2); actionNode2.name("Show order details"); actionNode2.action(kcontext -> {
  50. Take Aways • Do more in the pre-processing phase (compile-time)

    • Do less during the processing phase (run-time) • In other words, separate what you can do once from what you have to do repeatedly • Process in phases • Move all or some of your phases to compile-time
  51. Resources • Full Source Code https://github.com/evacchi/reflection-vs-codegen • KIE.org Drools, jBPM,

    OptaPlanner • Submarine https://github.com/kiegroup/submarine-examples • Drools Blog http://blog.athico.com • Other resources • GraalVM.org • Quarkus.io • Dagger https://google.github.io/dagger/ Edoardo Vacchi @evacchi
  52. Q&A