Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Code Duplication in the Graal Compiler

Code Duplication in the Graal Compiler

Virtual Machine Meetup

October 06, 2017
Tweet

More Decks by Virtual Machine Meetup

Other Decks in Education

Transcript

  1. CODE DUPLICATION IN THE GRAAL COMPILER Virtual Machine Meetup (VMM),

    Prague 2017 David Leopoldseder, JKU Linz, Austria Lukas Stadler, Oracle Labs, Linz - Austria Thomas Würthinger, Oracle Labs, Zurich - CH 1
  2. 5 DUPLICATION IN GRAAL Duplication Simulation Duplication Inlining Loop Optimizations

    Loop Unrolling Loop Unswitching Loop Peeling Loop Inversion
  3. 6 DUPLICATION IN GRAAL Duplication Inlining Loop Optimizations Loop Unrolling

    Loop Unswitching Loop Peeling Loop Inversion Duplication Simulation
  4. 7 MOTIVATING EXAMPLE int f(int a, int b, int[] x)

    { int p; if (a > b) { p = a; } else { p = 2; } return x.length / p; }
  5. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 8 MOTIVATING EXAMPLE Fastpath Slowpath
  6. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 9 MOTIVATING EXAMPLE Fastpath Slowpath Expensive operation
  7. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 10 MOTIVATING EXAMPLE Fastpath Slowpath Expensive operation p=2 in the false branch
  8. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 11 MOTIVATING EXAMPLE Fastpath Slowpath Expensive operation p=2 in the false branch Strength Reduction if we know p == 2 and x.length >= 0 x.length / 2  x.length >> 1
  9. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 12 MOTIVATING EXAMPLE Fastpath Slowpath Expensive operation p=2 in the false branch Merge is an optimization boundary Strength Reduction if we know p == 2 and x.length >= 0 x.length / 2  x.length >> 1
  10. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 13 MOTIVATING EXAMPLE Fastpath Slowpath Expensive operation p=2 in the false branch Merge is an optimization boundary Only allowed if isPowerOf2(a) == true Strength Reduction if we know p == 2 and x.length >= 0 x.length / 2  x.length >> 1
  11. int f(int a, int b, int[] x) { int p;

    if (a > b) { p = a; } else { p = 2; } return x.length / p; } 14 MOTIVATING EXAMPLE Fastpath Slowpath Expensive operation p=2 in the false branch Merge is an optimization boundary Solution  Duplication Strength Reduction if we know p == 2 and x.length >= 0 x.length / 2  x.length >> 1
  12. 15 MOTIVATING EXAMPLE – WHAT WE WANT IN THE END

    int f(int a, int b, int[] x) { if (a > b) { return x.length / a; } else { return x.length >> 1; } }
  13. 16 MOTIVATING EXAMPLE – WHAT WE WANT IN THE END

    int f(int a, int b, int[] x) { if (a > b) { return x.length / a; } else { return x.length >> 1; } } Slowpath Fastpath Approx. 15-90x faster on Nehalem
  14. 17 ENABLED OPTIMIZATIONS Read Eliminations Escape Analysis Opportunities Canonicalizations Conditional

    Eliminations Check Eliminations (Devirtualization) (Partial Redundancy Eliminations)
  15. 19 PROBLEMS TO SOLVE P1 We need to determine which

    duplications will increase peak performance
  16. 20 PROBLEMS TO SOLVE P1 We need to determine which

    duplications will increase peak performance P2 We need to know which optimizations are enabled by a certain duplication
  17. 21 PROBLEMS TO SOLVE P1 We need to determine which

    duplications will increase peak performance P2 We need to know which optimizations are enabled by a certain duplication P3 Finding those optimization opportunities after duplication is compile time intensive, therefore, we need to find a way to perform this kind of analysis in acceptable time in a JIT compiler
  18. 22 APPROACH Simulate a duplication per merge Constant Folding Trade-off

    every possible duplication Sort by • Benefit • Cost • Probability Duplicate & Optimize Conditional Elimination PEA & Scalar Replacement Strength Reduction Duplicate Optimize Initial IR Optimization Potential For each Duplication Beneficial Duplications Optimized IR Decide if duplication is beneficial
  19. 23 APPROACH Simulate a duplication per merge Constant Folding Trade-off

    every possible duplication Sort by • Benefit • Cost • Probability Duplicate & Optimize Conditional Elimination PEA & Scalar Replacement Strength Reduction Duplicate Optimize Initial IR Optimization Potential For each Duplication Beneficial Duplications Optimized IR Decide if duplication is beneficial
  20. 24 APPROACH Simulate a duplication per merge Constant Folding Trade-off

    every possible duplication Sort by • Benefit • Cost • Probability Duplicate & Optimize Conditional Elimination PEA & Scalar Replacement Strength Reduction Duplicate Optimize Initial IR Optimization Potential For each Duplication Beneficial Duplications Optimized IR Decide if duplication is beneficial
  21. 25 APPROACH Simulate a duplication per merge Constant Folding Trade-off

    every possible duplication Sort by • Benefit • Cost • Probability Duplicate & Optimize Conditional Elimination PEA & Scalar Replacement Strength Reduction Duplicate Optimize Initial IR Optimization Potential For each Duplication Beneficial Duplications Optimized IR Decide if duplication is beneficial
  22. 26 APPROACH Simulate a duplication per merge Constant Folding Trade-off

    every possible duplication Sort by • Benefit • Cost • Probability Duplicate & Optimize Simulation Tier Trade-off Tier Optimization Tier Conditional Elimination PEA & Scalar Replacement Strength Reduction Duplicate Optimize Initial IR Optimization Potential For each Duplication Beneficial Duplications Optimized IR Decide if duplication is beneficial
  23. 27 RECAP MOTIVATING EXAMPLE int f(int a, int b, int[]

    x) { int p; if (a > b) { p = a; } else { p = 2; } return x.length / p; }
  24. 28 APPROACH int p; if (a > b) p =

    a; p = 2; return x.length / p;
  25. 29 DOMINATOR TREE int p; if (a > b) p

    = a; p = 2; return x.length / p; Merge Block
  26. 30 DUPLICATION SIMULATION int p; if (a > b) p

    = a; p = 2; return x.length / p; Copy Merge Block return x.length / p; return x.length / p;
  27. 31 DUPLICATION SIMULATION int p; if (a > b) p

    = a; p = 2; return x.length / p; Simulate Dominance Relation return x.length / p; return x.length / p;
  28. int p; if (a > b) p = a; p

    = 2; return x.length / p; Copy Propagation return x.length / p; return x.length / p; a 2 32 DUPLICATION SIMULATION
  29. 33 DUPLICATION SIMULATION int p; if (a > b) return

    x.length / p; return x.length / 2; return x.length / a; Apply Optimizations
  30. 34 DUPLICATION SIMULATION int p; if (a > b) return

    x.length / p; return x.length / 2; return x.length / a; Strength Reduction x.length >> 1
  31. 35 BENEFIT / COST -> TRADE OFF int p; if

    (a > b) return x.length / p; return x.length >> 1; return x.length / a;
  32. 36 BENEFIT / COST -> TRADE OFF int p; if

    (a > b) return x.length / p; return x.length >> 1; return x.length / a; Benefit  (Latency(Div) - Latency(Shift)) * Probability = 31 * 0.9 = 27.9
  33. int p; if (a > b) return x.length / p;

    return x.length >> 1; return x.length / a; 37 BENEFIT / COST -> TRADE OFF Benefit  (Latency(Div) - Latency(Shift)) * Probability = 31 * 0.9 = 27.9 Cost  1 Additional Return (+ 4 Instructions) + 1 Additional Shift + 1 Additional Read = 6 6 – 1 Jump from branch = 5 Instructions