Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Managed Runtime Systems: Lecture 05 - Just In Time Compilation Part 1

zakkak
February 27, 2018

Managed Runtime Systems: Lecture 05 - Just In Time Compilation Part 1

zakkak

February 27, 2018
Tweet

More Decks by zakkak

Other Decks in Programming

Transcript

  1. Managed Runtime Systems
    Lecture 05: Just In Time Compilation Part 1
    Foivos Zakkak
    https://foivos.zakkak.net
    Except where otherwise noted, this presentation is licensed under the
    Creative Commons Attribution 4.0 International License.
    Third party marks and brands are the property of their respective holders.

    View full-size slide

  2. Acknowledgments
    The following slides are based on the corresponding slides of Mario
    Walczko about Dynamic Compilation
    Managed Runtime Systems 1 of 29 https://foivos.zakkak.net

    View full-size slide

  3. Interpretation is Slow
    Step a single (or a few) bytecode(s) a time
    Not suitable execution patterns for HW-prefetchers
    Not enough data at hand to perform optimizations
    Managed Runtime Systems 2 of 29 https://foivos.zakkak.net

    View full-size slide

  4. Dynamic Compilation to the Rescue
    Produce machine code
    Better execution patterns for HW-prefetchers
    More data at hand to perform optimizations
    Generation of HW-specific code
    Generation of case-specific code
    Managed Runtime Systems 3 of 29 https://foivos.zakkak.net

    View full-size slide

  5. Dynamic Compiler Design Decisions
    Managed Runtime Systems 4 of 29 https://foivos.zakkak.net

    View full-size slide

  6. How Much to Compile
    Basic Block
    Method
    Multiple Methods (via inlining)
    Multiple Consecutive Methods (via tracing)
    Managed Runtime Systems 5 of 29 https://foivos.zakkak.net

    View full-size slide

  7. When to Compile
    Ahead of Time (AoT)
    At class loading
    At installation (see Android’s dex2oat)
    Just in Time (JIT)
    Compile at first invocation (stall till compilation completes)
    Re-use on later invocations
    Practical JIT
    Start compiling after invocations
    Interpret while compiling to avoid stalls
    Managed Runtime Systems 6 of 29 https://foivos.zakkak.net

    View full-size slide

  8. How Much to Optimize
    Minimal
    Replace bytecodes with function calls, like macro expansion
    Simple
    Use HW-registers
    Perform peephole optimizations
    (substituting instruction sequences with more efficient ones)
    Advanced
    Run various analyses on code
    Generate multiple instances of same method for different cases

    Managed Runtime Systems 7 of 29 https://foivos.zakkak.net

    View full-size slide

  9. Interesting Facts About JIT
    Term originates from just-in-time manufacturing
    (aka kanban method)
    Appeared around the time of Java’s uptake
    Terrible misnomer (should be just too late)
    Universally misapplied
    (e.g., to dynamic compilation after first execution)
    “JIT” is not a noun
    Managed Runtime Systems 8 of 29 https://foivos.zakkak.net

    View full-size slide

  10. So! When should I JIT compile?
    Key factors:
    Speed of
    interpretation
    Speed of compilation
    Speed of compiled
    code
    0 1 2 3 4 5
    Number of Executions
    Time
    Managed Runtime Systems 9 of 29 https://foivos.zakkak.net

    View full-size slide

  11. Measuring Speed
    1. Measure the whole run in clock cycles
    2. Subtract GC time, native code, and anything unrelated
    3. Divide by number of bytecodes to obtain cycles per bytecode
    Managed Runtime Systems 10 of 29 https://foivos.zakkak.net

    View full-size slide

  12. Measuring Speed
    Interpretation time
    Compilation time
    Execution time of compiled code
    Compiler over Interpreter ratio =

    (mnemonic: translate)
    Interpreter over Compiled code ratio =

    (mnemonic: run)
    Managed Runtime Systems 11 of 29 https://foivos.zakkak.net

    View full-size slide

  13. The break-even point
    =


    =
    −1
    Example:
    = 1,
    = 2,
    = 10, = 5, = 2, = 10
    Managed Runtime Systems 12 of 29 https://foivos.zakkak.net

    View full-size slide

  14. log =
    −1
    for = 1, 2, 5, 10, 20, 50, 100
    0 2 4 6 8 10 12 14 16 18 20
    101
    102
    103
    104
    = 100
    = 50
    = 20
    = 10
    = 5
    = 2
    = 1


    Managed Runtime Systems 13 of 29 https://foivos.zakkak.net

    View full-size slide

  15. log Speedup =
    +
    =
    +
    for = 5, = 100
    100 101 102 103 104 105 106
    10−2
    10−1
    100
    101
    = 1
    =
    = 125
    Number of Executions
    Speedup
    Managed Runtime Systems 14 of 29 https://foivos.zakkak.net

    View full-size slide

  16. Interaction of JITed and non-JITed Code
    JITed code can invoke non-JITed code
    JITed code can fall-back to non-JITed counterpart
    Managed Runtime Systems 15 of 29 https://foivos.zakkak.net

    View full-size slide

  17. JITed Code Management
    Maintain a code cache holding the JITed code
    VMs manage it to ensure new code can be added when needed
    Compilers rely on a big buffer to produce the code
    The JITed code is then copied to the code cache
    Calls to that code are redirected (e.g. through method tables)
    Managed Runtime Systems 16 of 29 https://foivos.zakkak.net

    View full-size slide

  18. When to Remove Code
    When the corresponding code becomes unreachable
    (e.g. class unloading)
    When the code cache doesn’t have enough space for new code
    Managed Runtime Systems 17 of 29 https://foivos.zakkak.net

    View full-size slide

  19. What if Still Active?
    Keep an activation-counter
    Scan the stacks for activation records
    Managed Runtime Systems 18 of 29 https://foivos.zakkak.net

    View full-size slide

  20. What if Still Active?
    Keep an activation-counter
    Scan the stacks for activation records
    Sounds familiar?
    Managed Runtime Systems 18 of 29 https://foivos.zakkak.net

    View full-size slide

  21. Stack Scanning
    Walk the stack looking for return addresses in the code
    The calling convention must allow for stack scanning
    Standard placement of return addresses in the stack
    Access to saved registers (e.g. SP, FP) of suspended threads
    Depends on underlying, architecture, OS, and calling
    convention
    The OS can make this impossible
    (e.g., by saving register state in kernel space).
    Managed Runtime Systems 19 of 29 https://foivos.zakkak.net

    View full-size slide

  22. What if Still Active?
    Let it be
    Drop it and patch any activation records pointing to it
    Do the housekeeping (fix state, fix return address, etc.)
    Make it resume in the interpreter
    Called dynamic deoptimization
    It’s slow!!!
    Managed Runtime Systems 20 of 29 https://foivos.zakkak.net

    View full-size slide

  23. What About Fragmentation in the Code Cache
    Self VM uses compaction
    Similar to GC
    Stop the world
    Compact code cache
    Update links (method tables) and return addresses (activation records)
    HotSpot™ drops code till enough space becomes available
    Managed Runtime Systems 21 of 29 https://foivos.zakkak.net

    View full-size slide

  24. Using the Heap
    Make compiled code an object in the heap
    Pros
    Share memory management mechanisms
    Cons
    Compiled code usually lives long
    A tiny bug may give write access to generated code
    Managed Runtime Systems 22 of 29 https://foivos.zakkak.net

    View full-size slide

  25. Impact of JITed Code on Garbage Collection
    More roots
    Pointers might be:
    in HW-registers
    stack frames of JITed code
    or even hardcoded in the JITed code (as literals)
    A HW-register might even contain a pointer in the middle of
    (de)construction, e.g. pointer arithmetic
    Managed Runtime Systems 23 of 29 https://foivos.zakkak.net

    View full-size slide

  26. Keeping Track of Roots in Registers
    All activations (except the top) are at a call site
    For each call site keep a register map
    The register map indicates which registers are live
    Managed Runtime Systems 24 of 29 https://foivos.zakkak.net

    View full-size slide

  27. Keeping Track of Roots in Registers: Example
    Self compilers inject a register map word after each call:
    each bit represents a register, with the bit set if the register is
    live and has an oop
    The return sequence skips over the map
    A simple stack scan locates all the register maps
    Managed Runtime Systems 25 of 29 https://foivos.zakkak.net

    View full-size slide

  28. Keeping Track of Roots in Registers of Top Frame
    No root call site with register map
    Can be suspended at any instruction
    Only allow suspension at GC safe points, with register maps
    Maintain register maps at fixed locations and use abstract
    interpretation to derive the map at current point
    Replay the compiler to produce the register map(s)
    Managed Runtime Systems 26 of 29 https://foivos.zakkak.net

    View full-size slide

  29. GC Safe Points
    Safe points are placed at JITed code entry and back-branches
    At safe points ask the VM whether the thread should suspend
    To stop-the-world all threads need to reach a safe point
    Waiting others to reach a safe point, threads can scan themselves
    Managed Runtime Systems 27 of 29 https://foivos.zakkak.net

    View full-size slide

  30. Keeping Track of Roots on the Stack
    Keep stack maps for each call site
    Keep the stack maps in the code cache
    Add a pointer to them after the call
    Managed Runtime Systems 28 of 29 https://foivos.zakkak.net

    View full-size slide

  31. Keeping Track of Roots in the Code
    If the compiler hardcodes pointers in the JITed code:
    It might be in complex form, e.g. piecemeal assembly
    The compiler emits a table identifying the locations of these
    refs
    The VM needs to be able to use them as roots
    The VM needs to be able to alter them if needed
    Managed Runtime Systems 29 of 29 https://foivos.zakkak.net

    View full-size slide