Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Managed Runtime Systems: Lecture 09 - Concurrency

zakkak
March 26, 2018

Managed Runtime Systems: Lecture 09 - Concurrency

zakkak

March 26, 2018
Tweet

More Decks by zakkak

Other Decks in Programming

Transcript

  1. Managed Runtime Systems
    Lecture 09: Concurrency
    Foivos Zakkak
    https://foivos.zakkak.net
    Except where otherwise noted, this presentation is licensed under the
    Creative Commons Attribution 4.0 International License.
    Third party marks and brands are the property of their respective holders.

    View full-size slide

  2. Concurrency
    Processors are multi-core
    Take advantage of it
    Multiple single-threaded VMs or One multi-threaded?
    Managed Runtime Systems 1 of 14 https://foivos.zakkak.net

    View full-size slide

  3. Multiple Single-threaded VMs
    Pros

    Simplicity

    Less code

    No data-races
    Cons

    Duplication of overheads
    (e.g. Class Loading, JIT)

    Slower communication
    (data-transfers)

    Slower synchronization
    Managed Runtime Systems 2 of 14 https://foivos.zakkak.net

    View full-size slide

  4. One Multi-threaded VM
    Pros

    Fast communication

    Better resource Utilization

    Opportunities for better
    scheduling/memory-locality
    Cons

    Increased complexity

    Incompatibility with
    well-known optimizations

    Increased contention due to
    shared runtime data structures
    Managed Runtime Systems 3 of 14 https://foivos.zakkak.net

    View full-size slide

  5. State-of-the-art
    Multiple multi-threaded VMs
    VMs still don’t scale well
    (No smart-scheduling, GC on large heaps, Profiling, etc.)
    VMs still depend on shared-memory (very slow otherwise)
    Managed Runtime Systems 4 of 14 https://foivos.zakkak.net

    View full-size slide

  6. Design Decisions
    Atomicity (necessary for efficient concurrent algorithms)
    Locking (necessary to provide mutual exclusion)
    Scheduling (fundamental, especially for fine-grained parallelism)
    Memory model (defines expected behavior; should be intuitive)
    Explicit concurrency (Threads, Monitors, Actors, etc.)
    Implicit concurrency (Auto-parallelization, SIMD HW-acceleration)
    Managed Runtime Systems 5 of 14 https://foivos.zakkak.net

    View full-size slide

  7. Atomicity
    Provided through intrinsics
    Interpreter invokes native code that translates to atomic
    instructions
    JIT compilers generate atomic instructions
    When atomic instructions are unavailable, locks are used instead
    Managed Runtime Systems 6 of 14 https://foivos.zakkak.net

    View full-size slide

  8. Locking
    Avoid it if possible (though careful code writing and code analysis)
    Embedding locks in objects is inefficient
    (most objects are not used for locking)
    Empirically:

    Most locks are not contented

    Once a thread gets a lock it usually takes ti again later
    Managed Runtime Systems 7 of 14 https://foivos.zakkak.net

    View full-size slide

  9. Thin Locks
    3-state locking (unlocked, thin, fat)
    Embed pointer to lock record and lock state in objects’ headers
    Thin locking sets the said pointer to a local record (in stack frame)
    using CAS
    No synchronization as long as the lock remains uncontented
    Managed Runtime Systems 8 of 14 https://foivos.zakkak.net

    View full-size slide

  10. Thin Locks - Contented
    If the lock is owned by another thread (create a fat-lock):
    1. Create OS mutex and condition
    2. Set pointer to fat lock record
    (containing pointers to the mutex and condition)
    3. Wait on the fat lock to be notified
    Owner thread will observe the fat lock on release and notify the
    waiters
    Once a lock has been contented it’s never thin-locked again
    Managed Runtime Systems 9 of 14 https://foivos.zakkak.net

    View full-size slide

  11. Biased Locks
    See https://dl.acm.org/citation.cfm?id=1167496 for more
    Managed Runtime Systems 10 of 14 https://foivos.zakkak.net

    View full-size slide

  12. Scheduling
    Many VMs rely on OS threads both for VM and application threads
    Scheduling is delegated to the OS
    BUT:

    OS threads are preemptable (what about safepoints)

    Runtime threads may cause big delays when preempted

    The VM knows more than the OS about the threads
    Managed Runtime Systems 11 of 14 https://foivos.zakkak.net

    View full-size slide

  13. Memory Model
    See
    https://speakerdeck.com/zakkak/the-java-memory-model
    Managed Runtime Systems 12 of 14 https://foivos.zakkak.net

    View full-size slide

  14. Explicit Concurrency
    Different models feature different benefits
    Actors are message passing based, so a better fit for systems
    without shared memory
    Threads and Locks give more flexibility but are hard to get right
    There are more, e.g., task-based, map-reduce, etc.
    Managed Runtime Systems 13 of 14 https://foivos.zakkak.net

    View full-size slide

  15. Implicit Concurrency - Auto parallelization
    Limited by:
    1. hardware support (HW-acceleration)
    2. code analysis
    3. register allocation algorithms
    The use of certain patterns/libraries may help
    (e.g. vectors, streams, etc.)
    Managed Runtime Systems 14 of 14 https://foivos.zakkak.net

    View full-size slide