Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Managed Runtime Systems: Lecture 09 - Concurrency

March 26, 2018

Managed Runtime Systems: Lecture 09 - Concurrency


March 26, 2018

More Decks by zakkak

Other Decks in Programming


  1. Managed Runtime Systems Lecture 09: Concurrency Foivos Zakkak https://foivos.zakkak.net Except

    where otherwise noted, this presentation is licensed under the Creative Commons Attribution 4.0 International License. Third party marks and brands are the property of their respective holders.
  2. Concurrency Processors are multi-core Take advantage of it Multiple single-threaded

    VMs or One multi-threaded? Managed Runtime Systems 1 of 14 https://foivos.zakkak.net
  3. Multiple Single-threaded VMs Pros ▪ Simplicity ▪ Less code ▪

    No data-races Cons ▪ Duplication of overheads (e.g. Class Loading, JIT) ▪ Slower communication (data-transfers) ▪ Slower synchronization Managed Runtime Systems 2 of 14 https://foivos.zakkak.net
  4. One Multi-threaded VM Pros ▪ Fast communication ▪ Better resource

    Utilization ▪ Opportunities for better scheduling/memory-locality Cons ▪ Increased complexity ▪ Incompatibility with well-known optimizations ▪ Increased contention due to shared runtime data structures Managed Runtime Systems 3 of 14 https://foivos.zakkak.net
  5. State-of-the-art Multiple multi-threaded VMs VMs still don’t scale well (No

    smart-scheduling, GC on large heaps, Profiling, etc.) VMs still depend on shared-memory (very slow otherwise) Managed Runtime Systems 4 of 14 https://foivos.zakkak.net
  6. Design Decisions Atomicity (necessary for efficient concurrent algorithms) Locking (necessary

    to provide mutual exclusion) Scheduling (fundamental, especially for fine-grained parallelism) Memory model (defines expected behavior; should be intuitive) Explicit concurrency (Threads, Monitors, Actors, etc.) Implicit concurrency (Auto-parallelization, SIMD HW-acceleration) Managed Runtime Systems 5 of 14 https://foivos.zakkak.net
  7. Atomicity Provided through intrinsics Interpreter invokes native code that translates

    to atomic instructions JIT compilers generate atomic instructions When atomic instructions are unavailable, locks are used instead Managed Runtime Systems 6 of 14 https://foivos.zakkak.net
  8. Locking Avoid it if possible (though careful code writing and

    code analysis) Embedding locks in objects is inefficient (most objects are not used for locking) Empirically: ▪ Most locks are not contented ▪ Once a thread gets a lock it usually takes ti again later Managed Runtime Systems 7 of 14 https://foivos.zakkak.net
  9. Thin Locks 3-state locking (unlocked, thin, fat) Embed pointer to

    lock record and lock state in objects’ headers Thin locking sets the said pointer to a local record (in stack frame) using CAS No synchronization as long as the lock remains uncontented Managed Runtime Systems 8 of 14 https://foivos.zakkak.net
  10. Thin Locks - Contented If the lock is owned by

    another thread (create a fat-lock): 1. Create OS mutex and condition 2. Set pointer to fat lock record (containing pointers to the mutex and condition) 3. Wait on the fat lock to be notified Owner thread will observe the fat lock on release and notify the waiters Once a lock has been contented it’s never thin-locked again Managed Runtime Systems 9 of 14 https://foivos.zakkak.net
  11. Scheduling Many VMs rely on OS threads both for VM

    and application threads Scheduling is delegated to the OS BUT: ▪ OS threads are preemptable (what about safepoints) ▪ Runtime threads may cause big delays when preempted ▪ The VM knows more than the OS about the threads Managed Runtime Systems 11 of 14 https://foivos.zakkak.net
  12. Explicit Concurrency Different models feature different benefits Actors are message

    passing based, so a better fit for systems without shared memory Threads and Locks give more flexibility but are hard to get right There are more, e.g., task-based, map-reduce, etc. Managed Runtime Systems 13 of 14 https://foivos.zakkak.net
  13. Implicit Concurrency - Auto parallelization Limited by: 1. hardware support

    (HW-acceleration) 2. code analysis 3. register allocation algorithms The use of certain patterns/libraries may help (e.g. vectors, streams, etc.) Managed Runtime Systems 14 of 14 https://foivos.zakkak.net