Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reagents: Lock-free programming for the masses

Reagents: Lock-free programming for the masses

Efficient concurrent programming libraries are essential for taking advantage of fine-grained parallelism on multicore hardware. In this post, I will introduce reagents, a composable, lock-free concurrency library for expressing fine-grained parallel programs on Multicore OCaml. Reagents offer a high-level DSL for experts to specify efficient concurrency libraries, but also allows the consumers of the libraries to extend them further without knowing the details of the underlying implementation.

KC Sivaramakrishnan

August 02, 2016
Tweet

More Decks by KC Sivaramakrishnan

Other Decks in Programming

Transcript

  1. Multicore OCaml 2 Concurrency Parallelism Compiler Fibers Language + Stdlib

    • 12M fibers/s on 1 core • 30M fibers/s on 4 cores Libraries
  2. Multicore OCaml 2 Domains Concurrency Parallelism Compiler Fibers Language +

    Stdlib • 12M fibers/s on 1 core • 30M fibers/s on 4 cores Libraries
  3. Multicore OCaml 2 Effects Domains Concurrency Parallelism Compiler Fibers Language

    + Stdlib Domain API • 12M fibers/s on 1 core • 30M fibers/s on 4 cores Libraries
  4. Multicore OCaml 2 Effects Cooperative threading libraries Domains Concurrency Parallelism

    Compiler Fibers Language + Stdlib Domain API • 12M fibers/s on 1 core • 30M fibers/s on 4 cores Libraries
  5. Multicore OCaml 2 Effects Cooperative threading libraries Reagents: lock- free

    programming Domains Concurrency Parallelism Compiler Fibers Language + Stdlib Domain API • 12M fibers/s on 1 core • 30M fibers/s on 4 cores Libraries
  6. JVM: java.util.concurrent Synchronization Data structures Reentrant locks Semaphores R/W locks

    Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Deques Sets Maps (hash & skiplist) 3 .Net: System.Concurrent.Collections
  7. JVM: java.util.concurrent Synchronization Data structures Reentrant locks Semaphores R/W locks

    Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Deques Sets Maps (hash & skiplist) 3 .Net: System.Concurrent.Collections Not Composable
  8. lock-free 5 Under contention, at least 1 thread makes progress

    Single thread in isolation makes progress obstruction-free
  9. lock-free 5 Under contention, at least 1 thread makes progress

    Under contention, each thread makes progress wait-free Single thread in isolation makes progress obstruction-free
  10. Compare-and-swap (CAS) module CAS : sig val cas : 'a

    ref -> expect:'a -> update:'a -> bool end = struct (* atomically... *) let cas r ~expect ~update = if !r = expect then (r:= update; true) else false end 6
  11. Compare-and-swap (CAS) module CAS : sig val cas : 'a

    ref -> expect:'a -> update:'a -> bool end = struct (* atomically... *) let cas r ~expect ~update = if !r = expect then (r:= update; true) else false end • Implemented atomically by processors • x86: CMPXCHG and friends • arm: LDREX, STREX, etc. • ppc: lwarx, stwcx, etc. 6
  12. module type TREIBER_STACK = sig type 'a t val push

    : 'a t -> 'a -> unit ... end module Treiber_stack : TREIBER_STACK = struct type 'a t = 'a list ref let rec push s t = let cur = !s in if CAS.cas s cur (t::cur) then () else (backoff (); push s t) end 9
  13. module type TREIBER_STACK = sig type 'a t val push

    : 'a t -> 'a -> unit val try_pop : 'a t -> 'a option end module Treiber_stack : TREIBER_STACK = struct type 'a t = 'a list ref let rec push s t = ... let rec try_pop s = match !s with | [] -> None | (x::xs) as cur -> if CAS.cas s cur xs then Some x else (backoff (); try_pop s) end 10
  14. Concurrency libraries are indispensable, but hard to build and extend

    The Problem: let v = Treiber_stack.pop s1 in Treiber_stack.push s2 v is not atomic 11
  15. Scalable concurrent algorithms can be built and extended using abstraction

    and composition Reagents Treiber_stack.pop s1 >>> Treiber_stack.push s2 is atomic 12
  16. 13 Sequential >>> — Software transactional memory Parallel <*> —

    Join Calculus Selective <+> — Concurrent ML PLDI 2012
  17. 13 Sequential >>> — Software transactional memory Parallel <*> —

    Join Calculus Selective <+> — Concurrent ML PLDI 2012 still lock-free!
  18. Lambda: the ultimate abstraction f 'a 'b g 'b 'c

    val f : 'a -> 'b val g : 'b -> 'c 15
  19. f 'a 'b Lambda abstraction: Reagent abstraction: 'a 'b R

    ('a,'b) Reagent.t 17 val run : ('a,'b) Reagent.t -> 'a -> ‘b
  20. Thread Interaction 18 module type Reagents = sig type ('a,'b)

    t (* shared memory *) module Ref : Ref.S with type ('a,'b) reagent = ('a,'b) t (* communication channels *) module Channel : Channel.S with type ('a,'b) reagent = ('a,'b) t ... end
  21. module type Channel = sig type ('a,'b) endpoint type ('a,'b)

    reagent val mk_chan : unit -> ('a,'b) endpoint * ('b,'a) endpoint val swap : ('a,'b) endpoint -> ('a,'b) reagent end
  22. c: ('a,'b) endpoint c swap 'a 'b module type Channel

    = sig type ('a,'b) endpoint type ('a,'b) reagent val mk_chan : unit -> ('a,'b) endpoint * ('b,'a) endpoint val swap : ('a,'b) endpoint -> ('a,'b) reagent end
  23. c: ('a,'b) endpoint c swap 'a 'b c swap 'b

    'a module type Channel = sig type ('a,'b) endpoint type ('a,'b) reagent val mk_chan : unit -> ('a,'b) endpoint * ('b,'a) endpoint val swap : ('a,'b) endpoint -> ('a,'b) reagent end
  24. swap Message passing type 'a ref val upd : 'a

    ref -> f:(‘a -> 'b -> ('a * ‘c) option) -> ('b, 'c) Reagent.t 21
  25. swap upd f r 'a 'a 'b 'c Message passing

    type 'a ref val upd : 'a ref -> f:(‘a -> 'b -> ('a * ‘c) option) -> ('b, 'c) Reagent.t 21
  26. swap upd f 'a 'b R 'a 'b S Message

    passing Shared state 22
  27. swap upd f R S <+> 'a 'b R 'a

    'c S Message passing Shared state Disjunction 23
  28. swap upd f R S <+> R S <*> 'a

    ('b * 'c) Message passing Shared state Disjunction 23
  29. swap upd f R S <+> R S <*> Message

    passing Shared state Disjunction Conjunction 24
  30. module type TREIBER_STACK = sig type 'a t val create

    : unit -> 'a t val push : 'a t -> ('a, unit) Reagent.t val pop : 'a t -> (unit, 'a) Reagent.t ... end module Treiber_stack : TREIBER_STACK = struct type 'a t = 'a list Ref.ref let create () = Ref.ref [] let push r x = Ref.upd r (fun xs x -> Some (x::xs,())) let pop r = Ref.upd r (fun l () -> match l with | [] -> None (* block *) | x::xs -> Some (xs,x)) ... end 25
  31. Composability Treiber_stack.pop s1 >>> Treiber_stack.push s2 Transfer elements atomically Consume

    elements atomically Treiber_stack.pop s1 <*> Treiber_stack.pop s2 26
  32. Composability Treiber_stack.pop s1 >>> Treiber_stack.push s2 Transfer elements atomically Consume

    elements atomically Treiber_stack.pop s1 <*> Treiber_stack.pop s2 Consume elements from either Treiber_stack.pop s1 <+> Treiber_stack.pop s2 26
  33. Composability 27 val lift : ('a -> 'b option) ->

    ('a,'b) t val constant : 'a -> ('b,'a) t Transform arbitrary blocking reagent to a non-blocking reagent
  34. Composability 27 let attempt (r : ('a,'b) t) : ('a,'b

    option) t = (r >>> lift (fun x -> Some (Some x))) <+> (constant None) val lift : ('a -> 'b option) -> ('a,'b) t val constant : 'a -> ('b,'a) t Transform arbitrary blocking reagent to a non-blocking reagent
  35. Composability 27 let attempt (r : ('a,'b) t) : ('a,'b

    option) t = (r >>> lift (fun x -> Some (Some x))) <+> (constant None) val lift : ('a -> 'b option) -> ('a,'b) t val constant : 'a -> ('b,'a) t Transform arbitrary blocking reagent to a non-blocking reagent let try_pop stack = attempt (pop stack)
  36. • Philosopher’s alternate between thinking and eating • Philosopher can

    only eat after obtaining both forks • No philosopher starves
  37. type fork = {drop : (unit,unit) endpoint; take : (unit,unit)

    endpoint} let mk_fork () = let drop, take = mk_chan () in {drop; take} let drop f = swap f.drop let take f = swap f.take • Philosopher’s alternate between thinking and eating • Philosopher can only eat after obtaining both forks • No philosopher starves
  38. type fork = {drop : (unit,unit) endpoint; take : (unit,unit)

    endpoint} let mk_fork () = let drop, take = mk_chan () in {drop; take} let drop f = swap f.drop let take f = swap f.take let eat l_fork r_fork = run (take l_fork <*> take r_fork) (); (* ... * eat * ... *) spawn @@ run (drop l_fork); spawn @@ run (drop r_fork) • Philosopher’s alternate between thinking and eating • Philosopher can only eat after obtaining both forks • No philosopher starves
  39. Status https://github.com/ocamllabs/reagents Synchronization Data structures Locks Reentrant locks Semaphores R/W

    locks Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Stacks Treiber Elimination backoff Counters Deques Sets Maps (hash & skiplist)
  40. STM vs Reagents • STM is more ambitious — atomic

    { … }. Reagents are conservative. • Reagents don’t allow multiple writes to the same memory location. • Reagents are lock-free. STMs are typically obstruction- free. 33