Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Concurrency in Haskell

Anupam
August 06, 2023

Concurrency in Haskell

An introduction to Concurrency in Haskell. Talk presented at the FPIndia+ElixirDelhi meetup on 5th August 2023.

Anupam

August 06, 2023
Tweet

More Decks by Anupam

Other Decks in Technology

Transcript

  1. Concurrency vs Parallelism • Suspend execution and switch context =

    Multitasking • Cooperative context switching = Concurrency • Automatic time slicing = Parallelism • Multiple processors running multiple threads = Parallelism
  2. Haskell • Concurrent Execution • Haskell has lightweight threads that

    are very e ffi cient • Haskell IO multiplexes all ongoing IO requests using e ff i cient operating system primitives such as epoll on Linux. Thus applications with lots of lightweight threads, all doing IO simultaneously, perform very well
  3. Haskell • Shared Data, Many Producers and Consumers • Haskell

    provides powerful primitives which make it easy to precisely control and synchronise access to shared data
  4. Scope of this talk • Only Explicit Concurrency • Discussion

    of Haskell’s Concurrency API • No special syntax, only library functions • Rationale behind the API design • Solving problems from fi rst principles
  5. Haskell is a Pure Language • All expressions are pure

    • Concurrency does not make sense for pure expressions Parallelism still does • But you can still perform e ff ects with the IO Monad. e.g. printLn ⸬ String → IO () • E ff ectful values have to be sequenced to have an e ff ect
  6. Sequencing Effects • printLn “Hello” 
 is a pure value

    • printMyValue = printLn “Hello” 
 does not actually print anything • The *only* way to sequence e ff ects is by declaring *one* to be the main e ff ect - • main = printMyValue 
 prints “Hello”!
  7. Multiple Effects? • Haskell provides high level functions to sequence

    E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b
  8. Multiple Effects? • Haskell provides high level functions to compose

    E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b >>= :: IO a -> (a -> IO b) -> IO b 

  9. Combinators all the way • Combinator: a function that can

    combine multiple values into a single value • Bind is a combinator. Haskell is big on combinators! • Makes sense that the concurrency APIs are also combinators heavy • No special syntax needed, just function calls • Combinators compose • So concurrent code feels like using lego bricks to build larger programs
  10. Haskell’s philosophy • Like Lego • Haskell provides a small

    set of orthogonal general purpose APIs for concurrency • That you can combine in multiple ways to create the desired functionality. • That means you can build your own models of concurrency, including - • Shared memory • Transactions • Actor models • Etc.
  11. “Easiest” concurrency model • Inversion of control a.k.a. Callbacks 


    
 — This call returns immediately 
 getFoo params \fooResult -> 
 
 — This code will run only when the result is available 
 use fooResult 
 
 … Meanwhile do something else … 

  12. ForkIO • ForkIO just runs some IO action in a

    separate thread forkIO :: IO a -> IO ThreadId • Something like getFoo could be implemented using forkIO - getFoo params handler = forkIO do 
 fooResult <- makeAPICall params 
 handler fooResult • Note the monadic sequencing to invoke the handler
  13. Parametricity • Inlining doesn’t change the meaning of a program.

    So this program - getFoo params \fooResult -> 
 use fooResult 
 meanwhile do something else • Is equivalent to - forkIO do 
 fooResult <- makeAPICall params 
 use fooResult 
 meanwhile do something else
  14. Parallel calls • Running getFoo and getBar together • Use

    
 
 concurrently :: IO a -> IO b -> IO (a,b) • Usage 
 
 (fooResult, barResult) <- concurrently getFoo getBar 

  15. What happens if one fails? (fooResult, barResult) <- concurrently getFoo

    getBar 
 • If either action throws an exception at any time, then the other action is cancelled, and the exception is re-thrown by concurrently.
  16. What if we want to do something different? • Run

    getFoo and getBar together, cancel the other as soon as one returns • Use 
 
 race :: IO a -> IO b -> IO (Either a b) • Usage 
 
 res <- race getFoo getBar 
 case res of 
 Left fooResult -> do something with Foo 
 Right barResult -> do something with Bar Sum types FTW!
  17. More than two actions • Use Concurrently with the Applicative

    instance newtype Concurrently a = Concurrently (IO a) <$> :: (a -> b) -> Concurrently a -> Concurrently b 
 <*> :: Concurrently (a -> b) -> Concurrently a -> Concurrently b action :: Concurrently (Foo, Bar, Baz) 
 action = mk3Tuple <$> getFoo <*> getBar <*> getBaz Note: Separated for clarity mk3Tuple = (,,)
  18. How’d that work again? • Let the types guide you

    - 
 mk3Tuple :: a -> (b -> (c -> (a,b,c))) <$> :: (u -> v) -> Concurrently u -> Concurrently v 
 (mk3Tuple <$>) :: Concurrently a -> Concurrently (b -> (c -> (a,b,c))) getFoo :: Concurrently Foo 
 (mk3Tuple <$> getFoo) :: Concurrently (b -> (c -> (Foo,b,c))) (<*> getBar) :: Concurrently (Bar -> b) -> Concurrently Bar -> Concurrently b 
 (mk3Tuple <$> getFoo <*> getBar) :: Concurrently (c -> (Foo,Bar,c)) (mk3Tuple <$> getFoo <*> getBar <*> getBaz) :: Concurrently (Foo,Bar,Baz)
  19. Racing multiple actions • What if we want to run

    multiple actions, but return the result from the fi rst one to complete? • Use the Alternative instance with Concurrently! <|> :: Concurrently a -> Concurrently a -> Concurrently a action = getFoo <|> getBar <|> getBaz Error: can’t match Foo with Bar with Baz
  20. Racing multiple actions data OneOfThree a b c = First

    a | Second b | Third c <|> :: Concurrently a -> Concurrently a -> Concurrently a <$> :: (a -> b) -> Concurrently a -> Concurrently b First <$> getFoo :: Concurrently (OneOfThree Foo b c) Second <$> getBar :: Concurrently (OneOfThree a Bar c) Third <$> getBaz :: Concurrently (OneOfThree a b Baz) action :: Concurrently (OneOfThree Foo Bar Baz) 
 action = (First <$> getFoo) <|> (Second <$> getBar) <|> (Third <$> getBaz)
  21. Timeout function threadDelay can be used to measure time threadDelay

    :: Int -> IO () timeout :: Int -> IO a -> IO (Maybe a) 
 timeout ms io = race (Nothing <$ threadDelay ms) (Just <$> io)
  22. Manipulating a data structure concurrently • Assume a function -

    getURL :: String -> Concurrently String • We need to map over a list of URLs, fetching their contents concurrently, and returning all of them in a list mapConcurrently 
 :: (a -> Concurrently b) -> [a] -> Concurrently [b] • Haskellers would recognise that mapConcurrently = traverse • Traverse requires Concurrently to be Applicative, which as we saw it already is traverse getURL [“foo.com", “bar.com", “baz.com"]
  23. Tree of Actions • This is what we had in

    the previous slide - traverse getURL [“foo.com", “bar.com", “baz.com"] • This launches 3 threads, and an exception in any of them will kill all the other threads as well. • Using race or concurrently, we are building a tree of threads, where all the threads are always cleaned up. • For now, we will not discuss exceptions further in this presentation
  24. Cases for which there’s no prebuilt abstraction • That actually

    happens less than you think • Note that I said “prebuilt”, not “inbuilt” • All of these functions are written, in user code, upon lower level primitives • The implementations are very simple. Usually you can count the number of lines on one hand. • You can write your own abstractions very easily • But most of the time, you will just reach into your standard Haskell toolbox
  25. A complicated f low • Race two actions getInt and

    getBool concurrently. • If getBool fi nishes fi rst, and is true, then answer with 100, cancelling getInt • Else wait for and return the result of getInt • Since we need to inspect the result of one action before we decide whether to cancel the other action, this is not doable with race and concurrently
  26. Multiple sequential calls 
 
 getFoo params \fooResult -> do

    
 getBar fooResult \barResult -> do 
 getBaz fooResult barResult \bazResult -> do 
 do other things… 
 Callback hell!
  27. Do notation do fooResult <- getFoo params 
 barResult <-

    getBar fooResult 
 bazResult <- getBaz fooResult barResult 
 do other things… Much better, but how!
  28. Shared data between threads MVar a newMVar :: IO (MVar

    a) putMVar :: MVar a -> a -> IO () takeMVar :: MVar a -> IO a These are blocking operations Hence they are also synchronised
  29. Also remember ThreadID forkIO :: IO a -> IO ThreadId

    You can control the thread with the ThreadId For example, cancel the thread cancelThread :: ThreadId -> IO ()
  30. A complicated f low complicatedFlow = do 
 v <-

    newMVar 
 threadBool <- forkIO (getBool >>= boolHandler v) 
 threadInt <- forkIO (getInt >>= intHandler v) 
 takeMVar v 
 
 where 
 boolHandler b = when b do 
 cancelThread threadInt 
 putMVar v 100 
 
 intHandler i = putMVar v i
  31. Async - Await (but better) data Promise a = Promise

    (MVar a) ThreadId async :: IO a -> IO (Promise a) 
 async action = do 
 var <- newMVar 
 tid <- forkIO (action >>= putMVar var) 
 return (Promise var tid) await :: Promise a -> IO a 
 await (Promise var _) = takeMVar var 
 cancel :: Promise a -> IO () 
 cancel (Promise _ tid) = cancelThread tid
  32. A complicated f low complicatedFlow = do 
 boolPromise <-

    async getBool 
 intPromise <- async getInt 
 v <- race 
 (Left <$> await boolPromise) 
 (Right <$> await intPromise) 
 case v of 
 Left True -> do 
 cancel intPromise 
 pure 100 
 Left False -> await intPromise 
 Right i -> pure i
  33. Monad instance instance Monad Promise where 
 return a =

    Promise (return a) ??? 
 Promise v tid >>= f = Promise ??? 
 • We need a ThreadId, but don’t have one • We need to construct a new MVar for synchronisation but don’t have an IO context 
 
 Remember, (>>=) :: Promise a -> (a -> Promise b) -> Promise b
  34. Monad Instance Tweaks data Promise a = Promise { runPromise

    :: IO (IO a, IO ()) } async :: IO a -> IO (Promise a) 
 async action = do 
 var <- newMVar 
 tid <- forkIO (action >>= putMVar var) 
 return (Promise (return (takeMVar var, cancelThread tid))) await :: Promise a -> IO a 
 await p = do 
 (take, _) <- runPromise p 
 take 
 cancel :: Promise a -> IO () 
 cancel p = do 
 (_, cancel) <- runPromise p 
 cancel
  35. Monad instance, Try 2 instance Monad Promise where 
 return

    a = Promise (return (a, return ())) 
 x >>= f = Promise do 
 var <- newMVar 
 cvar <- newMVar 
 t1 <- forkIO do 
 (take1, cancel1) <- runPromise x 
 putMVar cvar cancel1 
 a <- take1 
 (take2, cancel2) <- runPromise (f a) 
 putMVar var take2 
 _ <- takeMVar cvar 
 putMVar cvar cancel2 
 return (join (takeMVar var), cancelThread t1 >> join (takeMVar cvar))
  36. Multiple sequential calls 
 
 do fooResult <- await $

    getFoo params 
 barResult <- await $ getBar fooResult 
 bazResult <- await $ getBaz fooResult barResult 
 do other things… 

  37. Software Transactional Memory Looks very similar to IO, and MVar

    
 
 data STM a 
 instance Monad STM 
 
 data TVar a 
 
 newTVar :: a -> STM (TVar a) 
 readTVar :: TVar a -> STM a 
 writeTVar :: TVar a -> a -> STM ()
  38. But More Conversion to IO 
 atomically :: STM a

    -> IO a 
 A concurrency operator 
 orElse :: STM a -> STM a -> STM a 
 And this mysterious function 
 retry :: STM a
  39. STM Example • IN: Shared List of URLs 
 OUT:

    Shared List of URL contents • We have a producer that adds URLs to the IN list • Multiple consumers that take URLs from the IN list, make network requests, and push the contents to the OUT list
  40. No Sweat Shared data 
 inList :: TVar [String] 


    outList :: TVar [String] 
 producer = forever do 
 url <- generateURL 
 atomically do 
 urls <- readTVar inList 
 writeTVar inList (url: urls)
  41. No Sweat consumer = forever do 
 url <- atomically

    do 
 urls <- readTVar inList 
 case urls of 
 [] -> retry 
 url:tail -> do 
 writeTVar inList tail 
 return url 
 contents <- fetch url 
 atomically do 
 tail <- readTVar outList 
 writeTVar outList (contents:tail)
  42. Other examples • Distributed-process framework - The Actor Model •

    https://github.com/haskell-distributed/distributed-process/wiki/The-Actor-Model • Communicating Haskell Processes - Message Passing via Channels • https://www.cs.kent.ac.uk/projects/ofa/chp/