Concurrency in Haskell

Anup a m J a in Concurrency in Haskell

Concurrency vs Parallelism • Suspend execution and switch context =
Multitasking • Cooperative context switching = Concurrency • Automatic time slicing = Parallelism • Multiple processors running multiple threads = Parallelism

Concurrency • Concurrent execution and IO • Shared Data, Many
Producers and Consumers

Haskell • Concurrent Execution • Haskell has lightweight threads that
are very e ffi cient • Haskell IO multiplexes all ongoing IO requests using e ff i cient operating system primitives such as epoll on Linux. Thus applications with lots of lightweight threads, all doing IO simultaneously, perform very well

Haskell • Shared Data, Many Producers and Consumers • Haskell
provides powerful primitives which make it easy to precisely control and synchronise access to shared data

Scope of this talk • Only Explicit Concurrency • Discussion
of Haskell’s Concurrency API • No special syntax, only library functions • Rationale behind the API design • Solving problems from fi rst principles

Haskell is a Pure Language • All expressions are pure
• Concurrency does not make sense for pure expressions Parallelism still does • But you can still perform e ff ects with the IO Monad. e.g. printLn ⸬ String → IO () • E ff ectful values have to be sequenced to have an e ff ect

Sequencing Effects • printLn “Hello”   is a pure value
• printMyValue = printLn “Hello”   does not actually print anything • The *only* way to sequence e ff ects is by declaring *one* to be the main e ff ect - • main = printMyValue   prints “Hello”!

Multiple Effects? • Haskell provides high level functions to sequence
E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b

Multiple Effects? • Haskell provides high level functions to compose
E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b >>= :: IO a -> (a -> IO b) -> IO b  

Combinators all the way • Combinator: a function that can
combine multiple values into a single value • Bind is a combinator. Haskell is big on combinators! • Makes sense that the concurrency APIs are also combinators heavy • No special syntax needed, just function calls • Combinators compose • So concurrent code feels like using lego bricks to build larger programs

Haskell’s philosophy • Like Lego • Haskell provides a small
set of orthogonal general purpose APIs for concurrency • That you can combine in multiple ways to create the desired functionality. • That means you can build your own models of concurrency, including - • Shared memory • Transactions • Actor models • Etc.

“Easiest” concurrency model • Inversion of control a.k.a. Callbacks  
  — This call returns immediately   getFoo params \fooResult ->     — This code will run only when the result is available   use fooResult     … Meanwhile do something else …  

ForkIO • ForkIO just runs some IO action in a
separate thread forkIO :: IO a -> IO ThreadId • Something like getFoo could be implemented using forkIO - getFoo params handler = forkIO do   fooResult <- makeAPICall params   handler fooResult • Note the monadic sequencing to invoke the handler

Parametricity • Inlining doesn’t change the meaning of a program.
So this program - getFoo params \fooResult ->   use fooResult   meanwhile do something else • Is equivalent to - forkIO do   fooResult <- makeAPICall params   use fooResult   meanwhile do something else

Parallel calls • Running getFoo and getBar together • Use
    concurrently :: IO a -> IO b -> IO (a,b) • Usage     (fooResult, barResult) <- concurrently getFoo getBar  

What happens if one fails? (fooResult, barResult) <- concurrently getFoo
getBar   • If either action throws an exception at any time, then the other action is cancelled, and the exception is re-thrown by concurrently.

What if we want to do something different? • Run
getFoo and getBar together, cancel the other as soon as one returns • Use     race :: IO a -> IO b -> IO (Either a b) • Usage     res <- race getFoo getBar   case res of   Left fooResult -> do something with Foo   Right barResult -> do something with Bar Sum types FTW!

More than two actions • Use Concurrently with the Applicative
instance newtype Concurrently a = Concurrently (IO a) <$> :: (a -> b) -> Concurrently a -> Concurrently b   <*> :: Concurrently (a -> b) -> Concurrently a -> Concurrently b action :: Concurrently (Foo, Bar, Baz)   action = mk3Tuple <$> getFoo <*> getBar <*> getBaz Note: Separated for clarity mk3Tuple = (,,)

How’d that work again? • Let the types guide you
-   mk3Tuple :: a -> (b -> (c -> (a,b,c))) <$> :: (u -> v) -> Concurrently u -> Concurrently v   (mk3Tuple <$>) :: Concurrently a -> Concurrently (b -> (c -> (a,b,c))) getFoo :: Concurrently Foo   (mk3Tuple <$> getFoo) :: Concurrently (b -> (c -> (Foo,b,c))) (<*> getBar) :: Concurrently (Bar -> b) -> Concurrently Bar -> Concurrently b   (mk3Tuple <$> getFoo <*> getBar) :: Concurrently (c -> (Foo,Bar,c)) (mk3Tuple <$> getFoo <*> getBar <*> getBaz) :: Concurrently (Foo,Bar,Baz)

Racing multiple actions • What if we want to run
multiple actions, but return the result from the fi rst one to complete? • Use the Alternative instance with Concurrently! <|> :: Concurrently a -> Concurrently a -> Concurrently a action = getFoo <|> getBar <|> getBaz Error: can’t match Foo with Bar with Baz

Racing multiple actions data OneOfThree a b c = First
a | Second b | Third c <|> :: Concurrently a -> Concurrently a -> Concurrently a <$> :: (a -> b) -> Concurrently a -> Concurrently b First <$> getFoo :: Concurrently (OneOfThree Foo b c) Second <$> getBar :: Concurrently (OneOfThree a Bar c) Third <$> getBaz :: Concurrently (OneOfThree a b Baz) action :: Concurrently (OneOfThree Foo Bar Baz)   action = (First <$> getFoo) <|> (Second <$> getBar) <|> (Third <$> getBaz)

Timeout function threadDelay can be used to measure time threadDelay
:: Int -> IO () timeout :: Int -> IO a -> IO (Maybe a)   timeout ms io = race (Nothing <$ threadDelay ms) (Just <$> io)

Manipulating a data structure concurrently • Assume a function -
getURL :: String -> Concurrently String • We need to map over a list of URLs, fetching their contents concurrently, and returning all of them in a list mapConcurrently   :: (a -> Concurrently b) -> [a] -> Concurrently [b] • Haskellers would recognise that mapConcurrently = traverse • Traverse requires Concurrently to be Applicative, which as we saw it already is traverse getURL [“foo.com", “bar.com", “baz.com"]

Tree of Actions • This is what we had in
the previous slide - traverse getURL [“foo.com", “bar.com", “baz.com"] • This launches 3 threads, and an exception in any of them will kill all the other threads as well. • Using race or concurrently, we are building a tree of threads, where all the threads are always cleaned up. • For now, we will not discuss exceptions further in this presentation

Cases for which there’s no prebuilt abstraction • That actually
happens less than you think • Note that I said “prebuilt”, not “inbuilt” • All of these functions are written, in user code, upon lower level primitives • The implementations are very simple. Usually you can count the number of lines on one hand. • You can write your own abstractions very easily • But most of the time, you will just reach into your standard Haskell toolbox

A complicated f low • Race two actions getInt and
getBool concurrently. • If getBool fi nishes fi rst, and is true, then answer with 100, cancelling getInt • Else wait for and return the result of getInt • Since we need to inspect the result of one action before we decide whether to cancel the other action, this is not doable with race and concurrently

Multiple sequential calls     getFoo params \fooResult -> do
  getBar fooResult \barResult -> do   getBaz fooResult barResult \bazResult -> do   do other things…   Callback hell!

Do notation do fooResult <- getFoo params   barResult <-
getBar fooResult   bazResult <- getBaz fooResult barResult   do other things… Much better, but how!

Shared data between threads MVar a newMVar :: IO (MVar
a) putMVar :: MVar a -> a -> IO () takeMVar :: MVar a -> IO a These are blocking operations Hence they are also synchronised

Also remember ThreadID forkIO :: IO a -> IO ThreadId
You can control the thread with the ThreadId For example, cancel the thread cancelThread :: ThreadId -> IO ()

A complicated f low complicatedFlow = do   v <-
newMVar   threadBool <- forkIO (getBool >>= boolHandler v)   threadInt <- forkIO (getInt >>= intHandler v)   takeMVar v     where   boolHandler b = when b do   cancelThread threadInt   putMVar v 100     intHandler i = putMVar v i

Async - Await (but better) data Promise a = Promise
(MVar a) ThreadId async :: IO a -> IO (Promise a)   async action = do   var <- newMVar   tid <- forkIO (action >>= putMVar var)   return (Promise var tid) await :: Promise a -> IO a   await (Promise var _) = takeMVar var   cancel :: Promise a -> IO ()   cancel (Promise _ tid) = cancelThread tid

A complicated f low complicatedFlow = do   boolPromise <-
async getBool   intPromise <- async getInt   v <- race   (Left <$> await boolPromise)   (Right <$> await intPromise)   case v of   Left True -> do   cancel intPromise   pure 100   Left False -> await intPromise   Right i -> pure i

Monad instance instance Monad Promise where   return a =
Promise (return a) ???   Promise v tid >>= f = Promise ???   • We need a ThreadId, but don’t have one • We need to construct a new MVar for synchronisation but don’t have an IO context     Remember, (>>=) :: Promise a -> (a -> Promise b) -> Promise b

Monad Instance Tweaks data Promise a = Promise { runPromise
:: IO (IO a, IO ()) } async :: IO a -> IO (Promise a)   async action = do   var <- newMVar   tid <- forkIO (action >>= putMVar var)   return (Promise (return (takeMVar var, cancelThread tid))) await :: Promise a -> IO a   await p = do   (take, _) <- runPromise p   take   cancel :: Promise a -> IO ()   cancel p = do   (_, cancel) <- runPromise p   cancel

Monad instance, Try 2 instance Monad Promise where   return
a = Promise (return (a, return ()))   x >>= f = Promise do   var <- newMVar   cvar <- newMVar   t1 <- forkIO do   (take1, cancel1) <- runPromise x   putMVar cvar cancel1   a <- take1   (take2, cancel2) <- runPromise (f a)   putMVar var take2   _ <- takeMVar cvar   putMVar cvar cancel2   return (join (takeMVar var), cancelThread t1 >> join (takeMVar cvar))

Multiple sequential calls     do fooResult <- await $
getFoo params   barResult <- await $ getBar fooResult   bazResult <- await $ getBaz fooResult barResult   do other things…  

Software Transactional Memory Looks very similar to IO, and MVar
    data STM a   instance Monad STM     data TVar a     newTVar :: a -> STM (TVar a)   readTVar :: TVar a -> STM a   writeTVar :: TVar a -> a -> STM ()

But More Conversion to IO   atomically :: STM a
-> IO a   A concurrency operator   orElse :: STM a -> STM a -> STM a   And this mysterious function   retry :: STM a

Why? • Automatic, no-sweat synchronisation • No need for locking,
broadcast channels etc. • No deadlocks!

STM Example • IN: Shared List of URLs   OUT:
Shared List of URL contents • We have a producer that adds URLs to the IN list • Multiple consumers that take URLs from the IN list, make network requests, and push the contents to the OUT list

No Sweat Shared data   inList :: TVar [String]  
outList :: TVar [String]   producer = forever do   url <- generateURL   atomically do   urls <- readTVar inList   writeTVar inList (url: urls)

No Sweat consumer = forever do   url <- atomically
do   urls <- readTVar inList   case urls of   [] -> retry   url:tail -> do   writeTVar inList tail   return url   contents <- fetch url   atomically do   tail <- readTVar outList   writeTVar outList (contents:tail)

Other examples • Distributed-process framework - The Actor Model •
https://github.com/haskell-distributed/distributed-process/wiki/The-Actor-Model • Communicating Haskell Processes - Message Passing via Channels • https://www.cs.kent.ac.uk/projects/ofa/chp/

Thank You

Concurrency in Haskell

Concurrency in Haskell

More Decks by Anupam

Other Decks in Technology

Featured

Transcript