Slide 1

Slide 1 text

Anup a m J a in Concurrency in Haskell

Slide 2

Slide 2 text

Concurrency vs Parallelism • Suspend execution and switch context = Multitasking • Cooperative context switching = Concurrency • Automatic time slicing = Parallelism • Multiple processors running multiple threads = Parallelism

Slide 3

Slide 3 text

Concurrency • Concurrent execution and IO • Shared Data, Many Producers and Consumers

Slide 4

Slide 4 text

Haskell • Concurrent Execution • Haskell has lightweight threads that are very e ffi cient • Haskell IO multiplexes all ongoing IO requests using e ff i cient operating system primitives such as epoll on Linux. Thus applications with lots of lightweight threads, all doing IO simultaneously, perform very well

Slide 5

Slide 5 text

Haskell • Shared Data, Many Producers and Consumers • Haskell provides powerful primitives which make it easy to precisely control and synchronise access to shared data

Slide 6

Slide 6 text

Scope of this talk • Only Explicit Concurrency • Discussion of Haskell’s Concurrency API • No special syntax, only library functions • Rationale behind the API design • Solving problems from fi rst principles

Slide 7

Slide 7 text

Haskell is a Pure Language • All expressions are pure • Concurrency does not make sense for pure expressions Parallelism still does • But you can still perform e ff ects with the IO Monad. e.g. printLn ⸬ String → IO () • E ff ectful values have to be sequenced to have an e ff ect

Slide 8

Slide 8 text

Sequencing Effects • printLn “Hello” 
 is a pure value • printMyValue = printLn “Hello” 
 does not actually print anything • The *only* way to sequence e ff ects is by declaring *one* to be the main e ff ect - • main = printMyValue 
 prints “Hello”!

Slide 9

Slide 9 text

Multiple Effects? • Haskell provides high level functions to sequence E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b

Slide 10

Slide 10 text

Multiple Effects? • Haskell provides high level functions to compose E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b >>= :: IO a -> (a -> IO b) -> IO b 


Slide 11

Slide 11 text

Combinators all the way • Combinator: a function that can combine multiple values into a single value • Bind is a combinator. Haskell is big on combinators! • Makes sense that the concurrency APIs are also combinators heavy • No special syntax needed, just function calls • Combinators compose • So concurrent code feels like using lego bricks to build larger programs

Slide 12

Slide 12 text

Haskell’s philosophy • Like Lego • Haskell provides a small set of orthogonal general purpose APIs for concurrency • That you can combine in multiple ways to create the desired functionality. • That means you can build your own models of concurrency, including - • Shared memory • Transactions • Actor models • Etc.

Slide 13

Slide 13 text

“Easiest” concurrency model • Inversion of control a.k.a. Callbacks 
 
 — This call returns immediately 
 getFoo params \fooResult -> 
 
 — This code will run only when the result is available 
 use fooResult 
 
 … Meanwhile do something else … 


Slide 14

Slide 14 text

ForkIO • ForkIO just runs some IO action in a separate thread forkIO :: IO a -> IO ThreadId • Something like getFoo could be implemented using forkIO - getFoo params handler = forkIO do 
 fooResult <- makeAPICall params 
 handler fooResult • Note the monadic sequencing to invoke the handler

Slide 15

Slide 15 text

Parametricity • Inlining doesn’t change the meaning of a program. So this program - getFoo params \fooResult -> 
 use fooResult 
 meanwhile do something else • Is equivalent to - forkIO do 
 fooResult <- makeAPICall params 
 use fooResult 
 meanwhile do something else

Slide 16

Slide 16 text

Parallel calls • Running getFoo and getBar together • Use 
 
 concurrently :: IO a -> IO b -> IO (a,b) • Usage 
 
 (fooResult, barResult) <- concurrently getFoo getBar 


Slide 17

Slide 17 text

What happens if one fails? (fooResult, barResult) <- concurrently getFoo getBar 
 • If either action throws an exception at any time, then the other action is cancelled, and the exception is re-thrown by concurrently.

Slide 18

Slide 18 text

What if we want to do something different? • Run getFoo and getBar together, cancel the other as soon as one returns • Use 
 
 race :: IO a -> IO b -> IO (Either a b) • Usage 
 
 res <- race getFoo getBar 
 case res of 
 Left fooResult -> do something with Foo 
 Right barResult -> do something with Bar Sum types FTW!

Slide 19

Slide 19 text

More than two actions • Use Concurrently with the Applicative instance newtype Concurrently a = Concurrently (IO a) <$> :: (a -> b) -> Concurrently a -> Concurrently b 
 <*> :: Concurrently (a -> b) -> Concurrently a -> Concurrently b action :: Concurrently (Foo, Bar, Baz) 
 action = mk3Tuple <$> getFoo <*> getBar <*> getBaz Note: Separated for clarity mk3Tuple = (,,)

Slide 20

Slide 20 text

How’d that work again? • Let the types guide you - 
 mk3Tuple :: a -> (b -> (c -> (a,b,c))) <$> :: (u -> v) -> Concurrently u -> Concurrently v 
 (mk3Tuple <$>) :: Concurrently a -> Concurrently (b -> (c -> (a,b,c))) getFoo :: Concurrently Foo 
 (mk3Tuple <$> getFoo) :: Concurrently (b -> (c -> (Foo,b,c))) (<*> getBar) :: Concurrently (Bar -> b) -> Concurrently Bar -> Concurrently b 
 (mk3Tuple <$> getFoo <*> getBar) :: Concurrently (c -> (Foo,Bar,c)) (mk3Tuple <$> getFoo <*> getBar <*> getBaz) :: Concurrently (Foo,Bar,Baz)

Slide 21

Slide 21 text

Racing multiple actions • What if we want to run multiple actions, but return the result from the fi rst one to complete? • Use the Alternative instance with Concurrently! <|> :: Concurrently a -> Concurrently a -> Concurrently a action = getFoo <|> getBar <|> getBaz Error: can’t match Foo with Bar with Baz

Slide 22

Slide 22 text

Racing multiple actions data OneOfThree a b c = First a | Second b | Third c <|> :: Concurrently a -> Concurrently a -> Concurrently a <$> :: (a -> b) -> Concurrently a -> Concurrently b First <$> getFoo :: Concurrently (OneOfThree Foo b c) Second <$> getBar :: Concurrently (OneOfThree a Bar c) Third <$> getBaz :: Concurrently (OneOfThree a b Baz) action :: Concurrently (OneOfThree Foo Bar Baz) 
 action = (First <$> getFoo) <|> (Second <$> getBar) <|> (Third <$> getBaz)

Slide 23

Slide 23 text

Timeout function threadDelay can be used to measure time threadDelay :: Int -> IO () timeout :: Int -> IO a -> IO (Maybe a) 
 timeout ms io = race (Nothing <$ threadDelay ms) (Just <$> io)

Slide 24

Slide 24 text

Manipulating a data structure concurrently • Assume a function - getURL :: String -> Concurrently String • We need to map over a list of URLs, fetching their contents concurrently, and returning all of them in a list mapConcurrently 
 :: (a -> Concurrently b) -> [a] -> Concurrently [b] • Haskellers would recognise that mapConcurrently = traverse • Traverse requires Concurrently to be Applicative, which as we saw it already is traverse getURL [“foo.com", “bar.com", “baz.com"]

Slide 25

Slide 25 text

Tree of Actions • This is what we had in the previous slide - traverse getURL [“foo.com", “bar.com", “baz.com"] • This launches 3 threads, and an exception in any of them will kill all the other threads as well. • Using race or concurrently, we are building a tree of threads, where all the threads are always cleaned up. • For now, we will not discuss exceptions further in this presentation

Slide 26

Slide 26 text

Cases for which there’s no prebuilt abstraction • That actually happens less than you think • Note that I said “prebuilt”, not “inbuilt” • All of these functions are written, in user code, upon lower level primitives • The implementations are very simple. Usually you can count the number of lines on one hand. • You can write your own abstractions very easily • But most of the time, you will just reach into your standard Haskell toolbox

Slide 27

Slide 27 text

A complicated f low • Race two actions getInt and getBool concurrently. • If getBool fi nishes fi rst, and is true, then answer with 100, cancelling getInt • Else wait for and return the result of getInt • Since we need to inspect the result of one action before we decide whether to cancel the other action, this is not doable with race and concurrently

Slide 28

Slide 28 text

Multiple sequential calls 
 
 getFoo params \fooResult -> do 
 getBar fooResult \barResult -> do 
 getBaz fooResult barResult \bazResult -> do 
 do other things… 
 Callback hell!

Slide 29

Slide 29 text

Do notation do fooResult <- getFoo params 
 barResult <- getBar fooResult 
 bazResult <- getBaz fooResult barResult 
 do other things… Much better, but how!

Slide 30

Slide 30 text

Shared data between threads MVar a newMVar :: IO (MVar a) putMVar :: MVar a -> a -> IO () takeMVar :: MVar a -> IO a These are blocking operations Hence they are also synchronised

Slide 31

Slide 31 text

Also remember ThreadID forkIO :: IO a -> IO ThreadId You can control the thread with the ThreadId For example, cancel the thread cancelThread :: ThreadId -> IO ()

Slide 32

Slide 32 text

A complicated f low complicatedFlow = do 
 v <- newMVar 
 threadBool <- forkIO (getBool >>= boolHandler v) 
 threadInt <- forkIO (getInt >>= intHandler v) 
 takeMVar v 
 
 where 
 boolHandler b = when b do 
 cancelThread threadInt 
 putMVar v 100 
 
 intHandler i = putMVar v i

Slide 33

Slide 33 text

Async - Await (but better) data Promise a = Promise (MVar a) ThreadId async :: IO a -> IO (Promise a) 
 async action = do 
 var <- newMVar 
 tid <- forkIO (action >>= putMVar var) 
 return (Promise var tid) await :: Promise a -> IO a 
 await (Promise var _) = takeMVar var 
 cancel :: Promise a -> IO () 
 cancel (Promise _ tid) = cancelThread tid

Slide 34

Slide 34 text

A complicated f low complicatedFlow = do 
 boolPromise <- async getBool 
 intPromise <- async getInt 
 v <- race 
 (Left <$> await boolPromise) 
 (Right <$> await intPromise) 
 case v of 
 Left True -> do 
 cancel intPromise 
 pure 100 
 Left False -> await intPromise 
 Right i -> pure i

Slide 35

Slide 35 text

Monad instance instance Monad Promise where 
 return a = Promise (return a) ??? 
 Promise v tid >>= f = Promise ??? 
 • We need a ThreadId, but don’t have one • We need to construct a new MVar for synchronisation but don’t have an IO context 
 
 Remember, (>>=) :: Promise a -> (a -> Promise b) -> Promise b

Slide 36

Slide 36 text

Monad Instance Tweaks data Promise a = Promise { runPromise :: IO (IO a, IO ()) } async :: IO a -> IO (Promise a) 
 async action = do 
 var <- newMVar 
 tid <- forkIO (action >>= putMVar var) 
 return (Promise (return (takeMVar var, cancelThread tid))) await :: Promise a -> IO a 
 await p = do 
 (take, _) <- runPromise p 
 take 
 cancel :: Promise a -> IO () 
 cancel p = do 
 (_, cancel) <- runPromise p 
 cancel

Slide 37

Slide 37 text

Monad instance, Try 2 instance Monad Promise where 
 return a = Promise (return (a, return ())) 
 x >>= f = Promise do 
 var <- newMVar 
 cvar <- newMVar 
 t1 <- forkIO do 
 (take1, cancel1) <- runPromise x 
 putMVar cvar cancel1 
 a <- take1 
 (take2, cancel2) <- runPromise (f a) 
 putMVar var take2 
 _ <- takeMVar cvar 
 putMVar cvar cancel2 
 return (join (takeMVar var), cancelThread t1 >> join (takeMVar cvar))

Slide 38

Slide 38 text

Multiple sequential calls 
 
 do fooResult <- await $ getFoo params 
 barResult <- await $ getBar fooResult 
 bazResult <- await $ getBaz fooResult barResult 
 do other things… 


Slide 39

Slide 39 text

Software Transactional Memory Looks very similar to IO, and MVar 
 
 data STM a 
 instance Monad STM 
 
 data TVar a 
 
 newTVar :: a -> STM (TVar a) 
 readTVar :: TVar a -> STM a 
 writeTVar :: TVar a -> a -> STM ()

Slide 40

Slide 40 text

But More Conversion to IO 
 atomically :: STM a -> IO a 
 A concurrency operator 
 orElse :: STM a -> STM a -> STM a 
 And this mysterious function 
 retry :: STM a

Slide 41

Slide 41 text

Why? • Automatic, no-sweat synchronisation • No need for locking, broadcast channels etc. • No deadlocks!

Slide 42

Slide 42 text

STM Example • IN: Shared List of URLs 
 OUT: Shared List of URL contents • We have a producer that adds URLs to the IN list • Multiple consumers that take URLs from the IN list, make network requests, and push the contents to the OUT list

Slide 43

Slide 43 text

No Sweat Shared data 
 inList :: TVar [String] 
 outList :: TVar [String] 
 producer = forever do 
 url <- generateURL 
 atomically do 
 urls <- readTVar inList 
 writeTVar inList (url: urls)

Slide 44

Slide 44 text

No Sweat consumer = forever do 
 url <- atomically do 
 urls <- readTVar inList 
 case urls of 
 [] -> retry 
 url:tail -> do 
 writeTVar inList tail 
 return url 
 contents <- fetch url 
 atomically do 
 tail <- readTVar outList 
 writeTVar outList (contents:tail)

Slide 45

Slide 45 text

Other examples • Distributed-process framework - The Actor Model • https://github.com/haskell-distributed/distributed-process/wiki/The-Actor-Model • Communicating Haskell Processes - Message Passing via Channels • https://www.cs.kent.ac.uk/projects/ofa/chp/

Slide 46

Slide 46 text

Thank You