Concurrency in Haskell - Speaker Deck

Slide 1

Slide 1 text

Anup a m J a in Concurrency in Haskell

Slide 2

Slide 2 text

Concurrency vs Parallelism • Suspend execution and switch context = Multitasking • Cooperative context switching = Concurrency • Automatic time slicing = Parallelism • Multiple processors running multiple threads = Parallelism

Slide 3

Slide 3 text

Concurrency • Concurrent execution and IO • Shared Data, Many Producers and Consumers

Slide 4

Slide 4 text

Haskell • Concurrent Execution • Haskell has lightweight threads that are very e ffi cient • Haskell IO multiplexes all ongoing IO requests using e ff i cient operating system primitives such as epoll on Linux. Thus applications with lots of lightweight threads, all doing IO simultaneously, perform very well

Slide 5

Slide 5 text

Haskell • Shared Data, Many Producers and Consumers • Haskell provides powerful primitives which make it easy to precisely control and synchronise access to shared data

Slide 6

Slide 6 text

Scope of this talk • Only Explicit Concurrency • Discussion of Haskell’s Concurrency API • No special syntax, only library functions • Rationale behind the API design • Solving problems from fi rst principles

Slide 7

Slide 7 text

Haskell is a Pure Language • All expressions are pure • Concurrency does not make sense for pure expressions Parallelism still does • But you can still perform e ff ects with the IO Monad. e.g. printLn ⸬ String → IO () • E ff ectful values have to be sequenced to have an e ff ect

Slide 8

Slide 8 text

Sequencing Effects • printLn “Hello”   is a pure value • printMyValue = printLn “Hello”   does not actually print anything • The *only* way to sequence e ff ects is by declaring *one* to be the main e ff ect - • main = printMyValue   prints “Hello”!

Slide 9

Slide 9 text

Multiple Effects? • Haskell provides high level functions to sequence E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b

Slide 10

Slide 10 text

Multiple Effects? • Haskell provides high level functions to compose E ff ectful values together. e.g. Monadic bind. >> :: IO a -> IO b -> IO b >>= :: IO a -> (a -> IO b) -> IO b  

Slide 11

Slide 11 text

Combinators all the way • Combinator: a function that can combine multiple values into a single value • Bind is a combinator. Haskell is big on combinators! • Makes sense that the concurrency APIs are also combinators heavy • No special syntax needed, just function calls • Combinators compose • So concurrent code feels like using lego bricks to build larger programs

Slide 12

Slide 12 text

Haskell’s philosophy • Like Lego • Haskell provides a small set of orthogonal general purpose APIs for concurrency • That you can combine in multiple ways to create the desired functionality. • That means you can build your own models of concurrency, including - • Shared memory • Transactions • Actor models • Etc.

Slide 13

Slide 13 text

“Easiest” concurrency model • Inversion of control a.k.a. Callbacks     — This call returns immediately   getFoo params \fooResult ->     — This code will run only when the result is available   use fooResult     … Meanwhile do something else …  

Slide 14

Slide 14 text

ForkIO • ForkIO just runs some IO action in a separate thread forkIO :: IO a -> IO ThreadId • Something like getFoo could be implemented using forkIO - getFoo params handler = forkIO do   fooResult <- makeAPICall params   handler fooResult • Note the monadic sequencing to invoke the handler

Slide 15

Slide 15 text

Parametricity • Inlining doesn’t change the meaning of a program. So this program - getFoo params \fooResult ->   use fooResult   meanwhile do something else • Is equivalent to - forkIO do   fooResult <- makeAPICall params   use fooResult   meanwhile do something else

Slide 16

Slide 16 text

Parallel calls • Running getFoo and getBar together • Use     concurrently :: IO a -> IO b -> IO (a,b) • Usage     (fooResult, barResult) <- concurrently getFoo getBar  

Slide 17

Slide 17 text

What happens if one fails? (fooResult, barResult) <- concurrently getFoo getBar   • If either action throws an exception at any time, then the other action is cancelled, and the exception is re-thrown by concurrently.

Slide 18

Slide 18 text

What if we want to do something different? • Run getFoo and getBar together, cancel the other as soon as one returns • Use     race :: IO a -> IO b -> IO (Either a b) • Usage     res <- race getFoo getBar   case res of   Left fooResult -> do something with Foo   Right barResult -> do something with Bar Sum types FTW!

Slide 19

Slide 19 text

More than two actions • Use Concurrently with the Applicative instance newtype Concurrently a = Concurrently (IO a) <$> :: (a -> b) -> Concurrently a -> Concurrently b   <*> :: Concurrently (a -> b) -> Concurrently a -> Concurrently b action :: Concurrently (Foo, Bar, Baz)   action = mk3Tuple <$> getFoo <*> getBar <*> getBaz Note: Separated for clarity mk3Tuple = (,,)

Slide 20

Slide 20 text

How’d that work again? • Let the types guide you -   mk3Tuple :: a -> (b -> (c -> (a,b,c))) <$> :: (u -> v) -> Concurrently u -> Concurrently v   (mk3Tuple <$>) :: Concurrently a -> Concurrently (b -> (c -> (a,b,c))) getFoo :: Concurrently Foo   (mk3Tuple <$> getFoo) :: Concurrently (b -> (c -> (Foo,b,c))) (<*> getBar) :: Concurrently (Bar -> b) -> Concurrently Bar -> Concurrently b   (mk3Tuple <$> getFoo <*> getBar) :: Concurrently (c -> (Foo,Bar,c)) (mk3Tuple <$> getFoo <*> getBar <*> getBaz) :: Concurrently (Foo,Bar,Baz)

Slide 21

Slide 21 text

Racing multiple actions • What if we want to run multiple actions, but return the result from the fi rst one to complete? • Use the Alternative instance with Concurrently! <|> :: Concurrently a -> Concurrently a -> Concurrently a action = getFoo <|> getBar <|> getBaz Error: can’t match Foo with Bar with Baz

Slide 22

Slide 22 text

Racing multiple actions data OneOfThree a b c = First a | Second b | Third c <|> :: Concurrently a -> Concurrently a -> Concurrently a <$> :: (a -> b) -> Concurrently a -> Concurrently b First <$> getFoo :: Concurrently (OneOfThree Foo b c) Second <$> getBar :: Concurrently (OneOfThree a Bar c) Third <$> getBaz :: Concurrently (OneOfThree a b Baz) action :: Concurrently (OneOfThree Foo Bar Baz)   action = (First <$> getFoo) <|> (Second <$> getBar) <|> (Third <$> getBaz)

Slide 23

Slide 23 text

Timeout function threadDelay can be used to measure time threadDelay :: Int -> IO () timeout :: Int -> IO a -> IO (Maybe a)   timeout ms io = race (Nothing <$ threadDelay ms) (Just <$> io)

Slide 24

Slide 24 text

Manipulating a data structure concurrently • Assume a function - getURL :: String -> Concurrently String • We need to map over a list of URLs, fetching their contents concurrently, and returning all of them in a list mapConcurrently   :: (a -> Concurrently b) -> [a] -> Concurrently [b] • Haskellers would recognise that mapConcurrently = traverse • Traverse requires Concurrently to be Applicative, which as we saw it already is traverse getURL [“foo.com", “bar.com", “baz.com"]

Slide 25

Slide 25 text

Tree of Actions • This is what we had in the previous slide - traverse getURL [“foo.com", “bar.com", “baz.com"] • This launches 3 threads, and an exception in any of them will kill all the other threads as well. • Using race or concurrently, we are building a tree of threads, where all the threads are always cleaned up. • For now, we will not discuss exceptions further in this presentation

Slide 26

Slide 26 text

Cases for which there’s no prebuilt abstraction • That actually happens less than you think • Note that I said “prebuilt”, not “inbuilt” • All of these functions are written, in user code, upon lower level primitives • The implementations are very simple. Usually you can count the number of lines on one hand. • You can write your own abstractions very easily • But most of the time, you will just reach into your standard Haskell toolbox

Slide 27

Slide 27 text

A complicated f low • Race two actions getInt and getBool concurrently. • If getBool fi nishes fi rst, and is true, then answer with 100, cancelling getInt • Else wait for and return the result of getInt • Since we need to inspect the result of one action before we decide whether to cancel the other action, this is not doable with race and concurrently

Slide 28

Slide 28 text

Multiple sequential calls     getFoo params \fooResult -> do   getBar fooResult \barResult -> do   getBaz fooResult barResult \bazResult -> do   do other things…   Callback hell!

Slide 29

Slide 29 text

Do notation do fooResult <- getFoo params   barResult <- getBar fooResult   bazResult <- getBaz fooResult barResult   do other things… Much better, but how!

Slide 30

Slide 30 text

Shared data between threads MVar a newMVar :: IO (MVar a) putMVar :: MVar a -> a -> IO () takeMVar :: MVar a -> IO a These are blocking operations Hence they are also synchronised

Slide 31

Slide 31 text

Also remember ThreadID forkIO :: IO a -> IO ThreadId You can control the thread with the ThreadId For example, cancel the thread cancelThread :: ThreadId -> IO ()

Slide 32

Slide 32 text

A complicated f low complicatedFlow = do   v <- newMVar   threadBool <- forkIO (getBool >>= boolHandler v)   threadInt <- forkIO (getInt >>= intHandler v)   takeMVar v     where   boolHandler b = when b do   cancelThread threadInt   putMVar v 100     intHandler i = putMVar v i

Slide 33

Slide 33 text

Async - Await (but better) data Promise a = Promise (MVar a) ThreadId async :: IO a -> IO (Promise a)   async action = do   var <- newMVar   tid <- forkIO (action >>= putMVar var)   return (Promise var tid) await :: Promise a -> IO a   await (Promise var _) = takeMVar var   cancel :: Promise a -> IO ()   cancel (Promise _ tid) = cancelThread tid

Slide 34

Slide 34 text

A complicated f low complicatedFlow = do   boolPromise <- async getBool   intPromise <- async getInt   v <- race   (Left <$> await boolPromise)   (Right <$> await intPromise)   case v of   Left True -> do   cancel intPromise   pure 100   Left False -> await intPromise   Right i -> pure i

Slide 35

Slide 35 text

Monad instance instance Monad Promise where   return a = Promise (return a) ???   Promise v tid >>= f = Promise ???   • We need a ThreadId, but don’t have one • We need to construct a new MVar for synchronisation but don’t have an IO context     Remember, (>>=) :: Promise a -> (a -> Promise b) -> Promise b

Slide 36

Slide 36 text

Monad Instance Tweaks data Promise a = Promise { runPromise :: IO (IO a, IO ()) } async :: IO a -> IO (Promise a)   async action = do   var <- newMVar   tid <- forkIO (action >>= putMVar var)   return (Promise (return (takeMVar var, cancelThread tid))) await :: Promise a -> IO a   await p = do   (take, _) <- runPromise p   take   cancel :: Promise a -> IO ()   cancel p = do   (_, cancel) <- runPromise p   cancel

Slide 37

Slide 37 text

Monad instance, Try 2 instance Monad Promise where   return a = Promise (return (a, return ()))   x >>= f = Promise do   var <- newMVar   cvar <- newMVar   t1 <- forkIO do   (take1, cancel1) <- runPromise x   putMVar cvar cancel1   a <- take1   (take2, cancel2) <- runPromise (f a)   putMVar var take2   _ <- takeMVar cvar   putMVar cvar cancel2   return (join (takeMVar var), cancelThread t1 >> join (takeMVar cvar))

Slide 38

Slide 38 text

Multiple sequential calls     do fooResult <- await $ getFoo params   barResult <- await $ getBar fooResult   bazResult <- await $ getBaz fooResult barResult   do other things…  

Slide 39

Slide 39 text

Software Transactional Memory Looks very similar to IO, and MVar     data STM a   instance Monad STM     data TVar a     newTVar :: a -> STM (TVar a)   readTVar :: TVar a -> STM a   writeTVar :: TVar a -> a -> STM ()

Slide 40

Slide 40 text

But More Conversion to IO   atomically :: STM a -> IO a   A concurrency operator   orElse :: STM a -> STM a -> STM a   And this mysterious function   retry :: STM a

Slide 41

Slide 41 text

Why? • Automatic, no-sweat synchronisation • No need for locking, broadcast channels etc. • No deadlocks!

Slide 42

Slide 42 text

STM Example • IN: Shared List of URLs   OUT: Shared List of URL contents • We have a producer that adds URLs to the IN list • Multiple consumers that take URLs from the IN list, make network requests, and push the contents to the OUT list

Slide 43

Slide 43 text

No Sweat Shared data   inList :: TVar [String]   outList :: TVar [String]   producer = forever do   url <- generateURL   atomically do   urls <- readTVar inList   writeTVar inList (url: urls)

Slide 44

Slide 44 text

No Sweat consumer = forever do   url <- atomically do   urls <- readTVar inList   case urls of   [] -> retry   url:tail -> do   writeTVar inList tail   return url   contents <- fetch url   atomically do   tail <- readTVar outList   writeTVar outList (contents:tail)

Slide 45

Slide 45 text

Other examples • Distributed-process framework - The Actor Model • https://github.com/haskell-distributed/distributed-process/wiki/The-Actor-Model • Communicating Haskell Processes - Message Passing via Channels • https://www.cs.kent.ac.uk/projects/ofa/chp/

Slide 46

Slide 46 text

Thank You