Slide 1

Slide 1 text

Software-Transactional Memory in Haskell (an overview of the implementation)

Slide 2

Slide 2 text

Let's start with WHY

Slide 3

Slide 3 text

FACT: Many modern applications have increasingly stringent concurrency requirements

Slide 4

Slide 4 text

FACT: Commodity multicore systems are increasingly affordable and available

Slide 5

Slide 5 text

FACT: The design and implementation of correct, efficient, and scalable concurrent software remains a daunting task

Slide 6

Slide 6 text

Haskell to the rescue! Meet STM

Slide 7

Slide 7 text

STM protects shared state in concurrent programs

Slide 8

Slide 8 text

STM provides a more user-friendly and scalable alternative to locks by promoting the notion of memory transactions as first-class citizens

Slide 9

Slide 9 text

Transactions, like many of the best ideas in computer science, originated in the data engineering world

Slide 10

Slide 10 text

Transactions are one of the foundations of database technology

Slide 11

Slide 11 text

Full-fledged transactions are defined by the ACID properties Memory transactions use two of them (A+I)

Slide 12

Slide 12 text

Transactions provide atomicity and isolation guarantees

Slide 13

Slide 13 text

Strong atomicity means all-or-nothing

Slide 14

Slide 14 text

Strong isolation means freedom from interference by other threads

Slide 15

Slide 15 text

Recall that Haskell is a strictly-typed, lazy, pure functional language

Slide 16

Slide 16 text

Pure means that functions with side-effects must be marked as such

Slide 17

Slide 17 text

The marking is done through the type system at compile time

Slide 18

Slide 18 text

STM is just another kind of I/O (with a different marker: "STM a" instead of "IO a")

Slide 19

Slide 19 text

Transactional memory needs to be declared explicitly as TVar

Slide 20

Slide 20 text

The STM library provides an STM-to-IO converter called "atomically"

Slide 21

Slide 21 text

Transactional memory can only be accessed through dedicated functions like "modifyTVar", "readTVar", "writeTVar" which can only be called inside STM blocks

Slide 22

Slide 22 text

Implementation Overview Of GHC's STM

Slide 23

Slide 23 text

Definition A transaction memory is a set of tuples in the shape of (Identity,Version,Value) The version number represents the number of times the value has changed.

Slide 24

Slide 24 text

The Transactional Record Every STM transaction keeps a record of state changes (similar to the tx log in the DB world)

Slide 25

Slide 25 text

STM performs all the effects of a transaction locally in the transactional record

Slide 26

Slide 26 text

Once the transaction has finished its work locally, a version-based consistency check determines if the values read for the entire access set are consistent

Slide 27

Slide 27 text

This version-based consistency check also obtains locks for the write set and with those locks STM updates the main memory and then releases the locks

Slide 28

Slide 28 text

Rolling back the effects of a transaction means forgetting the current transactional record and starting again

Slide 29

Slide 29 text

Reading: When a readTVar is attempted STM first searches the tr. record for an existing entry

Slide 30

Slide 30 text

Reading: If the entry is found, STM will use that local view of the TVar

Slide 31

Slide 31 text

Reading: On the first readTVar, a new entry is allocated and the TVar value is read and stored locally

Slide 32

Slide 32 text

Reading: The original Tvar does not need to be accessed again for its value until validation time

Slide 33

Slide 33 text

Writing: Writing to a Tvar requires that the variable first be in the tr. record

Slide 34

Slide 34 text

Writing: If it is not currently in the tr. record, a readTVar is performed and the value is stored in a new entry

Slide 35

Slide 35 text

Writing: The version in this entry will be used at validation time to ensure that no updates were made concurrently to this TVar

Slide 36

Slide 36 text

Writing: The value is stored locally in the tr. record until commit time

Slide 37

Slide 37 text

Validation: Before a transaction can make its effects visible to other threads it must check that it has seen a consistent view of memory while it was executing

Slide 38

Slide 38 text

Validation: This is done by checking that TVars hold their expected values (version comparison)

Slide 39

Slide 39 text

Validation: During validation, STM fetches the version numbers for all TVars and checks that they are consistent with its expectations

Slide 40

Slide 40 text

Validation: STM then acquires locks for the write set in ascending order of memory address

Slide 41

Slide 41 text

Validation: STM then reads and checks all version numbers again

Slide 42

Slide 42 text

Validation: If the version numbers are again consistent with its expectations, STM allows the commit to happen

Slide 43

Slide 43 text

Committing: The desired atomicity is guaranteed by: ● Validation having witnessed all TVars with their respective expected values ● Locks being held for all of the TVars in the write set

Slide 44

Slide 44 text

Committing: STM proceeds to increment each locked TVar's num_updates (a.k.a. version) field

Slide 45

Slide 45 text

Committing: STM then writes the new values into the respective current_value fields, and releases the locks

Slide 46

Slide 46 text

Committing: While these updates happen one-by-one, any attempt to read from this set will spin while the lock is held

Slide 47

Slide 47 text

Another useful STM abstraction is the TChan, an unbounded FIFO channel

Slide 48

Slide 48 text

Once some messages are transferred into a TChan, they are ready to be consumed by other threads (broadcasting is possible too)

Slide 49

Slide 49 text

TChans are useful when threads need to send signals to each other, as opposed to just accessing shared state

Slide 50

Slide 50 text

Compile your STM code with: ghc -threaded program.hs When running the program: ./program +RTS -N

Slide 51

Slide 51 text

Follow me on GitHub github.com/dserban