Slide 1

Slide 1 text

Concurrent Tries with Efficient Non-blocking Snapshots Aleksandar Prokopec Phil Bagwell Martin Odersky École Polytechnique Fédérale de Lausanne Nathan Bronson Stanford

Slide 2

Slide 2 text

Motivation val numbers = getNumbers() // compute square roots numbers foreach { entry => x = entry.root n = entry.number entry.root = 0.5 * (x + n / x) if (abs(entry.root - x) < eps) numbers.remove(entry) }

Slide 3

Slide 3 text

Hash Array Mapped Tries (HAMT)

Slide 4

Slide 4 text

Hash Array Mapped Tries (HAMT) 0 = 0000002

Slide 5

Slide 5 text

Hash Array Mapped Tries (HAMT) 0

Slide 6

Slide 6 text

Hash Array Mapped Tries (HAMT) 0 16 = 0100002

Slide 7

Slide 7 text

Hash Array Mapped Tries (HAMT) 0 16

Slide 8

Slide 8 text

Hash Array Mapped Tries (HAMT) 0 16 4 = 0001002

Slide 9

Slide 9 text

Hash Array Mapped Tries (HAMT) 16 0 4 = 0001002

Slide 10

Slide 10 text

Hash Array Mapped Tries (HAMT) 16 0 4

Slide 11

Slide 11 text

Hash Array Mapped Tries (HAMT) 16 0 4 12 = 0011002

Slide 12

Slide 12 text

Hash Array Mapped Tries (HAMT) 16 0 4 12 = 0011002

Slide 13

Slide 13 text

Hash Array Mapped Tries (HAMT) 16 0 4 12

Slide 14

Slide 14 text

Hash Array Mapped Tries (HAMT) 16 33 0 4 12

Slide 15

Slide 15 text

Hash Array Mapped Tries (HAMT) 16 33 0 4 12 48

Slide 16

Slide 16 text

Hash Array Mapped Tries (HAMT) 16 0 4 12 48 33 37

Slide 17

Slide 17 text

Hash Array Mapped Tries (HAMT) 16 4 12 48 33 37 0 3

Slide 18

Slide 18 text

Hash Array Mapped Tries (HAMT) 4 12 16 20 25 33 37 0 1 8 9 3 48 57

Slide 19

Slide 19 text

Immutable HAMT • used as immutable maps in functional languages 4 12 16 20 25 33 37 0 1 8 9 3

Slide 20

Slide 20 text

Immutable HAMT • updates rewrite path from root to leaf 4 12 16 20 25 33 37 0 1 8 9 3 4 12 8 9 11 insert(11)

Slide 21

Slide 21 text

Immutable HAMT • updates rewrite path from root to leaf 4 12 16 20 25 33 37 0 1 8 9 3 4 12 8 9 11 insert(11) efficient updates - logk (n)

Slide 22

Slide 22 text

Node compression 48 57 48 57 1 0 1 0 48 57 1 0 1 0 48 57 10 BITPOP(((1 << ((hc >> lev) & 1F)) – 1) & BMP)

Slide 23

Slide 23 text

Node compression 48 57 48 57 1 0 1 0 48 57 1 0 1 0 48 57 10 48 57

Slide 24

Slide 24 text

Ctrie Can mutable HAMT be modified to be thread-safe?

Slide 25

Slide 25 text

Ctrie insert 4 9 12 16 20 25 33 37 0 1 3 48 57 17 = 0100012

Slide 26

Slide 26 text

Ctrie insert 4 9 12 16 20 25 33 37 0 1 3 48 57 17 = 0100012 16 17 1) allocate

Slide 27

Slide 27 text

Ctrie insert 4 9 12 20 25 33 37 0 1 3 48 57 17 = 0100012 16 17 2) CAS

Slide 28

Slide 28 text

Ctrie insert 4 9 12 20 25 33 37 0 1 3 48 57 17 = 0100012 16 17

Slide 29

Slide 29 text

Ctrie insert 4 9 12 33 37 0 1 3 48 57 18 = 0100102 16 17 20 25

Slide 30

Slide 30 text

Ctrie insert 4 9 12 33 37 0 1 3 48 57 18 = 0100102 16 17 20 25 1) allocate 16 17 18

Slide 31

Slide 31 text

Ctrie insert 4 9 12 33 37 0 1 3 48 57 18 = 0100102 20 25 2) CAS 16 17 18

Slide 32

Slide 32 text

Ctrie insert 4 9 12 33 37 0 1 3 48 57 18 = 0100102 20 25 2) CAS 16 17 18 Unless…

Slide 33

Slide 33 text

Ctrie insert 4 9 12 33 37 0 1 3 48 57 18 = 0100102 16 17 20 25 T1-1) allocate 16 17 18 Unless… 28 = 0111002 T1 T2

Slide 34

Slide 34 text

Ctrie insert 4 9 12 0 1 3 18 = 0100102 16 17 20 25 T1-1) allocate 16 17 18 Unless… 28 = 0111002 T1 T2 20 25 28 T2-1) allocate

Slide 35

Slide 35 text

Ctrie insert 4 9 12 0 1 3 18 = 0100102 16 17 20 25 T1-1) allocate 16 17 18 28 = 0111002 T1 T2 20 25 28 T2-2) CAS

Slide 36

Slide 36 text

Ctrie insert 4 9 12 0 1 3 18 = 0100102 16 17 20 25 T1-2) CAS 16 17 18 28 = 0111002 T1 T2 20 25 28 T2-2) CAS

Slide 37

Slide 37 text

Ctrie insert 4 9 12 0 1 3 18 = 0100102 16 17 20 25 16 17 18 28 = 0111002 T1 T2 20 25 28 Lost insert!

Slide 38

Slide 38 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 20 25 Solution: I-nodes

Slide 39

Slide 39 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 20 25 18 = 0100102 28 = 0111002 T1 T2

Slide 40

Slide 40 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 T1 T2 20 25 18 = 0100102 28 = 0111002 16 17 18 20 25 28 T2-1) allocate T1-1) allocate

Slide 41

Slide 41 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 T1 T2 20 25 16 17 18 20 25 28 T2-2) CAS T1-2) CAS

Slide 42

Slide 42 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 18 20 25 28

Slide 43

Slide 43 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 18 20 25 28 Idea: once added to the Ctrie, I-nodes remain present.

Slide 44

Slide 44 text

Ctrie insert – 2nd attempt 4 9 12 0 1 3 16 17 18 20 25 28 Remove operation supported as well - details in the paper.

Slide 45

Slide 45 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28

Slide 46

Slide 46 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 0

Slide 47

Slide 47 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 0

Slide 48

Slide 48 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 0

Slide 49

Slide 49 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 0

Slide 50

Slide 50 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 1

Slide 51

Slide 51 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 2

Slide 52

Slide 52 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 3

Slide 53

Slide 53 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 5

Slide 54

Slide 54 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 5 actual size = 12

Slide 55

Slide 55 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 5 0 1 actual size = 12

Slide 56

Slide 56 text

Ctrie size 4 9 12 0 1 3 16 17 18 20 25 28 size = 5 0 1 CAS actual size = 11

Slide 57

Slide 57 text

Ctrie size 4 9 12 16 17 18 20 25 28 size = 5 0 1 actual size = 11

Slide 58

Slide 58 text

Ctrie size 4 9 12 16 17 18 20 25 28 size = 6 0 1 actual size = 11

Slide 59

Slide 59 text

Ctrie size 4 9 12 16 17 18 20 25 28 size = 6 0 1 actual size = 11 19

Slide 60

Slide 60 text

Ctrie size 4 9 12 16 17 18 20 25 28 size = 6 0 1 actual size = 11 16 17 18 19

Slide 61

Slide 61 text

Ctrie size 4 9 12 16 17 18 20 25 28 size = 6 0 1 actual size = 12 16 17 18 19 CAS

Slide 62

Slide 62 text

Ctrie size 4 9 12 20 25 28 size = 6 0 1 actual size = 12 16 17 18 19

Slide 63

Slide 63 text

Ctrie size 4 9 12 20 25 28 size = 6 0 1 actual size = 12 16 17 18 19

Slide 64

Slide 64 text

Ctrie size 4 9 12 20 25 28 size = 7 0 1 actual size = 9 16 17 18 19

Slide 65

Slide 65 text

Ctrie size 4 9 12 20 25 28 size = 8 0 1 actual size = 12 16 17 18 19

Slide 66

Slide 66 text

Ctrie size 4 9 12 20 25 28 size = 9 0 1 actual size = 12 16 17 18 19

Slide 67

Slide 67 text

Ctrie size 4 9 12 20 25 28 size = 10 0 1 actual size = 12 16 17 18 19

Slide 68

Slide 68 text

Ctrie size 4 9 12 20 25 28 size = 11 0 1 actual size = 12 16 17 18 19

Slide 69

Slide 69 text

Ctrie size 4 9 12 20 25 28 size = 12 0 1 actual size = 12 16 17 18 19

Slide 70

Slide 70 text

Ctrie size 4 9 12 20 25 28 size = 13 0 1 actual size = 12 16 17 18 19

Slide 71

Slide 71 text

Ctrie size 4 9 12 20 25 28 size = 13 0 1 actual size = 12 16 17 18 19 But the size was never 13!

Slide 72

Slide 72 text

Global state information 4 9 12 20 25 28 0 1 16 17 18 19 • size • find • filter • iterator

Slide 73

Slide 73 text

Global state information 4 9 12 20 25 28 0 1 16 17 18 19 • size • find • filter • iterator  snapshot

Slide 74

Slide 74 text

Snapshot using locks 4 9 12 20 25 28 0 1 16 17 18 19

Slide 75

Slide 75 text

Snapshot using locks 4 9 12 20 25 28 0 1 16 17 18 19 • copy expensive

Slide 76

Slide 76 text

Snapshot using locks 4 9 12 20 25 28 0 1 16 17 18 19 • copy expensive • not lock-free

Slide 77

Slide 77 text

Snapshot using locks 4 9 12 20 25 28 0 1 16 17 18 19 • copy expensive • not lock-free • can insert or remove remain lock-free? 0 1 2 CAS

Slide 78

Slide 78 text

Snapshot using locks 4 9 12 20 25 28 0 1 16 17 18 19 • copy expensive • not lock-free • can insert or remove remain lock-free? 0 1 2 CAS

Slide 79

Slide 79 text

Snapshot using logs 4 9 12 20 25 28 0 1 16 17 18 19 • keep a linked list of previous values in each I-node

Slide 80

Slide 80 text

Snapshot using logs 4 9 12 20 25 28 0 1 16 17 18 19 0 1 2 • keep a linked list of previous values in each I-node

Slide 81

Slide 81 text

Snapshot using logs 4 9 12 20 25 28 0 1 16 17 18 19 • keep a linked list of previous values in each I-node • when is it safe to delete old entries? 0 1 2

Slide 82

Slide 82 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 root

Slide 83

Slide 83 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 root

Slide 84

Slide 84 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 snapshot! root

Slide 85

Slide 85 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 snapshot! #2 root 1) create new I-node at #2

Slide 86

Slide 86 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 snapshot! #2 root 2) set snapshot snapshot #1

Slide 87

Slide 87 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 snapshot! #2 root 3) CAS root to new I-node snapshot #1

Slide 88

Slide 88 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 2

Slide 89

Slide 89 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 2 generation #2 - ok!

Slide 90

Slide 90 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 2 generation #1 not ok, too old!

Slide 91

Slide 91 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root 1) create updated node at #2 snapshot #1 2 #2 #2

Slide 92

Slide 92 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root 2) CAS to the updated node snapshot #1 2 #2 #2

Slide 93

Slide 93 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 2 #2 #2 #1 too old!

Slide 94

Slide 94 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 2 #2 #2 4 9 12 #2 1) create updated node at #2

Slide 95

Slide 95 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 2 #2 #2 4 9 12 #2 2) CAS

Slide 96

Slide 96 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 subsequent insert #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 finally, create a new leaf and CAS

Slide 97

Slide 97 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 another insert #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3

Slide 98

Slide 98 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 another insert #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3

Slide 99

Slide 99 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 But... this won't really work... why? #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3

Slide 100

Slide 100 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3 T2: remove 19 16 17 18

Slide 101

Slide 101 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3 T2: remove 19 16 17 18 CAS

Slide 102

Slide 102 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3 T2: remove 19 16 17 18 CAS How to fail this last CAS?

Slide 103

Slide 103 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3 T2: remove 19 16 17 18 DCAS How to fail this last CAS? DCAS

Slide 104

Slide 104 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3 T2: remove 19 16 17 18 How to fail this last CAS? DCAS - software based DCAS

Slide 105

Slide 105 text

Snapshot using immutability 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 0 1 2 3 T2: remove 19 16 17 18 How to fail this last CAS? DCAS - software based ...creates intermediate objects DCAS

Slide 106

Slide 106 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 T2: remove 19 16 17 18 prev 1) set prev field

Slide 107

Slide 107 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 T2: remove 19 16 17 18 prev 2) CAS

Slide 108

Slide 108 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 T2: remove 19 16 17 18 prev 3) read root generation

Slide 109

Slide 109 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 16 17 18 prev 4) if root generation changed CAS prev to FailedNode(prev) FN

Slide 110

Slide 110 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 16 17 18 prev 4) if root generation changed CAS prev to FailedNode(prev) FN

Slide 111

Slide 111 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 16 17 18 prev 5) CAS to previous value FN

Slide 112

Slide 112 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 16 17 18 prev 4) if root generation unchanged CAS prev to null

Slide 113

Slide 113 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 16 17 18 4) if root generation unchanged CAS prev to null

Slide 114

Slide 114 text

GCAS - generation-compare-and-swap 4 9 12 20 25 28 0 1 16 17 18 19 #1 #1 #1 #1 #1 #2 root snapshot #1 #2 #2 4 9 12 #2 0 1 2 3 1) Replace all CAS with GCAS 2) Replace all READ with GCAS_READ (which checks if prev field is null)

Slide 115

Slide 115 text

Snapshot-based iterator def iterator = if (isSnapshot) new Iterator(root) else snapshot().iterator()

Slide 116

Slide 116 text

Snapshot-based size def size = { val sz = 0 val it = iterator while (it.hasNext) sz += 1 sz }

Slide 117

Slide 117 text

Snapshot-based size def size = { val sz = 0 val it = iterator while (it.hasNext) sz += 1 sz } Above is O(n). But, by caching size in nodes - amortized O(logk n)! (see source code)

Slide 118

Slide 118 text

Snapshot-based atomic clear def clear() = { val or = READ(root) val nr = new INode(new Gen) if (!CAS(root, or, nr)) clear() } (roughly)

Slide 119

Slide 119 text

Evaluation - quad core i7

Slide 120

Slide 120 text

Evaluation – UltraSPARC T2

Slide 121

Slide 121 text

Evaluation – 4x 8-core i7

Slide 122

Slide 122 text

Evaluation – snapshot

Slide 123

Slide 123 text

Conclusion • snapshots are linearizable and lock-free • snapshots take constant time • snapshots are horizontally scalable • snapshots add a non-significant overhead to the algorithm if they aren't used • the approach may be applicable to tree-based lock-free data-structures in general (intuition)

Slide 124

Slide 124 text

Thank you!