Slide 1

Slide 1 text

Lightweight Affine Types for Safe Concurrency in Scala KTH Royal Institute of Technology Stockholm, Sweden Philipp Haller VIMPL 2024 
 Lund, Sweden, March 12, 2024 1

Slide 2

Slide 2 text

Philipp Haller Goal: Robust, large-scale concurrent and distributed programming • Reconcile – Fault tolerance – Scalability – Safety • Provide programming models and languages applicable to a variety of distributed applications Instead of building single "one-of" systems for speci fi c domains Fault tolerance Scalability Safety 2

Slide 3

Slide 3 text

Philipp Haller A safety challenge: data races What is a data race? • A data race occurs – when two tasks (threads, processes, actors) concurrently access the same shared variable (or object field) and – at least one of the accesses is a write (an assignment) • In practice, data races are difficult to find and fix → heisenbugs Fault tolerance Scalability Safety 3

Slide 4

Slide 4 text

Philipp Haller Data race: an example var x: Int = 0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 4

Slide 5

Slide 5 text

Philipp Haller Data race: an example var x: Int = 0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation 5

Slide 6

Slide 6 text

Philipp Haller Data race: an example var x: Int = 0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Checks whether the condition holds 6

Slide 7

Slide 7 text

Philipp Haller Data race: an example var x: Int = 0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Checks whether the condition holds Concurrent assignment 7

Slide 8

Slide 8 text

Philipp Haller Data race: an example var x: Int = 0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Assertion may fail! Concurrent assignment 8

Slide 9

Slide 9 text

Philipp Haller What about higher-level abstractions? • The example to the right looks harmless... • ...until we inspect class C val input: Int = ... val future = async { val x = new C() x.expensiveComputation(input) } val z = (new C()).get() class C: def expensiveComputation(init: Int): Int = set(init) ... def set(x: Int): Unit = Global.f = x def get(): Int = Global.f object Global: var f: Int = 0 Shared singleton object Global.f = global variable! Data race! Creates and uses fresh instance 9

Slide 10

Slide 10 text

Static Data-Race Prevention for Scala • Scala has not been designed with ownership, uniqueness or anything similar • Scala is celebrating its 20th anniversary this year 🎉 • Static data-race prevention for an existing language is a difficult challenge! • Multiple efforts • Earlier work: • Type system extension for static capabilities, based on annotations: 
 
 
 
 • Part of my PhD thesis at EPFL 10 Haller and Odersky. Capabilities for Uniqueness and Borrowing. ECOOP 2010: 354-378 https://doi.org/10.1007/978-3-642-14107-2_17

Slide 11

Slide 11 text

Capabilities for Uniqueness and Borrowing (2010) • Goal: 
 Safe and efficient message passing in concurrent object-oriented programming • Identified issues of state-of-the-art languages in 2010: • Issue 1: Can pass messages by reference but only with severely restricted shape (e.g., trees) • Issue 2: Destructive reads potential source of run-time errors • Our approach: • Introduce static capabilities to • provide a flexible notion of uniqueness, called “separate uniqueness” • enforce at-most-once consumption of unique references (affine types) • Implement type system as Scala extension: • Using annotations: @unique, @transient, @peer • Annotation checker implemented as compiler plug-in 11

Slide 12

Slide 12 text

Separate Uniqueness vs External Uniqueness • Separate Uniqueness most closely related to External Uniqueness 
 
 
 
 • Comparison: 
 
 
 
 
 • External uniqueness (A, B, C are objects; object A owns object B, u is a unique reference): • References r and i are internal to the ownership context of A • Ownership makes reference f’ illegal; uniqueness makes reference f illegal • Key difference: Separate uniqueness forbids reference s → enforces full encapsulation 12 Clarke, Wrigstad. External Uniqueness Is Unique Enough. ECOOP 2003: 176-200 https://doi.org/10.1007/978-3-540-45070-2_9

Slide 13

Slide 13 text

Capabilities for uniqueness and borrowing: Post-mortem • Why was the approach not pursued further? • Compelling properties: • Flexible notion of uniqueness • Suitable for preventing data races (demonstrated for actors) • Low annotation overhead • Promising experience with real-world actor-based Scala code • Remaining issues: • Complexity of the annotation checker • Capabilities a completely new concept: • For example, own implementation of inference • Unclear: capabilities orthogonal to existing implicits in Scala? 13

Slide 14

Slide 14 text

Capabilities Reloaded: Enter LaCasa • LaCasa = Lightweight Affinity and Object Capabilities in Scala • Key novelties: • Clarifies the relationship between capabilities and implicits in Scala • Reduces annotation overhead further through a novel idea: 14 Leverage properties of the Object-Capability Model to restrict aliasing and effects! Haller and Loiko. LaCasa: Lightweight af fi nity and object capabilities in Scala. OOPSLA 2016
 https://doi.org/10.1145/2983990.2984042

Slide 15

Slide 15 text

Example: Shopping cart • Actor A: • Receives commands for editing shopping cart • A checkout command sends shopping cart to actor B 15 val cart = Cart() def receive = { case AddItem(item) => cart.add(item) case RemoveItem(item) => cart.remove(item) case Checkout() => actorB ! Process(cart) } Goal: maintain isolation of actors A and B 
 even if shopping cart is sent by reference

Slide 16

Slide 16 text

Example: Challenges • Actors with shared heap could end up concurrently accessing the same shopping cart: • After sending cart to actor B, actor A could continue to access cart • Actor A could store a reference to cart anywhere in shared heap 16 val cart = Cart() def receive = { case AddItem(item) => cart.add(item) case RemoveItem(item) => cart.remove(item) case Checkout() => actorB ! Process(cart) cart.remove(someItem) Global.forLater = cart }

Slide 17

Slide 17 text

LaCasa in a nutshell • Main ideas: • Enable creating and maintaining separated/disjoint object graphs • Except for sharing of immutable objects • Control access to separate object graphs using affine permissions • Separate object graphs maintained in “boxes” • Each box has an associated permission 17

Slide 18

Slide 18 text

Maintaining separate object graphs using boxes • Given a box initialized with a reference to a fresh instance of class Cart: 
 
 
 
 
 
 
 • Assuming Item is a deeply immutable type, adding an item using open preserves heap separation: 18 class Cart: var items: List[Item] = List() ... val cartBox: Box[Cart] = ... val item: Item = ... cartBox.open(Spore(item) { env => cart => cart.items = env :: cart.items }) Spore = closure that captures only item

Slide 19

Slide 19 text

Opening boxes • The contents of a box can only be accessed using open • open takes a spore that is applied to the contents of the box • A spore is a special kind of closure which • has an explicit environment • tracks the type of its environment using a type refinement, enabling type-based constraints • enables operations on its environment, for example, for serialization and duplication/cloning • open requires the permission associated with the box 19 Miller, Haller, and Odersky. Spores: A Type-Based Foundation for Closures in the Age of Concurrency and Distribution. ECOOP 2014: 308-333 https://doi.org/10.1007/978-3-662-44202-9_13

Slide 20

Slide 20 text

Spores — overview • A simple spore without environment: 
 
 
 
 • The above spore has the following type: 
 
 
 
 • Spore types are subtypes of corresponding function types: 20 val s = Spore((x: Int) => x + 2) Spore[Int, Int] { type Env = Nothing } sealed trait Spore[-T, +R] extends (T => R) { type Env }

Slide 21

Slide 21 text

Spores with environments • The environment of a spore is initialized explicitly: 
 
 
 
 
 
 
 
 • The above spore s2 has type: 21 val str = "anonymous function" val s2 = Spore(str) { env => (x: Int) => x + env.length } Spore[Int, Int] { type Env = String } Environment initialized with argument str Environment accessed using extra parameter “env”

Slide 22

Slide 22 text

Type-based constraints • The Env type member of the Spore trait enables expressing type-based constraints on the spore's environment using context parameters • (Context parameters used to be called “implicit parameters” in Scala 2.) • Example: require a spore parameter to only capture immutable types: 22 /* Run spore `s` concurrently, immediately returning a future which * is eventually completed with the result of `s` of type `T`. */ def async[T](s: Spore[Unit, T])(using Immutable[s.Env]): Future[T] = ... This assumes given instances of the form: given intImmutable: Immutable[Int] = new Immutable[Int] {} given stringImmutable: Immutable[String] = new Immutable[String] {} ... Immutable types are types for which instances of type class Immutable exist

Slide 23

Slide 23 text

• Permission = given instance with the following type: 
 
 
 
 Boxes and permissions • The contents of a box can only be accessed using open • open takes a spore that is applied to the contents of the box • open requires the permission associated with the box 23 class Cart: var items: List[Item] = List() ... val cartBox: Box[Cart] = ... val item: Item = ... cartBox.open(Spore(item) { env => cart => cart.items = env :: cart.items }) CanAccess { type C = cartBox.C } Can think of type member C as a static region name

Slide 24

Slide 24 text

Typing open 24 sealed class Box[+T] private (private val instance: T) { self => type C def open(s: Spore[T, Unit])(using Immutable[s.Env], CanAccess { type C = self.C }) = s(instance) Used to statically associate a box and a permission Alias of this Primary constructor is private Users unable to create subclasses Region types of this box and the given permission must be equal

Slide 25

Slide 25 text

Boxes and immutable types • Object graph reachable from a box is separate from the rest of the heap, except for immutable objects, which may be shared • Reference to immutable object in common heap can be stored in object reachable from a box • Reference to immutable object can be extracted from box and stored in common heap 25 sealed class Box[+T] ... { self => def extract[R](s: Spore[T, R]) (using Immutable[R], Immutable[s.Env], CanAccess { type C = self.C }): R = s(instance)

Slide 26

Slide 26 text

How object graph separation could be broken • Example 1: Box[C] where a method of C accesses a top-level object: 
 
 
 
 
 
 
 • Example 2: Box[C] where a field of C has a class type with a method that accesses a top-level object: 26 class C: def m() = val d = TopLevelObject.getD() d.m2(this) // could retain this class C: var d: D = _ class D: def m2(c: C) = TopLevelObject.setC(c) // could retain c

Slide 27

Slide 27 text

Ensuring separation is maintained • Examples on previous slide are prevented by requiring that the type parameter C of Box[C] conforms to the object-capability model • Roughly, the object-capability model ensures that instances of a class can only access references that were passed explicitly • But first: capability-based system security… 27

Slide 28

Slide 28 text

Object-Capability Model • Originally a computer security model • An ocapability is a transferable right to perform operations on a given object • ocapabilities are unforgeable • In the context of object-capability-secure object-oriented languages: ocapability = object reference • In order for an OOL to confirm to the Object-Capability Model, an object reference can only be obtained by a caller in one of the following ways: • Parenthood: If A creates B, A obtains a reference to the newly created B • Endowment: If A creates B, B obtains that subset of A’s references with which A chose to endow it • Introduction: If A has references to both B and C, A can call a method on B, passing C as an argument (B can retain that reference by storing it in a field) 28 The Object-Capability Model directly supports the principle of least authority! In order to disambiguate: we call a capability in the Object-Capability Model ocapability

Slide 29

Slide 29 text

Restrictions to Enforce the Object-Capability Model • For an OOL to become object-capability-secure, certain loopholes need to be closed • Example: • Accessing a global singleton object • The authority to access Global is not given explicitly to instances of class C • Instances of class C can inadvertently obtain references to Global’s instance of class D 29 class C: def doSomething(): Int = val d = Global.evil() ... object Global: private val fld: D = new D def evil(): D = fld

Slide 30

Slide 30 text

Restrictions to Enforce the Object-Capability Model (2) • Enforce the principles of the Object-Capability Model (Parenthood, Endowment, Introduction) through the following restrictions for “ocap” classes:1 • Method and constructor parameters are either primitive or ocap class types • Methods only access parameters and the receiver (this) • Methods only instantiate ocap classes • Field types are either primitive or ocap class types • Superclasses are ocap • Previous work on enforcing the Object-Capability Model for JavaScript and Java, e.g.: 
 
 
 
 30 1) Simplified. We have not considered exceptions, file I/O etc! Mettler, Wagner, and Close. Joe-E: A Security-Oriented Subset of Java. NDSS 2010 https://www.ndss-symposium.org/ndss2010/joe-e-security-oriented-subset-java

Slide 31

Slide 31 text

Object-capability model in LaCasa • Key property: Whether a class is ocap is just one bit of information! • Prevent safety issues by requiring that the type parameter C of Box[C] is ocap 31

Slide 32

Slide 32 text

How practical are these restrictions? • How common are ocap classes in Scala? • Such classes can be used safely in concurrent code without changes! • Empirical study of over 75'000 LOC of open-source Scala code: 32 Project Version SLOC GitHub stats Scala stdlib 2.11.7 33,107 ✭5,795 👥 257 Signal/Collect 8.0.6 10,159 ✭123 👥 11 GeoTrellis 0.10.0-RC2 35,351 ✭400 👥 38 -engine 3,868 -raster 22,291 -spark 9,192

Slide 33

Slide 33 text

Results of empirical study • In the analyzed medium to large open-source Scala projects, 
 21-67% of all classes are safe (= ocap): 33 Project #classes/traits #safe (%) #dir. unsafe (%) Scala stdlib 1,505 644 (43%) 212/861 (25%) Signal/Collect 236 159 (67%) 60/77 (78%) GeoTrellis -engine 190 40 (21%) 124/150 (83%) -raster 670 233 (35%) 325/437 (74%) -spark 326 101 (31%) 167/225 (74%) Total 2,927 1,177 (40%) 888/1,750 (51%)

Slide 34

Slide 34 text

Consuming permissions • A box b can only be used if a given of type CanAccess { type C = b.C } is available in the current context • In LaCasa, a permission may be consumed, such that a box becomes unusable in the continuation of the program • The continuation must be specified explicitly, however 
 → Continuation-Passing Style (CPS) 34

Slide 35

Slide 35 text

Consuming permissions: Example • Example: actor A sends a box to actor B, consuming the permission of the box • In the continuation, the permission is unavailable • The continuation is specified as a spore which prevents capturing the consumed permission → within the spore's body the box is unusable 35 def receive[T](box: Box[T]) (using CanAccess { type C = box.C }) = ... actorB.send(box)(Spore(...) { // not allowed to capture a variable of type // CanAccess { type C = box.C } })

Slide 36

Slide 36 text

Second-class permissions • Soundness requires permissions, i.e., values of type CanAccess { type C = b.C } for some b, to be second-class, more specifically, to be stack-local • Stack locality prevents problematic indirect capturing 36 def m[T](box: Box[T]) (using p: CanAccess { type C = box.C }) = val fun = () => p actorB.send(box)(thunk((box, fun)) { case (b, f) => given forbidden = f() // could still access `box` using `b`! ... }) Violates stack locality of p thunk creates a thunk spore with only an environment

Slide 37

Slide 37 text

Further results • Formalization as a type-and-effect system: • Object-oriented core languages with heap separation and concurrency • Proof of type soundness • Proof of isolation theorem for processes with shared heap and ownership transfer • Integration with the actor model of concurrency: 37 Haller and Odersky. Scala Actors: Unifying thread-based and event-based programming. Theor. Comput. Sci. 410(2-3): 202-220 (2009)
 https://doi.org/10.1016/j.tcs.2008.09.019

Slide 38

Slide 38 text

LaCasa: Key insights • Key insight 1: Properties of object-capability model provide essential guarantees for ensuring separation of object graphs • Object-capability model practical to enforce in real-world languages! • Key insight 2: Scala’s contextual abstractions (implicits) and path-dependent types as well as spores provide important static guarantees • Other static checks enforced by compiler plugin 38 Haller and Loiko. LaCasa: Lightweight af fi nity and object capabilities in Scala. OOPSLA 2016
 https://doi.org/10.1145/2983990.2984042

Slide 39

Slide 39 text

Further research directions • Various improvements • Avoid explicit continuation-passing style through a selective CPS transform • Type erasure for boxes and permissions • New implementation for Scala 3 • Parts exist: 
 
 
 
 • Safe region-based memory management • Potentially less overhead than GC • More predictable performance than GC 39 Haller. Enhancing closures in Scala 3 with Spores3. Scala Symposium 2022: 22-27 https://doi.org/10.1145/3550198.3550428

Slide 40

Slide 40 text

Further research directions • Explore connections with capture checking • Experimental extension of type checking in Scala 3 • Type system for tracking references to capabilities in types • Capability = variable with capability type • Key application: effect polymorphism and effect safety • Example: effect-polymorphic map method (of List[A]): • The type A => B represents the type of impure function values that can close over arbitrary effect capabilities 40 def map(f: A => B): List[B]

Slide 41

Slide 41 text

Further research directions • Safe deterministic concurrency • Various concurrent/parallel programming models with deterministic semantics • Parallel collections (collections with bulk-parallel operations) • Fork/join parallelism and async-finish parallelism • Lattice-based deterministic concurrency, e.g., LVars, Reactive Async, and RACL: 
 
 
 
 • Subtasks may return permissions/ownership • Challenge: splitting and recombining isolated object graphs 41 Arvidsson. Deterministic Concurrency Using Lattices and the Object Capability Model.
 MSc thesis, KTH Royal Institute of Technology, Sweden, 2018. https://urn.kb.se/resolve?urn=urn%3Anbn%3Ase%3Akth%3Adiva-239917

Slide 42

Slide 42 text

Conclusion • Extending existing, widely-used programming languages with uniqueness, affine types, and heap separation is a difficult challenge • The LaCasa project is an ongoing effort to extend Scala with a notion of disjoint object graphs as well as static permissions for access control • Key insights: • The object-capability model provides essential aliasing restrictions in a practical and scalable manner • Scala’s path-dependent types and contextual abstractions are an excellent basis for flexible permissions • Spores provide essential safety properties (e.g., capturing only immutable types) • We are currently exploring generalizations of LaCasa for concurrent programming models other than actors 42

Slide 43

Slide 43 text

Thanks! Do you have any question? 43