  1. Lightweight Affine Types for Safe Concurrency in Scala KTH Royal

    Institute of Technology Stockholm, Sweden Philipp Haller VIMPL 2024 
 Lund, Sweden, March 12, 2024 1
  2. Philipp Haller Goal: Robust, large-scale concurrent and distributed programming •

    Reconcile – Fault tolerance – Scalability – Safety • Provide programming models and languages applicable to a variety of distributed applications Instead of building single "one-of" systems for speci fi c domains Fault tolerance Scalability Safety 2
  3. Philipp Haller A safety challenge: data races What is a

    data race? • A data race occurs – when two tasks (threads, processes, actors) concurrently access the same shared variable (or object field) and – at least one of the accesses is a write (an assignment) • In practice, data races are difficult to find and fix → heisenbugs Fault tolerance Scalability Safety 3
  4. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 4
  5. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation 5
  6. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Checks whether the condition holds 6
  7. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Checks whether the condition holds Concurrent assignment 7
  8. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Assertion may fail! Concurrent assignment 8
  9. Philipp Haller What about higher-level abstractions? • The example to

    the right looks harmless... • ...until we inspect class C val input: Int = ... val future = async { val x = new C() x.expensiveComputation(input) } val z = (new C()).get() class C: def expensiveComputation(init: Int): Int = set(init) ... def set(x: Int): Unit = Global.f = x def get(): Int = Global.f object Global: var f: Int = 0 Shared singleton object Global.f = global variable! Data race! Creates and uses fresh instance 9
  10. Static Data-Race Prevention for Scala • Scala has not been

    designed with ownership, uniqueness or anything similar • Scala is celebrating its 20th anniversary this year 🎉 • Static data-race prevention for an existing language is a difficult challenge! • Multiple efforts • Earlier work: • Type system extension for static capabilities, based on annotations: 
 • Part of my PhD thesis at EPFL 10 Haller and Odersky. Capabilities for Uniqueness and Borrowing. ECOOP 2010: 354-378 https://doi.org/10.1007/978-3-642-14107-2_17
  11. Capabilities for Uniqueness and Borrowing (2010) • Goal: 

    and efficient message passing in concurrent object-oriented programming • Identified issues of state-of-the-art languages in 2010: • Issue 1: Can pass messages by reference but only with severely restricted shape (e.g., trees) • Issue 2: Destructive reads potential source of run-time errors • Our approach: • Introduce static capabilities to • provide a flexible notion of uniqueness, called “separate uniqueness” • enforce at-most-once consumption of unique references (affine types) • Implement type system as Scala extension: • Using annotations: @unique, @transient, @peer • Annotation checker implemented as compiler plug-in 11
  12. Separate Uniqueness vs External Uniqueness • Separate Uniqueness most closely

    related to External Uniqueness 
 • Comparison: 
 • External uniqueness (A, B, C are objects; object A owns object B, u is a unique reference): • References r and i are internal to the ownership context of A • Ownership makes reference f’ illegal; uniqueness makes reference f illegal • Key difference: Separate uniqueness forbids reference s → enforces full encapsulation 12 Clarke, Wrigstad. External Uniqueness Is Unique Enough. ECOOP 2003: 176-200 https://doi.org/10.1007/978-3-540-45070-2_9
  13. Capabilities for uniqueness and borrowing: Post-mortem • Why was the

    approach not pursued further? • Compelling properties: • Flexible notion of uniqueness • Suitable for preventing data races (demonstrated for actors) • Low annotation overhead • Promising experience with real-world actor-based Scala code • Remaining issues: • Complexity of the annotation checker • Capabilities a completely new concept: • For example, own implementation of inference • Unclear: capabilities orthogonal to existing implicits in Scala? 13
  14. Capabilities Reloaded: Enter LaCasa • LaCasa = Lightweight Affinity and

    Object Capabilities in Scala • Key novelties: • Clarifies the relationship between capabilities and implicits in Scala • Reduces annotation overhead further through a novel idea: 14 Leverage properties of the Object-Capability Model to restrict aliasing and effects! Haller and Loiko. LaCasa: Lightweight af fi nity and object capabilities in Scala. OOPSLA 2016
  15. Example: Shopping cart • Actor A: • Receives commands for

    editing shopping cart • A checkout command sends shopping cart to actor B 15 val cart = Cart() def receive = { case AddItem(item) => cart.add(item) case RemoveItem(item) => cart.remove(item) case Checkout() => actorB ! Process(cart) } Goal: maintain isolation of actors A and B 
 even if shopping cart is sent by reference
  16. Example: Challenges • Actors with shared heap could end up

    concurrently accessing the same shopping cart: • After sending cart to actor B, actor A could continue to access cart • Actor A could store a reference to cart anywhere in shared heap 16 val cart = Cart() def receive = { case AddItem(item) => cart.add(item) case RemoveItem(item) => cart.remove(item) case Checkout() => actorB ! Process(cart) cart.remove(someItem) Global.forLater = cart }
  17. LaCasa in a nutshell • Main ideas: • Enable creating

    and maintaining separated/disjoint object graphs • Except for sharing of immutable objects • Control access to separate object graphs using affine permissions • Separate object graphs maintained in “boxes” • Each box has an associated permission 17
  18. Maintaining separate object graphs using boxes • Given a box

    initialized with a reference to a fresh instance of class Cart: 
 • Assuming Item is a deeply immutable type, adding an item using open preserves heap separation: 18 class Cart: var items: List[Item] = List() ... val cartBox: Box[Cart] = ... val item: Item = ... cartBox.open(Spore(item) { env => cart => cart.items = env :: cart.items }) Spore = closure that captures only item
  19. Opening boxes • The contents of a box can only

    be accessed using open • open takes a spore that is applied to the contents of the box • A spore is a special kind of closure which • has an explicit environment • tracks the type of its environment using a type refinement, enabling type-based constraints • enables operations on its environment, for example, for serialization and duplication/cloning • open requires the permission associated with the box 19 Miller, Haller, and Odersky. Spores: A Type-Based Foundation for Closures in the Age of Concurrency and Distribution. ECOOP 2014: 308-333 https://doi.org/10.1007/978-3-662-44202-9_13
  20. Spores — overview • A simple spore without environment: 

 • The above spore has the following type: 
 • Spore types are subtypes of corresponding function types: 20 val s = Spore((x: Int) => x + 2) Spore[Int, Int] { type Env = Nothing } sealed trait Spore[-T, +R] extends (T => R) { type Env }
  21. Spores with environments • The environment of a spore is

    initialized explicitly: 
 • The above spore s2 has type: 21 val str = "anonymous function" val s2 = Spore(str) { env => (x: Int) => x + env.length } Spore[Int, Int] { type Env = String } Environment initialized with argument str Environment accessed using extra parameter “env”
  22. Type-based constraints • The Env type member of the Spore

    trait enables expressing type-based constraints on the spore's environment using context parameters • (Context parameters used to be called “implicit parameters” in Scala 2.) • Example: require a spore parameter to only capture immutable types: 22 /* Run spore `s` concurrently, immediately returning a future which * is eventually completed with the result of `s` of type `T`. */ def async[T](s: Spore[Unit, T])(using Immutable[s.Env]): Future[T] = ... This assumes given instances of the form: given intImmutable: Immutable[Int] = new Immutable[Int] {} given stringImmutable: Immutable[String] = new Immutable[String] {} ... Immutable types are types for which instances of type class Immutable exist
  23. • Permission = given instance with the following type: 

 Boxes and permissions • The contents of a box can only be accessed using open • open takes a spore that is applied to the contents of the box • open requires the permission associated with the box 23 class Cart: var items: List[Item] = List() ... val cartBox: Box[Cart] = ... val item: Item = ... cartBox.open(Spore(item) { env => cart => cart.items = env :: cart.items }) CanAccess { type C = cartBox.C } Can think of type member C as a static region name
  24. Typing open 24 sealed class Box[+T] private (private val instance:

    T) { self => type C def open(s: Spore[T, Unit])(using Immutable[s.Env], CanAccess { type C = self.C }) = s(instance) Used to statically associate a box and a permission Alias of this Primary constructor is private Users unable to create subclasses Region types of this box and the given permission must be equal
  25. Boxes and immutable types • Object graph reachable from a

    box is separate from the rest of the heap, except for immutable objects, which may be shared • Reference to immutable object in common heap can be stored in object reachable from a box • Reference to immutable object can be extracted from box and stored in common heap 25 sealed class Box[+T] ... { self => def extract[R](s: Spore[T, R]) (using Immutable[R], Immutable[s.Env], CanAccess { type C = self.C }): R = s(instance)
  26. How object graph separation could be broken • Example 1:

    Box[C] where a method of C accesses a top-level object: 
 • Example 2: Box[C] where a field of C has a class type with a method that accesses a top-level object: 26 class C: def m() = val d = TopLevelObject.getD() d.m2(this) // could retain this class C: var d: D = _ class D: def m2(c: C) = TopLevelObject.setC(c) // could retain c
  27. Ensuring separation is maintained • Examples on previous slide are

    prevented by requiring that the type parameter C of Box[C] conforms to the object-capability model • Roughly, the object-capability model ensures that instances of a class can only access references that were passed explicitly • But first: capability-based system security… 27
  28. Object-Capability Model • Originally a computer security model • An

    ocapability is a transferable right to perform operations on a given object • ocapabilities are unforgeable • In the context of object-capability-secure object-oriented languages: ocapability = object reference • In order for an OOL to confirm to the Object-Capability Model, an object reference can only be obtained by a caller in one of the following ways: • Parenthood: If A creates B, A obtains a reference to the newly created B • Endowment: If A creates B, B obtains that subset of A’s references with which A chose to endow it • Introduction: If A has references to both B and C, A can call a method on B, passing C as an argument (B can retain that reference by storing it in a field) 28 The Object-Capability Model directly supports the principle of least authority! In order to disambiguate: we call a capability in the Object-Capability Model ocapability
  29. Restrictions to Enforce the Object-Capability Model • For an OOL

    to become object-capability-secure, certain loopholes need to be closed • Example: • Accessing a global singleton object • The authority to access Global is not given explicitly to instances of class C • Instances of class C can inadvertently obtain references to Global’s instance of class D 29 class C: def doSomething(): Int = val d = Global.evil() ... object Global: private val fld: D = new D def evil(): D = fld
  30. Restrictions to Enforce the Object-Capability Model (2) • Enforce the

    principles of the Object-Capability Model (Parenthood, Endowment, Introduction) through the following restrictions for “ocap” classes:1 • Method and constructor parameters are either primitive or ocap class types • Methods only access parameters and the receiver (this) • Methods only instantiate ocap classes • Field types are either primitive or ocap class types • Superclasses are ocap • Previous work on enforcing the Object-Capability Model for JavaScript and Java, e.g.: 
 30 1) Simplified. We have not considered exceptions, file I/O etc! Mettler, Wagner, and Close. Joe-E: A Security-Oriented Subset of Java. NDSS 2010 https://www.ndss-symposium.org/ndss2010/joe-e-security-oriented-subset-java
  31. Object-capability model in LaCasa • Key property: Whether a class

    is ocap is just one bit of information! • Prevent safety issues by requiring that the type parameter C of Box[C] is ocap 31
  32. How practical are these restrictions? • How common are ocap

    classes in Scala? • Such classes can be used safely in concurrent code without changes! • Empirical study of over 75'000 LOC of open-source Scala code: 32 Project Version SLOC GitHub stats Scala stdlib 2.11.7 33,107 ✭5,795 👥 257 Signal/Collect 8.0.6 10,159 ✭123 👥 11 GeoTrellis 0.10.0-RC2 35,351 ✭400 👥 38 -engine 3,868 -raster 22,291 -spark 9,192
  33. Results of empirical study • In the analyzed medium to

    large open-source Scala projects, 
 21-67% of all classes are safe (= ocap): 33 Project #classes/traits #safe (%) #dir. unsafe (%) Scala stdlib 1,505 644 (43%) 212/861 (25%) Signal/Collect 236 159 (67%) 60/77 (78%) GeoTrellis -engine 190 40 (21%) 124/150 (83%) -raster 670 233 (35%) 325/437 (74%) -spark 326 101 (31%) 167/225 (74%) Total 2,927 1,177 (40%) 888/1,750 (51%)
  34. Consuming permissions • A box b can only be used

    if a given of type CanAccess { type C = b.C } is available in the current context • In LaCasa, a permission may be consumed, such that a box becomes unusable in the continuation of the program • The continuation must be specified explicitly, however 
 → Continuation-Passing Style (CPS) 34
  35. Consuming permissions: Example • Example: actor A sends a box

    to actor B, consuming the permission of the box • In the continuation, the permission is unavailable • The continuation is specified as a spore which prevents capturing the consumed permission → within the spore's body the box is unusable 35 def receive[T](box: Box[T]) (using CanAccess { type C = box.C }) = ... actorB.send(box)(Spore(...) { // not allowed to capture a variable of type // CanAccess { type C = box.C } })
  36. Second-class permissions • Soundness requires permissions, i.e., values of type

    CanAccess { type C = b.C } for some b, to be second-class, more specifically, to be stack-local • Stack locality prevents problematic indirect capturing 36 def m[T](box: Box[T]) (using p: CanAccess { type C = box.C }) = val fun = () => p actorB.send(box)(thunk((box, fun)) { case (b, f) => given forbidden = f() // could still access `box` using `b`! ... }) Violates stack locality of p thunk creates a thunk spore with only an environment
  37. Further results • Formalization as a type-and-effect system: • Object-oriented

    core languages with heap separation and concurrency • Proof of type soundness • Proof of isolation theorem for processes with shared heap and ownership transfer • Integration with the actor model of concurrency: 37 Haller and Odersky. Scala Actors: Unifying thread-based and event-based programming. Theor. Comput. Sci. 410(2-3): 202-220 (2009)
  38. LaCasa: Key insights • Key insight 1: Properties of object-capability

    model provide essential guarantees for ensuring separation of object graphs • Object-capability model practical to enforce in real-world languages! • Key insight 2: Scala’s contextual abstractions (implicits) and path-dependent types as well as spores provide important static guarantees • Other static checks enforced by compiler plugin 38 Haller and Loiko. LaCasa: Lightweight af fi nity and object capabilities in Scala. OOPSLA 2016
  39. Further research directions • Various improvements • Avoid explicit continuation-passing

    style through a selective CPS transform • Type erasure for boxes and permissions • New implementation for Scala 3 • Parts exist: 
 • Safe region-based memory management • Potentially less overhead than GC • More predictable performance than GC 39 Haller. Enhancing closures in Scala 3 with Spores3. Scala Symposium 2022: 22-27 https://doi.org/10.1145/3550198.3550428
  40. Further research directions • Explore connections with capture checking •

    Experimental extension of type checking in Scala 3 • Type system for tracking references to capabilities in types • Capability = variable with capability type • Key application: effect polymorphism and effect safety • Example: effect-polymorphic map method (of List[A]): • The type A => B represents the type of impure function values that can close over arbitrary effect capabilities 40 def map(f: A => B): List[B]
  41. Further research directions • Safe deterministic concurrency • Various concurrent/parallel

    programming models with deterministic semantics • Parallel collections (collections with bulk-parallel operations) • Fork/join parallelism and async-finish parallelism • Lattice-based deterministic concurrency, e.g., LVars, Reactive Async, and RACL: 
 • Subtasks may return permissions/ownership • Challenge: splitting and recombining isolated object graphs 41 Arvidsson. Deterministic Concurrency Using Lattices and the Object Capability Model.
 MSc thesis, KTH Royal Institute of Technology, Sweden, 2018. https://urn.kb.se/resolve?urn=urn%3Anbn%3Ase%3Akth%3Adiva-239917
  42. Conclusion • Extending existing, widely-used programming languages with uniqueness, affine

    types, and heap separation is a difficult challenge • The LaCasa project is an ongoing effort to extend Scala with a notion of disjoint object graphs as well as static permissions for access control • Key insights: • The object-capability model provides essential aliasing restrictions in a practical and scalable manner • Scala’s path-dependent types and contextual abstractions are an excellent basis for flexible permissions • Spores provide essential safety properties (e.g., capturing only immutable types) • We are currently exploring generalizations of LaCasa for concurrent programming models other than actors 42