Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lightweight Affine Types for Safe Concurrency in Scala

Philipp Haller
March 12, 2024
41

Lightweight Affine Types for Safe Concurrency in Scala

Philipp Haller

March 12, 2024
Tweet

Transcript

  1. Lightweight Affine Types for Safe Concurrency in Scala KTH Royal

    Institute of Technology Stockholm, Sweden Philipp Haller VIMPL 2024 
 Lund, Sweden, March 12, 2024 1
  2. Philipp Haller Goal: Robust, large-scale concurrent and distributed programming •

    Reconcile – Fault tolerance – Scalability – Safety • Provide programming models and languages applicable to a variety of distributed applications Instead of building single "one-of" systems for speci fi c domains Fault tolerance Scalability Safety 2
  3. Philipp Haller A safety challenge: data races What is a

    data race? • A data race occurs – when two tasks (threads, processes, actors) concurrently access the same shared variable (or object field) and – at least one of the accesses is a write (an assignment) • In practice, data races are difficult to find and fix → heisenbugs Fault tolerance Scalability Safety 3
  4. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 4
  5. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation 5
  6. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Checks whether the condition holds 6
  7. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Checks whether the condition holds Concurrent assignment 7
  8. Philipp Haller Data race: an example var x: Int =

    0 async { if (x == 0) { x = 1 assert(x == 1) } } x = 5 Start a concurrent computation Assertion may fail! Concurrent assignment 8
  9. Philipp Haller What about higher-level abstractions? • The example to

    the right looks harmless... • ...until we inspect class C val input: Int = ... val future = async { val x = new C() x.expensiveComputation(input) } val z = (new C()).get() class C: def expensiveComputation(init: Int): Int = set(init) ... def set(x: Int): Unit = Global.f = x def get(): Int = Global.f object Global: var f: Int = 0 Shared singleton object Global.f = global variable! Data race! Creates and uses fresh instance 9
  10. Static Data-Race Prevention for Scala • Scala has not been

    designed with ownership, uniqueness or anything similar • Scala is celebrating its 20th anniversary this year 🎉 • Static data-race prevention for an existing language is a difficult challenge! • Multiple efforts • Earlier work: • Type system extension for static capabilities, based on annotations: 
 
 
 
 • Part of my PhD thesis at EPFL 10 Haller and Odersky. Capabilities for Uniqueness and Borrowing. ECOOP 2010: 354-378 https://doi.org/10.1007/978-3-642-14107-2_17
  11. Capabilities for Uniqueness and Borrowing (2010) • Goal: 
 Safe

    and efficient message passing in concurrent object-oriented programming • Identified issues of state-of-the-art languages in 2010: • Issue 1: Can pass messages by reference but only with severely restricted shape (e.g., trees) • Issue 2: Destructive reads potential source of run-time errors • Our approach: • Introduce static capabilities to • provide a flexible notion of uniqueness, called “separate uniqueness” • enforce at-most-once consumption of unique references (affine types) • Implement type system as Scala extension: • Using annotations: @unique, @transient, @peer • Annotation checker implemented as compiler plug-in 11
  12. Separate Uniqueness vs External Uniqueness • Separate Uniqueness most closely

    related to External Uniqueness 
 
 
 
 • Comparison: 
 
 
 
 
 • External uniqueness (A, B, C are objects; object A owns object B, u is a unique reference): • References r and i are internal to the ownership context of A • Ownership makes reference f’ illegal; uniqueness makes reference f illegal • Key difference: Separate uniqueness forbids reference s → enforces full encapsulation 12 Clarke, Wrigstad. External Uniqueness Is Unique Enough. ECOOP 2003: 176-200 https://doi.org/10.1007/978-3-540-45070-2_9
  13. Capabilities for uniqueness and borrowing: Post-mortem • Why was the

    approach not pursued further? • Compelling properties: • Flexible notion of uniqueness • Suitable for preventing data races (demonstrated for actors) • Low annotation overhead • Promising experience with real-world actor-based Scala code • Remaining issues: • Complexity of the annotation checker • Capabilities a completely new concept: • For example, own implementation of inference • Unclear: capabilities orthogonal to existing implicits in Scala? 13
  14. Capabilities Reloaded: Enter LaCasa • LaCasa = Lightweight Affinity and

    Object Capabilities in Scala • Key novelties: • Clarifies the relationship between capabilities and implicits in Scala • Reduces annotation overhead further through a novel idea: 14 Leverage properties of the Object-Capability Model to restrict aliasing and effects! Haller and Loiko. LaCasa: Lightweight af fi nity and object capabilities in Scala. OOPSLA 2016
 https://doi.org/10.1145/2983990.2984042
  15. Example: Shopping cart • Actor A: • Receives commands for

    editing shopping cart • A checkout command sends shopping cart to actor B 15 val cart = Cart() def receive = { case AddItem(item) => cart.add(item) case RemoveItem(item) => cart.remove(item) case Checkout() => actorB ! Process(cart) } Goal: maintain isolation of actors A and B 
 even if shopping cart is sent by reference
  16. Example: Challenges • Actors with shared heap could end up

    concurrently accessing the same shopping cart: • After sending cart to actor B, actor A could continue to access cart • Actor A could store a reference to cart anywhere in shared heap 16 val cart = Cart() def receive = { case AddItem(item) => cart.add(item) case RemoveItem(item) => cart.remove(item) case Checkout() => actorB ! Process(cart) cart.remove(someItem) Global.forLater = cart }
  17. LaCasa in a nutshell • Main ideas: • Enable creating

    and maintaining separated/disjoint object graphs • Except for sharing of immutable objects • Control access to separate object graphs using affine permissions • Separate object graphs maintained in “boxes” • Each box has an associated permission 17
  18. Maintaining separate object graphs using boxes • Given a box

    initialized with a reference to a fresh instance of class Cart: 
 
 
 
 
 
 
 • Assuming Item is a deeply immutable type, adding an item using open preserves heap separation: 18 class Cart: var items: List[Item] = List() ... val cartBox: Box[Cart] = ... val item: Item = ... cartBox.open(Spore(item) { env => cart => cart.items = env :: cart.items }) Spore = closure that captures only item
  19. Opening boxes • The contents of a box can only

    be accessed using open • open takes a spore that is applied to the contents of the box • A spore is a special kind of closure which • has an explicit environment • tracks the type of its environment using a type refinement, enabling type-based constraints • enables operations on its environment, for example, for serialization and duplication/cloning • open requires the permission associated with the box 19 Miller, Haller, and Odersky. Spores: A Type-Based Foundation for Closures in the Age of Concurrency and Distribution. ECOOP 2014: 308-333 https://doi.org/10.1007/978-3-662-44202-9_13
  20. Spores — overview • A simple spore without environment: 


    
 
 
 • The above spore has the following type: 
 
 
 
 • Spore types are subtypes of corresponding function types: 20 val s = Spore((x: Int) => x + 2) Spore[Int, Int] { type Env = Nothing } sealed trait Spore[-T, +R] extends (T => R) { type Env }
  21. Spores with environments • The environment of a spore is

    initialized explicitly: 
 
 
 
 
 
 
 
 • The above spore s2 has type: 21 val str = "anonymous function" val s2 = Spore(str) { env => (x: Int) => x + env.length } Spore[Int, Int] { type Env = String } Environment initialized with argument str Environment accessed using extra parameter “env”
  22. Type-based constraints • The Env type member of the Spore

    trait enables expressing type-based constraints on the spore's environment using context parameters • (Context parameters used to be called “implicit parameters” in Scala 2.) • Example: require a spore parameter to only capture immutable types: 22 /* Run spore `s` concurrently, immediately returning a future which * is eventually completed with the result of `s` of type `T`. */ def async[T](s: Spore[Unit, T])(using Immutable[s.Env]): Future[T] = ... This assumes given instances of the form: given intImmutable: Immutable[Int] = new Immutable[Int] {} given stringImmutable: Immutable[String] = new Immutable[String] {} ... Immutable types are types for which instances of type class Immutable exist
  23. • Permission = given instance with the following type: 


    
 
 
 Boxes and permissions • The contents of a box can only be accessed using open • open takes a spore that is applied to the contents of the box • open requires the permission associated with the box 23 class Cart: var items: List[Item] = List() ... val cartBox: Box[Cart] = ... val item: Item = ... cartBox.open(Spore(item) { env => cart => cart.items = env :: cart.items }) CanAccess { type C = cartBox.C } Can think of type member C as a static region name
  24. Typing open 24 sealed class Box[+T] private (private val instance:

    T) { self => type C def open(s: Spore[T, Unit])(using Immutable[s.Env], CanAccess { type C = self.C }) = s(instance) Used to statically associate a box and a permission Alias of this Primary constructor is private Users unable to create subclasses Region types of this box and the given permission must be equal
  25. Boxes and immutable types • Object graph reachable from a

    box is separate from the rest of the heap, except for immutable objects, which may be shared • Reference to immutable object in common heap can be stored in object reachable from a box • Reference to immutable object can be extracted from box and stored in common heap 25 sealed class Box[+T] ... { self => def extract[R](s: Spore[T, R]) (using Immutable[R], Immutable[s.Env], CanAccess { type C = self.C }): R = s(instance)
  26. How object graph separation could be broken • Example 1:

    Box[C] where a method of C accesses a top-level object: 
 
 
 
 
 
 
 • Example 2: Box[C] where a field of C has a class type with a method that accesses a top-level object: 26 class C: def m() = val d = TopLevelObject.getD() d.m2(this) // could retain this class C: var d: D = _ class D: def m2(c: C) = TopLevelObject.setC(c) // could retain c
  27. Ensuring separation is maintained • Examples on previous slide are

    prevented by requiring that the type parameter C of Box[C] conforms to the object-capability model • Roughly, the object-capability model ensures that instances of a class can only access references that were passed explicitly • But first: capability-based system security… 27
  28. Object-Capability Model • Originally a computer security model • An

    ocapability is a transferable right to perform operations on a given object • ocapabilities are unforgeable • In the context of object-capability-secure object-oriented languages: ocapability = object reference • In order for an OOL to confirm to the Object-Capability Model, an object reference can only be obtained by a caller in one of the following ways: • Parenthood: If A creates B, A obtains a reference to the newly created B • Endowment: If A creates B, B obtains that subset of A’s references with which A chose to endow it • Introduction: If A has references to both B and C, A can call a method on B, passing C as an argument (B can retain that reference by storing it in a field) 28 The Object-Capability Model directly supports the principle of least authority! In order to disambiguate: we call a capability in the Object-Capability Model ocapability
  29. Restrictions to Enforce the Object-Capability Model • For an OOL

    to become object-capability-secure, certain loopholes need to be closed • Example: • Accessing a global singleton object • The authority to access Global is not given explicitly to instances of class C • Instances of class C can inadvertently obtain references to Global’s instance of class D 29 class C: def doSomething(): Int = val d = Global.evil() ... object Global: private val fld: D = new D def evil(): D = fld
  30. Restrictions to Enforce the Object-Capability Model (2) • Enforce the

    principles of the Object-Capability Model (Parenthood, Endowment, Introduction) through the following restrictions for “ocap” classes:1 • Method and constructor parameters are either primitive or ocap class types • Methods only access parameters and the receiver (this) • Methods only instantiate ocap classes • Field types are either primitive or ocap class types • Superclasses are ocap • Previous work on enforcing the Object-Capability Model for JavaScript and Java, e.g.: 
 
 
 
 30 1) Simplified. We have not considered exceptions, file I/O etc! Mettler, Wagner, and Close. Joe-E: A Security-Oriented Subset of Java. NDSS 2010 https://www.ndss-symposium.org/ndss2010/joe-e-security-oriented-subset-java
  31. Object-capability model in LaCasa • Key property: Whether a class

    is ocap is just one bit of information! • Prevent safety issues by requiring that the type parameter C of Box[C] is ocap 31
  32. How practical are these restrictions? • How common are ocap

    classes in Scala? • Such classes can be used safely in concurrent code without changes! • Empirical study of over 75'000 LOC of open-source Scala code: 32 Project Version SLOC GitHub stats Scala stdlib 2.11.7 33,107 ✭5,795 👥 257 Signal/Collect 8.0.6 10,159 ✭123 👥 11 GeoTrellis 0.10.0-RC2 35,351 ✭400 👥 38 -engine 3,868 -raster 22,291 -spark 9,192
  33. Results of empirical study • In the analyzed medium to

    large open-source Scala projects, 
 21-67% of all classes are safe (= ocap): 33 Project #classes/traits #safe (%) #dir. unsafe (%) Scala stdlib 1,505 644 (43%) 212/861 (25%) Signal/Collect 236 159 (67%) 60/77 (78%) GeoTrellis -engine 190 40 (21%) 124/150 (83%) -raster 670 233 (35%) 325/437 (74%) -spark 326 101 (31%) 167/225 (74%) Total 2,927 1,177 (40%) 888/1,750 (51%)
  34. Consuming permissions • A box b can only be used

    if a given of type CanAccess { type C = b.C } is available in the current context • In LaCasa, a permission may be consumed, such that a box becomes unusable in the continuation of the program • The continuation must be specified explicitly, however 
 → Continuation-Passing Style (CPS) 34
  35. Consuming permissions: Example • Example: actor A sends a box

    to actor B, consuming the permission of the box • In the continuation, the permission is unavailable • The continuation is specified as a spore which prevents capturing the consumed permission → within the spore's body the box is unusable 35 def receive[T](box: Box[T]) (using CanAccess { type C = box.C }) = ... actorB.send(box)(Spore(...) { // not allowed to capture a variable of type // CanAccess { type C = box.C } })
  36. Second-class permissions • Soundness requires permissions, i.e., values of type

    CanAccess { type C = b.C } for some b, to be second-class, more specifically, to be stack-local • Stack locality prevents problematic indirect capturing 36 def m[T](box: Box[T]) (using p: CanAccess { type C = box.C }) = val fun = () => p actorB.send(box)(thunk((box, fun)) { case (b, f) => given forbidden = f() // could still access `box` using `b`! ... }) Violates stack locality of p thunk creates a thunk spore with only an environment
  37. Further results • Formalization as a type-and-effect system: • Object-oriented

    core languages with heap separation and concurrency • Proof of type soundness • Proof of isolation theorem for processes with shared heap and ownership transfer • Integration with the actor model of concurrency: 37 Haller and Odersky. Scala Actors: Unifying thread-based and event-based programming. Theor. Comput. Sci. 410(2-3): 202-220 (2009)
 https://doi.org/10.1016/j.tcs.2008.09.019
  38. LaCasa: Key insights • Key insight 1: Properties of object-capability

    model provide essential guarantees for ensuring separation of object graphs • Object-capability model practical to enforce in real-world languages! • Key insight 2: Scala’s contextual abstractions (implicits) and path-dependent types as well as spores provide important static guarantees • Other static checks enforced by compiler plugin 38 Haller and Loiko. LaCasa: Lightweight af fi nity and object capabilities in Scala. OOPSLA 2016
 https://doi.org/10.1145/2983990.2984042
  39. Further research directions • Various improvements • Avoid explicit continuation-passing

    style through a selective CPS transform • Type erasure for boxes and permissions • New implementation for Scala 3 • Parts exist: 
 
 
 
 • Safe region-based memory management • Potentially less overhead than GC • More predictable performance than GC 39 Haller. Enhancing closures in Scala 3 with Spores3. Scala Symposium 2022: 22-27 https://doi.org/10.1145/3550198.3550428
  40. Further research directions • Explore connections with capture checking •

    Experimental extension of type checking in Scala 3 • Type system for tracking references to capabilities in types • Capability = variable with capability type • Key application: effect polymorphism and effect safety • Example: effect-polymorphic map method (of List[A]): • The type A => B represents the type of impure function values that can close over arbitrary effect capabilities 40 def map(f: A => B): List[B]
  41. Further research directions • Safe deterministic concurrency • Various concurrent/parallel

    programming models with deterministic semantics • Parallel collections (collections with bulk-parallel operations) • Fork/join parallelism and async-finish parallelism • Lattice-based deterministic concurrency, e.g., LVars, Reactive Async, and RACL: 
 
 
 
 • Subtasks may return permissions/ownership • Challenge: splitting and recombining isolated object graphs 41 Arvidsson. Deterministic Concurrency Using Lattices and the Object Capability Model.
 MSc thesis, KTH Royal Institute of Technology, Sweden, 2018. https://urn.kb.se/resolve?urn=urn%3Anbn%3Ase%3Akth%3Adiva-239917
  42. Conclusion • Extending existing, widely-used programming languages with uniqueness, affine

    types, and heap separation is a difficult challenge • The LaCasa project is an ongoing effort to extend Scala with a notion of disjoint object graphs as well as static permissions for access control • Key insights: • The object-capability model provides essential aliasing restrictions in a practical and scalable manner • Scala’s path-dependent types and contextual abstractions are an excellent basis for flexible permissions • Spores provide essential safety properties (e.g., capturing only immutable types) • We are currently exploring generalizations of LaCasa for concurrent programming models other than actors 42