Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Enhancing Closures in Scala with Blocks

Philipp Haller
June 06, 2022
45

Enhancing Closures in Scala with Blocks

Philipp Haller

June 06, 2022
Tweet

Transcript

  1. Enhancing Closures in Scala with Blocks Philipp Haller Associate Professor

    School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Stockholm, Sweden 13th ACM SIGPLAN Scala Symposium June 6, 2022 Berlin, Germany
  2. Philipp Haller Closures, concurrency, and distribution • Using closures in

    concurrent settings presents safety hazards – Example: running a closure on a concurrent thread could cause a data race if a captured variable refers to a shared mutable object • Using closures in distributed settings exposes limitations – Example: sending a closure from a frontend running on a JavaScript engine to a backend running on a JVM requires a portable serialization scheme • … and is a safety risk – Example: serializing closures can result in runtime errors (e.g., java.io.NotSerializableException on the JVM) 2
  3. Philipp Haller Example: Concurrency 3 val customerData: mutable.Map[Int, CustomerInfo] =

    ... def averageAge(customers: List[Customer]): Future[Float] = Future { val infos = customers.flatMap { c => customerData.get(c.customerNo) match case Some(info) => List(info) case None => List() } val sumAges = infos.foldLeft(0)(_ + _.age).toFloat if (infos.nonEmpty) sumAges / infos.size else 0.0f } Possible data race!
  4. Philipp Haller Example: Serialization 4 class Example { private val

    factor: Double = 1.2 def example(): Unit = { val fun = { (persons: List[Person]) => val sumAges = persons.map(_.age).reduce(_ + _) (sumAges / persons.size) * factor } val bytes = serialize(fun) } def serialize(f: List[Person] => Double): Array[Byte] = { ... Exception in thread "main" java.io.NotSerializableException: Example at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185) at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1379) at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:15 at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510) Uses factor from enclosing scope
  5. Philipp Haller Example: Serialization 5 class Example { private val

    factor: Double = 1.2 def example(): Unit = { val fun = { (persons: List[Person]) => val sumAges = persons.map(_.age).reduce(_ + _) (sumAges / persons.size) * this.factor } val bytes = serialize(fun) } def serialize(f: List[Person] => Double): Array[Byte] = { ... Actually: capturing this (of type Example)! Exception in thread "main" java.io.NotSerializableException: Example at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185) at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1379) at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:15 at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
  6. Philipp Haller Observations • Safety issues stem from unrestricted variable

    capture – Concurrency: capture and access shared mutable objects – Serialization: capture references to non-serializable objects • Potential remedies: – Restricting types of captured variables • For example, permit only types known to be serializable – Provide more capturing modes • For example, deeply clone a mutable object upon capture 6
  7. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 7
  8. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 8
  9. Philipp Haller Goal and requirements Goal: An abstraction that makes

    closures safer and more flexible Requirements: – Enable constraining the environment (the captured variables) using types – Support serialization based on type classes – Enable a portable implementation, including serialization – Minimize the use of macros 9
  10. Philipp Haller Idea • Introduce an abstraction, called "block", which

    can be seen as a special kind of closure • Blocks: – have an explicit environment; – restrict variable capture to a single variable; – track the type of their environment using a type refinement; – enable operations on their environment, for example, for serialization and duplication/cloning 10
  11. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 11
  12. Philipp Haller Overview • A simple block without environment: •

    The above block has the following type: • Block types are subtypes of corresponding function types: 12 val b = Block((x: Int) => x + 2) Block[Int, Int] { type Env = Nothing } sealed trait Block[-T, +R] extends (T => R) { type Env } Function literal not permitted to capture anything!
  13. Philipp Haller Blocks with environments • The environment of a

    block is initialized explicitly: • The above block b2 has type: 13 val s = "anonymous function" val b2 = Block(s) { (x: Int) => x + env.length } Environment initialized with argument s Environment accessed using env Block[Int, Int] { type Env = String }
  14. Philipp Haller Type-based constraints • The Env type member of

    the Block trait enables expressing type-based constraints on the block's environment using context parameters • Example: require a block parameter to only capture thread-safe types: 14 /* Run block `b` concurrently, immediately returning a future * which is eventually completed with the result of type `T`. */ def future[T](b: Block[Unit, T])(using ThreadSafe[b.Env]): Future[T] = ... Thread-safe types are types for which instances of type class ThreadSafe exist
  15. Philipp Haller Serialization • One of the design goals for

    blocks is to support flexible, portable, and safe serialization based on type classes/contextual abstractions – Flexibility: enable integration with different serialization frameworks (uPickle, Java serialization, Kryo, Jackson, etc.) – Portability: support multiple backends/runtime environments – Safety: serializability is determined statically • Assumptions: – Serialization is primarily used for communication between remote nodes – Every node is running the same code – No transmission of byte code or source code 15
  16. Philipp Haller Serializing blocks: Approach • Instead of serializing the

    code of a block, what's serialized is – a unique identifier that enables instantiating the implementation of the block; and – the block's environment. • In practice: – Create block using a named block builder – Block builder identifies the block's implementation 16
  17. Philipp Haller Serializing blocks: Example • Step 1: define block

    using block builder: • Step 2: create serializable representation of block: 17 object PrependBuilder extends Block.Builder[Int, List[Int], List[Int]]( (xs: List[Int]) => env :: xs ) Prepend environment to list parameter val num: Int = ... val data = BlockData(PrependBuilder, Some(num)) Environment
  18. Philipp Haller Serializing blocks: Example (cont'd) • Step 3: pickle

    BlockData (here, using uPickle): • Output (JSON): 18 import upickle.default.* import com.phaller.blocks.pickle.given val data = BlockData(PrependBuilder, Some(num)) val pickled = write(data) ["com.example.PrependBuilder",1,"<num>"] 1 = non-empty environment
  19. Philipp Haller Deserializing blocks • Step 1: read pickled data

    with target type PackedBlockData: – Note: PackedBlockData abstracts from type of environment! • Step 2: convert PackedBlockData to block: 19 val unpickledData = read[PackedBlockData](pickled) val unpickledBlock = unpickledData.toBlock[List[Int], List[Int]]
  20. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 20
  21. Philipp Haller Design and implementation • Block creation: • Parameter

    body is a function returning a context function [1] • This means: EnvAsParam[E] parameter does not appear in user code! 21 object Block: def apply[E, T, R](initEnv: E) (body: T => EnvAsParam[E] ?=> R) = new Block[T, R] { type Env = E def apply(x: T): R = body(x)(using initEnv) ... } Environment passed to context parameter
  22. Philipp Haller Environment access • Why is environment of type

    E compatible with context parameter of type EnvAsParam[E]? • Environment access: • Why use an opaque type alias? – To permit any type for the environment, including types for which there are user- defined givens (implicits) in scope! – Without the opaque type alias, the env method could return a user-defined given that happens to have the same type as the environment 22 def env[E](using ep: EnvAsParam[E]): E = ep opaque type EnvAsParam[T] = T (within Block object)
  23. Philipp Haller Capture checking • Soundness of the approach requires

    checking that the body of the block does not capture any variable – Environment is accessed using env (which is not captured!) • Capture checking is done using a macro [2,3]: 23 inline def apply[E, T, R](inline initEnv: E) (inline body: T => EnvAsParam[E] ?=> R): Block[T, R] { type Env = E } = ${ applyCode('initEnv)('body) }
  24. Philipp Haller Implementing serialization • Recall user definition of block

    builder: • Environment serializer+deserializer obtained when builder is constructed: 24 class Builder[E, T, R](body: T => EnvAsParam[E] ?=> R) (using ReadWriter[E]) extends TypedBuilder[E, T, R]: def createBlock(envOpt: Option[String]): Block[T, R] = val initEnv = read[E](envOpt.get) apply(initEnv) object PrependBuilder extends Block.Builder[Int, List[Int], List[Int]]( (xs: List[Int]) => env :: xs ) Deserialize environment
  25. Philipp Haller Implementing serialization (2) • Recall step 2: creating

    serializable form of a block: • BlockData construction uses a macro to: – check that argument builder is a top-level object – obtain fully-qualified name of builder object • Serialization of BlockData object consists of: – fully-qualified name of builder object – serialized environment 25 val data = BlockData(PrependBuilder, Some(num)) val pickled = write(data)
  26. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 26
  27. Philipp Haller Environment access • As shown, environment accessed using

    special env member • This is suboptimal when environment is a tuple – Needed when environment consists of multiple values • Example: 27 val s = "anonymous function" val i = 5 Block((s, i)) { (x: Int) => x + env._1.length - env._2 }
  28. Philipp Haller Alternative environment access • An alternative approach provides

    the environment as an explicit parameter instead of a context parameter • Thus, user code needs one more parameter • However, it enables the use of pattern matching instead of env._1, env._2, ...: 28 val s = "anonymous function" val i = 5 Block((s, i)) { case (l, r) => (x: Int) => x + l.length - r } Can also use parameter untupling
  29. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 29
  30. Philipp Haller Selected related work • Main inspiration and most

    closely related work: Spores [4] – Implementation of Spores depends on experimental macro system of Scala 2.11 • Scala 3 has a new, incompatible macro system – Portability: there was only experimental Scala.js support for Spores which excluded serialization – Blocks explore a unique combination of language features introduced in Scala 3 (context functions, opaque types, and unified macro/multi- stage programming system) in order to reduce complexity of macros 30
  31. Philipp Haller Selected related work (2) • Cloud Haskell [5]

    introduces a new type constructor Static: – A value of type Static t can be serialized without knowing how to serialize t (e.g., a function without free variables can be serialized as a symbolic code address; type Static (a -> b)) • A term static e has type Static t iff e has type t and all of e's free variables are top-level – Key idea for serialization: at closure construction time, serialize environment and look up deserializer (which must be a top-level definition) • Blocks: (a) environment does not have to be serialized at closure construction time (since type of environment tracked) and (b) deserializer does not have to be a top-level definition 31
  32. Philipp Haller Selected related work (3) • Capture checking [6]

    tracks captures in types – Introduces pure closures which don't capture any capabilities • Capability = variables with capability type – Blocks require checking a stronger property: body closure of block must not capture any variable; restriction to capabilities not sufficient 32
  33. Philipp Haller Outline • Goal and requirements • Overview of

    blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 33
  34. Philipp Haller Conclusion • Blocks = special kinds of closures

    with explicit environments – Environment type is part of block type – Safety ensured using macros – Safe and portable serialization based on type classes • Implemented1 as a library for Scala 3 – Explores unique combination of language features – Designed to be portable: • Scala.js support from the beginning, Scala Native support planned 34 1 https://github.com/phaller/blocks Thank You!
  35. Philipp Haller References 35 [4] Miller, Haller, and Odersky. Spores:

    a type-based foundation for closures in the age of concurrency and distribution. ECOOP 2014 [5] Epstein, Black, Peyton Jones. Towards Haskell in the cloud. Haskell Symposium 2011 [1] Odersky, Blanvillain, Liu, Biboudis, Miller, Stucki. Simplicitly: foundations and applications of implicit function types. Proc. ACM Program. Lang. 2(POPL): 42:1-42:29 (2018) [3] Stucki, Brachthäuser, Odersky. Virtual ADTs for portable metaprogramming. MPLR 2021 [2] Stucki, Biboudis, Odersky. A practical unification of multi-stage programming and macros. GPCE 2018 [6] Boruch-Gruszecki, Brachthäuser, Lee, Lhoták, Odersky. Tracking Captured Variables in Types. CoRR abs/2105.11896 (2021)