Slide 1

Slide 1 text

Enhancing Closures in Scala with Blocks Philipp Haller Associate Professor School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Stockholm, Sweden 13th ACM SIGPLAN Scala Symposium June 6, 2022 Berlin, Germany

Slide 2

Slide 2 text

Philipp Haller Closures, concurrency, and distribution • Using closures in concurrent settings presents safety hazards – Example: running a closure on a concurrent thread could cause a data race if a captured variable refers to a shared mutable object • Using closures in distributed settings exposes limitations – Example: sending a closure from a frontend running on a JavaScript engine to a backend running on a JVM requires a portable serialization scheme • … and is a safety risk – Example: serializing closures can result in runtime errors (e.g., java.io.NotSerializableException on the JVM) 2

Slide 3

Slide 3 text

Philipp Haller Example: Concurrency 3 val customerData: mutable.Map[Int, CustomerInfo] = ... def averageAge(customers: List[Customer]): Future[Float] = Future { val infos = customers.flatMap { c => customerData.get(c.customerNo) match case Some(info) => List(info) case None => List() } val sumAges = infos.foldLeft(0)(_ + _.age).toFloat if (infos.nonEmpty) sumAges / infos.size else 0.0f } Possible data race!

Slide 4

Slide 4 text

Philipp Haller Example: Serialization 4 class Example { private val factor: Double = 1.2 def example(): Unit = { val fun = { (persons: List[Person]) => val sumAges = persons.map(_.age).reduce(_ + _) (sumAges / persons.size) * factor } val bytes = serialize(fun) } def serialize(f: List[Person] => Double): Array[Byte] = { ... Exception in thread "main" java.io.NotSerializableException: Example at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185) at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1379) at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:15 at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510) Uses factor from enclosing scope

Slide 5

Slide 5 text

Philipp Haller Example: Serialization 5 class Example { private val factor: Double = 1.2 def example(): Unit = { val fun = { (persons: List[Person]) => val sumAges = persons.map(_.age).reduce(_ + _) (sumAges / persons.size) * this.factor } val bytes = serialize(fun) } def serialize(f: List[Person] => Double): Array[Byte] = { ... Actually: capturing this (of type Example)! Exception in thread "main" java.io.NotSerializableException: Example at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185) at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1379) at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:15 at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)

Slide 6

Slide 6 text

Philipp Haller Observations • Safety issues stem from unrestricted variable capture – Concurrency: capture and access shared mutable objects – Serialization: capture references to non-serializable objects • Potential remedies: – Restricting types of captured variables • For example, permit only types known to be serializable – Provide more capturing modes • For example, deeply clone a mutable object upon capture 6

Slide 7

Slide 7 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 7

Slide 8

Slide 8 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 8

Slide 9

Slide 9 text

Philipp Haller Goal and requirements Goal: An abstraction that makes closures safer and more flexible Requirements: – Enable constraining the environment (the captured variables) using types – Support serialization based on type classes – Enable a portable implementation, including serialization – Minimize the use of macros 9

Slide 10

Slide 10 text

Philipp Haller Idea • Introduce an abstraction, called "block", which can be seen as a special kind of closure • Blocks: – have an explicit environment; – restrict variable capture to a single variable; – track the type of their environment using a type refinement; – enable operations on their environment, for example, for serialization and duplication/cloning 10

Slide 11

Slide 11 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 11

Slide 12

Slide 12 text

Philipp Haller Overview • A simple block without environment: • The above block has the following type: • Block types are subtypes of corresponding function types: 12 val b = Block((x: Int) => x + 2) Block[Int, Int] { type Env = Nothing } sealed trait Block[-T, +R] extends (T => R) { type Env } Function literal not permitted to capture anything!

Slide 13

Slide 13 text

Philipp Haller Blocks with environments • The environment of a block is initialized explicitly: • The above block b2 has type: 13 val s = "anonymous function" val b2 = Block(s) { (x: Int) => x + env.length } Environment initialized with argument s Environment accessed using env Block[Int, Int] { type Env = String }

Slide 14

Slide 14 text

Philipp Haller Type-based constraints • The Env type member of the Block trait enables expressing type-based constraints on the block's environment using context parameters • Example: require a block parameter to only capture thread-safe types: 14 /* Run block `b` concurrently, immediately returning a future * which is eventually completed with the result of type `T`. */ def future[T](b: Block[Unit, T])(using ThreadSafe[b.Env]): Future[T] = ... Thread-safe types are types for which instances of type class ThreadSafe exist

Slide 15

Slide 15 text

Philipp Haller Serialization • One of the design goals for blocks is to support flexible, portable, and safe serialization based on type classes/contextual abstractions – Flexibility: enable integration with different serialization frameworks (uPickle, Java serialization, Kryo, Jackson, etc.) – Portability: support multiple backends/runtime environments – Safety: serializability is determined statically • Assumptions: – Serialization is primarily used for communication between remote nodes – Every node is running the same code – No transmission of byte code or source code 15

Slide 16

Slide 16 text

Philipp Haller Serializing blocks: Approach • Instead of serializing the code of a block, what's serialized is – a unique identifier that enables instantiating the implementation of the block; and – the block's environment. • In practice: – Create block using a named block builder – Block builder identifies the block's implementation 16

Slide 17

Slide 17 text

Philipp Haller Serializing blocks: Example • Step 1: define block using block builder: • Step 2: create serializable representation of block: 17 object PrependBuilder extends Block.Builder[Int, List[Int], List[Int]]( (xs: List[Int]) => env :: xs ) Prepend environment to list parameter val num: Int = ... val data = BlockData(PrependBuilder, Some(num)) Environment

Slide 18

Slide 18 text

Philipp Haller Serializing blocks: Example (cont'd) • Step 3: pickle BlockData (here, using uPickle): • Output (JSON): 18 import upickle.default.* import com.phaller.blocks.pickle.given val data = BlockData(PrependBuilder, Some(num)) val pickled = write(data) ["com.example.PrependBuilder",1,""] 1 = non-empty environment

Slide 19

Slide 19 text

Philipp Haller Deserializing blocks • Step 1: read pickled data with target type PackedBlockData: – Note: PackedBlockData abstracts from type of environment! • Step 2: convert PackedBlockData to block: 19 val unpickledData = read[PackedBlockData](pickled) val unpickledBlock = unpickledData.toBlock[List[Int], List[Int]]

Slide 20

Slide 20 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 20

Slide 21

Slide 21 text

Philipp Haller Design and implementation • Block creation: • Parameter body is a function returning a context function [1] • This means: EnvAsParam[E] parameter does not appear in user code! 21 object Block: def apply[E, T, R](initEnv: E) (body: T => EnvAsParam[E] ?=> R) = new Block[T, R] { type Env = E def apply(x: T): R = body(x)(using initEnv) ... } Environment passed to context parameter

Slide 22

Slide 22 text

Philipp Haller Environment access • Why is environment of type E compatible with context parameter of type EnvAsParam[E]? • Environment access: • Why use an opaque type alias? – To permit any type for the environment, including types for which there are user- defined givens (implicits) in scope! – Without the opaque type alias, the env method could return a user-defined given that happens to have the same type as the environment 22 def env[E](using ep: EnvAsParam[E]): E = ep opaque type EnvAsParam[T] = T (within Block object)

Slide 23

Slide 23 text

Philipp Haller Capture checking • Soundness of the approach requires checking that the body of the block does not capture any variable – Environment is accessed using env (which is not captured!) • Capture checking is done using a macro [2,3]: 23 inline def apply[E, T, R](inline initEnv: E) (inline body: T => EnvAsParam[E] ?=> R): Block[T, R] { type Env = E } = ${ applyCode('initEnv)('body) }

Slide 24

Slide 24 text

Philipp Haller Implementing serialization • Recall user definition of block builder: • Environment serializer+deserializer obtained when builder is constructed: 24 class Builder[E, T, R](body: T => EnvAsParam[E] ?=> R) (using ReadWriter[E]) extends TypedBuilder[E, T, R]: def createBlock(envOpt: Option[String]): Block[T, R] = val initEnv = read[E](envOpt.get) apply(initEnv) object PrependBuilder extends Block.Builder[Int, List[Int], List[Int]]( (xs: List[Int]) => env :: xs ) Deserialize environment

Slide 25

Slide 25 text

Philipp Haller Implementing serialization (2) • Recall step 2: creating serializable form of a block: • BlockData construction uses a macro to: – check that argument builder is a top-level object – obtain fully-qualified name of builder object • Serialization of BlockData object consists of: – fully-qualified name of builder object – serialized environment 25 val data = BlockData(PrependBuilder, Some(num)) val pickled = write(data)

Slide 26

Slide 26 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 26

Slide 27

Slide 27 text

Philipp Haller Environment access • As shown, environment accessed using special env member • This is suboptimal when environment is a tuple – Needed when environment consists of multiple values • Example: 27 val s = "anonymous function" val i = 5 Block((s, i)) { (x: Int) => x + env._1.length - env._2 }

Slide 28

Slide 28 text

Philipp Haller Alternative environment access • An alternative approach provides the environment as an explicit parameter instead of a context parameter • Thus, user code needs one more parameter • However, it enables the use of pattern matching instead of env._1, env._2, ...: 28 val s = "anonymous function" val i = 5 Block((s, i)) { case (l, r) => (x: Int) => x + l.length - r } Can also use parameter untupling

Slide 29

Slide 29 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 29

Slide 30

Slide 30 text

Philipp Haller Selected related work • Main inspiration and most closely related work: Spores [4] – Implementation of Spores depends on experimental macro system of Scala 2.11 • Scala 3 has a new, incompatible macro system – Portability: there was only experimental Scala.js support for Spores which excluded serialization – Blocks explore a unique combination of language features introduced in Scala 3 (context functions, opaque types, and unified macro/multi- stage programming system) in order to reduce complexity of macros 30

Slide 31

Slide 31 text

Philipp Haller Selected related work (2) • Cloud Haskell [5] introduces a new type constructor Static: – A value of type Static t can be serialized without knowing how to serialize t (e.g., a function without free variables can be serialized as a symbolic code address; type Static (a -> b)) • A term static e has type Static t iff e has type t and all of e's free variables are top-level – Key idea for serialization: at closure construction time, serialize environment and look up deserializer (which must be a top-level definition) • Blocks: (a) environment does not have to be serialized at closure construction time (since type of environment tracked) and (b) deserializer does not have to be a top-level definition 31

Slide 32

Slide 32 text

Philipp Haller Selected related work (3) • Capture checking [6] tracks captures in types – Introduces pure closures which don't capture any capabilities • Capability = variables with capability type – Blocks require checking a stronger property: body closure of block must not capture any variable; restriction to capabilities not sufficient 32

Slide 33

Slide 33 text

Philipp Haller Outline • Goal and requirements • Overview of blocks • Design and implementation of blocks library • Two approaches for environment access • Selected related work • Conclusion 33

Slide 34

Slide 34 text

Philipp Haller Conclusion • Blocks = special kinds of closures with explicit environments – Environment type is part of block type – Safety ensured using macros – Safe and portable serialization based on type classes • Implemented1 as a library for Scala 3 – Explores unique combination of language features – Designed to be portable: • Scala.js support from the beginning, Scala Native support planned 34 1 https://github.com/phaller/blocks Thank You!

Slide 35

Slide 35 text

Philipp Haller References 35 [4] Miller, Haller, and Odersky. Spores: a type-based foundation for closures in the age of concurrency and distribution. ECOOP 2014 [5] Epstein, Black, Peyton Jones. Towards Haskell in the cloud. Haskell Symposium 2011 [1] Odersky, Blanvillain, Liu, Biboudis, Miller, Stucki. Simplicitly: foundations and applications of implicit function types. Proc. ACM Program. Lang. 2(POPL): 42:1-42:29 (2018) [3] Stucki, Brachthäuser, Odersky. Virtual ADTs for portable metaprogramming. MPLR 2021 [2] Stucki, Biboudis, Odersky. A practical unification of multi-stage programming and macros. GPCE 2018 [6] Boruch-Gruszecki, Brachthäuser, Lee, Lhoták, Odersky. Tracking Captured Variables in Types. CoRR abs/2105.11896 (2021)