Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Enhancing Closures in Scala with Blocks

Philipp Haller
June 06, 2022
23

Enhancing Closures in Scala with Blocks

Philipp Haller

June 06, 2022
Tweet

Transcript

  1. Enhancing Closures in Scala with Blocks
    Philipp Haller
    Associate Professor
    School of Electrical Engineering and Computer Science
    KTH Royal Institute of Technology
    Stockholm, Sweden
    13th ACM SIGPLAN Scala Symposium
    June 6, 2022
    Berlin, Germany

    View Slide

  2. Philipp Haller
    Closures, concurrency, and distribution
    • Using closures in concurrent settings presents safety hazards
    – Example: running a closure on a concurrent thread could cause a data
    race if a captured variable refers to a shared mutable object
    • Using closures in distributed settings exposes limitations
    – Example: sending a closure from a frontend running on a JavaScript
    engine to a backend running on a JVM requires a portable
    serialization scheme
    • … and is a safety risk
    – Example: serializing closures can result in runtime errors (e.g.,
    java.io.NotSerializableException on the JVM)
    2

    View Slide

  3. Philipp Haller
    Example: Concurrency
    3
    val customerData: mutable.Map[Int, CustomerInfo] = ...
    def averageAge(customers: List[Customer]): Future[Float] =
    Future {
    val infos = customers.flatMap { c =>
    customerData.get(c.customerNo) match
    case Some(info) => List(info)
    case None => List()
    }
    val sumAges = infos.foldLeft(0)(_ + _.age).toFloat
    if (infos.nonEmpty) sumAges / infos.size else 0.0f
    }
    Possible
    data race!

    View Slide

  4. Philipp Haller
    Example: Serialization
    4
    class Example {
    private val factor: Double = 1.2
    def example(): Unit = {
    val fun = { (persons: List[Person]) =>
    val sumAges = persons.map(_.age).reduce(_ + _)
    (sumAges / persons.size) * factor
    }
    val bytes = serialize(fun)
    }
    def serialize(f: List[Person] => Double): Array[Byte] = {
    ...
    Exception in thread "main" java.io.NotSerializableException: Example
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185)
    at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1379)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:15
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
    Uses factor from
    enclosing scope

    View Slide

  5. Philipp Haller
    Example: Serialization
    5
    class Example {
    private val factor: Double = 1.2
    def example(): Unit = {
    val fun = { (persons: List[Person]) =>
    val sumAges = persons.map(_.age).reduce(_ + _)
    (sumAges / persons.size) * this.factor
    }
    val bytes = serialize(fun)
    }
    def serialize(f: List[Person] => Double): Array[Byte] = {
    ...
    Actually:
    capturing this (of type
    Example)!
    Exception in thread "main" java.io.NotSerializableException: Example
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1185)
    at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1379)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:15
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)

    View Slide

  6. Philipp Haller
    Observations
    • Safety issues stem from unrestricted variable capture
    – Concurrency: capture and access shared mutable objects
    – Serialization: capture references to non-serializable objects
    • Potential remedies:
    – Restricting types of captured variables
    • For example, permit only types known to be serializable
    – Provide more capturing modes
    • For example, deeply clone a mutable object upon capture
    6

    View Slide

  7. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    7

    View Slide

  8. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    8

    View Slide

  9. Philipp Haller
    Goal and requirements
    Goal:
    An abstraction that makes closures safer and more flexible
    Requirements:
    – Enable constraining the environment (the captured variables) using
    types
    – Support serialization based on type classes
    – Enable a portable implementation, including serialization
    – Minimize the use of macros
    9

    View Slide

  10. Philipp Haller
    Idea
    • Introduce an abstraction, called "block", which can be seen as a special
    kind of closure
    • Blocks:
    – have an explicit environment;
    – restrict variable capture to a single variable;
    – track the type of their environment using a type refinement;
    – enable operations on their environment, for example, for serialization
    and duplication/cloning
    10

    View Slide

  11. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    11

    View Slide

  12. Philipp Haller
    Overview
    • A simple block without environment:
    • The above block has the following type:
    • Block types are subtypes of corresponding function types:
    12
    val b = Block((x: Int) => x + 2)
    Block[Int, Int] { type Env = Nothing }
    sealed trait Block[-T, +R] extends (T => R) {
    type Env
    }
    Function literal
    not permitted to
    capture anything!

    View Slide

  13. Philipp Haller
    Blocks with environments
    • The environment of a block is initialized explicitly:
    • The above block b2 has type:
    13
    val s = "anonymous function"
    val b2 = Block(s) {
    (x: Int) => x + env.length
    }
    Environment
    initialized with
    argument s
    Environment
    accessed using env
    Block[Int, Int] { type Env = String }

    View Slide

  14. Philipp Haller
    Type-based constraints
    • The Env type member of the Block trait enables expressing type-based
    constraints on the block's environment using context parameters
    • Example: require a block parameter to only capture thread-safe types:
    14
    /* Run block `b` concurrently, immediately returning a future
    * which is eventually completed with the result of type `T`.
    */
    def future[T](b: Block[Unit, T])(using ThreadSafe[b.Env]): Future[T] =
    ...
    Thread-safe types are
    types for which instances of type
    class ThreadSafe exist

    View Slide

  15. Philipp Haller
    Serialization
    • One of the design goals for blocks is to support flexible, portable, and safe
    serialization based on type classes/contextual abstractions
    – Flexibility: enable integration with different serialization frameworks
    (uPickle, Java serialization, Kryo, Jackson, etc.)
    – Portability: support multiple backends/runtime environments
    – Safety: serializability is determined statically
    • Assumptions:
    – Serialization is primarily used for communication between remote nodes
    – Every node is running the same code
    – No transmission of byte code or source code
    15

    View Slide

  16. Philipp Haller
    Serializing blocks: Approach
    • Instead of serializing the code of a block, what's serialized is
    – a unique identifier that enables instantiating the implementation of the
    block; and
    – the block's environment.
    • In practice:
    – Create block using a named block builder
    – Block builder identifies the block's implementation
    16

    View Slide

  17. Philipp Haller
    Serializing blocks: Example
    • Step 1: define block using block builder:
    • Step 2: create serializable representation of block:
    17
    object PrependBuilder extends
    Block.Builder[Int, List[Int], List[Int]](
    (xs: List[Int]) => env :: xs
    ) Prepend environment to
    list parameter
    val num: Int = ...
    val data = BlockData(PrependBuilder, Some(num))
    Environment

    View Slide

  18. Philipp Haller
    Serializing blocks: Example (cont'd)
    • Step 3: pickle BlockData (here, using uPickle):
    • Output (JSON):
    18
    import upickle.default.*
    import com.phaller.blocks.pickle.given
    val data = BlockData(PrependBuilder, Some(num))
    val pickled = write(data)
    ["com.example.PrependBuilder",1,""]
    1 = non-empty environment

    View Slide

  19. Philipp Haller
    Deserializing blocks
    • Step 1: read pickled data with target type PackedBlockData:
    – Note: PackedBlockData abstracts from type of environment!
    • Step 2: convert PackedBlockData to block:
    19
    val unpickledData = read[PackedBlockData](pickled)
    val unpickledBlock = unpickledData.toBlock[List[Int], List[Int]]

    View Slide

  20. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    20

    View Slide

  21. Philipp Haller
    Design and implementation
    • Block creation:
    • Parameter body is a function returning a context function [1]
    • This means: EnvAsParam[E] parameter does not appear in user code!
    21
    object Block:
    def apply[E, T, R](initEnv: E)
    (body: T => EnvAsParam[E] ?=> R) =
    new Block[T, R] {
    type Env = E
    def apply(x: T): R = body(x)(using initEnv)
    ...
    } Environment passed to
    context parameter

    View Slide

  22. Philipp Haller
    Environment access
    • Why is environment of type E compatible with context parameter of type EnvAsParam[E]?
    • Environment access:
    • Why use an opaque type alias?
    – To permit any type for the environment, including types for which there are user-
    defined givens (implicits) in scope!
    – Without the opaque type alias, the env method could return a user-defined given that
    happens to have the same type as the environment
    22
    def env[E](using ep: EnvAsParam[E]): E = ep
    opaque type EnvAsParam[T] = T (within Block object)

    View Slide

  23. Philipp Haller
    Capture checking
    • Soundness of the approach requires checking that the body of the block
    does not capture any variable
    – Environment is accessed using env (which is not captured!)
    • Capture checking is done using a macro [2,3]:
    23
    inline def apply[E, T, R](inline initEnv: E)
    (inline body: T => EnvAsParam[E] ?=> R):
    Block[T, R] { type Env = E } =
    ${ applyCode('initEnv)('body) }

    View Slide

  24. Philipp Haller
    Implementing serialization
    • Recall user definition of block builder:
    • Environment serializer+deserializer obtained when builder is constructed:
    24
    class Builder[E, T, R](body: T => EnvAsParam[E] ?=> R)
    (using ReadWriter[E]) extends TypedBuilder[E, T, R]:
    def createBlock(envOpt: Option[String]): Block[T, R] =
    val initEnv = read[E](envOpt.get)
    apply(initEnv)
    object PrependBuilder extends Block.Builder[Int, List[Int], List[Int]](
    (xs: List[Int]) => env :: xs
    )
    Deserialize environment

    View Slide

  25. Philipp Haller
    Implementing serialization (2)
    • Recall step 2: creating serializable form of a block:
    • BlockData construction uses a macro to:
    – check that argument builder is a top-level object
    – obtain fully-qualified name of builder object
    • Serialization of BlockData object consists of:
    – fully-qualified name of builder object
    – serialized environment
    25
    val data = BlockData(PrependBuilder, Some(num))
    val pickled = write(data)

    View Slide

  26. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    26

    View Slide

  27. Philipp Haller
    Environment access
    • As shown, environment accessed using special env member
    • This is suboptimal when environment is a tuple
    – Needed when environment consists of multiple values
    • Example:
    27
    val s = "anonymous function"
    val i = 5
    Block((s, i)) {
    (x: Int) => x + env._1.length - env._2
    }

    View Slide

  28. Philipp Haller
    Alternative environment access
    • An alternative approach provides the environment as an explicit parameter
    instead of a context parameter
    • Thus, user code needs one more parameter
    • However, it enables the use of pattern matching instead of env._1,
    env._2, ...:
    28
    val s = "anonymous function"
    val i = 5
    Block((s, i)) {
    case (l, r) => (x: Int) => x + l.length - r
    }
    Can also use
    parameter untupling

    View Slide

  29. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    29

    View Slide

  30. Philipp Haller
    Selected related work
    • Main inspiration and most closely related work: Spores [4]
    – Implementation of Spores depends on experimental macro system of
    Scala 2.11
    • Scala 3 has a new, incompatible macro system
    – Portability: there was only experimental Scala.js support for Spores
    which excluded serialization
    – Blocks explore a unique combination of language features introduced
    in Scala 3 (context functions, opaque types, and unified macro/multi-
    stage programming system) in order to reduce complexity of macros
    30

    View Slide

  31. Philipp Haller
    Selected related work (2)
    • Cloud Haskell [5] introduces a new type constructor Static:
    – A value of type Static t can be serialized without knowing how to serialize t
    (e.g., a function without free variables can be serialized as a symbolic code
    address; type Static (a -> b))
    • A term static e has type Static t iff e has type t and all of e's free
    variables are top-level
    – Key idea for serialization: at closure construction time, serialize environment
    and look up deserializer (which must be a top-level definition)
    • Blocks: (a) environment does not have to be serialized at closure
    construction time (since type of environment tracked) and (b) deserializer
    does not have to be a top-level definition
    31

    View Slide

  32. Philipp Haller
    Selected related work (3)
    • Capture checking [6] tracks captures in types
    – Introduces pure closures which don't capture any capabilities
    • Capability = variables with capability type
    – Blocks require checking a stronger property: body closure of block
    must not capture any variable; restriction to capabilities not sufficient
    32

    View Slide

  33. Philipp Haller
    Outline
    • Goal and requirements
    • Overview of blocks
    • Design and implementation of blocks library
    • Two approaches for environment access
    • Selected related work
    • Conclusion
    33

    View Slide

  34. Philipp Haller
    Conclusion
    • Blocks = special kinds of closures with explicit environments
    – Environment type is part of block type
    – Safety ensured using macros
    – Safe and portable serialization based on type classes
    • Implemented1 as a library for Scala 3
    – Explores unique combination of language features
    – Designed to be portable:
    • Scala.js support from the beginning, Scala Native support planned
    34
    1 https://github.com/phaller/blocks
    Thank You!

    View Slide

  35. Philipp Haller
    References
    35
    [4] Miller, Haller, and Odersky. Spores: a type-based foundation for closures in the age of concurrency
    and distribution. ECOOP 2014
    [5] Epstein, Black, Peyton Jones. Towards Haskell in the cloud. Haskell Symposium 2011
    [1] Odersky, Blanvillain, Liu, Biboudis, Miller, Stucki. Simplicitly: foundations and applications of implicit
    function types. Proc. ACM Program. Lang. 2(POPL): 42:1-42:29 (2018)
    [3] Stucki, Brachthäuser, Odersky. Virtual ADTs for portable metaprogramming. MPLR 2021
    [2] Stucki, Biboudis, Odersky. A practical unification of multi-stage programming and macros. GPCE 2018
    [6] Boruch-Gruszecki, Brachthäuser, Lee, Lhoták, Odersky. Tracking Captured Variables in Types. CoRR
    abs/2105.11896 (2021)

    View Slide