Tagless Final & Scala 3

Tagless Final & Scala 3 ScalaMad 16 & 23 /
2020 (online event) Juan-José Vázquez @juanjovazquez CTO Tecsisa

“The purpose of abstraction is not to be vague, but
to create a new semantic level in which one can be absolutely precise.” Edgers Dijkstra

Domain-Speciﬁc Languages

“A domain-speciﬁc language (DSL) is a computer language specialized to
a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains.” Wikipedia

Domain-speciﬁc languages DSL Application domain HTML Hypertext web page SQL
Database queries Postscript Publishing Dhall Conﬁguration Hibernate ORM Erlang OTP / Akka Concurrent & distributed systems Camel Enterprise integration patterns

Domain-speciﬁc languages • Specialized and designed for a specific task
(no general purpose) • Less powerful than GPL’s albeit some Turing complete DSL’s exist • Do not have to represent computation directly, they can just declare rules, facts or relationships • Interpreted in different ways • Narrower applicability than GPL’s in order to perform really well in their domains • Need semantics (what it means) and syntax (how it looks) • Bring the power of language theory to a particular domain • Tend to remain small and simple so often lack abstractive capabilities (variables, modules, HOF’s, etc)

Styles of DSL’s • External ◦ Own syntax and parser
(including lexers, tokenizers, etc) ◦ Dedicated interpreter or compiler ◦ Powerful, but you are on your own: lack of abstractive capabilities, tools, IDE’s, etc • Internal (a.k.a. Embedded) ◦ An object language (the DSL) embedded in a metalanguage or host language (GPL) ◦ Reuse the host language’s mechanics and types to express the terms of the object language ◦ A lot of work is done for you (e.g. parsing) ◦ Abstraction capabilities from the host language are reused (e.g. HOAS) ◦ The metalanguage might be used for metaprogramming

Styles of embedding • Deep embedding ◦ The program exists
as data (e.g. AST) ◦ Interpreters in the metalanguage need to be implemented • Shallow embedding ◦ DSL constructs are composed purely of host language constructs ◦ Terms are implemented directly as values to which they evaluate, bypassing intermediate AST’s and traversals ◦ Typically are interpreted by the metalanguage in a single way (metacircular interpreter) but techniques as tagless final remove this limitation by adding an extra layer of abstraction • Both, deep and shallow can be seen as folds [Gibbons. 2013]

Purpose • Embedding typed first-order and higher-order languages in a
typed metalanguage as Scala 3 • Enabling multiple interpretations (even non-monadic ones) • Enforcing compositionality (promoting FP principles and idioms) • Extensibility and testability • Serialization and deserialization • Do not resort on fancy types, as GADT’s or dependent types, or metaprogramming • Tightness: only correct terms should be representable • Still, allowing static analysis and optimizations

Interlude - Scala 3

Scala 3 - Control Syntax if (x < 0) "negative"
else "positive" if x < 0 then "negative" else "positive" for { x =- maybeFoo(1) y =- maybeBar(x) } yield x + y for x =- maybeFoo(1) y =- maybeBar(x) yield x + y day match { case Weekday(d) => Right(d) case _ => Left("bad day") } day match case Weekday(d) => Right(d) case _ => Left("bad day")

Scala 3 - Optional Braces trait Foo { def bar(x:
Int): List[String] } trait Foo: def bar(x: Int): List[Int] new Foo { def bar(x: Int): List[String] = List.fill(x)("foo") } new Foo: def bar(x: Int): List[String] = List.fill(x)("foo") object Foo { def apply(): Foo = ??? } object Foo: def apply(): Foo = ???

Scala 3 - Enums sealed trait Color object Color {
case object White extends Color case object Red extends Color case object Blue extends Color } enum Color: case White, Red, Blue sealed trait Maybe[+T] object Maybe { case object Empty extends Maybe[Nothing] case class Just[T](x: T) extends Maybe[T] } > val x = Just(3) =/ x: Just[Int] = Just(3) enum Maybe[+T]: case Empty extends Maybe[Nothing] case Just(x: T) extends Maybe[T] > val x = Just(3) =/ x: Maybe[Int] = Just(3)

Scala 3 - Contextual Abstractions trait Ord[T] { def compare(x:
T, y: T): Int } implicit val IntOrd = new Ord[Int] { def compare(x: Int, y: Int): Int = ??? } implicit def ListOrd[T](implicit T: Ord[T]) = new Ord[List[T]] { def compare(x: List[T], y: List[T]): Int = ??? } def foo[T](x: T, y: T)(implicit T: Ord[T]) = ??? def foo[T: Ord](x: T, y: T) = ??? trait Ord[T]: def compare(x: T, y: T): Int given Ord[Int]: def compare(x: Int, y: Int): Int = ??? given [T: Ord] as Ord[List[T]]: def compare(x: List[T], y: List[T]): Int = ??? def foo[T](x: T, y: T)(using Ord[T]) = ??? def foo[T: Ord](x: T, y: T) = ???

Scala 3 - Context Function Types def foo(x: Future[Int]) (implicit
ec: ExecutionContext): Future[Int] = ??? type Executable[T] = ExecutionContext ?=> T def foo(x: Future[Int]): Executable[Int] = ???

Scala 3 - Opaque Types case class Logarithm(x: Double) extends
AnyVal object Logarithm { implicit class LogarithmOps(val self: Logarithm) { ==. } } opaque type Logarithm = Double object Logarithm: def apply(d: Double): Logarithm = math.log(d) extension on (x: Logarithm): def toDouble = math.exp(x) def + (y: Logarithm) = Logarithm(math.exp(x) + math.exp(y)) def * (y: Logarithm) = x + y

Initial Embedding

enum Exp: case Lit(x: Int) case Neg(e: Exp) case Add(e1:
Exp, e2: Exp) Initial Embedding • Encodes expressions of the object language as values of an ADT in the metalanguage • The data type Exp represents the AST of the object language • The metalanguage is Scala 3 • Exp is the language of arithmetic expressions with integers, addition and negation (first-order & unityped)

=/ ti1: Exp val ti1 = Add(Lit(8), Neg(Add(Lit(1), Lit(2)))) =/
8 + (- (1 + 2)) Initial Embedding • Sample expression expressed as a value of type Exp

def eval(e: Exp): Int = e match case Lit(x) =>
x case Neg(e) => - eval(e) case Add(e1, e2) => eval(e1) + eval(e2) Initial Embedding • Standard interpreter: evaluator proceeding by case analysis • Folds recursively over the expression • Metacircular: integers are Scala integers, addition is Scala addition==. > eval(ti1) =/ 5

def view(e: Exp): String = e match case Lit(x) =>
x.toString case Neg(e) => s"(-${view(e)})" case Add(e1, e2) => s"(${view(e1)} + ${view(e2)})" Initial Embedding • Non-standard interpreter: pretty-printer • Different interpretations are possible > view(ti1) =/ (8 + (-(1 + 2)))

Final Embedding

type Repr = Int def lit(x: Int): Repr = x
def neg(e: Repr): Repr = - e def add(e1: Repr, e2: Repr): Repr = e1 + e2 Final Embedding • Represents the expression as its value or by a Scala expression that computes its value • Repr is an alias for the meaning of the expression, i.e. the semantic type Int in this case • The functions lit, neg and add compute the meaning of the expressions forms of the language (literals, negation and addition) • The computation is compositional, e.g. the meaning of addition is computed from the meaning of the summands

=/ tf1: Repr val tf1 = add(lit(8), neg(add(lit(1), lit(2)))) =/
8 + (- (1 + 2)) Final Embedding • Repr is exactly Int so just one interpretation is possible: the metacircular interpretation • The evaluator is hardwired into the expression • Lowercase replaces uppercase but we lost abstraction over interpretation. Something else is needed.

trait ExpSym[Repr]: def lit(x: Int): Repr def neg(e: Repr): Repr
def add(e1: Repr, e2: Repr): Repr Final Embedding • The constructor functions are now packed into a type class parameterized by the type variable Repr • ExpSym represents the class of all algebraic expression programs with integer literals, addition and negation for any semantic domain

type P[Repr] = ExpSym[Repr] ?=> Repr def tf1[Repr]: P[Repr] =
add(lit(8), neg(add(lit(1), lit(2)))) Final Embedding • ExpSym is the denotational semantics over the semantic domain Repr, i.e. the meaning of an expression is computed from the meaning of the components, regardless Repr • No algorithmic details, just pure syntax: a timeless expression liberated from time and space (the essence of FP, right?) • The object term is not represented by its AST but by its meaning or denotation in certain semantic domain: val ti1: Exp = Add(Lit(8), … =/ becomes def tf1[Repr]: ExpSym[Repr] ?=> Repr = add(lit(8), …

trait ExpSym: type Repr def lit(x: Int): Repr def neg(e:
Repr): Repr def add(e1: Repr, e2: Repr): Repr Final Embedding • Alternative encoding using type members • Leads to a more rigid composition through the (infamous?) cake pattern

given ExpSym[Int]: def lit(x: Int): Int = x def neg(e:
Int): Int = - e def add(e1: Int, e2: Int): Int = e1 + e2 Final Embedding • Interpreters are given by instances of the type class. Sym stands for Symantics: the type class defines the syntax; instances define its semantics • The evaluator is trivial, just the identity function: a selector of an interpretation as an integer, i.e. the metacircular interpreter. def eval(x: Int): Int = x > eval(tf1) =/ 5

given ExpSym[String]: def lit(x: Int): String = x.toString def neg(e:
String): String = s"(-$e)" def add(e1: String, e2: String): String = s"($e1 + $e2)" Final Embedding • The pretty-printing interpreter shows that the final embedding accepts multiple interpretations now, even non-standard ones, as the language is polymorphic • The evaluator is again trivial, only its type matters for the compiler to dispatch the correct type class instance def view(x: String): String = x > view(tf1) =/ (8 + (-(1 + 2)))

Extensibility - The Expression Problem

“The Expression Problem is a new name for an old
problem: to deﬁne a datatype by cases, where one can add new cases and new functions over the datatype, without recompiling existing code, and while retaining type safety… Whether a language can solve this, it’s a salient indicator of its capacity for expression.” Philip Wadler

The expression problem Variants Operations Closed Open Closed Objects Open
ADTs ??? enum Exp: case Lit(x: Int) ==. case Mul(e1: Exp, e2: Exp) =/ Breaks code!. Adjust & recompile!

trait MulSym[Repr]: def mul(x: Repr, y: Repr): Repr Extensibility: the
ﬁnal approach • The object language is extended with a new syntactic form mul by defining a new type class MulSym • New expressions might use previous terms from the unextended language with no changes =/ (ExpSym[Repr], MulSym[Repr]) ?=> Repr type PE[Repr] = MulSym[Repr] ?=> P[Repr] def tfm1[Repr]: PE[Repr] = add(lit(7), neg(mul(lit(2), lit(2)))) def tfm2[Repr]: PE[Repr] = mul(lit(7), tf1) =/ tf1 from the unextended lang

given MulSym[Int]: def mul(x: Int, y: Int): Int = x
* y given MulSym[String]: def mul(s: String, y: String): String = s"($x * $y)" Extensibility: the ﬁnal approach • The final encoding accepts not only new interpretations but also new language forms without recompilation > eval(tfm2) =/ Same evaluators as before =/ 35 > view(tfm2) =/ (7 * (8 + (-(1 + 2))))

The De-Serialization Problem

The de-serialization problem tf1 type checking output expression Add Neg
Add 8 1 2 tf1 expression AST Memory DB File EASY HARD

enum Tree: case Leaf(lbl: String) case Node(lbl: String, ts: List[Tree])
The de-serialization problem • The serialization part is unproblematic, a variation of the pretty-printer given ExpSym[Tree]: def lit(x: Int): Tree = Node("Lit", Leaf(x.toString) =: Nil) def neg(e: Tree): Tree = Node("Neg", e =: Nil) def add(e1: Tree, e2: Tree): Tree = Node("Add", e1 =: e2 =: Nil)

def toTree(t: Tree): Tree = t The de-serialization problem •
The serializer toTree is just another trivial interpreter as eval or view • Produces a Json-like data structure (or an S-expression for our Lisp friends) > toTree(tf1) =/ Node(Add, List(Node(Lit,List(Leaf(8))), =/ Node(Neg,List(Node(Add,List(Node(Lit,List(Leaf(1))), =/ Node(Lit,List(Leaf(2)))))))))

type ErrMsg = String def fromTree[Repr: ExpSym](t: Tree): Either[ErrMsg, Repr]
= ==. The de-serialization problem • Deserialization is necessarily partial: input might be invalid object Lit: def unapply(t: Tree): Option[Tree] = t match case Node("Lit", Leaf(s) =: Nil) => Try(s.toInt).toOption case _ => None ==.

def fromTree[Repr: ExpSym](t: Tree): Either[ErrMsg, Repr] = t match case
Lit(x) => Right(ExpSym[Repr].lit(x)) case Neg(t) => fromTree(t).map(ExpSym[Repr].neg) case Add(l, r) => for l0 =- fromTree(l) r0 =- fromTree(r) yield ExpSym[Repr].add(l0, r0) case t => Left(s"Invalid tree: $t") =/ We lost extensibility The de-serialization problem def evalTree[Repr: ExpSym](t: Tree): Unit = fromTree[Repr](t) match case Left(e) => println(s"Error: $e") case Right(r) => println(r) > evalTree[Int](toTree(tf1)) =/ We lost polymorphism =/ 5 > evalTree[Int](Leaf("<bad input>")) =/ Error: Invalid tree: Leaf(<bad input>)

=/ Church encoding (data as a function) =/ newtype Expr
= Expr (forall repr. ExpSym repr => repr) trait Expr: def apply[Repr]: P[Repr] The de-serialization problem given ExpSym[Expr]: def lit(x: Int): Expr = new : def apply[Repr]: P[Repr] = ExpSym[Repr].lit(x) ==. =/ We lost extensibility though def evalTreeChurch(t: Tree): Unit = fromTree[Expr](t) match case Left(e) => println(s"Error: $e") case Right(r) => println(r[Int]) =/ r is again polymorphic println(r[String]) > evalTreeChurch(toTree(tf1)) =/ 5 =/ (8 + (-(1 + 2)))

Recap • DSL’s allow us to program at the appropriate
level of abstraction according to our domain business rules • Embedding is a cost-effective way to implement DSL's as leverages the features and tools of the metalanguage and its ecosystem • The initial embedding focuses on defining a syntax as a data type and semantics as evaluators over this data type • The final embedding focuses on defining a syntax as functions packed in a polymorphic interface and semantics as instances of this interface • Tagless final allows us to define extensible languages that can be serialized and deserialized while preserving compositionality and type safety

Optimizations

The non-compositionality problem • Compositionality: the meaning of a complex
expression is determined by its structure and the meanings of its constituents eval(Add(e1, e2)) === eval(e1) + eval(e2) • All interpreters have been compositional so far, i.e. all folds and context insensitive • Many operations as program transformations and optimizations are non-compositional though, i.e. context sensitive

Pushing negation down • Part of disjunctive normal form (DNF)
applying distribute laws and eliminating double negation (-(-1)) => 1 (8 + (-(1 + 2))) => (8 + ((-1) + (-2)))

Pushing negation down - Initial approach def pushNeg(e: Exp): Exp
= e match case Lit(_) => e case Neg(Lit(_)) => e case Neg(Neg(e)) => pushNeg(e) case Neg(Add(e1, e2)) => Add(pushNeg(Neg(e1)), pushNeg(Neg(e2))) case Add(e1, e2) => Add(pushNeg(e1), pushNeg(e2)) > eval(ti1Norm) =/ 5 > view(ti1Norm) =/ (8 + ((-1) + (-2))) val ti1Norm = pushNeg(ti1) =/ ti1 = Add(Lit(8), Neg(Add(Lit(1), Lit(2))))

Pushing negation down - Final approach =/ The context needs
to be explicit =/ to recover compositionality enum Ctx: case Neg, Pos given [Repr] (using s: ExpSym[Repr]) as ExpSym[Ctx => Repr]: type R = Ctx => Repr def lit(x: Int): R = case Pos => s.lit(x) case Neg => s.neg(s.lit(x)) def neg(e: R): R = case Pos => e(Neg) case Neg => e(Pos) def add(e1: R, e2: R): R = ctx => s.add(e1(ctx), e2(ctx))

Pushing negation down - Final approach def pushNeg[Repr](e: Ctx =>
Repr): P[Repr] = e(Pos) =/ type P[Repr] = ExpSym[Repr] ?=> Repr > eval(tf1Norm) =/ 5 > view(tf1Norm) =/ (8 + ((-1) + (-2))) val tf1Norm = pushNeg(tf1) =/ tf1 = add(lit(8), neg(add(lit(1), lit(2)))) • Perhaps surprisingly, tagless final allows to analyze, transform and optimize expressions!

Typed Higher-Order Languages

Purpose • Embedding typed, higher-order languages in Scala 3 •
Tackling the embedding of both object terms and object types • Compare initial and final approaches • The running example will be typed lambda-calculus with constants and binding (represented as de Bruijn indices and higher-order abstract syntax)

Typed HO Languages - Initial Embedding

Initial Embedding - The problem of tags =/ lambda-calculus with
booleans enum Exp: case V(v: Var) =/ variables case B(b: Boolean) =/ boolean literals case L(f: Exp) =/ abstraction case A(f: Exp, arg: Exp) =/ application =/ variables as `de Bruijn` indices enum Var: case VZ case VS(v: Var) val ti1 = A(L(V(VZ)), B(true)) =/ ((b: Boolean) => b)(true)

Initial Embedding - The problem of tags type Env =
List[Boolean] def lookup(v: Var, env: Env): Boolean = ??? def eval(e: Exp, env: Env) = e match case V(v) => lookup(v, env) =/ Boolean case B(b) => b =/ Boolean case L(e0) => x => eval(e0, x =: env) =/ function value case A(f, arg) => (eval(f, env))(eval(arg, env)) • Different branches return different types, i.e. eval is ill-typed • Need something more: tags

Initial Embedding - The problem of tags =/ the universal
type enum U: case UB(b: Boolean) case UA(f: U => U) type Env = List[U] def lookup(v: Var, env: Env): U = (v, env) match case (VZ, x =: _) => x case (VS(v), _ =: env0) => lookup(v, env0) =/ match may not be exhaustive. =/ It would fail on pattern case: (_, Nil) • UB and UA are discriminators that tell the type of the injected values, i.e. type tags • lookup is not exhaustive and might fail

Initial Embedding - The problem of tags def eval(e: Exp,
env: Env): U = e match case V(v) => lookup(v, env) case B(b) => UB(b) case L(e0) => UA(x => eval(e0, x =: env)) case A(f, arg) => eval(f, env) match case UA(f0) => f0(eval(arg, env)) =/ match may not be exhaustive. =/ It would fail on pattern case: UB(_) val ti2a = A(B(true), B(false)) =/ compiles but fails as `eval` is partial val ti2o = A(L(V(VS(VZ))), B(true)) =/ open term but `lookup` accepts it • The language is untyped so expressions need to be typechecked

Initial Embedding - The problem of tags def typecheck(e: Exp):
Either[ErrMsg, Exp] = ??? def safeEval(e: Exp) = typecheck(e) match case Right(x) => println(eval(x, Nil)) case Left(t) => println(s"Type error: $t") • The presence of type tags and the need for runtime typechecks reveal the lack of type safety • Ill-terms are possible, i.e. the embedding is not tight • The language is not typed as in untyped languages where tags are not visible but hidden in runtime • Ordinary ADTs are unsuitable, i.e. they are too large: we need GADTs

Tagless Initial Embedding - GADTs enum Exp[Env, T]: case B[Env](b:
Boolean) extends Exp[Env, Boolean] case V[Env, T](v: Var[Env, T]) extends Exp[Env, T] case L[Env, A, B](f: Exp[(A, Env), B]) extends Exp[Env, A => B] case A[Env, A, B]( f: Exp[Env, A => B], e: Exp[Env, A]) extends Exp[Env, B] enum Var[Env, T]: case VZ[Env, T]() extends Var[(T, Env), T] case VS[Env, A, T](v: Var[Env, T]) extends Var[(A, Env), T] • The GADT is not only parameterized with the type of the object term but also with the environment, i.e. the free variables in the term • Constructors express the type system of the calculus, e.g. booleans have the type Boolean in any Env, the application of A => B to A gives you B all in the same Env, etc

Tagless Initial Embedding - GADTs def lookup[Env, T](v: Var[Env, T],
env: Env): T = (v, env) match case (VZ(), (x, _)) => x case (VS(v), (_, env0)) => lookup(v, env0) def eval[Env, T](env: Env, e: Exp[Env, T]): T = (env, e) match case (ev, V(v)) => lookup(v, ev) case (_, B(b)) => b case (ev, L(f)) => def aux[Env0, A, B](f0: Exp[(A, Env0), B], ev0: Env0) = (x: A) => eval((x, ev0), f0) aux(f, ev) case (ev0, A(f, e)) => (eval(ev0, f))(eval(ev0, e)) • The type of eval states that the type parameter T of the GADT expression e is the type of the result for every branch, i.e. not type tags needed • Now lookup and eval are total. Ill-terms are not representable so the language is tight

Typed HO Languages - Final Embedding

Tagless Final Embedding - de Bruijn indices trait Symantics[Repr[_, _]]:
def int[Env](x: Int): Repr[Env, Int] def add[Env](x: Repr[Env, Int], y: Repr[Env, Int]): Repr[Env, Int] def vz[Env, A]: Repr[(A, Env), A] def vs[Env, A, B](z: Repr[Env, A]): Repr[(B, Env), A] def lam[Env, A, B](f: Repr[(A, Env), B]): Repr[Env, A => B] def app[Env, A, B](f: Repr[Env, A => B], repa: Repr[Env, A]): Repr[Env, B] • Simply typed lambda calculus with integer literals and addition. • Free variables are encoded with de Bruijn indices as typed nested tuples • A value of type Repr[Env, T] represents a full embedded language expression given an instance of Symantics for that Repr

Tagless Final Embedding - de Bruijn indices type P[Repr[_, _],
Env, A] = Symantics[Repr] ?=> Repr[Env, A] def td1[Repr[_, _]]: P[Repr, Unit, Int] = add(int(1), int(2)) def td3[Repr[_, _]]: P[Repr, Unit, (Int => Int) => Int] = lam(add(app(vz, int(1)), int(2))) def td2o[Repr[_, _]]: P[Repr, (Int, Unit), Int => Int] = lam(add(vz, vs(vz[Repr, Unit, Int]))) • Expressions are well-typed only in compatible environments • td2o is only well-typed in the non-empty environment (Int, Unit), i.e. it is an open term

Tagless Final Embedding - de Bruijn indices opaque type R[Env,
A] = Env => A =/ standard metacircular interpreter given Symantics[R]: def int[Env](x: Int): R[Env, Int] = R(_ => x) def add[Env](x: R[Env, Int], y: R[Env, Int]): R[Env, Int] = R(e => x.unR(e) + y.unR(e)) def vz[Env, A]: R[(A, Env), A] = R(_._1) def vs[Env, A, B](z: R[Env, A]): R[(B, Env), A] = R((_, e) => z.unR(e)) def lam[Env, A, B](f: R[(A, Env), B]: R[Env, A => B] = R(e => a => f.unR((a, e)) def app[Env, A, B](f: R[Env, A => B], repa: R[Env, A]): R[Env, B] = R(e => f.unR(e)(repa.unR(e)))

Tagless Final Embedding - de Bruijn indices opaque type S[Env,
A] = Int => String =/ `Int`: level index =/ pretty-printer given Symantics[S]: def int[Env](x: Int): S[Env, Int] = S(_ => x.toString) def add[Env](x: S[Env, Int], y: S[Env, Int]): S[Env, Int] = S(i => s"(${x.unS(i)} + ${y.unS(i)})") def vz[Env, A]: S[(A, Env), A] = S(i => s"x${i - 1}") def vs[Env, A, B](z: S[Env, A]): S[(B, Env), A] = S(i => z.unS(i - 1)) def lam[Env, A, B](f: S[(A, Env), B]: S[Env, A => B] = S(i => s"\\\\x$i => ${f.unS(i + 1)}") def app[Env, A, B](f: S[Env, A => B], repa: S[Env, A]): S[Env, B] = S(i => s"(${f.unS(i)} ${repa.unS(i)})")

Tagless Final Embedding - de Bruijn indices def eval[A](e: R[Unit,
A]) = e.unR(()) > eval(td1)) =/ 3 val rd3: (Int => Int) => Int = eval(td3) > rd3(_ + 1) =/ 4 val rd2o = eval(td2o) =/ compilation error =/ `td2o` cannot be evaluated =/ in the `Unit` environment

Tagless Final Embedding - de Bruijn indices def view[A](e: S[Unit,
A]) = e.unS(0) > view(td1)) =/ (1 + 2) > view(td3) =/ (\\x0 => ((x0 1) + 2))

Tagless Final Embedding - HOAS trait Symantics[Repr[_]]: def int(x: Int):
Repr[Int] def add(x: Repr[Int], y: Repr[Int]): Repr[Int] def lam[A, B](f: Repr[A] => Repr[B]): Repr[A => B] def app[A, B](f: Repr[A => B], repa: Repr[A]): Repr[B] • Simply typed lambda calculus with integer literals and addition. • Named variables instead of indices • Lambdas are embedded using Scala lambdas (no environment needed, Scala handles this for us) • A value of type Repr[T] represents a full embedded language expression given an instance of Symantics for that Repr

Tagless Final Embedding - HOAS type P[Repr[_], A] = Symantics[Repr]
?=> Repr[A] def th1[Repr[_]]: P[Repr, Int] = add(int(1), int(2)) def th2[Repr[_]]: P[Repr, Int => Int] = lam(x => add(x, x)) def th3[Repr[_]]: P[Repr, (Int => Int) => Int] = lam(x => add(app(x, int(1)), int(2))) • Terms use variable names as x • Open terms cannot be expressed at all since object variables are now Scala variables and open terms cannot be expressed at the top level in Scala

Tagless Final Embedding - HOAS opaque type R[A] = A
=/ the identity type =/ standard metacircular interpreter given Symantics[R]: def int(x: Int): R[Int] = R(x) def add(x: R[Int], y: R[Int]): R[Int] = R(x.unR + y.unR) def lam[A, B](f: R[A] => R[B]): R[A => B] = R(a => f(R(a)).unR) def app[A, B](f: R[A => B], repa: R[A]): R[B] = R(f.unR(repa.unR))

Tagless Final Embedding - HOAS opaque type S[A] = Int
=> String =/ `Int`: level index =/ pretty-printer given Symantics[S]: def int(x: Int): S[Int] = S(_ => x.toString) def add(x: S[Int], y: S[Int]): S[Int] = S(i => s"(${x.unS(i)} + ${y.unS(i)})") def lam[A, B](f: S[A] => S[B]): S[A => B] = S { i => val x = s"x$i" s"\\\\$x => ${f(S(_ => x)).unS(i + 1)}") } def app[A, B](f: S[A => B], repa: S[A]): S[B] = S(i => s"(${f.unS(i)} ${repa.unS(i)})")

Tagless Final Embedding - HOAS def eval[A](e: R[A]): A =
e.unR > eval(th1)) =/ 3 val rh2: Int => Int = eval(th2) > rh2(1) =/ 2 val rh2: (Int => Int) => Int = eval(th3) > rh3(_ + 1) =/ 4

Tagless Final Embedding - HOAS def view[A](e: S[A]): String =
e.unS(0) > view(th1)) =/ (1 + 2) > view(th2)) =/ (\\x0 => (x0 + x0)) > view(th3)) =/ (\\x0 => ((x0 1) + 2))

Examples - QUEΛ trait MultisetExpr[Repr[_]]: def from[A, B](q: Repr[List[A]]) (f:
Repr[A] => Repr[List[B]]): Repr[List[B]] =/ first-order def where[A](cond: Repr[Boolean])(q: Repr[List[A]]): Repr[List[A]] def select[A](a: Repr[A]): Repr[List[A]] =/ ==. trait QUEΛ[Repr[_]] extends MultisetExpr[Repr] with ==. type P[Repr[_], A] = (QUEΛ[Repr], WorldModel[Repr]) ?=> Repr[A] def largeCapitals[Repr[_]]: P[Repr, List[(String, String)] = for country =- allCountries =/ Repr[List[Country]] city =- allCities =/ Repr[List[City]] if country.capital.exists(_ === city.id) if city.population > 8000000 yield city.name => country.name

Examples - Nivens trait Curves[C[_], D[_]]: def pure[A](c: Date =>
Option[A]) (from: Date, to: Date, step: Frequency): C[Curve[A]] def frequency(f: Frequency): C[Frequency] def slice[A](c: Curve[A]], from: C[Date], to: C[Date]): C[Curve[A]] def concat[A](c1: C[Curve[A]], c2: C[Curve[A]]): C[Curve[A]] def downsample[A](c: C[Curve[A]]) (f: D[A => A => A])(f: C[Frequency]): C[Curve[A]] =/ ==. def map[A, B](c: C[Curve[A]])(f: D[A => B]): C[Curve[B]] =/ higher-order =/ ==. val settlement: C[Curve[Int] => Curve[Decimal] => Curve[Decimal]] = lam(volumes => lam(prices => volumes.downsample(add)(Hourly) * prices)

Takeaways

Takeaways • Tagless final is a general technique for the
embedding of typed DSLs in GPLs that demands HKTs only (out of the box or via defunctionalization) • Tagless final produces extensible and optimizable languages whose expressions can be serialized and deserialized • What is known as tagless final style in the Scala community is based on the general theory but just focused on abstracting over a monadic computation type (as opposed to a general representation) • Even so, tagless final demands the re-implementation of many facilities already there in the metalanguage, e.g. lambdas. Why not reuse them? (stay tuned to the new metaprogramming framework in dotty)

References • Github code https://github.com/juanjovazquez/tagless-dotty • Typed Tagless Final Interpreters
(Kiselyov) http://okmij.org/ftp/tagless-final/course/lecture.pdf • Tagless-final style (Kiselyov et al.) http://okmij.org/ftp/tagless-final/index.html • Scala 3 (a.k.a Dotty) documentation https://dotty.epfl.ch/docs/ • Simplicitly (Odersky et al.) https://infoscience.epfl.ch/record/229878/files/simplicitly_1.pdf • Folding Domain-Specific Languages (Gibbons) https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/embedding-short.pdf

¡GRACIAS!

Tagless Final & Scala 3

Tagless Final & Scala 3

More Decks by Juan-José Vázquez

Other Decks in Programming

Featured

Transcript