EPFL Working on program transformations focusing on data representation. Miniboxing guy. Scala compiler geek. @ @VladUreche @VladUreche [email protected]
EmployeeVector ID ID ... ... SALARY SALARY Vector[Employee] ID NAME SALARY ID NAME SALARY iteration is 5x faster C++ would produce a better representation here, but there are still cases where the C++ representation could be improved over.
Vector[T] and Employee are optimal • Together, Vector[Employee] can be optimized Challenge: No means of communicating this to the compiler You may disagree. We'll have a related work section later.
based on experience • based on speculation • one-time effort • repetitive and simple • affects code readability • is verbose • is error-prone programmer
• based on experience • based on speculation • one-time effort • repetitive and simple • affects code readability • is verbose • is error-prone compiler (automated)
• based on experience • based on speculation • one-time effort • repetitive and simple • affects code readability • is verbose • is error-prone compiler (automated)
type Target = Vector[Employee] type Result = EmployeeVector ... } The transformation is type-driven we indicate → the type of the target and of the representation.
type Target = Vector[Employee] type Result = EmployeeVector ... } The transformation is type-driven we indicate → the type of the target and of the representation. The improved representation is defined in the host language.
type Target = Vector[Employee] type Result = EmployeeVector def toResult(t: Target): Result = ... def toTarget(t: Result): Target = ... ... } Conversions to/from Vector[Employee] that consume/produce a EmployeeVector?
type Target = Vector[Employee] type Result = EmployeeVector def toResult(t: Target): Result = ... def toTarget(t: Result): Target = ... ... } So far so good, but how to execute Vector[Employee] operations on EmployeeVector?
be automated • based on experience • based on speculation • one-time effort • repetitive and simple • affects code readability • is verbose • is error-prone compiler (automated)
• based on experience • based on speculation • one-time effort • repetitive and simple • affects code readability • is verbose • is error-prone compiler (automated)
• based on experience • based on speculation • one-time effort • repetitive and simple • affects code readability • is verbose • is error-prone compiler (automated) In the paper
NAME SALARY Vector[Employee] ID NAME SALARY ID NAME SALARY class Vector[T] { … } NAME ... NAME EmployeeVector ID ID ... ... SALARY SALARY class NewEmployee(...) extends Employee(...) ID NAME SALARY DEPT
NAME SALARY Vector[Employee] ID NAME SALARY ID NAME SALARY class Vector[T] { … } NAME ... NAME EmployeeVector ID ID ... ... SALARY SALARY class NewEmployee(...) extends Employee(...) ID NAME SALARY DEPT Oooops...
can happen • Locally the programmer has full control: – Knows the values that will be used – Can reject non-conforming values How to use this information?
can happen • Locally the programmer has full control: – Knows the values that will be used – Can reject non-conforming values How to use this information? Scopes
Vector[Employee] = for (employee ← employees) yield employee.copy( salary = (1 + by) * employee.salary ) } Now the method operates on the EmployeeVector representation.
Vector[Employee] = for (employee ← employees) yield employee.copy( salary = (1 + by) * employee.salary ) } Now the method operates on the EmployeeVector representation. Programmers can freely choose which parts of their code to transform.
classes – Inlined immediately after the parser – Definitions are visible outside the "scope" • Locally closed world – Incoming/outgoing values go through conversions – Programmer can reject unexpected values
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations Coercions
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations Coercions
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations No coercions
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations No coercions Even across separate compilation
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations Two coercions Repr1 Target Repr2 → →
Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
• Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations
• Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations The transformation has to preserve the object model.
• Transformed code • Original code • Transformed code • Same transformation • Different transformation • Code can be – Left untransformed (using the original repr.) – Transformed using different representations The transformation has to preserve the object model. Handled automatically by the ildl transformation!
without regret” - Tiark Rompf – DSLs small enough to be staged → • 10000x speed improvements – Scala many features not supported by LMS: → • Separate compilation/modularization • Dynamic dispatch • Aliasing • Reflection
without regret” - Tiark Rompf – DSLs small enough to be staged → • 10000x speed improvements – Scala many features not supported by LMS: → • Separate compilation/modularization • Dynamic dispatch • Aliasing • Reflection If we add support, we lose the ability to optimize code :(
machine support – Access to the low-level code – Can assume a (local) closed world – Can speculate based on profiles – On the critical path – limited analysis
machine support – Access to the low-level code – Can assume a (local) closed world – Can speculate based on profiles – On the critical path – limited analysis – Biggest opportunities are high-level - O(n2) O(n) → • Incoming code is low-level • Rarely possible to recover opportunities
machine support – Access to the low-level code – Can assume a (local) closed world – Can speculate based on profiles – On the critical path – limited analysis – Biggest opportunities are high-level - O(n2) O(n) → • Incoming code is low-level • Rarely possible to recover opportunities Typical solution: Metaprogramming
Full-fledged program transformers – :) Lots of power – :( Lots of responsibility • Compiler invariants • Object-oriented model • Modularity def optimize(tree: AST): AST = { ... } Can we make metaprogramming “high-level”?
Solution: data-centric metaprogramming – Splitting the responsibility: • Defining the Transformation programmer → • Applying the Transformation compiler → – Scopes • Adapt the data representation to the operation • Allow speculating properties of the scope • We've just begun to scratch the surface – Many interesting research questions lie ahead
Solution: data-centric metaprogramming – Splitting the responsibility: • Defining the Transformation programmer → • Applying the Transformation compiler → – Scopes • Adapt the data representation to the operation • Allow speculating properties of the scope • We've just begun to scratch the surface – Many interesting research questions lie ahead
Solution: data-centric metaprogramming – Splitting the responsibility: • Defining the Transformation programmer → • Applying the Transformation compiler → – Scopes • Adapt the data representation to the operation • Allow speculating properties of the scope • We've just begun to scratch the surface – Many interesting research questions lie ahead
Solution: data-centric metaprogramming – Splitting the responsibility: • Defining the Transformation programmer → • Applying the Transformation compiler → – Scopes • Adapt the data representation to the operation • Allow speculating properties of the scope • We've just begun to scratch the surface – Many interesting research questions lie ahead