Generic specialization in ooc

Generic specialization in ooc

Semester project 2012 @ LARA, EPFL


Amos Wenger

June 20, 2012


  1. Specialization in ooc Amos Wenger June 8, 2012

  2. The ooc programming language ooc is a general-purpose programming language.

    It was created in 2009 for an EPFL school project, and is now self-hosting, currently in v0.9.4. It produces clean, portable C code, its SDK works on Windows, OSX, Linux, Haiku, FreeBSD, and probably more. Has been used to create games, power live streaming backend architecture (in production), write compilers, IRC servers, IRC bots, torrent clients, implement Lisp, JIT assemblers, package managers, and more.
  3. Class definition Logger: class extends Formatter { level: Level out:

    Stream init: func (=out) { level = Level INFO } log: func (msg: String, level: Level) { if (level <= this level) out print(format(msg)) } }
  4. Modules, entry point, string formatting import os/Time main: func {

    logger := Logger new(stdout) logger log("Started at " + Time currentTimeMillis()) work() time := Time currentTimeMillis() logger log("Finished at %d" format(time)) }
  5. Covers (C side) // In uv-private.h struct uv_loop_s { long

    timestamp; }; // In uv.h typedef struct uv_loop_s uv_loop_t; UV_EXTERN uv_loop_t* uv_default_loop(void); UV_EXTERN uv_run(uv_loop_t*);
  6. Covers (ooc side) Loop_s: cover from uv_loop_s { timestamp: Long

    } Loop: cover from Loop_s* { default: static extern(uv_default_loop) func -> This run: extern(uv_run) func } // example loop := Loop default() loop run()
  7. Features not covered here Well outside the scope of this

    presentation: Operator overloading Implicit conversions Cover inheritance, compound covers, structured initializers Version blocks Interfaces Custom memory management Enums Pattern matching
  8. Meta-programming in other languages C only allows macros, not generic

    programming. While this doesn’t prevent the creation of generic containers, type safety is not guaranteed. C++ meta-programming is done via templates: compile-time instanciation, compile-time type safety, significant cost in compilation time and binary size. RTTI available via typeid. JVM-based languages (Java, Scala, Groovy, etc.) have generic classes, with type erasure because of backwards-compatibility. Limited compile-time type-safety (can be overriden) and no introspection possible at runtime.
  9. Types A type can either be: A complex type: object,

    interface. e.g. String, Logger, etc. A primitive type: cover, cover from. e.g. Int, Boolean Java has a similar distinction (int vs Integer). In ooc, instead of boxing and unboxing, primitive types are allowed as generic type parameters.
  10. Generics - Functions identity: func <T> (value: T) -> T

    { value } // primitive type answer := identity(42) // object type identity("foobar") println()
  11. Generics - Classes Container: class <T> { value: T init:

    func (=value) get: func -> T { value } } Number of generic parameters is not limited: Map: abstract class <K, V> { put: abstract func (key: K, value: V) get: abstract func (key: K) -> V }
  12. The problem Non-generic code generates straightforward C code, but generic

    types add to the semantics of the language and have no natural C translation. /* what is the generic version of this? */ int identity(int value) { return value; } Generic type sizes can vary: operations on generic values must work whatever their actual size is at runtime. So must operations on arrays of generic values.
  13. The solution All types in ooc have runtime type information,

    returned by the TypeName class() function. This structure contains the width of the type. typedef uint8_t *Any; void identity(Class *T, Any value, Any ret) { if (ret) { memcpy(ret, value, T->size) } }
  14. The solution Calls like this one: a := 42 b

    := identity(a) are translated as: int a = 42; int b; identity(Int_class(), &b, &a);
  15. The solution When casting from a generic type to a

    concrete type, the generic value is unboxed by dereferencing its address. void somefunc(Any value) { int i = *((int*) value); // do something with i } For arrays of generic types, the position of an element is computed at runtime using its index and the size of an element.
  16. The solution’s problem Passing the address of generic values instead

    of their value directly is an extra indirection (dereference), which incurs a speed penalty. Calling memcpy is much more expensive than the = operator in C. No C compiler is smart enough to optimize memcpy to something else. gc malloc calls are more expensive than stack allocations (for local generic variables). These explanations were based on intuition, the subject of this work was to implement generic specialization to assess the performance problem and solve it.
  17. The solution, part II - Specialization typedef uint8_t *Any; void

    identity(Class *T, Any value, Any ret) { if (ret) { memcpy(ret, value, T->size) } } int identity__int(int value) { const Class *T = Int_class(); return value; }
  18. The economics of specialization 41 changed files with 2,233

    additions and 2,385 deletions. Net cost: -152 lines of code
  19. Using specialization Our implementation specialize functions that are marked with

    the inline keyword (pre-existing, unused). It also adds a compiler instruction named #specialize. It is used to manually mark a type parameter combination for specialization: For example, #specialize ArrayList<Int> would make all lists of integers faster, and all other combinations would work as usual.
  20. Benchmark Our benchmark is bubble sort on a simple ArrayList

    implementation. Full benchmark fits in 100 lines of code.
  21. Why not a larger application? removeAt: func (index: SSizeT) ->

    T { element := data[index] memmove(data + (index * T size), data + ((index + 1) * T size), (_size - index) * T size) _size -= 1 element }
  22. How to fix legacy code removeAt: func (index: SSizeT) ->

    T { element := data[index] data[index.._size - 1] = data[index + 1.._size] element }
  23. Source size cost

  24. Performance gains (GCC)

  25. Performance gains (Clang/LLVM)

  26. Conclusion Specialization proved to be an interesting alternative to the

    no-compromise C++ and JVM models. It allows partial specialization of generic types. Unspecialized code remains as fast as generic collections in C (cf. qsort), and specialized code performance is comparable to C++ template code. Further work is needed for legacy code to take advantage of the optimizations implemented here, because of abstraction leaks.
  27. Questions Thanks for listening!