Slide 1

Slide 1 text

Specialization in ooc Amos Wenger June 8, 2012

Slide 2

Slide 2 text

The ooc programming language ooc is a general-purpose programming language. It was created in 2009 for an EPFL school project, and is now self-hosting, currently in v0.9.4. It produces clean, portable C code, its SDK works on Windows, OSX, Linux, Haiku, FreeBSD, and probably more. Has been used to create games, power live streaming backend architecture (in production), write compilers, IRC servers, IRC bots, torrent clients, implement Lisp, JIT assemblers, package managers, and more. https://github.com/languages/ooc

Slide 3

Slide 3 text

Class definition Logger: class extends Formatter { level: Level out: Stream init: func (=out) { level = Level INFO } log: func (msg: String, level: Level) { if (level <= this level) out print(format(msg)) } }

Slide 4

Slide 4 text

Modules, entry point, string formatting import os/Time main: func { logger := Logger new(stdout) logger log("Started at " + Time currentTimeMillis()) work() time := Time currentTimeMillis() logger log("Finished at %d" format(time)) }

Slide 5

Slide 5 text

Covers (C side) // In uv-private.h struct uv_loop_s { long timestamp; }; // In uv.h typedef struct uv_loop_s uv_loop_t; UV_EXTERN uv_loop_t* uv_default_loop(void); UV_EXTERN uv_run(uv_loop_t*);

Slide 6

Slide 6 text

Covers (ooc side) Loop_s: cover from uv_loop_s { timestamp: Long } Loop: cover from Loop_s* { default: static extern(uv_default_loop) func -> This run: extern(uv_run) func } // example loop := Loop default() loop run()

Slide 7

Slide 7 text

Features not covered here Well outside the scope of this presentation: Operator overloading Implicit conversions Cover inheritance, compound covers, structured initializers Version blocks Interfaces Custom memory management Enums Pattern matching

Slide 8

Slide 8 text

Meta-programming in other languages C only allows macros, not generic programming. While this doesn’t prevent the creation of generic containers, type safety is not guaranteed. C++ meta-programming is done via templates: compile-time instanciation, compile-time type safety, significant cost in compilation time and binary size. RTTI available via typeid. JVM-based languages (Java, Scala, Groovy, etc.) have generic classes, with type erasure because of backwards-compatibility. Limited compile-time type-safety (can be overriden) and no introspection possible at runtime.

Slide 9

Slide 9 text

Types A type can either be: A complex type: object, interface. e.g. String, Logger, etc. A primitive type: cover, cover from. e.g. Int, Boolean Java has a similar distinction (int vs Integer). In ooc, instead of boxing and unboxing, primitive types are allowed as generic type parameters.

Slide 10

Slide 10 text

Generics - Functions identity: func (value: T) -> T { value } // primitive type answer := identity(42) // object type identity("foobar") println()

Slide 11

Slide 11 text

Generics - Classes Container: class { value: T init: func (=value) get: func -> T { value } } Number of generic parameters is not limited: Map: abstract class { put: abstract func (key: K, value: V) get: abstract func (key: K) -> V }

Slide 12

Slide 12 text

The problem Non-generic code generates straightforward C code, but generic types add to the semantics of the language and have no natural C translation. /* what is the generic version of this? */ int identity(int value) { return value; } Generic type sizes can vary: operations on generic values must work whatever their actual size is at runtime. So must operations on arrays of generic values.

Slide 13

Slide 13 text

The solution All types in ooc have runtime type information, returned by the TypeName class() function. This structure contains the width of the type. typedef uint8_t *Any; void identity(Class *T, Any value, Any ret) { if (ret) { memcpy(ret, value, T->size) } }

Slide 14

Slide 14 text

The solution Calls like this one: a := 42 b := identity(a) are translated as: int a = 42; int b; identity(Int_class(), &b, &a);

Slide 15

Slide 15 text

The solution When casting from a generic type to a concrete type, the generic value is unboxed by dereferencing its address. void somefunc(Any value) { int i = *((int*) value); // do something with i } For arrays of generic types, the position of an element is computed at runtime using its index and the size of an element.

Slide 16

Slide 16 text

The solution’s problem Passing the address of generic values instead of their value directly is an extra indirection (dereference), which incurs a speed penalty. Calling memcpy is much more expensive than the = operator in C. No C compiler is smart enough to optimize memcpy to something else. gc malloc calls are more expensive than stack allocations (for local generic variables). These explanations were based on intuition, the subject of this work was to implement generic specialization to assess the performance problem and solve it.

Slide 17

Slide 17 text

The solution, part II - Specialization typedef uint8_t *Any; void identity(Class *T, Any value, Any ret) { if (ret) { memcpy(ret, value, T->size) } } int identity__int(int value) { const Class *T = Int_class(); return value; }

Slide 18

Slide 18 text

The economics of specialization https://github.com/nddrylliog/rock/tree/specialize 41 changed files with 2,233 additions and 2,385 deletions. Net cost: -152 lines of code

Slide 19

Slide 19 text

Using specialization Our implementation specialize functions that are marked with the inline keyword (pre-existing, unused). It also adds a compiler instruction named #specialize. It is used to manually mark a type parameter combination for specialization: For example, #specialize ArrayList would make all lists of integers faster, and all other combinations would work as usual.

Slide 20

Slide 20 text

Benchmark Our benchmark is bubble sort on a simple ArrayList implementation. Full benchmark fits in 100 lines of code. https://github.com/nddrylliog/semester-project/

Slide 21

Slide 21 text

Why not a larger application? removeAt: func (index: SSizeT) -> T { element := data[index] memmove(data + (index * T size), data + ((index + 1) * T size), (_size - index) * T size) _size -= 1 element }

Slide 22

Slide 22 text

How to fix legacy code removeAt: func (index: SSizeT) -> T { element := data[index] data[index.._size - 1] = data[index + 1.._size] element }

Slide 23

Slide 23 text

Source size cost

Slide 24

Slide 24 text

Performance gains (GCC)

Slide 25

Slide 25 text

Performance gains (Clang/LLVM)

Slide 26

Slide 26 text

Conclusion Specialization proved to be an interesting alternative to the no-compromise C++ and JVM models. It allows partial specialization of generic types. Unspecialized code remains as fast as generic collections in C (cf. qsort), and specialized code performance is comparable to C++ template code. Further work is needed for legacy code to take advantage of the optimizations implemented here, because of abstraction leaks.

Slide 27

Slide 27 text

Questions Thanks for listening! [email protected] https://github.com/nddrylliog/