Motivating Example 7 (decimal) 000…000111 (binary) 61 leading zero bits Task: Print the number of leading zero bits in a 64-bit integer (implementing a C program) 3
Approach 1: Implement in Plain C int count_leading_zeroes(unsigned long x) { int count = 0; unsigned long cur = x; while (cur > 0) { cur = cur >> 1; count++; } return 64 - count; } int main() { unsigned long value = 0b111; int bits = count_leading_zeroes(value); printf("%d", bits); } 4
int main() { unsigned long value = 0b111; int bits = __builtin_clzl(value); printf("%d", bits); } Approach 2: Reuse Existing Functionality GCC builtins are provided directly by the (GCC) compiler 6
int main() { unsigned long value = 0b111; int bits = __builtin_clzl(value); printf("%d", bits); } Approach 2: Reuse Existing Functionality bsr rsi, rsi xor rsi, 63 Compiles to GCC builtins typically result in efficient machine code 7
int main() { unsigned long value = 0b111; int bits = __builtin_clzl(value); printf("%d", bits); } Why Should Tool Developers Care? Finding: builtins are used by 37% of projects that we analyzed 10
Many Tools Exist that Support Developing C Programs Clang (LLVM) GCC ICC Preliminary experimentation suggested that these compilers support common GCC builtins 12
Many Tools Exist that Support Developing C Programs Clang (LLVM) GCC ICC CompCert Tiny C Compiler KCC Frama-C CIL KLEE Various other tools DragonEgg Sulong 15
18 • Compiler builtins are an important feature, but not widely understood Emerging languages like Rust also provide builtins Broader Implications of our Study
Broader Implications of our Study • Compiler builtins are an important feature, but not widely understood • Language design Demonstrates what features programming languages like C lack 19
Broader Implications of our Study • Compiler builtins are an important feature, but not widely understood • Language design • Developer feedback Informs developers on how builtin usage affects how well their code can be analyzed 20
Broader Implications of our Study • Compiler builtins are an important feature, but not widely understood • Language design • Developer feedback • Implementation and maintenance of compilers Informs compilers developers about which builtins are often used 21
1. How frequently are builtins used? 2. How well do tools that process C code support builtins? 3. How many builtins must be implemented to support most projects? Research Questions 22
1. How frequently are builtins used? 2. How well do tools that process C code support builtins? 3. How many builtins must be implemented to support most projects? 4. (How does builtin usage vary over a project’s lifetime?) Research Questions 22
1. How frequently are builtins used? 2. How well do tools that process C code support builtins? 3. How many builtins must be implemented to support most projects? 4. (How does builtin usage vary over a project’s lifetime?) 5. (For what purposes are builtins used?) Research Questions 22
Obtain C Projects Filter C Projects Extracting Builtin Uses Filter Builtin Name Records Analyze the results Methodology All steps are replicable, see https://github.com/jku-ssw/gcc-builtin-study 23
Obtain C Projects Filter C Projects Extracting Builtin Uses Filter Builtin Name Records Analyze the results Methodology All steps are replicable, see https://github.com/jku-ssw/gcc-builtin-study 23 Repeating the study on newly- added builtins requires little effort
Overview 37% 0 500 1000 1500 2000 Number of projects Used builtins Machine-independent GCC builtins are used by many projects 25 ~3,000 different builtins were used
Overview 37% 0 500 1000 1500 2000 Number of projects Used builtins Machine-independent GCC builtins are used by many projects 25 ~3,000 different builtins were used Builtins are used infrequently within a project: 1 builtin every ~6K LOC
Overview 37% 36% 0 500 1000 1500 2000 Number of projects Used builtins Machine-independent Machine-specific Many projects rely on architecture- independent builtins 26
Architecture-independent Builtins __builtin_clzl() Compiles to Definition: Architecture-independent builtins are typically supported on all common architectures 27
Architecture-specific Builtins 37% 36% 8% 0 500 1000 1500 2000 Number of projects Used builtins Machine-independent Machine-specific Architecture-specific builtins are less frequently used 30
Architecture-specific Builtins 37% 36% 8% 0 500 1000 1500 2000 Number of projects Used builtins Machine-independent Machine-specific A project can rely on both architecture-specific and architecture-independent builtins 30
Architecture-specific Builtins vec_perm() Compiles to Definition: Architecture-specific builtins are typically supported only on a specific architecture 31
Test Suite for 100 most-frequently used builtins #include int main() { volatile unsigned long value = -1; assert(__builtin_clzl(value) == 0); value = (unsigned long)-1 >> 3; assert(__builtin_clzl(value) == 3); value = (long)((unsigned long)-1 >> 5) - 4; assert(__builtin_clzl(value) == 5); return 0; } 35 Goal was to test common and corner case
Test Suite for 100 most-frequently used builtins #include int main() { volatile unsigned long value = -1; assert(__builtin_clzl(value) == 0); value = (unsigned long)-1 >> 3; assert(__builtin_clzl(value) == 3); value = (long)((unsigned long)-1 >> 5) - 4; assert(__builtin_clzl(value) == 5); return 0; } Compile Analyze Parse Warning? Error? 35 Goal was to test common and corner case
Example: Bugs in CompCert https://github.com/AbsInt/CompCert/issues/243 __builtin_clz for long and long long incorrectly assumed a 32-bit integer #include int main() { volatile unsigned long value = -1; assert(__builtin_clzl(value) == 0); value = (unsigned long)-1 >> 3; assert(__builtin_clzl(value) == 3); value = (long)((unsigned long)-1 >> 5) - 4; assert(__builtin_clzl(value) == 5); return 0; } a.out: test.c:7: int main(): Assertion `__builtin_clzl(value) == 3' failed. Aborted 37
Summary and Discussion GCC builtins are a challenge for tool developers 37% of projects use GCC builtins (mostly machine-independent ones) Many tools lack support for GCC builtins Exponential number of builtins to support a specific number of projects @RiggerManuel https://github.com/jku-ssw/gcc-builtin-study
Summary and Discussion GCC builtins are a challenge for tool developers 37% of projects use GCC builtins (mostly machine-independent ones) Many tools lack support for GCC builtins Exponential number of builtins to support a specific number of projects @RiggerManuel https://github.com/jku-ssw/gcc-builtin-study
Summary and Discussion GCC builtins are a challenge for tool developers 37% of projects use GCC builtins (mostly machine-independent ones) Many tools lack support for GCC builtins Exponential number of builtins to support a specific number of projects @RiggerManuel https://github.com/jku-ssw/gcc-builtin-study
Summary and Discussion GCC builtins are a challenge for tool developers 37% of projects use GCC builtins (mostly machine-independent ones) Many tools lack support for GCC builtins Exponential number of builtins to support a specific number of projects @RiggerManuel https://github.com/jku-ssw/gcc-builtin-study
Summary and Discussion GCC builtins are a challenge for tool developers 37% of projects use GCC builtins (mostly machine-independent ones) Many tools lack support for GCC builtins Exponential number of builtins to support a specific number of projects @RiggerManuel https://github.com/jku-ssw/gcc-builtin-study
Summary and Discussion GCC builtins are a challenge for tool developers 37% of projects use GCC builtins (mostly machine-independent ones) Many tools lack support for GCC builtins Exponential number of builtins to support a specific number of projects @RiggerManuel https://github.com/jku-ssw/gcc-builtin-study