a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. §2 6.5
already really ignore some of the more idiotic C standard rules that introduce pointless undefined behavior: things like the strict aliasing rules are just insane, and the "overflow is u[nd]efined" is bad too. So we use -fno-strict-aliasing -fno-strict-overflow -fno-delete-null-pointer-checks to basically say "those optimizations are fundamentally stupid and wrong, and only encourage compilers to generate random code that doesn't actually match the source code". Linus 2017
*number = 5; free(number); printf("%d\n", *number); 5 Address offset = 0 pointee I32 value Integer value = 5 prints The GC will collect the object if it is no longer referenced
= INT_MAX; int val = a + b; printf("%d\n", val); mov edi, .L.str mov esi, -2147483648 call printf Most compilers UB But: integer overflow is not handled consistenly
= INT_MAX; int val = a + b; printf("%d\n", val); mov edi, .L.str mov esi, -2147483648 call printf Most compilers UB Lenient C: Signed integer overflow as wraparound semantics (-fno-strict-overflow)
int val = a << b; printf("%d\n", val); Consensus is impossible! -O3 ??? x86 2147483648 PowerPC UB mov edi, .L.str mov esi, <some value> call printf 0 Lenient C: Invalid shift values have x86 shift semantics
Cuoq et al. • C* by Ertl • C-like languages 52 • Replaces many occurrences of “X has undefined behavior” with “X results in an unspecified value” • Addresses 14 points
Cuoq et al. • C* by Ertl • C-like languages 53 • C* specifies language elements according to the hardware features • Behavior might be different for every platform
the JVM ManagedObject Address pointee: ManagedObject offset: int I32Array values: int[] Function id: long I32 value: int DoubleArray values: double[] Lenient C as a user-friendly C dialect
Will Dietz, Peng Li, John Regehr, and Vikram Adve. 2012. Understanding integer overflow in C/C++. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). • Wang 2013: Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: analyzing the impact of undefined behavior. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). • Memarian 2016: Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N. M. Watson, and Peter Sewell. 2016. Into the depths of C: elaborating the de facto standards. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ‘16) 56