Deductive verification of unmodified Linux kernel library functions

Deductive verification of unmodified Linux kernel library functions Denis Efremov
NRU HSE defremov@hse.ru ISoLA 2018, Limassol, Cyprus, 6 November 2018

Motivation Tools evaluation (Frama-C+AstraVer+Why3) on a real code: • Does
our specification language expressive enough to describe this code? • How far can we go without posing restrictions on the C syntax? Tests for our tools: • When we will change our memory/arithmetic models, will a fully-proved function still be easy to reprove?

Linux kernel Library functions Linux kernel • Doesn’t rely on
any other piece of software (no stdlib, only gcc- builtins) • Userspace/Kernelspace pointers • Contains no floating point operations (well, almost) • Wide use of gcc extensions • Heavy-weight casting operations container_of/offsetof, unions, pointers-to-integers, void * to struct *, pointers to functions, bitwise operations • You can’t rewrite the code to be more “suitable” for verification in all cases Contains implementation of many «standard» functions on strings, memory from stdlib • Generic versions in C • Architecture-optimized versions in assembler

What can we say about this function? • This is
a pure C function; • It computes the average value of two int values. int average(int a, int b) { return (a + b) / 2; } What is the deductive verification? How does it look like on practice?

What can we say about this function? • This is
a pure C function; • It computes the average value of two int values; • There is a signed integer overflow under certain conditions. int average(int a, int b) { return (a + b) / 2; } What is the deductive verification? How does it look like on practice?

What is the deductive verification? Context of a function call
• Calling context for average: binary search function; • The indexes l and h are non- negative, l is not greater than h; • Integer overflow in m may lead to out-of-bounds access (base[m]). int average(int a, int b) { return (a + b) / 2; } int *binsearch(int *base, int n, int key) { int l = 0, h = n - 1; while (l <= h) { int m = average(l, h); int val = base[m]; if (val < key) { l = m + 1; } else if (val > key) { h = m - 1; } else { return base + m; } } return NULL; }

Formal specification for a C function Contract of a function
Describe call context (pre-conditions): : × → ⊤, ⊥ , ≡ ≥ 0 ∧ ≥ 0 ∧ ≤ Describe functional requirements on results (post-conditions): : ×× → {⊤, ⊥} , , ≡ = + 2

Formal specification for a C function Error model and code
representation • Define an error (an integer overflow): _: → {⊤, ⊥} _ ≡ _ ≤ ≤ _ • Formalize the program code: the function , returns the result (, ) according to its program code, iff it terminates and terminates without an error otherwise, special value returned • Prove the total correctness: ∀, , ⇒ , ≠ && , , ,

Formal specification for a C function Code should comply with
specification /*@ requires 0 <= a && 0 <= b; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; }

specification Verification Condition /*@ requires 0 <= a && 0 <= b; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } predicate in_bounds (n:int) = min <= n /\ n <= max constant a : t17 constant b : t17 axiom H : of_int 0 <= a /\ of_int 0 <= b /\ a <= b axiom H1: in_bounds 2 constant o : t17 axiom H2 : to_int o = 2 goal WP_parameter_average: in_bounds (to_int a + to_int b)

specification Verification Condition Pre-condition update /*@ requires 0 <= a && 0 <= b; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } /*@ requires 0 <= a <= INT_MAX / 2; requires 0 <= b <= INT_MAX / 2; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } predicate in_bounds (n:int) = min <= n /\ n <= max constant a : t17 constant b : t17 axiom H : of_int 0 <= a /\ of_int 0 <= b /\ a <= b axiom H1: in_bounds 2 constant o : t17 axiom H2 : to_int o = 2 goal WP_parameter_average: in_bounds (to_int a + to_int b)

specification Verification Condition Code fix Pre-condition update int average(int a, int b) { - return (a + b) / 2; + return a + (b - a) / 2; } /*@ requires 0 <= a && 0 <= b; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } /*@ requires 0 <= a <= INT_MAX / 2; requires 0 <= b <= INT_MAX / 2; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } predicate in_bounds (n:int) = min <= n /\ n <= max constant a : t17 constant b : t17 axiom H : of_int 0 <= a /\ of_int 0 <= b /\ a <= b axiom H1: in_bounds 2 constant o : t17 axiom H2 : to_int o = 2 goal WP_parameter_average: in_bounds (to_int a + to_int b)

Related Work • M. Torlakcik «Contracts in OpenBSD» (2010) •
Frama-C + Jessie deductive verification plugin • 12 functions (7 fully-proved functions) • Solvers: Simplify (1.5.4), Alt-Ergo (0.7.3), Z3 (2.0) • N. Carvalho, Silva Sousa, Cristiano and Pinto, Jorge Sousa and Tomb, Aaron «Formal Verification of kLIBC with the WP Frama-C Plug-in» (2014) • Frama-C + WP deductive verification plugin • 26 functions (14 fully-proved functions) • Solvers: Alt-Ergo (0.95.1), CVC3 (2.4.1), Z3 (4.3.1) • D. R. Cok, I. Blissard, J. Robbins «C Library annotations in ACSL for Frama-C: experience report» (2017)

Known problems (1) And how to handle them (ACSL extensions)
char *strnchr(const char *s, size_t count, int c) { for (; count-- && *s != '\0'; ++s) if (*s == (char)c) return (char *)s; return NULL; }

char *strnchr(const char *s, size_t count, int c) { for (; count-- && *s != '\0'; ++s) if (*s == (char)c) return (char *)s; return NULL; } • The underflow of an unsigned loop iterator at the last iteration step due to the postfix decrement;

char *strnchr(const char *s, size_t count, int c) { for (; count-- /*@%*/ && *s != '\0'; ++s) if (*s == (char)c) return (char *)s; return NULL; } • The underflow of an unsigned loop iterator at the last iteration step due to the postfix decrement;

char *strnchr(const char *s, size_t count, int c) { for (; count-- /*@%*/ && *s != '\0'; ++s) if (*s == (char)c) return (char *)s; return NULL; } • The underflow of an unsigned loop iterator at the last iteration step due to the postfix decrement; • The intended cast to a smaller integer type;

char *strnchr(const char *s, size_t count, int c) { for (; count-- /*@%*/ && *s != '\0'; ++s) if (*s == (char) /*@%*/ c) return (char *)s; return NULL; } • The underflow of an unsigned loop iterator at the last iteration step due to the postfix decrement; • The intended cast to a smaller integer type;

char *strnchr(const char *s, size_t count, int c) { for (; count-- /*@%*/ && *s != '\0'; ++s) if (*s == (char) /*@%*/ c) return (char *)s; return NULL; } • The underflow of an unsigned loop iterator at the last iteration step due to the postfix decrement; • The intended cast to a smaller integer type; • Pointer casts. Example: unsigned char * to char *.

VerKer Results (1) • 26 library functions from Linux (Frama-C+AstraVer+Why3)
• 17 str* functions • 6 mem* functions • 3 others • 25 fully-proved functions • in memmove we were not able to discharge one verification condition • In 9 functions there was an intended integer overflow • In 7 functions there was an intended integer cast to a smaller type • In 2 we «slightly» changed the code to prove them • Solvers: Alt-Ergo (2.0), CVC4 (1.4), CVC4 (1.6)

VerKer Results (2) • VC transformation strategy for solvers benchmarking
• Total number of verification conditions is 2781 • Number of lemmas 69 lemmas (37 proved automatically) • Integration with CI system (TravisCI) • An average number of spec lines for a single C line (about 900) is ~2.6 • Open specifications and verification artifacts (proofs) http://forge.ispras.ru/projects/verker https://github.com/evdenis/verker

Memmove • The memory model implemented in AstraVer (Jessie) plugin
allows arithmetic operations on pointers only when the pointers belong to the same allocated memory block; • For memmove, this is not necessarily the case; • If we state in the specification contract that src and dest may belong to different allocated memory blocks, then it is impossible to prove the VC states that they should belong to the same memory block; • Comparison of pointers to different memory blocks is the undefined behavior in ACSI C. void *memmove(void *dest, const void *src, size_t count) { if (dest <= src)

The modified functions An implicit cast in memset, strcmp void
*memset(void *s, int c, size_t count) { char *xs = s; while (count--) *xs++ = c; return s; } int strcmp(const char *cs, const char *ct) { unsigned char c1, c2; while (1) { c1 = *cs++; c2 = *ct++; if (c1 != c2) return c1 < c2 ? -1 : 1; if (!c1) break; } return 0; }

*memset(void *s, int c, size_t count) { char *xs = s; while (count--) *xs++ = (char) c; return s; } int strcmp(const char *cs, const char *ct) { unsigned char c1, c2; while (1) { c1 = (unsigned char) *cs++; c2 = (unsigned char) *ct++; if (c1 != c2) return c1 < c2 ? -1 : 1; if (!c1) break; } return 0; }

*memset(void *s, int c, size_t count) { char *xs = s; while (count--) *xs++ = (char) /*@%*/ c; return s; } int strcmp(const char *cs, const char *ct) { unsigned char c1, c2; while (1) { c1 = (unsigned char) /*@%*/ *cs++; c2 = (unsigned char) /*@%*/ *ct++; if (c1 != c2) return c1 < c2 ? -1 : 1; if (!c1) break; } return 0; }

What’s next (1) • «Lemma Functions for Frama-C: C Programs
as Proofs» G. Volkov, M. Mandrykin, D. Efremov https://arxiv.org/abs/1811.05879 • Auto-active verification technique for the Frama-C framework • Lemma-functions ACSL extension • Interactive proving (Coq) vs auto-active verification technique • 31 lemma-functions • Source code: https://github.com/evdenis/verker/tree/lemma_functi ons

What’s next (2) Arch-optimized implementations of functions Function Implementations on
architectures memmove powerpc (2), s390, mips, x86_64, alpha, sparc (2) memcpy ia64 (2), powerpc (2), s390, mips (2), x86_64 (3), alpha, spark memset ia64, powerpc (2), s390, mips, x86_64, alpha (2), spark (2) memchr powerpc, alpha (2) memcmp powerpc (2), spark memscan spark (4) strcat alpha (2) strchr alpha (2) strncmp powerpc, spark (2) strcpy alpha strlen ia64, powerpc, alpha (2), spark strrchr alpha, arm64

How to verify all these implementations? • Runtime verification •
Translate a contract for a generic function in assertions (Frama-C + E-ACSL) • Extract particular implementation • Integrate fuzzer (e.g., libfuzzer) with assertions • Run the testing in QEMU (user mode) • Catch the violations of postconditions

Translate specifications to assertions Frama-C E-ACSL /*@ requires a <=
0 && 0 <= b; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } bool average_precondition(int a, int b) { if (0 <= a && 0 <= b) if (a <= b) return true; return false; } bool average_postcondition(int ret_value) { long result = ((long) a + (long) b) / 2; if (result == ret_value) return true; return false; } int _average(int a, int b) { assert(average_precondition(a, b)); int _tmp = average(a, b); assert(average_postcondition(_tmp)); return _tmp; }

Translate specifications to assertions Frama-C E-ACSL /*@ requires a <=
0 && 0 <= b; requires a <= b; ensures \result == (a + b) / 2; */ int average(int a, int b) { return (a + b) / 2; } bool average_precondition(int a, int b) { if (0 <= a && 0 <= b) if (a <= b) return true; return false; } bool average_postcondition(int ret_value) { long result = ((long) a + (long) b) / 2; if (result == ret_value) return true; return false; } void _fuzz_average(int fuzz_a, int fuzz_b) { if (average_precondition(fuzz_a, fuzz_b)) { int _tmp = average(fuzz_a, fuzz_b); assert(average_postcondition(_tmp)); } }

Questions?

Logic errors. Can you see the contradiction? The artificial example
/*@ requires 0 == 1; ensures \result == 0 && \result == 1 && \result == 2; */ int main(void) { int a = 1; return a / 0; } • The contradiction in the specification; • Division-by-zero in the main function; • Errors in specification may lead to missing errors in code.

Logic errors. Can you see the contradiction? The real example
logic Z Count{L}(int *a, Z m, Z n, int v); axiom CountSectionEmpty: ∀ int *a, v, Z m, n; n ≤ m ⇒ Count(a, m, n, v) == 0; axiom CountSectionHit: ∀ int *a, v, Z n, m; a[n] == v ⇒ Count(a, m, n + 1, v) == Count(a, m, n, v) + 1;

logic Z Count{L}(int *a, Z m, Z n, int v);
axiom CountSectionEmpty: ∀ int *a, v, Z m, n; n ≤ m ⇒ Count(a, m, n, v) == 0; axiom CountSectionHit: ∀ int *a, v, Z n, m; a[n] == v ⇒ Count(a, m, n + 1, v) == Count(a, m, n, v) + 1; Logic errors. Can you see the contradiction? The real example

axiom CountSectionEmpty: ∀ int *a, v, Z m, n; n ≤ m ⇒ Count(a, m, n, v) == 0; axiom CountSectionHit: ∀ int *a, v, Z n, m; a[n] == v ⇒ Count(a, m, n + 1, v) == Count(a, m, n, v) + 1; int a = 5; assert Count(&a+1, 0, -1, 5) == 0 && Count(&a+1, 0, 0, 5) == 0; Logic errors. Can you see the contradiction? The real example

axiom CountSectionEmpty: ∀ int *a, v, Z m, n; n ≤ m ⇒ Count(a, m, n, v) == 0; axiom CountSectionHit: ∀ int *a, v, Z n, m; a[n] == v ⇒ Count(a, m, n + 1, v) == Count(a, m, n, v) + 1; int a = 5; assert Count(&a+1, 0, -1, 5) == 0 && Count(&a+1, 0, 0, 5) == 0; assert Count(&a+1, 0, 0, 5) == Count(&a+1, 0, -1, 5) + 1; Logic errors. Can you see the contradiction? The real example

axiom CountSectionEmpty: ∀ int *a, v, Z m, n; n ≤ m ⇒ Count(a, m, n, v) == 0; axiom CountSectionHit: ∀ int *a, v, Z n, m; a[n] == v ⇒ Count(a, m, n + 1, v) == Count(a, m, n, v) + 1; int a = 5; assert Count(&a+1, 0, -1, 5) == 0 && Count(&a+1, 0, 0, 5) == 0; assert Count(&a+1, 0, 0, 5) == Count(&a+1, 0, -1, 5) + 1; assert 0 == 1; Logic errors. Can you see the contradiction? The real example

• Insert an incorrect assertion. It should not be proved;
• Perform a special transformation of a verification condition: • Try to prove it; • Negate. Try to prove it; • If it holds in either case, you’ve got a problem. /*@ requires 0 == 1; ensures \result == 0 && \result == 1 && \result == 2; */ int main(void) { int a = 1; //@ assert 0 == 1; return a / 0; } Logic errors. Can you see the contradiction? How to check yourself

Deductive verification of unmodified Linux kern...

Deductive verification of unmodified Linux kernel library functions

More Decks by Denis

Other Decks in Research

Featured

Transcript