Chef: Prototyping Symbolic Execution Engines for Interpreted Languages

CHEF Prototyping Symbolic Execution Engines for Interpreted Languages School of
Computer and Communication Sciences EPFL, Switzerland Stefan Bucur, Johannes Kinder, George Candea

Automated Software Testing

Automated Software Testing Symbolic execution

Automated Software Testing Symbolic execution is used for bug ﬁnding,
increasing coverage, debugging

increasing coverage, debugging and applied on device drivers, system utilities, ﬁle parsers, distributed systems,

increasing coverage, debugging and applied on device drivers, system utilities, ﬁle parsers, distributed systems, interpreted programs Today

Symbolic Execution int foo(int x) { if (x > 10)
{ return 2*x; } x = x + 1; if (x > 5) { return 3*x; } else { return x; } }

Symbolic Execution int foo(int x) { if (x > 10)
{ return 2*x; } x = x + 1; if (x > 5) { return 3*x; } else { return x; } } x⟵λ

λ>10 λ≤10 Symbolic Execution int foo(int x) { if (x
> 10) { return 2*x; } x = x + 1; if (x > 5) { return 3*x; } else { return x; } } x⟵λ

λ>10 λ≤10 Symbolic Execution int foo(int x) { if (x
> 10) { return 2*x; } x = x + 1; if (x > 5) { return 3*x; } else { return x; } } ret 2λ x⟵λ x⟵λ+1

λ+1>5 λ>10 λ≤10 Symbolic Execution int foo(int x) { if
(x > 10) { return 2*x; } x = x + 1; if (x > 5) { return 3*x; } else { return x; } } ret 2λ ret λ+1 λ+1≤5 x⟵λ x⟵λ+1 ret 3(λ+1)

Test case: λ = 6 (λ≤10)∧(λ+1>5) λ+1>5 λ>10 λ≤10 Symbolic
Execution int foo(int x) { if (x > 10) { return 2*x; } x = x + 1; if (x > 5) { return 3*x; } else { return x; } } ret 2λ ret λ+1 λ+1≤5 Test cases for bug ﬁnding and statement coverage x⟵λ x⟵λ+1 ret 3(λ+1)

Programming Languages Symbolic Execution Engines

Programming Languages Symbolic Execution Engines Scala Java C# C C++

Programming Languages Symbolic Execution Engines Java Bytecode LLVM x86 ARM
Scala Java C# C C++ Compiled

Scala Java C# C C++ Compiled BitBlaze KLEE JPF SAGE S2E

Scala Java C# C C++ Python Ruby Lua JavaScript Bash Perl Compiled BitBlaze KLEE JPF SAGE S2E

Scala Java C# C C++ Python Ruby Lua JavaScript Bash Perl Compiled BitBlaze KLEE JPF SAGE S2E ?

def parse_file(file_name): with open(file_name, "r") as f: data = f.read()
return json.loads(data, encoding="utf-8") Interpreted Languages

return json.loads(data, encoding="utf-8") Interpreted Languages Complex semantics Complete File Read

return json.loads(data, encoding="utf-8") Interpreted Languages Complex semantics + Ambiguity in speciﬁcations Complete File Read Incomplete Speciﬁcation

return json.loads(data, encoding="utf-8") Interpreted Languages Complex semantics + Ambiguity in speciﬁcations + Evolving language Since Python 2.5 Complete File Read Incomplete Speciﬁcation

return json.loads(data, encoding="utf-8") Interpreted Languages Complex semantics + Ambiguity in speciﬁcations + Evolving language + Large standard library Since Python 2.5 Complete File Read Incomplete Speciﬁcation

return json.loads(data, encoding="utf-8") Interpreted Languages Complex semantics + Ambiguity in speciﬁcations + Evolving language + Large standard library + Widespread native methods Since Python 2.5 Complete File Read Incomplete Speciﬁcation

return json.loads(data, encoding="utf-8") Interpreted Languages Complex semantics + Ambiguity in specifications + Evolving language + Large standard library + Widespread native methods Since Python 2.5 Complete File Read Incomplete Specification Significant Engineering Effort

“Consequently, if you were coming from Mars and tried to
re-implement Python from this document alone, you might have to guess things and in fact you would probably end up implementing quite a different language.” - The Python Language Reference

How can we efﬁciently obtain a correct symbolic execution engine?

Key idea: Use the language interpreter as executable speciﬁcation

Symbolic Execution Engine for Language X CHEF Key idea: Use
the language interpreter as executable speciﬁcation Language X Interpreter

Symbolic Execution Engine for Language X CHEF Program + Symbolic
Tests Test Cases Key idea: Use the language interpreter as executable speciﬁcation Language X Interpreter

Chef Overview • Built on top of the S2E symbolic
execution engine for x86 • Relies on lightweight interpreter instrumentation + optimizations • Prototyped engines for Python and Lua in 5 + 3 person-days

Talk Outline 1. Challenges 2. Class-Uniform Path Analysis (CUPA) 3.
Interpreter Recipe 4. Uses and Evaluation

Testing Interpreted Programs Naive approach: Run interpreter in a stock
symbolic execution engine

Testing Interpreted Programs def validateEmail(email): pos = email.find("@") if pos
< 1: raise InvalidEmailError() if email.rfind(".") < pos: raise InvalidEmailError() Naive approach: Run interpreter in a stock symbolic execution engine

< 1: raise InvalidEmailError() if email.rfind(".") < pos: raise InvalidEmailError() Naive approach: Run interpreter in a stock symbolic execution engine Python Interpreter ./python program.py

< 1: raise InvalidEmailError() if email.rfind(".") < pos: raise InvalidEmailError() x86 Symbolic Execution Engine (S2E) Naive approach: Run interpreter in a stock symbolic execution engine Python Interpreter ./python program.py

pos = email.find("@") Naive approach: Run interpreter in a stock
symbolic execution engine

Py_LOCAL_INLINE(Py_ssize_t) fastsearch(const STRINGLIB_CHAR* s, Py_ssize_t n, const STRINGLIB_CHAR* p, Py_ssize_t
m, Py_ssize_t maxcount, int mode) { unsigned long mask; Py_ssize_t skip, count = 0; Py_ssize_t i, j, mlast, w; w = n - m; if (w < 0 || (mode == FAST_COUNT && maxcount == 0)) return -1; /* look for special cases */ if (m <= 1) { pos = email.find("@") Naive approach: Run interpreter in a stock symbolic execution engine

STRINGLIB_BLOOM_ADD(mask, p[i]); if (p[i] == p[0]) skip = i -
1; } for (i = w; i >= 0; i--) { if (s[i] == p[0]) { /* candidate match */ for (j = mlast; j > 0; j--) if (s[i+j] != p[j]) break; if (j == 0) /* got a match! */ return i; /* miss: check if previous character is part of if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1])) i = i - m; else i = i - skip; } else { /* skip: check if previous character is part of if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1])) i = i - m; } } } if (mode != FAST_COUNT) pos = email.find("@") Naive approach: Run interpreter in a stock symbolic execution engine

1; } for (i = w; i >= 0; i--) { if (s[i] == p[0]) { /* candidate match */ for (j = mlast; j > 0; j--) if (s[i+j] != p[j]) break; if (j == 0) /* got a match! */ return i; /* miss: check if previous character is part of if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1])) i = i - m; else i = i - skip; } else { /* skip: check if previous character is part of if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1])) i = i - m; } } } if (mode != FAST_COUNT) pos = email.find("@") Naive approach: Run interpreter in a stock symbolic execution engine Path Explosion

1; } for (i = w; i >= 0; i--) { if (s[i] == p[0]) { /* candidate match */ for (j = mlast; j > 0; j--) if (s[i+j] != p[j]) break; if (j == 0) /* got a match! */ return i; /* miss: check if previous character is part of if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1])) i = i - m; else i = i - skip; } else { /* skip: check if previous character is part of if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1])) i = i - m; } } } if (mode != FAST_COUNT) pos = email.find("@") Gets lost in the details of the implementation Naive approach: Run interpreter in a stock symbolic execution engine Path Explosion

High-level Execution Paths def validateEmail(email): pos = email.find("@") if pos
< 1: raise InvalidEmailError() if email.rfind(".") < pos: raise InvalidEmailError() High-level execution tree

< 1: raise InvalidEmailError() if email.rfind(".") < pos: raise InvalidEmailError() High-level execution tree Low-level (x86) execution tree

< 1: raise InvalidEmailError() if email.rfind(".") < pos: raise InvalidEmailError() HL/LL path ratio is low due to path explosion 3 HL paths 10 LL paths High-level execution tree Low-level (x86) execution tree

Goal: Prioritize the low-level paths that maximize the HL/LL path
ratio.

High-level Execution Paths Alternative approach: Select states at high-level branches

High-level Execution Paths Fork (low-level) Divergence (high-level) Alternative approach: Select
states at high-level branches

High-level Execution Paths Fork (low-level) Divergence (high-level) High-level fork points
are unpredictable Alternative approach: Select states at high-level branches

Reducing Path Explosion Program

Reducing Path Explosion Fork points clustered in hot spots Program
Low High Selection probability

Reducing Path Explosion Fork points clustered in hot spots Program
“Fork bomb” (e.g., input dependent loop) Low High Selection probability Global DFS / BFS / randomized strategy

Reducing Path Explosion Fork points clustered in hot spots Clusters
grow bigger 㱺 Slower overall progress Program “Fork bomb” (e.g., input dependent loop) Low High Selection probability Global DFS / BFS / randomized strategy

Reducing Path Explosion Fork points clustered in hot spots Clusters
grow bigger 㱺 Slower overall progress Program “Fork bomb” (e.g., input dependent loop) Low High Selection probability Reduced state diversity Global DFS / BFS / randomized strategy

Reducing Path Explosion Program Idea: Partition the state space into
groups

groups Select group Select state from group

groups Select group Select state from group Faster progress across all groups

groups Select group Select state from group Faster progress across all groups Increased state diversity

Class-Uniform Path Analysis

Class-Uniform Path Analysis States arranged in a class hierarchy Class
1 Class 2

Partitioning High-level Paths

Partitioning High-level Paths High-level Instruction (Bytecode Instruction)

Partitioning High-level Paths High-level Instruction (Bytecode Instruction) High-level Program Counter

Partitioning High-level Paths High-level Instruction (Bytecode Instruction) Low-level x86 PC
High-level Program Counter

High-level Program Counter 1st CUPA Class

High-level Program Counter 1st CUPA Class 2nd CUPA Class Reconstruct high-level execution tree

CUPA Classes 1. High-level PC • Uniform HL instruction exploration
• Obtained via instrumentation 2. x86 PC • Uniform native method exploration • Approximated as the PC of fork point Coverage-optimized CUPA in the paper

switch (opcode) { case LOAD: ... case STORE: ... case
CALL_FUNCTION: ... ... } hlpc++; } Interpreter Loop Instrumentation while (true) { fetch_instr(hlpc, &opcode, &params);

switch (opcode) { case LOAD: ... case STORE: ... case
CALL_FUNCTION: ... ... } hlpc++; } Interpreter Loop Instrumentation while (true) { fetch_instr(hlpc, &opcode, &params); Reconstruct high-level execution tree and CFG chef_log_hlpc(hlpc, opcode);

Interpreter Optimizations static long string_hash(PyStringObject *a) { #ifdef SYMBEX_HASHES return
0; #else register Py_ssize_t len; register unsigned char *p; register long x; len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = _Py_HashSecret.prefix; x ^= *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); x ^= _Py_HashSecret.suffix; if (x == -1) x = -2; return x; #endif } Hash neutralization

Interpreter Optimizations • Simple changes to interpreter source static long
string_hash(PyStringObject *a) { #ifdef SYMBEX_HASHES return 0; #else register Py_ssize_t len; register unsigned char *p; register long x; len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = _Py_HashSecret.prefix; x ^= *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); x ^= _Py_HashSecret.suffix; if (x == -1) x = -2; return x; #endif } Hash neutralization

Interpreter Optimizations • Simple changes to interpreter source • “Anti-optimizations”
in linear performance... static long string_hash(PyStringObject *a) { #ifdef SYMBEX_HASHES return 0; #else register Py_ssize_t len; register unsigned char *p; register long x; len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = _Py_HashSecret.prefix; x ^= *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); x ^= _Py_HashSecret.suffix; if (x == -1) x = -2; return x; #endif } Hash neutralization

Interpreter Optimizations • Simple changes to interpreter source • “Anti-optimizations”
in linear performance... • ... but exponential gains in symbolic mode static long string_hash(PyStringObject *a) { #ifdef SYMBEX_HASHES return 0; #else register Py_ssize_t len; register unsigned char *p; register long x; len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = _Py_HashSecret.prefix; x ^= *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); x ^= _Py_HashSecret.suffix; if (x == -1) x = -2; return x; #endif } Hash neutralization

Program + Symbolic Tests Symbolic Execution Engine for Language X
Chef Summary Language X Interpreter (+instrumentation) CHEF API HL Tree Reconstr. CUPA State Selection S2E x86 Symbolic Execution CHEF

Chef-Prototyped Engines Python 5 person-days 321 LoC Lua 3 person-days
277 LoC

Evaluation Questions How does a Chef-obtained engine... • ... work
for test case generation? • ... beneﬁt from CUPA and optimizations? • ... compare to a dedicated implementation?

Using a Chef Engine class ArgparseTest(SymbolicTest): def setUp(self): self.argparse =
import_module("argparse") def runTest(self): parser = self.argparse.ArgumentParser() arg_name = self.getSymString(size=3) arg_value = self.getSymString(size=3) parser.add_argument(arg_name) args = parser.parse_args([arg_value])

Using a Chef Engine class ArgparseTest(SymbolicTest): def setUp(self): self.argparse =
import_module("argparse") def runTest(self): parser = self.argparse.ArgumentParser() arg_name = self.getSymString(size=3) arg_value = self.getSymString(size=3) parser.add_argument(arg_name) args = parser.parse_args([arg_value]) CHEF Symbolic Test Library Program

Testing Python Packages xlrd simplejson argparse HTMLParser ConﬁgParser unicodecsv 6
Popular Packages 10.9K lines of Python code 30 min. / package > 7,000 tests generated 4 undocumented exceptions found

Testing Python Packages xlrd simplejson argparse HTMLParser ConﬁgParser unicodecsv 6
Popular Packages 10.9K lines of Python code 30 min. / package > 7,000 tests generated 4 undocumented exceptions found High bug ﬁnding potential for dynamic languages

xlrd simplejson argparse HTMLParser ConﬁgParser unicodecsv Efﬁciency Package 0.1 1
10 100 1000 10000 Path Ratio (P / PBaseline) CUPA + Optimizations Baseline

xlrd simplejson argparse HTMLParser ConﬁgParser unicodecsv Efﬁciency Package 0.1 1
10 100 1000 10000 Path Ratio (P / PBaseline) CUPA + Optimizations Optimizations Only CUPA Only Baseline

Comparison to Dedicated Engine • Symbolic execution engine of NICE
[1] • Targets OpenFlow applications in Python • Case Study: Switch MAC learning algorithm [1] M. Canini, D. Venzano, P. Peresini, D. Kostic, and J. Rexford. “A NICE way to test OpenFlow applications.” NSDI 2012.

Overhead 1 10 100 1000 1 2 3 4 5
6 7 8 9 10 Size of Symbolic Input [# of Ethernet frames] CHEF Overhead TCHEF /TNICE

Overhead 1 10 100 1000 1 2 3 4 5
6 7 8 9 10 Size of Symbolic Input [# of Ethernet frames] CHEF Overhead TCHEF /TNICE >100×

Overhead 1 10 100 1000 1 2 3 4 5
6 7 8 9 10 Size of Symbolic Input [# of Ethernet frames] CHEF Overhead TCHEF /TNICE >100× 5× O ne-tim e Initialization

Overhead 1 10 100 1000 1 2 3 4 5
6 7 8 9 10 Size of Symbolic Input [# of Ethernet frames] CHEF Overhead TCHEF /TNICE >100× 5× 40× O ne-tim e Initialization x86 Reasoning Overhead (Instructions + Constraints)

Chef Engine as Reference Implementation Chef-Python Reference Paths

Test Cases Chef Engine as Reference Implementation NICE Dedicated Engine
Chef-Python Reference Paths

Test Cases Chef Engine as Reference Implementation NICE Dedicated Engine
Missing Paths Duplicate Paths Chef-Python Reference Paths Chef engine’s correctness outweighs performance penalty

Conclusions http://dslab.epfl.ch/proj/chef CHEF Program + Symbolic Tests Language X Interpreter
Test Cases

Chef: Prototyping Symbolic Execution Engines fo...

Chef: Prototyping Symbolic Execution Engines for Interpreted Languages

More Decks by Stefan Bucur

Other Decks in Research

Featured

Transcript