Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud9: Parallel Symbolic Execution for Automat...

Cloud9: Parallel Symbolic Execution for Automated Real-World Software Testing

Stefan Bucur

April 12, 2011
Tweet

More Decks by Stefan Bucur

Other Decks in Research

Transcript

  1. Parallel Symbolic Execution for Automated Real-World Software Testing Stefan Bucur,

    Vlad Ureche, Cristian Zam r, George Candea Cloud9 School of Computer and Communication Sciences
  2. Automated Techniques Automated Software Testing 2 λ Symbolic Execution Model

    Checking Industrial SW Testing Manual Testing Static Analysis Fuzzing Scalability Applicability Usability
  3. Cloud9 - The Big Picture • Parallel symbolic execution •

    Linear scalability on commodity clusters • Full symbolic POSIX support • Applicable on real-world systems • Platform for writing test cases • Easy-to-use platform API 3
  4. Automated Systems Testing 4 [*] C. Cadar, D. Dunbar, D.

    Engler, “KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs”, OSDI 2008 • Promising for systems testing: KLEE [*] • High-coverage test cases • Found new bugs • ... But applied only on small programs λ Symbolic Execution
  5. void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt);

    return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  6. void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt);

    return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  7. pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic !=

    0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  8. pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) {

    if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  9. pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) {

    if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  10. void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt);

    return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 λ
  11. λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) {

    if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 λ
  12. λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic

    != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 λ
  13. λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic

    != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 ∼2 paths λ program size
  14. Linear Solution to Exponential Problem 9 Program Size Time to

    Test Testing target Bring testing time down to practical values 1 worker 2 workers 4 workers 8 workers
  15. Cloud9 Architecture 15 W1’s Local Tree W2’s Local Tree W3’s

    Local Tree Each worker runs a local sequential symbolic execution engine (KLEE)
  16. Cloud9 Architecture 16 Candidate nodes Fence nodes • Candidate nodes

    are selected for exploration • Fence nodes bound the local tree
  17. Load Balancing LB W1 W2 W3 17 Hybrid distributed system:

    centralized reports, P2P work transfer
  18. Load Balancing LB W1 W2 W3 17 Hybrid distributed system:

    centralized reports, P2P work transfer
  19. Load Balancing LB W1 W2 W3 17 Hybrid distributed system:

    centralized reports, P2P work transfer
  20. 1 1 1 1 1 0 0 0 0 0

    0 0 0 Path-based Encoding 19 • Nodes are encoded as paths in tree • Compact binary representation • Two paths can share common pre x • Small encoding size • For a tree of 2100 leaves, a path ts in <128 bits (16 bytes)
  21. Load Balancing in Practice 20 LB stops after 1 min

    LB stops after 4 min Continuous load balancing Work done [% of total instructions] Time [minutes] 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 Load balancing necessary to ensure scalability
  22. Calls into the Environment 22 if (fork() == 0) {

    ... if ((res = recv(sock, buff, size, 0)) > 0) { pthread_mutex_lock(&mutex); memcpy(gBuff, buff, res); pthread_mutex_unlock(&mutex); } ... } else { ... pid_t pid = wait(&stat); ... }
  23. fork() Program Under Test Environment (C Library / OS) Environment

    Model 23 Cannot directly execute symbolically
  24. fork() Program Under Test Environment (C Library / OS) Environment

    Model 23 Model Code Symbolic Execution Engine Equivalent functionality Executable symbolically
  25. Starting Point 24 Symbolic Execution Engine Network Stubs Files POSIX

    Single-threaded isolated nodes Single-threaded utilities
  26. POSIX Environment Model 25 Symbolic Execution Engine Network TCP/UDP/UNIX Files

    Pipes Threads pthread_* Processes POSIX M essage passing Servers and clients M ulti-threaded program s Distributed system s Signals Asynchronous events, IPC Single-threaded utilities
  27. Key Changes in Symbolic Execution Multithreading and Scheduling • Deterministic

    or symbolic scheduling • Non-preemptive execution model Address Space Isolation • Copy on Write (CoW) between processes • CoW domains for memory sharing 26
  28. Symbolic Engine System Calls • Symbolic engine support needed for

    threads/processes 1. Thread/process lifecycle 2. Synchronization 3. Shared memory 27 Symbolic Engine System Calls thread_create thread_terminate process_fork process_terminate get_context thread_preempt thread_sleep thread_notify get_wait_list make_shared 1 2 3
  29. Time to Reach Target Coverage 30 printf Faster time-to-cover, higher

    coverage values 60% coverage 70% coverage 80% coverage 90% coverage 0 10 20 30 40 50 60 1 4 8 24 48 Time to achieve target coverage [minutes] Number of workers
  30. Increase in Code Coverage 0 10 20 30 40 50

    0 10 20 30 40 50 60 70 80 90 Additional code covered [ % of program LOC ] Index of tested Coreutil (sorted by additional coverage) 31 Coreutils suite (12 workers, 10 min.) Consistent code coverage increase
  31. Exhaustive Exploration 32 0 1 2 3 4 5 6

    2 4 6 12 24 48 Time to complete exhaustive test [hours] Number of workers Scalability of exhaustive path exploration memcached (7.4×104 paths)
  32. Instruction Throughput 33 0.0e+00 2.0e+09 4.0e+09 6.0e+09 8.0e+09 1.0e+10 1.2e+10

    1.4e+10 1.6e+10 1.8e+10 1 4 6 12 24 48 Useful work done [ # of instructions ] Number of workers 4 minutes 6 minutes 8 minutes 10 minutes memcached Linear scalability with number of workers
  33. Execute the “whole world” symbolically Symbolic State Experimental Setup 34

    Client Process memcached/ Apache/ lighttpd TCP Stream Symbolic cmd. Srv. response
  34. Symbolic Test Cases • Easy-to-use API for developers to write

    symbolic test cases • Basic symbolic memory support • POSIX extensions for environment control • Network conditions, fault injection, symbolic scheduler 35
  35. Symbolic Test Cases 36 Testing HTTP header extension make_symbolic(hdrData); //

    Append symbolic header to request strcat(req, “X-NewExtension: “); strcat(req, hdrData); // Enable fault injection on socket ioctl(ssock, SIO_FAULT_INJ, RD | WR); // Symbolic stream fragmentation ioctl(ssock, SIO_PKT_FRAGMENT, RD);
  36. Conclusions • Parallel symbolic execution • Linear scalability on commodity

    clusters • Full POSIX environment model • Real-world systems testing • Use cases • Increasing coverage • Exhaustive path exploration • Bug patch veri cation 37