Cloud9: Parallel Symbolic Execution for Automated Real-World Software Testing

Cloud9: Parallel Symbolic Execution for Automated Real-World Software Testing

66b2d5393e5e1a030eb52832855d9cbb?s=128

Stefan Bucur

April 12, 2011
Tweet

Transcript

  1. Parallel Symbolic Execution for Automated Real-World Software Testing Stefan Bucur,

    Vlad Ureche, Cristian Zam r, George Candea Cloud9 School of Computer and Communication Sciences
  2. Automated Techniques Automated Software Testing 2 λ Symbolic Execution Model

    Checking Industrial SW Testing Manual Testing Static Analysis Fuzzing Scalability Applicability Usability
  3. Cloud9 - The Big Picture • Parallel symbolic execution •

    Linear scalability on commodity clusters • Full symbolic POSIX support • Applicable on real-world systems • Platform for writing test cases • Easy-to-use platform API 3
  4. Automated Systems Testing 4 [*] C. Cadar, D. Dunbar, D.

    Engler, “KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs”, OSDI 2008 • Promising for systems testing: KLEE [*] • High-coverage test cases • Found new bugs • ... But applied only on small programs λ Symbolic Execution
  5. 5 Memcached GNU Coreutils Apache

  6. void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt);

    return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  7. void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt);

    return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  8. pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic !=

    0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  9. pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) {

    if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  10. pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) {

    if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell [C9 A0 ... ] 6
  11. void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt);

    return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 λ
  12. λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) {

    if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 λ
  13. λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic

    != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 λ
  14. λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic

    != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... } Symbolic Execution in a Nutshell 7 ∼2 paths λ program size
  15. 8 CPU Bottleneck Memory Exhaustion

  16. W1 W2 W3 Parallel Tree Exploration 8

  17. W1 W2 W3 Parallel Tree Exploration 8 Key research problem:

    Scalable parallel exploration
  18. Linear Solution to Exponential Problem 9 Program Size Time to

    Test
  19. Linear Solution to Exponential Problem 9 Program Size Time to

    Test Testing target 1 worker
  20. Linear Solution to Exponential Problem 9 Program Size Time to

    Test Testing target Bring testing time down to practical values 1 worker 2 workers 4 workers 8 workers
  21. Throw Hardware at the Problem 10

  22. Scalability Challenges Tree structure not known a priori ? ?

    ? ? ? ? ? ? ? ? 11
  23. Scalability Challenges Static Allocation 12

  24. Scalability Challenges 12

  25. Scalability Challenges Anticipate Allocation 13

  26. Scalability Challenges 13

  27. Outline • Scalable Parallel Symbolic Execution • POSIX Environment Model

    • Evaluation 14
  28. Cloud9 Architecture 15 Global Symbolic Tree

  29. Cloud9 Architecture 15 W1’s Local Tree W2’s Local Tree W3’s

    Local Tree Each worker runs a local sequential symbolic execution engine (KLEE)
  30. Cloud9 Architecture 16 Candidate nodes Fence nodes • Candidate nodes

    are selected for exploration • Fence nodes bound the local tree
  31. Load Balancing LB W1 W2 W3 17 Hybrid distributed system:

    centralized reports, P2P work transfer
  32. Load Balancing LB W1 W2 W3 17 Hybrid distributed system:

    centralized reports, P2P work transfer
  33. Load Balancing LB W1 W2 W3 17 Hybrid distributed system:

    centralized reports, P2P work transfer
  34. Work Transfer W1 18 Candidate Fence

  35. Work Transfer W1 W2 18 Candidate Fence

  36. Work Transfer W1 W2 Virtual 18 Candidate Fence

  37. Work Transfer W1 W2 Virtual 18 Candidate Fence

  38. Work Transfer W1 W2 Materialized 18 Candidate Fence

  39. Work Transfer W1 W2 18 Exploration disjointness + completeness Candidate

    Fence
  40. 1 1 1 1 1 0 0 0 0 0

    0 0 0 Path-based Encoding 19 • Nodes are encoded as paths in tree • Compact binary representation • Two paths can share common pre x • Small encoding size • For a tree of 2100 leaves, a path ts in <128 bits (16 bytes)
  41. Load Balancing in Practice 20 LB stops after 1 min

    LB stops after 4 min Continuous load balancing Work done [% of total instructions] Time [minutes] 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 Load balancing necessary to ensure scalability
  42. Outline • Scalable Parallel Symbolic Execution • POSIX Environment Model

    • Evaluation 21
  43. Calls into the Environment 22 if (fork() == 0) {

    ... if ((res = recv(sock, buff, size, 0)) > 0) { pthread_mutex_lock(&mutex); memcpy(gBuff, buff, res); pthread_mutex_unlock(&mutex); } ... } else { ... pid_t pid = wait(&stat); ... }
  44. fork() Program Under Test Environment (C Library / OS) Environment

    Model 23 Cannot directly execute symbolically
  45. fork() Program Under Test Environment (C Library / OS) Environment

    Model 23 Model Code Symbolic Execution Engine Equivalent functionality Executable symbolically
  46. Starting Point 24 Symbolic Execution Engine Network Stubs Files POSIX

    Single-threaded isolated nodes Single-threaded utilities
  47. POSIX Environment Model 25 Symbolic Execution Engine Network TCP/UDP/UNIX Files

    Pipes Threads pthread_* Processes POSIX M essage passing Servers and clients M ulti-threaded program s Distributed system s Signals Asynchronous events, IPC Single-threaded utilities
  48. Key Changes in Symbolic Execution Multithreading and Scheduling • Deterministic

    or symbolic scheduling • Non-preemptive execution model Address Space Isolation • Copy on Write (CoW) between processes • CoW domains for memory sharing 26
  49. Symbolic Engine System Calls • Symbolic engine support needed for

    threads/processes 1. Thread/process lifecycle 2. Synchronization 3. Shared memory 27 Symbolic Engine System Calls thread_create thread_terminate process_fork process_terminate get_context thread_preempt thread_sleep thread_notify get_wait_list make_shared 1 2 3
  50. Outline • Scalable Parallel Symbolic Execution • POSIX Environment Model

    • Evaluation 28
  51. Testing Real-World Software 29 Memcached GNU Coreutils Apache

  52. Time to Reach Target Coverage 30 printf Faster time-to-cover, higher

    coverage values 60% coverage 70% coverage 80% coverage 90% coverage 0 10 20 30 40 50 60 1 4 8 24 48 Time to achieve target coverage [minutes] Number of workers
  53. Increase in Code Coverage 0 10 20 30 40 50

    0 10 20 30 40 50 60 70 80 90 Additional code covered [ % of program LOC ] Index of tested Coreutil (sorted by additional coverage) 31 Coreutils suite (12 workers, 10 min.) Consistent code coverage increase
  54. Exhaustive Exploration 32 0 1 2 3 4 5 6

    2 4 6 12 24 48 Time to complete exhaustive test [hours] Number of workers Scalability of exhaustive path exploration memcached (7.4×104 paths)
  55. Instruction Throughput 33 0.0e+00 2.0e+09 4.0e+09 6.0e+09 8.0e+09 1.0e+10 1.2e+10

    1.4e+10 1.6e+10 1.8e+10 1 4 6 12 24 48 Useful work done [ # of instructions ] Number of workers 4 minutes 6 minutes 8 minutes 10 minutes memcached Linear scalability with number of workers
  56. Execute the “whole world” symbolically Symbolic State Experimental Setup 34

    Client Process memcached/ Apache/ lighttpd TCP Stream Symbolic cmd. Srv. response
  57. Symbolic Test Cases • Easy-to-use API for developers to write

    symbolic test cases • Basic symbolic memory support • POSIX extensions for environment control • Network conditions, fault injection, symbolic scheduler 35
  58. Symbolic Test Cases 36 Testing HTTP header extension make_symbolic(hdrData); //

    Append symbolic header to request strcat(req, “X-NewExtension: “); strcat(req, hdrData); // Enable fault injection on socket ioctl(ssock, SIO_FAULT_INJ, RD | WR); // Symbolic stream fragmentation ioctl(ssock, SIO_PKT_FRAGMENT, RD);
  59. Conclusions • Parallel symbolic execution • Linear scalability on commodity

    clusters • Full POSIX environment model • Real-world systems testing • Use cases • Increasing coverage • Exhaustive path exploration • Bug patch veri cation 37