$30 off During Our Annual Pro Sale. View Details »

Chef: Prototyping Symbolic Execution Engines for Interpreted Languages

Chef: Prototyping Symbolic Execution Engines for Interpreted Languages

Stefan Bucur

March 03, 2014
Tweet

More Decks by Stefan Bucur

Other Decks in Research

Transcript

  1. CHEF
    Prototyping Symbolic Execution Engines
    for Interpreted Languages
    School of Computer and Communication Sciences
    EPFL, Switzerland
    Stefan Bucur, Johannes Kinder, George Candea

    View Slide

  2. Automated Software Testing

    View Slide

  3. Automated Software Testing
    Symbolic execution

    View Slide

  4. Automated Software Testing
    Symbolic execution
    is used for
    bug finding, increasing coverage, debugging

    View Slide

  5. Automated Software Testing
    Symbolic execution
    is used for
    bug finding, increasing coverage, debugging
    and applied on
    device drivers, system utilities,
    file parsers, distributed systems,

    View Slide

  6. Automated Software Testing
    Symbolic execution
    is used for
    bug finding, increasing coverage, debugging
    and applied on
    device drivers, system utilities,
    file parsers, distributed systems,
    interpreted programs
    Today

    View Slide

  7. Symbolic Execution
    int foo(int x) {
    if (x > 10) {
    return 2*x;
    }
    x = x + 1;
    if (x > 5) {
    return 3*x;
    } else {
    return x;
    }
    }

    View Slide

  8. Symbolic Execution
    int foo(int x) {
    if (x > 10) {
    return 2*x;
    }
    x = x + 1;
    if (x > 5) {
    return 3*x;
    } else {
    return x;
    }
    }
    x⟵λ

    View Slide

  9. λ>10 λ≤10
    Symbolic Execution
    int foo(int x) {
    if (x > 10) {
    return 2*x;
    }
    x = x + 1;
    if (x > 5) {
    return 3*x;
    } else {
    return x;
    }
    }
    x⟵λ

    View Slide

  10. λ>10 λ≤10
    Symbolic Execution
    int foo(int x) {
    if (x > 10) {
    return 2*x;
    }
    x = x + 1;
    if (x > 5) {
    return 3*x;
    } else {
    return x;
    }
    }
    ret 2λ
    x⟵λ
    x⟵λ+1

    View Slide

  11. λ+1>5
    λ>10 λ≤10
    Symbolic Execution
    int foo(int x) {
    if (x > 10) {
    return 2*x;
    }
    x = x + 1;
    if (x > 5) {
    return 3*x;
    } else {
    return x;
    }
    }
    ret 2λ
    ret λ+1
    λ+1≤5
    x⟵λ
    x⟵λ+1
    ret 3(λ+1)

    View Slide

  12. Test case: λ = 6
    (λ≤10)∧(λ+1>5)
    λ+1>5
    λ>10 λ≤10
    Symbolic Execution
    int foo(int x) {
    if (x > 10) {
    return 2*x;
    }
    x = x + 1;
    if (x > 5) {
    return 3*x;
    } else {
    return x;
    }
    }
    ret 2λ
    ret λ+1
    λ+1≤5
    Test cases for bug finding and statement coverage
    x⟵λ
    x⟵λ+1
    ret 3(λ+1)

    View Slide

  13. Programming
    Languages
    Symbolic Execution
    Engines

    View Slide

  14. Programming
    Languages
    Symbolic Execution
    Engines
    Scala
    Java C#
    C C++

    View Slide

  15. Programming
    Languages
    Symbolic Execution
    Engines
    Java Bytecode
    LLVM
    x86 ARM
    Scala
    Java C#
    C C++
    Compiled

    View Slide

  16. Programming
    Languages
    Symbolic Execution
    Engines
    Java Bytecode
    LLVM
    x86 ARM
    Scala
    Java C#
    C C++
    Compiled
    BitBlaze
    KLEE
    JPF
    SAGE
    S2E

    View Slide

  17. Programming
    Languages
    Symbolic Execution
    Engines
    Java Bytecode
    LLVM
    x86 ARM
    Scala
    Java C#
    C C++
    Python Ruby
    Lua JavaScript
    Bash Perl
    Compiled
    BitBlaze
    KLEE
    JPF
    SAGE
    S2E

    View Slide

  18. Programming
    Languages
    Symbolic Execution
    Engines
    Java Bytecode
    LLVM
    x86 ARM
    Scala
    Java C#
    C C++
    Python Ruby
    Lua JavaScript
    Bash Perl
    Compiled
    BitBlaze
    KLEE
    JPF
    SAGE
    S2E
    ?

    View Slide

  19. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages

    View Slide

  20. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages
    Complex semantics
    Complete
    File Read

    View Slide

  21. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages
    Complex semantics
    +
    Ambiguity in specifications
    Complete
    File Read
    Incomplete
    Specification

    View Slide

  22. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages
    Complex semantics
    +
    Ambiguity in specifications
    +
    Evolving language
    Since
    Python 2.5
    Complete
    File Read
    Incomplete
    Specification

    View Slide

  23. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages
    Complex semantics
    +
    Ambiguity in specifications
    +
    Evolving language
    +
    Large standard library
    Since
    Python 2.5
    Complete
    File Read
    Incomplete
    Specification

    View Slide

  24. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages
    Complex semantics
    +
    Ambiguity in specifications
    +
    Evolving language
    +
    Large standard library
    +
    Widespread native methods
    Since
    Python 2.5
    Complete
    File Read
    Incomplete
    Specification

    View Slide

  25. def parse_file(file_name):
    with open(file_name, "r") as f:
    data = f.read()
    return json.loads(data, encoding="utf-8")
    Interpreted Languages
    Complex semantics
    +
    Ambiguity in specifications
    +
    Evolving language
    +
    Large standard library
    +
    Widespread native methods
    Since
    Python 2.5
    Complete
    File Read
    Incomplete
    Specification
    Significant
    Engineering Effort

    View Slide

  26. “Consequently, if you were coming from
    Mars and tried to re-implement Python
    from this document alone, you might
    have to guess things and in fact you
    would probably end up implementing
    quite a different language.”
    - The Python Language Reference

    View Slide

  27. How can we efficiently obtain
    a correct symbolic execution engine?

    View Slide

  28. Key idea:
    Use the language interpreter
    as executable specification

    View Slide

  29. Symbolic Execution
    Engine for
    Language X
    CHEF
    Key idea:
    Use the language interpreter
    as executable specification
    Language X
    Interpreter

    View Slide

  30. Symbolic Execution
    Engine for
    Language X
    CHEF
    Program +
    Symbolic Tests
    Test Cases
    Key idea:
    Use the language interpreter
    as executable specification
    Language X
    Interpreter

    View Slide

  31. Chef Overview
    • Built on top of the S2E symbolic
    execution engine for x86
    • Relies on lightweight interpreter
    instrumentation + optimizations
    • Prototyped engines for Python and Lua
    in 5 + 3 person-days

    View Slide

  32. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  33. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  34. Testing Interpreted Programs
    Naive approach:
    Run interpreter in a stock symbolic execution engine

    View Slide

  35. Testing Interpreted Programs
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    Naive approach:
    Run interpreter in a stock symbolic execution engine

    View Slide

  36. Testing Interpreted Programs
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    Naive approach:
    Run interpreter in a stock symbolic execution engine
    Python Interpreter
    ./python program.py

    View Slide

  37. Testing Interpreted Programs
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    x86 Symbolic Execution Engine
    (S2E)
    Naive approach:
    Run interpreter in a stock symbolic execution engine
    Python Interpreter
    ./python program.py

    View Slide

  38. pos = email.find("@")
    Naive approach:
    Run interpreter in a stock symbolic execution engine

    View Slide

  39. Py_LOCAL_INLINE(Py_ssize_t)
    fastsearch(const STRINGLIB_CHAR* s, Py_ssize_t n,
    const STRINGLIB_CHAR* p, Py_ssize_t m,
    Py_ssize_t maxcount, int mode)
    {
    unsigned long mask;
    Py_ssize_t skip, count = 0;
    Py_ssize_t i, j, mlast, w;
    w = n - m;
    if (w < 0 || (mode == FAST_COUNT && maxcount == 0))
    return -1;
    /* look for special cases */
    if (m <= 1) {
    pos = email.find("@")
    Naive approach:
    Run interpreter in a stock symbolic execution engine

    View Slide

  40. STRINGLIB_BLOOM_ADD(mask, p[i]);
    if (p[i] == p[0])
    skip = i - 1;
    }
    for (i = w; i >= 0; i--) {
    if (s[i] == p[0]) {
    /* candidate match */
    for (j = mlast; j > 0; j--)
    if (s[i+j] != p[j])
    break;
    if (j == 0)
    /* got a match! */
    return i;
    /* miss: check if previous character is part of
    if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))
    i = i - m;
    else
    i = i - skip;
    } else {
    /* skip: check if previous character is part of
    if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))
    i = i - m;
    }
    }
    }
    if (mode != FAST_COUNT)
    pos = email.find("@")
    Naive approach:
    Run interpreter in a stock symbolic execution engine

    View Slide

  41. STRINGLIB_BLOOM_ADD(mask, p[i]);
    if (p[i] == p[0])
    skip = i - 1;
    }
    for (i = w; i >= 0; i--) {
    if (s[i] == p[0]) {
    /* candidate match */
    for (j = mlast; j > 0; j--)
    if (s[i+j] != p[j])
    break;
    if (j == 0)
    /* got a match! */
    return i;
    /* miss: check if previous character is part of
    if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))
    i = i - m;
    else
    i = i - skip;
    } else {
    /* skip: check if previous character is part of
    if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))
    i = i - m;
    }
    }
    }
    if (mode != FAST_COUNT)
    pos = email.find("@")
    Naive approach:
    Run interpreter in a stock symbolic execution engine
    Path Explosion

    View Slide

  42. STRINGLIB_BLOOM_ADD(mask, p[i]);
    if (p[i] == p[0])
    skip = i - 1;
    }
    for (i = w; i >= 0; i--) {
    if (s[i] == p[0]) {
    /* candidate match */
    for (j = mlast; j > 0; j--)
    if (s[i+j] != p[j])
    break;
    if (j == 0)
    /* got a match! */
    return i;
    /* miss: check if previous character is part of
    if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))
    i = i - m;
    else
    i = i - skip;
    } else {
    /* skip: check if previous character is part of
    if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))
    i = i - m;
    }
    }
    }
    if (mode != FAST_COUNT)
    pos = email.find("@")
    Gets lost in the details of the implementation
    Naive approach:
    Run interpreter in a stock symbolic execution engine
    Path Explosion

    View Slide

  43. High-level Execution Paths
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    High-level
    execution tree

    View Slide

  44. High-level Execution Paths
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    High-level
    execution tree
    Low-level (x86)
    execution tree

    View Slide

  45. High-level Execution Paths
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    High-level
    execution tree
    Low-level (x86)
    execution tree

    View Slide

  46. High-level Execution Paths
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    High-level
    execution tree
    Low-level (x86)
    execution tree

    View Slide

  47. High-level Execution Paths
    def validateEmail(email):
    pos = email.find("@")
    if pos < 1:
    raise InvalidEmailError()
    if email.rfind(".") < pos:
    raise InvalidEmailError()
    HL/LL path ratio is low
    due to path explosion
    3 HL paths
    10 LL paths
    High-level
    execution tree
    Low-level (x86)
    execution tree

    View Slide

  48. Goal:
    Prioritize the low-level paths
    that maximize the HL/LL path ratio.

    View Slide

  49. High-level Execution Paths
    Alternative approach:
    Select states at high-level
    branches

    View Slide

  50. High-level Execution Paths
    Fork
    (low-level)
    Divergence
    (high-level)
    Alternative approach:
    Select states at high-level
    branches

    View Slide

  51. High-level Execution Paths
    Fork
    (low-level)
    Divergence
    (high-level)
    High-level fork points are
    unpredictable
    Alternative approach:
    Select states at high-level
    branches

    View Slide

  52. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  53. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  54. Reducing Path Explosion
    Program

    View Slide

  55. Reducing Path Explosion
    Fork points clustered in hot spots
    Program Low High
    Selection probability

    View Slide

  56. Reducing Path Explosion
    Fork points clustered in hot spots
    Program
    “Fork bomb”
    (e.g., input dependent loop)
    Low High
    Selection probability
    Global DFS / BFS / randomized strategy

    View Slide

  57. Reducing Path Explosion
    Fork points clustered in hot spots
    Clusters grow bigger 㱺 Slower overall progress
    Program
    “Fork bomb”
    (e.g., input dependent loop)
    Low High
    Selection probability
    Global DFS / BFS / randomized strategy

    View Slide

  58. Reducing Path Explosion
    Fork points clustered in hot spots
    Clusters grow bigger 㱺 Slower overall progress
    Program
    “Fork bomb”
    (e.g., input dependent loop)
    Low High
    Selection probability
    Reduced state diversity
    Global DFS / BFS / randomized strategy

    View Slide

  59. Reducing Path Explosion
    Program
    Idea:
    Partition the state space into groups

    View Slide

  60. Reducing Path Explosion
    Program
    Idea:
    Partition the state space into groups

    View Slide

  61. Reducing Path Explosion
    Program
    Idea:
    Partition the state space into groups
    Select group
    Select state
    from group

    View Slide

  62. Reducing Path Explosion
    Program
    Idea:
    Partition the state space into groups
    Select group
    Select state
    from group
    Faster progress across all groups

    View Slide

  63. Reducing Path Explosion
    Program
    Idea:
    Partition the state space into groups
    Select group
    Select state
    from group
    Faster progress across all groups
    Increased state diversity

    View Slide

  64. Class-Uniform Path Analysis

    View Slide

  65. Class-Uniform Path Analysis
    States arranged in a class hierarchy
    Class 1
    Class 2

    View Slide

  66. Class-Uniform Path Analysis
    States arranged in a class hierarchy
    Class 1
    Class 2

    View Slide

  67. Partitioning High-level Paths

    View Slide

  68. Partitioning High-level Paths
    High-level Instruction
    (Bytecode Instruction)

    View Slide

  69. Partitioning High-level Paths
    High-level Instruction
    (Bytecode Instruction)

    View Slide

  70. Partitioning High-level Paths
    High-level Instruction
    (Bytecode Instruction)
    High-level
    Program Counter

    View Slide

  71. Partitioning High-level Paths
    High-level Instruction
    (Bytecode Instruction)
    Low-level
    x86 PC
    High-level
    Program Counter

    View Slide

  72. Partitioning High-level Paths
    High-level Instruction
    (Bytecode Instruction)
    Low-level
    x86 PC
    High-level
    Program Counter
    1st CUPA Class

    View Slide

  73. Partitioning High-level Paths
    High-level Instruction
    (Bytecode Instruction)
    Low-level
    x86 PC
    High-level
    Program Counter
    1st CUPA Class
    2nd CUPA Class
    Reconstruct high-level execution tree

    View Slide

  74. CUPA Classes
    1. High-level PC
    • Uniform HL instruction exploration
    • Obtained via instrumentation
    2. x86 PC
    • Uniform native method exploration
    • Approximated as the PC of fork point
    Coverage-optimized CUPA in the paper

    View Slide

  75. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  76. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  77. switch (opcode) {
    case LOAD:
    ...
    case STORE:
    ...
    case CALL_FUNCTION:
    ...
    ...
    }
    hlpc++;
    }
    Interpreter Loop Instrumentation
    while (true) {
    fetch_instr(hlpc, &opcode, &params);

    View Slide

  78. switch (opcode) {
    case LOAD:
    ...
    case STORE:
    ...
    case CALL_FUNCTION:
    ...
    ...
    }
    hlpc++;
    }
    Interpreter Loop Instrumentation
    while (true) {
    fetch_instr(hlpc, &opcode, &params);
    Reconstruct high-level execution tree and CFG
    chef_log_hlpc(hlpc, opcode);

    View Slide

  79. Interpreter Optimizations
    static long
    string_hash(PyStringObject *a)
    {
    #ifdef SYMBEX_HASHES
    return 0;
    #else
    register Py_ssize_t len;
    register unsigned char *p;
    register long x;
    len = Py_SIZE(a);
    p = (unsigned char *) a->ob_sval;
    x = _Py_HashSecret.prefix;
    x ^= *p << 7;
    while (--len >= 0)
    x = (1000003*x) ^ *p++;
    x ^= Py_SIZE(a);
    x ^= _Py_HashSecret.suffix;
    if (x == -1)
    x = -2;
    return x;
    #endif
    }
    Hash neutralization

    View Slide

  80. Interpreter Optimizations
    • Simple changes to interpreter source
    static long
    string_hash(PyStringObject *a)
    {
    #ifdef SYMBEX_HASHES
    return 0;
    #else
    register Py_ssize_t len;
    register unsigned char *p;
    register long x;
    len = Py_SIZE(a);
    p = (unsigned char *) a->ob_sval;
    x = _Py_HashSecret.prefix;
    x ^= *p << 7;
    while (--len >= 0)
    x = (1000003*x) ^ *p++;
    x ^= Py_SIZE(a);
    x ^= _Py_HashSecret.suffix;
    if (x == -1)
    x = -2;
    return x;
    #endif
    }
    Hash neutralization

    View Slide

  81. Interpreter Optimizations
    • Simple changes to interpreter source
    • “Anti-optimizations” in linear
    performance...
    static long
    string_hash(PyStringObject *a)
    {
    #ifdef SYMBEX_HASHES
    return 0;
    #else
    register Py_ssize_t len;
    register unsigned char *p;
    register long x;
    len = Py_SIZE(a);
    p = (unsigned char *) a->ob_sval;
    x = _Py_HashSecret.prefix;
    x ^= *p << 7;
    while (--len >= 0)
    x = (1000003*x) ^ *p++;
    x ^= Py_SIZE(a);
    x ^= _Py_HashSecret.suffix;
    if (x == -1)
    x = -2;
    return x;
    #endif
    }
    Hash neutralization

    View Slide

  82. Interpreter Optimizations
    • Simple changes to interpreter source
    • “Anti-optimizations” in linear
    performance...
    • ... but exponential gains in symbolic
    mode
    static long
    string_hash(PyStringObject *a)
    {
    #ifdef SYMBEX_HASHES
    return 0;
    #else
    register Py_ssize_t len;
    register unsigned char *p;
    register long x;
    len = Py_SIZE(a);
    p = (unsigned char *) a->ob_sval;
    x = _Py_HashSecret.prefix;
    x ^= *p << 7;
    while (--len >= 0)
    x = (1000003*x) ^ *p++;
    x ^= Py_SIZE(a);
    x ^= _Py_HashSecret.suffix;
    if (x == -1)
    x = -2;
    return x;
    #endif
    }
    Hash neutralization

    View Slide

  83. Program +
    Symbolic Tests
    Symbolic Execution
    Engine for
    Language X
    Chef Summary
    Language X
    Interpreter
    (+instrumentation)
    CHEF API
    HL Tree
    Reconstr.
    CUPA
    State
    Selection
    S2E x86 Symbolic Execution
    CHEF

    View Slide

  84. Program +
    Symbolic Tests
    Symbolic Execution
    Engine for
    Language X
    Chef Summary
    Language X
    Interpreter
    (+instrumentation)
    CHEF API
    HL Tree
    Reconstr.
    CUPA
    State
    Selection
    S2E x86 Symbolic Execution
    CHEF

    View Slide

  85. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  86. Talk Outline
    1. Challenges
    2. Class-Uniform Path Analysis (CUPA)
    3. Interpreter Recipe
    4. Uses and Evaluation

    View Slide

  87. Chef-Prototyped Engines
    Python
    5 person-days
    321 LoC
    Lua
    3 person-days
    277 LoC

    View Slide

  88. Chef-Prototyped Engines
    Python
    5 person-days
    321 LoC
    Lua
    3 person-days
    277 LoC

    View Slide

  89. Evaluation Questions
    How does a Chef-obtained engine...
    • ... work for test case generation?
    • ... benefit from CUPA and optimizations?
    • ... compare to a dedicated implementation?

    View Slide

  90. Using a Chef Engine
    class ArgparseTest(SymbolicTest):
    def setUp(self):
    self.argparse = import_module("argparse")
    def runTest(self):
    parser = self.argparse.ArgumentParser()
    arg_name = self.getSymString(size=3)
    arg_value = self.getSymString(size=3)
    parser.add_argument(arg_name)
    args = parser.parse_args([arg_value])

    View Slide

  91. Using a Chef Engine
    class ArgparseTest(SymbolicTest):
    def setUp(self):
    self.argparse = import_module("argparse")
    def runTest(self):
    parser = self.argparse.ArgumentParser()
    arg_name = self.getSymString(size=3)
    arg_value = self.getSymString(size=3)
    parser.add_argument(arg_name)
    args = parser.parse_args([arg_value])

    View Slide

  92. Using a Chef Engine
    class ArgparseTest(SymbolicTest):
    def setUp(self):
    self.argparse = import_module("argparse")
    def runTest(self):
    parser = self.argparse.ArgumentParser()
    arg_name = self.getSymString(size=3)
    arg_value = self.getSymString(size=3)
    parser.add_argument(arg_name)
    args = parser.parse_args([arg_value])

    View Slide

  93. Using a Chef Engine
    class ArgparseTest(SymbolicTest):
    def setUp(self):
    self.argparse = import_module("argparse")
    def runTest(self):
    parser = self.argparse.ArgumentParser()
    arg_name = self.getSymString(size=3)
    arg_value = self.getSymString(size=3)
    parser.add_argument(arg_name)
    args = parser.parse_args([arg_value])

    View Slide

  94. Using a Chef Engine
    class ArgparseTest(SymbolicTest):
    def setUp(self):
    self.argparse = import_module("argparse")
    def runTest(self):
    parser = self.argparse.ArgumentParser()
    arg_name = self.getSymString(size=3)
    arg_value = self.getSymString(size=3)
    parser.add_argument(arg_name)
    args = parser.parse_args([arg_value])
    CHEF
    Symbolic Test
    Library
    Program

    View Slide

  95. Testing Python Packages
    xlrd
    simplejson
    argparse
    HTMLParser
    ConfigParser
    unicodecsv
    6 Popular Packages
    10.9K lines of Python code
    30 min. / package
    > 7,000 tests generated
    4 undocumented exceptions found

    View Slide

  96. Testing Python Packages
    xlrd
    simplejson
    argparse
    HTMLParser
    ConfigParser
    unicodecsv
    6 Popular Packages
    10.9K lines of Python code
    30 min. / package
    > 7,000 tests generated
    4 undocumented exceptions found
    High bug finding potential for dynamic languages

    View Slide

  97. xlrd simplejson
    argparse
    HTMLParser
    ConfigParser
    unicodecsv
    Efficiency
    Package
    0.1
    1
    10
    100
    1000
    10000
    Path Ratio (P / PBaseline)
    CUPA + Optimizations
    Baseline

    View Slide

  98. xlrd simplejson
    argparse
    HTMLParser
    ConfigParser
    unicodecsv
    Efficiency
    Package
    0.1
    1
    10
    100
    1000
    10000
    Path Ratio (P / PBaseline)
    CUPA + Optimizations
    Optimizations Only
    CUPA Only
    Baseline

    View Slide

  99. Comparison to Dedicated Engine
    • Symbolic execution engine of NICE [1]
    • Targets OpenFlow applications in Python
    • Case Study: Switch MAC learning algorithm
    [1] M. Canini, D. Venzano, P. Peresini, D. Kostic, and J. Rexford.
    “A NICE way to test OpenFlow applications.” NSDI 2012.

    View Slide

  100. Overhead
    1
    10
    100
    1000
    1 2 3 4 5 6 7 8 9 10
    Size of Symbolic Input [# of Ethernet frames]
    CHEF Overhead
    TCHEF
    /TNICE

    View Slide

  101. Overhead
    1
    10
    100
    1000
    1 2 3 4 5 6 7 8 9 10
    Size of Symbolic Input [# of Ethernet frames]
    CHEF Overhead
    TCHEF
    /TNICE
    >100×

    View Slide

  102. Overhead
    1
    10
    100
    1000
    1 2 3 4 5 6 7 8 9 10
    Size of Symbolic Input [# of Ethernet frames]
    CHEF Overhead
    TCHEF
    /TNICE
    >100×

    O
    ne-tim
    e
    Initialization

    View Slide

  103. Overhead
    1
    10
    100
    1000
    1 2 3 4 5 6 7 8 9 10
    Size of Symbolic Input [# of Ethernet frames]
    CHEF Overhead
    TCHEF
    /TNICE
    >100×

    40×
    O
    ne-tim
    e
    Initialization
    x86 Reasoning Overhead
    (Instructions + Constraints)

    View Slide

  104. Chef Engine as Reference Implementation
    Chef-Python Reference Paths

    View Slide

  105. Test Cases
    Chef Engine as Reference Implementation
    NICE
    Dedicated
    Engine
    Chef-Python Reference Paths

    View Slide

  106. Test Cases
    Chef Engine as Reference Implementation
    NICE
    Dedicated
    Engine
    Chef-Python Reference Paths

    View Slide

  107. Test Cases
    Chef Engine as Reference Implementation
    NICE
    Dedicated
    Engine
    Missing Paths
    Duplicate Paths
    Chef-Python Reference Paths
    Chef engine’s correctness outweighs performance penalty

    View Slide

  108. Conclusions
    http://dslab.epfl.ch/proj/chef
    CHEF
    Program +
    Symbolic Tests
    Language X
    Interpreter
    Test Cases

    View Slide