Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fuzzing -- From Alchemy to a Science

Fuzzing -- From Alchemy to a Science

Talk at Passau University

Rahul Gopinath

June 30, 2021
Tweet

More Decks by Rahul Gopinath

Other Decks in Research

Transcript

  1. Fuzzing
    Rahul Gopinath

    View Slide

  2. Fuzzing
    from Alchemy
    Rahul Gopinath

    View Slide

  3. Fuzzing
    from Alchemy
    Rahul Gopinath

    View Slide

  4. Fuzzing
    from Alchemy
    to Science
    Rahul Gopinath

    View Slide

  5. Fuzzing
    from Alchemy
    to Science
    Rahul Gopinath

    View Slide

  6. 4

    View Slide

  7. 4
    The story begins 2500 years ago.
    500 BC

    View Slide

  8. 4
    Vedic Sanskrit
    500 BC

    View Slide

  9. 4
    Vedic Sanskrit Classical Sanskrit
    500 BC

    View Slide

  10. 5
    500 BC
    Vedic Sanskrit Classical Sanskrit

    View Slide

  11. 5
    500 BC
    Aṣṭādhyāyī Dakṣiputra Pāṇini
    Vedic Sanskrit Classical Sanskrit

    View Slide

  12. 5
    500 BC
    Aṣṭādhyāyī Dakṣiputra Pāṇini
    Ad hoc rules Formal specification
    Vedic Sanskrit Classical Sanskrit

    View Slide

  13. 6

    View Slide

  14. 6
    2500 years later....
    2021 CE The world is governed by software

    View Slide

  15. 6
    2500 years later....
    2021 CE
    We have a crisis
    The world is governed by software

    View Slide

  16. 7
    Debian 5 ~ 70 million lines
    Smart cars ~ 100 million lines
    Google is ~ 2 Billion lines
    5 M
    10 M
    15 M
    20 M
    25 M
    30 M
    35 M
    (Source: Wikipedia)
    #Linux Kernel

    Size in Millions

    of Lines of Code
    Linux Releases
    Growth in software complexity

    View Slide

  17. 7
    Debian 5 ~ 70 million lines
    Smart cars ~ 100 million lines
    Google is ~ 2 Billion lines
    1994
    1996
    1.0.0 2.0.0 2.1.0 2.2.0 2.4.0 2.6.0
    2003
    2011
    3.0
    2015
    4.0
    2019
    5.0
    2019
    5.7
    2001
    5 M
    10 M
    15 M
    20 M
    25 M
    30 M
    35 M
    (Source: Wikipedia)
    #Linux Kernel

    Size in Millions

    of Lines of Code
    Linux Releases
    Growth in software complexity

    View Slide

  18. 8
    Growth in vulnerabilities
    #Vulnerabilities
    Year
    16k
    14k
    12k
    10k
    8k
    6k
    4k
    2k
    (Source: NIST)

    View Slide

  19. 8
    Growth in vulnerabilities
    #Vulnerabilities
    Year
    2019
    2018
    2017
    2016
    2015
    2014
    2013
    2012
    2011
    2010
    2009
    2008
    2007
    2006
    2005
    2004
    2003
    2002
    2001
    16k
    14k
    12k
    10k
    8k
    6k
    4k
    2k
    (Source: NIST)

    View Slide

  20. Fuzzing
    Program
    Trash deck technique: 1950s - Gerald Weinberg

    View Slide

  21. Fuzzing
    Program
    Trash deck technique: 1950s - Gerald Weinberg

    View Slide

  22. Fuzzing
    Crash?
    Program
    Trash deck technique: 1950s - Gerald Weinberg

    View Slide

  23. 10
    Fuzzing
    Random Inputs
    Program
    Automatic Checks

    View Slide

  24. 10
    Fuzzing
    • Memory Bounds Violation

    • Privilege Escalation

    • Safety Violations

    • Metamorphic Relations

    • Differential Execution
    Random Inputs
    Program
    Automatic Checks

    View Slide

  25. 11
    Random Fuzzing
    Program

    View Slide

  26. 11
    Random Fuzzing
    $ ./fuzz
    [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu

    2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z

    h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!

    AxB"[email protected]!Kd6;wtAMefFWM(`|J_<1~o}

    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?

    lR=bF3+;y$3lodQ)KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp

    [email protected]{P!AZU7x#4(Rtn!q4nCwqol^y6

    }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU

    )*BiC<),`+t*gkaPq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM

    PG-FKj'\xwuZ1=Q`^`5,[email protected][!CuRzJ2

    D|vBy!^zkhdf3C5PAkR?V((-%>i2Qx]D$qs4O`[email protected]'2\[email protected]
    5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:

    cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc

    un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N

    -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r
    Program

    View Slide

  27. $ ./fuzz -int | program
    634111569742810193727424069509
    741355925061499451162464719526
    615957331924826555590537407605
    181400079803446874252046374716
    740973770255348279425601333144
    152724057932073828569041216191
    099859446496509919024810271242
    622974988671421938012464630138
    735355134599327240920259675263
    574528613057084231370741920902
    794677842164654990353575580453
    777282305855352378119038096476
    699871306655084953377039862387
    924957554389878352934547664240
    082431556093837288597262675598
    630851919061829885048834738832
    677022429414980917053939970795
    722006987916088650168665471731 yes
    12
    Random Fuzzing
    def is_prime(n: int) -> bool:
    """Primality test using 6k+-1 optimization."""
    if n <= 3:
    return n > 1
    if n % 2 == 0 or n % 3 == 0:
    return False
    i = 5
    while i ** 2 <= n:
    if n % i == 0 or n % (i + 2) == 0:
    return False
    i += 6
    return True
    def main():
    num = stdin.read()
    print(num, is_prime(num))

    View Slide

  28. $ ./fuzz -int | program
    634111569742810193727424069509
    741355925061499451162464719526
    615957331924826555590537407605
    181400079803446874252046374716
    740973770255348279425601333144
    152724057932073828569041216191
    099859446496509919024810271242
    622974988671421938012464630138
    735355134599327240920259675263
    574528613057084231370741920902
    794677842164654990353575580453
    777282305855352378119038096476
    699871306655084953377039862387
    924957554389878352934547664240
    082431556093837288597262675598
    630851919061829885048834738832
    677022429414980917053939970795
    722006987916088650168665471731 yes
    12
    Random Fuzzing
    def is_prime(n: int) -> bool:
    """Primality test using 6k+-1 optimization."""
    if n <= 3:
    return n > 1
    if n % 2 == 0 or n % 3 == 0:
    return False
    i = 5
    while i ** 2 <= n:
    if n % i == 0 or n % (i + 2) == 0:
    return False
    i += 6
    return True
    def main():
    num = stdin.read()
    print(num, is_prime(num))
    Weak point: unequal divisions in input space
    Input space
    n > 3
    n <= 3

    View Slide

  29. 13
    Advanced Fuzzing: Instrumentation

    View Slide

  30. 14
    Advanced Fuzzing: Instrumentation
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View Slide

  31. 14
    Advanced Fuzzing: Instrumentation
    def triangle(a, b, c):
    __probe_enter()
    if a == b:
    __probe_1()
    if b == c:
    __probe_2()
    return Equilateral
    else:
    __probe_3()
    return Isosceles
    else:
    __probe_4()
    if b == c:
    __probe_5()
    return Isosceles
    else:
    __probe_6()
    if a == c:
    __probe_7()
    return Isosceles
    else:
    __probe_8()
    return Scalene
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View Slide

  32. 15
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View Slide

  33. 15
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    triangle (1,1,1)

    View Slide

  34. 16
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View Slide

  35. 16
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    triangle (1,1,2)

    View Slide

  36. 17
    Instrumentation Guided Fuzzing
    • Coverage Guided
    • Solver Directed

    View Slide

  37. 18
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage

    View Slide

  38. 18
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL

    View Slide

  39. 19
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    First Release: 2013
    AFL Trophy Case

    View Slide

  40. 20
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    Pulling JPEGs out of thin air
    https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html
    Valid JPEG in 6 hours in an 8 core machine!

    View Slide

  41. 21
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    static int is_reserved_word_token(const char *s, int len) {
    const char *reserved[] = {
    "break", "case", "catch", "continue", "debugger", "default",
    "delete", "do", "else", "false", "finally", "for",
    "function", "if", "in", "instanceof", "new", "null",
    "return", "switch", "this", "throw", "true", "try",
    "typeof", "var", "void", "while", "with", "let",
    "undefined", ((void *)0)};
    int i;
    if (!mjs_is_alpha(s[0]))
    return 0;
    for (i = 0; reserved[i] != ((void *)0); i++) {
    if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0)
    return i + 1;
    }
    return 0;
    }

    View Slide

  42. 21
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    static int is_reserved_word_token(const char *s, int len) {
    const char *reserved[] = {
    "break", "case", "catch", "continue", "debugger", "default",
    "delete", "do", "else", "false", "finally", "for",
    "function", "if", "in", "instanceof", "new", "null",
    "return", "switch", "this", "throw", "true", "try",
    "typeof", "var", "void", "while", "with", "let",
    "undefined", ((void *)0)};
    int i;
    if (!mjs_is_alpha(s[0]))
    return 0;
    for (i = 0; reserved[i] != ((void *)0); i++) {
    if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0)
    return i + 1;
    }
    return 0;
    }
    Weak point: Magic bytes

    View Slide

  43. 22
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage

    View Slide

  44. 22
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage

    View Slide

  45. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs

    View Slide

  46. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)

    View Slide

  47. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)
    (b != c)

    View Slide

  48. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)
    (b != c)
    triangle(1,2,1)

    View Slide

  49. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)
    (b != c)
    triangle(1,2,1)

    View Slide

  50. 24
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve constraints for new inputs
    void next_sym() {
    while(1) {
    switch (ch){
    case '{': next_ch(); sym = LBRA; return;
    case '}': next_ch(); sym = RBRA; return;
    case '(': next_ch(); sym = LPAR; return;
    case ')': next_ch(); sym = RPAR; return;
    case '+': next_ch(); sym = PLUS; return;
    case '-': next_ch(); sym = MINUS; return;
    case '<': next_ch(); sym = LESS; return;
    case ';': next_ch(); sym = SEMI; return;
    case '=': next_ch(); sym = EQUAL; return;
    default:
    if (ch >= '0' && ch <= '9') {
    int_val = 0; /* missing overflow check */
    while (ch >= '0' && ch <= '9') {
    int_val = int_val*10 + (ch - '0'); next_ch(); }
    sym = INT;
    } else if (ch >= 'a' && ch <= 'z') {
    int i = 0; /* missing overflow check */
    while ((ch >= 'a' && ch <= 'z') || ch == '_'){
    id_name[i++] = ch; next_ch(); }
    id_name[i] = '\0';
    sym = 0;
    while (words[sym] != NULL && strcmp(words[sym], id_name) != 0)
    sym++;
    if (words[sym] == NULL)
    if (id_name[1] == '\0') sym = ID; else syntax_error();
    }
    else syntax_error();
    return;
    }

    View Slide

  51. 24
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve constraints for new inputs
    void next_sym() {
    while(1) {
    switch (ch){
    case '{': next_ch(); sym = LBRA; return;
    case '}': next_ch(); sym = RBRA; return;
    case '(': next_ch(); sym = LPAR; return;
    case ')': next_ch(); sym = RPAR; return;
    case '+': next_ch(); sym = PLUS; return;
    case '-': next_ch(); sym = MINUS; return;
    case '<': next_ch(); sym = LESS; return;
    case ';': next_ch(); sym = SEMI; return;
    case '=': next_ch(); sym = EQUAL; return;
    default:
    if (ch >= '0' && ch <= '9') {
    int_val = 0; /* missing overflow check */
    while (ch >= '0' && ch <= '9') {
    int_val = int_val*10 + (ch - '0'); next_ch(); }
    sym = INT;
    } else if (ch >= 'a' && ch <= 'z') {
    int i = 0; /* missing overflow check */
    while ((ch >= 'a' && ch <= 'z') || ch == '_'){
    id_name[i++] = ch; next_ch(); }
    id_name[i] = '\0';
    sym = 0;
    while (words[sym] != NULL && strcmp(words[sym], id_name) != 0)
    sym++;
    if (words[sym] == NULL)
    if (id_name[1] == '\0') sym = ID; else syntax_error();
    }
    else syntax_error();
    return;
    }
    Weak point: Path explosion

    View Slide

  52. 25
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs

    View Slide

  53. 25
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs

    View Slide

  54. 26
    Fuzzing Parsers
    $ ./fuzz
    [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu

    2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z

    h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!

    AxB"[email protected]!Kd6;wtAMefFWM(`|J_<1~o}

    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?

    lR=bF3+;y$3lodQ)KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp

    [email protected]{P!AZU7x#4(Rtn!q4nCwqol^y6

    }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU

    )*BiC<),`+t*gkaPq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM

    PG-FKj'\xwuZ1=Q`^`5,[email protected][!CuRzJ2

    D|vBy!^zkhdf3C5PAkR?V((-%>i2Qx]D$qs4O`[email protected]'2\[email protected]
    5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:

    cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc

    un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N

    -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r
    Interpreter

    View Slide

  55. 26
    Fuzzing Parsers
    $ ./fuzz
    [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu

    2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z

    h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!

    AxB"[email protected]!Kd6;wtAMefFWM(`|J_<1~o}

    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?

    lR=bF3+;y$3lodQ)KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp

    [email protected]{P!AZU7x#4(Rtn!q4nCwqol^y6

    }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU

    )*BiC<),`+t*gkaPq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM

    PG-FKj'\xwuZ1=Q`^`5,[email protected][!CuRzJ2

    D|vBy!^zkhdf3C5PAkR?V((-%>i2Qx]D$qs4O`[email protected]'2\[email protected]
    5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:

    cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc

    un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N

    -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r
    Parser
    Syntax Error
    Interpreter
    #

    View Slide

  56. 27
    Advanced Fuzzing: Specialized Generators
    • Specialize generation for a domain

    View Slide

  57. 27
    Advanced Fuzzing: Specialized Generators
    • Specialize generation for a domain

    View Slide

  58. 27
    Advanced Fuzzing: Specialized Generators
    • Specialize generation for a domain
    80,000 lines of code

    View Slide

  59. 28
    The Holy Grail of Fuzzing

    View Slide

  60. 28
    The Holy Grail of Fuzzing
    Parsers

    View Slide

  61. Overcoming Parsers

    View Slide

  62. Overcoming Parsers

    View Slide

  63. 30
    Overcoming Parsers

    View Slide

  64. 30
    Overcoming Parsers

    View Slide

  65. 30
    Monolithic
    Overcoming Parsers

    View Slide

  66. 31

    View Slide

  67. 32
    The Missing Piece: Formal Specification

    View Slide

  68. 33
    Grammar

    View Slide

  69. 34
    Formal Languages
    Formal Language Descriptions

    View Slide

  70. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    (Chomsky,1956)

    View Slide

  71. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    Context Free
    (Chomsky,1956)
    Argument Stack

    View Slide

  72. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    Context Free
    Recursively Enumerable
    (Chomsky,1956)
    Argument Stack
    Return Stack

    View Slide

  73. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    Context Free
    Recursively Enumerable
    (Chomsky,1956)
    Easy to produce and parse
    Argument Stack
    Return Stack

    View Slide

  74. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar

    View Slide

  75. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar

    View Slide

  76. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar
    key

    View Slide

  77. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar
    Definition for
    key

    View Slide

  78. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar

    View Slide

  79. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule

    View Slide

  80. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule Terminal Symbol

    View Slide

  81. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule Terminal Symbol
    Nonterminal Symbol

    View Slide

  82. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule Terminal Symbol
    Nonterminal Symbol

    View Slide

  83. 37
    Grammars
    As recognizers
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    View Slide

  84. 37
    Grammars
    As recognizers
    (8 / 3) * 49
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    View Slide

  85. 37
    Grammars
    As recognizers
    (8 / 3) * 49
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    View Slide

  86. 38
    Grammars
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    As producers (Hanford 1970)

    (Purdom 1972)

    View Slide

  87. 38
    Grammars8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    As producers (Hanford 1970)

    (Purdom 1972)

    View Slide

  88. 39
    Grammars
    As effective producers
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090

    View Slide

  89. 39
    Grammars
    As effective producers
    Interpreter
    Parser


    8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090

    View Slide

  90. 40
    Where to Get the Grammar From?

    View Slide

  91. 41
    •Reference Specification?

    View Slide

  92. 41
    The standard spec
    •Reference Specification?

    View Slide

  93. 41
    The standard spec
    Buggy Implementation
    •Reference Specification?

    View Slide

  94. 41
    The standard spec
    Buggy Implementation
    "Extra" Features
    •Reference Specification?

    View Slide

  95. 41
    The standard spec
    Buggy Implementation
    "Extra" Features
    "Be liberal in what you accept, and conservative in what you send"

    (the cause of trouble) Postel's Law
    "Accepted" Bugs
    •Reference Specification?

    View Slide

  96. 42
    https://www.json.org

    View Slide

  97. 42
    object
    { }
    { members }
    members
    pair
    pair , members
    pair
    string : value
    array
    [ ]
    [ elements ]
    elements
    value
    value , elements
    value
    string
    number
    object
    array
    true
    false
    null
    string
    " "
    " chars "
    chars
    char
    char chars
    char
    UNICODE \ [",\,CTRL]
    \" \\ \/ \b \f \n \r \t
    \u hex hex hex hex
    number
    int
    int frac
    int exp
    int frac exp
    int
    digit
    onenine digits
    - digit
    - onenine digits
    frac
    . digits
    exp
    e digits
    hex
    digit
    A - F
    a - f
    digits
    digit
    digit digits
    e
    e e+ e-
    E E+ E-
    https://www.json.org

    View Slide

  98. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View Slide

  99. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View Slide

  100. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View Slide

  101. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View Slide

  102. 43
    Parsing JSON is a Minefield
    http://seriot.ch/
    Expected
    Parse Fail (Expect Success)
    Parse Success (Expect Fail)
    Parse Success (Undefined)
    Parse Fail (Undefined)
    Parser Crash
    Timeout

    View Slide

  103. 43
    Parsing JSON is a Minefield
    http://seriot.ch/
    Expected
    Parse Fail (Expect Success)
    Parse Success (Expect Fail)
    Parse Success (Undefined)
    Parse Fail (Undefined)
    Parser Crash
    Timeout

    View Slide

  104. View Slide

  105. 45
    Where to Get the Grammar From?

    View Slide

  106. 45
    Where to Get the Grammar From?
    Hand-written parsers already encode the grammar

    View Slide

  107. 46
    How to Extract This Grammar?

    View Slide

  108. 46
    How to Extract This Grammar?
    • Inputs -> Dynamic Control Dependence Trees

    View Slide

  109. 46
    How to Extract This Grammar?
    • Inputs -> Dynamic Control Dependence Trees
    • DCD Trees -> Context Free Grammar

    View Slide

  110. 47
    Control Dependence Graph
    Statement B is control dependent on A if A determines whether B executes.
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1

    View Slide

  111. 47
    Control Dependence Graph
    Statement B is control dependent on A if A determines whether B executes.
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv

    View Slide

  112. 47
    Control Dependence Graph
    Statement B is control dependent on A if A determines whether B executes.
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv
    while: determines

    whether

    if: executes

    View Slide

  113. 48
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv
    Dynamic Control Dependence Tree
    Each statement execution is represented as a separate node

    View Slide

  114. 48
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv
    Dynamic Control Dependence Tree
    Each statement execution is represented as a separate node
    DCD Tree for call parse_csv()

    View Slide

  115. 49
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    DCD Tree ~ Parse Tree
    •No tracking beyond input buffer

    •Characters are attached to nodes where they are accessed last
    "12,"
    "12,"

    View Slide

  116. 49
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    '1' '2' ','
    DCD Tree ~ Parse Tree
    •No tracking beyond input buffer

    •Characters are attached to nodes where they are accessed last
    "12,"
    "12,"

    View Slide

  117. 50
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Parse tree for parse_expr('9+3/4')

    View Slide

  118. 50
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Parse tree for parse_expr('9+3/4')

    View Slide

  119. 51
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Identifying Compatible Nodes
    Which nodes correspond to the same nonterminal

    View Slide

  120. 51
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Identifying Compatible Nodes
    Which nodes correspond to the same nonterminal

    View Slide

  121. 52
    3 * (9 + 1)

    View Slide

  122. 52
    3 * (9 + 1)

    View Slide

  123. 52
    (9 + 1) * 3
    3 * (9 + 1)

    View Slide

  124. 52
    (9 + 1) * 3
    3 * (9 + 1)

    View Slide

  125. 52
    (9 + 1) * 3
    3 * (9 + 1)

    View Slide

  126. 53
    3 * (9 + 1)

    View Slide

  127. 53
    3 * (9 + 1)

    View Slide

  128. 53
    9 + 1
    3 * (9 + 1)

    View Slide

  129. 53
    9 + 1
    3 * (9 + 1)

    View Slide

  130. 53
    9 + 1
    3 * (9 + 1)

    View Slide

  131. 54
    3 * (9 + 1)

    View Slide

  132. 54
    3 (9 + 1) *
    3 * (9 + 1)

    View Slide

  133. 54
    3 (9 + 1) *
    3 * (9 + 1)

    View Slide

  134. 54
    3 (9 + 1) *
    3 * (9 + 1)

    View Slide

  135. 55
    3*(1)
    1

    View Slide

  136. 56
    3*(1)
    1

    View Slide

  137. 56
    3*(1)
    1
    :=

    View Slide

  138. 56
    3*(1)
    1
    :=

    :=

    View Slide

  139. :=
    |
    |
    |
    :=
    :=
    |
    :=
    := '3' | '1'
    := '(' ')'
    :=
    := '*'
    57

    View Slide

  140. :=
    :=
    |
    :=
    |
    |
    |
    :=
    :=
    |
    :=
    := '3' | '1'
    := '(' ')'
    :=
    := '*'
    57

    View Slide

  141. 58
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    :=
    :=
    |
    :=
    |
    := '(' ')'
    |
    := '*' | '+' | '-' | '/'
    :=
    |
    : [0-9]
    calc.py Recovered Arithmetic Grammar

    View Slide

  142. 59
    :=
    :=
    |
    :=
    |
    := '(' ')'
    |
    := '*' | '+' | '-' | '/'
    :=
    |
    : [0-9]

    View Slide

  143. 59
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090
    :=
    :=
    |
    :=
    |
    := '(' ')'
    |
    := '*' | '+' | '-' | '/'
    :=
    |
    : [0-9]

    View Slide

  144. 60
    ::=
    ::= '"'
    | '['
    | '{'
    |
    | 'true'
    | 'false'
    | 'null'
    ::= +
    | + 'e' +
    ::= '+' | '-' | '.' | [0-9] | 'E' | 'e'
    ::= * '"'
    ::= ']'
    | (',')* ']'
    | ( ',' )+ (',' )* ']'
    ::= '}'
    | ( '"' ':' ',' )*
    '"' ':' '}'
    ::= ' ' | '!' | '#' | '$' | '%' | '&' | '''
    | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';'
    | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^'
    | '_', ''',| '{' | '|' | '}' | '~'
    | '[A-Za-z0-9]'
    | '\'
    ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't'
    stm.next()
    if expect_key:
    raise JSONError(E_DKEY, stm, stm.pos)
    if c == '}':
    return result
    expect_key = 1
    continue
    # parse out a key/value pair
    elif c == '"':
    key = _from_json_string(stm)
    stm.skipspaces()
    c = stm.next()
    if c != ':':
    raise JSONError(E_COLON, stm, stm.pos)
    stm.skipspaces()
    val = _from_json_raw(stm)
    result[key] = val
    expect_key = 0
    continue
    raise JSONError(E_MALF, stm, stm.pos)
    def _from_json_raw(stm):
    while True:
    stm.skipspaces()
    c = stm.peek()
    if c == '"':
    return _from_json_string(stm)
    elif c == '{':
    return _from_json_dict(stm)
    elif c == '[':
    return _from_json_list(stm)
    elif c == 't':
    return _from_json_fixed(stm, 'true', True, E_BOOL)
    elif c == 'f':
    return _from_json_fixed(stm, 'false', False, E_BOOL)
    elif c == 'n':
    return _from_json_fixed(stm, 'null', None, E_NULL)
    elif c in NUMSTART:
    return _from_json_number(stm)
    raise JSONError(E_MALF, stm, stm.pos)
    def from_json(data):
    stm = JSONStream(data)
    return _from_json_raw(stm)
    microjson.py Recovered JSON grammar

    View Slide

  145. 61
    ::=
    ::= '"'
    | '['
    | '{'
    |
    | 'true'
    | 'false'
    | 'null'
    ::= +
    | + 'e' +
    ::= '+' | '-' | '.' | [0-9] | 'E' | 'e'
    ::= * '"'
    ::= ']'
    | (',')* ']'
    | ( ',' )+ (',' )* ']'
    ::= '}'
    | ( '"' ':' ',' )*
    '"' ':' '}'
    ::= ' ' | '!' | '#' | '$' | '%' | '&' | '''
    | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';'
    | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^'
    | '_', ''',| '{' | '|' | '}' | '~'
    | '[A-Za-z0-9]'
    | '\'
    ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't'
    stm.next()
    if expect_key:
    raise JSONError(E_DKEY, stm, stm.pos)
    if c == '}':
    return result
    expect_key = 1
    continue
    # parse out a key/value pair
    elif c == '"':
    key = _from_json_string(stm)
    stm.skipspaces()
    c = stm.next()
    if c != ':':
    raise JSONError(E_COLON, stm, stm.pos)
    stm.skipspaces()
    val = _from_json_raw(stm)
    result[key] = val
    expect_key = 0
    continue
    raise JSONError(E_MALF, stm, stm.pos)
    def _from_json_raw(stm):
    while True:
    stm.skipspaces()
    c = stm.peek()
    if c == '"':
    return _from_json_string(stm)
    elif c == '{':
    return _from_json_dict(stm)
    elif c == '[':
    return _from_json_list(stm)
    elif c == 't':
    return _from_json_fixed(stm, 'true', True, E_BOOL)
    elif c == 'f':
    return _from_json_fixed(stm, 'false', False, E_BOOL)
    elif c == 'n':
    return _from_json_fixed(stm, 'null', None, E_NULL)
    elif c in NUMSTART:
    return _from_json_number(stm)
    raise JSONError(E_MALF, stm, stm.pos)
    def from_json(data):
    stm = JSONStream(data)
    return _from_json_raw(stm)
    microjson.py Recovered JSON grammar
    Mimid
    Gopinath, Mathis, and Zeller. Mining Input Grammars from Dynamic Control Flow. ESEC/FSE 2020.
    •Javascript
    •C
    •Lisp
    •JSON
    •URL
    •CGI

    View Slide

  146. Actual Specification

    View Slide

  147. Dynamic Approximation
    (Mimid)
    Actual Specification

    View Slide

  148. Dynamic Approximation
    (Mimid)
    Static Approximation
    (Static Mimid)
    Actual Specification

    View Slide

  149. Dynamic Approximation
    (Mimid)
    Static Approximation
    (Static Mimid)
    Actual Specification
    IOT
    &
    Embedded

    View Slide

  150. 63

    View Slide

  151. Generating Unbiased Samples

    View Slide

  152. Finding Good Samples
    Seed corpus?

    View Slide

  153. Finding Good Samples
    Seed corpus?
    (Blind spots)

    View Slide

  154. • Differentiate incomplete and incorrect inputs
    • Solve one input symbol at a time systematically
    Key Idea

    View Slide

  155. 67
    Sample Free Generators

    View Slide

  156. 67
    Sample Free Generators

    View Slide

  157. 67
    Sample Free Generators

    View Slide

  158. 67
    Sample Free Generators
    A

    View Slide

  159. 67
    Sample Free Generators
    A
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

    View Slide

  160. 67
    Sample Free Generators
    A
    (
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

    View Slide

  161. 67
    Sample Free Generators
    A
    ( 2
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

    View Slide

  162. 67
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0
    B ∉ +,-,1,2,3,4,5,6,7,8,9,0,)
    ) ∉ +,-,1,2,3,4,5,6,7,8,9,0

    View Slide

  163. 67
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0
    B ∉ +,-,1,2,3,4,5,6,7,8,9,0,)
    ) ∉ +,-,1,2,3,4,5,6,7,8,9,0

    View Slide

  164. 67
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0
    B ∉ +,-,1,2,3,4,5,6,7,8,9,0,)
    ) ∉ +,-,1,2,3,4,5,6,7,8,9,0
    (2-94)

    View Slide

  165. Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Parser Directed Fuzzing. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Zeller Black-box Testing with Monotonic Prefixes. ISSRE 2021 (submitted).
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )

    View Slide

  166. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    fuzz(expr_grammar, '')

    View Slide

  167. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))

    View Slide

  168. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))

    View Slide

  169. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View Slide

  170. Fast Fuzzing with Grammars
    fuzz(expr_grammar, '')
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View Slide

  171. Fast Fuzzing with Grammars



    +



    1 8
    fuzz(expr_grammar, '')
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View Slide

  172. Fast Fuzzing with Grammars



    +



    1 8
    fuzz(expr_grammar, '')
    def collapse(tree):
    key, children = tree
    if not children: return tree
    return ''.join([collapse(c) for c in children])
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View Slide

  173. Fast Fuzzing with Grammars



    +



    1 8
    fuzz(expr_grammar, '')
    1 8
    +
    "1 + 8"
    def collapse(tree):
    key, children = tree
    if not children: return tree
    return ''.join([collapse(c) for c in children])
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View Slide

  174. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar

    View Slide

  175. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View Slide

  176. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View Slide

  177. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')
    def start(rops):
    expr(rops)
    def expr(rops):
    match (rops.next % 6):
    case 0: expr(rops); print('+'); e
    case 1: expr(rops); print('-'); e
    case 2: expr(rops); print('/'); e
    case 3: expr(rops); print('*'); e
    case 4: print('('); expr(rops); p
    case 5: number(rops)
    def number(rops):
    match (rops.next % 2):
    case 0: integer(rops)
    case 1: integer(rops); print('.')
    def integer(rops):
    match (rops.next % 2):
    case 0: digit(rops); integer(rops
    case 1: digit(rops)
    def digit(rops):
    match (rops.next % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View Slide

  178. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')
    def start(rops):
    expr(rops)
    def expr(rops):
    match (rops.next % 6):
    case 0: expr(rops); print('+'); e
    case 1: expr(rops); print('-'); e
    case 2: expr(rops); print('/'); e
    case 3: expr(rops); print('*'); e
    case 4: print('('); expr(rops); p
    case 5: number(rops)
    def number(rops):
    match (rops.next % 2):
    case 0: integer(rops)
    case 1: integer(rops); print('.')
    def integer(rops):
    match (rops.next % 2):
    case 0: digit(rops); integer(rops
    case 1: digit(rops)
    def digit(rops):
    match (rops.next % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View Slide

  179. Grammar Fuzzer
    grammar = """

    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    """
    generate(grammar)
    Fast Fuzzers
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')
    Compiled Grammar (F1)
    Building Fast Fuzzers Gopinath and Zeller 2019
    def start_0(rops):
    r = next(rops)
    if 0 <= r < 43: expr_0()
    elif 43 <= r < 85: expr_1()
    elif 85 <= r < 128: expr_2()
    elif 128 <= r < 171: expr_3()
    elif 171 <= r < 213: expr_4()
    else: expr_5()
    def expr_0(rops):
    r = next(rops)
    if 0 <= r < 43: expr_0()
    elif 43 <= r < 85: expr_1()
    elif 85 <= r < 128: expr_2()
    elif 128 <= r < 171: expr_3()
    elif 171 <= r < 213: expr_4()
    else: expr_5()
    print('+')
    r = next(rops)
    if 0 <= r < 43: expr_0()
    elif 43 <= r < 85: expr_1()
    elif 85 <= r < 128: expr_2()
    elif 128 <= r < 171: expr_3()
    elif 171 <= r < 213: expr_4()
    else: expr_5()
    Grammar VM (F1)

    View Slide

  180. 73
    The Fuzzing Pipeline
    Program Under Test

    View Slide

  181. 73
    The Fuzzing Pipeline
    Program Under Test
    pFuzzer
    Active
    Guidance
    F1 Grammar Fuzzer
    Grammar
    Inputs
    Grammar Miner
    Samples
    Active
    Learning

    View Slide

  182. 74
    The Fuzzing Synergy
    Mimid Grammar Miner
    Parser Directed pFuzzer
    F1 Fuzzer VM
    Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted).
    Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished)
    Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

    View Slide

  183. 74
    The Fuzzing Synergy
    Mimid Grammar Miner
    Parser Directed pFuzzer
    F1 Fuzzer VM
    Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted).
    Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished)
    Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

    View Slide

  184. 74
    The Fuzzing Synergy
    Mimid Grammar Miner
    Parser Directed pFuzzer
    F1 Fuzzer VM
    Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted).
    Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished)
    Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

    View Slide

  185. View Slide

  186. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    View Slide

  187. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST

    View Slide

  188. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST
    XML PAYLOAD

    View Slide

  189. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST
    XML PAYLOAD
    SOAP

    View Slide

  190. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST
    XML PAYLOAD
    SOAP
    RPC Call

    View Slide

  191. 77
    Future Challenge: Multilevel Envelopes

    View Slide

  192. 77
    Future Challenge: Multilevel Envelopes

    View Slide

  193. 77
    Future Challenge: Multilevel Envelopes
    Mimid

    View Slide

  194. 78
    #include
    int main() {
    int number1, number2, number3;
    number1 = 10;
    number2 = 20;
    number3 = sum(number1, number2);
    if (number3 > 100) return 0;
    return 1;
    }
    $ cc example.c -o example
    example.c
    1.Syntactically correct
    2.Variables declared before use
    3.Use correct types
    4.Statically conforming
    5.Dynamically conforming
    6.Model conforming
    Challenge: Semantic Envelopes
    (Mckeeman 1998)
    (parse)
    (compile)
    (link)
    (run)
    (synthesis)

    View Slide

  195. 78
    #include
    int main() {
    int number1, number2, number3;
    number1 = 10;
    number2 = 20;
    number3 = sum(number1, number2);
    if (number3 > 100) return 0;
    return 1;
    }
    $ cc example.c -o example
    example.c
    1.Syntactically correct
    2.Variables declared before use
    3.Use correct types
    4.Statically conforming
    5.Dynamically conforming
    6.Model conforming
    Challenge: Semantic Envelopes
    (Mckeeman 1998)
    (parse)
    (compile)
    (link)
    (run)
    (synthesis)

    View Slide

  196. 79
    #include
    int main() {
    int number1, number2, number3;
    number1 = 10;
    number2 = 20;
    number3 = sum(number1, number2);
    if (number3 > 100) return 0;
    return 1;
    }
    $ cc example.c -o example
    example.c
    1.Syntactically correct
    2.Variables declared before use
    3.Use correct types
    4.Statically conforming
    5.Dynamically conforming
    6.Model conforming
    Challenge: Semantic Envelopes
    (Mckeeman 1998)
    (parse)
    (compile)
    (link)
    (run)
    (synthesis)

    View Slide

  197. 80

    View Slide

  198. 81
    Fuzzing
    Inputs
    Program
    Behavior

    View Slide

  199. 81
    Fuzzing
    Inputs
    Program
    Behavior

    View Slide

  200. 81
    Fuzzing
    Inputs
    Program
    Behavior

    View Slide

  201. 82

    View Slide

  202. 83
    We Found A Crash

    View Slide

  203. 83
    We Found A Crash

    View Slide

  204. Why Did My Program Crash?

    View Slide

  205. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090

    View Slide

  206. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))

    View Slide

  207. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))
    00000 ?

    View Slide

  208. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))
    00000 ?
    ((5)) ?

    View Slide

  209. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))
    00000 ?
    ((5)) ?
    (++5) ?

    View Slide

  210. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View Slide

  211. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View Slide

  212. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View Slide

  213. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View Slide

  214. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure
    Delta Minimization is useful but not sufficient

    View Slide

  215. ( ( 4 ) )

    View Slide

  216. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View Slide

  217. ( ( 4 ) )

    View Slide

  218. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View Slide

  219. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View Slide

  220. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    ✓ Did not reproduce the failure
    1 * (2 - 3)

    View Slide

  221. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View Slide

  222. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c

    View Slide

  223. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c

    View Slide

  224. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    ✓ Did not reproduce the failure
    1 + 3 + 4

    View Slide

  225. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c

    View Slide

  226. 3 * 4
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c

    View Slide

  227. 3 * 4
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    ✓ Did not reproduce the failure

    View Slide

  228. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    c
    c
    c
    c
    c

    View Slide

  229. ( ( 1 - 2 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    c
    c
    c
    c
    c
    ( ( 1 - 2 ) )

    View Slide

  230. ( ( 1 - 2 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    c
    c
    c
    c
    c
    ✘ reproduced the failure
    ( ( 1 - 2 ) )

    View Slide

  231. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c
    ( ( 1 - 2 ) )

    View Slide

  232. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    View Slide

  233. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )
    ( ( 2 * 3 + 4 ) )

    View Slide

  234. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    View Slide

  235. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )
    ( ( - 2 / 1 ) )

    View Slide

  236. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    ( ( - 2 / 1 ) )

    View Slide

  237. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    ( ( - 2 / 1 ) )
    ( ( 98 - 0 ) )

    View Slide

  238. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    ( ( - 2 / 1 ) )

    ( ( 98 - 0 ) )

    View Slide

  239. )
    (
    ( )
    ( ( )
    4 )
    ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A

    View Slide

  240. )
    (
    ( )
    ( ( )
    4 )
    ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A

    View Slide

  241. ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A
    ( ( ) )

    ( ( ) )
    4
    Minimized Input
    Abstract Failure Inducing Input
    def check(parsed):
    if parsed.is_nested() and parsed.child.is_nested():
    raise Exception()
    return input

    View Slide

  242. var A = class extends (class {}){};
    Issue 2937 from Closure

    View Slide

  243. var A = class extends (class {}){};
    Issue 2937 from Closure
    = class extends (class {}){}

    View Slide

  244. var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino

    View Slide

  245. var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    var {<$Id1>:{} = <$Id1> => {}} ;

    View Slide

  246. const [y,y] = [];
    Issue 386 from Rhino

    View Slide

  247. const [y,y] = [];
    Issue 386 from Rhino
    const [<$Id1>,<$Id1>] = []

    View Slide

  248. {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View Slide

  249. {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure
    {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}}

    View Slide

  250. ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A
    ( ( ) )

    ( ( ) )
    4
    Minimized Input
    Abstract Failure Inducing Input
    • Effectively abstracts a minimized input

    • The abstraction identifies where the problem lies

    • Decompose complex program behaviors
    DDSET
    Gopinath, Kampmann, Havrikov, Soremekun, and Zeller. Abstracting Failure Inducing Inputs. ISSTA 2020.
    def check(parsed):
    if parsed.is_nested() and parsed.child.is_nested():
    raise Exception()
    return input
    ISSTA 2020 Distinguished Award

    View Slide

  251. 108
    :=
    := ' + '
    | ' + '
    | ' - '
    | ' - '
    |
    := ' * '
    | ' * '
    | ' / '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '(' ')'
    :=
    :=
    := '(' ')'
    Specialized Grammar
    is (())

    View Slide

  252. 108
    :=
    := ' + '
    | ' + '
    | ' - '
    | ' - '
    |
    := ' * '
    | ' * '
    | ' / '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '(' ')'
    :=
    :=
    := '(' ')'
    ((1)) + 2
    (23 * ((3)) - 34)
    (344- 4 + ((223)))
    (1) - 3 * 773 + (-22 + 1)
    1798 - 889 / ((333-1)) * 2 / 3 + 1
    34 + ((4)) -334 + (334 - (22) + 919 * 0 + 1
    98435747+ 88 + (((0))) + (1) - 1 * 7 / 4 * 889 - 2
    8 + ((8)) + --1 + 11223 / 344 - 39 + (1) - 456 + 134 / 45
    437 + 8 - 1 * ((9 + ((1))) - 1 + 99111948 + 3 --1 + (112) - 2 + 445) + 0
    74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * 223 - 1233 + 334672
    2 * ((9)) - (1798 - 889 / (333-1) * 2 / 3 + 100012 + 3434392 + 234 ----6 * 1798 - 889 / (33
    778 - (((1) - 3 * 773 + (-22 + 1) * (4545) - 23 - ((2)) * 773 + (-22 + 1) / 3434 + ---1 + 1 / 34343 + 112
    349 + (((1) - 3 * 3 + (-22 + 1) ((+ (-22 + 1) * (4545) - 23 - (2) * 773 + ((-22 + 1)) / 3434 + ---1 + 1 / 34343 + 1123
    8 + ((8)) + --1 + / 1 - 39 + (1) - 456 + 134 / 45 ))(((1) - 2334 + ((((1)) - 3 * 773 + (-22 + 1) * (2) - 23 - (2) * 773 + (-22 + 1) / 3
    74 + 3 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 45 ))(((1) - 3 * 773
    1+ 33+ 24343433 +23343 - ((74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 4

    Specialized Grammar
    is (())

    View Slide

  253. Algebra of Grammar Specializations

    View Slide

  254. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View Slide


  255. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View Slide



  256. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View Slide




  257. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View Slide

  258. Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021.
    Mechanized proofs are available
    Algebra of Grammar Specializations

    View Slide

  259. Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021.

    where
    is = class extends (class {}){}
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}}
    is var {<$Id2>:{} = <$Id2> => {}} ;
    is const [<$Id3>,<$Id3>] = []
    Mechanized proofs are available
    Algebra of Grammar Specializations

    View Slide

  260. View Slide

  261. View Slide

  262. View Slide

  263. View Slide

  264. View Slide

  265. Science of Focused Fuzzing

    where is (())
    is / 0

    View Slide


  266. where is (())
    is / 0

    View Slide


  267. where is (())
    is / 0
    Isolating & Decomposing Program Behaviors

    View Slide


  268. where is (())
    is / 0
    Isolating & Decomposing Program Behaviors
    Algebra of Program Behaviors

    View Slide


  269. where is (())
    is / 0
    Isolating & Decomposing Program Behaviors
    Algebra of Program Behaviors
    Science of Program Behaviors

    View Slide

  270. View Slide

  271. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    Input Behavior
    Program
    Challenge: Identify Behavior Divergence

    View Slide

  272. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    update($file)
    read($file)
    rm($file)
    Input Behavior
    action='read'
    $action('tbl')
    Program
    assert invoked read: 'tbl.data' ✔
    Challenge: Identify Behavior Divergence

    View Slide

  273. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    update($file)
    read($file)
    rm($file)
    Input Behavior
    action='read'
    $action('tbl')
    Program
    assert invoked read: 'tbl.data' ✔
    action='rm'
    Challenge: Identify Behavior Divergence

    View Slide

  274. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    update($file)
    read($file)
    rm($file)
    Input Behavior
    action='read'
    $action('tbl')
    Program
    assert invoked read: 'tbl.data' ✔
    action='rm'
    Challenge: Identify Behavior Divergence

    View Slide

  275. def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    Challenge: Identify Behavior Divergence

    View Slide

  276. def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    Challenge: Identify Behavior Divergence

    View Slide

  277. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()

    View Slide

  278. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()

    View Slide

  279. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()
    LockB()
    LockA()
    DoAB()
    UnlockA()
    UnlockB()

    View Slide

  280. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()
    LockB()
    LockA()
    DoAB()
    UnlockA()
    UnlockB()
    UnLockA()
    LockA()
    DoAB()
    LockB()
    UnlockB()

    View Slide

  281. 118
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View Slide

  282. 118
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View Slide

  283. 119

    View Slide

  284. 119
    :=

    View Slide

  285. 120
    :=
    := (a==b)
    | (a!=b)

    View Slide

  286. 121
    :=
    := (a==b)
    | (a!=b)
    := (b==c)
    | (b!=c)

    View Slide

  287. 122
    :=
    := (a==b)
    | (a!=b)
    := (b==c)
    | (b!=c)
    := "return Equilateral"
    := "return Isosceles"

    View Slide

  288. 123
    :=
    := (a==b)
    | (a!=b)
    := (b==c)
    | (b!=c)
    := "return Equilateral"
    := "return Isosceles"
    := (b==c)
    | (b!=c)
    := "return Isosceles"
    := (a==c)
    | (a!=c)
    := "return Isosceles"
    := "return Scalene"

    View Slide

  289. LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()
    Challenge: Identify Behavior Divergence
    "lockA" "lockB" "UnlockB" "UnlockA"

    View Slide

  290. View Slide

  291. 126
    The Science of Inputs
    Program
    The Science of Behaviors

    View Slide

  292. 126
    The Science of Inputs
    Program
    The Science of Behaviors

    View Slide

  293. 126
    The Science of Inputs
    Program
    The Science of Behaviors

    View Slide

  294. 126
    The Science of Fuzzing
    The Science of Inputs
    Program
    The Science of Behaviors

    View Slide

  295. View Slide

  296. 128
    Oracles for Fuzzing
    Focused Fuzzing
    Automatic Repair
    Fault Localization
    Beyond Syntax
    Grammar Mining

    View Slide

  297. 128
    Oracles for Fuzzing
    Focused Fuzzing
    Automatic Repair
    Fault Localization
    Beyond Syntax
    Grammar Mining
    Onward

    View Slide

  298. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View Slide

  299. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View Slide

  300. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View Slide

  301. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View Slide

  302. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org

    View Slide

  303. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Generalize

    View Slide

  304. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Generalize
    Combine

    View Slide

  305. Future Work:
    Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Generalize
    Combine
    The Science of Fuzzing

    View Slide

  306. Future Work:
    Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Debugging
    Combine
    The Science of Fuzzing

    View Slide