Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fuzzing -- From Alchemy to a Science

Fuzzing -- From Alchemy to a Science

Talk at Passau University

Rahul Gopinath

June 30, 2021
Tweet

More Decks by Rahul Gopinath

Other Decks in Research

Transcript

  1. Fuzzing
    Rahul Gopinath

    View full-size slide

  2. Fuzzing
    from Alchemy
    Rahul Gopinath

    View full-size slide

  3. Fuzzing
    from Alchemy
    Rahul Gopinath

    View full-size slide

  4. Fuzzing
    from Alchemy
    to Science
    Rahul Gopinath

    View full-size slide

  5. Fuzzing
    from Alchemy
    to Science
    Rahul Gopinath

    View full-size slide

  6. 4
    The story begins 2500 years ago.
    500 BC

    View full-size slide

  7. 4
    Vedic Sanskrit
    500 BC

    View full-size slide

  8. 4
    Vedic Sanskrit Classical Sanskrit
    500 BC

    View full-size slide

  9. 5
    500 BC
    Vedic Sanskrit Classical Sanskrit

    View full-size slide

  10. 5
    500 BC
    Aṣṭādhyāyī Dakṣiputra Pāṇini
    Vedic Sanskrit Classical Sanskrit

    View full-size slide

  11. 5
    500 BC
    Aṣṭādhyāyī Dakṣiputra Pāṇini
    Ad hoc rules Formal specification
    Vedic Sanskrit Classical Sanskrit

    View full-size slide

  12. 6
    2500 years later....
    2021 CE The world is governed by software

    View full-size slide

  13. 6
    2500 years later....
    2021 CE
    We have a crisis
    The world is governed by software

    View full-size slide

  14. 7
    Debian 5 ~ 70 million lines
    Smart cars ~ 100 million lines
    Google is ~ 2 Billion lines
    5 M
    10 M
    15 M
    20 M
    25 M
    30 M
    35 M
    (Source: Wikipedia)
    #Linux Kernel

    Size in Millions

    of Lines of Code
    Linux Releases
    Growth in software complexity

    View full-size slide

  15. 7
    Debian 5 ~ 70 million lines
    Smart cars ~ 100 million lines
    Google is ~ 2 Billion lines
    1994
    1996
    1.0.0 2.0.0 2.1.0 2.2.0 2.4.0 2.6.0
    2003
    2011
    3.0
    2015
    4.0
    2019
    5.0
    2019
    5.7
    2001
    5 M
    10 M
    15 M
    20 M
    25 M
    30 M
    35 M
    (Source: Wikipedia)
    #Linux Kernel

    Size in Millions

    of Lines of Code
    Linux Releases
    Growth in software complexity

    View full-size slide

  16. 8
    Growth in vulnerabilities
    #Vulnerabilities
    Year
    16k
    14k
    12k
    10k
    8k
    6k
    4k
    2k
    (Source: NIST)

    View full-size slide

  17. 8
    Growth in vulnerabilities
    #Vulnerabilities
    Year
    2019
    2018
    2017
    2016
    2015
    2014
    2013
    2012
    2011
    2010
    2009
    2008
    2007
    2006
    2005
    2004
    2003
    2002
    2001
    16k
    14k
    12k
    10k
    8k
    6k
    4k
    2k
    (Source: NIST)

    View full-size slide

  18. Fuzzing
    Program
    Trash deck technique: 1950s - Gerald Weinberg

    View full-size slide

  19. Fuzzing
    Program
    Trash deck technique: 1950s - Gerald Weinberg

    View full-size slide

  20. Fuzzing
    Crash?
    Program
    Trash deck technique: 1950s - Gerald Weinberg

    View full-size slide

  21. 10
    Fuzzing
    Random Inputs
    Program
    Automatic Checks

    View full-size slide

  22. 10
    Fuzzing
    • Memory Bounds Violation

    • Privilege Escalation

    • Safety Violations

    • Metamorphic Relations

    • Differential Execution
    Random Inputs
    Program
    Automatic Checks

    View full-size slide

  23. 11
    Random Fuzzing
    Program

    View full-size slide

  24. 11
    Random Fuzzing
    $ ./fuzz
    [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu

    2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z

    h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!

    AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}

    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?

    lR=bF3+;y$3lodQ)KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp

    bY@aGZ1{P!AZU7x#4(Rtn!q4nCwqol^y6

    }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU

    )*BiC<),`+t*gkaPq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM

    PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2

    D|vBy!^zkhdf3C5PAkR?V((-%>i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@

    5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:

    cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc

    un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N

    -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r
    Program

    View full-size slide

  25. $ ./fuzz -int | program
    634111569742810193727424069509
    741355925061499451162464719526
    615957331924826555590537407605
    181400079803446874252046374716
    740973770255348279425601333144
    152724057932073828569041216191
    099859446496509919024810271242
    622974988671421938012464630138
    735355134599327240920259675263
    574528613057084231370741920902
    794677842164654990353575580453
    777282305855352378119038096476
    699871306655084953377039862387
    924957554389878352934547664240
    082431556093837288597262675598
    630851919061829885048834738832
    677022429414980917053939970795
    722006987916088650168665471731 yes
    12
    Random Fuzzing
    def is_prime(n: int) -> bool:
    """Primality test using 6k+-1 optimization."""
    if n <= 3:
    return n > 1
    if n % 2 == 0 or n % 3 == 0:
    return False
    i = 5
    while i ** 2 <= n:
    if n % i == 0 or n % (i + 2) == 0:
    return False
    i += 6
    return True
    def main():
    num = stdin.read()
    print(num, is_prime(num))

    View full-size slide

  26. $ ./fuzz -int | program
    634111569742810193727424069509
    741355925061499451162464719526
    615957331924826555590537407605
    181400079803446874252046374716
    740973770255348279425601333144
    152724057932073828569041216191
    099859446496509919024810271242
    622974988671421938012464630138
    735355134599327240920259675263
    574528613057084231370741920902
    794677842164654990353575580453
    777282305855352378119038096476
    699871306655084953377039862387
    924957554389878352934547664240
    082431556093837288597262675598
    630851919061829885048834738832
    677022429414980917053939970795
    722006987916088650168665471731 yes
    12
    Random Fuzzing
    def is_prime(n: int) -> bool:
    """Primality test using 6k+-1 optimization."""
    if n <= 3:
    return n > 1
    if n % 2 == 0 or n % 3 == 0:
    return False
    i = 5
    while i ** 2 <= n:
    if n % i == 0 or n % (i + 2) == 0:
    return False
    i += 6
    return True
    def main():
    num = stdin.read()
    print(num, is_prime(num))
    Weak point: unequal divisions in input space
    Input space
    n > 3
    n <= 3

    View full-size slide

  27. 13
    Advanced Fuzzing: Instrumentation

    View full-size slide

  28. 14
    Advanced Fuzzing: Instrumentation
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View full-size slide

  29. 14
    Advanced Fuzzing: Instrumentation
    def triangle(a, b, c):
    __probe_enter()
    if a == b:
    __probe_1()
    if b == c:
    __probe_2()
    return Equilateral
    else:
    __probe_3()
    return Isosceles
    else:
    __probe_4()
    if b == c:
    __probe_5()
    return Isosceles
    else:
    __probe_6()
    if a == c:
    __probe_7()
    return Isosceles
    else:
    __probe_8()
    return Scalene
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View full-size slide

  30. 15
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View full-size slide

  31. 15
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    triangle (1,1,1)

    View full-size slide

  32. 16
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View full-size slide

  33. 16
    Instrumentation Based Fuzzing
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    triangle (1,1,2)

    View full-size slide

  34. 17
    Instrumentation Guided Fuzzing
    • Coverage Guided
    • Solver Directed

    View full-size slide

  35. 18
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage

    View full-size slide

  36. 18
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL

    View full-size slide

  37. 19
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    First Release: 2013
    AFL Trophy Case

    View full-size slide

  38. 20
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    Pulling JPEGs out of thin air
    https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html
    Valid JPEG in 6 hours in an 8 core machine!

    View full-size slide

  39. 21
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    static int is_reserved_word_token(const char *s, int len) {
    const char *reserved[] = {
    "break", "case", "catch", "continue", "debugger", "default",
    "delete", "do", "else", "false", "finally", "for",
    "function", "if", "in", "instanceof", "new", "null",
    "return", "switch", "this", "throw", "true", "try",
    "typeof", "var", "void", "while", "with", "let",
    "undefined", ((void *)0)};
    int i;
    if (!mjs_is_alpha(s[0]))
    return 0;
    for (i = 0; reserved[i] != ((void *)0); i++) {
    if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0)
    return i + 1;
    }
    return 0;
    }

    View full-size slide

  40. 21
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage
    AFL
    static int is_reserved_word_token(const char *s, int len) {
    const char *reserved[] = {
    "break", "case", "catch", "continue", "debugger", "default",
    "delete", "do", "else", "false", "finally", "for",
    "function", "if", "in", "instanceof", "new", "null",
    "return", "switch", "this", "throw", "true", "try",
    "typeof", "var", "void", "while", "with", "let",
    "undefined", ((void *)0)};
    int i;
    if (!mjs_is_alpha(s[0]))
    return 0;
    for (i = 0; reserved[i] != ((void *)0); i++) {
    if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0)
    return i + 1;
    }
    return 0;
    }
    Weak point: Magic bytes

    View full-size slide

  41. 22
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage

    View full-size slide

  42. 22
    Coverage Guided Fuzzing
    • Randomly generate inputs
    • Choose seeds with new coverage

    View full-size slide

  43. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs

    View full-size slide

  44. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)

    View full-size slide

  45. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)
    (b != c)

    View full-size slide

  46. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)
    (b != c)
    triangle(1,2,1)

    View full-size slide

  47. 23
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs
    (a == b)
    (b == c)
    (b != c)
    triangle(1,2,1)

    View full-size slide

  48. 24
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve constraints for new inputs
    void next_sym() {
    while(1) {
    switch (ch){
    case '{': next_ch(); sym = LBRA; return;
    case '}': next_ch(); sym = RBRA; return;
    case '(': next_ch(); sym = LPAR; return;
    case ')': next_ch(); sym = RPAR; return;
    case '+': next_ch(); sym = PLUS; return;
    case '-': next_ch(); sym = MINUS; return;
    case '<': next_ch(); sym = LESS; return;
    case ';': next_ch(); sym = SEMI; return;
    case '=': next_ch(); sym = EQUAL; return;
    default:
    if (ch >= '0' && ch <= '9') {
    int_val = 0; /* missing overflow check */
    while (ch >= '0' && ch <= '9') {
    int_val = int_val*10 + (ch - '0'); next_ch(); }
    sym = INT;
    } else if (ch >= 'a' && ch <= 'z') {
    int i = 0; /* missing overflow check */
    while ((ch >= 'a' && ch <= 'z') || ch == '_'){
    id_name[i++] = ch; next_ch(); }
    id_name[i] = '\0';
    sym = 0;
    while (words[sym] != NULL && strcmp(words[sym], id_name) != 0)
    sym++;
    if (words[sym] == NULL)
    if (id_name[1] == '\0') sym = ID; else syntax_error();
    }
    else syntax_error();
    return;
    }

    View full-size slide

  49. 24
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve constraints for new inputs
    void next_sym() {
    while(1) {
    switch (ch){
    case '{': next_ch(); sym = LBRA; return;
    case '}': next_ch(); sym = RBRA; return;
    case '(': next_ch(); sym = LPAR; return;
    case ')': next_ch(); sym = RPAR; return;
    case '+': next_ch(); sym = PLUS; return;
    case '-': next_ch(); sym = MINUS; return;
    case '<': next_ch(); sym = LESS; return;
    case ';': next_ch(); sym = SEMI; return;
    case '=': next_ch(); sym = EQUAL; return;
    default:
    if (ch >= '0' && ch <= '9') {
    int_val = 0; /* missing overflow check */
    while (ch >= '0' && ch <= '9') {
    int_val = int_val*10 + (ch - '0'); next_ch(); }
    sym = INT;
    } else if (ch >= 'a' && ch <= 'z') {
    int i = 0; /* missing overflow check */
    while ((ch >= 'a' && ch <= 'z') || ch == '_'){
    id_name[i++] = ch; next_ch(); }
    id_name[i] = '\0';
    sym = 0;
    while (words[sym] != NULL && strcmp(words[sym], id_name) != 0)
    sym++;
    if (words[sym] == NULL)
    if (id_name[1] == '\0') sym = ID; else syntax_error();
    }
    else syntax_error();
    return;
    }
    Weak point: Path explosion

    View full-size slide

  50. 25
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs

    View full-size slide

  51. 25
    Solver Directed Fuzzing
    • Collect path constraints
    • Solve negated constraints for new inputs

    View full-size slide

  52. 26
    Fuzzing Parsers
    $ ./fuzz
    [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu

    2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z

    h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!

    AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}

    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?

    lR=bF3+;y$3lodQ)KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp

    bY@aGZ1{P!AZU7x#4(Rtn!q4nCwqol^y6

    }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU

    )*BiC<),`+t*gkaPq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM

    PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2

    D|vBy!^zkhdf3C5PAkR?V((-%>i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@

    5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:

    cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc

    un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N

    -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r
    Interpreter

    View full-size slide

  53. 26
    Fuzzing Parsers
    $ ./fuzz
    [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu

    2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z

    h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!

    AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}

    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?

    lR=bF3+;y$3lodQ)KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp

    bY@aGZ1{P!AZU7x#4(Rtn!q4nCwqol^y6

    }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU

    )*BiC<),`+t*gkaPq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM

    PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2

    D|vBy!^zkhdf3C5PAkR?V((-%>i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@

    5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:

    cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc

    un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N

    -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r
    Parser
    Syntax Error
    Interpreter
    #

    View full-size slide

  54. 27
    Advanced Fuzzing: Specialized Generators
    • Specialize generation for a domain

    View full-size slide

  55. 27
    Advanced Fuzzing: Specialized Generators
    • Specialize generation for a domain

    View full-size slide

  56. 27
    Advanced Fuzzing: Specialized Generators
    • Specialize generation for a domain
    80,000 lines of code

    View full-size slide

  57. 28
    The Holy Grail of Fuzzing

    View full-size slide

  58. 28
    The Holy Grail of Fuzzing
    Parsers

    View full-size slide

  59. Overcoming Parsers

    View full-size slide

  60. Overcoming Parsers

    View full-size slide

  61. 30
    Overcoming Parsers

    View full-size slide

  62. 30
    Overcoming Parsers

    View full-size slide

  63. 30
    Monolithic
    Overcoming Parsers

    View full-size slide

  64. 32
    The Missing Piece: Formal Specification

    View full-size slide

  65. 34
    Formal Languages
    Formal Language Descriptions

    View full-size slide

  66. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    (Chomsky,1956)

    View full-size slide

  67. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    Context Free
    (Chomsky,1956)
    Argument Stack

    View full-size slide

  68. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    Context Free
    Recursively Enumerable
    (Chomsky,1956)
    Argument Stack
    Return Stack

    View full-size slide

  69. 34
    Formal Languages
    Formal Language Descriptions
    3. Regular
    Context Free
    Recursively Enumerable
    (Chomsky,1956)
    Easy to produce and parse
    Argument Stack
    Return Stack

    View full-size slide

  70. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar

    View full-size slide

  71. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar

    View full-size slide

  72. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar
    key

    View full-size slide

  73. 35
    Grammar
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Arithmetic expression grammar
    Definition for
    key

    View full-size slide

  74. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar

    View full-size slide

  75. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule

    View full-size slide

  76. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule Terminal Symbol

    View full-size slide

  77. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule Terminal Symbol
    Nonterminal Symbol

    View full-size slide

  78. 36
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Grammar
    Arithmetic expression grammar
    Expansion Rule Terminal Symbol
    Nonterminal Symbol

    View full-size slide

  79. 37
    Grammars
    As recognizers
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    View full-size slide

  80. 37
    Grammars
    As recognizers
    (8 / 3) * 49
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    View full-size slide

  81. 37
    Grammars
    As recognizers
    (8 / 3) * 49
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    View full-size slide

  82. 38
    Grammars
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    As producers (Hanford 1970)

    (Purdom 1972)

    View full-size slide

  83. 38
    Grammars8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090
    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    As producers (Hanford 1970)

    (Purdom 1972)

    View full-size slide

  84. 39
    Grammars
    As effective producers
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090

    View full-size slide

  85. 39
    Grammars
    As effective producers
    Interpreter
    Parser


    8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090

    View full-size slide

  86. 40
    Where to Get the Grammar From?

    View full-size slide

  87. 41
    •Reference Specification?

    View full-size slide

  88. 41
    The standard spec
    •Reference Specification?

    View full-size slide

  89. 41
    The standard spec
    Buggy Implementation
    •Reference Specification?

    View full-size slide

  90. 41
    The standard spec
    Buggy Implementation
    "Extra" Features
    •Reference Specification?

    View full-size slide

  91. 41
    The standard spec
    Buggy Implementation
    "Extra" Features
    "Be liberal in what you accept, and conservative in what you send"

    (the cause of trouble) Postel's Law
    "Accepted" Bugs
    •Reference Specification?

    View full-size slide

  92. 42
    https://www.json.org

    View full-size slide

  93. 42
    object
    { }
    { members }
    members
    pair
    pair , members
    pair
    string : value
    array
    [ ]
    [ elements ]
    elements
    value
    value , elements
    value
    string
    number
    object
    array
    true
    false
    null
    string
    " "
    " chars "
    chars
    char
    char chars
    char
    UNICODE \ [",\,CTRL]
    \" \\ \/ \b \f \n \r \t
    \u hex hex hex hex
    number
    int
    int frac
    int exp
    int frac exp
    int
    digit
    onenine digits
    - digit
    - onenine digits
    frac
    . digits
    exp
    e digits
    hex
    digit
    A - F
    a - f
    digits
    digit
    digit digits
    e
    e e+ e-
    E E+ E-
    https://www.json.org

    View full-size slide

  94. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View full-size slide

  95. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View full-size slide

  96. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View full-size slide

  97. 43
    Parsing JSON is a Minefield
    http://seriot.ch/

    View full-size slide

  98. 43
    Parsing JSON is a Minefield
    http://seriot.ch/
    Expected
    Parse Fail (Expect Success)
    Parse Success (Expect Fail)
    Parse Success (Undefined)
    Parse Fail (Undefined)
    Parser Crash
    Timeout

    View full-size slide

  99. 43
    Parsing JSON is a Minefield
    http://seriot.ch/
    Expected
    Parse Fail (Expect Success)
    Parse Success (Expect Fail)
    Parse Success (Undefined)
    Parse Fail (Undefined)
    Parser Crash
    Timeout

    View full-size slide

  100. 45
    Where to Get the Grammar From?

    View full-size slide

  101. 45
    Where to Get the Grammar From?
    Hand-written parsers already encode the grammar

    View full-size slide

  102. 46
    How to Extract This Grammar?

    View full-size slide

  103. 46
    How to Extract This Grammar?
    • Inputs -> Dynamic Control Dependence Trees

    View full-size slide

  104. 46
    How to Extract This Grammar?
    • Inputs -> Dynamic Control Dependence Trees
    • DCD Trees -> Context Free Grammar

    View full-size slide

  105. 47
    Control Dependence Graph
    Statement B is control dependent on A if A determines whether B executes.
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1

    View full-size slide

  106. 47
    Control Dependence Graph
    Statement B is control dependent on A if A determines whether B executes.
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv

    View full-size slide

  107. 47
    Control Dependence Graph
    Statement B is control dependent on A if A determines whether B executes.
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv
    while: determines

    whether

    if: executes

    View full-size slide

  108. 48
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv
    Dynamic Control Dependence Tree
    Each statement execution is represented as a separate node

    View full-size slide

  109. 48
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    CDG for parse_csv
    Dynamic Control Dependence Tree
    Each statement execution is represented as a separate node
    DCD Tree for call parse_csv()

    View full-size slide

  110. 49
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    DCD Tree ~ Parse Tree
    •No tracking beyond input buffer

    •Characters are attached to nodes where they are accessed last
    "12,"
    "12,"

    View full-size slide

  111. 49
    def parse_csv(s,i):
    while s[i:]:
    if is_digit(s[i]):
    n,j = num(s[i:])
    i = i+j
    else:
    comma(s[i])
    i += 1
    '1' '2' ','
    DCD Tree ~ Parse Tree
    •No tracking beyond input buffer

    •Characters are attached to nodes where they are accessed last
    "12,"
    "12,"

    View full-size slide

  112. 50
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Parse tree for parse_expr('9+3/4')

    View full-size slide

  113. 50
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Parse tree for parse_expr('9+3/4')

    View full-size slide

  114. 51
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Identifying Compatible Nodes
    Which nodes correspond to the same nonterminal

    View full-size slide

  115. 51
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    9+3/4
    Identifying Compatible Nodes
    Which nodes correspond to the same nonterminal

    View full-size slide

  116. 52
    3 * (9 + 1)

    View full-size slide

  117. 52
    3 * (9 + 1)

    View full-size slide

  118. 52
    (9 + 1) * 3
    3 * (9 + 1)

    View full-size slide

  119. 52
    (9 + 1) * 3
    3 * (9 + 1)

    View full-size slide

  120. 52
    (9 + 1) * 3
    3 * (9 + 1)

    View full-size slide

  121. 53
    3 * (9 + 1)

    View full-size slide

  122. 53
    3 * (9 + 1)

    View full-size slide

  123. 53
    9 + 1
    3 * (9 + 1)

    View full-size slide

  124. 53
    9 + 1
    3 * (9 + 1)

    View full-size slide

  125. 53
    9 + 1
    3 * (9 + 1)

    View full-size slide

  126. 54
    3 * (9 + 1)

    View full-size slide

  127. 54
    3 (9 + 1) *
    3 * (9 + 1)

    View full-size slide

  128. 54
    3 (9 + 1) *
    3 * (9 + 1)

    View full-size slide

  129. 54
    3 (9 + 1) *
    3 * (9 + 1)

    View full-size slide

  130. 56
    3*(1)
    1
    :=

    :=

    View full-size slide

  131. :=
    |
    |
    |
    :=
    :=
    |
    :=
    := '3' | '1'
    := '(' ')'
    :=
    := '*'
    57

    View full-size slide

  132. :=
    :=
    |
    :=
    |
    |
    |
    :=
    :=
    |
    :=
    := '3' | '1'
    := '(' ')'
    :=
    := '*'
    57

    View full-size slide

  133. 58
    def is_digit(i): return i in '0123456789'
    def parse_num(s,i):
    n = ''
    while s[i:] and is_digit(s[i]):
    n += s[i]
    i = i +1
    return i,n
    def parse_paren(s, i):
    assert s[i] == '('
    i, v = parse_expr(s, i+1)
    if s[i:] == '': raise Ex(s, i)
    assert s[i] == ')'
    return i+1, v
    def parse_expr(s, i = 0):
    expr, is_op = [], True
    while s[i:]:
    c = s[i]
    if isdigit(c):
    if not is_op: raise Ex(s,i)
    i,num = parse_num(s,i)
    expr.append(num)
    is_op = False
    elif c in ['+', '-', '*', '/']:
    if is_op: raise Ex(s,i)
    expr.append(c)
    is_op, i = True, i + 1
    elif c == '(':
    if not is_op: raise Ex(s,i)
    i, cexpr = parse_paren(s, i)
    expr.append(cexpr)
    is_op = False
    elif c == ')': break
    else: raise Ex(s,i)
    if is_op: raise Ex(s,i)
    return i, expr
    :=
    :=
    |
    :=
    |
    := '(' ')'
    |
    := '*' | '+' | '-' | '/'
    :=
    |
    : [0-9]
    calc.py Recovered Arithmetic Grammar

    View full-size slide

  134. 59
    :=
    :=
    |
    :=
    |
    := '(' ')'
    |
    := '*' | '+' | '-' | '/'
    :=
    |
    : [0-9]

    View full-size slide

  135. 59
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-
    ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4)
    )))) - ++4) / +(-+---((5.6 - --(3 *
    -1.8 * +(6 * +-(((-(-6) * ---+6)) /
    +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3
    - ++9.0 + ---(--+7 / (1 / +++6.37)
    + (1) / 482) / +++-+0)))) + 8.2 - 27
    - -9 / +((+9 * --2 + --+-+-((-1 * +
    (8 - 5 - 6)) * (-(a-+(((+(4))))) - +
    +4) / +(-+---((5.6 - --(3 * -1.8 * +
    (6 * +-(((-(-6) * ---+6)) / +--(+-+-
    7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0
    + ---(--+7 / (1 / +++6.37) + (1) /
    482) / +++-+0)))) * -+5 + 7.513))))
    - (+1 / ++((-84)))))))) * ++5 / +-(-
    -2 - -++-9.0)))) / 5 * --++090 + * -
    +5 + 7.513)))) - (+1 / ++((-84))))))
    )) * 8.2 - 27 - -9 / +((+9 * --2 + -
    -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+((
    (+(4))))) - ++4) / +(-+---((5.6 - --
    (3 * -1.8 * +(6 * +-(((-(-6) * ---+6
    )) / +--(+-+-7 * (-0 * (+(((((2)) +
    8 - 3 - ++9.0 + ---(--+7 / (1 / +++6
    .37) + (1) / 482) / +++-+0)))) * -+5
    + 7.513)))) - (+1 / ++((-84))))))))
    * ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090 ++5 / +-(--2 - -++-9.0)))) /
    5 * --++090
    :=
    :=
    |
    :=
    |
    := '(' ')'
    |
    := '*' | '+' | '-' | '/'
    :=
    |
    : [0-9]

    View full-size slide

  136. 60
    ::=
    ::= '"'
    | '['
    | '{'
    |
    | 'true'
    | 'false'
    | 'null'
    ::= +
    | + 'e' +
    ::= '+' | '-' | '.' | [0-9] | 'E' | 'e'
    ::= * '"'
    ::= ']'
    | (',')* ']'
    | ( ',' )+ (',' )* ']'
    ::= '}'
    | ( '"' ':' ',' )*
    '"' ':' '}'
    ::= ' ' | '!' | '#' | '$' | '%' | '&' | '''
    | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';'
    | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^'
    | '_', ''',| '{' | '|' | '}' | '~'
    | '[A-Za-z0-9]'
    | '\'
    ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't'
    stm.next()
    if expect_key:
    raise JSONError(E_DKEY, stm, stm.pos)
    if c == '}':
    return result
    expect_key = 1
    continue
    # parse out a key/value pair
    elif c == '"':
    key = _from_json_string(stm)
    stm.skipspaces()
    c = stm.next()
    if c != ':':
    raise JSONError(E_COLON, stm, stm.pos)
    stm.skipspaces()
    val = _from_json_raw(stm)
    result[key] = val
    expect_key = 0
    continue
    raise JSONError(E_MALF, stm, stm.pos)
    def _from_json_raw(stm):
    while True:
    stm.skipspaces()
    c = stm.peek()
    if c == '"':
    return _from_json_string(stm)
    elif c == '{':
    return _from_json_dict(stm)
    elif c == '[':
    return _from_json_list(stm)
    elif c == 't':
    return _from_json_fixed(stm, 'true', True, E_BOOL)
    elif c == 'f':
    return _from_json_fixed(stm, 'false', False, E_BOOL)
    elif c == 'n':
    return _from_json_fixed(stm, 'null', None, E_NULL)
    elif c in NUMSTART:
    return _from_json_number(stm)
    raise JSONError(E_MALF, stm, stm.pos)
    def from_json(data):
    stm = JSONStream(data)
    return _from_json_raw(stm)
    microjson.py Recovered JSON grammar

    View full-size slide

  137. 61
    ::=
    ::= '"'
    | '['
    | '{'
    |
    | 'true'
    | 'false'
    | 'null'
    ::= +
    | + 'e' +
    ::= '+' | '-' | '.' | [0-9] | 'E' | 'e'
    ::= * '"'
    ::= ']'
    | (',')* ']'
    | ( ',' )+ (',' )* ']'
    ::= '}'
    | ( '"' ':' ',' )*
    '"' ':' '}'
    ::= ' ' | '!' | '#' | '$' | '%' | '&' | '''
    | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';'
    | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^'
    | '_', ''',| '{' | '|' | '}' | '~'
    | '[A-Za-z0-9]'
    | '\'
    ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't'
    stm.next()
    if expect_key:
    raise JSONError(E_DKEY, stm, stm.pos)
    if c == '}':
    return result
    expect_key = 1
    continue
    # parse out a key/value pair
    elif c == '"':
    key = _from_json_string(stm)
    stm.skipspaces()
    c = stm.next()
    if c != ':':
    raise JSONError(E_COLON, stm, stm.pos)
    stm.skipspaces()
    val = _from_json_raw(stm)
    result[key] = val
    expect_key = 0
    continue
    raise JSONError(E_MALF, stm, stm.pos)
    def _from_json_raw(stm):
    while True:
    stm.skipspaces()
    c = stm.peek()
    if c == '"':
    return _from_json_string(stm)
    elif c == '{':
    return _from_json_dict(stm)
    elif c == '[':
    return _from_json_list(stm)
    elif c == 't':
    return _from_json_fixed(stm, 'true', True, E_BOOL)
    elif c == 'f':
    return _from_json_fixed(stm, 'false', False, E_BOOL)
    elif c == 'n':
    return _from_json_fixed(stm, 'null', None, E_NULL)
    elif c in NUMSTART:
    return _from_json_number(stm)
    raise JSONError(E_MALF, stm, stm.pos)
    def from_json(data):
    stm = JSONStream(data)
    return _from_json_raw(stm)
    microjson.py Recovered JSON grammar
    Mimid
    Gopinath, Mathis, and Zeller. Mining Input Grammars from Dynamic Control Flow. ESEC/FSE 2020.
    •Javascript
    •C
    •Lisp
    •JSON
    •URL
    •CGI

    View full-size slide

  138. Actual Specification

    View full-size slide

  139. Dynamic Approximation
    (Mimid)
    Actual Specification

    View full-size slide

  140. Dynamic Approximation
    (Mimid)
    Static Approximation
    (Static Mimid)
    Actual Specification

    View full-size slide

  141. Dynamic Approximation
    (Mimid)
    Static Approximation
    (Static Mimid)
    Actual Specification
    IOT
    &
    Embedded

    View full-size slide

  142. Generating Unbiased Samples

    View full-size slide

  143. Finding Good Samples
    Seed corpus?

    View full-size slide

  144. Finding Good Samples
    Seed corpus?
    (Blind spots)

    View full-size slide

  145. • Differentiate incomplete and incorrect inputs
    • Solve one input symbol at a time systematically
    Key Idea

    View full-size slide

  146. 67
    Sample Free Generators

    View full-size slide

  147. 67
    Sample Free Generators

    View full-size slide

  148. 67
    Sample Free Generators

    View full-size slide

  149. 67
    Sample Free Generators
    A

    View full-size slide

  150. 67
    Sample Free Generators
    A
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

    View full-size slide

  151. 67
    Sample Free Generators
    A
    (
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

    View full-size slide

  152. 67
    Sample Free Generators
    A
    ( 2
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

    View full-size slide

  153. 67
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0
    B ∉ +,-,1,2,3,4,5,6,7,8,9,0,)
    ) ∉ +,-,1,2,3,4,5,6,7,8,9,0

    View full-size slide

  154. 67
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0
    B ∉ +,-,1,2,3,4,5,6,7,8,9,0,)
    ) ∉ +,-,1,2,3,4,5,6,7,8,9,0

    View full-size slide

  155. 67
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )
    A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0
    B ∉ +,-,1,2,3,4,5,6,7,8,9,0,)
    ) ∉ +,-,1,2,3,4,5,6,7,8,9,0
    (2-94)

    View full-size slide

  156. Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Parser Directed Fuzzing. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Zeller Black-box Testing with Monotonic Prefixes. ISSRE 2021 (submitted).
    Sample Free Generators
    A
    ( 2
    -
    B
    9
    )
    4 )

    View full-size slide

  157. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    fuzz(expr_grammar, '')

    View full-size slide

  158. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))

    View full-size slide

  159. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))

    View full-size slide

  160. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Fast Fuzzing with Grammars
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View full-size slide

  161. Fast Fuzzing with Grammars
    fuzz(expr_grammar, '')
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View full-size slide

  162. Fast Fuzzing with Grammars



    +



    1 8
    fuzz(expr_grammar, '')
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View full-size slide

  163. Fast Fuzzing with Grammars



    +



    1 8
    fuzz(expr_grammar, '')
    def collapse(tree):
    key, children = tree
    if not children: return tree
    return ''.join([collapse(c) for c in children])
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View full-size slide

  164. Fast Fuzzing with Grammars



    +



    1 8
    fuzz(expr_grammar, '')
    1 8
    +
    "1 + 8"
    def collapse(tree):
    key, children = tree
    if not children: return tree
    return ''.join([collapse(c) for c in children])
    def gen_key(grammar, key):

    if is_terminal_symbol(key):

    return key
    else:
    next_rule = random.choice(rules)
    return gen_rule(grammar, grammar[key][next_rule])
    fuzz(expr_grammar, '')
    def fuzz(grammar, key):
    return collapse_tree(gen_key(grammar, key))
    def gen_rule(grammar, rule):
    return [gen_key(token) for token in rule]

    View full-size slide

  165. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar

    View full-size slide

  166. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View full-size slide

  167. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View full-size slide

  168. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')
    def start(rops):
    expr(rops)
    def expr(rops):
    match (rops.next % 6):
    case 0: expr(rops); print('+'); e
    case 1: expr(rops); print('-'); e
    case 2: expr(rops); print('/'); e
    case 3: expr(rops); print('*'); e
    case 4: print('('); expr(rops); p
    case 5: number(rops)
    def number(rops):
    match (rops.next % 2):
    case 0: integer(rops)
    case 1: integer(rops); print('.')
    def integer(rops):
    match (rops.next % 2):
    case 0: digit(rops); integer(rops
    case 1: digit(rops)
    def digit(rops):
    match (rops.next % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View full-size slide

  169. :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]
    Compiling the Grammar
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')
    def start(rops):
    expr(rops)
    def expr(rops):
    match (rops.next % 6):
    case 0: expr(rops); print('+'); e
    case 1: expr(rops); print('-'); e
    case 2: expr(rops); print('/'); e
    case 3: expr(rops); print('*'); e
    case 4: print('('); expr(rops); p
    case 5: number(rops)
    def number(rops):
    match (rops.next % 2):
    case 0: integer(rops)
    case 1: integer(rops); print('.')
    def integer(rops):
    match (rops.next % 2):
    case 0: digit(rops); integer(rops
    case 1: digit(rops)
    def digit(rops):
    match (rops.next % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')

    View full-size slide

  170. Grammar Fuzzer
    grammar = """

    :=
    := '+'
    | '-'
    | '/'
    | '*'
    | '(' ')'
    |
    :=
    | '.'
    :=
    |
    := [0-9]

    """
    generate(grammar)
    Fast Fuzzers
    def start():
    expr()
    def expr():
    match (random() % 6):
    case 0: expr(); print('+'); expr()
    case 1: expr(); print('-'); expr()
    case 2: expr(); print('/'); expr()
    case 3: expr(); print('*'); expr()
    case 4: print('('); expr(); print(')')
    case 5: number()
    def number():
    match (random() % 2):
    case 0: integer()
    case 1: integer(); print('.'); integer()
    def integer():
    match (random() % 2):
    case 0: digit(); integer()
    case 1: digit()
    def digit():
    match (random() % 10):
    case 0: print('0')
    case 1: print('1')
    case 2: print('2')
    case 3: print('3')
    case 4: print('4')
    case 5: print('5')
    case 6: print('6')
    case 7: print('7')
    Compiled Grammar (F1)
    Building Fast Fuzzers Gopinath and Zeller 2019
    def start_0(rops):
    r = next(rops)
    if 0 <= r < 43: expr_0()
    elif 43 <= r < 85: expr_1()
    elif 85 <= r < 128: expr_2()
    elif 128 <= r < 171: expr_3()
    elif 171 <= r < 213: expr_4()
    else: expr_5()
    def expr_0(rops):
    r = next(rops)
    if 0 <= r < 43: expr_0()
    elif 43 <= r < 85: expr_1()
    elif 85 <= r < 128: expr_2()
    elif 128 <= r < 171: expr_3()
    elif 171 <= r < 213: expr_4()
    else: expr_5()
    print('+')
    r = next(rops)
    if 0 <= r < 43: expr_0()
    elif 43 <= r < 85: expr_1()
    elif 85 <= r < 128: expr_2()
    elif 128 <= r < 171: expr_3()
    elif 171 <= r < 213: expr_4()
    else: expr_5()
    Grammar VM (F1)

    View full-size slide

  171. 73
    The Fuzzing Pipeline
    Program Under Test

    View full-size slide

  172. 73
    The Fuzzing Pipeline
    Program Under Test
    pFuzzer
    Active
    Guidance
    F1 Grammar Fuzzer
    Grammar
    Inputs
    Grammar Miner
    Samples
    Active
    Learning

    View full-size slide

  173. 74
    The Fuzzing Synergy
    Mimid Grammar Miner
    Parser Directed pFuzzer
    F1 Fuzzer VM
    Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted).
    Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished)
    Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

    View full-size slide

  174. 74
    The Fuzzing Synergy
    Mimid Grammar Miner
    Parser Directed pFuzzer
    F1 Fuzzer VM
    Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted).
    Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished)
    Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

    View full-size slide

  175. 74
    The Fuzzing Synergy
    Mimid Grammar Miner
    Parser Directed pFuzzer
    F1 Fuzzer VM
    Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019.
    Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020.
    Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted).
    Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished)
    Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

    View full-size slide

  176. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    View full-size slide

  177. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST

    View full-size slide

  178. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST
    XML PAYLOAD

    View full-size slide

  179. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST
    XML PAYLOAD
    SOAP

    View full-size slide

  180. 76
    Challenge: Multilevel Envelopes
    POST /InStock HTTP/1.1
    Host: www.stock.org
    Content-Type: application/soap+xml; charset=utf-8
    Content-Length: 312

    xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
    soap:encodingStyle="http://w3.org/2001/12/soap-encoding">


    IBM



    HTTP POST
    XML PAYLOAD
    SOAP
    RPC Call

    View full-size slide

  181. 77
    Future Challenge: Multilevel Envelopes

    View full-size slide

  182. 77
    Future Challenge: Multilevel Envelopes

    View full-size slide

  183. 77
    Future Challenge: Multilevel Envelopes
    Mimid

    View full-size slide

  184. 78
    #include
    int main() {
    int number1, number2, number3;
    number1 = 10;
    number2 = 20;
    number3 = sum(number1, number2);
    if (number3 > 100) return 0;
    return 1;
    }
    $ cc example.c -o example
    example.c
    1.Syntactically correct
    2.Variables declared before use
    3.Use correct types
    4.Statically conforming
    5.Dynamically conforming
    6.Model conforming
    Challenge: Semantic Envelopes
    (Mckeeman 1998)
    (parse)
    (compile)
    (link)
    (run)
    (synthesis)

    View full-size slide

  185. 78
    #include
    int main() {
    int number1, number2, number3;
    number1 = 10;
    number2 = 20;
    number3 = sum(number1, number2);
    if (number3 > 100) return 0;
    return 1;
    }
    $ cc example.c -o example
    example.c
    1.Syntactically correct
    2.Variables declared before use
    3.Use correct types
    4.Statically conforming
    5.Dynamically conforming
    6.Model conforming
    Challenge: Semantic Envelopes
    (Mckeeman 1998)
    (parse)
    (compile)
    (link)
    (run)
    (synthesis)

    View full-size slide

  186. 79
    #include
    int main() {
    int number1, number2, number3;
    number1 = 10;
    number2 = 20;
    number3 = sum(number1, number2);
    if (number3 > 100) return 0;
    return 1;
    }
    $ cc example.c -o example
    example.c
    1.Syntactically correct
    2.Variables declared before use
    3.Use correct types
    4.Statically conforming
    5.Dynamically conforming
    6.Model conforming
    Challenge: Semantic Envelopes
    (Mckeeman 1998)
    (parse)
    (compile)
    (link)
    (run)
    (synthesis)

    View full-size slide

  187. 81
    Fuzzing
    Inputs
    Program
    Behavior

    View full-size slide

  188. 81
    Fuzzing
    Inputs
    Program
    Behavior

    View full-size slide

  189. 81
    Fuzzing
    Inputs
    Program
    Behavior

    View full-size slide

  190. 83
    We Found A Crash

    View full-size slide

  191. 83
    We Found A Crash

    View full-size slide

  192. Why Did My Program Crash?

    View full-size slide

  193. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090

    View full-size slide

  194. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))

    View full-size slide

  195. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))
    00000 ?

    View full-size slide

  196. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))
    00000 ?
    ((5)) ?

    View full-size slide

  197. Why Did My Program Crash?
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.
    6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-
    -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(
    --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) +
    8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 -
    5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6
    - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--
    (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(-
    -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -
    +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-
    (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513)
    ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +((
    +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+
    (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 *
    +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (
    +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.
    37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) -
    (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))
    ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 *
    --++090
    DD Minimized Input
    ((4))
    00000 ?
    ((5)) ?
    (++5) ?

    View full-size slide

  198. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View full-size slide

  199. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View full-size slide

  200. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View full-size slide

  201. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View full-size slide

  202. 85
    Issue 386 from Rhino
    var A = class extends (class {}){};
    Issue 2937 from Closure
    const [y,y] = [];
    var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure
    Delta Minimization is useful but not sufficient

    View full-size slide

  203. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View full-size slide

  204. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View full-size slide

  205. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View full-size slide

  206. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    ✓ Did not reproduce the failure
    1 * (2 - 3)

    View full-size slide

  207. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]

    View full-size slide

  208. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c

    View full-size slide

  209. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c

    View full-size slide

  210. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    ✓ Did not reproduce the failure
    1 + 3 + 4

    View full-size slide

  211. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c

    View full-size slide

  212. 3 * 4
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c

    View full-size slide

  213. 3 * 4
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    ✓ Did not reproduce the failure

    View full-size slide

  214. ( ( 4 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    c
    c
    c
    c
    c

    View full-size slide

  215. ( ( 1 - 2 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    c
    c
    c
    c
    c
    ( ( 1 - 2 ) )

    View full-size slide

  216. ( ( 1 - 2 ) )
    :=
    := ' + '
    | ' - '
    |
    := ' * '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '.'
    |
    :=
    |
    := [0-9]
    c
    c
    c
    c
    c
    c
    c
    ✘ reproduced the failure
    ( ( 1 - 2 ) )

    View full-size slide

  217. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c
    ( ( 1 - 2 ) )

    View full-size slide

  218. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    View full-size slide

  219. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )
    ( ( 2 * 3 + 4 ) )

    View full-size slide

  220. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    View full-size slide

  221. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )
    ( ( - 2 / 1 ) )

    View full-size slide

  222. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    ( ( - 2 / 1 ) )

    View full-size slide

  223. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    ( ( - 2 / 1 ) )
    ( ( 98 - 0 ) )

    View full-size slide

  224. ( ( 1 - 2 ) )
    c
    c
    c
    c
    c
    c
    c

    ( ( 1 - 2 ) )

    ( ( 2 * 3 + 4 ) )

    ( ( - 2 / 1 ) )

    ( ( 98 - 0 ) )

    View full-size slide

  225. )
    (
    ( )
    ( ( )
    4 )
    ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A

    View full-size slide

  226. )
    (
    ( )
    ( ( )
    4 )
    ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A

    View full-size slide

  227. ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A
    ( ( ) )

    ( ( ) )
    4
    Minimized Input
    Abstract Failure Inducing Input
    def check(parsed):
    if parsed.is_nested() and parsed.child.is_nested():
    raise Exception()
    return input

    View full-size slide

  228. var A = class extends (class {}){};
    Issue 2937 from Closure

    View full-size slide

  229. var A = class extends (class {}){};
    Issue 2937 from Closure
    = class extends (class {}){}

    View full-size slide

  230. var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino

    View full-size slide

  231. var {baz:{} = baz => {}} = baz => {};
    Issue 385 from Rhino
    var {<$Id1>:{} = <$Id1> => {}} ;

    View full-size slide

  232. const [y,y] = [];
    Issue 386 from Rhino

    View full-size slide

  233. const [y,y] = [];
    Issue 386 from Rhino
    const [<$Id1>,<$Id1>] = []

    View full-size slide

  234. {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure

    View full-size slide

  235. {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
    Issue 2842 from Closure
    {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}}

    View full-size slide

  236. ( ( 4 ) )
    c
    c
    c
    c
    c
    c
    c
    A
    ( ( ) )

    ( ( ) )
    4
    Minimized Input
    Abstract Failure Inducing Input
    • Effectively abstracts a minimized input

    • The abstraction identifies where the problem lies

    • Decompose complex program behaviors
    DDSET
    Gopinath, Kampmann, Havrikov, Soremekun, and Zeller. Abstracting Failure Inducing Inputs. ISSTA 2020.
    def check(parsed):
    if parsed.is_nested() and parsed.child.is_nested():
    raise Exception()
    return input
    ISSTA 2020 Distinguished Award

    View full-size slide

  237. 108
    :=
    := ' + '
    | ' + '
    | ' - '
    | ' - '
    |
    := ' * '
    | ' * '
    | ' / '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '(' ')'
    :=
    :=
    := '(' ')'
    Specialized Grammar
    is (())

    View full-size slide

  238. 108
    :=
    := ' + '
    | ' + '
    | ' - '
    | ' - '
    |
    := ' * '
    | ' * '
    | ' / '
    | ' / '
    |
    := '+'
    | '-'
    | '(' ')'
    | '(' ')'
    :=
    :=
    := '(' ')'
    ((1)) + 2
    (23 * ((3)) - 34)
    (344- 4 + ((223)))
    (1) - 3 * 773 + (-22 + 1)
    1798 - 889 / ((333-1)) * 2 / 3 + 1
    34 + ((4)) -334 + (334 - (22) + 919 * 0 + 1
    98435747+ 88 + (((0))) + (1) - 1 * 7 / 4 * 889 - 2
    8 + ((8)) + --1 + 11223 / 344 - 39 + (1) - 456 + 134 / 45
    437 + 8 - 1 * ((9 + ((1))) - 1 + 99111948 + 3 --1 + (112) - 2 + 445) + 0
    74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * 223 - 1233 + 334672
    2 * ((9)) - (1798 - 889 / (333-1) * 2 / 3 + 100012 + 3434392 + 234 ----6 * 1798 - 889 / (33
    778 - (((1) - 3 * 773 + (-22 + 1) * (4545) - 23 - ((2)) * 773 + (-22 + 1) / 3434 + ---1 + 1 / 34343 + 112
    349 + (((1) - 3 * 3 + (-22 + 1) ((+ (-22 + 1) * (4545) - 23 - (2) * 773 + ((-22 + 1)) / 3434 + ---1 + 1 / 34343 + 1123
    8 + ((8)) + --1 + / 1 - 39 + (1) - 456 + 134 / 45 ))(((1) - 2334 + ((((1)) - 3 * 773 + (-22 + 1) * (2) - 23 - (2) * 773 + (-22 + 1) / 3
    74 + 3 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 45 ))(((1) - 3 * 773
    1+ 33+ 24343433 +23343 - ((74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 4

    Specialized Grammar
    is (())

    View full-size slide

  239. Algebra of Grammar Specializations

    View full-size slide

  240. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View full-size slide


  241. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View full-size slide



  242. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View full-size slide




  243. is = class extends (class {}){} Closure 2937
    is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385
    is const [<$Id3>,<$Id3>] = [] Rhino 386
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842
    where
    Algebra of Grammar Specializations

    View full-size slide

  244. Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021.
    Mechanized proofs are available
    Algebra of Grammar Specializations

    View full-size slide

  245. Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021.

    where
    is = class extends (class {}){}
    is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}}
    is var {<$Id2>:{} = <$Id2> => {}} ;
    is const [<$Id3>,<$Id3>] = []
    Mechanized proofs are available
    Algebra of Grammar Specializations

    View full-size slide

  246. Science of Focused Fuzzing

    where is (())
    is / 0

    View full-size slide


  247. where is (())
    is / 0

    View full-size slide


  248. where is (())
    is / 0
    Isolating & Decomposing Program Behaviors

    View full-size slide


  249. where is (())
    is / 0
    Isolating & Decomposing Program Behaviors
    Algebra of Program Behaviors

    View full-size slide


  250. where is (())
    is / 0
    Isolating & Decomposing Program Behaviors
    Algebra of Program Behaviors
    Science of Program Behaviors

    View full-size slide

  251. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    Input Behavior
    Program
    Challenge: Identify Behavior Divergence

    View full-size slide

  252. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    update($file)
    read($file)
    rm($file)
    Input Behavior
    action='read'
    $action('tbl')
    Program
    assert invoked read: 'tbl.data' ✔
    Challenge: Identify Behavior Divergence

    View full-size slide

  253. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    update($file)
    read($file)
    rm($file)
    Input Behavior
    action='read'
    $action('tbl')
    Program
    assert invoked read: 'tbl.data' ✔
    action='rm'
    Challenge: Identify Behavior Divergence

    View full-size slide

  254. 115
    insert into tbl values (1,2,3)
    select b from tbl
    drop table tbl
    update($file)
    read($file)
    rm($file)
    Input Behavior
    action='read'
    $action('tbl')
    Program
    assert invoked read: 'tbl.data' ✔
    action='rm'
    Challenge: Identify Behavior Divergence

    View full-size slide

  255. def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    Challenge: Identify Behavior Divergence

    View full-size slide

  256. def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene
    Challenge: Identify Behavior Divergence

    View full-size slide

  257. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()

    View full-size slide

  258. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()

    View full-size slide

  259. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()
    LockB()
    LockA()
    DoAB()
    UnlockA()
    UnlockB()

    View full-size slide

  260. Challenge: Identify Behavior Divergence
    LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()
    LockB()
    LockA()
    DoAB()
    UnlockA()
    UnlockB()
    UnLockA()
    LockA()
    DoAB()
    LockB()
    UnlockB()

    View full-size slide

  261. 118
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View full-size slide

  262. 118
    def triangle(a, b, c):
    if a == b:
    if b == c:
    return Equilateral
    else:
    return Isosceles
    else:
    if b == c:
    return Isosceles
    else:
    if a == c:
    return Isosceles
    else:
    return Scalene

    View full-size slide

  263. 120
    :=
    := (a==b)
    | (a!=b)

    View full-size slide

  264. 121
    :=
    := (a==b)
    | (a!=b)
    := (b==c)
    | (b!=c)

    View full-size slide

  265. 122
    :=
    := (a==b)
    | (a!=b)
    := (b==c)
    | (b!=c)
    := "return Equilateral"
    := "return Isosceles"

    View full-size slide

  266. 123
    :=
    := (a==b)
    | (a!=b)
    := (b==c)
    | (b!=c)
    := "return Equilateral"
    := "return Isosceles"
    := (b==c)
    | (b!=c)
    := "return Isosceles"
    := (a==c)
    | (a!=c)
    := "return Isosceles"
    := "return Scalene"

    View full-size slide

  267. LockB()
    LockA()
    DoAB()
    UnlockB()
    UnlockA()
    Challenge: Identify Behavior Divergence
    "lockA" "lockB" "UnlockB" "UnlockA"

    View full-size slide

  268. 126
    The Science of Inputs
    Program
    The Science of Behaviors

    View full-size slide

  269. 126
    The Science of Inputs
    Program
    The Science of Behaviors

    View full-size slide

  270. 126
    The Science of Inputs
    Program
    The Science of Behaviors

    View full-size slide

  271. 126
    The Science of Fuzzing
    The Science of Inputs
    Program
    The Science of Behaviors

    View full-size slide

  272. 128
    Oracles for Fuzzing
    Focused Fuzzing
    Automatic Repair
    Fault Localization
    Beyond Syntax
    Grammar Mining

    View full-size slide

  273. 128
    Oracles for Fuzzing
    Focused Fuzzing
    Automatic Repair
    Fault Localization
    Beyond Syntax
    Grammar Mining
    Onward

    View full-size slide

  274. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View full-size slide

  275. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View full-size slide

  276. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View full-size slide

  277. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    https://rahul.gopinath.org

    View full-size slide

  278. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org

    View full-size slide

  279. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Generalize

    View full-size slide

  280. Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Generalize
    Combine

    View full-size slide

  281. Future Work:
    Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Generalize
    Combine
    The Science of Fuzzing

    View full-size slide

  282. Future Work:
    Program Under Test
    pFuzzer
    F1 Fuzzer
    Inputs
    Grammar Miner
    Problem: How to Fuzz Parsers
    https://rahul.gopinath.org
    Debugging
    Combine
    The Science of Fuzzing

    View full-size slide