Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fuzzing -- From Alchemy to a Science

Fuzzing -- From Alchemy to a Science

Talk at Passau University

Rahul Gopinath

June 30, 2021
Tweet

More Decks by Rahul Gopinath

Other Decks in Research

Transcript

  1. 4

  2. 5 500 BC Aṣṭādhyāyī Dakṣiputra Pāṇini Ad hoc rules Formal

    specification Vedic Sanskrit Classical Sanskrit
  3. 6

  4. 6 2500 years later.... 2021 CE We have a crisis

    The world is governed by software
  5. 7 Debian 5 ~ 70 million lines Smart cars ~

    100 million lines Google is ~ 2 Billion lines 5 M 10 M 15 M 20 M 25 M 30 M 35 M (Source: Wikipedia) #Linux Kernel Size in Millions of Lines of Code Linux Releases Growth in software complexity
  6. 7 Debian 5 ~ 70 million lines Smart cars ~

    100 million lines Google is ~ 2 Billion lines 1994 1996 1.0.0 2.0.0 2.1.0 2.2.0 2.4.0 2.6.0 2003 2011 3.0 2015 4.0 2019 5.0 2019 5.7 2001 5 M 10 M 15 M 20 M 25 M 30 M 35 M (Source: Wikipedia) #Linux Kernel Size in Millions of Lines of Code Linux Releases Growth in software complexity
  7. 8 Growth in vulnerabilities #Vulnerabilities Year 2019 2018 2017 2016

    2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 16k 14k 12k 10k 8k 6k 4k 2k (Source: NIST)
  8. 10 Fuzzing • Memory Bounds Violation • Privilege Escalation •

    Safety Violations • Metamorphic Relations • Differential Execution Random Inputs Program Automatic Checks
  9. 11 Random Fuzzing $ ./fuzz [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu
 2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z
 h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!
 AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}


    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?
 lR=bF3+;y$3lodQ<B89!5"W2fK*vE7v{'
 )KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp
 bY@aGZ1{P!AZU7x#4(Rtn!q4nCwqol^y6
 }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU
 )*BiC<),`+t*gka<W=Z.%T5WGHZpI30D<
 Pq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM
 PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2
 D|vBy!^zkhdf3C5PAkR?V((-%><hn|3='
 i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@
 5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:
 cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc
 un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N
 -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r Program ✘
  10. $ ./fuzz -int | program 634111569742810193727424069509 741355925061499451162464719526 615957331924826555590537407605 181400079803446874252046374716 740973770255348279425601333144

    152724057932073828569041216191 099859446496509919024810271242 622974988671421938012464630138 735355134599327240920259675263 574528613057084231370741920902 794677842164654990353575580453 777282305855352378119038096476 699871306655084953377039862387 924957554389878352934547664240 082431556093837288597262675598 630851919061829885048834738832 677022429414980917053939970795 722006987916088650168665471731 yes 12 Random Fuzzing def is_prime(n: int) -> bool: """Primality test using 6k+-1 optimization.""" if n <= 3: return n > 1 if n % 2 == 0 or n % 3 == 0: return False i = 5 while i ** 2 <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True def main(): num = stdin.read() print(num, is_prime(num))
  11. $ ./fuzz -int | program 634111569742810193727424069509 741355925061499451162464719526 615957331924826555590537407605 181400079803446874252046374716 740973770255348279425601333144

    152724057932073828569041216191 099859446496509919024810271242 622974988671421938012464630138 735355134599327240920259675263 574528613057084231370741920902 794677842164654990353575580453 777282305855352378119038096476 699871306655084953377039862387 924957554389878352934547664240 082431556093837288597262675598 630851919061829885048834738832 677022429414980917053939970795 722006987916088650168665471731 yes 12 Random Fuzzing def is_prime(n: int) -> bool: """Primality test using 6k+-1 optimization.""" if n <= 3: return n > 1 if n % 2 == 0 or n % 3 == 0: return False i = 5 while i ** 2 <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True def main(): num = stdin.read() print(num, is_prime(num)) Weak point: unequal divisions in input space Input space n > 3 n <= 3
  12. 14 Advanced Fuzzing: Instrumentation def triangle(a, b, c): if a

    == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
  13. 14 Advanced Fuzzing: Instrumentation def triangle(a, b, c): __probe_enter() if

    a == b: __probe_1() if b == c: __probe_2() return Equilateral else: __probe_3() return Isosceles else: __probe_4() if b == c: __probe_5() return Isosceles else: __probe_6() if a == c: __probe_7() return Isosceles else: __probe_8() return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
  14. 15 Instrumentation Based Fuzzing def triangle(a, b, c): if a

    == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
  15. 15 Instrumentation Based Fuzzing def triangle(a, b, c): if a

    == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,1)
  16. 16 Instrumentation Based Fuzzing def triangle(a, b, c): if a

    == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
  17. 16 Instrumentation Based Fuzzing def triangle(a, b, c): if a

    == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,2)
  18. 19 Coverage Guided Fuzzing • Randomly generate inputs • Choose

    seeds with new coverage AFL First Release: 2013 AFL Trophy Case
  19. 20 Coverage Guided Fuzzing • Randomly generate inputs • Choose

    seeds with new coverage AFL Pulling JPEGs out of thin air https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html Valid JPEG in 6 hours in an 8 core machine!
  20. 21 Coverage Guided Fuzzing • Randomly generate inputs • Choose

    seeds with new coverage AFL static int is_reserved_word_token(const char *s, int len) { const char *reserved[] = { "break", "case", "catch", "continue", "debugger", "default", "delete", "do", "else", "false", "finally", "for", "function", "if", "in", "instanceof", "new", "null", "return", "switch", "this", "throw", "true", "try", "typeof", "var", "void", "while", "with", "let", "undefined", ((void *)0)}; int i; if (!mjs_is_alpha(s[0])) return 0; for (i = 0; reserved[i] != ((void *)0); i++) { if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0) return i + 1; } return 0; }
  21. 21 Coverage Guided Fuzzing • Randomly generate inputs • Choose

    seeds with new coverage AFL static int is_reserved_word_token(const char *s, int len) { const char *reserved[] = { "break", "case", "catch", "continue", "debugger", "default", "delete", "do", "else", "false", "finally", "for", "function", "if", "in", "instanceof", "new", "null", "return", "switch", "this", "throw", "true", "try", "typeof", "var", "void", "while", "with", "let", "undefined", ((void *)0)}; int i; if (!mjs_is_alpha(s[0])) return 0; for (i = 0; reserved[i] != ((void *)0); i++) { if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0) return i + 1; } return 0; } Weak point: Magic bytes
  22. 23 Solver Directed Fuzzing • Collect path constraints • Solve

    negated constraints for new inputs (a == b) (b == c)
  23. 23 Solver Directed Fuzzing • Collect path constraints • Solve

    negated constraints for new inputs (a == b) (b == c) (b != c)
  24. 23 Solver Directed Fuzzing • Collect path constraints • Solve

    negated constraints for new inputs (a == b) (b == c) (b != c) triangle(1,2,1)
  25. 23 Solver Directed Fuzzing • Collect path constraints • Solve

    negated constraints for new inputs (a == b) (b == c) (b != c) triangle(1,2,1)
  26. 24 Solver Directed Fuzzing • Collect path constraints • Solve

    constraints for new inputs void next_sym() { while(1) { switch (ch){ case '{': next_ch(); sym = LBRA; return; case '}': next_ch(); sym = RBRA; return; case '(': next_ch(); sym = LPAR; return; case ')': next_ch(); sym = RPAR; return; case '+': next_ch(); sym = PLUS; return; case '-': next_ch(); sym = MINUS; return; case '<': next_ch(); sym = LESS; return; case ';': next_ch(); sym = SEMI; return; case '=': next_ch(); sym = EQUAL; return; default: if (ch >= '0' && ch <= '9') { int_val = 0; /* missing overflow check */ while (ch >= '0' && ch <= '9') { int_val = int_val*10 + (ch - '0'); next_ch(); } sym = INT; } else if (ch >= 'a' && ch <= 'z') { int i = 0; /* missing overflow check */ while ((ch >= 'a' && ch <= 'z') || ch == '_'){ id_name[i++] = ch; next_ch(); } id_name[i] = '\0'; sym = 0; while (words[sym] != NULL && strcmp(words[sym], id_name) != 0) sym++; if (words[sym] == NULL) if (id_name[1] == '\0') sym = ID; else syntax_error(); } else syntax_error(); return; }
  27. 24 Solver Directed Fuzzing • Collect path constraints • Solve

    constraints for new inputs void next_sym() { while(1) { switch (ch){ case '{': next_ch(); sym = LBRA; return; case '}': next_ch(); sym = RBRA; return; case '(': next_ch(); sym = LPAR; return; case ')': next_ch(); sym = RPAR; return; case '+': next_ch(); sym = PLUS; return; case '-': next_ch(); sym = MINUS; return; case '<': next_ch(); sym = LESS; return; case ';': next_ch(); sym = SEMI; return; case '=': next_ch(); sym = EQUAL; return; default: if (ch >= '0' && ch <= '9') { int_val = 0; /* missing overflow check */ while (ch >= '0' && ch <= '9') { int_val = int_val*10 + (ch - '0'); next_ch(); } sym = INT; } else if (ch >= 'a' && ch <= 'z') { int i = 0; /* missing overflow check */ while ((ch >= 'a' && ch <= 'z') || ch == '_'){ id_name[i++] = ch; next_ch(); } id_name[i] = '\0'; sym = 0; while (words[sym] != NULL && strcmp(words[sym], id_name) != 0) sym++; if (words[sym] == NULL) if (id_name[1] == '\0') sym = ID; else syntax_error(); } else syntax_error(); return; } Weak point: Path explosion
  28. 26 Fuzzing Parsers $ ./fuzz [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu
 2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z
 h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!
 AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}


    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?
 lR=bF3+;y$3lodQ<B89!5"W2fK*vE7v{'
 )KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp
 bY@aGZ1{P!AZU7x#4(Rtn!q4nCwqol^y6
 }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU
 )*BiC<),`+t*gka<W=Z.%T5WGHZpI30D<
 Pq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM
 PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2
 D|vBy!^zkhdf3C5PAkR?V((-%><hn|3='
 i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@
 5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:
 cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc
 un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N
 -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r Interpreter
  29. 26 Fuzzing Parsers $ ./fuzz [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu
 2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z
 h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!
 AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}


    z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?
 lR=bF3+;y$3lodQ<B89!5"W2fK*vE7v{'
 )KC-i,c{<[~m!]o;{.'}Gj\(X}EtYetrp
 bY@aGZ1{P!AZU7x#4(Rtn!q4nCwqol^y6
 }0|Ko=*JK~;zMKV=9Nai:wxu{J&UV#HaU
 )*BiC<),`+t*gka<W=Z.%T5WGHZpI30D<
 Pq>&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM
 PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2
 D|vBy!^zkhdf3C5PAkR?V((-%><hn|3='
 i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@
 5:dfd45*(7^%5ap\zIyl"'f,$ee,J4Gw:
 cgNKLie3nx9(`efSlg6#[K"@WjhZ}r[Sc
 un&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/6N
 -wyzj/MTd#A;r*(ds./df3r8Odaf?/<#r Parser Syntax Error Interpreter #
  30. 31

  31. 34 Formal Languages Formal Language Descriptions 3. Regular Context Free

    Recursively Enumerable (Chomsky,1956) Argument Stack Return Stack
  32. 34 Formal Languages Formal Language Descriptions 3. Regular Context Free

    Recursively Enumerable (Chomsky,1956) Easy to produce and parse Argument Stack Return Stack
  33. 35 Grammar <start> := <expr> <expr> := <expr> '+' <expr>

    | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Arithmetic expression grammar
  34. 35 Grammar <start> := <expr> <expr> := <expr> '+' <expr>

    | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Arithmetic expression grammar
  35. 35 Grammar <start> := <expr> <expr> := <expr> '+' <expr>

    | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Arithmetic expression grammar <expr> key
  36. 35 Grammar <start> := <expr> <expr> := <expr> '+' <expr>

    | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Arithmetic expression grammar Definition for <expr> <expr> key
  37. 36 <start> := <expr> <expr> := <expr> '+' <expr> |

    <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Grammar Arithmetic expression grammar
  38. 36 <start> := <expr> <expr> := <expr> '+' <expr> |

    <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Grammar Arithmetic expression grammar Expansion Rule
  39. 36 <start> := <expr> <expr> := <expr> '+' <expr> |

    <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Grammar Arithmetic expression grammar Expansion Rule Terminal Symbol
  40. 36 <start> := <expr> <expr> := <expr> '+' <expr> |

    <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Grammar Arithmetic expression grammar Expansion Rule Terminal Symbol Nonterminal Symbol
  41. 36 <start> := <expr> <expr> := <expr> '+' <expr> |

    <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Grammar Arithmetic expression grammar Expansion Rule Terminal Symbol Nonterminal Symbol
  42. 37 Grammars As recognizers <start> := <expr> <expr> := <expr>

    '+' <expr> | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  43. 37 Grammars As recognizers (8 / 3) * 49 <start>

    := <expr> <expr> := <expr> '+' <expr> | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  44. 37 Grammars As recognizers (8 / 3) * 49 <start>

    := <expr> <expr> := <expr> '+' <expr> | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  45. 38 Grammars <start> := <expr> <expr> := <expr> '+' <expr>

    | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] As producers (Hanford 1970) (Purdom 1972)
  46. 38 Grammars8.2 - 27 - -9 / +((+9 * --2

    + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 <start> := <expr> <expr> := <expr> '+' <expr> | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] As producers (Hanford 1970) (Purdom 1972)
  47. 39 Grammars As effective producers 8.2 - 27 - -9

    / +((+9 * --2 + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090
  48. 39 Grammars As effective producers Interpreter Parser ✘ ✔ 8.2

    - 27 - -9 / +((+9 * --2 + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090
  49. 41 The standard spec Buggy Implementation "Extra" Features "Be liberal

    in what you accept, and conservative in what you send"
 (the cause of trouble) Postel's Law "Accepted" Bugs •Reference Specification?
  50. 42 object { } { members } members pair pair

    , members pair string : value array [ ] [ elements ] elements value value , elements value string number object array true false null string " " " chars " chars char char chars char UNICODE \ [",\,CTRL] \" \\ \/ \b \f \n \r \t \u hex hex hex hex number int int frac int exp int frac exp int digit onenine digits - digit - onenine digits frac . digits exp e digits hex digit A - F a - f digits digit digit digits e e e+ e- E E+ E- https://www.json.org
  51. 43 Parsing JSON is a Minefield http://seriot.ch/ Expected Parse Fail

    (Expect Success) Parse Success (Expect Fail) Parse Success (Undefined) Parse Fail (Undefined) Parser Crash Timeout
  52. 43 Parsing JSON is a Minefield http://seriot.ch/ Expected Parse Fail

    (Expect Success) Parse Success (Expect Fail) Parse Success (Undefined) Parse Fail (Undefined) Parser Crash Timeout
  53. 46 How to Extract This Grammar? • Inputs -> Dynamic

    Control Dependence Trees • DCD Trees -> Context Free Grammar
  54. 47 Control Dependence Graph Statement B is control dependent on

    A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1
  55. 47 Control Dependence Graph Statement B is control dependent on

    A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv
  56. 47 Control Dependence Graph Statement B is control dependent on

    A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv while: determines whether if: executes
  57. 48 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:])

    i = i+j else: comma(s[i]) i += 1 CDG for parse_csv Dynamic Control Dependence Tree Each statement execution is represented as a separate node
  58. 48 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:])

    i = i+j else: comma(s[i]) i += 1 CDG for parse_csv Dynamic Control Dependence Tree Each statement execution is represented as a separate node DCD Tree for call parse_csv()
  59. 49 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:])

    i = i+j else: comma(s[i]) i += 1 DCD Tree ~ Parse Tree •No tracking beyond input buffer •Characters are attached to nodes where they are accessed last "12," "12,"
  60. 49 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:])

    i = i+j else: comma(s[i]) i += 1 '1' '2' ',' DCD Tree ~ Parse Tree •No tracking beyond input buffer •Characters are attached to nodes where they are accessed last "12," "12,"
  61. 50 def is_digit(i): return i in '0123456789' def parse_num(s,i): n

    = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Parse tree for parse_expr('9+3/4')
  62. 50 def is_digit(i): return i in '0123456789' def parse_num(s,i): n

    = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Parse tree for parse_expr('9+3/4')
  63. 51 def is_digit(i): return i in '0123456789' def parse_num(s,i): n

    = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Identifying Compatible Nodes Which nodes correspond to the same nonterminal
  64. 51 def is_digit(i): return i in '0123456789' def parse_num(s,i): n

    = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Identifying Compatible Nodes Which nodes correspond to the same nonterminal
  65. <parse_expr> := <while 1:1> <while 1:0> <while 1:1> | <while

    1:1> <while 1:0> <while 1:1> <while 1:0> <while 1:1> <while 1:0> <while 1:1> | <while 1:1> <while 1:0> <while 1:1> <while 1:0> <while 1:1> | <while 1:1> <while 1:1> := <if 1:1> <if 1:1> := <parse_num> | <parse_paren> <parse_num> := <is_digit> <is_digit> := '3' | '1' <parse_paren>:= '(' <parse_expr> ')' <while 1:0> := <if 1:0> <if 1:0> := '*' 57
  66. <parse_expr> := <while_s> <while_s> := <while_1:1> <while_1:0> <while_s> | <while_1:1>

    <parse_expr> := <while 1:1> <while 1:0> <while 1:1> | <while 1:1> <while 1:0> <while 1:1> <while 1:0> <while 1:1> <while 1:0> <while 1:1> | <while 1:1> <while 1:0> <while 1:1> <while 1:0> <while 1:1> | <while 1:1> <while 1:1> := <if 1:1> <if 1:1> := <parse_num> | <parse_paren> <parse_num> := <is_digit> <is_digit> := '3' | '1' <parse_paren>:= '(' <parse_expr> ')' <while 1:0> := <if 1:0> <if 1:0> := '*' 57
  67. 58 def is_digit(i): return i in '0123456789' def parse_num(s,i): n

    = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr <START> := <parse_expr.0-0-c> <parse_expr.0-0-c> := <parse_expr.0-1-s><parse_expr.0> | <parse_expr.0> <parse_expr.0-1-s> := <parse_expr.0><parse_expr.0-2> | <parse_expr.0><parse_expr.0-2><parse_expr.0-1-s> <parse_expr.0> := '(' <parse_expr.0-0-c> ')' | <parse_num.0-1-s> <parse_expr.0-2> := '*' | '+' | '-' | '/' <parse_num.0-1-s> := <is_digit.0-0-c> | <is_digit.0-0-c><parse_num.0-1-s> <is_digit.0-0-c> : [0-9] calc.py Recovered Arithmetic Grammar
  68. 59 <START> := <parse_expr.0-0-c> <parse_expr.0-0-c> := <parse_expr.0-1-s><parse_expr.0> | <parse_expr.0> <parse_expr.0-1-s>

    := <parse_expr.0><parse_expr.0-2> | <parse_expr.0><parse_expr.0-2><parse_expr.0-1-s> <parse_expr.0> := '(' <parse_expr.0-0-c> ')' | <parse_num.0-1-s> <parse_expr.0-2> := '*' | '+' | '-' | '/' <parse_num.0-1-s> := <is_digit.0-0-c> | <is_digit.0-0-c><parse_num.0-1-s> <is_digit.0-0-c> : [0-9]
  69. 59 8.2 - 27 - -9 / +((+9 * --2

    + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 <START> := <parse_expr.0-0-c> <parse_expr.0-0-c> := <parse_expr.0-1-s><parse_expr.0> | <parse_expr.0> <parse_expr.0-1-s> := <parse_expr.0><parse_expr.0-2> | <parse_expr.0><parse_expr.0-2><parse_expr.0-1-s> <parse_expr.0> := '(' <parse_expr.0-0-c> ')' | <parse_num.0-1-s> <parse_expr.0-2> := '*' | '+' | '-' | '/' <parse_num.0-1-s> := <is_digit.0-0-c> | <is_digit.0-0-c><parse_num.0-1-s> <is_digit.0-0-c> : [0-9]
  70. 60 <START> ::= <json_raw> <json_raw> ::= '"' <json_string'> | '['

    <json_list'> | '{' <json_dict'> | <json_number'> | 'true' | 'false' | 'null' <json_number'> ::= <json_number>+ | <json_number>+ 'e' <json_number>+ <json_number> ::= '+' | '-' | '.' | [0-9] | 'E' | 'e' <json_string'> ::= <json_string>* '"' <json_list'> ::= ']' | <json_raw> (','<json_raw>)* ']' | ( ',' <json_raw>)+ (',' <json_raw>)* ']' <json_dict'> ::= '}' | ( '"' <json_string'> ':' <json_raw> ',' )* '"'<json_string'> ':' <json_raw> '}' <json_string> ::= ' ' | '!' | '#' | '$' | '%' | '&' | ''' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^' | '_', ''',| '{' | '|' | '}' | '~' | '[A-Za-z0-9]' | '\' <decode_escape> <decode_escape> ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next() if expect_key: raise JSONError(E_DKEY, stm, stm.pos) if c == '}': return result expect_key = 1 continue # parse out a key/value pair elif c == '"': key = _from_json_string(stm) stm.skipspaces() c = stm.next() if c != ':': raise JSONError(E_COLON, stm, stm.pos) stm.skipspaces() val = _from_json_raw(stm) result[key] = val expect_key = 0 continue raise JSONError(E_MALF, stm, stm.pos) def _from_json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == '"': return _from_json_string(stm) elif c == '{': return _from_json_dict(stm) elif c == '[': return _from_json_list(stm) elif c == 't': return _from_json_fixed(stm, 'true', True, E_BOOL) elif c == 'f': return _from_json_fixed(stm, 'false', False, E_BOOL) elif c == 'n': return _from_json_fixed(stm, 'null', None, E_NULL) elif c in NUMSTART: return _from_json_number(stm) raise JSONError(E_MALF, stm, stm.pos) def from_json(data): stm = JSONStream(data) return _from_json_raw(stm) microjson.py Recovered JSON grammar
  71. 61 <START> ::= <json_raw> <json_raw> ::= '"' <json_string'> | '['

    <json_list'> | '{' <json_dict'> | <json_number'> | 'true' | 'false' | 'null' <json_number'> ::= <json_number>+ | <json_number>+ 'e' <json_number>+ <json_number> ::= '+' | '-' | '.' | [0-9] | 'E' | 'e' <json_string'> ::= <json_string>* '"' <json_list'> ::= ']' | <json_raw> (','<json_raw>)* ']' | ( ',' <json_raw>)+ (',' <json_raw>)* ']' <json_dict'> ::= '}' | ( '"' <json_string'> ':' <json_raw> ',' )* '"'<json_string'> ':' <json_raw> '}' <json_string> ::= ' ' | '!' | '#' | '$' | '%' | '&' | ''' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^' | '_', ''',| '{' | '|' | '}' | '~' | '[A-Za-z0-9]' | '\' <decode_escape> <decode_escape> ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next() if expect_key: raise JSONError(E_DKEY, stm, stm.pos) if c == '}': return result expect_key = 1 continue # parse out a key/value pair elif c == '"': key = _from_json_string(stm) stm.skipspaces() c = stm.next() if c != ':': raise JSONError(E_COLON, stm, stm.pos) stm.skipspaces() val = _from_json_raw(stm) result[key] = val expect_key = 0 continue raise JSONError(E_MALF, stm, stm.pos) def _from_json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == '"': return _from_json_string(stm) elif c == '{': return _from_json_dict(stm) elif c == '[': return _from_json_list(stm) elif c == 't': return _from_json_fixed(stm, 'true', True, E_BOOL) elif c == 'f': return _from_json_fixed(stm, 'false', False, E_BOOL) elif c == 'n': return _from_json_fixed(stm, 'null', None, E_NULL) elif c in NUMSTART: return _from_json_number(stm) raise JSONError(E_MALF, stm, stm.pos) def from_json(data): stm = JSONStream(data) return _from_json_raw(stm) microjson.py Recovered JSON grammar Mimid Gopinath, Mathis, and Zeller. Mining Input Grammars from Dynamic Control Flow. ESEC/FSE 2020. •Javascript •C •Lisp •JSON •URL •CGI
  72. 63

  73. 67 Sample Free Generators A ( 2 - B 9

    ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0
  74. 67 Sample Free Generators A ( 2 - B 9

    ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0
  75. 67 Sample Free Generators A ( 2 - B 9

    ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0 (2-94)
  76. Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Parser Directed Fuzzing.

    PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Zeller Black-box Testing with Monotonic Prefixes. ISSRE 2021 (submitted). Sample Free Generators A ( 2 - B 9 ) 4 )
  77. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Fast Fuzzing with Grammars fuzz(expr_grammar, '<start>')
  78. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Fast Fuzzing with Grammars fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key))
  79. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Fast Fuzzing with Grammars def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key))
  80. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Fast Fuzzing with Grammars def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]
  81. Fast Fuzzing with Grammars fuzz(expr_grammar, '<start>') def gen_key(grammar, key):
 if

    is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]
  82. Fast Fuzzing with Grammars <start> <expr> <expr> <expr> + <number>

    <number> <integer> <integer> <digit> <digit> 1 8 fuzz(expr_grammar, '<start>') def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]
  83. Fast Fuzzing with Grammars <start> <expr> <expr> <expr> + <number>

    <number> <integer> <integer> <digit> <digit> 1 8 fuzz(expr_grammar, '<start>') def collapse(tree): key, children = tree if not children: return tree return ''.join([collapse(c) for c in children]) def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]
  84. Fast Fuzzing with Grammars <start> <expr> <expr> <expr> + <number>

    <number> <integer> <integer> <digit> <digit> 1 8 fuzz(expr_grammar, '<start>') 1 8 + "1 + 8" def collapse(tree): key, children = tree if not children: return tree return ''.join([collapse(c) for c in children]) def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '<start>') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]
  85. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar
  86. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')
  87. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')
  88. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') def start(rops): expr(rops) def expr(rops): match (rops.next % 6): case 0: expr(rops); print('+'); e case 1: expr(rops); print('-'); e case 2: expr(rops); print('/'); e case 3: expr(rops); print('*'); e case 4: print('('); expr(rops); p case 5: number(rops) def number(rops): match (rops.next % 2): case 0: integer(rops) case 1: integer(rops); print('.') def integer(rops): match (rops.next % 2): case 0: digit(rops); integer(rops case 1: digit(rops) def digit(rops): match (rops.next % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')
  89. <start> := <expr> <expr> := <expr> '+' <expr> | <expr>

    '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') def start(rops): expr(rops) def expr(rops): match (rops.next % 6): case 0: expr(rops); print('+'); e case 1: expr(rops); print('-'); e case 2: expr(rops); print('/'); e case 3: expr(rops); print('*'); e case 4: print('('); expr(rops); p case 5: number(rops) def number(rops): match (rops.next % 2): case 0: integer(rops) case 1: integer(rops); print('.') def integer(rops): match (rops.next % 2): case 0: digit(rops); integer(rops case 1: digit(rops) def digit(rops): match (rops.next % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')
  90. Grammar Fuzzer grammar = """
 <start> := <expr> <expr> :=

    <expr> '+' <expr> | <expr> '-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
 """ generate(grammar) Fast Fuzzers def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') Compiled Grammar (F1) Building Fast Fuzzers Gopinath and Zeller 2019 def start_0(rops): r = next(rops) if 0 <= r < 43: expr_0() elif 43 <= r < 85: expr_1() elif 85 <= r < 128: expr_2() elif 128 <= r < 171: expr_3() elif 171 <= r < 213: expr_4() else: expr_5() def expr_0(rops): r = next(rops) if 0 <= r < 43: expr_0() elif 43 <= r < 85: expr_1() elif 85 <= r < 128: expr_2() elif 128 <= r < 171: expr_3() elif 171 <= r < 213: expr_4() else: expr_5() print('+') r = next(rops) if 0 <= r < 43: expr_0() elif 43 <= r < 85: expr_1() elif 85 <= r < 128: expr_2() elif 128 <= r < 171: expr_3() elif 171 <= r < 213: expr_4() else: expr_5() Grammar VM (F1)
  91. 73 The Fuzzing Pipeline Program Under Test pFuzzer Active Guidance

    F1 Grammar Fuzzer Grammar Inputs Grammar Miner Samples Active Learning
  92. 74 The Fuzzing Synergy Mimid Grammar Miner Parser Directed pFuzzer

    F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.
  93. 74 The Fuzzing Synergy Mimid Grammar Miner Parser Directed pFuzzer

    F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.
  94. 74 The Fuzzing Synergy Mimid Grammar Miner Parser Directed pFuzzer

    F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.
  95. 76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type:

    application/soap+xml; charset=utf-8 Content-Length: 312 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://w3.org/2001/12/soap-encoding"> <soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPrice> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body> </soap:Envelope>
  96. 76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type:

    application/soap+xml; charset=utf-8 Content-Length: 312 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://w3.org/2001/12/soap-encoding"> <soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPrice> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body> </soap:Envelope> HTTP POST
  97. 76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type:

    application/soap+xml; charset=utf-8 Content-Length: 312 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://w3.org/2001/12/soap-encoding"> <soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPrice> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body> </soap:Envelope> HTTP POST XML PAYLOAD
  98. 76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type:

    application/soap+xml; charset=utf-8 Content-Length: 312 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://w3.org/2001/12/soap-encoding"> <soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPrice> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body> </soap:Envelope> HTTP POST XML PAYLOAD SOAP
  99. 76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type:

    application/soap+xml; charset=utf-8 Content-Length: 312 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://w3.org/2001/12/soap-encoding"> <soap:Body xmlns:m="http://www.stock.org/stock"> <m:GetStockPrice> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body> </soap:Envelope> HTTP POST XML PAYLOAD SOAP RPC Call
  100. 78 #include <stdio.h> int main() { int number1, number2, number3;

    number1 = 10; number2 = 20; number3 = sum(number1, number2); if (number3 > 100) return 0; return 1; } $ cc example.c -o example example.c 1.Syntactically correct 2.Variables declared before use 3.Use correct types 4.Statically conforming 5.Dynamically conforming 6.Model conforming Challenge: Semantic Envelopes (Mckeeman 1998) (parse) (compile) (link) (run) (synthesis)
  101. 78 #include <stdio.h> int main() { int number1, number2, number3;

    number1 = 10; number2 = 20; number3 = sum(number1, number2); if (number3 > 100) return 0; return 1; } $ cc example.c -o example example.c 1.Syntactically correct 2.Variables declared before use 3.Use correct types 4.Statically conforming 5.Dynamically conforming 6.Model conforming Challenge: Semantic Envelopes (Mckeeman 1998) (parse) (compile) (link) (run) (synthesis)
  102. 79 #include <stdio.h> int main() { int number1, number2, number3;

    number1 = 10; number2 = 20; number3 = sum(number1, number2); if (number3 > 100) return 0; return 1; } $ cc example.c -o example example.c 1.Syntactically correct 2.Variables declared before use 3.Use correct types 4.Statically conforming 5.Dynamically conforming 6.Model conforming Challenge: Semantic Envelopes (Mckeeman 1998) (parse) (compile) (link) (run) (synthesis)
  103. 80

  104. 82

  105. Why Did My Program Crash? 8.2 - 27 - -9

    / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090
  106. Why Did My Program Crash? 8.2 - 27 - -9

    / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4))
  107. Why Did My Program Crash? 8.2 - 27 - -9

    / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4)) 00000 ?
  108. Why Did My Program Crash? 8.2 - 27 - -9

    / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4)) 00000 ? ((5)) ?
  109. Why Did My Program Crash? 8.2 - 27 - -9

    / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4)) 00000 ? ((5)) ? (++5) ?
  110. 85 Issue 386 from Rhino var A = class extends

    (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure
  111. 85 Issue 386 from Rhino var A = class extends

    (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure
  112. 85 Issue 386 from Rhino var A = class extends

    (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure
  113. 85 Issue 386 from Rhino var A = class extends

    (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure
  114. 85 Issue 386 from Rhino var A = class extends

    (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure Delta Minimization is useful but not sufficient
  115. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  116. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  117. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  118. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] ✓ Did not reproduce the failure 1 * (2 - 3)
  119. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9]
  120. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c
  121. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c
  122. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c ✓ Did not reproduce the failure 1 + 3 + 4
  123. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c c
  124. 3 * 4 <start> := <expr> <expr> := <term> '

    + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c c
  125. 3 * 4 <start> := <expr> <expr> := <term> '

    + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c c ✓ Did not reproduce the failure
  126. ( ( 4 ) ) <start> := <expr> <expr> :=

    <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c c c c c c c
  127. ( ( 1 - 2 ) ) <start> := <expr>

    <expr> := <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c c c c c c c ( ( 1 - 2 ) )
  128. ( ( 1 - 2 ) ) <start> := <expr>

    <expr> := <term> ' + ' <expr> | <term> ' - ' <expr> | <term> <term> := <factor> ' * ' <term> | <factor> ' / ' <term> | <factor> <factor> := '+' <factor> | '-' <factor> | '(' <expr> ')' | <integer> '.' <integer> | <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] c c c c c c c ✘ reproduced the failure ( ( 1 - 2 ) )
  129. ( ( 1 - 2 ) ) c c c

    c c c c ( ( 1 - 2 ) )
  130. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) )
  131. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) ) ( ( 2 * 3 + 4 ) )
  132. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) )
  133. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ( ( - 2 / 1 ) )
  134. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ✘ ( ( - 2 / 1 ) )
  135. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ✘ ( ( - 2 / 1 ) ) ( ( 98 - 0 ) )
  136. ( ( 1 - 2 ) ) c c c

    c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ✘ ( ( - 2 / 1 ) ) ✘ ( ( 98 - 0 ) )
  137. <expr> ) ( ( ) ( ( ) 4 )

    ( ( 4 ) ) c c c c c c c A
  138. <expr> ) ( ( ) ( ( ) 4 )

    ( ( 4 ) ) c c c c c c c A
  139. ( ( 4 ) ) c c c c c

    c c A ( ( ) ) <expr> ( ( ) ) 4 Minimized Input Abstract Failure Inducing Input def check(parsed): if parsed.is_nested() and parsed.child.is_nested(): raise Exception() return input
  140. var A = class extends (class {}){}; Issue 2937 from

    Closure <varModifier> <Identifier> = class extends (class {}){}
  141. var {baz:{} = baz => {}} = baz => {};

    Issue 385 from Rhino var {<$Id1>:{} = <$Id1> => {}} <variableDeclaration>;
  142. {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842

    from Closure {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}}
  143. ( ( 4 ) ) c c c c c

    c c A ( ( ) ) <expr> ( ( ) ) 4 Minimized Input Abstract Failure Inducing Input • Effectively abstracts a minimized input • The abstraction identifies where the problem lies • Decompose complex program behaviors DDSET Gopinath, Kampmann, Havrikov, Soremekun, and Zeller. Abstracting Failure Inducing Inputs. ISSTA 2020. def check(parsed): if parsed.is_nested() and parsed.child.is_nested(): raise Exception() return input ISSTA 2020 Distinguished Award
  144. 108 <start F> := <expr F> <expr F> := <term

    F> ' + ' <expr> | <term> ' + ' <expr F> | <term F> ' - ' <expr> | <term> ' - ' <expr F> | <term F> <term F> := <factor F> ' * ' <term> | <factor> ' * ' <term F> | <factor F> ' / ' <term> | <factor> ' / ' <term F> | <factor F> <factor F> := '+' <factor F> | '-' <factor F> | '(' <expr F> ')' | '(' <expr F1> ')' <expr F1> := <term F2> <term F2> := <factor F3> <factor F3>:= '(' <expr> ')' Specialized Grammar <factor F> is ((<expr>))
  145. 108 <start F> := <expr F> <expr F> := <term

    F> ' + ' <expr> | <term> ' + ' <expr F> | <term F> ' - ' <expr> | <term> ' - ' <expr F> | <term F> <term F> := <factor F> ' * ' <term> | <factor> ' * ' <term F> | <factor F> ' / ' <term> | <factor> ' / ' <term F> | <factor F> <factor F> := '+' <factor F> | '-' <factor F> | '(' <expr F> ')' | '(' <expr F1> ')' <expr F1> := <term F2> <term F2> := <factor F3> <factor F3>:= '(' <expr> ')' ((1)) + 2 (23 * ((3)) - 34) (344- 4 + ((223))) (1) - 3 * 773 + (-22 + 1) 1798 - 889 / ((333-1)) * 2 / 3 + 1 34 + ((4)) -334 + (334 - (22) + 919 * 0 + 1 98435747+ 88 + (((0))) + (1) - 1 * 7 / 4 * 889 - 2 8 + ((8)) + --1 + 11223 / 344 - 39 + (1) - 456 + 134 / 45 437 + 8 - 1 * ((9 + ((1))) - 1 + 99111948 + 3 --1 + (112) - 2 + 445) + 0 74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * 223 - 1233 + 334672 2 * ((9)) - (1798 - 889 / (333-1) * 2 / 3 + 100012 + 3434392 + 234 ----6 * 1798 - 889 / (33 778 - (((1) - 3 * 773 + (-22 + 1) * (4545) - 23 - ((2)) * 773 + (-22 + 1) / 3434 + ---1 + 1 / 34343 + 112 349 + (((1) - 3 * 3 + (-22 + 1) ((+ (-22 + 1) * (4545) - 23 - (2) * 773 + ((-22 + 1)) / 3434 + ---1 + 1 / 34343 + 1123 8 + ((8)) + --1 + / 1 - 39 + (1) - 456 + 134 / 45 ))(((1) - 2334 + ((((1)) - 3 * 773 + (-22 + 1) * (2) - 23 - (2) * 773 + (-22 + 1) / 3 74 + 3 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 45 ))(((1) - 3 * 773 1+ 33+ 24343433 +23343 - ((74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 4 ✘ Specialized Grammar <factor F> is ((<expr>)) ✘
  146. <variableDeclarationList C2937> is <varModifier> <Identifier> = class extends (class {}){}

    Closure 2937 <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; Rhino 385 <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Rhino 386 <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations
  147. <JavaScript C2937 and C2842> <variableDeclarationList C2937> is <varModifier> <Identifier> =

    class extends (class {}){} Closure 2937 <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; Rhino 385 <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Rhino 386 <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations
  148. <JavaScript C2937 and C2842> <JavaScript not(C2937 and C2842)> <variableDeclarationList C2937>

    is <varModifier> <Identifier> = class extends (class {}){} Closure 2937 <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; Rhino 385 <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Rhino 386 <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations
  149. <JavaScript C2937 and C2842> <JavaScript not(C2937 and C2842)> <JavaScript (C2937

    or C2842) and (R385 or R386)> <variableDeclarationList C2937> is <varModifier> <Identifier> = class extends (class {}){} Closure 2937 <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; Rhino 385 <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Rhino 386 <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations
  150. Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021. <JavaScript not(C2937 or

    C2842 or R385 or R386)> where <variableDeclarationList C2937> is <varModifier> <Identifier> = class extends (class {}){} <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Mechanized proofs are available Algebra of Grammar Specializations
  151. Science of Focused Fuzzing <expr E & F> where <factor

    E> is ((<expr>)) <factor F> is <term> / 0
  152. <expr E & F> where <factor E> is ((<expr>)) <factor

    F> is <term> / 0 Isolating & Decomposing Program Behaviors
  153. <expr E & F> where <factor E> is ((<expr>)) <factor

    F> is <term> / 0 Isolating & Decomposing Program Behaviors Algebra of Program Behaviors
  154. <expr E & F> where <factor E> is ((<expr>)) <factor

    F> is <term> / 0 Isolating & Decomposing Program Behaviors Algebra of Program Behaviors Science of Program Behaviors
  155. 115 insert into tbl values (1,2,3) select b from tbl

    drop table tbl Input Behavior Program Challenge: Identify Behavior Divergence
  156. 115 insert into tbl values (1,2,3) select b from tbl

    drop table tbl update($file) read($file) rm($file) Input Behavior action='read' $action('tbl') Program assert invoked read: 'tbl.data' ✔ Challenge: Identify Behavior Divergence
  157. 115 insert into tbl values (1,2,3) select b from tbl

    drop table tbl update($file) read($file) rm($file) Input Behavior action='read' $action('tbl') Program assert invoked read: 'tbl.data' ✔ action='rm' Challenge: Identify Behavior Divergence
  158. 115 insert into tbl values (1,2,3) select b from tbl

    drop table tbl update($file) read($file) rm($file) Input Behavior action='read' $action('tbl') Program assert invoked read: 'tbl.data' ✔ action='rm' Challenge: Identify Behavior Divergence
  159. def triangle(a, b, c): if a == b: if b

    == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene Challenge: Identify Behavior Divergence
  160. def triangle(a, b, c): if a == b: if b

    == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene Challenge: Identify Behavior Divergence
  161. Challenge: Identify Behavior Divergence LockB() LockA() DoAB() UnlockB() UnlockA() LockB()

    LockA() DoAB() UnlockA() UnlockB() UnLockA() LockA() DoAB() LockB() UnlockB()
  162. 118 def triangle(a, b, c): if a == b: if

    b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
  163. 118 def triangle(a, b, c): if a == b: if

    b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
  164. 119

  165. 121 <enter> := <triangle(a,b,c)> <triangle(a,b,c)> := (a==b) <if_2 T> |

    (a!=b) <if_2 F> <if_2 T> := (b==c) <if_3 T> | (b!=c) <if_3 F>
  166. 122 <enter> := <triangle(a,b,c)> <triangle(a,b,c)> := (a==b) <if_2 T> |

    (a!=b) <if_2 F> <if_2 T> := (b==c) <if_3 T> | (b!=c) <if_3 F> <if_3 T> := "return Equilateral" <if_3 F> := "return Isosceles"
  167. 123 <enter> := <triangle(a,b,c)> <triangle(a,b,c)> := (a==b) <if_2 T> |

    (a!=b) <if_2 F> <if_2 T> := (b==c) <if_3 T> | (b!=c) <if_3 F> <if_3 T> := "return Equilateral" <if_3 F> := "return Isosceles" <if_2 F> := (b==c) <if_8 T> | (b!=c) <if_8 F> <if_8 T> := "return Isosceles" <if_8 F> := (a==c) <if_11 T> | (a!=c) <if_11 F> <if_11 T> := "return Isosceles" <if_11 F> := "return Scalene"
  168. Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem:

    How to Fuzz Parsers https://rahul.gopinath.org
  169. Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem:

    How to Fuzz Parsers https://rahul.gopinath.org Generalize
  170. Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem:

    How to Fuzz Parsers https://rahul.gopinath.org Generalize Combine
  171. Future Work: Program Under Test pFuzzer F1 Fuzzer Inputs Grammar

    Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org Generalize Combine The Science of Fuzzing
  172. Future Work: Program Under Test pFuzzer F1 Fuzzer Inputs Grammar

    Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org Debugging Combine The Science of Fuzzing