Slide 1

Slide 1 text

Fuzzing Rahul Gopinath

Slide 2

Slide 2 text

Fuzzing from Alchemy Rahul Gopinath

Slide 3

Slide 3 text

Fuzzing from Alchemy Rahul Gopinath

Slide 4

Slide 4 text

Fuzzing from Alchemy to Science Rahul Gopinath

Slide 5

Slide 5 text

Fuzzing from Alchemy to Science Rahul Gopinath

Slide 6

Slide 6 text

4

Slide 7

Slide 7 text

4 The story begins 2500 years ago. 500 BC

Slide 8

Slide 8 text

4 Vedic Sanskrit 500 BC

Slide 9

Slide 9 text

4 Vedic Sanskrit Classical Sanskrit 500 BC

Slide 10

Slide 10 text

5 500 BC Vedic Sanskrit Classical Sanskrit

Slide 11

Slide 11 text

5 500 BC Aṣṭādhyāyī Dakṣiputra Pāṇini Vedic Sanskrit Classical Sanskrit

Slide 12

Slide 12 text

5 500 BC Aṣṭādhyāyī Dakṣiputra Pāṇini Ad hoc rules Formal specification Vedic Sanskrit Classical Sanskrit

Slide 13

Slide 13 text

6

Slide 14

Slide 14 text

6 2500 years later.... 2021 CE The world is governed by software

Slide 15

Slide 15 text

6 2500 years later.... 2021 CE We have a crisis The world is governed by software

Slide 16

Slide 16 text

7 Debian 5 ~ 70 million lines Smart cars ~ 100 million lines Google is ~ 2 Billion lines 5 M 10 M 15 M 20 M 25 M 30 M 35 M (Source: Wikipedia) #Linux Kernel Size in Millions of Lines of Code Linux Releases Growth in software complexity

Slide 17

Slide 17 text

7 Debian 5 ~ 70 million lines Smart cars ~ 100 million lines Google is ~ 2 Billion lines 1994 1996 1.0.0 2.0.0 2.1.0 2.2.0 2.4.0 2.6.0 2003 2011 3.0 2015 4.0 2019 5.0 2019 5.7 2001 5 M 10 M 15 M 20 M 25 M 30 M 35 M (Source: Wikipedia) #Linux Kernel Size in Millions of Lines of Code Linux Releases Growth in software complexity

Slide 18

Slide 18 text

8 Growth in vulnerabilities #Vulnerabilities Year 16k 14k 12k 10k 8k 6k 4k 2k (Source: NIST)

Slide 19

Slide 19 text

8 Growth in vulnerabilities #Vulnerabilities Year 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 16k 14k 12k 10k 8k 6k 4k 2k (Source: NIST)

Slide 20

Slide 20 text

Fuzzing Program Trash deck technique: 1950s - Gerald Weinberg

Slide 21

Slide 21 text

Fuzzing Program Trash deck technique: 1950s - Gerald Weinberg

Slide 22

Slide 22 text

Fuzzing Crash? Program Trash deck technique: 1950s - Gerald Weinberg

Slide 23

Slide 23 text

10 Fuzzing Random Inputs Program Automatic Checks

Slide 24

Slide 24 text

10 Fuzzing • Memory Bounds Violation • Privilege Escalation • Safety Violations • Metamorphic Relations • Differential Execution Random Inputs Program Automatic Checks

Slide 25

Slide 25 text

11 Random Fuzzing Program

Slide 26

Slide 26 text

11 Random Fuzzing $ ./fuzz [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu
 2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z
 h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!
 AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}
 z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?
 lR=bF3+;y$3lodQ&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM
 PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2
 D|vBy!^zkhdf3C5PAkR?V((-%>

Slide 27

Slide 27 text

$ ./fuzz -int | program 634111569742810193727424069509 741355925061499451162464719526 615957331924826555590537407605 181400079803446874252046374716 740973770255348279425601333144 152724057932073828569041216191 099859446496509919024810271242 622974988671421938012464630138 735355134599327240920259675263 574528613057084231370741920902 794677842164654990353575580453 777282305855352378119038096476 699871306655084953377039862387 924957554389878352934547664240 082431556093837288597262675598 630851919061829885048834738832 677022429414980917053939970795 722006987916088650168665471731 yes 12 Random Fuzzing def is_prime(n: int) -> bool: """Primality test using 6k+-1 optimization.""" if n <= 3: return n > 1 if n % 2 == 0 or n % 3 == 0: return False i = 5 while i ** 2 <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True def main(): num = stdin.read() print(num, is_prime(num))

Slide 28

Slide 28 text

$ ./fuzz -int | program 634111569742810193727424069509 741355925061499451162464719526 615957331924826555590537407605 181400079803446874252046374716 740973770255348279425601333144 152724057932073828569041216191 099859446496509919024810271242 622974988671421938012464630138 735355134599327240920259675263 574528613057084231370741920902 794677842164654990353575580453 777282305855352378119038096476 699871306655084953377039862387 924957554389878352934547664240 082431556093837288597262675598 630851919061829885048834738832 677022429414980917053939970795 722006987916088650168665471731 yes 12 Random Fuzzing def is_prime(n: int) -> bool: """Primality test using 6k+-1 optimization.""" if n <= 3: return n > 1 if n % 2 == 0 or n % 3 == 0: return False i = 5 while i ** 2 <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True def main(): num = stdin.read() print(num, is_prime(num)) Weak point: unequal divisions in input space Input space n > 3 n <= 3

Slide 29

Slide 29 text

13 Advanced Fuzzing: Instrumentation

Slide 30

Slide 30 text

14 Advanced Fuzzing: Instrumentation def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene

Slide 31

Slide 31 text

14 Advanced Fuzzing: Instrumentation def triangle(a, b, c): __probe_enter() if a == b: __probe_1() if b == c: __probe_2() return Equilateral else: __probe_3() return Isosceles else: __probe_4() if b == c: __probe_5() return Isosceles else: __probe_6() if a == c: __probe_7() return Isosceles else: __probe_8() return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene

Slide 32

Slide 32 text

15 Instrumentation Based Fuzzing def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene

Slide 33

Slide 33 text

15 Instrumentation Based Fuzzing def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,1)

Slide 34

Slide 34 text

16 Instrumentation Based Fuzzing def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene

Slide 35

Slide 35 text

16 Instrumentation Based Fuzzing def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,2)

Slide 36

Slide 36 text

17 Instrumentation Guided Fuzzing • Coverage Guided • Solver Directed

Slide 37

Slide 37 text

18 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage

Slide 38

Slide 38 text

18 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage AFL

Slide 39

Slide 39 text

19 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage AFL First Release: 2013 AFL Trophy Case

Slide 40

Slide 40 text

20 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage AFL Pulling JPEGs out of thin air https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html Valid JPEG in 6 hours in an 8 core machine!

Slide 41

Slide 41 text

21 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage AFL static int is_reserved_word_token(const char *s, int len) { const char *reserved[] = { "break", "case", "catch", "continue", "debugger", "default", "delete", "do", "else", "false", "finally", "for", "function", "if", "in", "instanceof", "new", "null", "return", "switch", "this", "throw", "true", "try", "typeof", "var", "void", "while", "with", "let", "undefined", ((void *)0)}; int i; if (!mjs_is_alpha(s[0])) return 0; for (i = 0; reserved[i] != ((void *)0); i++) { if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0) return i + 1; } return 0; }

Slide 42

Slide 42 text

21 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage AFL static int is_reserved_word_token(const char *s, int len) { const char *reserved[] = { "break", "case", "catch", "continue", "debugger", "default", "delete", "do", "else", "false", "finally", "for", "function", "if", "in", "instanceof", "new", "null", "return", "switch", "this", "throw", "true", "try", "typeof", "var", "void", "while", "with", "let", "undefined", ((void *)0)}; int i; if (!mjs_is_alpha(s[0])) return 0; for (i = 0; reserved[i] != ((void *)0); i++) { if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0) return i + 1; } return 0; } Weak point: Magic bytes

Slide 43

Slide 43 text

22 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage

Slide 44

Slide 44 text

22 Coverage Guided Fuzzing • Randomly generate inputs • Choose seeds with new coverage

Slide 45

Slide 45 text

23 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs

Slide 46

Slide 46 text

23 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs (a == b) (b == c)

Slide 47

Slide 47 text

23 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs (a == b) (b == c) (b != c)

Slide 48

Slide 48 text

23 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs (a == b) (b == c) (b != c) triangle(1,2,1)

Slide 49

Slide 49 text

23 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs (a == b) (b == c) (b != c) triangle(1,2,1)

Slide 50

Slide 50 text

24 Solver Directed Fuzzing • Collect path constraints • Solve constraints for new inputs void next_sym() { while(1) { switch (ch){ case '{': next_ch(); sym = LBRA; return; case '}': next_ch(); sym = RBRA; return; case '(': next_ch(); sym = LPAR; return; case ')': next_ch(); sym = RPAR; return; case '+': next_ch(); sym = PLUS; return; case '-': next_ch(); sym = MINUS; return; case '<': next_ch(); sym = LESS; return; case ';': next_ch(); sym = SEMI; return; case '=': next_ch(); sym = EQUAL; return; default: if (ch >= '0' && ch <= '9') { int_val = 0; /* missing overflow check */ while (ch >= '0' && ch <= '9') { int_val = int_val*10 + (ch - '0'); next_ch(); } sym = INT; } else if (ch >= 'a' && ch <= 'z') { int i = 0; /* missing overflow check */ while ((ch >= 'a' && ch <= 'z') || ch == '_'){ id_name[i++] = ch; next_ch(); } id_name[i] = '\0'; sym = 0; while (words[sym] != NULL && strcmp(words[sym], id_name) != 0) sym++; if (words[sym] == NULL) if (id_name[1] == '\0') sym = ID; else syntax_error(); } else syntax_error(); return; }

Slide 51

Slide 51 text

24 Solver Directed Fuzzing • Collect path constraints • Solve constraints for new inputs void next_sym() { while(1) { switch (ch){ case '{': next_ch(); sym = LBRA; return; case '}': next_ch(); sym = RBRA; return; case '(': next_ch(); sym = LPAR; return; case ')': next_ch(); sym = RPAR; return; case '+': next_ch(); sym = PLUS; return; case '-': next_ch(); sym = MINUS; return; case '<': next_ch(); sym = LESS; return; case ';': next_ch(); sym = SEMI; return; case '=': next_ch(); sym = EQUAL; return; default: if (ch >= '0' && ch <= '9') { int_val = 0; /* missing overflow check */ while (ch >= '0' && ch <= '9') { int_val = int_val*10 + (ch - '0'); next_ch(); } sym = INT; } else if (ch >= 'a' && ch <= 'z') { int i = 0; /* missing overflow check */ while ((ch >= 'a' && ch <= 'z') || ch == '_'){ id_name[i++] = ch; next_ch(); } id_name[i] = '\0'; sym = 0; while (words[sym] != NULL && strcmp(words[sym], id_name) != 0) sym++; if (words[sym] == NULL) if (id_name[1] == '\0') sym = ID; else syntax_error(); } else syntax_error(); return; } Weak point: Path explosion

Slide 52

Slide 52 text

25 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs

Slide 53

Slide 53 text

25 Solver Directed Fuzzing • Collect path constraints • Solve negated constraints for new inputs

Slide 54

Slide 54 text

26 Fuzzing Parsers $ ./fuzz [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu
 2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z
 h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!
 AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}
 z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?
 lR=bF3+;y$3lodQ&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM
 PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2
 D|vBy!^zkhdf3C5PAkR?V((-%>

Slide 55

Slide 55 text

26 Fuzzing Parsers $ ./fuzz [;x1-GPZ+wcckc];,N9J+?#6^6\e?]9lu
 2_%'4GX"0VUB[E/r ~fApu6b8<{%siq8Z
 h.6{V,hr?;{Ti.r3PIxMMMv6{xS^+'Hq!
 AxB"YXRS@!Kd6;wtAMefFWM(`|J_<1~o}
 z3K(CCzRH JIIvHz>_*.\>JrlU32~eGP?
 lR=bF3+;y$3lodQ&]BS6R&j?#tP7iaV}-}`\?[_[Z^LBM
 PG-FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2
 D|vBy!^zkhdf3C5PAkR?V((-%>

Slide 56

Slide 56 text

27 Advanced Fuzzing: Specialized Generators • Specialize generation for a domain

Slide 57

Slide 57 text

27 Advanced Fuzzing: Specialized Generators • Specialize generation for a domain

Slide 58

Slide 58 text

27 Advanced Fuzzing: Specialized Generators • Specialize generation for a domain 80,000 lines of code

Slide 59

Slide 59 text

28 The Holy Grail of Fuzzing

Slide 60

Slide 60 text

28 The Holy Grail of Fuzzing Parsers

Slide 61

Slide 61 text

Overcoming Parsers

Slide 62

Slide 62 text

Overcoming Parsers

Slide 63

Slide 63 text

30 Overcoming Parsers

Slide 64

Slide 64 text

30 Overcoming Parsers

Slide 65

Slide 65 text

30 Monolithic Overcoming Parsers

Slide 66

Slide 66 text

31

Slide 67

Slide 67 text

32 The Missing Piece: Formal Specification

Slide 68

Slide 68 text

33 Grammar

Slide 69

Slide 69 text

34 Formal Languages Formal Language Descriptions

Slide 70

Slide 70 text

34 Formal Languages Formal Language Descriptions 3. Regular (Chomsky,1956)

Slide 71

Slide 71 text

34 Formal Languages Formal Language Descriptions 3. Regular Context Free (Chomsky,1956) Argument Stack

Slide 72

Slide 72 text

34 Formal Languages Formal Language Descriptions 3. Regular Context Free Recursively Enumerable (Chomsky,1956) Argument Stack Return Stack

Slide 73

Slide 73 text

34 Formal Languages Formal Language Descriptions 3. Regular Context Free Recursively Enumerable (Chomsky,1956) Easy to produce and parse Argument Stack Return Stack

Slide 74

Slide 74 text

35 Grammar := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Arithmetic expression grammar

Slide 75

Slide 75 text

35 Grammar := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Arithmetic expression grammar

Slide 76

Slide 76 text

35 Grammar := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Arithmetic expression grammar key

Slide 77

Slide 77 text

35 Grammar := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Arithmetic expression grammar Definition for key

Slide 78

Slide 78 text

36 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Grammar Arithmetic expression grammar

Slide 79

Slide 79 text

36 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Grammar Arithmetic expression grammar Expansion Rule

Slide 80

Slide 80 text

36 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Grammar Arithmetic expression grammar Expansion Rule Terminal Symbol

Slide 81

Slide 81 text

36 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Grammar Arithmetic expression grammar Expansion Rule Terminal Symbol Nonterminal Symbol

Slide 82

Slide 82 text

36 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Grammar Arithmetic expression grammar Expansion Rule Terminal Symbol Nonterminal Symbol

Slide 83

Slide 83 text

37 Grammars As recognizers := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9]

Slide 84

Slide 84 text

37 Grammars As recognizers (8 / 3) * 49 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9]

Slide 85

Slide 85 text

37 Grammars As recognizers (8 / 3) * 49 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9]

Slide 86

Slide 86 text

38 Grammars := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] As producers (Hanford 1970) (Purdom 1972)

Slide 87

Slide 87 text

38 Grammars8.2 - 27 - -9 / +((+9 * --2 + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] As producers (Hanford 1970) (Purdom 1972)

Slide 88

Slide 88 text

39 Grammars As effective producers 8.2 - 27 - -9 / +((+9 * --2 + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090

Slide 89

Slide 89 text

39 Grammars As effective producers Interpreter Parser ✘ ✔ 8.2 - 27 - -9 / +((+9 * --2 + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090

Slide 90

Slide 90 text

40 Where to Get the Grammar From?

Slide 91

Slide 91 text

41 •Reference Specification?

Slide 92

Slide 92 text

41 The standard spec •Reference Specification?

Slide 93

Slide 93 text

41 The standard spec Buggy Implementation •Reference Specification?

Slide 94

Slide 94 text

41 The standard spec Buggy Implementation "Extra" Features •Reference Specification?

Slide 95

Slide 95 text

41 The standard spec Buggy Implementation "Extra" Features "Be liberal in what you accept, and conservative in what you send"
 (the cause of trouble) Postel's Law "Accepted" Bugs •Reference Specification?

Slide 96

Slide 96 text

42 https://www.json.org

Slide 97

Slide 97 text

42 object { } { members } members pair pair , members pair string : value array [ ] [ elements ] elements value value , elements value string number object array true false null string " " " chars " chars char char chars char UNICODE \ [",\,CTRL] \" \\ \/ \b \f \n \r \t \u hex hex hex hex number int int frac int exp int frac exp int digit onenine digits - digit - onenine digits frac . digits exp e digits hex digit A - F a - f digits digit digit digits e e e+ e- E E+ E- https://www.json.org

Slide 98

Slide 98 text

43 Parsing JSON is a Minefield http://seriot.ch/

Slide 99

Slide 99 text

43 Parsing JSON is a Minefield http://seriot.ch/

Slide 100

Slide 100 text

43 Parsing JSON is a Minefield http://seriot.ch/

Slide 101

Slide 101 text

43 Parsing JSON is a Minefield http://seriot.ch/

Slide 102

Slide 102 text

43 Parsing JSON is a Minefield http://seriot.ch/ Expected Parse Fail (Expect Success) Parse Success (Expect Fail) Parse Success (Undefined) Parse Fail (Undefined) Parser Crash Timeout

Slide 103

Slide 103 text

43 Parsing JSON is a Minefield http://seriot.ch/ Expected Parse Fail (Expect Success) Parse Success (Expect Fail) Parse Success (Undefined) Parse Fail (Undefined) Parser Crash Timeout

Slide 104

Slide 104 text

No content

Slide 105

Slide 105 text

45 Where to Get the Grammar From?

Slide 106

Slide 106 text

45 Where to Get the Grammar From? Hand-written parsers already encode the grammar

Slide 107

Slide 107 text

46 How to Extract This Grammar?

Slide 108

Slide 108 text

46 How to Extract This Grammar? • Inputs -> Dynamic Control Dependence Trees

Slide 109

Slide 109 text

46 How to Extract This Grammar? • Inputs -> Dynamic Control Dependence Trees • DCD Trees -> Context Free Grammar

Slide 110

Slide 110 text

47 Control Dependence Graph Statement B is control dependent on A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1

Slide 111

Slide 111 text

47 Control Dependence Graph Statement B is control dependent on A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv

Slide 112

Slide 112 text

47 Control Dependence Graph Statement B is control dependent on A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv while: determines whether if: executes

Slide 113

Slide 113 text

48 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv Dynamic Control Dependence Tree Each statement execution is represented as a separate node

Slide 114

Slide 114 text

48 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv Dynamic Control Dependence Tree Each statement execution is represented as a separate node DCD Tree for call parse_csv()

Slide 115

Slide 115 text

49 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 DCD Tree ~ Parse Tree •No tracking beyond input buffer •Characters are attached to nodes where they are accessed last "12," "12,"

Slide 116

Slide 116 text

49 def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 '1' '2' ',' DCD Tree ~ Parse Tree •No tracking beyond input buffer •Characters are attached to nodes where they are accessed last "12," "12,"

Slide 117

Slide 117 text

50 def is_digit(i): return i in '0123456789' def parse_num(s,i): n = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Parse tree for parse_expr('9+3/4')

Slide 118

Slide 118 text

50 def is_digit(i): return i in '0123456789' def parse_num(s,i): n = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Parse tree for parse_expr('9+3/4')

Slide 119

Slide 119 text

51 def is_digit(i): return i in '0123456789' def parse_num(s,i): n = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Identifying Compatible Nodes Which nodes correspond to the same nonterminal

Slide 120

Slide 120 text

51 def is_digit(i): return i in '0123456789' def parse_num(s,i): n = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Identifying Compatible Nodes Which nodes correspond to the same nonterminal

Slide 121

Slide 121 text

52 3 * (9 + 1)

Slide 122

Slide 122 text

52 3 * (9 + 1)

Slide 123

Slide 123 text

52 (9 + 1) * 3 3 * (9 + 1)

Slide 124

Slide 124 text

52 (9 + 1) * 3 3 * (9 + 1)

Slide 125

Slide 125 text

52 (9 + 1) * 3 3 * (9 + 1)

Slide 126

Slide 126 text

53 3 * (9 + 1)

Slide 127

Slide 127 text

53 3 * (9 + 1)

Slide 128

Slide 128 text

53 9 + 1 3 * (9 + 1)

Slide 129

Slide 129 text

53 9 + 1 3 * (9 + 1)

Slide 130

Slide 130 text

53 9 + 1 3 * (9 + 1)

Slide 131

Slide 131 text

54 3 * (9 + 1)

Slide 132

Slide 132 text

54 3 (9 + 1) * 3 * (9 + 1)

Slide 133

Slide 133 text

54 3 (9 + 1) * 3 * (9 + 1)

Slide 134

Slide 134 text

54 3 (9 + 1) * 3 * (9 + 1)

Slide 135

Slide 135 text

55 3*(1) 1

Slide 136

Slide 136 text

56 3*(1) 1

Slide 137

Slide 137 text

56 3*(1) 1 :=

Slide 138

Slide 138 text

56 3*(1) 1 := :=

Slide 139

Slide 139 text

:= | | | := := | := := '3' | '1' := '(' ')' := := '*' 57

Slide 140

Slide 140 text

:= := | := | | | := := | := := '3' | '1' := '(' ')' := := '*' 57

Slide 141

Slide 141 text

58 def is_digit(i): return i in '0123456789' def parse_num(s,i): n = '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr := := | := | := '(' ')' | := '*' | '+' | '-' | '/' := | : [0-9] calc.py Recovered Arithmetic Grammar

Slide 142

Slide 142 text

59 := := | := | := '(' ')' | := '*' | '+' | '-' | '/' := | : [0-9]

Slide 143

Slide 143 text

59 8.2 - 27 - -9 / +((+9 * --2 + --+-+- ((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4) )))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * + (8 - 5 - 6)) * (-(a-+(((+(4))))) - + +4) / +(-+---((5.6 - --(3 * -1.8 * + (6 * +-(((-(-6) * ---+6)) / +--(+-+- 7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(- -2 - -++-9.0)))) / 5 * --++090 + * - +5 + 7.513)))) - (+1 / ++((-84)))))) )) * 8.2 - 27 - -9 / +((+9 * --2 + - -+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(( (+(4))))) - ++4) / +(-+---((5.6 - -- (3 * -1.8 * +(6 * +-(((-(-6) * ---+6 )) / +--(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6 .37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 := := | := | := '(' ')' | := '*' | '+' | '-' | '/' := | : [0-9]

Slide 144

Slide 144 text

60 ::= ::= '"' | '[' | '{' | | 'true' | 'false' | 'null' ::= + | + 'e' + ::= '+' | '-' | '.' | [0-9] | 'E' | 'e' ::= * '"' ::= ']' | (',')* ']' | ( ',' )+ (',' )* ']' ::= '}' | ( '"' ':' ',' )* '"' ':' '}' ::= ' ' | '!' | '#' | '$' | '%' | '&' | ''' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^' | '_', ''',| '{' | '|' | '}' | '~' | '[A-Za-z0-9]' | '\' ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next() if expect_key: raise JSONError(E_DKEY, stm, stm.pos) if c == '}': return result expect_key = 1 continue # parse out a key/value pair elif c == '"': key = _from_json_string(stm) stm.skipspaces() c = stm.next() if c != ':': raise JSONError(E_COLON, stm, stm.pos) stm.skipspaces() val = _from_json_raw(stm) result[key] = val expect_key = 0 continue raise JSONError(E_MALF, stm, stm.pos) def _from_json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == '"': return _from_json_string(stm) elif c == '{': return _from_json_dict(stm) elif c == '[': return _from_json_list(stm) elif c == 't': return _from_json_fixed(stm, 'true', True, E_BOOL) elif c == 'f': return _from_json_fixed(stm, 'false', False, E_BOOL) elif c == 'n': return _from_json_fixed(stm, 'null', None, E_NULL) elif c in NUMSTART: return _from_json_number(stm) raise JSONError(E_MALF, stm, stm.pos) def from_json(data): stm = JSONStream(data) return _from_json_raw(stm) microjson.py Recovered JSON grammar

Slide 145

Slide 145 text

61 ::= ::= '"' | '[' | '{' | | 'true' | 'false' | 'null' ::= + | + 'e' + ::= '+' | '-' | '.' | [0-9] | 'E' | 'e' ::= * '"' ::= ']' | (',')* ']' | ( ',' )+ (',' )* ']' ::= '}' | ( '"' ':' ',' )* '"' ':' '}' ::= ' ' | '!' | '#' | '$' | '%' | '&' | ''' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^' | '_', ''',| '{' | '|' | '}' | '~' | '[A-Za-z0-9]' | '\' ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next() if expect_key: raise JSONError(E_DKEY, stm, stm.pos) if c == '}': return result expect_key = 1 continue # parse out a key/value pair elif c == '"': key = _from_json_string(stm) stm.skipspaces() c = stm.next() if c != ':': raise JSONError(E_COLON, stm, stm.pos) stm.skipspaces() val = _from_json_raw(stm) result[key] = val expect_key = 0 continue raise JSONError(E_MALF, stm, stm.pos) def _from_json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == '"': return _from_json_string(stm) elif c == '{': return _from_json_dict(stm) elif c == '[': return _from_json_list(stm) elif c == 't': return _from_json_fixed(stm, 'true', True, E_BOOL) elif c == 'f': return _from_json_fixed(stm, 'false', False, E_BOOL) elif c == 'n': return _from_json_fixed(stm, 'null', None, E_NULL) elif c in NUMSTART: return _from_json_number(stm) raise JSONError(E_MALF, stm, stm.pos) def from_json(data): stm = JSONStream(data) return _from_json_raw(stm) microjson.py Recovered JSON grammar Mimid Gopinath, Mathis, and Zeller. Mining Input Grammars from Dynamic Control Flow. ESEC/FSE 2020. •Javascript •C •Lisp •JSON •URL •CGI

Slide 146

Slide 146 text

Actual Specification

Slide 147

Slide 147 text

Dynamic Approximation (Mimid) Actual Specification

Slide 148

Slide 148 text

Dynamic Approximation (Mimid) Static Approximation (Static Mimid) Actual Specification

Slide 149

Slide 149 text

Dynamic Approximation (Mimid) Static Approximation (Static Mimid) Actual Specification IOT & Embedded

Slide 150

Slide 150 text

63

Slide 151

Slide 151 text

Generating Unbiased Samples

Slide 152

Slide 152 text

Finding Good Samples Seed corpus?

Slide 153

Slide 153 text

Finding Good Samples Seed corpus? (Blind spots)

Slide 154

Slide 154 text

• Differentiate incomplete and incorrect inputs • Solve one input symbol at a time systematically Key Idea

Slide 155

Slide 155 text

67 Sample Free Generators

Slide 156

Slide 156 text

67 Sample Free Generators

Slide 157

Slide 157 text

67 Sample Free Generators

Slide 158

Slide 158 text

67 Sample Free Generators A

Slide 159

Slide 159 text

67 Sample Free Generators A A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

Slide 160

Slide 160 text

67 Sample Free Generators A ( A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

Slide 161

Slide 161 text

67 Sample Free Generators A ( 2 A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0

Slide 162

Slide 162 text

67 Sample Free Generators A ( 2 - B 9 ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0

Slide 163

Slide 163 text

67 Sample Free Generators A ( 2 - B 9 ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0

Slide 164

Slide 164 text

67 Sample Free Generators A ( 2 - B 9 ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0 (2-94)

Slide 165

Slide 165 text

Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Parser Directed Fuzzing. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Zeller Black-box Testing with Monotonic Prefixes. ISSRE 2021 (submitted). Sample Free Generators A ( 2 - B 9 ) 4 )

Slide 166

Slide 166 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Fast Fuzzing with Grammars fuzz(expr_grammar, '')

Slide 167

Slide 167 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Fast Fuzzing with Grammars fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key))

Slide 168

Slide 168 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Fast Fuzzing with Grammars def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key))

Slide 169

Slide 169 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Fast Fuzzing with Grammars def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]

Slide 170

Slide 170 text

Fast Fuzzing with Grammars fuzz(expr_grammar, '') def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]

Slide 171

Slide 171 text

Fast Fuzzing with Grammars + 1 8 fuzz(expr_grammar, '') def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]

Slide 172

Slide 172 text

Fast Fuzzing with Grammars + 1 8 fuzz(expr_grammar, '') def collapse(tree): key, children = tree if not children: return tree return ''.join([collapse(c) for c in children]) def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]

Slide 173

Slide 173 text

Fast Fuzzing with Grammars + 1 8 fuzz(expr_grammar, '') 1 8 + "1 + 8" def collapse(tree): key, children = tree if not children: return tree return ''.join([collapse(c) for c in children]) def gen_key(grammar, key):
 if is_terminal_symbol(key):
 return key else: next_rule = random.choice(rules) return gen_rule(grammar, grammar[key][next_rule]) fuzz(expr_grammar, '') def fuzz(grammar, key): return collapse_tree(gen_key(grammar, key)) def gen_rule(grammar, rule): return [gen_key(token) for token in rule]

Slide 174

Slide 174 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Compiling the Grammar

Slide 175

Slide 175 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')

Slide 176

Slide 176 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')

Slide 177

Slide 177 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') def start(rops): expr(rops) def expr(rops): match (rops.next % 6): case 0: expr(rops); print('+'); e case 1: expr(rops); print('-'); e case 2: expr(rops); print('/'); e case 3: expr(rops); print('*'); e case 4: print('('); expr(rops); p case 5: number(rops) def number(rops): match (rops.next % 2): case 0: integer(rops) case 1: integer(rops); print('.') def integer(rops): match (rops.next % 2): case 0: digit(rops); integer(rops case 1: digit(rops) def digit(rops): match (rops.next % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')

Slide 178

Slide 178 text

:= := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') def start(rops): expr(rops) def expr(rops): match (rops.next % 6): case 0: expr(rops); print('+'); e case 1: expr(rops); print('-'); e case 2: expr(rops); print('/'); e case 3: expr(rops); print('*'); e case 4: print('('); expr(rops); p case 5: number(rops) def number(rops): match (rops.next % 2): case 0: integer(rops) case 1: integer(rops); print('.') def integer(rops): match (rops.next % 2): case 0: digit(rops); integer(rops case 1: digit(rops) def digit(rops): match (rops.next % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')

Slide 179

Slide 179 text

Grammar Fuzzer grammar = """
 := := '+' | '-' | '/' | '*' | '(' ')' | := | '.' := | := [0-9]
 """ generate(grammar) Fast Fuzzers def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') Compiled Grammar (F1) Building Fast Fuzzers Gopinath and Zeller 2019 def start_0(rops): r = next(rops) if 0 <= r < 43: expr_0() elif 43 <= r < 85: expr_1() elif 85 <= r < 128: expr_2() elif 128 <= r < 171: expr_3() elif 171 <= r < 213: expr_4() else: expr_5() def expr_0(rops): r = next(rops) if 0 <= r < 43: expr_0() elif 43 <= r < 85: expr_1() elif 85 <= r < 128: expr_2() elif 128 <= r < 171: expr_3() elif 171 <= r < 213: expr_4() else: expr_5() print('+') r = next(rops) if 0 <= r < 43: expr_0() elif 43 <= r < 85: expr_1() elif 85 <= r < 128: expr_2() elif 128 <= r < 171: expr_3() elif 171 <= r < 213: expr_4() else: expr_5() Grammar VM (F1)

Slide 180

Slide 180 text

73 The Fuzzing Pipeline Program Under Test

Slide 181

Slide 181 text

73 The Fuzzing Pipeline Program Under Test pFuzzer Active Guidance F1 Grammar Fuzzer Grammar Inputs Grammar Miner Samples Active Learning

Slide 182

Slide 182 text

74 The Fuzzing Synergy Mimid Grammar Miner Parser Directed pFuzzer F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

Slide 183

Slide 183 text

74 The Fuzzing Synergy Mimid Grammar Miner Parser Directed pFuzzer F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

Slide 184

Slide 184 text

74 The Fuzzing Synergy Mimid Grammar Miner Parser Directed pFuzzer F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.

Slide 185

Slide 185 text

No content

Slide 186

Slide 186 text

76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: 312 IBM

Slide 187

Slide 187 text

76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: 312 IBM HTTP POST

Slide 188

Slide 188 text

76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: 312 IBM HTTP POST XML PAYLOAD

Slide 189

Slide 189 text

76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: 312 IBM HTTP POST XML PAYLOAD SOAP

Slide 190

Slide 190 text

76 Challenge: Multilevel Envelopes POST /InStock HTTP/1.1 Host: www.stock.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: 312 IBM HTTP POST XML PAYLOAD SOAP RPC Call

Slide 191

Slide 191 text

77 Future Challenge: Multilevel Envelopes

Slide 192

Slide 192 text

77 Future Challenge: Multilevel Envelopes

Slide 193

Slide 193 text

77 Future Challenge: Multilevel Envelopes Mimid

Slide 194

Slide 194 text

78 #include int main() { int number1, number2, number3; number1 = 10; number2 = 20; number3 = sum(number1, number2); if (number3 > 100) return 0; return 1; } $ cc example.c -o example example.c 1.Syntactically correct 2.Variables declared before use 3.Use correct types 4.Statically conforming 5.Dynamically conforming 6.Model conforming Challenge: Semantic Envelopes (Mckeeman 1998) (parse) (compile) (link) (run) (synthesis)

Slide 195

Slide 195 text

78 #include int main() { int number1, number2, number3; number1 = 10; number2 = 20; number3 = sum(number1, number2); if (number3 > 100) return 0; return 1; } $ cc example.c -o example example.c 1.Syntactically correct 2.Variables declared before use 3.Use correct types 4.Statically conforming 5.Dynamically conforming 6.Model conforming Challenge: Semantic Envelopes (Mckeeman 1998) (parse) (compile) (link) (run) (synthesis)

Slide 196

Slide 196 text

79 #include int main() { int number1, number2, number3; number1 = 10; number2 = 20; number3 = sum(number1, number2); if (number3 > 100) return 0; return 1; } $ cc example.c -o example example.c 1.Syntactically correct 2.Variables declared before use 3.Use correct types 4.Statically conforming 5.Dynamically conforming 6.Model conforming Challenge: Semantic Envelopes (Mckeeman 1998) (parse) (compile) (link) (run) (synthesis)

Slide 197

Slide 197 text

80

Slide 198

Slide 198 text

81 Fuzzing Inputs Program Behavior

Slide 199

Slide 199 text

81 Fuzzing Inputs Program Behavior

Slide 200

Slide 200 text

81 Fuzzing Inputs Program Behavior ✔

Slide 201

Slide 201 text

82

Slide 202

Slide 202 text

83 We Found A Crash

Slide 203

Slide 203 text

83 We Found A Crash

Slide 204

Slide 204 text

Why Did My Program Crash?

Slide 205

Slide 205 text

Why Did My Program Crash? 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090

Slide 206

Slide 206 text

Why Did My Program Crash? 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4))

Slide 207

Slide 207 text

Why Did My Program Crash? 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4)) 00000 ?

Slide 208

Slide 208 text

Why Did My Program Crash? 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4)) 00000 ? ((5)) ?

Slide 209

Slide 209 text

Why Did My Program Crash? 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5. 6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +- -(+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---( --+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) + 8.2 - 27 - -9 / +((+9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+(((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +-- (+-+-7 * (-0 * (+(((((2)) + 8 - 3 - ++9.0 + ---(- -+7 / (1 / +++6.37) + (1) / 482) / +++-+0)))) * - +5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +- (--2 - -++-9.0)))) / 5 * --++090 + * -+5 + 7.513) ))) - (+1 / ++((-84)))))))) * 8.2 - 27 - -9 / +(( +9 * --2 + --+-+-((-1 * +(8 - 5 - 6)) * (-(a-+ (((+(4))))) - ++4) / +(-+---((5.6 - --(3 * -1.8 * +(6 * +-(((-(-6) * ---+6)) / +--(+-+-7 * (-0 * ( +(((((2)) + 8 - 3 - ++9.0 + ---(--+7 / (1 / +++6. 37) + (1) / 482) / +++-+0)))) * -+5 + 7.513)))) - (+1 / ++((-84)))))))) * ++5 / +-(--2 - -++-9.0))) ) / 5 * --++090 ++5 / +-(--2 - -++-9.0)))) / 5 * --++090 DD Minimized Input ((4)) 00000 ? ((5)) ? (++5) ?

Slide 210

Slide 210 text

85 Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure

Slide 211

Slide 211 text

85 Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure

Slide 212

Slide 212 text

85 Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure

Slide 213

Slide 213 text

85 Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure

Slide 214

Slide 214 text

85 Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure Delta Minimization is useful but not sufficient

Slide 215

Slide 215 text

( ( 4 ) )

Slide 216

Slide 216 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9]

Slide 217

Slide 217 text

( ( 4 ) )

Slide 218

Slide 218 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9]

Slide 219

Slide 219 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9]

Slide 220

Slide 220 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] ✓ Did not reproduce the failure 1 * (2 - 3)

Slide 221

Slide 221 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9]

Slide 222

Slide 222 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c

Slide 223

Slide 223 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c

Slide 224

Slide 224 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c ✓ Did not reproduce the failure 1 + 3 + 4

Slide 225

Slide 225 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c c

Slide 226

Slide 226 text

3 * 4 := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c c

Slide 227

Slide 227 text

3 * 4 := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c c ✓ Did not reproduce the failure

Slide 228

Slide 228 text

( ( 4 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c c c c c c c

Slide 229

Slide 229 text

( ( 1 - 2 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c c c c c c c ( ( 1 - 2 ) )

Slide 230

Slide 230 text

( ( 1 - 2 ) ) := := ' + ' | ' - ' | := ' * ' | ' / ' | := '+' | '-' | '(' ')' | '.' | := | := [0-9] c c c c c c c ✘ reproduced the failure ( ( 1 - 2 ) )

Slide 231

Slide 231 text

( ( 1 - 2 ) ) c c c c c c c ( ( 1 - 2 ) )

Slide 232

Slide 232 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) )

Slide 233

Slide 233 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) ) ( ( 2 * 3 + 4 ) )

Slide 234

Slide 234 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) )

Slide 235

Slide 235 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ( ( - 2 / 1 ) )

Slide 236

Slide 236 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ✘ ( ( - 2 / 1 ) )

Slide 237

Slide 237 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ✘ ( ( - 2 / 1 ) ) ( ( 98 - 0 ) )

Slide 238

Slide 238 text

( ( 1 - 2 ) ) c c c c c c c ✘ ( ( 1 - 2 ) ) ✘ ( ( 2 * 3 + 4 ) ) ✘ ( ( - 2 / 1 ) ) ✘ ( ( 98 - 0 ) )

Slide 239

Slide 239 text

) ( ( ) ( ( ) 4 ) ( ( 4 ) ) c c c c c c c A

Slide 240

Slide 240 text

) ( ( ) ( ( ) 4 ) ( ( 4 ) ) c c c c c c c A

Slide 241

Slide 241 text

( ( 4 ) ) c c c c c c c A ( ( ) ) ( ( ) ) 4 Minimized Input Abstract Failure Inducing Input def check(parsed): if parsed.is_nested() and parsed.child.is_nested(): raise Exception() return input

Slide 242

Slide 242 text

var A = class extends (class {}){}; Issue 2937 from Closure

Slide 243

Slide 243 text

var A = class extends (class {}){}; Issue 2937 from Closure = class extends (class {}){}

Slide 244

Slide 244 text

var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino

Slide 245

Slide 245 text

var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino var {<$Id1>:{} = <$Id1> => {}} ;

Slide 246

Slide 246 text

const [y,y] = []; Issue 386 from Rhino

Slide 247

Slide 247 text

const [y,y] = []; Issue 386 from Rhino const [<$Id1>,<$Id1>] = []

Slide 248

Slide 248 text

{while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure

Slide 249

Slide 249 text

{while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}}

Slide 250

Slide 250 text

( ( 4 ) ) c c c c c c c A ( ( ) ) ( ( ) ) 4 Minimized Input Abstract Failure Inducing Input • Effectively abstracts a minimized input • The abstraction identifies where the problem lies • Decompose complex program behaviors DDSET Gopinath, Kampmann, Havrikov, Soremekun, and Zeller. Abstracting Failure Inducing Inputs. ISSTA 2020. def check(parsed): if parsed.is_nested() and parsed.child.is_nested(): raise Exception() return input ISSTA 2020 Distinguished Award

Slide 251

Slide 251 text

108 := := ' + ' | ' + ' | ' - ' | ' - ' | := ' * ' | ' * ' | ' / ' | ' / ' | := '+' | '-' | '(' ')' | '(' ')' := := := '(' ')' Specialized Grammar is (())

Slide 252

Slide 252 text

108 := := ' + ' | ' + ' | ' - ' | ' - ' | := ' * ' | ' * ' | ' / ' | ' / ' | := '+' | '-' | '(' ')' | '(' ')' := := := '(' ')' ((1)) + 2 (23 * ((3)) - 34) (344- 4 + ((223))) (1) - 3 * 773 + (-22 + 1) 1798 - 889 / ((333-1)) * 2 / 3 + 1 34 + ((4)) -334 + (334 - (22) + 919 * 0 + 1 98435747+ 88 + (((0))) + (1) - 1 * 7 / 4 * 889 - 2 8 + ((8)) + --1 + 11223 / 344 - 39 + (1) - 456 + 134 / 45 437 + 8 - 1 * ((9 + ((1))) - 1 + 99111948 + 3 --1 + (112) - 2 + 445) + 0 74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * 223 - 1233 + 334672 2 * ((9)) - (1798 - 889 / (333-1) * 2 / 3 + 100012 + 3434392 + 234 ----6 * 1798 - 889 / (33 778 - (((1) - 3 * 773 + (-22 + 1) * (4545) - 23 - ((2)) * 773 + (-22 + 1) / 3434 + ---1 + 1 / 34343 + 112 349 + (((1) - 3 * 3 + (-22 + 1) ((+ (-22 + 1) * (4545) - 23 - (2) * 773 + ((-22 + 1)) / 3434 + ---1 + 1 / 34343 + 1123 8 + ((8)) + --1 + / 1 - 39 + (1) - 456 + 134 / 45 ))(((1) - 2334 + ((((1)) - 3 * 773 + (-22 + 1) * (2) - 23 - (2) * 773 + (-22 + 1) / 3 74 + 3 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 45 ))(((1) - 3 * 773 1+ 33+ 24343433 +23343 - ((74 + 334 + ((178 - 88 / (3393-1) * 1002 / 3 + 1+ 3439)) * - 1233 + 334672)) ((8 + ((8)) + --1 + / 344 - 39 + (1) - 456 + 134 / 4 ✘ Specialized Grammar is (()) ✘

Slide 253

Slide 253 text

Algebra of Grammar Specializations

Slide 254

Slide 254 text

is = class extends (class {}){} Closure 2937 is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385 is const [<$Id3>,<$Id3>] = [] Rhino 386 is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations

Slide 255

Slide 255 text

is = class extends (class {}){} Closure 2937 is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385 is const [<$Id3>,<$Id3>] = [] Rhino 386 is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations

Slide 256

Slide 256 text

is = class extends (class {}){} Closure 2937 is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385 is const [<$Id3>,<$Id3>] = [] Rhino 386 is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations

Slide 257

Slide 257 text

is = class extends (class {}){} Closure 2937 is var {<$Id2>:{} = <$Id2> => {}} ; Rhino 385 is const [<$Id3>,<$Id3>] = [] Rhino 386 is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations

Slide 258

Slide 258 text

Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021. Mechanized proofs are available Algebra of Grammar Specializations

Slide 259

Slide 259 text

Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021. where is = class extends (class {}){} is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} is var {<$Id2>:{} = <$Id2> => {}} ; is const [<$Id3>,<$Id3>] = [] Mechanized proofs are available Algebra of Grammar Specializations

Slide 260

Slide 260 text

No content

Slide 261

Slide 261 text

No content

Slide 262

Slide 262 text

No content

Slide 263

Slide 263 text

No content

Slide 264

Slide 264 text

No content

Slide 265

Slide 265 text

Science of Focused Fuzzing where is (()) is / 0

Slide 266

Slide 266 text

where is (()) is / 0

Slide 267

Slide 267 text

where is (()) is / 0 Isolating & Decomposing Program Behaviors

Slide 268

Slide 268 text

where is (()) is / 0 Isolating & Decomposing Program Behaviors Algebra of Program Behaviors

Slide 269

Slide 269 text

where is (()) is / 0 Isolating & Decomposing Program Behaviors Algebra of Program Behaviors Science of Program Behaviors

Slide 270

Slide 270 text

No content

Slide 271

Slide 271 text

115 insert into tbl values (1,2,3) select b from tbl drop table tbl Input Behavior Program Challenge: Identify Behavior Divergence

Slide 272

Slide 272 text

115 insert into tbl values (1,2,3) select b from tbl drop table tbl update($file) read($file) rm($file) Input Behavior action='read' $action('tbl') Program assert invoked read: 'tbl.data' ✔ Challenge: Identify Behavior Divergence

Slide 273

Slide 273 text

115 insert into tbl values (1,2,3) select b from tbl drop table tbl update($file) read($file) rm($file) Input Behavior action='read' $action('tbl') Program assert invoked read: 'tbl.data' ✔ action='rm' Challenge: Identify Behavior Divergence

Slide 274

Slide 274 text

115 insert into tbl values (1,2,3) select b from tbl drop table tbl update($file) read($file) rm($file) Input Behavior action='read' $action('tbl') Program assert invoked read: 'tbl.data' ✔ action='rm' Challenge: Identify Behavior Divergence

Slide 275

Slide 275 text

def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene Challenge: Identify Behavior Divergence

Slide 276

Slide 276 text

def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene Challenge: Identify Behavior Divergence

Slide 277

Slide 277 text

Challenge: Identify Behavior Divergence LockB() LockA() DoAB() UnlockB() UnlockA()

Slide 278

Slide 278 text

Challenge: Identify Behavior Divergence LockB() LockA() DoAB() UnlockB() UnlockA()

Slide 279

Slide 279 text

Challenge: Identify Behavior Divergence LockB() LockA() DoAB() UnlockB() UnlockA() LockB() LockA() DoAB() UnlockA() UnlockB()

Slide 280

Slide 280 text

Challenge: Identify Behavior Divergence LockB() LockA() DoAB() UnlockB() UnlockA() LockB() LockA() DoAB() UnlockA() UnlockB() UnLockA() LockA() DoAB() LockB() UnlockB()

Slide 281

Slide 281 text

118 def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene

Slide 282

Slide 282 text

118 def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene

Slide 283

Slide 283 text

119

Slide 284

Slide 284 text

119 :=

Slide 285

Slide 285 text

120 := := (a==b) | (a!=b)

Slide 286

Slide 286 text

121 := := (a==b) | (a!=b) := (b==c) | (b!=c)

Slide 287

Slide 287 text

122 := := (a==b) | (a!=b) := (b==c) | (b!=c) := "return Equilateral" := "return Isosceles"

Slide 288

Slide 288 text

123 := := (a==b) | (a!=b) := (b==c) | (b!=c) := "return Equilateral" := "return Isosceles" := (b==c) | (b!=c) := "return Isosceles" := (a==c) | (a!=c) := "return Isosceles" := "return Scalene"

Slide 289

Slide 289 text

LockB() LockA() DoAB() UnlockB() UnlockA() Challenge: Identify Behavior Divergence "lockA" "lockB" "UnlockB" "UnlockA"

Slide 290

Slide 290 text

No content

Slide 291

Slide 291 text

126 The Science of Inputs Program The Science of Behaviors ✔

Slide 292

Slide 292 text

126 The Science of Inputs Program The Science of Behaviors ✔

Slide 293

Slide 293 text

126 The Science of Inputs Program The Science of Behaviors ✔

Slide 294

Slide 294 text

126 The Science of Fuzzing The Science of Inputs Program The Science of Behaviors ✔

Slide 295

Slide 295 text

No content

Slide 296

Slide 296 text

128 Oracles for Fuzzing Focused Fuzzing Automatic Repair Fault Localization Beyond Syntax Grammar Mining

Slide 297

Slide 297 text

128 Oracles for Fuzzing Focused Fuzzing Automatic Repair Fault Localization Beyond Syntax Grammar Mining Onward

Slide 298

Slide 298 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner https://rahul.gopinath.org

Slide 299

Slide 299 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner https://rahul.gopinath.org

Slide 300

Slide 300 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner https://rahul.gopinath.org

Slide 301

Slide 301 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner https://rahul.gopinath.org

Slide 302

Slide 302 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org

Slide 303

Slide 303 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org Generalize

Slide 304

Slide 304 text

Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org Generalize Combine

Slide 305

Slide 305 text

Future Work: Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org Generalize Combine The Science of Fuzzing

Slide 306

Slide 306 text

Future Work: Program Under Test pFuzzer F1 Fuzzer Inputs Grammar Miner Problem: How to Fuzz Parsers https://rahul.gopinath.org Debugging Combine The Science of Fuzzing