Slide 1

Slide 1 text

1 Building Blocks for Fuzzing Rahul Gopinath https://rahul.gopinath.org @_rahulgopinath

Slide 2

Slide 2 text

Rahul Gopinath Building Blocks for Fuzzing https://rahul.gopinath.org @_rahulgopinath 2

Slide 3

Slide 3 text

The story begins 2500 years ago. Vedic Era 500 BC 3

Slide 4

Slide 4 text

Classical Era 500 BC 4

Slide 5

Slide 5 text

500 BC Aṣṭādhyāyī Dakṣiputra Pāṇini Ad hoc rules Formal specification Vedic Sanskrit Classical Sanskrit 5

Slide 6

Slide 6 text

6 2500 years later.... 2022 CE Dawn of a New Era

Slide 7

Slide 7 text

7 New Challenges

Slide 8

Slide 8 text

Bugs 8

Slide 9

Slide 9 text

9

Slide 10

Slide 10 text

10 Input ✓ ✘ Testing @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500

Slide 11

Slide 11 text

HTTP/1.1 401 Not Authorize d Content-Type: application/jso n { "Error" : "Assignment of internal rol e 'superadmin' is forbidden " } HTTP/1.1 200 O K Content-type: application/jso n { "result": "OK: Updated use r 'exampleUser' with rol e 'superadmin' " } 11 Input ✓ ✘ Testing POST /user/update HTTP/1. 1 { "user": "exampleUser", "roles": [ "superadmin " ] } @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 HTTP/1.1 500 Internal Server Error ✘

Slide 12

Slide 12 text

12 Verify Behavior Input ✓ ✘ Automatic Testing @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500

Slide 13

Slide 13 text

13 (Oracle) Input ✓ ✘ Automatic Testing Oracles require domain expertise @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 Verify Behavior

Slide 14

Slide 14 text

@app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 Fuzzing Trash deck technique: 1950s - Gerald Weinberg Crash? 14

Slide 15

Slide 15 text

@app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 Fuzzing Crash? • Memory Bounds Violation • Privilege Escalation • Safety Violations • Metamorphic Relations • Differential Execution ASAN,MSAN,TSAN,NSan,FuZZan,UBSan 15

Slide 16

Slide 16 text

16 https://www.fuzzingbook.org

Slide 17

Slide 17 text

@app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 [ ; x 1 - G P Z + w c c k c ] ; , N 9 J + ? # 6 ^ 6 \ e ? ] 9 l u 2 _ % ' 4 G X " 0 V U B [ E / r ~ f A p u 6 b 8 < { % s i q 8 Z h . 6 { V , h r ? ; {Ti.r3PIxMMMv6{xS^+'Hq!AxB"YXRS@! Kd6;wtAMefFWM(`|J_<1~o}z3K(CCzRH J I I v H z > _ * . \ > J r l U 3 2 ~ e G P ? lR=bF3+;y$3lodQ & ] B S 6 R & j ? # t P 7 i a V } - } ` \ ? [ _ [ Z ^ L B M P G - FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2D|vBy! ^ z k h d f 3 C 5 P A k R ? V ( ( - % > < h n | 3='i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@ 5 : d f d 4 5 * ( 7 ^ % 5 a p \ z I y l " ' f , $ee,J4Gw:cgNKLie3nx9(`efSlg6#[K"@Wjh Z}r[Scun&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/ 6N-wyzj/MTd#A;r Program Traditional Fuzzing 17 https://www.fuzzingbook.org/html/Fuzzer.html

Slide 18

Slide 18 text

@app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 [ ; x 1 - G P Z + w c c k c ] ; , N 9 J + ? # 6 ^ 6 \ e ? ] 9 l u 2 _ % ' 4 G X " 0 V U B [ E / r ~ f A p u 6 b 8 < { % s i q 8 Z h . 6 { V , h r ? ; {Ti.r3PIxMMMv6{xS^+'Hq!AxB"YXRS@! Kd6;wtAMefFWM(`|J_<1~o}z3K(CCzRH J I I v H z > _ * . \ > J r l U 3 2 ~ e G P ? lR=bF3+;y$3lodQ & ] B S 6 R & j ? # t P 7 i a V } - } ` \ ? [ _ [ Z ^ L B M P G - FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2D|vBy! ^ z k h d f 3 C 5 P A k R ? V h n | 3='i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@ 5 5 a p \ z I y l " ' f , $ee,J4Gw:cgNKLie3nx9(`efSlg6#[K"@Wjh Z}r[Scun&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/ 6N-wyzj/MTd#A;r Structured Inputs SYNTAX ERROR ✘ 18

Slide 19

Slide 19 text

def process_input(input) : try : ✘val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r SYNTAX ERROR 19 Parser

Slide 20

Slide 20 text

SYNTAX ERROR def process_input(input) : try : ✘val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r 20 The Core

Slide 21

Slide 21 text

21 def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r SYNTAX ERROR 21

Slide 22

Slide 22 text

• Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage Feedback Driven Fuzzing 22 22 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 23

Slide 23 text

23 def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene def triangle(a, b, c): __probe_enter() if a == b: __probe_1() if b == c: __probe_2() return Equilateral else: __probe_3() return Isosceles else: __probe_4() if b == c: __probe_5() return Isosceles else: __probe_6() if a == c: __probe_7() return Isosceles else: __probe_8() return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene Feedback Driven Fuzzing • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage 23 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 24

Slide 24 text

24 Feedback Driven Fuzzing triangle (1,1,1) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage 24 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 25

Slide 25 text

25 Feedback Driven Fuzzing triangle (1,1,1) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage 25 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 26

Slide 26 text

triangle (1,1,1) 26 Feedback Driven Fuzzing triangle (1,1,2) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage Mutated 26 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 27

Slide 27 text

27 Feedback Driven Fuzzing triangle (1,1,3) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage Mutated 27 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 28

Slide 28 text

28 Feedback Driven Fuzzing triangle (1,1,2) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage 28 https://www.fuzzingbook.org/html/MutationFuzzer.html triangle (1,1,1) triangle (1,1,3)

Slide 29

Slide 29 text

• Insert Instrumentatio

Slide 30

Slide 30 text

30 Feedback Driven Fuzzing Weakness: static int is_reserved_word_token(const char *s, int len) { const char *reserved[] = { "break", "case", "catch", "continue", "debugger", "default", "delete", "do", "else", "false", "finally", "for", "function", "if", "in", "instanceof", "new", "null", "return", "switch", "this", "throw", "true", "try", "typeof", "var", "void", "while", "with", "let", "undefined", ((void *)0)}; int i; if (!mjs_is_alpha(s[0])) return 0; for (i = 0; reserved[i] != ((void *)0); i++) { if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0) return i + 1; } return 0; } Tokens if (x > 100) { } coverage: 20% if (x > 100) { } e coverage: 5% if (x > 100) { } el coverage: 5% if (x > 100) { } els coverage: 5% if (x > 100) { } else coverage: 25% No smooth coverage gradient in parsers 30

Slide 31

Slide 31 text

31 Feedback Driven Fuzzing def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_dict(stm ) elif c == '[': return json_list(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) Weak points : • Need for smooth coverage gradien t • Coverage only provides first level guidance 1. {"abc":[]} 2. [{"a":[]}, {"b":[]}, {"c":["ab","c"]}] 31

Slide 32

Slide 32 text

32 Feedback Driven Fuzzing def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_dict(stm ) elif c == '[': return json_list(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) Weak points : • Need for smooth coverage gradien t • Coverage only provides first level guidance 1. {"abc":[]} 2. [{"a":[]}, {"b":[]}, {"c":["ab","c"]}] 32

Slide 33

Slide 33 text

33 Feedback Driven Fuzzing def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_dict(stm ) elif c == '[': return json_list(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) Weak points : • Need for smooth coverage gradien t • Coverage only provides first level guidance 1. {"abc":[]} 2. [{"a":[]}, {"b":[]}, {"c":["ab","c"]}] 33

Slide 34

Slide 34 text

34 Feedback Driven Fuzzing 34 • Insert Instrumentatio n • Generate input s • Collect execution feedbac k • Branches covered during executio n • Slightly Mutate Input and try agai n Collect inputs obtaining new coverage

Slide 35

Slide 35 text

35 Solver Directed Fuzzing • Collect path constraint s • Solve negated constraints for new inputs (a == b) (b == c) (b != c) triangle(1,2,1) https://www.fuzzingbook.org/html/ConcolicFuzzer.html 35

Slide 36

Slide 36 text

36 Solver Directed Fuzzing • Collect path constraint s • Solve constraints for new inputs void next_sym() { while(!eof) { switch (ch){ case '{': next_ch(); sym = LBRA; return; case '}': next_ch(); sym = RBRA; return; case '(': next_ch(); sym = LPAR; return; case ')': next_ch(); sym = RPAR; return; case '+': next_ch(); sym = PLUS; return; case '-': next_ch(); sym = MINUS; return; case '<': next_ch(); sym = LESS; return; case ';': next_ch(); sym = SEMI; return; case '=': next_ch(); sym = EQUAL; return; default: if (ch >= '0' && ch <= '9') { int_val = 0; /* missing overflow check */ while (ch >= '0' && ch <= '9') { int_val = int_val*10 + (ch - '0'); next_ch(); } sym = INT; } else if (ch >= 'a' && ch <= 'z') { int i = 0; /* missing overflow check */ while ((ch >= 'a' && ch <= 'z') || ch == '_'){ id_name[i++] = ch; next_ch(); } id_name[i] = '\0'; sym = 0; while (words[sym] != NULL && strcmp(words[sym], id_name) != 0) sym++; if (words[sym] == NULL) if (id_name[1] == '\0') sym = ID; else syntax_error(); } else syntax_error(); return; } Weak point: Path explosion https://www.fuzzingbook.org/html/ConcolicFuzzer.html 36

Slide 37

Slide 37 text

37 Solver Directed Fuzzing • Collect path constraint s • Solve negated constraints for new inputs 37

Slide 38

Slide 38 text

38 Specialized Generators • Specialize generation for a domain 38

Slide 39

Slide 39 text

39 Overcoming Parsers

Slide 40

Slide 40 text

Needed: Input Language 40 40

Slide 41

Slide 41 text

41 Formal Languages Language Descriptions: Grammars 3. Regular Context Free Recursively Enumerable (Chomsky,1956) Easy to produce and parse Argument Stack Return Stack 41

Slide 42

Slide 42 text

42 Grammar JSON grammar De f inition for key ::= 
 ::= 
 | 
 | 
 | 
 | `true ` | `false ` | `null ` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | 42

Slide 43

Slide 43 text

43 Grammar JSON grammar ::= 
 ::= 
 | 
 | 
 | 
 | `true ` | `false ` | `null ` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | Expansion Rule Terminal Symbol Nonterminal Symbol 43

Slide 44

Slide 44 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] Tree 44

Slide 45

Slide 45 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] Tree 45

Slide 46

Slide 46 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] Tree 46

Slide 47

Slide 47 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] { } Tree 47

Slide 48

Slide 48 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] { } { } Tree 48

Slide 49

Slide 49 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] { } { : } Tree 49

Slide 50

Slide 50 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] { : } { "" : } Tree 50

Slide 51

Slide 51 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] { "" : } Tree 51

Slide 52

Slide 52 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] { "" : } { "" : true } Tree 52

Slide 53

Slide 53 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false` | `null` ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= `"` `"` | `""` ::= | ::= [A-Za-z0-9] ::= ::= | ::= [0-9] {"":true} Derivation Tree 53

Slide 54

Slide 54 text

{
 '' : [['']] , '' : [[''] , [''] , [''] , [''] , ['true'], ['false'], ['null']] , '' : [['{', '','}'] , ['{}']] , '' : [[',',','] , ['']] , '' : [['',':', '']] , '' : [['[', '', ']'] , ['[]']] , '' : [[',',','] , ['']] , '' : [['"', '', '"'] , ['""']] , '' : [['',''] , ['']] , '' : [['']] , '' : [['',''] , ['']] , '' : [[c] for c in string.characters ] '' : [[c] for c in string.digits]
 } Grammar Fuzzer 54

Slide 55

Slide 55 text

General Parser {
 '' : [['']] , '' : [[''] , [''] , [''] , [''] , ['true'], ['false'], ['null']] , '' : [['{', '','}'] , ['{}']] , '' : [[',',','] , ['']] , '' : [['',':', '']] , '' : [['[', '', ']'] , ['[]']] , '' : [[',',','] , ['']] , '' : [['"', '', '"'] , ['""']] , '' : [['',''] , ['']] , '' : [['']] , '' : [['',''] , ['']] , '' : [[c] for c in string.characters ] '' : [[c] for c in string.digits]
 } 55

Slide 56

Slide 56 text

{
 '' : [['']] , '' : [[''] , [''] , [''] , [''] , ['true'], ['false'], ['null']] , '' : [['{', '','}'] , ['{}']] , '' : [[',',','] , ['']] , '' : [['',':', '']] , '' : [['[', '', ']'] , ['[]']] , '' : [[',',','] , ['']] , '' : [['"', '', '"'] , ['""']] , '' : [['',''] , ['']] , '' : [['']] , '' : [['',''] , ['']] , '' : [[c] for c in string.characters ] '' : [[c] for c in string.digits]
 } Lang Fuzzer https://www.fuzzingbook.org/html/LangFuzzer.html 56

Slide 57

Slide 57 text

def process_input(input) : try : ✔val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {
 '' : [['']] , '' : [[''] , [''] , [''] , [''] , ['true'], ['false'], ['null']] , '' : [['{', '','}'] , ['{}']] , '' : [[',',','] , ['']] , '' : [['',':', '']] , '' : [['[', '', ']'] , ['[]']] , '' : [[',',','] , ['']] , '' : [['"', '', '"'] , ['""']] , '' : [['',''] , ['']] , '' : [['']] , '' : [['',''] , ['']] , '' : [[c] for c in string.characters ] '' : [[c] for c in string.digits]
 } 57

Slide 58

Slide 58 text

Where to Get the Grammar From? 58

Slide 59

Slide 59 text

Almost Everyone Uses Handwritten Parsers https://notes.eatonphil.com/parser-generators-vs-handwritten-parsers-survey-2021.html 59

Slide 60

Slide 60 text

Where to Get the Grammar From? 60

Slide 61

Slide 61 text

61 "Be liberal in what you accept, and conservative in what you send" Postel's Law 61

Slide 62

Slide 62 text

QUIRK_ALLOW_ASCII_CONTROL_CODES QUIRK_ALLOW_BACKSLASH_A QUIRK_ALLOW_BACKSLASH_CAPITAL_U QUIRK_ALLOW_BACKSLASH_E QUIRK_ALLOW_BACKSLASH_NEW_LINE QUIRK_ALLOW_BACKSLASH_QUESTION_MARK QUIRK_ALLOW_BACKSLASH_SINGLE_QUOTE QUIRK_ALLOW_BACKSLASH_V QUIRK_ALLOW_BACKSLASH_X_AS_BYTE S QUIRK_ALLOW_BACKSLASH_X_AS_CODE_POINTS QUIRK_ALLOW_BACKSLASH_ZERO QUIRK_ALLOW_COMMENT_BLOCK QUIRK_ALLOW_COMMENT_LINE QUIRK_ALLOW_EXTRA_COMMA QUIRK_ALLOW_INF_NAN_NUMBERS QUIRK_ALLOW_LEADING_ASCII_RECORD_SEPARATOR QUIRK_ALLOW_LEADING_UNICODE_BYTE_ORDER_MARK QUIRK_ALLOW_TRAILING_FILLER QUIRK_EXPECT_TRAILING_NEW_LINE_OR_EOF QUIRK_JSON_POINTER_ALLOW_TILDE_N_TILDE_R_TILDE_T QUIRK_REPLACE_INVALID_UNICODE JSON common quirks from https://github.com/google/wuffs 62

Slide 63

Slide 63 text

"Be liberal in what you accept, and conservative in what you send"
 Postel's Law The Specification The Implementation Extra "Features" Where to Get the Grammar From? 63

Slide 64

Slide 64 text

64 Where to Get the Grammar From? Hand-written parsers already encode the grammar 64

Slide 65

Slide 65 text

def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_dict(stm ) elif c == '[': return json_list(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) ::= 
 | 
 | 
 | 
 | | | ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{} ` ::= `,` | ::= `:` ::= `[``] ` | `[] ` ::= `,` | ::= ::= | https://github.com/phensley/microjson MicroJSON 65 65

Slide 66

Slide 66 text

def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_dict(stm ) elif c == '[': return json_list(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) ::= 
 | 
 | 
 | 
 | | | ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | https://github.com/phensley/microjson MicroJSON 66

Slide 68

Slide 68 text

def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_object(stm ) elif c == '[': return json_array(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) ::= ::= | ::= | | 
 | `"` | `{` 
 | `[` 
 | [[1-9] 
 68

Slide 69

Slide 69 text

def json_raw(stm) : while True : stm.skipspaces( ) c = stm.peek( ) if c == 't' : return json_fixed(stm, 'true' ) elif c == 'f' : return json_fixed(stm, 'false' ) elif c == 'n': return json_fixed(stm, 'null' ) elif c == '"': return json_string(stm ) elif c == '{': return json_object(stm ) elif c == '[': return json_array(stm ) elif c in NUMSTART : return json_number(stm ) raise JSONError(E_MALF, stm, stm.pos) ::= ::= | ::= | | 
 | `"` | `{` 
 | `[` 
 | [[1-9] 
 69

Slide 70

Slide 70 text

def json_string(stm) : # skip over '" ' stm.next() r = [ ] while True : c = stm.next( ) if c == '' : raise JSONError(E_TRUNC ) elif c == '\\' : c = stm.next( ) r.append(decode_escape(c, stm) ) elif c == '"' : return ''.join(r ) else : r.append(c) ::= ... ::= `"` .. . ::= ::= | ::= `\\` | `" ` | "ab" ::= ::= | ::= `\\` | ::= ::=`"` 70

Slide 71

Slide 71 text

71 ::= ::= '"' | '[' | '{' | | 'true ' | 'false ' | 'null ' ::= + | + 'e' + ::= '+' | '-' | '.' | [0-9] | 'E' | 'e ' ::= * '" ' ::= '] ' | (',')* '] ' | ( ',' )+ (',' )* '] ' ::= '} ' | ( '"' ':' ',' )* '"' ':' '} ' ::= ' ' | '!' | '#' | '$' | '%' | '&' | '' ' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | '; ' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^ ' | '_', ''',| '{' | '|' | '}' | '~ ' | '[A-Za-z0-9] ' | '\' ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next()

Slide 72

Slide 72 text

Recall Subjects Mimid calc.py 100.0 % mathexpr.py 87.5 % cgidecode.py 100.0 % urlparse.py 100.0 % microjson.py 98.7 % parseclisp.py 99.3 % jsonparser.c 100.0 % tiny.c 100.0 % mjs.c 95.4 % Inputs generated by inferred grammar that were accepted by the program Subjects Mimid calc.py 100.0 % mathexpr.py 92.7 % cgidecode.py 100.0 % urlparse.py 96.4 % microjson.py 93.0 % parseclisp.py 80.6 % jsonparser.c 83.8 % tiny.c 92.8 % mjs.c 95.9 % Inputs generated by golden grammar that were accepted by the inferred grammar parser Precision Evaluation: Accuracy 72 72

Slide 73

Slide 73 text

73 ::=

Slide 74

Slide 74 text

74

Slide 75

Slide 75 text

75 HTTP Parser XML Parser SOAP Parser RPC Parser C Parser Check Declarations Check Types Static Checks Challenges Compilers Servers Semantics Application

Slide 76

Slide 76 text

76 76

Slide 77

Slide 77 text

What Can We Do With Grammars? 77 77

Slide 78

Slide 78 text

{
 '' : [['']] , '' : [[''] , [''] , [''] , [''] , ['true'], ['false'], ['null']] , '' : [['{', '','}'], ['{}']] , '' : [[',',','], ['']] , '' : [['',':', '']] , '' : [['[', '', ']'], ['[]']] , '' : [[',',','], ['']] , '' : [['"', '', '"'], ['""']] , '' : [['',''], ['']] , '' : [['']] , '' : [['',''], ['']] , '' : [[c] for c in string.characters ] '' : [[c] for c in string.digits]
 } def process_input(input) : try : ✔val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r Fuzz Programs? 78

Slide 79

Slide 79 text

Grammar Fuzzer Fast Fuzzers def json() : elt( ) def elt() : match (random() % 7) : case 0: object( ) case 1: array( ) case 2: string( ) case 3: number( ) case 4: print('true' ) case 5: print('false' ) case 5: print('null')
 def object() : match (random() % 2) : case 0: print('{'); items(); print('}' ) case 1: print('{');print('}' ) def array() : match (random() % 2) : case 0: print('['); elts(); print(']' ) case 1: print('[');print(']' ) def items() : match (random() % 2) : case 0: item(); print(','); items( ) case 1: item() Compiled Grammar (F1) Building Fast Fuzzers Gopinath and Zeller 2019 def json_0(rops) : elt_0(rops ) def elt_0(rops) : r = next(rops ) if 0 <= r < 36: elt_0( ) elif 36 <= r < 73: elt_1( ) elif 73 <= r < 109: elt_2( ) elif 109 <= r <146: elt_3( ) elif 146 <= r < 182: elt_4( ) elif 182 <= r < 219: elt_5( ) else: elt_6( ) def elt_0(rops) : r = next(rops ) if 0 <= r < 43: expr_0( ) elif 43 <= r < 85: expr_1( ) elif 85 <= r < 128: expr_2( ) elif 128 <= r < 171: expr_3( ) elif 171 <= r < 213: expr_4( ) else: expr_5( ) print('+' ) elif 171 <= r < 213: expr_4( ) else: expr_5() Grammar VM (F1) 79 {
 '' : [['']] , '' : [[''] , [''] , [''] , [''] , ['true'], ['false'], ['null']] , '' : [['{', '','}'] , ['{}']] , '' : [[',',','] , ['']] , '' : [['',':', '']] , '' : [['[', '', ']'] , ['[]']] , '' : [[',',','] , ['']] , '' : [['"', '', '"'] , ['""']] , '' : [['',''] , ['']] , '' : [['']] , '' : [['',''] , ['']] , '' : [[c] for c in string.characters ] '' : [[c] for c in string.digits]
 }

Slide 80

Slide 80 text

80 Focused Fuzzing Generate inputs that focus on speci fi c system parts • Covering speci fi c portions of code • Speci fi c paths through the code • Performing speci fi c functions Without damaging the nondeterminism. 80

Slide 81

Slide 81 text

def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {"a": ["key"]} ✓ 81

Slide 82

Slide 82 text

def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {"a": ["key"]} {"": [1,2,"k"]} ✘ 82

Slide 83

Slide 83 text

def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {"a": ["key"]} {"": [1,2,"k"]} ✓ ["A", "B", "C"] 83

Slide 84

Slide 84 text

def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {"a": ["key"]} {"": [1,2,"k"]} ["A", "B", "C"] [{"": [1,2,3,4]}] ✘ 84

Slide 85

Slide 85 text

def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {"a": ["key"]} {"": [1,2,"k"]} ["A", "B", "C"] [{"": [1,2,3,4]}] if json.has_key("") : raise Exception() 85

Slide 86

Slide 86 text

def process_input(input) : try : val = parse(input ) res = process(val ) return re s except SyntaxError : return Erro r {"type":"PathNode","matrix": {"m11":-0.6630394213564543,"m12":0,"m21":0,"m22":0.5236476835782672,"dx":565.5201948 628471,"dy":371.5686591257294},"children": [],"strokeStyle":"#000000"," fi llStyle":"#e1e1e1","lineWidth":4,"smoothness":0.3,"sloppiness":0.5, "startX":50,"startY":0,"closed":true,"segments":[{"type":3,"x":100,"y":50,"x1":100,"y1":0,"r": [-0.3779207859188318,0.07996635790914297,-0.47163885831832886,-0.0710031278431415 6]},{"type":3,"x":50,"y":100,"x1":100,"y1":100,"r": [0.24857700895518064,0.030472169630229473,0.49844827968627214,0.1326016811653971 7]},{"type":3,"x":0,"y":50,"x1":0,"y1":100,"r": [0.1751830680295825,-0.18606301862746477,-0.4092112798243761,-0.4790717279538512]} ,{"type":3,"x":50,"y":0,"x1":0,"y1":0,"r": [0.37117584701627493,0.3612578883767128,0.0462839687243104,-0.1564063960686326]}], "shadow":false},{"type":"PathNode","matrix": {"m11":-1.475090930376591,"m12":0,"m21":0,"m22":1.2306765694828008,"dx":700.13810328 55618,"dy":133.20628077515605},"children": [],"strokeStyle":"#000000"," fi llStyle":"#ffffff","lineWidth":2,"smoothness":0.3,"sloppiness":0.5,"star tX":126.25,"startY":127.50445838342671,"closed":true,"segments": [{"type":3,"x":146.01190476190476,"y":147.5936260519611,"x1":146.01190476190476,"y1":127 .50445838342671,"r": [-0.1750196823850274,-0.05804965365678072,-0.3536788672208786,0.05322327278554439 5]}, {"type":3,"x":126.25,"y":167.6827937204955,"x1":146.01190476190476,"y1":167.68279372049 55,"r": [-0.32906053867191076,-0.11536165233701468,0.35579121299088,0.38731588050723076]}, {"type":3,"x":108,"y":147,"x1":106.48809523809524,"y1":167.6827937204955,"r": [0.08825046103447676,0.011088204570114613,0.43411328736692667,-0.133069220930337 9]}, {"type":3,"x":126.25,"y":127.50445838342671,"x1":106.48809523809524,"y1":127.5044583834 2671,"r": [0.42778260353952646,0.24726040940731764,0.3631806019693613,0.05325550492852926]} ],"shadow":false},{"type":"TextNode","matrix": {"m11":1,"m12":0,"m21":0,"m22":1,"dx":543,"dy":225},"children": []," fi llStyle":"#000000","text":"Y","fontName":"FG Virgil","fontSize":20}, {"type":"TextNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":559,"dy":144},"children": []," fi llStyle":"#000000","text":"x","fontName":"FG Virgil","fontSize":20}, {"type":"ArrowNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":0,"dy":0},"children": [],"arrowSize":10,"path":{"type":"PathNode","matrix": {"m11":1,"m12":0,"m21":0,"m22":1,"dx":464,"dy":-3},"children": [],"strokeStyle":"#000000"," fi llStyle":"#ffffff","lineWidth":2,"smoothness":0.3," ✘ 86

Slide 87

Slide 87 text

{"type":"PathNode","matrix": {"m11":-0.6630394213564543,"m12":0,"m21":0,"m22":0.5236476835782672,"dx":565.5201948 628471,"dy":371.5686591257294},"children": [],"strokeStyle":"#000000"," fi llStyle":"#e1e1e1","lineWidth":4,"smoothness":0.3,"sloppiness":0.5, "startX":50,"startY":0,"closed":true,"segments":[{"type":3,"x":100,"y":50,"x1":100,"y1":0,"r": [-0.3779207859188318,0.07996635790914297,-0.47163885831832886,-0.0710031278431415 6]},{"type":3,"x":50,"y":100,"x1":100,"y1":100,"r": [0.24857700895518064,0.030472169630229473,0.49844827968627214,0.1326016811653971 7]},{"type":3,"x":0,"y":50,"x1":0,"y1":100,"r": [0.1751830680295825,-0.18606301862746477,-0.4092112798243761,-0.4790717279538512]} ,{"type":3,"x":50,"y":0,"x1":0,"y1":0,"r": [0.37117584701627493,0.3612578883767128,0.0462839687243104,-0.1564063960686326]}], "shadow":false},{"type":"PathNode","matrix": {"m11":-1.475090930376591,"m12":0,"m21":0,"m22":1.2306765694828008,"dx":700.13810328 55618,"dy":133.20628077515605},"children": [],"strokeStyle":"#000000"," fi llStyle":"#ffffff","lineWidth":2,"smoothness":0.3,"sloppiness":0.5,"star tX":126.25,"startY":127.50445838342671,"closed":true,"segments": [{"type":3,"x":146.01190476190476,"y":147.5936260519611,"x1":146.01190476190476,"y1":127 .50445838342671,"r": [-0.1750196823850274,-0.05804965365678072,-0.3536788672208786,0.05322327278554439 5]}, {"type":3,"x":126.25,"y":167.6827937204955,"x1":146.01190476190476,"y1":167.68279372049 55,"r": [-0.32906053867191076,-0.11536165233701468,0.35579121299088,0.38731588050723076]}, {"type":3,"x":108,"y":147,"x1":106.48809523809524,"y1":167.6827937204955,"r": [0.08825046103447676,0.011088204570114613,0.43411328736692667,-0.133069220930337 9]}, {"type":3,"x":126.25,"y":127.50445838342671,"x1":106.48809523809524,"y1":127.5044583834 2671,"r": [0.42778260353952646,0.24726040940731764,0.3631806019693613,0.05325550492852926]} ],"shadow":false},{"type":"TextNode","matrix": {"m11":1,"m12":0,"m21":0,"m22":1,"dx":543,"dy":225},"children": []," fi llStyle":"#000000","text":"Y","fontName":"FG Virgil","fontSize":20}, {"type":"TextNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":559,"dy":144},"children": []," fi llStyle":"#000000","text":"x","fontName":"FG Virgil","fontSize":20}, {"type":"ArrowNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":0,"dy":0},"children": [],"arrowSize":10,"path":{"type":"PathNode","matrix": What is the smallest failure inducing input? Delta Debugging 87

Slide 88

Slide 88 text

::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``}`| `{}` ::= `,` | ::= `:` ::= `[``]`| `[]` ::= `,` | ::= ::= | Hierarchical Delta Debugging {"type":"PathNode","matrix": {"m11":-0.6630394213564543,"m12":0,"m21":0,"m22":0.5236476835782672,"dx":565.5201948 628471,"dy":371.5686591257294},"children": [],"strokeStyle":"#000000"," fi llStyle":"#e1e1e1","lineWidth":4,"smoothness":0.3,"sloppiness":0.5, "startX":50,"startY":0,"closed":true,"segments":[{"type":3,"x":100,"y":50,"x1":100,"y1":0,"r": [-0.3779207859188318,0.07996635790914297,-0.47163885831832886,-0.0710031278431415 6]},{"type":3,"x":50,"y":100,"x1":100,"y1":100,"r": [0.24857700895518064,0.030472169630229473,0.49844827968627214,0.1326016811653971 7]},{"type":3,"x":0,"y":50,"x1":0,"y1":100,"r": [0.1751830680295825,-0.18606301862746477,-0.4092112798243761,-0.4790717279538512]} ,{"type":3,"x":50,"y":0,"x1":0,"y1":0,"r": [0.37117584701627493,0.3612578883767128,0.0462839687243104,-0.1564063960686326]}], "shadow":false},{"type":"PathNode","matrix": {"m11":-1.475090930376591,"m12":0,"m21":0,"m22":1.2306765694828008,"dx":700.13810328 55618,"dy":133.20628077515605},"children": [],"strokeStyle":"#000000"," fi llStyle":"#ffffff","lineWidth":2,"smoothness":0.3,"sloppiness":0.5,"star tX":126.25,"startY":127.50445838342671,"closed":true,"segments": [{"type":3,"x":146.01190476190476,"y":147.5936260519611,"x1":146.01190476190476,"y1":127 .50445838342671,"r": [-0.1750196823850274,-0.05804965365678072,-0.3536788672208786,0.05322327278554439 5]}, {"type":3,"x":126.25,"y":167.6827937204955,"x1":146.01190476190476,"y1":167.68279372049 55,"r": [-0.32906053867191076,-0.11536165233701468,0.35579121299088,0.38731588050723076]}, {"type":3,"x":108,"y":147,"x1":106.48809523809524,"y1":167.6827937204955,"r": [0.08825046103447676,0.011088204570114613,0.43411328736692667,-0.133069220930337 9]}, {"type":3,"x":126.25,"y":127.50445838342671,"x1":106.48809523809524,"y1":127.5044583834 2671,"r": [0.42778260353952646,0.24726040940731764,0.3631806019693613,0.05325550492852926]} ],"shadow":false},{"type":"TextNode","matrix": {"m11":1,"m12":0,"m21":0,"m22":1,"dx":543,"dy":225},"children": []," fi llStyle":"#000000","text":"Y","fontName":"FG Virgil","fontSize":20}, {"type":"TextNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":559,"dy":144},"children": []," fi llStyle":"#000000","text":"x","fontName":"FG Virgil","fontSize":20}, {"type":"ArrowNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":0,"dy":0},"children": [],"arrowSize":10,"path":{"type":"PathNode","matrix": 88

Slide 89

Slide 89 text

{"":[]} ✘ 89 Test Minimization {"type":"PathNode","matrix": {"m11":-0.6630394213564543,"m12":0,"m21":0,"m22":0.5236476835782672,"dx":565.5201948 628471,"dy":371.5686591257294},"children": [],"strokeStyle":"#000000"," fi llStyle":"#e1e1e1","lineWidth":4,"smoothness":0.3,"sloppiness":0.5, "startX":50,"startY":0,"closed":true,"segments":[{"type":3,"x":100,"y":50,"x1":100,"y1":0,"r": [-0.3779207859188318,0.07996635790914297,-0.47163885831832886,-0.0710031278431415 6]},{"type":3,"x":50,"y":100,"x1":100,"y1":100,"r": [0.24857700895518064,0.030472169630229473,0.49844827968627214,0.1326016811653971 7]},{"type":3,"x":0,"y":50,"x1":0,"y1":100,"r": [0.1751830680295825,-0.18606301862746477,-0.4092112798243761,-0.4790717279538512]} ,{"type":3,"x":50,"y":0,"x1":0,"y1":0,"r": [0.37117584701627493,0.3612578883767128,0.0462839687243104,-0.1564063960686326]}], "shadow":false},{"type":"PathNode","matrix": {"m11":-1.475090930376591,"m12":0,"m21":0,"m22":1.2306765694828008,"dx":700.13810328 55618,"dy":133.20628077515605},"children": [],"strokeStyle":"#000000"," fi llStyle":"#ffffff","lineWidth":2,"smoothness":0.3,"sloppiness":0.5,"star tX":126.25,"startY":127.50445838342671,"closed":true,"segments": [{"type":3,"x":146.01190476190476,"y":147.5936260519611,"x1":146.01190476190476,"y1":127 .50445838342671,"r": [-0.1750196823850274,-0.05804965365678072,-0.3536788672208786,0.05322327278554439 5]}, {"type":3,"x":126.25,"y":167.6827937204955,"x1":146.01190476190476,"y1":167.68279372049 55,"r": [-0.32906053867191076,-0.11536165233701468,0.35579121299088,0.38731588050723076]}, {"type":3,"x":108,"y":147,"x1":106.48809523809524,"y1":167.6827937204955,"r": [0.08825046103447676,0.011088204570114613,0.43411328736692667,-0.133069220930337 9]}, {"type":3,"x":126.25,"y":127.50445838342671,"x1":106.48809523809524,"y1":127.5044583834 2671,"r": [0.42778260353952646,0.24726040940731764,0.3631806019693613,0.05325550492852926]} ],"shadow":false},{"type":"TextNode","matrix": {"m11":1,"m12":0,"m21":0,"m22":1,"dx":543,"dy":225},"children": []," fi llStyle":"#000000","text":"Y","fontName":"FG Virgil","fontSize":20}, {"type":"TextNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":559,"dy":144},"children": []," fi llStyle":"#000000","text":"x","fontName":"FG Virgil","fontSize":20}, {"type":"ArrowNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":0,"dy":0},"children": [],"arrowSize":10,"path":{"type":"PathNode","matrix": Hierarchical Delta Debugging 89

Slide 90

Slide 90 text

90 Why? [12345] {"":[]} {"":0} {"x":[]} {"type":"PathNode","matrix": {"m11":-0.6630394213564543,"m12":0,"m21":0,"m22":0.5236476835782672,"dx":565.5201948 628471,"dy":371.5686591257294},"children": [],"strokeStyle":"#000000"," fi llStyle":"#e1e1e1","lineWidth":4,"smoothness":0.3,"sloppiness":0.5, "startX":50,"startY":0,"closed":true,"segments":[{"type":3,"x":100,"y":50,"x1":100,"y1":0,"r": [-0.3779207859188318,0.07996635790914297,-0.47163885831832886,-0.0710031278431415 6]},{"type":3,"x":50,"y":100,"x1":100,"y1":100,"r": [0.24857700895518064,0.030472169630229473,0.49844827968627214,0.1326016811653971 7]},{"type":3,"x":0,"y":50,"x1":0,"y1":100,"r": [0.1751830680295825,-0.18606301862746477,-0.4092112798243761,-0.4790717279538512]} ,{"type":3,"x":50,"y":0,"x1":0,"y1":0,"r": [0.37117584701627493,0.3612578883767128,0.0462839687243104,-0.1564063960686326]}], "shadow":false},{"type":"PathNode","matrix": {"m11":-1.475090930376591,"m12":0,"m21":0,"m22":1.2306765694828008,"dx":700.13810328 55618,"dy":133.20628077515605},"children": [],"strokeStyle":"#000000"," fi llStyle":"#ffffff","lineWidth":2,"smoothness":0.3,"sloppiness":0.5,"star tX":126.25,"startY":127.50445838342671,"closed":true,"segments": [{"type":3,"x":146.01190476190476,"y":147.5936260519611,"x1":146.01190476190476,"y1":127 .50445838342671,"r": [-0.1750196823850274,-0.05804965365678072,-0.3536788672208786,0.05322327278554439 5]}, {"type":3,"x":126.25,"y":167.6827937204955,"x1":146.01190476190476,"y1":167.68279372049 55,"r": [-0.32906053867191076,-0.11536165233701468,0.35579121299088,0.38731588050723076]}, {"type":3,"x":108,"y":147,"x1":106.48809523809524,"y1":167.6827937204955,"r": [0.08825046103447676,0.011088204570114613,0.43411328736692667,-0.133069220930337 9]}, {"type":3,"x":126.25,"y":127.50445838342671,"x1":106.48809523809524,"y1":127.5044583834 2671,"r": [0.42778260353952646,0.24726040940731764,0.3631806019693613,0.05325550492852926]} ],"shadow":false},{"type":"TextNode","matrix": {"m11":1,"m12":0,"m21":0,"m22":1,"dx":543,"dy":225},"children": []," fi llStyle":"#000000","text":"Y","fontName":"FG Virgil","fontSize":20}, {"type":"TextNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":559,"dy":144},"children": []," fi llStyle":"#000000","text":"x","fontName":"FG Virgil","fontSize":20}, {"type":"ArrowNode","matrix":{"m11":1,"m12":0,"m21":0,"m22":1,"dx":0,"dy":0},"children": [],"arrowSize":10,"path":{"type":"PathNode","matrix": Hierarchical Delta Debugging 90

Slide 91

Slide 91 text

DDSET Gopinath, Kampmann, Havrikov, Soremekun, and Zeller. Abstracting Failure Inducing Inputs. ISSTA 2020. https://github.com/vrthra/ddset 91

Slide 92

Slide 92 text

{"": []} DDSET: ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | 92

Slide 93

Slide 93 text

{"": []} 122489 {"A":{}, {"23": {"P":[]}}] [[], [[[]],[]],{"A":{}, {"23": {"P":[]}}] ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET: 93

Slide 94

Slide 94 text

{"": []} "XYZR389" {"A":{}, {"23": {"P":[]}}} [[], [[[]],[]],{"A":{}, {"23": {"P":[]}}] ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET: 94

Slide 95

Slide 95 text

{"": []} [true, null] {"A":{}, {"23": {"P":[]}}} [[], [[[]],[]],{"A":{}, {"23": {"P":[]}}] ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET: 95

Slide 96

Slide 96 text

{"": []} {"__": [[]]} {"?P":[{}], {"|": {"":[]}}} {"X":[[],[]],{"A":{}, {"2": {"R":[]}}} ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET: 96

Slide 97

Slide 97 text

{"": []} {"": [[]]} {"?P":[{}], {"|": {"P":[]}}} {"X":[[],[]],{"A":{}, {"2": {"R":[]}}} 97 ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET:

Slide 98

Slide 98 text

{"": []} {"7897A": []} {"klnm,.qer;dfs?P":[]} {"123KOUIJ!qR30578950":[]} 98 ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET:

Slide 99

Slide 99 text

{"": []} {"": true} {"":[1,2,445,"x"]} {"":{"PQ":[true, false, 223,"a"]}} 99 ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET:

Slide 100

Slide 100 text

{"": []} Abstraction {"": } Abstract Input 100 ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | DDSET:

Slide 101

Slide 101 text

{"": } Abstract Input {"": []} Minimized Input 101 ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= |

Slide 102

Slide 102 text

Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure = class extends (class {}){} var {<$Id1>:{} = <$Id1> => {}} ; const [<$Id1>,<$Id1>] = [] {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} 102

Slide 103

Slide 103 text

{"": } Abstract Input {"": []} Minimized Input 103 ::= 
 ::= 
 | 
 | 
 | 
 | `true` | `false ` | `null ` ::= `"` `" ` | `"" ` ::= | ::= `{``} ` | `{}` ::= `,` | ::= `:` ::= `[``] ` | `[]` ::= `,` | ::= ::= | ::= 
 ::= 
 | | 
 | 
 ::= `{``}`
 ::= | `,` | `,`
 ::= `:` | `:`
 ::= `[``]`
 ::= | `,` | `,` ::= `"" ` ::= 
 ::= 
 | | 
 | 
 | `true` | `false` | `null`
 ::= `{``}` | `{}`
 ::= | `,`
 ::= `:`
 ::= `[``]` | `[]`
 ::= | `,`
 ::= `"` `"` | `""`
 ::= 
 ::= [A-Za-z0-9]
 ::= 
 ::= | 
 ::= [0-9] where is "": Evocative Pattern Evocative Grammar

Slide 104

Slide 104 text

104 {"": 100} {"": [343,{},44998]} [{"": {"xxy":44998, {"b":[1,2,3]}}},[],[]] {"_": {"ket":[], {"":[],"y",[[],[1,2,3,455,6]]}}} {".":[{3243435656:"xy,zzzpqiu"},[{"":[112]},{"d":[[]]},{}]]} [{"": [1,2,3,4]}] {"pqr": {"": [1,2,3,4]}, "abc":[]} [[1132],{"xx":[{6:"dafjli;y,zzzdfaiu"},[{"__":[1{}{}]},{"":[[444456]]},{}]]} generate() ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ::= 
 ::= 
 | | 
 | 
 ::= `{``}`
 ::= | `,` | `,`
 ::= `:` | `:`
 ::= `[``]`
 ::= | `,` | `,` ::= `"" ` ::= 
 ::= 
 | | 
 | 
 | `true` | `false` | `null`
 ::= `{``}` | `{}`
 ::= | `,`
 ::= `:`
 ::= `[``]` | `[]`
 ::= | `,`
 ::= `"` `"` | `""`
 ::= 
 ::= [A-Za-z0-9]
 ::= 
 ::= | 
 ::= [0-9] where is "": Evocative Grammar

Slide 105

Slide 105 text

105 where is "": 1. We can produce any and all instances of the failure inducing pattern . 2. We can recognize any input that contains the failure inducing pattern . 3. The grammar will reject any input that doesn't contain the failure inducing pattern. ::= 
 ::= 
 | | 
 | 
 ::= `{``}`
 ::= | `,` | `,`
 ::= `:` | `:`
 ::= `[``]`
 ::= | `,` | `,` ::= `"" ` ::= 
 ::= 
 | | 
 | 
 | `true` | `false` | `null`
 ::= `{``}` | `{}`
 ::= | `,`
 ::= `:`
 ::= `[``]` | `[]`
 ::= | `,`
 ::= `"` `"` | `""`
 ::= 
 ::= [A-Za-z0-9]
 ::= 
 ::= | 
 ::= [0-9] Evocative Grammar

Slide 106

Slide 106 text

where is "": if json.has_key("") : raise Exception() if json.has_key_value(null) : raise Exception() where is : null 106 Combining Evocative Patterns

Slide 107

Slide 107 text

if json.has_key("") and json.has_key_value(null) : raise Exception() where is "": 
 is : null 107 Combining Evocative Patterns

Slide 108

Slide 108 text

if json.has_key("") and not json.has_key_value(null) : raise Exception() where is "": is : null 108 Combining Evocative Patterns

Slide 109

Slide 109 text

if json.has_key("") : raise Exception( ) if json.has_key_value(null) : raise Exception() where is "": is : null 109 Combining Evocative Patterns

Slide 110

Slide 110 text

:= := | := '[' '] ' := '{' '} ' := | ',' | ',' := | ',' | ',' := ':' | ':' := := 'false' | 'true ' | | | := '[]' | '[' '] ' := '{}' | '{' '} ' := | ',' := | ',' := ':' := 'false' | 'true ' | | | | where is "": is : null {"": 100} {"": [343,{},44998]} [{"": {"xxy":44998, {"b":[1,2,3]}}},[],[]] {"_": {"ket":[], {"":[],"y",[[],[1,2,3,455,6]]}}} {".":[{3243435656:"xy,zzzpqiu"},[{"":[112]},{"d":[[]]},{}]]} [{"": [1,2,3,4]}] {"pqr": {"": [1,2,3,4]}, "abc":[]} [[1132],{"xx":[{6:"dafjli;y,zzzdfaiu"},[{"__":[1{}{}]},{"":[[444456]]},{}]]} generate() ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ 110

Slide 111

Slide 111 text

Issue 386 from Rhino var A = class extends (class {}){}; Issue 2937 from Closure const [y,y] = []; var {baz:{} = baz => {}} = baz => {}; Issue 385 from Rhino {while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}} Issue 2842 from Closure = class extends (class {}){} var {<$Id1>:{} = <$Id1> => {}} ; const [<$Id1>,<$Id1>] = [] {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} 111

Slide 112

Slide 112 text

where is = class extends (class {}){} is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} is var {<$Id2>:{} = <$Id2> => {}} ; is const [<$Id3>,<$Id3>] = [] Evocative Expressions 112

Slide 113

Slide 113 text

113 Evocative Expressions: Validating Inputs where is = class extends (class {}){} is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} is var {<$Id2>:{} = <$Id2> => {}} ; is const [<$Id3>,<$Id3>] = []

Slide 114

Slide 114 text

Evocative Expressions for Focused Fuzzing def assign_admin_rights() : self.db_rights.add(MODIFY_DB ) self.fs_rights = R W self.timeout = 6 0 self.deploy = Tru e def assign_guest_rights() : self.db_rights = [QUERY_DB ] self.fs_rights = Non e self.timeout = Non e self.deploy = Fals e def modify_db(stmt) : if ADMIN in self.db_rights : process(stmt ) else : raise Error( ) def query_db(stmt) : process(stmt ) 114

Slide 115

Slide 115 text

115 where is "role": "admin" def assign_admin_rights() : self.db_rights.add(MODIFY_DB ) self.fs_rights = R W self.timeout = 6 0 self.deploy = Tru e def assign_guest_rights() : self.db_rights = [QUERY_DB ] self.fs_rights = Non e self.timeout = Non e self.deploy = Fals e def modify_db(stmt) : if ADMIN in self.db_rights : process(stmt ) else : raise Error( ) def query_db(stmt) : process(stmt ) {"role" : "admin"} Evocative Expressions for Focused Fuzzing

Slide 116

Slide 116 text

116 def assign_admin_rights() : self.db_rights.add(MODIFY_DB ) self.fs_rights = R W self.timeout = 6 0 self.deploy = Tru e def assign_guest_rights() : self.db_rights = [QUERY_DB ] self.fs_rights = Non e self.timeout = Non e self.deploy = Fals e def modify_db(stmt) : if ADMIN in self.db_rights : process(stmt ) else : raise Error( ) def query_db(stmt) : process(stmt ) {"method":"remove_table","args":["orders", "inventory"]} where is "method":"remove_table","args": Evocative Expressions for Focused Fuzzing

Slide 117

Slide 117 text

117 def assign_admin_rights() : self.db_rights.add(MODIFY_DB ) self.fs_rights = R W self.timeout = 6 0 self.deploy = Tru e def assign_guest_rights() : self.db_rights = [QUERY_DB ] self.fs_rights = Non e self.timeout = Non e self.deploy = Fals e def modify_db(stmt) : if ADMIN in self.db_rights : process(stmt ) else : raise Error( ) def query_db(stmt) : process(stmt ) where is "role": "admin" is {"method":"remove_table","args":} {"method":"remove_table","args":["orders", "inventory"], "role":"admin"} {"role":"admin", "method":"remove_table","args":["orders", "inventory"]} {"method":"remove_table","role":"admin","args":["orders", "inventory"]} {"method":"remove_table","args":["orders","inventory",{"role":"admin"}]} Evocative Expressions for Focused Fuzzing

Slide 118

Slide 118 text

118 def assign_admin_rights() : self.db_rights.add(MODIFY_DB ) self.fs_rights = R W self.timeout = 6 0 self.deploy = Tru e def assign_guest_rights() : self.db_rights = [QUERY_DB ] self.fs_rights = Non e self.timeout = Non e self.deploy = Fals e def modify_db(stmt) : if ADMIN in self.db_rights : process(stmt ) else : raise Error( ) def query_db(stmt) : process(stmt ) where is "role": "admin" is {"method":"remove_table","args":} {"method":"remove_table","args":["orders", "inventory"], "role":"guest"} {"role":"guest", "method":"remove_table","args":["orders", "inventory"]} {"method":"remove_table","role":"guest","args":["orders", "inventory"]} {"method":"remove_table","args":["orders","inventory",{"role":"guest"}]} Evocative Expressions for Focused Fuzzing

Slide 119

Slide 119 text

Algebraic Data Types 119 ::= struct mystruct { stype m1 ; stype m2 ; }; union myunion { utype m1 ; utype m2 ; }; ::= | Data Structures Context Free Grammar Evocative Expressions: Data Structures

Slide 120

Slide 120 text

Search Source Code For Fault Patterns • Structure aware • Semantics can be mined from bugs! • No need to specify by hand • Combine different patterns using `or`, `and` and `not` 120 Evocative Expressions: Semantic Search

Slide 121

Slide 121 text

Gopinath, Nemati, Zeller. Input Algebras. ICSE 2021. Evocative Expressions https://rahul.gopinath.org/posts/ 121

Slide 122

Slide 122 text

if json.has_key("") : raise Exception() if not json.has_key_value(null) : raise Exception() 122 if json.has_key("") and not json.has_key_value(null) : raise Exception() Future: Decomposition with DDSet 122

Slide 123

Slide 123 text

123 123

Slide 124

Slide 124 text

Generating Unbiased Samples 124

Slide 125

Slide 125 text

Finding Good Samples Seed corpus? (Blind spots) 125

Slide 126

Slide 126 text

126 • Differentiate incomplete and incorrect input s • Solve one character at a time systematically Key Idea (Monotonic Failure Property) 126

Slide 127

Slide 127 text

127 Sample Free Generators A ( 2 - B 9 ) 4 ) A ∉ (,+,-,1,2,3,4,5,6,7,8,9,0 B ∉ +,-,1,2,3,4,5,6,7,8,9,0,) ) ∉ +,-,1,2,3,4,5,6,7,8,9,0 (2-94) 127

Slide 128

Slide 128 text

128 Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Parser Directed Fuzzing. PLDI 2019 . Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Sample Free Generators A ( 2 - B 9 ) 4 )

Slide 129

Slide 129 text

129 Blackbox Sample Free Generators • Monotonic Failure Propert y • Differentiate incomplete and incorrect input s • Solve one character at a time systematically Bendrissou, Gopinath, Mathis, and Zeller Failure Feedback for Fast Fuzzing. (arXiv). 129

Slide 130

Slide 130 text

130

Slide 131

Slide 131 text

131 HTTP Parser XML Parser SOAP Parser RPC Parser C Parser Check Declarations Check Types Static Checks Compilers Servers Semantics Application Layered Programs 131

Slide 132

Slide 132 text

132

Slide 133

Slide 133 text

133 QS: Computer Science 22 in the world One in the Group of Eight (Go8) in Australia

Slide 134

Slide 134 text

134

Slide 135

Slide 135 text

135 Mining Evocative Expressions Blakbox Generation