Mutation Analysis: Answering the Fuzzing Challenge

Slide 1

Slide 1 text

1 Mutation Analysis:  Answering the Fuzzing Challenge Rahul Gopinath https://rahul.gopinath.org [email protected] @_rahulgopinath

Slide 2

Slide 2 text

[email protected] Rahul Gopinath https://rahul.gopinath.org @_rahulgopinath 2 Mutation Analysis:  Answering the Fuzzing Challenge

Slide 3

Slide 3 text

3 https://www2.deloitte.com/us/en/pages/technology-media-and-telecommunications/articles/software-growth-in-tech.html Software is Eating the World

Slide 4

Slide 4 text

4 New Challenges

Slide 5

Slide 5 text

Bugs 5

Slide 6

Slide 6 text

Slide 7

Slide 7 text

7 Software Failures can be Catastrophic https://www3.weforum.org/docs/WEF_Global_Risk_Report_2020.pdf

Slide 8

Slide 8 text

8 Input ✓ ✘ Testing @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500

Slide 9

Slide 9 text

HTTP/1.1 401 Not Authorized Content-Type: application/json { "Error": "Assignment of internal role 'superadmin' is forbidden" } HTTP/1.1 200 OK Content-type: application/json { "result": "OK: Updated user 'exampleUser' with role 'superadmin'" } 9 Input ✓ ✘ Testing POST /user/update HTTP/1.1 { "user": "exampleUser", "roles": [ "superadmin" ] } @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 HTTP/1.1 500 Internal Server Error ✘

Slide 10

Slide 10 text

10 Can We Trust Our Tests? Tests are still predominantly written manually (Patrick Lam 2014) Developer Tester

Slide 11

Slide 11 text

11 • Modern tests -- • Are complex and have non-deterministic control flow • Interact: network, file system ... • No rigorous quality control • 50% of test code is cut and paste (Lam 2014) • 65% of test assertions are inadequate or wrong (Zhi 2013) • Once a test is written, it is rarely looked at unless it fails (Coplian 2014) Can We Trust Our Tests?

Slide 12

Slide 12 text

12 Can We Trust Our Tests? How to evaluate the quality of test suites?

Slide 13

Slide 13 text

@app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 13 Thou Shalt Cover Thy Code But is coverage su ffi cient? @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500

Slide 14

Slide 14 text

14 Code Coverage is Useful • Statement coverage vs effectiveness • Developer written or organic test suites Results from 250 programs from Github Effectiveness = 0.9 x S.Coverage  R2 = 0.939 Effectiveness Statement Coverage Effectiveness computed with faults produced by PIT 2014 ICSE

Slide 15

Slide 15 text

15 But Maybe Misleading • Statement coverage against effectiveness • Machine generated test suites (using Randoop) Results from 250 programs from Github Effectiveness = 0.6 x S.Coverage R2 = 0.72 Effectiveness Statement Coverage Effectiveness computed with faults produced by PIT 2014 ICSE

Slide 16

Slide 16 text

16 Coverage Maybe Misleading More complex coverage measures are easier to attain for automated tools using signi fi cantly weaker assertions. Effectiveness Branch Coverage Effectiveness Effectiveness Effectiveness Branch Coverage Path Coverage Path Coverage Gopinath, Jensen, and Groce “Code Coverage for Suite Evaluation by Developers” 2014 ICSE 2014 ICSE

Slide 17

Slide 17 text

17 Can We Trust Our Tests? HTTP/1.1 401 Not Authorized Content-Type: application/json { "Error": "Assignment of internal role 'superadmin' is forbidden" } HTTP/1.1 200 OK Content-type: application/json { "result": "OK: Updated user 'exampleUser' with role 'superadmin'" } Input ✓ ✘ POST /user/update HTTP/1.1 { "user": "exampleUser", "roles": [ "superadmin" ] } @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 HTTP/1.1 500 Internal Server Error ✘ 17 Unless backed by test assertions, code coverage is pointless

Slide 18

Slide 18 text

18 Evaluating Quality of Assertions

Slide 19

Slide 19 text

19 A Simple Idea: Fault Seeding • Effectiveness = M/N (here 5/10 = 0.5) • Seed the program with N random faults (here 10) • Run the test suite against the faults • Say M faults were caught (here 5) Typically used in benchmarks

Slide 20

Slide 20 text

20 Problems with Fault Seeding • Relies on manual generation/curation of faults • Hand produced faults are different from real-world faults (Andrews 2005)

Slide 21

Slide 21 text

21 Mutation Testing To The Rescue! The “gold standard” for measuring test suite effectiveness.

Slide 22

Slide 22 text

22 What Is Mutation Testing? d = b^3 - 4 * a * c d = b^2 + 4 * a * c d = b^2 - 4 + a * c Mutants d = b^2 - 4 * a * c Original (a = 0, b = 0, c = 0) => (d = 0) (a = 1, b = 1, c = 1) => (d = -3) (a = 0, b = 2, c = 0) => (d = 4) Mutants killed by test cases Test cases = b2 4ac

Slide 23

Slide 23 text

23 Some Possible Mutants d = b^0 - 4 * a * c;  d = b^1 - 4 * a * c; d = b^-1 - 4 * a * c; d = b^MAX - 4 * a * c; d = b^MIN - 4 * a * c; d = b - 4 * a * c;  d = b ^ 4 * a * c; d = b^2 - 0 * a * c;  d = b^2 - 1 * a * c;  d = b^2 – (-1) * a * c;  d = b^2 - MAX * a * c;  d = b^2 - MIN * a * c;  d = b^2 - 4 * a * c;  d = b^2 - 4 * a * c; d = b^2 + 4 * a * c;  d = b^2 * 4 * a * c;  d = b^2 / 4 * a * c;  d = b^2 ^ 4 * a * c;  d = b^2 % 4 * a * c; d = b^2 << 4 * a * c; d = b^2 >> 4 * a * c; d = b^2 * 4 + a * c;  d = b^2 * 4 - a * c;  d = b^2 * 4 / a * c;  d = b^2 * 4 ^ a * c;  d = b^2 * 4 % a * c; d = b^2 * 4 << a * c; d = b^2 * 4 >> a * c; d = b^2 * 4 * a + c;  d = b^2 * 4 * a - c;  d = b^2 * 4 * a / c;  d = b^2 * 4 * a ^ c;  d = b^2 * 4 * a % c; d = b^2 * 4 * a << c; d = b^2 * 4 * a >> c; d = b + 2 - 4 * a * c;  d = b - 2 - 4 * a * c;  d = b * 2 - 4 * a * c;  d = b / 2 - 4 * a * c;  d = b % 2 - 4 * a * c;  d = b << 2 - 4 * a * c;  d = b >> 2 - 4 * a * c;  … = b2 4ac

Slide 24

Slide 24 text

24 Why Does It Work? • The simple mutations are not the entire set of faults • n mutations can produce 2^n complex faults • So why does mutation testing work?

Slide 25

Slide 25 text

25 Assumptions of Mutation Testing The finite neighborhood assumption Programmers make simple mistakes. d = b^2 + 4 * a * c d = b^2 - 4 * a * c = b2 4ac

Slide 26

Slide 26 text

26 The coupling effect • Faults rarely interact • If they interact, they become easier to detect (kill) d = b^2 + 4 * a + c a=1,b=1,c=1 => d = 0 a=0,b=0,c=0 => d = 0 a=1,b=1,c=1 => d = 0 a=0,b=0,c=0 => d = 0 d = b^2 - 4 * a + c d = b^2 + 4 * a * c = b2 4ac Assumptions of Mutation Testing

Slide 27

Slide 27 text

27 Finite Neighborhood Bug fi x patches analyzed from: 1850 C  1128 Java  1000 Python 1393 Haskell open source projects from Github 2014 ISSRE

Slide 28

Slide 28 text

28 Coupling E ff ect 2017 ICST

Slide 29

Slide 29 text

29 Can Mutation Score Predict Future Bugs? Statements with detected mutants twice less likely to contain future bugs (1-Mutation score) is a good proxy for residual defects Boxplot of bug fi xes on covered program elements with killed mutants vs elements with no killed mutants FSE 2016

Slide 30

Slide 30 text

Slide 31

Slide 31 text

31 Fuzzing

Slide 32

Slide 32 text

32 Debian 5 ~ 70 million lines Smart cars ~ 100 million lines Google is ~ 2 Billion lines A million lines zoomed (informationisbeautiful.net) ‘70s ‘80s ‘90s 2000 onwards Software Complexity Size of software systems has a nasty habit of doubling every few years

Slide 33

Slide 33 text

33 And So Do Bugs

Slide 34

Slide 34 text

HTTP/1.1 401 Not Authorized Content-Type: application/json { "Error": "Assignment of internal role 'superadmin' is forbidden" } HTTP/1.1 200 OK Content-type: application/json { "result": "OK: Created user 'exampleUser' with role 'superadmin\ud888'" } 34 Input ✓ ✘ E ff ective Testing? POST /user/create HTTP/1.1 { "user": "exampleUser", "roles": [ "superadmin\ud888" ] } @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 HTTP/1.1 500 Internal Server Error ✘

Slide 35

Slide 35 text

35 E ff ective Testing? 24 = 16 2 Testing cannot keep up when components (layers, libraries, services) multiply.

Slide 36

Slide 36 text

36 Verify Behavior Input ✓ ✘ Automatic Testing @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500

Slide 37

Slide 37 text

37 (Oracle) Input ✓ ✘ Automatic Testing Oracles require domain expertise @app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 Verify Behavior

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text

40 https://www.fuzzingbook.org

Slide 41

Slide 41 text

@app.route('/admin') def admin(): username = request.cookies.get("username") if not username: return {"Error": "Specify username in Cookie"} username = urllib.quote(os.path.basename(username)) url = "http://permissions:5000/permissions/{}".format(username) resp = requests.request(method="GET", url=url) # "superadmin\ud888" will be simpli fi ed to "superadmin" ret = ujson.loads(resp.text) if resp.status_code == 200: if "superadmin" in ret["roles"]: return {"OK": "Superadmin Access granted"} else: e = u"Access denied. User has following roles: {}".format(ret["roles"]) return {"Error": e}, 401 else:return {"Error": ret["Error"]}, 500 [ ; x 1 - G P Z + w c c k c ] ; , N 9 J + ? # 6 ^ 6 \ e ? ] 9 l u 2 _ % ' 4 G X " 0 V U B [ E / r ~ f A p u 6 b 8 < { % s i q 8 Z h . 6 { V , h r ? ; {Ti.r3PIxMMMv6{xS^+'Hq!AxB"YXRS@! Kd6;wtAMefFWM(`|J_<1~o}z3K(CCzRH J I I v H z > _ * . \ > J r l U 3 2 ~ e G P ? lR=bF3+;y$3lodQ & ] B S 6 R & j ? # t P 7 i a V } - } ` \ ? [ _ [ Z ^ L B M P G - FKj'\xwuZ1=Q`^`5,$N$Q@[!CuRzJ2D|vBy! ^ z k h d f 3 C 5 P A k R ? V ( ( - % > < h n | 3='i2Qx]D$qs4O`1@fevnG'2\11Vf3piU37@ 5 : d f d 4 5 * ( 7 ^ % 5 a p \ z I y l " ' f , $ee,J4Gw:cgNKLie3nx9(`efSlg6#[K"@Wjh Z}r[Scun&sBCS,T[/3]KAeEnQ7lU)3Pn,0)G/ 6N-wyzj/MTd#A;r Program 41 https://www.fuzzingbook.org/html/Fuzzer.html Traditional Fuzzing

Slide 42

Slide 42 text

42 (CACM '90) No longer very effective

Slide 43

Slide 43 text

Slide 44

Slide 44 text

44 def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene def triangle(a, b, c): __probe_enter() if a == b: __probe_1() if b == c: __probe_2() return Equilateral else: __probe_3() return Isosceles else: __probe_4() if b == c: __probe_5() return Isosceles else: __probe_6() if a == c: __probe_7() return Isosceles else: __probe_8() return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage 44 https://www.fuzzingbook.org/html/MutationFuzzer.html Feedback Driven Fuzzing

Slide 45

Slide 45 text

45 Feedback Driven Fuzzing triangle (1,1,1) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage 45 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 46

Slide 46 text

46 Feedback Driven Fuzzing triangle (1,1,1) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage 46 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 47

Slide 47 text

triangle (1,1,1) 47 Feedback Driven Fuzzing triangle (1,1,2) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage Mutated 47 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 48

Slide 48 text

48 Feedback Driven Fuzzing triangle (1,1,3) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage Mutated 48 https://www.fuzzingbook.org/html/MutationFuzzer.html

Slide 49

Slide 49 text

49 Feedback Driven Fuzzing triangle (1,1,2) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene • Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage 49 https://www.fuzzingbook.org/html/MutationFuzzer.html triangle (1,1,1) triangle (1,1,3)

Slide 50

Slide 50 text

• Insert Instrumentation • Generate inputs • Collect execution feedback • Branches covered during execution • Slightly Mutate Input and try again Collect inputs obtaining new coverage 50 Feedback Driven Fuzzing triangle (1,1,2) def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,1) AFL 50

Slide 51

Slide 51 text

51 Feedback Driven Fuzzing Weakness: static int is_reserved_word_token(const char *s, int len) { const char *reserved[] = { "break", "case", "catch", "continue", "debugger", "default", "delete", "do", "else", "false", "finally", "for", "function", "if", "in", "instanceof", "new", "null", "return", "switch", "this", "throw", "true", "try", "typeof", "var", "void", "while", "with", "let", "undefined", ((void *)0)}; int i; if (!mjs_is_alpha(s[0])) return 0; for (i = 0; reserved[i] != ((void *)0); i++) { if (len == (int)strlen(reserved[i]) && strncmp(s, reserved[i], len) == 0) return i + 1; } return 0; } Tokens if (x > 100) { } coverage: 20% if (x > 100) { } e coverage: 5% if (x > 100) { } el coverage: 5% if (x > 100) { } els coverage: 5% if (x > 100) { } else coverage: 25% No smooth coverage gradient in parsers 51

Slide 52

Slide 52 text

52 Feedback Driven Fuzzing def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_dict(stm) elif c == '[': return json_list(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) Weak points: • Need for smooth coverage gradient • Coverage only provides first level guidance 1. {"abc":[]} 2. [{"a":[]}, {"b":[]}, {"c":["ab","c"]}] 52

Slide 53

Slide 53 text

53 Feedback Driven Fuzzing def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_dict(stm) elif c == '[': return json_list(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) Weak points: • Need for smooth coverage gradient • Coverage only provides first level guidance 1. {"abc":[]} 2. [{"a":[]}, {"b":[]}, {"c":["ab","c"]}] 53

Slide 54

Slide 54 text

54 Feedback Driven Fuzzing def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_dict(stm) elif c == '[': return json_list(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) Weak points: • Need for smooth coverage gradient • Coverage only provides first level guidance 1. {"abc":[]} 2. [{"a":[]}, {"b":[]}, {"c":["ab","c"]}] 54

Slide 55

Slide 55 text

55 def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_dict(stm) elif c == '[': return json_list(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) 55 {  '' : [['']], '' : [[''], [''], [''], [''], ['true'], ['false'], ['null']], '' : [['{', '','}'], ['{}']], '' : [[',',','], ['']], '' : [['',':', '']], '' : [['[', '', ']'], ['[]']], '' : [[',',','], ['']], '' : [['"', '', '"'], ['""']], '' : [['',''], ['']], '' : [['']], '' : [['',''], ['']], '' : [[c] for c in string.characters] '' : [[c] for c in string.digits]  } Solution: Structure Aware Fuzzing JSON Grammar

Slide 56

Slide 56 text

56 {  '' : [['']], '' : [[''], [''], [''], [''], ['true'], ['false'], ['null']], '' : [['{', '','}'], ['{}']], '' : [[',',','], ['']], '' : [['',':', '']], '' : [['[', '', ']'], ['[]']], '' : [[',',','], ['']], '' : [['"', '', '"'], ['""']], '' : [['',''], ['']], '' : [['']], '' : [['',''], ['']], '' : [[c] for c in string.characters] '' : [[c] for c in string.digits]  } Structure Aware Feedback Driven Fuzzer

Slide 57

Slide 57 text

{  '' : [['']], '' : [[''], [''], [''], [''], ['true'], ['false'], ['null']], '' : [['{', '','}'], ['{}']], '' : [[',',','], ['']], '' : [['',':', '']], '' : [['[', '', ']'], ['[]']], '' : [[',',','], ['']], '' : [['"', '', '"'], ['""']], '' : [['',''], ['']], '' : [['']], '' : [['',''], ['']], '' : [[c] for c in string.characters] '' : [[c] for c in string.digits]  } 57 Grammar Fuzzer

Slide 58

Slide 58 text

General Parser {  '' : [['']], '' : [[''], [''], [''], [''], ['true'], ['false'], ['null']], '' : [['{', '','}'], ['{}']], '' : [[',',','], ['']], '' : [['',':', '']], '' : [['[', '', ']'], ['[]']], '' : [[',',','], ['']], '' : [['"', '', '"'], ['""']], '' : [['',''], ['']], '' : [['']], '' : [['',''], ['']], '' : [[c] for c in string.characters] '' : [[c] for c in string.digits]  } 58

Slide 59

Slide 59 text

Lang Fuzzer {  '' : [['']], '' : [[''], [''], [''], [''], ['true'], ['false'], ['null']], '' : [['{', '','}'], ['{}']], '' : [[',',','], ['']], '' : [['',':', '']], '' : [['[', '', ']'], ['[]']], '' : [[',',','], ['']], '' : [['"', '', '"'], ['""']], '' : [['',''], ['']], '' : [['']], '' : [['',''], ['']], '' : [[c] for c in string.characters] '' : [[c] for c in string.digits]  } https://www.fuzzingbook.org/html/LangFuzzer.html 59

Slide 60

Slide 60 text

def process_input(input): try: ✔val = parse(input) res = process(val) return res except SyntaxError: return Error {  '' : [['']], '' : [[''], [''], [''], [''], ['true'], ['false'], ['null']], '' : [['{', '','}'], ['{}']], '' : [[',',','], ['']], '' : [['',':', '']], '' : [['[', '', ']'], ['[]']], '' : [[',',','], ['']], '' : [['"', '', '"'], ['""']], '' : [['',''], ['']], '' : [['']], '' : [['',''], ['']], '' : [[c] for c in string.characters] '' : [[c] for c in string.digits]  } 60

Slide 61

Slide 61 text

Where to Get the Grammar From? 61

Slide 62

Slide 62 text

62 Almost Everyone Uses Handwritten Parsers https://notes.eatonphil.com/parser-generators-vs-handwritten-parsers-survey-2021.html 62

Slide 63

Slide 63 text

63 Where to Get the Grammar From? 63

Slide 64

Slide 64 text

64 "Be liberal in what you accept, and conservative in what you send" Postel's Law 64

Slide 65

Slide 65 text

QUIRK_ALLOW_ASCII_CONTROL_CODES QUIRK_ALLOW_BACKSLASH_A QUIRK_ALLOW_BACKSLASH_CAPITAL_U QUIRK_ALLOW_BACKSLASH_E QUIRK_ALLOW_BACKSLASH_NEW_LINE QUIRK_ALLOW_BACKSLASH_QUESTION_MARK QUIRK_ALLOW_BACKSLASH_SINGLE_QUOTE QUIRK_ALLOW_BACKSLASH_V QUIRK_ALLOW_BACKSLASH_X_AS_BYTES QUIRK_ALLOW_BACKSLASH_X_AS_CODE_POINTS QUIRK_ALLOW_BACKSLASH_ZERO QUIRK_ALLOW_COMMENT_BLOCK QUIRK_ALLOW_COMMENT_LINE QUIRK_ALLOW_EXTRA_COMMA QUIRK_ALLOW_INF_NAN_NUMBERS QUIRK_ALLOW_LEADING_ASCII_RECORD_SEPARATOR QUIRK_ALLOW_LEADING_UNICODE_BYTE_ORDER_MARK QUIRK_ALLOW_TRAILING_FILLER QUIRK_EXPECT_TRAILING_NEW_LINE_OR_EOF QUIRK_JSON_POINTER_ALLOW_TILDE_N_TILDE_R_TILDE_T QUIRK_REPLACE_INVALID_UNICODE JSON common quirks from https://github.com/google/wuffs 65

Slide 66

Slide 66 text

"Be liberal in what you accept, and conservative in what you send"  Postel's Law The Specification The Implementation Extra "Features" Where to Get the Grammar From? 66

Slide 67

Slide 67 text

67 Where to Get the Grammar From? Hand-written parsers already encode the grammar 67

Slide 68

Slide 68 text

def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_dict(stm) elif c == '[': return json_list(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) ::=   |   |   |   | | | ::= `"` `"` | `""` ::= | ::= `{``}` | `{}` ::= `,` | ::= `:` ::= `[``]` | `[]` ::= `,` | ::= ::= | https://github.com/phensley/microjson MicroJSON 68 68

Slide 69

Slide 69 text

Slide 70

Slide 70 text

:= | := := | Structured Control Flow to Grammar Sequence A B C [F] Selection cond A B [F] F T Iteration cond B [F] 70

Slide 71

Slide 71 text

def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_object(stm) elif c == '[': return json_array(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) ::= ::= | ::= | |   | `"` | `{`   | `[`   | [[1-9]   71

Slide 72

Slide 72 text

def json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == 't': return json_fixed(stm, 'true') elif c == 'f': return json_fixed(stm, 'false') elif c == 'n': return json_fixed(stm, 'null') elif c == '"': return json_string(stm) elif c == '{': return json_object(stm) elif c == '[': return json_array(stm) elif c in NUMSTART: return json_number(stm) raise JSONError(E_MALF, stm, stm.pos) ::= ::= | ::= | |   | `"` | `{`   | `[`   | [[1-9]   72

Slide 73

Slide 73 text

def json_string(stm): # skip over '"' stm.next() r = [] while True: c = stm.next() if c == '': raise JSONError(E_TRUNC) elif c == '\\': c = stm.next() r.append(decode_escape(c, stm)) elif c == '"': return ''.join(r) else: r.append(c) ::= ... ::= `"` ... ::= ::= | ::= `\\` | `"` | "ab" ::= ::= | ::= `\\` | ::= ::=`"` 73

Slide 74

Slide 74 text

74 ::= ::= '"' | '[' | '{' | | 'true' | 'false' | 'null' ::= + | + 'e' + ::= '+' | '-' | '.' | [0-9] | 'E' | 'e' ::= * '"' ::= ']' | (',')* ']' | ( ',' )+ (',' )* ']' ::= '}' | ( '"' ':' ',' )* '"' ':' '}' ::= ' ' | '!' | '#' | '$' | '%' | '&' | ''' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^' | '_', ''',| '{' | '|' | '}' | '~' | '[A-Za-z0-9]' | '\' ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next() if expect_key: raise JSONError(E_DKEY, stm, stm.pos) if c == '}': return result expect_key = 1 continue # parse out a key/value pair elif c == '"': key = _from_json_string(stm) stm.skipspaces() c = stm.next() if c != ':': raise JSONError(E_COLON, stm, stm.pos) stm.skipspaces() val = _from_json_raw(stm) result[key] = val expect_key = 0 continue raise JSONError(E_MALF, stm, stm.pos) def _from_json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == '"': return _from_json_string(stm) elif c == '{': return _from_json_dict(stm) elif c == '[': return _from_json_list(stm) elif c == 't': return _from_json_fixed(stm, 'true', True, E_BOOL) elif c == 'f': return _from_json_fixed(stm, 'false', False, E_BOOL) elif c == 'n': return _from_json_fixed(stm, 'null', None, E_NULL) elif c in NUMSTART: return _from_json_number(stm) raise JSONError(E_MALF, stm, stm.pos) def from_json(data): stm = JSONStream(data) return _from_json_raw(stm) microjson.py Recovered JSON grammar 74

Slide 75

Slide 75 text

Recall Subjects Mimid calc.py 100.0% mathexpr.py 87.5% cgidecode.py 100.0% urlparse.py 100.0% microjson.py 98.7% parseclisp.py 99.3% jsonparser.c 100.0% tiny.c 100.0% mjs.c 95.4% Inputs generated by inferred grammar that were accepted by the program Subjects Mimid calc.py 100.0% mathexpr.py 92.7% cgidecode.py 100.0% urlparse.py 96.4% microjson.py 93.0% parseclisp.py 80.6% jsonparser.c 83.8% tiny.c 92.8% mjs.c 95.9% Inputs generated by golden grammar that were accepted by the inferred grammar parser Precision 75 75 Evaluation: Accuracy

Slide 76

Slide 76 text

76 ::= ::= '"' | '[' | '{' | | 'true' | 'false' | 'null' ::= + | + 'e' + ::= '+' | '-' | '.' | [0-9] | 'E' | 'e' ::= * '"' ::= ']' | (',')* ']' | ( ',' )+ (',' )* ']' ::= '}' | ( '"' ':' ',' )* '"' ':' '}' ::= ' ' | '!' | '#' | '$' | '%' | '&' | ''' | '*' | '+' | '-' | ',' | '.' | '/' | ':' | ';' | '<' | '=' | '>' | '?' | '@' | '[' | ']' | '^' | '_', ''',| '{' | '|' | '}' | '~' | '[A-Za-z0-9]' | '\' ::= '"' | '/' | 'b' | 'f' | 'n' | 'r' | 't' stm.next() if expect_key: raise JSONError(E_DKEY, stm, stm.pos) if c == '}': return result expect_key = 1 continue # parse out a key/value pair elif c == '"': key = _from_json_string(stm) stm.skipspaces() c = stm.next() if c != ':': raise JSONError(E_COLON, stm, stm.pos) stm.skipspaces() val = _from_json_raw(stm) result[key] = val expect_key = 0 continue raise JSONError(E_MALF, stm, stm.pos) def _from_json_raw(stm): while True: stm.skipspaces() c = stm.peek() if c == '"': return _from_json_string(stm) elif c == '{': return _from_json_dict(stm) elif c == '[': return _from_json_list(stm) elif c == 't': return _from_json_fixed(stm, 'true', True, E_BOOL) elif c == 'f': return _from_json_fixed(stm, 'false', False, E_BOOL) elif c == 'n': return _from_json_fixed(stm, 'null', None, E_NULL) elif c in NUMSTART: return _from_json_number(stm) raise JSONError(E_MALF, stm, stm.pos) def from_json(data): stm = JSONStream(data) return _from_json_raw(stm) microjson.py Recovered JSON grammar 76 ESEC/FSE 2020. Mimid

Slide 77

Slide 77 text

77 HTTP Parser XML Parser SOAP Parser RPC Parser C Parser Check Declarations Check Types Static Checks Challenges Compilers Servers Semantics Application

Slide 78

Slide 78 text

Slide 79

Slide 79 text

79 Which Fuzzer Should We Use? Exponential growth in fuzzing literature Cumulative publications Publications per year

Slide 80

Slide 80 text

80 Found CVEs?

Slide 81

Slide 81 text

81 Structural Coverage?

Slide 82

Slide 82 text

82 Structural Coverage? Effectiveness Branch Coverage Effectiveness Effectiveness Effectiveness Branch Coverage Path Coverage Path Coverage Insufficient

Slide 83

Slide 83 text

83 Seeded Fault Benchmarks?

Slide 84

Slide 84 text

84 Seeded Fault Benchmarks? • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 85

Slide 85 text

85 Seeded Fault Benchmarks? • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply Mutation Analysis?

Slide 86

Slide 86 text

86 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 87

Slide 87 text

87 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 88

Slide 88 text

88 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 89

Slide 89 text

89 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 90

Slide 90 text

90 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 91

Slide 91 text

91 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 92

Slide 92 text

92 Mutation Analysis • Easy to fine-tune a fuzzer to overfit • Faults are rarely similar to real faults • Based on bugs we know about! • Human bias in bug curation • Limited supply • Bug interactions requiring deduplication

Slide 93

Slide 93 text

93 BUT

Slide 94

Slide 94 text

94 Mutation Analysis Is Costly Deterministically insert all simple faults, and try to fi nd each. = b2 4ac d = b^0 - 4 * a * c;  d = b^1 - 4 * a * c; d = b^-1 - 4 * a * c; d = b^MAX - 4 * a * c; d = b^MIN - 4 * a * c; d = b - 4 * a * c;  d = b ^ 4 * a * c; d = b^2 - 0 * a * c;  d = b^2 - 1 * a * c;  d = b^2 – (-1) * a * c;  d = b^2 - MAX * a * c;  d = b^2 - MIN * a * c;  d = b^2 - 4 * a * c;  d = b^2 - 4 * a * c; d = b^2 + 4 * a * c;  d = b^2 * 4 * a * c;  d = b^2 / 4 * a * c;  d = b^2 ^ 4 * a * c;  d = b^2 % 4 * a * c; d = b^2 << 4 * a * c; d = b^2 >> 4 * a * c; d = b^2 * 4 + a * c;  d = b^2 * 4 - a * c;  d = b^2 * 4 / a * c;  d = b^2 * 4 ^ a * c;  d = b^2 * 4 % a * c; d = b^2 * 4 << a * c; d = b^2 * 4 >> a * c; d = b^2 * 4 * a + c;  d = b^2 * 4 * a - c;  d = b^2 * 4 * a / c;  d = b^2 * 4 * a ^ c;  d = b^2 * 4 * a % c; d = b^2 * 4 * a << c; d = b^2 * 4 * a >> c; d = b + 2 - 4 * a * c;  d = b - 2 - 4 * a * c;  d = b * 2 - 4 * a * c;  d = b / 2 - 4 * a * c;  d = b % 2 - 4 * a * c;  d = b << 2 - 4 * a * c;  d = b >> 2 - 4 * a * c;  … and more …

Slide 95

Slide 95 text

95 Mutation Analysis Is Costly M mutants Lines of code Mutation points Lines of code Number of tests T tests Program size Effort for mutation testing = MxT test runs Test cases are no longer static

Slide 96

Slide 96 text

96 Ongoing Research • Non interacting Faults with Higher Order Mutants • Separate Coverage Analysis • Parallel Executions