100 million lines Google is ~ 2 Billion lines 5 M 10 M 15 M 20 M 25 M 30 M 35 M (Source: Wikipedia) #Linux Kernel Size in Millions of Lines of Code Linux Releases Growth in software complexity
100 million lines Google is ~ 2 Billion lines 1994 1996 1.0.0 2.0.0 2.1.0 2.2.0 2.4.0 2.6.0 2003 2011 3.0 2015 4.0 2019 5.0 2019 5.7 2001 5 M 10 M 15 M 20 M 25 M 30 M 35 M (Source: Wikipedia) #Linux Kernel Size in Millions of Lines of Code Linux Releases Growth in software complexity
152724057932073828569041216191 099859446496509919024810271242 622974988671421938012464630138 735355134599327240920259675263 574528613057084231370741920902 794677842164654990353575580453 777282305855352378119038096476 699871306655084953377039862387 924957554389878352934547664240 082431556093837288597262675598 630851919061829885048834738832 677022429414980917053939970795 722006987916088650168665471731 yes 12 Random Fuzzing def is_prime(n: int) -> bool: """Primality test using 6k+-1 optimization.""" if n <= 3: return n > 1 if n % 2 == 0 or n % 3 == 0: return False i = 5 while i ** 2 <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True def main(): num = stdin.read() print(num, is_prime(num))
152724057932073828569041216191 099859446496509919024810271242 622974988671421938012464630138 735355134599327240920259675263 574528613057084231370741920902 794677842164654990353575580453 777282305855352378119038096476 699871306655084953377039862387 924957554389878352934547664240 082431556093837288597262675598 630851919061829885048834738832 677022429414980917053939970795 722006987916088650168665471731 yes 12 Random Fuzzing def is_prime(n: int) -> bool: """Primality test using 6k+-1 optimization.""" if n <= 3: return n > 1 if n % 2 == 0 or n % 3 == 0: return False i = 5 while i ** 2 <= n: if n % i == 0 or n % (i + 2) == 0: return False i += 6 return True def main(): num = stdin.read() print(num, is_prime(num)) Weak point: unequal divisions in input space Input space n > 3 n <= 3
== b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
a == b: __probe_1() if b == c: __probe_2() return Equilateral else: __probe_3() return Isosceles else: __probe_4() if b == c: __probe_5() return Isosceles else: __probe_6() if a == c: __probe_7() return Isosceles else: __probe_8() return Scalene def triangle(a, b, c): if a == b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene
== b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,1)
== b: if b == c: return Equilateral else: return Isosceles else: if b == c: return Isosceles else: if a == c: return Isosceles else: return Scalene triangle (1,1,2)
seeds with new coverage AFL Pulling JPEGs out of thin air https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html Valid JPEG in 6 hours in an 8 core machine!
, members pair string : value array [ ] [ elements ] elements value value , elements value string number object array true false null string " " " chars " chars char char chars char UNICODE \ [",\,CTRL] \" \\ \/ \b \f \n \r \t \u hex hex hex hex number int int frac int exp int frac exp int digit onenine digits - digit - onenine digits frac . digits exp e digits hex digit A - F a - f digits digit digit digits e e e+ e- E E+ E- https://www.json.org
A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv
A if A determines whether B executes. def parse_csv(s,i): while s[i:]: if is_digit(s[i]): n,j = num(s[i:]) i = i+j else: comma(s[i]) i += 1 CDG for parse_csv while: determines whether if: executes
i = i+j else: comma(s[i]) i += 1 CDG for parse_csv Dynamic Control Dependence Tree Each statement execution is represented as a separate node DCD Tree for call parse_csv()
i = i+j else: comma(s[i]) i += 1 DCD Tree ~ Parse Tree •No tracking beyond input buffer •Characters are attached to nodes where they are accessed last "12," "12,"
i = i+j else: comma(s[i]) i += 1 '1' '2' ',' DCD Tree ~ Parse Tree •No tracking beyond input buffer •Characters are attached to nodes where they are accessed last "12," "12,"
= '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Parse tree for parse_expr('9+3/4')
= '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Parse tree for parse_expr('9+3/4')
= '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Identifying Compatible Nodes Which nodes correspond to the same nonterminal
= '' while s[i:] and is_digit(s[i]): n += s[i] i = i +1 return i,n def parse_paren(s, i): assert s[i] == '(' i, v = parse_expr(s, i+1) if s[i:] == '': raise Ex(s, i) assert s[i] == ')' return i+1, v def parse_expr(s, i = 0): expr, is_op = [], True while s[i:]: c = s[i] if isdigit(c): if not is_op: raise Ex(s,i) i,num = parse_num(s,i) expr.append(num) is_op = False elif c in ['+', '-', '*', '/']: if is_op: raise Ex(s,i) expr.append(c) is_op, i = True, i + 1 elif c == '(': if not is_op: raise Ex(s,i) i, cexpr = parse_paren(s, i) expr.append(cexpr) is_op = False elif c == ')': break else: raise Ex(s,i) if is_op: raise Ex(s,i) return i, expr 9+3/4 Identifying Compatible Nodes Which nodes correspond to the same nonterminal
'-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') def start(rops): expr(rops) def expr(rops): match (rops.next % 6): case 0: expr(rops); print('+'); e case 1: expr(rops); print('-'); e case 2: expr(rops); print('/'); e case 3: expr(rops); print('*'); e case 4: print('('); expr(rops); p case 5: number(rops) def number(rops): match (rops.next % 2): case 0: integer(rops) case 1: integer(rops); print('.') def integer(rops): match (rops.next % 2): case 0: digit(rops); integer(rops case 1: digit(rops) def digit(rops): match (rops.next % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')
'-' <expr> | <expr> '/' <expr> | <expr> '*' <expr> | '(' <expr> ')' | <number> <number> := <integer> | <integer> '.' <integer> <integer>:= <digit> <integer> | <digit> <digit> := [0-9] Compiling the Grammar def start(): expr() def expr(): match (random() % 6): case 0: expr(); print('+'); expr() case 1: expr(); print('-'); expr() case 2: expr(); print('/'); expr() case 3: expr(); print('*'); expr() case 4: print('('); expr(); print(')') case 5: number() def number(): match (random() % 2): case 0: integer() case 1: integer(); print('.'); integer() def integer(): match (random() % 2): case 0: digit(); integer() case 1: digit() def digit(): match (random() % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7') def start(rops): expr(rops) def expr(rops): match (rops.next % 6): case 0: expr(rops); print('+'); e case 1: expr(rops); print('-'); e case 2: expr(rops); print('/'); e case 3: expr(rops); print('*'); e case 4: print('('); expr(rops); p case 5: number(rops) def number(rops): match (rops.next % 2): case 0: integer(rops) case 1: integer(rops); print('.') def integer(rops): match (rops.next % 2): case 0: digit(rops); integer(rops case 1: digit(rops) def digit(rops): match (rops.next % 10): case 0: print('0') case 1: print('1') case 2: print('2') case 3: print('3') case 4: print('4') case 5: print('5') case 6: print('6') case 7: print('7')
F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.
F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.
F1 Fuzzer VM Mathis, Gopinath, Mera, Kampmann, Höschele, and Zeller. Mining Input Grammars from Dynamic Control Flow. PLDI 2019. Mathis, Gopinath and Zeller Learning Input Tokens for Effective Fuzzing. ISSTA 2020. Gopinath, Bendrissou, Mathis, and Andreas Zeller Black-box Testing with Monotonic Prefixes. ISSTA 2021 (submitted). Gopinath and Zeller Building Fast Fuzzers 2019 (unpublished) Gopinath, Mathis, and Zeller Mining Input Grammars with Dynamic Control Flow. FSE 2020.
c c A ( ( ) ) <expr> ( ( ) ) 4 Minimized Input Abstract Failure Inducing Input • Effectively abstracts a minimized input • The abstraction identifies where the problem lies • Decompose complex program behaviors DDSET Gopinath, Kampmann, Havrikov, Soremekun, and Zeller. Abstracting Failure Inducing Inputs. ISSTA 2020. def check(parsed): if parsed.is_nested() and parsed.child.is_nested(): raise Exception() return input ISSTA 2020 Distinguished Award
or C2842) and (R385 or R386)> <variableDeclarationList C2937> is <varModifier> <Identifier> = class extends (class {}){} Closure 2937 <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; Rhino 385 <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Rhino 386 <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} Closure 2842 where Algebra of Grammar Specializations
C2842 or R385 or R386)> where <variableDeclarationList C2937> is <varModifier> <Identifier> = class extends (class {}){} <iterationStatement C2842> is {while ((<$Id1>)){ if ((<$Id1>)) {break;;var <$Id1>; continue }0}} <variableStatement R385> is var {<$Id2>:{} = <$Id2> => {}} <variableDeclaration>; <variableDeclarationList R386> is const [<$Id3>,<$Id3>] = [] Mechanized proofs are available Algebra of Grammar Specializations