10
Fuzzing Research:
Instrumentation Guided Fuzzing
Industry Requirements:
Uninstrumentable Code
Slide 11
Slide 11 text
Service topology map of Uber
showing hundreds of microservices
(Source: Uber Engineering)
Instrumentation ability or source code
access is not always guaranteed
16
Formal Languages
Formal Language Descriptions
3. Regular
Context Free
Recursively Enumerable
(Chomsky,1956)
Possible to infer
Argument Stack
Return Stack
18
Where to Get the Grammar From?
"Be liberal in what you accept, and conservative in what you send"
(the cause of trouble) Postel's Law
Slide 19
Slide 19 text
19
The standard spec
Buggy Implementation
"Extra" Features
"Accepted" Bugs
•Reference Specification?
Slide 20
Slide 20 text
20
The standard spec
Buggy Implementation
"Extra" Features
"Accepted" Bugs
•Reference Specification?
"Be liberal in what you accept, and conservative in what you send"
(the cause of trouble) Postel's Law
Slide 21
Slide 21 text
21
Grammar Inference
Slide 22
Slide 22 text
22
Grammar Inference
1984
1993
2014
2022
2019
Slide 23
Slide 23 text
23
L* Grammar Inference
Slide 24
Slide 24 text
24
Grammar Inference (L*)
L* (Angluin'84)
Learner
membership: w ∈ L?
equivalence: G = L?
yes/no
counterexample
yes/no
Teacher
Slide 25
Slide 25 text
25
Grammar Inference (L*)
L* (Angluin)
Learner
membership: ab ?
equivalence:
no
abbb
yes
Teacher
?
Slide 26
Slide 26 text
26
Slide 27
Slide 27 text
27
Grammar Inference (L*)
ɛ
ɛ 0
S
Observation Table E
Language (L): {s ∈ Σ* : #b's = 1 }
Alphabet (Σ) : {a, b}
T is Closed (states):
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent (transitions):
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
We begin with S = {ɛ} and E = {ɛ}
a 0
b 1
S · Σ
Slide 28
Slide 28 text
28
Grammar Inference (L*)
ɛ
ɛ 0
S
Observation Table E Alphabet (Σ) : {a, b}
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
We begin with S = {ɛ} and E = {ɛ}
a 0
b 1
S · Σ
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 29
Slide 29 text
29
Grammar Inference (L*)
ɛ
ɛ 0
b 1
Observation Table
S
E
a 0
ba 1
bb 0
S · Σ
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
Language (L): {s ∈ Σ* : #b's = 1 }
Alphabet (Σ) : {a, b}
Slide 30
Slide 30 text
30
Grammar Inference (L*)
ɛ
ɛ 0
b 1
Observation Table
S
E
a 0
ba 1
bb 0
S · Σ
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
Language (L): {s ∈ Σ* : #b's = 1 }
Alphabet (Σ) : {a, b}
Slide 31
Slide 31 text
31
Grammar Inference (L*)
ɛ
ɛ 0
b 1
Observation Table
S
E
a 0
ba 1
bb 0
S · Σ
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 32
Slide 32 text
32
Grammar Inference (L*)
ɛ
ɛ 0
b 1
Observation Table
S
E
a 0
ba 1
bb 0
S · Σ
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 33
Slide 33 text
33
Grammar Inference (L*)
Observation Table E
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
<$ɛ>:= a <$ɛ>
| b <$b>
<$b>:= a <$b>
| b <$ɛ>
| ɛ
ɛ
ɛ 0
b 1
S
a 0
ba 1
bb 0
S · Σ
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 34
Slide 34 text
34
Grammar Inference (L*)
Observation Table
S
E
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
bbb ✘
<ɛ> := a <ɛ>
| b := a
| b <ɛ>
| ɛ
ɛ
ɛ 0
b 1
a 0
ba 1
bb 0
S · Σ
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 35
Slide 35 text
35
Grammar Inference (L*)
ɛ
ɛ 0
b 1
bb 0
bbb 0
Observation Table
S
E
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 36
Slide 36 text
36
Grammar Inference (L*)
ɛ
ɛ 0
b 1
bb 0
bbb 0
Observation Table
S
E
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 37
Slide 37 text
37
Grammar Inference (L*)
Observation Table
S
E
a 0
ba 1
bba 0
bbba 0
bbbb 0
S · Σ
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
ɛ
ɛ 0
b 1
bb 0
bbb 0
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 38
Slide 38 text
38
Grammar Inference (L*)
Observation Table
S
E
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
s1 = [ɛ,] 0
s2 = [bb,] 0
s1.a = [ɛb,] 1
s2.a = [bbb,] 0
ɛ
ɛ 0
b 1
bb 0
bbb 0
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
a 0
ba 1
bba 0
bbba 0
bbbb 0
S · Σ
b
b
Slide 39
Slide 39 text
39
Grammar Inference (L*)
Observation Table
S
E
T is Closed:
∀ w∈S⋅ Σ, ∃ s ∈ S such that [w,]=[s,]
If not closed, add the counter example to S
T is Consistent:
∀ s1,s2∈S, if row(s1) = row(s2) then ∀ a ∈ A [s1a,]=[s2a,]
If not consistent, add the counter example to E
ɛ b
ɛ 0 1
b 1 0
bb 0 0
bbb 0 0
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
a 0 1
ba 1 0
bba 0 0
bbba 0 0
bbbb 0 0
S · Σ
Slide 40
Slide 40 text
40
Grammar Inference (L*)
Observation Table
S
E
<$ɛ> := a <$ɛ>
| b <$b>
<$b> := a <$b>
| b <$bb>
| ɛ
<$bb>:= a <$bb>
| b <$bb>
ɛ b
ɛ 0 1
b 1 0
bb 0 0
bbb 0 0
a 0 1
ba 1 0
bba 0 0
bbba 0 0
bbbb 0 0
S · Σ
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 41
Slide 41 text
41
Grammar Inference (L*)
Observation Table
S
E
<$ɛ> := a <$ɛ>
| b <$b>
<$b> := a <$b>
| ɛ
✓
ɛ b
ɛ 0 1
b 1 0
bb 0 0
bbb 0 0
a 0 1
ba 1 0
bba 0 0
bbba 0 0
bbbb 0 0
S · Σ
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 42
Slide 42 text
42
Grammar Inference (L*)
-Use a discrimination tree rather than an observation table
-Minimise the counter examples through exponential
backo
ff
and binary search of the su
ffi
x
-Intuition LG(w) != BB(w)
-So, there should be a short pre
fi
x
u such that w = uv where they di
ff
er originally. Find it
through binary search. v is then the distinguishing su
ffi
x.
Minimise that su
ff i
x.
-You need to only add all pre
fi
xes of u to S.
-Remove su
ff i
xes that do not add any value.
Improvements (TTT)
Slide 43
Slide 43 text
43
Grammar Inference (L*)
Learner Teacher
w
G = L?
Equivalences Queries are not possible in software engineering scenarios
<$ɛ> := a <$ɛ>
| b <$b>
<$b> := a <$b>
| ɛ
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
✓
Slide 44
Slide 44 text
44
Grammar Inference (L*)
Learner Teacher
w
G = L?
Equivalences Queries are not possible in software engineering scenarios
47
L* Teacher with PAC Guarantees
Probably Approximately Correct (Valiant'84)
Pr(L(A)≢X ≤ ϵ) ≥ 1−δ 1-∈: accuracy
1-δ: confidence
Equivalence Query = Multiple Membership Checks
Checks come from some sampling distribution D over A*
We only get a PAC guarantee based on D
qi = [1/ϵ (ln(1/δ) + i ln(2))]
Checks made in place of ith equivalence query:
Slide 48
Slide 48 text
48
Grammar Inference (L*)
Learner Oracle
Membership Checks substitute Equivalence Queries
w <ɛ> := a <ɛ>
| b := a
| b
| ɛ
:= a
| b
Alphabet (Σ) : {a, b}
Language (L): {s ∈ Σ* : #b's = 1 }
Slide 49
Slide 49 text
49
Grammar Inference (PAC-L*)
Learner Oracle
w
Random Sampler (D)
Blackbox Hypothesis
w ∈ D
L(*)
Substituting Equivalence Queries
ab ✓
✓
abb
✘ ✘
bb ✓
✓
aaaa ✓
✓
bbb ✓
✘
Slide 50
Slide 50 text
50
Grammar Inference (PAC-L*)
Learner Oracle
w
Random Sampler (D)
w ∈ D
L(*)
Substituting Equivalence Queries
Search Space
Slide 51
Slide 51 text
Grammar Inference
Problem: Exponential Search Space
2n possibilities for n length string
Slide 52
Slide 52 text
Grammar Inference
Problem: Exponential Search Space
2n possibilities for n length string
Slide 53
Slide 53 text
Grammar Inference (with examples)
Glade Arvada
With good examples, the problem is tractable
Slide 54
Slide 54 text
Finding Good Examples
Example corpus?
(Blind spots)
54
56
• Differentiate incomplete and incorrect inputs
Key Idea: Viable Pre
fi
xes
56
• Solve one character at a time systematically
Slide 57
Slide 57 text
57
Example Generator
a
[ 5
1
b
,
}
4 ]
a ∉ [,],{,},",0,1,2,3,4,5,.,.
b ∉ [,],0,1,2,3,4,5,6,7,8,9,,
} ∉ [,],0,1,2,3,4,5,6,7,8,9,0,,
[51,4]
57
Slide 58
Slide 58 text
58
Pre
fi
xQ AFL(black)
INI 62.5 65
CSV 65.7 68.3
JSON 13.8 9.2
TinyC 86.8 47.9
MJS 28.0 19.0
Quality of Examples
Branch Coverage Obtained
C programs
Slide 59
Slide 59 text
59
Pre
fi
xQ AFL(black) AFL(gray)
INI 62.5 65 77.5
CSV 65.7 68.3 68.5
JSON 13.8 9.2 22.5
TinyC 86.8 47.9 81.6
MJS 28.0 19.0 29.9
Quality of Examples
Tex Crash: ]9xdy[zSf$\theta{f!;} ;i\nonfrenchspacing !$$\prec q;7O/, $\downbrace
fi
ll @Pz \mathstrut{}$^: aK[X|?$47$ ,`D f$)Cg8$*
Branch Coverage Obtained
C programs
Slide 60
Slide 60 text
60
Slide 61
Slide 61 text
61
Positive and Negative Examples
with Pre
fi
x Queries
a
[ 5
1
b
,
}
4 ]
a ∉ [,],{,},",0,1,2,3,4,5,.,.
b ∉ [,],0,1,2,3,4,5,6,7,8,9,,
} ∉ [,],0,1,2,3,4,5,6,7,8,9,0,,
[51,4]
61
Slide 62
Slide 62 text
62
Grammar Inference (PL*)
Learner Pre
fi
x Oracle
w
Blackbox Hypothesis
w ∈ B
Yes/No
Yes/No
PL(*)
w ∈ H
Substituting Equivalence Queries
Slide 63
Slide 63 text
63
Grammar Inference (PL*)
Pr(L(A)≢X ≤ ϵ) ≥ 1−δ
Relation between D,ϵ,δ and F1 score On Arithmetic (depth limited)
L(*)
Eq = Pre
fi
x Sampler Eq = Pre
fi
x Sampler)
(p=0.05) (p=0.5)
Eq = Pre
fi
x Sampler)
(p=1.0)
Red is good, Blue is bad
PL(*) PL(*) PL(*)
1-δ: confidence
1-∈: accuracy
Slide 64
Slide 64 text
64
Grammar Inference (PL*)
Pr(L(A)≢X ≤ ϵ) ≥ 1−δ
Relation between D,ϵ,δ and F1 score On JSON (depth limited)
L(*)
Eq = Pre
fi
x Sampler
(p=0.05)
Eq = Pre
fi
x Sampler)
(p=0.5)
Eq = Pre
fi
x Sampler)
(p=1.0)
Red is good, Blue is bad
1-δ: confidence
1-∈: accuracy
PL(*)
PL(*) PL(*)
70
Issue 386 from Rhino
var A = class extends (class {}){};
Issue 2937 from Closure
const [y,y] = [];
var {baz:{} = baz => {}} = baz => {};
Issue 385 from Rhino
{while ((l_0)){ if ((l_0)) {break;;var l_0; continue }0}}
Issue 2842 from Closure
Delta Minimization is useful but not su
ff
i
cient
3 * 4
:=
:= ' + '
| ' - '
|
:= ' * '
| ' / '
|
:= '+'
| '-'
| '(' ')'
| '.'
|
:=
|
:= [0-9]
c
c
✓ Did not reproduce the failure
78
Slide 79
Slide 79 text
( ( 4 ) )
:=
:= ' + '
| ' - '
|
:= ' * '
| ' / '
|
:= '+'
| '-'
| '(' ')'
| '.'
|
:=
|
:= [0-9]
c
c
c
c
c
c
c
79
Slide 80
Slide 80 text
( ( 1 - 2 ) )
:=
:= ' + '
| ' - '
|
:= ' * '
| ' / '
|
:= '+'
| '-'
| '(' ')'
| '.'
|
:=
|
:= [0-9]
c
c
c
c
c
c
c
✘ reproduced the failure
( ( 1 - 2 ) )
80
Slide 81
Slide 81 text
( ( 1 - 2 ) )
c
c
c
c
c
c
c
✘
( ( 1 - 2 ) )
81
Slide 82
Slide 82 text
( ( 1 - 2 ) )
c
c
c
c
c
c
c
✘
( ( 1 - 2 ) )
✘
( ( 2 * 3 + 4 ) )
82
Slide 83
Slide 83 text
( ( 1 - 2 ) )
c
c
c
c
c
c
c
✘
( ( 1 - 2 ) )
✘
( ( 2 * 3 + 4 ) )
✘
( ( - 2 / 1 ) )
83
Slide 84
Slide 84 text
( ( 1 - 2 ) )
c
c
c
c
c
c
c
✘
( ( 1 - 2 ) )
✘
( ( 2 * 3 + 4 ) )
✘
( ( - 2 / 1 ) )
✘
( ( 98 - 0 ) )
84
Slide 85
Slide 85 text
)
(
( )
( ( )
4 )
( ( 4 ) )
c
c
c
c
c
c
c
A
85
Slide 86
Slide 86 text
)
(
( )
( ( )
4 )
( ( 4 ) )
c
c
c
c
c
c
c
A
86
Slide 87
Slide 87 text
( ( 4 ) )
c
c
c
c
c
c
c
A
( ( ) )
( ( ) )
4
Minimized Input
Abstract Failure Inducing Input
def check(parsed):
if parsed.is_nested() and parsed.child.is_nested():
raise Exception()
return input
87
def jsoncheck(json):
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
{"abc": null}
✘
is : null
92
Slide 93
Slide 93 text
{"abc": []}
✔
no is : null
def jsoncheck(json):
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
93
Slide 94
Slide 94 text
{"abc": 124}
✘
no is "" :
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
process(json)
94
Slide 95
Slide 95 text
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
process(json)
{"": 124}
✔
is "" :
95
Slide 96
Slide 96 text
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
96
Slide 97
Slide 97 text
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
{"": 124}
✔
is "" :
no is : null
&
97
Slide 98
Slide 98 text
def jsoncheck(json):
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
{"abc": null}
✘
is : null
Start Symbol
98
Slide 99
Slide 99 text
def jsoncheck(json):
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
{"abc": []}
no is : null
✔
Start Symbol
99
Slide 100
Slide 100 text
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
process(json)
{"abc": 124}
✘
is "" :
Start Symbol
100
Slide 101
Slide 101 text
is "" :
no is : null
&
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
{"": 124}
✔
Start Symbol
101
Slide 102
Slide 102 text
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
Evogram
Automatically Derived
102
Slide 103
Slide 103 text
def jsoncheck(json):
if no_key_is_empty_string(json):
fail(’one key must be empty’)
if any_key_has_null_value(json):
fail(’key value must not be null’)
process(json)
Automatically Derived
103
Slide 104
Slide 104 text
Supercharged Pattern Matchers
where
is "":
is :null
where
is (())
is / 0
where
is "0"
is "0x"
where
is ";;"
is "()"
is "()"
Alternative to Regular Expressions
104