Fuzzying test in Go

Fuzzying test in Go David Chou @ Golang Taipei CC-BY-SA-3.0-TW

@ Umbo Computer Vision @ Golang Taipei Co-organizer Software engineer,
DevOps, and Gopher david74.chou @ gmail david74.chou @ facebook david74.chou @ medium david7482 @ github

What is fuzzing test? wiki: an automated testing that provides
random data as inputs to a computer program.

A brief history of fuzzing 1950s: 1988: term fuzzing is
coined by Barton Miller “ We didn't call it fuzzing back in the 1950s, but it was our standard practice to test programs by inputting decks of punch cards taken from the trash. This type of testing was so common that it had no name. - Gerald M. Weinberg

“ Fuzzing is the process of sending intentionally invalid data
to a product in the hopes of triggering an error. - H.D. Moore

Fuzzing test Continuously manipulate inputs Semi-random data from various mutation
Discover new code coverage based on instrumentation Run more mutations quickly; rather than fewer mutations intelligently

https://blog.code-intelligence.com/the-magic-behind-feedback-based-fuzzing

What can be fuzzed? deserialization (xml, json, proto, gob) network
protocols (HTTP, SMTP) media codecs (audio, video, images, pdf) crypto (boringssl, openssl) compression (zip, gzip, bzip2, brotli) etc

Why do we need fuzzing? you don't know what you
don't know

Why do we need fuzzing? Fuzzing can reach edge cases
which humans often miss It is particularly valuable for finding vulnerabilities Also a good choice for regression testing Lots of real-world Trophies found 15000+ bugs in Chrome [ ] found 1500+ bugs in FFMPEG [ ] link link

A simple example func CountAverage(num []byte) int { sum :=
byte(0) for _, v := range num { sum += v } return int(sum) / len(num) } 1 2 3 4 5 6 7

func TestCountAverage(t *testing.T) { tests := []struct { name string
num []byte want int }{ { num: []byte{1, 2, 3, 4, 5}, want: 3, }, } for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { got := CountAverage(tt.num) assert.EqualValues(t, tt.want, got) }) } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ go test -run TestCountAverage -cover PASS coverage: 100.0% of statements

Also works for logical bugs Sanity check still works the
result must be within [0, 1) range image decoder: 100 byte input -> 100 MB output? encrypt, check decryption would fail with wrong key sorting: each element exists and the order is expected

Also works for logical bugs Roud-trip test deserialize -> serialize
-> deserialize decompress/compress, decrypt/encrypt Check serialize does not fail 2nd deserialize does not fail deserialize results are equal

Fuzzing test in Go go-fuzz to the rescue

go-fuzz Dmitry Vyukov, Google A successful 3rd-party Go fuzzing solution
It found 200+ bugs in go stdlib, and thousands more Coverage-based fuzzing Instrument program for code coverage Collect initial corpus of inputs for { Randomly mutate an input from the corpus Execute and collect coverage If the input gives new coverage, add it to corpus } 1 2 3 4 5 6 7 Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 Randomly mutate an input from the corpus 4 Execute and collect coverage 5 If the input gives new coverage, add it to corpus 6 } 7 Collect initial corpus of inputs Instrument program for code coverage 1 2 for { 3 Randomly mutate an input from the corpus 4 Execute and collect coverage 5 If the input gives new coverage, add it to corpus 6 } 7 Randomly mutate an input from the corpus Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 4 Execute and collect coverage 5 If the input gives new coverage, add it to corpus 6 } 7 Execute and collect coverage Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 Randomly mutate an input from the corpus 4 5 If the input gives new coverage, add it to corpus 6 } 7 If the input gives new coverage, add it to corpus Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 Randomly mutate an input from the corpus 4 Execute and collect coverage 5 6 } 7

1. Write fuzz function // +build gofuzz func Fuzz(data []byte)
int { gob.NewDecoder(bytes.NewReader(data)).Decode(new(interface{})) return 0 } 2. Build go get github.com/dvyukov/go-fuzz/... go-fuzz-build github.com/dvyukov/go-fuzz-corpus/gob 3. Run go-fuzz -bin gob-fuzz.zip -workdir ./workdir workers: 8, corpus: 1525 (6s ago), crashers: 6, execs: 0 (0/sec), cover: 1651, uptime: 6s workers: 8, corpus: 1525 (9s ago), crashers: 6, execs: 16787 (1860/sec), cover: 1651, uptime: 9s workers: 8, corpus: 1525 (12s ago), crashers: 6, execs: 29840 (2482/sec), cover: 1651, uptime: 12s

go-fuzz's problems Might break (multiple times) due to Go internal
package changes. It tries to do coverage instrumentation without compiler's help. More difficult to use compared to Go's unit testing custom command-line tools separate test files or build tags, etc.

Go's official fuzzing proposal go test -fuzz

Official proposal [ ] Write fuzz function just like test
function func FuzzFoo(f *testing.F) Integrate with go command go test -fuzz Coveraged-based fuzzing Plan to land in 1.18 link

Already beta now

func FuzzCountAverage(f *testing.F) { f.Add([]byte{1}) f.Fuzz(func(t *testing.T, num []byte) {
CountAverage(num) }) } 1 2 3 4 5 6 The fuzz target is a FuzzX function Each fuzz target has its own corpus input testing.F f.Add(): add seed corpus f.Fuzz(): run the fuzz function

$ gotip test -fuzz=FuzzCountAverage -parallel=2 fuzzing, elapsed: 3.0s, execs: 40648
(13549/sec), workers: 2, interesting: 3 fuzzing, elapsed: 3.4s, execs: 44291 (13157/sec), workers: 2, interesting: 3 found a crash, minimizing... --- FAIL: FuzzCountAverage (3.37s) panic: runtime error: integer divide by zero goroutine 21364 [running]: runtime/debug.Stack() /home/david74/sdk/gotip/src/runtime/debug/stack.go:24 +0x90 testing.tRunner.func1.2({0x69e4c0, 0x887760}) /home/david74/sdk/gotip/src/testing/testing.go:1281 +0x267 testing.tRunner.func1() /home/david74/sdk/gotip/src/testing/testing.go:1288 +0x218 panic({0x69e4c0, 0x887760}) /home/david74/sdk/gotip/src/runtime/panic.go:1038 +0x215 github.com/david7482/go-fuzzing-playground.CountAverage({0xc000246000, 0x0, 0x0}) /home/david74/projects/go-fuzzing-playground/count_average.go:8 +0xa5 ... --- FAIL: FuzzCountAverage (0.00s) Crash written to testdata/corpus/FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d To re-run: go test github.com/david7482/go-fuzzing-playground \ -run=FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

func FuzzUnmarshal(f *testing.F) { f.Add([]byte{1}) f.Fuzz(func(t *testing.T, num []byte) {
var v interface{} _ = yaml.Unmarshal([]byte(input), &v) }) } 1 2 3 4 5 6 7 go-yaml/yaml

$ gotip test -fuzz=FuzzUnmarshal fuzzing, elapsed: 3.0s, execs: 62242 (20740/sec),
workers: 4, interesting: 41 fuzzing, elapsed: 6.0s, execs: 127025 (21168/sec), workers: 4, interesting: 48 ... fuzzing, elapsed: 1794.0s, execs: 39365685 (21943/sec), workers: 4, interesting: 324 fuzzing, elapsed: 1796.9s, execs: 39427737 (21942/sec), workers: 4, interesting: 324 found a crash, minimizing... --- FAIL: FuzzUnmarshal (1796.90s) panic: runtime error: invalid memory address or nil pointer dereference goroutine 9884315 [running]: panic({0x72d820, 0x93abe0}) /home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215 gopkg.in/yaml%2ev3.handleErr(0xc00007f6b0) /home/ubuntu/go/pkg/mod/gopkg.in/[email protected]/yaml.go:29 panic({0x72d820, 0x93abe0}) /home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215 gopkg.in/yaml%2ev3.yaml_parser_split_stem_comment(0xc00bf34c00, 0x1) /home/ubuntu/go/pkg/mod/gopkg.in/[email protected]/parserc.go gopkg.in/yaml%2ev3.yaml_parser_parse_block_sequence_entry(0xc00bf34c00, 0xc00bf34eb0, 0x0) /home/ubuntu/go/pkg/mod/gopkg.in/[email protected]/parserc.go gopkg.in/yaml%2ev3.yaml_parser_state_machine(0xc00bf34c00, 0x40df54) ... --- FAIL: FuzzUnmarshal (0.00s) Crash written to testdata/corpus/FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092 To re-run: go test gopkg.in/yaml.v2 \ -run=FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

package main import ( "fmt" "gopkg.in/yaml.v3" ) func main() {
in := "#\n-[[" var n yaml.Node if err := yaml.Unmarshal([]byte(in), &n); err != nil { fmt.Println(err) } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

It does fuzzing with multiple processes Seed corpus folder: ${pkg}/testdata/corpus
Seed corpus = seeds in files + seeds in test A good seed corpus can save the mutation engine a lot of work Regression test go test (no -fuzz) also runs Fuzz() functions with seed corpus as input

Current limitation Only support []byte and primitive types No struct
type, slice and array support Cannot run multiple fuzzers in the same pkg Cannot keep running after a crash is found Cannot convert existing files to the corpus format go test fuzz v1 float(45.241) int(12345) []byte("ABC\xa8\x8c\xb3G\xfc")

How "go test -fuzz" works show me the codes

Instrument program for code coverage Collect initial corpus of inputs
for { Randomly mutate an input from the corpus Execute and collect coverage If the input gives new coverage, add it to corpus } 1 2 3 4 5 6 7 The architecture of "go test -fuzz" How it collects code coverage How it mutates input data

Coordinator Worker Worker run & ping workers ask workers to
fuzz next input write to seed corpus if crash write to corpus cache if new edge RPC request <-> response command: pipe input data: shm mutate input run fuzz function collect coverage return crash or new edge; otherwise cont.

Compiler instrumentation // edge inserts coverage instrumentation for libfuzzer. func
(o *orderState) edge() { // Create a new uint8 counter to be allocated in section // __libfuzzer_extra_counters. counter := staticinit.StaticName(types.Types[types.TUINT8]) counter.SetLibfuzzerExtraCounter(true) // counter += 1 incr := ir.NewAssignOpStmt(base.Pos, ir.OADD, counter, ir.NewInt(1)) o.append(incr) } 1 2 3 4 5 6 7 8 9 10 11 edge() inserts coverage instrumentation

Compiler instrumentation func (o *orderState) stmt(n ir.Node) { switch n.Op()
{ ... case ir.OFOR: edge() case ir.OIF: edge() case ir.ORANGE: edge() case ir.OSELECT: edge() case ir.OSWITCH: edge() ... } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 compiler adds edge() into each edge

Compiler instrumentation // _counters and _ecounters mark the start and
end, respectively, of where // the 8-bit coverage counters reside in memory. They're known to cmd/link, // which specially assigns their addresses for this purpose. var _counters, _ecounters [0]byte func coverage() []byte { addr := unsafe.Pointer(&_counters) size := uintptr(unsafe.Pointer(&_ecounters)) - uintptr(addr) var res []byte *(*unsafeheader.Slice)(unsafe.Pointer(&res)) = unsafeheader.Slice{ Data: addr, Len: int(size), Cap: int(size), } return res } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 coverage() returns the coverage counters

The mutators var byteSliceMutators = []byteSliceMutator{ byteSliceRemoveBytes, byteSliceInsertRandomBytes, byteSliceDuplicateBytes, byteSliceOverwriteBytes,
byteSliceBitFlip, byteSliceXORByte, byteSliceSwapByte, byteSliceOverwriteInterestingUint8, byteSliceOverwriteInterestingUint16, byteSliceOverwriteInterestingUint32, byteSliceInsertConstantBytes, byteSliceOverwriteConstantBytes, byteSliceShuffleBytes, byteSliceSwapBytes, .... } func (m *mutator) mutateBytes(ptrB *[]byte) func (m *mutator) mutateInt(v, maxValue int64) int64 func (m *mutator) mutateUInt(v, maxValue uint64) uint64 func (m *mutator) mutateFloat(v, maxValue float64) float64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

fuzzing test the benefit of fuzzing go-fuzz project go official
fuzzing solution continuous fuzzing ????

Fuzzying test in Go

Fuzzying test in Go

More Decks by David Chou

Other Decks in Programming

Featured

Transcript