Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fuzzying test in Go

Fuzzying test in Go

David Chou

July 17, 2021
Tweet

More Decks by David Chou

Other Decks in Programming

Transcript

  1. @ Umbo Computer Vision @ Golang Taipei Co-organizer Software engineer,

    DevOps, and Gopher david74.chou @ gmail david74.chou @ facebook david74.chou @ medium david7482 @ github
  2. What is fuzzing test? wiki: an automated testing that provides

    random data as inputs to a computer program.
  3. A brief history of fuzzing 1950s: 1988: term fuzzing is

    coined by Barton Miller “ We didn't call it fuzzing back in the 1950s, but it was our standard practice to test programs by inputting decks of punch cards taken from the trash. This type of testing was so common that it had no name. - Gerald M. Weinberg
  4. “ Fuzzing is the process of sending intentionally invalid data

    to a product in the hopes of triggering an error. - H.D. Moore
  5. Fuzzing test Continuously manipulate inputs Semi-random data from various mutation

    Discover new code coverage based on instrumentation Run more mutations quickly; rather than fewer mutations intelligently
  6. What can be fuzzed? deserialization (xml, json, proto, gob) network

    protocols (HTTP, SMTP) media codecs (audio, video, images, pdf) crypto (boringssl, openssl) compression (zip, gzip, bzip2, brotli) etc
  7. Why do we need fuzzing? Fuzzing can reach edge cases

    which humans often miss It is particularly valuable for finding vulnerabilities Also a good choice for regression testing Lots of real-world Trophies found 15000+ bugs in Chrome [ ] found 1500+ bugs in FFMPEG [ ] link link
  8. A simple example func CountAverage(num []byte) int { sum :=

    byte(0) for _, v := range num { sum += v } return int(sum) / len(num) } 1 2 3 4 5 6 7
  9. func TestCountAverage(t *testing.T) { tests := []struct { name string

    num []byte want int }{ { num: []byte{1, 2, 3, 4, 5}, want: 3, }, } for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { got := CountAverage(tt.num) assert.EqualValues(t, tt.want, got) }) } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ go test -run TestCountAverage -cover PASS coverage: 100.0% of statements
  10. Also works for logical bugs Sanity check still works the

    result must be within [0, 1) range image decoder: 100 byte input -> 100 MB output? encrypt, check decryption would fail with wrong key sorting: each element exists and the order is expected
  11. Also works for logical bugs Roud-trip test deserialize -> serialize

    -> deserialize decompress/compress, decrypt/encrypt Check serialize does not fail 2nd deserialize does not fail deserialize results are equal
  12. go-fuzz Dmitry Vyukov, Google A successful 3rd-party Go fuzzing solution

    It found 200+ bugs in go stdlib, and thousands more Coverage-based fuzzing Instrument program for code coverage Collect initial corpus of inputs for { Randomly mutate an input from the corpus Execute and collect coverage If the input gives new coverage, add it to corpus } 1 2 3 4 5 6 7 Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 Randomly mutate an input from the corpus 4 Execute and collect coverage 5 If the input gives new coverage, add it to corpus 6 } 7 Collect initial corpus of inputs Instrument program for code coverage 1 2 for { 3 Randomly mutate an input from the corpus 4 Execute and collect coverage 5 If the input gives new coverage, add it to corpus 6 } 7 Randomly mutate an input from the corpus Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 4 Execute and collect coverage 5 If the input gives new coverage, add it to corpus 6 } 7 Execute and collect coverage Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 Randomly mutate an input from the corpus 4 5 If the input gives new coverage, add it to corpus 6 } 7 If the input gives new coverage, add it to corpus Instrument program for code coverage 1 Collect initial corpus of inputs 2 for { 3 Randomly mutate an input from the corpus 4 Execute and collect coverage 5 6 } 7
  13. 1. Write fuzz function // +build gofuzz func Fuzz(data []byte)

    int { gob.NewDecoder(bytes.NewReader(data)).Decode(new(interface{})) return 0 } 2. Build go get github.com/dvyukov/go-fuzz/... go-fuzz-build github.com/dvyukov/go-fuzz-corpus/gob 3. Run go-fuzz -bin gob-fuzz.zip -workdir ./workdir workers: 8, corpus: 1525 (6s ago), crashers: 6, execs: 0 (0/sec), cover: 1651, uptime: 6s workers: 8, corpus: 1525 (9s ago), crashers: 6, execs: 16787 (1860/sec), cover: 1651, uptime: 9s workers: 8, corpus: 1525 (12s ago), crashers: 6, execs: 29840 (2482/sec), cover: 1651, uptime: 12s
  14. go-fuzz's problems Might break (multiple times) due to Go internal

    package changes. It tries to do coverage instrumentation without compiler's help. More difficult to use compared to Go's unit testing custom command-line tools separate test files or build tags, etc.
  15. Official proposal [ ] Write fuzz function just like test

    function func FuzzFoo(f *testing.F) Integrate with go command go test -fuzz Coveraged-based fuzzing Plan to land in 1.18 link
  16. func FuzzCountAverage(f *testing.F) { f.Add([]byte{1}) f.Fuzz(func(t *testing.T, num []byte) {

    CountAverage(num) }) } 1 2 3 4 5 6 The fuzz target is a FuzzX function Each fuzz target has its own corpus input testing.F f.Add(): add seed corpus f.Fuzz(): run the fuzz function
  17. $ gotip test -fuzz=FuzzCountAverage -parallel=2 fuzzing, elapsed: 3.0s, execs: 40648

    (13549/sec), workers: 2, interesting: 3 fuzzing, elapsed: 3.4s, execs: 44291 (13157/sec), workers: 2, interesting: 3 found a crash, minimizing... --- FAIL: FuzzCountAverage (3.37s) panic: runtime error: integer divide by zero goroutine 21364 [running]: runtime/debug.Stack() /home/david74/sdk/gotip/src/runtime/debug/stack.go:24 +0x90 testing.tRunner.func1.2({0x69e4c0, 0x887760}) /home/david74/sdk/gotip/src/testing/testing.go:1281 +0x267 testing.tRunner.func1() /home/david74/sdk/gotip/src/testing/testing.go:1288 +0x218 panic({0x69e4c0, 0x887760}) /home/david74/sdk/gotip/src/runtime/panic.go:1038 +0x215 github.com/david7482/go-fuzzing-playground.CountAverage({0xc000246000, 0x0, 0x0}) /home/david74/projects/go-fuzzing-playground/count_average.go:8 +0xa5 ... --- FAIL: FuzzCountAverage (0.00s) Crash written to testdata/corpus/FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d To re-run: go test github.com/david7482/go-fuzzing-playground \ -run=FuzzCountAverage/d40a98862ed393eb712e47a91bcef18e6f24cf368bb4bd248c7a7101ef8e178d 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
  18. func FuzzUnmarshal(f *testing.F) { f.Add([]byte{1}) f.Fuzz(func(t *testing.T, num []byte) {

    var v interface{} _ = yaml.Unmarshal([]byte(input), &v) }) } 1 2 3 4 5 6 7 go-yaml/yaml
  19. $ gotip test -fuzz=FuzzUnmarshal fuzzing, elapsed: 3.0s, execs: 62242 (20740/sec),

    workers: 4, interesting: 41 fuzzing, elapsed: 6.0s, execs: 127025 (21168/sec), workers: 4, interesting: 48 ... fuzzing, elapsed: 1794.0s, execs: 39365685 (21943/sec), workers: 4, interesting: 324 fuzzing, elapsed: 1796.9s, execs: 39427737 (21942/sec), workers: 4, interesting: 324 found a crash, minimizing... --- FAIL: FuzzUnmarshal (1796.90s) panic: runtime error: invalid memory address or nil pointer dereference goroutine 9884315 [running]: panic({0x72d820, 0x93abe0}) /home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215 gopkg.in/yaml%2ev3.handleErr(0xc00007f6b0) /home/ubuntu/go/pkg/mod/gopkg.in/[email protected]/yaml.go:29 panic({0x72d820, 0x93abe0}) /home/ubuntu/sdk/gotip/src/runtime/panic.go:1038 +0x215 gopkg.in/yaml%2ev3.yaml_parser_split_stem_comment(0xc00bf34c00, 0x1) /home/ubuntu/go/pkg/mod/gopkg.in/[email protected]/parserc.go gopkg.in/yaml%2ev3.yaml_parser_parse_block_sequence_entry(0xc00bf34c00, 0xc00bf34eb0, 0x0) /home/ubuntu/go/pkg/mod/gopkg.in/[email protected]/parserc.go gopkg.in/yaml%2ev3.yaml_parser_state_machine(0xc00bf34c00, 0x40df54) ... --- FAIL: FuzzUnmarshal (0.00s) Crash written to testdata/corpus/FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092 To re-run: go test gopkg.in/yaml.v2 \ -run=FuzzUnmarshal/9c9e78ca4b2c797536d2fbe662c68321c5c3ab6df680664b23c913799fc7f092 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
  20. package main import ( "fmt" "gopkg.in/yaml.v3" ) func main() {

    in := "#\n-[[" var n yaml.Node if err := yaml.Unmarshal([]byte(in), &n); err != nil { fmt.Println(err) } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
  21. It does fuzzing with multiple processes Seed corpus folder: ${pkg}/testdata/corpus

    Seed corpus = seeds in files + seeds in test A good seed corpus can save the mutation engine a lot of work Regression test go test (no -fuzz) also runs Fuzz() functions with seed corpus as input
  22. Current limitation Only support []byte and primitive types No struct

    type, slice and array support Cannot run multiple fuzzers in the same pkg Cannot keep running after a crash is found Cannot convert existing files to the corpus format go test fuzz v1 float(45.241) int(12345) []byte("ABC\xa8\x8c\xb3G\xfc")
  23. Instrument program for code coverage Collect initial corpus of inputs

    for { Randomly mutate an input from the corpus Execute and collect coverage If the input gives new coverage, add it to corpus } 1 2 3 4 5 6 7 The architecture of "go test -fuzz" How it collects code coverage How it mutates input data
  24. Coordinator Worker Worker run & ping workers ask workers to

    fuzz next input write to seed corpus if crash write to corpus cache if new edge RPC request <-> response command: pipe input data: shm mutate input run fuzz function collect coverage return crash or new edge; otherwise cont.
  25. Compiler instrumentation // edge inserts coverage instrumentation for libfuzzer. func

    (o *orderState) edge() { // Create a new uint8 counter to be allocated in section // __libfuzzer_extra_counters. counter := staticinit.StaticName(types.Types[types.TUINT8]) counter.SetLibfuzzerExtraCounter(true) // counter += 1 incr := ir.NewAssignOpStmt(base.Pos, ir.OADD, counter, ir.NewInt(1)) o.append(incr) } 1 2 3 4 5 6 7 8 9 10 11 edge() inserts coverage instrumentation
  26. Compiler instrumentation func (o *orderState) stmt(n ir.Node) { switch n.Op()

    { ... case ir.OFOR: edge() case ir.OIF: edge() case ir.ORANGE: edge() case ir.OSELECT: edge() case ir.OSWITCH: edge() ... } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 compiler adds edge() into each edge
  27. Compiler instrumentation // _counters and _ecounters mark the start and

    end, respectively, of where // the 8-bit coverage counters reside in memory. They're known to cmd/link, // which specially assigns their addresses for this purpose. var _counters, _ecounters [0]byte func coverage() []byte { addr := unsafe.Pointer(&_counters) size := uintptr(unsafe.Pointer(&_ecounters)) - uintptr(addr) var res []byte *(*unsafeheader.Slice)(unsafe.Pointer(&res)) = unsafeheader.Slice{ Data: addr, Len: int(size), Cap: int(size), } return res } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 coverage() returns the coverage counters
  28. The mutators var byteSliceMutators = []byteSliceMutator{ byteSliceRemoveBytes, byteSliceInsertRandomBytes, byteSliceDuplicateBytes, byteSliceOverwriteBytes,

    byteSliceBitFlip, byteSliceXORByte, byteSliceSwapByte, byteSliceOverwriteInterestingUint8, byteSliceOverwriteInterestingUint16, byteSliceOverwriteInterestingUint32, byteSliceInsertConstantBytes, byteSliceOverwriteConstantBytes, byteSliceShuffleBytes, byteSliceSwapBytes, .... } func (m *mutator) mutateBytes(ptrB *[]byte) func (m *mutator) mutateInt(v, maxValue int64) int64 func (m *mutator) mutateUInt(v, maxValue uint64) uint64 func (m *mutator) mutateFloat(v, maxValue float64) float64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
  29. fuzzing test the benefit of fuzzing go-fuzz project go official

    fuzzing solution continuous fuzzing ????