Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ParaFuzz: Fuzzing Multicore OCaml Programs

ParaFuzz: Fuzzing Multicore OCaml Programs

ParaFuzz is a concurrency testing tool for Multicore OCaml programs that combines property-based testing with grey-box fuzzing applied to parallel programs.

C29f097d23f8904532ca088ac23ce801?s=128

KC Sivaramakrishnan

August 26, 2021
Tweet

Transcript

  1. ParaFuzz: Fuzzing Multicore OCaml programs “KC” Sivaramakrishna n joint work

    wit h Sumit Padhiyar and Adharsh Kamath
  2. • Adds native support for concurrency and parallelism to OCaml

    Multicore OCaml
  3. • Adds native support for concurrency and parallelism to OCaml

    Multicore OCaml Overlapped execution A B A C B Time
  4. • Adds native support for concurrency and parallelism to OCaml

    Multicore OCaml Overlapped execution A B A C B Time Simultaneous execution A B C Time
  5. • Adds native support for concurrency and parallelism to OCaml

    Multicore OCaml Overlapped execution A B A C B Time Simultaneous execution A B C Time Effect Handlers
  6. • Adds native support for concurrency and parallelism to OCaml

    Multicore OCaml Overlapped execution A B A C B Time Simultaneous execution A B C Time Effect Handlers Domains
  7. • Adds native support for concurrency and parallelism to OCaml

    Multicore OCaml Overlapped execution A B A C B Time Simultaneous execution A B C Time Effect Handlers Domains using Testing
  8. Testing Parallel Programs

  9. Testing Parallel Programs • The assertion can fail for a

    particular input and scheduling combination
  10. Testing Parallel Programs • The assertion can fail for a

    particular input and scheduling combination 10
  11. Testing Parallel Programs • The assertion can fail for a

    particular input and scheduling combination happens before 10
  12. Testing Parallel Programs • The assertion can fail for a

    particular input and scheduling combination happens before happens before 10
  13. Testing Parallel Programs • The assertion can fail for a

    particular input and scheduling combination • Logic bugs are more than just detecting data race s ✦ No data races here happens before happens before 10
  14. Testing Parallel Programs • The assertion can fail for a

    particular input and scheduling combination • Logic bugs are more than just detecting data race s ✦ No data races here • Goal — An effective and pragmatic testing technique for the working OCaml programmer. happens before happens before 10
  15. Existing solutions • Testin g ✦ Stress testing — run

    the program over and over again and hope that the assertion is triggere d ✦ Random testing — generate random inputs, and perturb the OS scheduler (somehow) to trigger bugs
  16. Existing solutions • Testin g ✦ Stress testing — run

    the program over and over again and hope that the assertion is triggere d ✦ Random testing — generate random inputs, and perturb the OS scheduler (somehow) to trigger bugs • Model checking — SPIN, TLC model checker s ✦ Strong guarantees, but not practical with limited time budge t ✦ Often works on a model of the program and not directly on the source code
  17. Existing solutions • Testin g ✦ Stress testing — run

    the program over and over again and hope that the assertion is triggere d ✦ Random testing — generate random inputs, and perturb the OS scheduler (somehow) to trigger bugs • Model checking — SPIN, TLC model checker s ✦ Strong guarantees, but not practical with limited time budge t ✦ Often works on a model of the program and not directly on the source code • Formal veri fi catio n ✦ Requires expert knowledge and lots of time and effort
  18. Goal + Approach • Build upon effective pragmatic testing technique

    s ✦ Ignore concurrency for the moment — only input non-determinism
  19. Goal + Approach • Build upon effective pragmatic testing technique

    s ✦ Ignore concurrency for the moment — only input non-determinism • Property-based testin g ✦ Use a generator to generate random inputs to test a functio n ✦ Quick-check
  20. Goal + Approach • Build upon effective pragmatic testing technique

    s ✦ Ignore concurrency for the moment — only input non-determinism • Property-based testin g ✦ Use a generator to generate random inputs to test a functio n ✦ Quick-check • Fuzzin g ✦ Generate random inputs to crash a progra m ✦ AFL — Extremely effective grey-box (coverage-guided) fuzzer
  21. Goal + Approach • Build upon effective pragmatic testing technique

    s ✦ Ignore concurrency for the moment — only input non-determinism • Property-based testin g ✦ Use a generator to generate random inputs to test a functio n ✦ Quick-check • Fuzzin g ✦ Generate random inputs to crash a progra m ✦ AFL — Extremely effective grey-box (coverage-guided) fuzzer • Crowbar = Fuzzing + QuickChec k ✦ Coverage-guided property-fuzzin g ✦ https://github.com/stedolan/crowbar
  22. ParaFuzz Parafuzz = Crowbar (grey-box fuzzing + property-based testing) +

    Parallelism
  23. ParaFuzz • How to control parallel thread scheduling ? ✦

    OS controls thread scheduling in parallel program s ✦ Need to force a buggy schedule Parafuzz = Crowbar (grey-box fuzzing + property-based testing) + Parallelism
  24. Effect handlers • Mock the parallelism API using effect handler

    s • Effect Handler s ✦ Modular and composable basis of non-local control- fl ow mechanism s ✤ Exceptions, async/await, lightweight threads, co-routines, etc . ✦ Structured programming with delimited continuations
  25. Effect Handlers + Fuzzing • Simulate parallel thread scheduler using

    effect handler s ✦ OS thread scheduler → user-level thread schedule r ✦ Retain control over the scheduling decisions
  26. Effect Handlers + Fuzzing • Simulate parallel thread scheduler using

    effect handler s ✦ OS thread scheduler → user-level thread schedule r ✦ Retain control over the scheduling decisions • Fuzzing the schedule r ✦ Yield at every synchronisation point ✤ Synchronisation point — context-switch leads to non-determinis m ✦ Use AFL to pick next thread to run from ready queue
  27. Effect Handlers + Fuzzing • Simulate parallel thread scheduler using

    effect handler s ✦ OS thread scheduler → user-level thread schedule r ✦ Retain control over the scheduling decisions • Fuzzing the schedule r ✦ Yield at every synchronisation point ✤ Synchronisation point — context-switch leads to non-determinis m ✦ Use AFL to pick next thread to run from ready queue • Synchronisation point s ✦ Domain (spawn, join), Atomic (get, set, CAS), Mutex (lock, unlock), Condition variable (wait, notify, broadcast)
  28. ParaFuzz • System layout

  29. ParaFuzz • System layout • Advantage s ✦ Drop-in replacement

    for Multicore OCaml — separate compiler switc h ✦ No false positive s ✦ Deterministic record-and-replay
  30. Evaluation

  31. Evaluation

  32. Evaluation Effectivenes s fraction of runs that found the bug

    Ef fi cienc y Mean-time to failure
  33. Evaluation Effectivenes s fraction of runs that found the bug

    Ef fi cienc y Mean-time to failure
  34. Evaluation Effectivenes s fraction of runs that found the bug

    Ef fi cienc y Mean-time to failure
  35. Future work: Data races

  36. Future work: Data races • ParaFuzz currently assumes that the

    programs are data-race-free (DRF) ✦ DRF programs in OCaml have SC semantics
  37. Future work: Data races • ParaFuzz currently assumes that the

    programs are data-race-free (DRF) ✦ DRF programs in OCaml have SC semantics • OCaml memory model (PLDI’18) also has a simple operational model for racy program s ✦ Racy reads may return one of a subset of writes performed to a non- atomic location
  38. Future work: Data races • ParaFuzz currently assumes that the

    programs are data-race-free (DRF) ✦ DRF programs in OCaml have SC semantics • OCaml memory model (PLDI’18) also has a simple operational model for racy program s ✦ Racy reads may return one of a subset of writes performed to a non- atomic location • Extend ParaFuzz to racy program s ✦ Use AFL to pick the value that a read should retur n ✦ Force a yield at non-atomic reads and writes
  39. Future work: Data races • ParaFuzz currently assumes that the

    programs are data-race-free (DRF) ✦ DRF programs in OCaml have SC semantics • OCaml memory model (PLDI’18) also has a simple operational model for racy program s ✦ Racy reads may return one of a subset of writes performed to a non- atomic location • Extend ParaFuzz to racy program s ✦ Use AFL to pick the value that a read should retur n ✦ Force a yield at non-atomic reads and writes • Can we make it fast enough for pragmatic use?
  40. Summary • ParaFuzz ✦ property-based fuzzing tool for Parallel OCaml

    program s ✦ Easy to use — drop-in replacement for Multicore OCaml program s ✦ Effective and ef fi cient at fi nding concurrency bug s • Future work — Detecting bugs under data races https://github.com/ocaml-multicore/parafuzz