Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Aviso at WoDET 2012

Aviso at WoDET 2012

A talk I gave at WoDET on Aviso, a new technique and system for avoiding schedule dependent failures in concurrent programs.

Brandon Lucia

May 28, 2012
Tweet

More Decks by Brandon Lucia

Other Decks in Research

Transcript

  1. Aviso: Empirical Automatic Failure Avoidance Brandon Lucia, Luis Ceze University

    of Washington Department of Computer Science and Engineering Workshop on Determinism (WoDET) 2012 Monday, May 28, 2012
  2. Concurrency Bugs 2 Multithreaded Program T1 T2 T3 Hard for

    programmers to understand and fix Caused by unforseen thread interactions Monday, May 28, 2012
  3. Determinism (& Memoization) [dmp,coredet,grace,kendo,dpj,psets,tern] Aviso Prohibit all but observed good

    schedules Permit all but observed bad schedules Duality Monday, May 28, 2012
  4. bool valid = false; Lock(L); if(p != NULL) valid =

    true; Unlock(L); Lock(L); p = NULL; Unlock(L); Thread 1 Thread 2 Initially, p points to a valid object if( valid ){ Lock(L); p->doStuff(); Unlock(L);} Bug: Thread 1 dereferences NULL Thread 2 wrote NULL to p ! A Concurrency Error Monday, May 28, 2012
  5. bool valid = false; Lock(L); if(p != NULL) valid =

    true; Unlock(L); Lock(L); p = NULL; Unlock(L); Thread 1 Thread 2 if( valid ){ Lock(L); p->doStuff(); Unlock(L);} Don’t allow p = NULL to communicate to p->doStuff() Preventing the Failure: Communication View Monday, May 28, 2012
  6. bool valid = false; Lock(L); if(p != NULL) valid =

    true; Unlock(L); Lock(L); p = NULL; Unlock(L); Thread 1 Thread 2 if( valid ){ Lock(L); p->doStuff(); Unlock(L);} Don’t allow p = NULL between if(p != NULL) and p->doStuff() Preventing the Failure: Schedule View Schedule constraint Monday, May 28, 2012
  7. Hypothesis: A sequence of events that preceded a failure might

    be its cause ! Prevent the sequence, prevent the failure } Monday, May 28, 2012
  8. Monitor events and failures Generate candidate event sequences Generate constraints

    preventing sequences Vet constraints to find failure avoiders Distribute effective constraints System Overview Monday, May 28, 2012
  9. Time ! Dump the history when a failure occurs Thread

    1 Thread 2 Thread 3 Monitoring Events oldest event Event History Monday, May 28, 2012
  10. Lock(L); Synchronization Shared-memory Accesses Find shared memory events with profiling

    if(p != NULL) Unlock(L); Which Events? Monday, May 28, 2012
  11. if(p != NULL) Pruning Sharing Events p->doStuff() CFG Domination shared_A

    = 10; shared_B = 20 if(p != NULL) p->doStuff(); <10μs Online Pruning Monday, May 28, 2012
  12. Event Triples Event Pairs if(p != NULL) p = NULL;

    p->doStuff(); q->use() delete q Which Sequences? Pairs are general Many bugs triggered by few pair orderings[burckhardt10] Monday, May 28, 2012
  13. ✔ ✖ Events trigger transitions States represent progress through sequence

    A schedule constraint prevents its sequence Schedule Constraints Aviso delays the event to enforce the constraint Monday, May 28, 2012
  14. C B A Early-Crash Failures ! Should execute first Should

    execute second ✖ A Bummer: Failing runs donʼt see B Use A to infer B is coming Delay until threshold, hope for the best Monday, May 28, 2012
  15. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  16. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Activate? Event History Monday, May 28, 2012
  17. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Yes! Event History Monday, May 28, 2012
  18. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  19. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Activate? Monday, May 28, 2012
  20. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  21. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Delay? Monday, May 28, 2012
  22. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Yes! Monday, May 28, 2012
  23. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Yes! Monday, May 28, 2012
  24. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  25. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Activate? Monday, May 28, 2012
  26. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  27. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Delay? Monday, May 28, 2012
  28. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  29. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  30. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  31. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  32. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  33. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  34. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Activate? Monday, May 28, 2012
  35. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  36. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Deactivate! Monday, May 28, 2012
  37. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Deactivate! Monday, May 28, 2012
  38. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  39. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  40. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Delay? Monday, May 28, 2012
  41. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  42. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  43. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  44. ✔ ✖ ✖ ✔ ✔ ✖ ✖ ✔ ✔ ✖

    ✖ ✔ Vetting Constraints Seeking largest increase in time between failures Monday, May 28, 2012
  45. ✔ ✖ ✖ ✔ ✔ ✖ ✖ ✔ ✔ ✖

    ✖ ✔ Vetting Constraints Seeking largest increase in time between failures Monday, May 28, 2012
  46. 0 10 20 30 40 memcached Transmission PBZip2 AGet Apache-1

    Apache-2 >1000 Prevents all failures in our experiments Failure Rate Reduction (x) >1000 Failure Rate Reduction Constraints are composable Monday, May 28, 2012
  47. 0 5 10 15 20 Transmission AGet PBZip2 Apache-1 Apache-2

    39 Event collection has low overhead in these apps Runtime Overhead (%) Collection Only Avoidance Composed constraints’ overheads remain reasonable 21 Performance Overhead Monday, May 28, 2012
  48. Going Forward Do Early-Crash Failures the “Right Way” Eliminating the

    Runtime - Compile Constraints In Composability - Constraint interference? Formalizing Our Correctness Guarantees Monday, May 28, 2012
  49. Aviso: Empirical Automatic Failure Avoidance Brandon Lucia, Luis Ceze University

    of Washington Department of Computer Science and Engineering Monday, May 28, 2012