Aviso at WoDET 2012

Aviso at WoDET 2012

A talk I gave at WoDET on Aviso, a new technique and system for avoiding schedule dependent failures in concurrent programs.

4d7bad4018644d2e5ebc1cb49c3a4278?s=128

Brandon Lucia

May 28, 2012
Tweet

Transcript

  1. Aviso: Empirical Automatic Failure Avoidance Brandon Lucia, Luis Ceze University

    of Washington Department of Computer Science and Engineering Workshop on Determinism (WoDET) 2012 Monday, May 28, 2012
  2. Concurrency Bugs 2 Multithreaded Program T1 T2 T3 Hard for

    programmers to understand and fix Caused by unforseen thread interactions Monday, May 28, 2012
  3. Determinism to the Rescue! 3 Monday, May 28, 2012

  4. Determinism to the Rescue! 3 Monday, May 28, 2012

  5. Determinism to the Rescue! 3 Input A Monday, May 28,

    2012
  6. Determinism to the Rescue? 4 Input B Monday, May 28,

    2012
  7. Determinism to the Rescue? 4 Input B ! Monday, May

    28, 2012
  8. Testing to the Rescue? 5 Input Z Input A Monday,

    May 28, 2012
  9. Aviso: An Empirical 6 Aviso Monday, May 28, 2012

  10. Aviso: An Empirical 6 ! Aviso Monday, May 28, 2012

  11. An Empirical Approach 7 ! ! Monday, May 28, 2012

  12. An Empirical Approach 7 ! ! Monday, May 28, 2012

  13. Determinism (& Memoization) [dmp,coredet,grace,kendo,dpj,psets,tern] Aviso Prohibit all but observed good

    schedules Permit all but observed bad schedules Duality Monday, May 28, 2012
  14. bool valid = false; Lock(L); if(p != NULL) valid =

    true; Unlock(L); Lock(L); p = NULL; Unlock(L); Thread 1 Thread 2 Initially, p points to a valid object if( valid ){ Lock(L); p->doStuff(); Unlock(L);} Bug: Thread 1 dereferences NULL Thread 2 wrote NULL to p ! A Concurrency Error Monday, May 28, 2012
  15. bool valid = false; Lock(L); if(p != NULL) valid =

    true; Unlock(L); Lock(L); p = NULL; Unlock(L); Thread 1 Thread 2 if( valid ){ Lock(L); p->doStuff(); Unlock(L);} Don’t allow p = NULL to communicate to p->doStuff() Preventing the Failure: Communication View Monday, May 28, 2012
  16. bool valid = false; Lock(L); if(p != NULL) valid =

    true; Unlock(L); Lock(L); p = NULL; Unlock(L); Thread 1 Thread 2 if( valid ){ Lock(L); p->doStuff(); Unlock(L);} Don’t allow p = NULL between if(p != NULL) and p->doStuff() Preventing the Failure: Schedule View Schedule constraint Monday, May 28, 2012
  17. Goal: Automatically find and generate schedule constraints to prevent failures

    ! Monday, May 28, 2012
  18. Hypothesis: A sequence of events that preceded a failure might

    be its cause ! Prevent the sequence, prevent the failure } Monday, May 28, 2012
  19. Monitor events and failures Generate candidate event sequences Generate constraints

    preventing sequences Vet constraints to find failure avoiders Distribute effective constraints System Overview Monday, May 28, 2012
  20. Time ! Dump the history when a failure occurs Thread

    1 Thread 2 Thread 3 Monitoring Events oldest event Event History Monday, May 28, 2012
  21. Lock(L); Synchronization Shared-memory Accesses Find shared memory events with profiling

    if(p != NULL) Unlock(L); Which Events? Monday, May 28, 2012
  22. if(p != NULL) Pruning Sharing Events p->doStuff() CFG Domination shared_A

    = 10; shared_B = 20 if(p != NULL) p->doStuff(); <10μs Online Pruning Monday, May 28, 2012
  23. Failing Event History Candidate sequences ! Candidate Failure Sequences Monday,

    May 28, 2012
  24. Event Triples Event Pairs if(p != NULL) p = NULL;

    p->doStuff(); q->use() delete q Which Sequences? Pairs are general Many bugs triggered by few pair orderings[burckhardt10] Monday, May 28, 2012
  25. ✔ ✖ Events trigger transitions States represent progress through sequence

    A schedule constraint prevents its sequence Schedule Constraints Aviso delays the event to enforce the constraint Monday, May 28, 2012
  26. C B A Early-Crash Failures ! Should execute first Should

    execute second ✖ A Bummer: Failing runs donʼt see B Use A to infer B is coming Delay until threshold, hope for the best Monday, May 28, 2012
  27. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  28. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Activate? Event History Monday, May 28, 2012
  29. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Yes! Event History Monday, May 28, 2012
  30. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  31. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Activate? Monday, May 28, 2012
  32. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  33. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Delay? Monday, May 28, 2012
  34. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Yes! Monday, May 28, 2012
  35. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Yes! Monday, May 28, 2012
  36. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  37. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Activate? Monday, May 28, 2012
  38. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  39. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Delay? Monday, May 28, 2012
  40. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  41. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  42. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  43. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  44. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  45. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  46. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Activate? Monday, May 28, 2012
  47. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  48. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Deactivate! Monday, May 28, 2012
  49. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Deactivate! Monday, May 28, 2012
  50. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  51. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  52. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Delay? Monday, May 28, 2012
  53. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  54. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History No! Monday, May 28, 2012
  55. Thread 1 Thread 2 Thread 3 Available Constraints Active Constraints

    Thread 1 Thread 2 Thread 3 Event History Monday, May 28, 2012
  56. ✔ ✖ ✖ ✔ ✔ ✖ ✖ ✔ ✔ ✖

    ✖ ✔ Vetting Constraints Seeking largest increase in time between failures Monday, May 28, 2012
  57. ✔ ✖ ✖ ✔ ✔ ✖ ✖ ✔ ✔ ✖

    ✖ ✔ Vetting Constraints Seeking largest increase in time between failures Monday, May 28, 2012
  58. ✔ ✖ Sharing constraints shares learned avoidance Monday, May 28,

    2012
  59. 0 10 20 30 40 memcached Transmission PBZip2 AGet Apache-1

    Apache-2 >1000 Prevents all failures in our experiments Failure Rate Reduction (x) >1000 Failure Rate Reduction Constraints are composable Monday, May 28, 2012
  60. 0 5 10 15 20 Transmission AGet PBZip2 Apache-1 Apache-2

    39 Event collection has low overhead in these apps Runtime Overhead (%) Collection Only Avoidance Composed constraints’ overheads remain reasonable 21 Performance Overhead Monday, May 28, 2012
  61. Going Forward Do Early-Crash Failures the “Right Way” Eliminating the

    Runtime - Compile Constraints In Composability - Constraint interference? Formalizing Our Correctness Guarantees Monday, May 28, 2012
  62. Aviso: Empirical Automatic Failure Avoidance Brandon Lucia, Luis Ceze University

    of Washington Department of Computer Science and Engineering Monday, May 28, 2012