A talk I gave at ISCA 2010 on ColorSafe, a new technique and computer architecture for automatically avoiding failures due to complex concurrent programming errors.
T2 T1 T2 ! ! We can use architecture support for precise bug detection. Reusing the same support for avoidance makes it useful for the system’s lifetime. Monday, May 28, 2012
t; Execution ctr = ctr+1; ctr = ctr+1; Program Initially: ctr = 0; Result: ctr = 2; This should be atomic, but it is not. T1 T2 t = ctr; T1 t = ctr; t = t + 1; ctr = t; T2 Monday, May 28, 2012
Program len = 4 l = len Initially: str = “”, len = 0 Result: s = BUGS and l = 4 Or s = “” and l = 0 str = “BUGS” s = str len = 4 l = len T1 T2 T1 T2 Monday, May 28, 2012
Program Result: s = BUGS and l = 0 len = 4 l = len Initially: str = “”, len = 0 Result: s = BUGS and l = 4 Or s = “” and l = 0 str = “BUGS” s = str len = 4 l = len T1 T2 T1 T2 Monday, May 28, 2012
Program Result: s = BUGS and l = 0 len = 4 l = len Initially: str = “”, len = 0 Missing atomicity constraint But single-variable access interleavings are serializable! We need a new analysis to find these bugs. str = “BUGS” s = str len = 4 l = len T1 T2 T1 T2 Monday, May 28, 2012
str l = len Key Idea: Assign “colors” to related data and analyze serializability of accesses to colors str len str and len are related so they both map to blue Monday, May 28, 2012
to related data and analyze serializability of accesses to colors str len str and len are related so they both map to blue In the “color-space”, these accesses are unserializable. ColorSafe’s analysis finds multi-variable atomicity violations Monday, May 28, 2012
and multi-variable bugs Bug detection and avoidance using the same hardware architectural support Debugging Mode vs. Deployment Mode: Trading precision for proactive bug avoidance Monday, May 28, 2012
In-Memory Table Maps Memory Addresses to Color Meta-Data Granularity of meta-data is determined by size of colored regions ColorSafe needs general meta-data support, and can rely on mechanisms like MMP [ASPLOS ‘02] or Loki [OSDI ‘08] 0xA 0x11 0x12 Color Lookaside Buffer caches address-to-color translations Monday, May 28, 2012
access histories Local Remote Read Write Read Write Read and Write sets are maintained separately History quantized into epochs by hash encoding many accesses into a signature Time Monday, May 28, 2012
History Buffer Signature File: FIFO of bloom filter based hardware signatures History Item: The set of all four signatures for an epoch Time Epoch Monday, May 28, 2012
History Buffer Signature File: FIFO of bloom filter based hardware signatures History Item: The set of all four signatures for an epoch Time Monday, May 28, 2012
Processor 1’s History Buffer Intersection is just bitwise AND of signatures Checks performed at... Debugging mode: every instruction Deployment mode: end of epochs U Unserializable access to Local Write Local Write Remote Read Monday, May 28, 2012
Set Wr str Wr len Rd Foo Rd Bar Rd Baz Rd Foo Hazard Color Set holds set of suspicious colors ! On accesses to colors in the Hazard Color Set, a processor starts an Ephemeral Transaction Ephemeral transactions prevent interleaving of subsequent instructions preventing atomicity violations Implemented as a signature Monday, May 28, 2012
on bug report + application knowledge Primarily useful for bug detection Fully Automatic (e.g., at malloc calls) Preemptively infers likely data relationships More precise Less precise Primarily useful for bug avoidance Monday, May 28, 2012
logic, ephemeral transactions Experimented with manual and malloc coloring ColorSafe simulator built using Pin Used bug kernel benchmarks as well as full applications (AGet, Apache, MySQL) Monday, May 28, 2012
30 35 40 Apache AGet MySQL Debugging Mode Debugging Mode + Post Processing 677 Benchmark # of Reports (w/ FPs) Relatively few code locations reported to developers Using simple invariant-based processing, dramatic reduction in false positives Monday, May 28, 2012
nsTextFram e m acN etD river jsStringLength jsInterpreter m sgPane Apache AG et M ySQ L Full Applications Percent of dynamic atomicity violations avoided Bug kernels extracted from Mozilla Monday, May 28, 2012
Finds and avoids bugs in real software Hardware support with precise or proactive detection Bug avoidance justifies hardware support and enables us to cope with inevitably broken multi-threaded software Monday, May 28, 2012