coupled to simple faults in such a way that a test data set that detects all simple faults in a program will detect a high percentage of complex faults (Jia et al. 2011) (Mutation Testing)
coupled to simple faults in such a way that a test data set that detects all simple faults in a program will detect a high percentage of complex faults (Jia et al. 2011) (Mutation Testing) Coupling effect was verified for combined faults with up to 3 simple faults. (Offutt 92)
programs as composed of q independent functions with same domain and range n • A complex fault can be split into independent faults • All faults have equal probability of occurring (Wah 1995) The program composed of q independent functions A single function with same domain and range q b A fault a
masking probability = 1/n (where n is the size of the domain) • The coupling ratio = 1 - Fault Masking = (n - 1)/n • Fault masking becomes stronger as q approaches n (where q is the number of independent functions that compose the program) • i.e Coupling effect is unreliable in very large programs (Wah 1995)
0 while i <= len(lst): result+=lst[i] i += 1 return result assert sum[1,0] = 1 0 def sum(lst): i, result=1, 0 while i <= len(lst)-1: result+=lst[i] i += 1 return result assert sum[0,1] = 1 0 def sum(lst): i, result=1, 0 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[0,1] = 1 assert sum[1,0] = 1 Original
0 while i <= len(lst): result+=lst[i] i += 1 return result assert sum[1,0] = 1 0 def sum(lst): i, result=1, 0 while i <= len(lst)-1: result+=lst[i] i += 1 return result assert sum[0,1] = 1 0 def sum(lst): i, result=1, 0 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,0] = 1 assert sum[0,1] = 1 def sum(lst): i, result=2, 0 while i <= len(lst)-1: result+=lst[i-1] i += 1 return result assert sum[1,0] = 1 0 assert sum[0,1] = 1 0 Original
i, result=1, 1 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 3 def sum(lst): i, result=2, 0 while i <len(lst): result+=lst[i] i += 1 return result assert sum[1,1] = 2 1 def sum(lst): i, result=1, 0 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 Original
i, result=1, 1 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 3 def sum(lst): i, result=2, 0 while i <len(lst): result+=lst[i] i += 1 return result assert sum[1,1] = 2 1 def sum(lst): i, result=1, 0 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 def sum(lst): i, result=2, 1 while i <=len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 Original
i, result=1, 1 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 3 def sum(lst): i, result=2, 0 while i <len(lst): result+=lst[i] i += 1 return result assert sum[1,1] = 2 1 def sum(lst): i, result=1, 0 while i <= len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 def sum(lst): i, result=2, 1 while i <=len(lst): result+=lst[i-1] i += 1 return result assert sum[1,1] = 2 assert sum[0,1] = 1 2 Original
0 while i < len(lst): i += 1 result+=lst[i-1] return result def sum(lst): i, result=0, 0 while i < len(lst): result+=lst[i] i += 1 return result Equivalents def sum(lst): i, result=0, 0 while i < len(lst): result+=lst[i-1] i += 1 return result Original
0 while i < len(lst): i += 1 result+=lst[i-1] return result def sum(lst): i, result=0, 0 while i < len(lst): result+=lst[i] i += 1 return result Equivalents def sum(lst): i, result=0, 0 while i < len(lst): result+=lst[i-1] i += 1 return result Original def sum(lst): i, result=0, 0 while i < len(lst): i += 1 result+=lst[i] return result assert sum[0,1] = 1 error Faulty These faults can't be separated to independent faults
Model programs as composed of q independent functions • Caveat: Ignores recursion and iteration • A complex fault can be split into independent faults • Caveat: Ignores strongly interacting faults • The functions have same domain and range n • Caveat: Real world is often more complex • All faults have equal probability of occurring • Caveat: Ignores the syntactic neighborhood
Model based on fault propagation on function pairs • Only model composite faults (where faults can be split) • Functions have different domain and co-domain • All faults have equal probability of occurring • But also considers syntactic neighborhood A program with a composite fault Faults in independent functions with different domain and co-domain a b
is the domain and n is the co-domain nm unique alternatives for h nm-1 alternatives of h' can correct g' Composite coupling ratio = 1 - (nm-1/nm) = (n -1)/n (i.e fault masking probability = 1/n)
premature exits (e.g. crashes), there is a smaller probability of masking than 1/n Premature exits for x% inputs x + (ny - 1) ≥ (n-1) _______ _____ ny n x% where y = x - 1
is 1/n where n is the size of the co-domain for the program considered • The probability of fault masking does not increase with increase in program size or length of execution for the common patterns we studied. • The probability of fault masking does not increase even when the effects of syntactical neighborhood are considered.
coupling ratio: Given two faults, and and their combined fault, what percentage of test cases detecting them in isolation will detect the combined fault? The general coupling ratio: Given two faults, and their combined fault, what is the ratio between the number of test cases detecting them in isolation vs tests detecting the combined fault?
• Size 84 KLOC - 1.4 KLOC • Collected commits • Generated reverse patches that can be independently applied from bug-fixes without compilation errors • Collected test cases failing each patch • Combined pairs of patches together • Collected test cases failing combined patches
fault hypothesis Tests detecting a fault in isolation will (with probability ~ 99%) continue to detect the fault even when in combination with other faults.
faults • Probability of fault masking is 1/n where n is the size of the co-domain for the program considered • The probability of fault masking does not increase with increase in program size or length of execution for the common patterns we studied. • Empirical evaluation of the composite fault ratio • Between 98.8% and 99.1% of test cases detecting a fault in isolation can be expected to detect a composite fault including that fault. (95% confidence level, p < 0.0001)