Slide 1

Slide 1 text

Faculty of Engineering and Natural Sciences Sabanci University, Istanbul, Turkey {mkemaltas | hanefimercan | gulsend | kaya | cyilmaz}@sabanciuniv.edu TMPA 2017 <- Dışarı al Bantı genişletip yazarları içeri al TMPA 2017 GENERATING COST-AWARE COVERING ARRAYS FOR FREE Mustafa Kemal Tas, Hanefi Mercan, Gulsen Demiroz, Kamer Kaya, and Cemal Yilmaz

Slide 2

Slide 2 text

Combinatorial Testing A Motivating Example: MySQL  A highly configurable system  100+ configuration options  Dozens of OS, compiler, and platform combinations  Assuming each option takes a binary value  2100+ configurations to validate  Assuming each configuration takes 1 second to test  2100+ secs. ≈ 1020+ centuries for exhaustive testing  Big Bang is estimated to be about 107 centuries ago!  Exhaustive testing is infeasible Which configurations should be tested? 2 / 25

Slide 3

Slide 3 text

Combinatorial Interaction Testing (CIT)  Many empirical studies suggest that most of the faults are typically caused by the interactions of only small number of options.  Typically between 2 and 6.  CIT generates a sample for a given coverage criteria  Contains some combinations of the options and their values  Typically, t-way covering arrays are used in CIT to detect faulty behaviour caused by t or less option interactions 3 / 25

Slide 4

Slide 4 text

Covering Arrays  A t-way covering array is a set of configurations, in which each possible combination of option settings for every combination of t options appears at least once  A combination of t option- value pairs is called a t-tuple  t is often referred to as the strength of the covering array  In a CA, each row is referred to as a configuration OS Browser Protocol XP Firefox IPv4 OS X IE IPv6 LINUX IE IPv4 OS X Firefox IPv4 XP Firefox IPv6 LINUX Firefox IPv6 XP IE IPv6 OS Browser Protocol XP Firefox IPv4 OS X IE IPv6 LINUX IE IPv4 OS X Firefox IPv4 XP Firefox IPv6 LINUX Firefox IPv6 XP IE IPv6 OS Browser Protocol XP Firefox IPv4 OS X IE IPv6 LINUX IE IPv4 OS X Firefox IPv4 XP Firefox IPv6 LINUX Firefox IPv6 XP IE IPv6 A 2-way covering array 4 / 25

Slide 5

Slide 5 text

Gulsen Demiroz, Cost-aware combinatorial interaction testing (doctoral symposium), ISSTA 2015, Proceedings of the 2015 International Symposium on Software Testing and Analysis, pages 440-443, Baltimore, USA However… 5 / 25  Standard covering arrays aim to minimize number of configurations by assuming that each configuration costs the same  Do not take actual testing cost into account  Cost may vary from one configuration to another  Minimizing the number of configurations to be tested, does not necessarily minimize the actual cost of testing

Slide 6

Slide 6 text

Gulsen Demiroz and Cemal Yilmaz, “Cost-Aware combinatorial interaction testing”, Fourth International Conference on Advances in System Testing and Validation Lifecycle, VALID ’12, Portugal, November 2012. 6 / 25  Demiroz et al. proposed a novel object called a t-way cost-aware covering array in 2012  A t-way cost-aware covering array is a t-way covering array that minimizes a given cost function  The cost function models actual cost of testing at the level of option-value combinations  We have improved an existing CA generation tool -Jenny- by adding cost-awareness in multiple steps Cost-Aware Covering Array

Slide 7

Slide 7 text

Cost Function  The cost function cost(c) computes the expected cost of a given configuration  We assume that some tuples may have additional costs  cost(c) = intercept + σ1Φ1 ( 1 ) + σ2Φ2 ( 2 ) + … + σ Φ ( )  Intercept is the base cost of the configuration in the absence of any costly tuples  Φ is a costly m-tuple with an additional cost  cost( ) > 0  1 ≤ m ≤ f ≤ k 7 / 25

Slide 8

Slide 8 text

▪ = 100 ▪ + 2 = 1 ٿ 3 = 1 ∗ 50 ▪ + 1 = 1 ٿ 5 = 0 ∗ 50 ▪ + 7 = 0 ٿ 9 = 0 ∗ 50 ▪ … Intercept Impact Costly tuple Cardinality of a tuple # of Costly tuples 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 Cost Function - Example 8 / 25

Slide 9

Slide 9 text

▪ = 100 ▪ + 2 = 1 ٿ 3 = 1 ∗ 50 ▪ + 1 = 1 ٿ 5 = 0 ∗ 50 ▪ + 7 = 0 ٿ 9 = 0 ∗ 50 ▪ … 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 Cost Function - Example 9 / 25

Slide 10

Slide 10 text

▪ = 100 ▪ + 2 = 1 ٿ 3 = 1 ∗ 50 ▪ + 1 = 1 ٿ 5 = 0 ∗ 50 ▪ + 7 = 0 ٿ 9 = 0 ∗ 50 ▪ … cost 0 1 1 0 1 0 1 1 0 1 150 1 0 0 1 0 0 0 1 0 1 200 Cost Function - Example 10 / 25

Slide 11

Slide 11 text

Cost Function Discovery  Discovering a cost model/cost function for a given software is time consuming and error-prone  It may require expertise on the targetted software  Demiroz et al. has worked on an automatic cost model discovery using a linear regression method Gulsen Demiroz and Cemal Yilmaz, Towards Automatic Cost Model Discovery for Combinatorial Interaction Testing, Proceedings of the 2016 International Workshop on Combinatorial Testing (IWCT 2016) in IEEE International Conference on Software Testing, Verification and Validation (ICST) 2016, 2016, 46-50, IEEE. 11 / 25

Slide 12

Slide 12 text

 Three objectives:  Meet the coverage criteria of the standard t-way covering arrays  Minimize the cost function  Keep the CA generation process short  Method: A parallel, iterative, greedy algorithm  Generate a configuration with "maximal" coverage at the "minimal" cost at each iteration  Repeat until all t-tuples are covered  Output: A covering array with "minimal" cost 12 / 25 Generating Cost-Aware Covering Arrays

Slide 13

Slide 13 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break 13 / 25 CA Jenny

Slide 14

Slide 14 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break ▪ The tuples are enumerated dynamically instead of using a preprocessing step ▪ At each iteration the list of uncovered tuples is extended and one tuple is selected randomly ▪ At the end of each iteration, additionally covered tuples are removed from this list 13 / 25 CA Jenny

Slide 15

Slide 15 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break ▪ Selected tuple is added to an empty configuration ▪ The rest of the configuration is filled randomly ▪ If an inter-option constraint is violated, the process is repeted 14 / 25 CA Jenny

Slide 16

Slide 16 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break ▪ Two goals: Minimizing the cost & maximizing the coverage ▪ A random walk on the configuration is carried out. ▪ For each option, all values are tested ▪ If a value invalidates the configuration, it’s skipped ▪ If a value increases coverage and decreases the cost, corresponding option is updated with that value 15 / 25 CA Jenny

Slide 17

Slide 17 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break 16 / 25 ▪ Since we generate τ candidate configurations, we select the configuration with the best result ▪ Configuration with highest coverage and lowest cost is selected CA Jenny

Slide 18

Slide 18 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break • 84% - 89% of execution time • Generating configurations are independent events and can be performed in parallel • Allows utilization of n threads where n ≤ τ, effectively 17 / 25 CA Jenny

Slide 19

Slide 19 text

Input: S: configuration space, t: strength Output: CA(S): N x k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do in parallel config ← GenerateConfiguration(S, tuple) (configs, costs, covs) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← SelectBestResult(configs, costs, covs) CA ← bConfig if CountUncoveredTuples(S) = 0 then break • 84% - 89% of execution time • Generating configurations are independent events and can be performed in parallel • Allows utilization of n threads where n ≤ τ, effectively 18 / 25 CA Jenny

Slide 20

Slide 20 text

 Independent variables:   Strength(t) {2,3}  Number of options(k) {25,35,45,55,65,85,100}  Impact of costly tuples(i) {50%, 100%}  Number of costly tuples(b) {4,5,6}  Cardinality of costly tuples(f) {1,2,3,4}  Tools { Jenny, ACTS, CAJenny}  Number of threads {1,2,4,8}  3 executions with different random seeds  Over 15K experiments are carried out Experimental Setup 19/ 25

Slide 21

Slide 21 text

Experimental Results – Cost Reduction  Several experiments are carried out to generate CAs with Jenny, ACTS and Cost-Aware Jenny  Note that both ACTS and Jenny are not cost-aware  The results of Cost-Aware Jenny are compared to the best result of either Jenny or ACTS.  Even in the worst cases, Cost-Aware Jenny generates 35% and 21% lower cost CAs for t=2 and t=3 respectively when the impact of costly tuples is set to 100%. 20 / 25

Slide 22

Slide 22 text

K Cost Reduction (%) for Impact = 100% T = 2 T = 3 25 39.97 20.88 35 37.36 25.17 45 38.29 26.69 55 41.12 28.91 65 35.41 29.50 85 41.30 29.26 100 41.17 29.99 Experimental Results – Cost Reduction 21 / 25

Slide 23

Slide 23 text

Experimental Results – Performance  For all experiments τ = 8 is used to avoid load- imbalance issues  Parallel implementation surpasses Jenny in the execution time with 2.45x and 3.53x speedups for t=2 and t=3, respectively  The cost of adding a new functionality is neutralized by using multicore architectures effectively  Thus, we claim that cost-awareness comes "for free" 22 / 25

Slide 24

Slide 24 text

Experimental Results – Performance 0 20 40 60 80 100 120 140 160 25 35 45 55 65 85 100 Execution Times (in ms) for t = 2 Jenny N = 1 N = 2 N = 4 N = 8 23 / 25

Slide 25

Slide 25 text

Experimental Results – Performance 0.0625 0.125 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 25 35 45 55 65 85 100 Execution Times (in s) for t = 3 Jenny N=1 N=2 N=4 N=8 24 / 25

Slide 26

Slide 26 text

Conclusion & Future Work  We have shown that generation of a cost-aware CA is not necessarily more costly than generating a standard CA  Moreover, we have emprically demonstrated that parallelization can be a nice asset for CA generation process  As a future work, we still believe there are room for improvement in both cost reduction and performance  Also, we plan to investigate other approaches such as SAT solvers and branch-and-bound algorithms to compute cost-aware CAs in the future. 25 / 25

Slide 27

Slide 27 text

Faculty of Engineering and Natural Sciences Sabanci University, Istanbul, Turkey {mkemaltas | hanefimercan | gulsend | kaya | cyilmaz}@sabanciuniv.edu TMPA 2017 <- Dışarı al Bantı genişletip yazarları içeri al Thanks For Listening QUESTIONS?