Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TMPA-2017: Generating Cost Aware Covering Arrays For Free

TMPA-2017: Generating Cost Aware Covering Arrays For Free

TMPA-2017: Tools and Methods of Program Analysis
3-4 March, 2017, Hotel Holiday Inn Moscow Vinogradovo, Moscow

Generating Cost Aware Covering Arrays For Free
Mustafa Kemal Tas, Hanefi Mercan, Gülşen Demiröz, Kamer Kaya, Cemal Yilmaz, Sabanci University

For video follow the link: https://youtu.be/Wkdd4A0rRjE

Would like to know more?
Visit our website:
www.tmpaconf.org
www.exactprosystems.com/events/tmpa

Follow us:
https://www.linkedin.com/company/exactpro-systems-llc?trk=biz-companies-cym
https://twitter.com/exactpro

Exactpro

March 23, 2017
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Faculty of Engineering and Natural Sciences Sabanci University, Istanbul, Turkey

    {mkemaltas | hanefimercan | gulsend | kaya | cyilmaz}@sabanciuniv.edu TMPA 2017 <- Dışarı al Bantı genişletip yazarları içeri al TMPA 2017 GENERATING COST-AWARE COVERING ARRAYS FOR FREE Mustafa Kemal Tas, Hanefi Mercan, Gulsen Demiroz, Kamer Kaya, and Cemal Yilmaz
  2. Combinatorial Testing A Motivating Example: MySQL  A highly configurable

    system  100+ configuration options  Dozens of OS, compiler, and platform combinations  Assuming each option takes a binary value  2100+ configurations to validate  Assuming each configuration takes 1 second to test  2100+ secs. ≈ 1020+ centuries for exhaustive testing  Big Bang is estimated to be about 107 centuries ago!  Exhaustive testing is infeasible Which configurations should be tested? 2 / 25
  3. Combinatorial Interaction Testing (CIT)  Many empirical studies suggest that

    most of the faults are typically caused by the interactions of only small number of options.  Typically between 2 and 6.  CIT generates a sample for a given coverage criteria  Contains some combinations of the options and their values  Typically, t-way covering arrays are used in CIT to detect faulty behaviour caused by t or less option interactions 3 / 25
  4. Covering Arrays  A t-way covering array is a set

    of configurations, in which each possible combination of option settings for every combination of t options appears at least once  A combination of t option- value pairs is called a t-tuple  t is often referred to as the strength of the covering array  In a CA, each row is referred to as a configuration OS Browser Protocol XP Firefox IPv4 OS X IE IPv6 LINUX IE IPv4 OS X Firefox IPv4 XP Firefox IPv6 LINUX Firefox IPv6 XP IE IPv6 OS Browser Protocol XP Firefox IPv4 OS X IE IPv6 LINUX IE IPv4 OS X Firefox IPv4 XP Firefox IPv6 LINUX Firefox IPv6 XP IE IPv6 OS Browser Protocol XP Firefox IPv4 OS X IE IPv6 LINUX IE IPv4 OS X Firefox IPv4 XP Firefox IPv6 LINUX Firefox IPv6 XP IE IPv6 A 2-way covering array 4 / 25
  5. Gulsen Demiroz, Cost-aware combinatorial interaction testing (doctoral symposium), ISSTA 2015,

    Proceedings of the 2015 International Symposium on Software Testing and Analysis, pages 440-443, Baltimore, USA However… 5 / 25  Standard covering arrays aim to minimize number of configurations by assuming that each configuration costs the same  Do not take actual testing cost into account  Cost may vary from one configuration to another  Minimizing the number of configurations to be tested, does not necessarily minimize the actual cost of testing
  6. Gulsen Demiroz and Cemal Yilmaz, “Cost-Aware combinatorial interaction testing”, Fourth

    International Conference on Advances in System Testing and Validation Lifecycle, VALID ’12, Portugal, November 2012. 6 / 25  Demiroz et al. proposed a novel object called a t-way cost-aware covering array in 2012  A t-way cost-aware covering array is a t-way covering array that minimizes a given cost function  The cost function models actual cost of testing at the level of option-value combinations  We have improved an existing CA generation tool -Jenny- by adding cost-awareness in multiple steps Cost-Aware Covering Array
  7. Cost Function  The cost function cost(c) computes the expected

    cost of a given configuration  We assume that some tuples may have additional costs  cost(c) = intercept + σ1Φ1 ( 1 ) + σ2Φ2 ( 2 ) + … + σ Φ ( )  Intercept is the base cost of the configuration in the absence of any costly tuples  Φ is a costly m-tuple with an additional cost  cost( ) > 0  1 ≤ m ≤ f ≤ k 7 / 25
  8. ▪ = 100 ▪ + 2 = 1 ٿ 3

    = 1 ∗ 50 ▪ + 1 = 1 ٿ 5 = 0 ∗ 50 ▪ + 7 = 0 ٿ 9 = 0 ∗ 50 ▪ … Intercept Impact Costly tuple Cardinality of a tuple # of Costly tuples 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 Cost Function - Example 8 / 25
  9. ▪ = 100 ▪ + 2 = 1 ٿ 3

    = 1 ∗ 50 ▪ + 1 = 1 ٿ 5 = 0 ∗ 50 ▪ + 7 = 0 ٿ 9 = 0 ∗ 50 ▪ … 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 Cost Function - Example 9 / 25
  10. ▪ = 100 ▪ + 2 = 1 ٿ 3

    = 1 ∗ 50 ▪ + 1 = 1 ٿ 5 = 0 ∗ 50 ▪ + 7 = 0 ٿ 9 = 0 ∗ 50 ▪ … cost 0 1 1 0 1 0 1 1 0 1 150 1 0 0 1 0 0 0 1 0 1 200 Cost Function - Example 10 / 25
  11. Cost Function Discovery  Discovering a cost model/cost function for

    a given software is time consuming and error-prone  It may require expertise on the targetted software  Demiroz et al. has worked on an automatic cost model discovery using a linear regression method Gulsen Demiroz and Cemal Yilmaz, Towards Automatic Cost Model Discovery for Combinatorial Interaction Testing, Proceedings of the 2016 International Workshop on Combinatorial Testing (IWCT 2016) in IEEE International Conference on Software Testing, Verification and Validation (ICST) 2016, 2016, 46-50, IEEE. 11 / 25
  12.  Three objectives:  Meet the coverage criteria of the

    standard t-way covering arrays  Minimize the cost function  Keep the CA generation process short  Method: A parallel, iterative, greedy algorithm  Generate a configuration with "maximal" coverage at the "minimal" cost at each iteration  Repeat until all t-tuples are covered  Output: A covering array with "minimal" cost 12 / 25 Generating Cost-Aware Covering Arrays
  13. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break 13 / 25 CA Jenny
  14. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break ▪ The tuples are enumerated dynamically instead of using a preprocessing step ▪ At each iteration the list of uncovered tuples is extended and one tuple is selected randomly ▪ At the end of each iteration, additionally covered tuples are removed from this list 13 / 25 CA Jenny
  15. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break ▪ Selected tuple is added to an empty configuration ▪ The rest of the configuration is filled randomly ▪ If an inter-option constraint is violated, the process is repeted 14 / 25 CA Jenny
  16. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break ▪ Two goals: Minimizing the cost & maximizing the coverage ▪ A random walk on the configuration is carried out. ▪ For each option, all values are tested ▪ If a value invalidates the configuration, it’s skipped ▪ If a value increases coverage and decreases the cost, corresponding option is updated with that value 15 / 25 CA Jenny
  17. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break 16 / 25 ▪ Since we generate τ candidate configurations, we select the configuration with the best result ▪ Configuration with highest coverage and lowest cost is selected CA Jenny
  18. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do config ← GenerateConfiguration(S, tuple) (config, cost, cov) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← UpdateBestResult(config, cost, cov) CA ← bConfig if CountUncoveredTuples(S) = 0 then break • 84% - 89% of execution time • Generating configurations are independent events and can be performed in parallel • Allows utilization of n threads where n ≤ τ, effectively 17 / 25 CA Jenny
  19. Input: S: configuration space, t: strength Output: CA(S): N x

    k covering array while true do tuple ← SelectUncoveredTuple(CA, S) for i = 1 to τ do in parallel config ← GenerateConfiguration(S, tuple) (configs, costs, covs) ← ImproveConfiguration(S, test) (bConfig, bCost, bCov) ← SelectBestResult(configs, costs, covs) CA ← bConfig if CountUncoveredTuples(S) = 0 then break • 84% - 89% of execution time • Generating configurations are independent events and can be performed in parallel • Allows utilization of n threads where n ≤ τ, effectively 18 / 25 CA Jenny
  20.  Independent variables:   Strength(t) {2,3}  Number of

    options(k) {25,35,45,55,65,85,100}  Impact of costly tuples(i) {50%, 100%}  Number of costly tuples(b) {4,5,6}  Cardinality of costly tuples(f) {1,2,3,4}  Tools { Jenny, ACTS, CAJenny}  Number of threads {1,2,4,8}  3 executions with different random seeds  Over 15K experiments are carried out Experimental Setup 19/ 25
  21. Experimental Results – Cost Reduction  Several experiments are carried

    out to generate CAs with Jenny, ACTS and Cost-Aware Jenny  Note that both ACTS and Jenny are not cost-aware  The results of Cost-Aware Jenny are compared to the best result of either Jenny or ACTS.  Even in the worst cases, Cost-Aware Jenny generates 35% and 21% lower cost CAs for t=2 and t=3 respectively when the impact of costly tuples is set to 100%. 20 / 25
  22. K Cost Reduction (%) for Impact = 100% T =

    2 T = 3 25 39.97 20.88 35 37.36 25.17 45 38.29 26.69 55 41.12 28.91 65 35.41 29.50 85 41.30 29.26 100 41.17 29.99 Experimental Results – Cost Reduction 21 / 25
  23. Experimental Results – Performance  For all experiments τ =

    8 is used to avoid load- imbalance issues  Parallel implementation surpasses Jenny in the execution time with 2.45x and 3.53x speedups for t=2 and t=3, respectively  The cost of adding a new functionality is neutralized by using multicore architectures effectively  Thus, we claim that cost-awareness comes "for free" 22 / 25
  24. Experimental Results – Performance 0 20 40 60 80 100

    120 140 160 25 35 45 55 65 85 100 Execution Times (in ms) for t = 2 Jenny N = 1 N = 2 N = 4 N = 8 23 / 25
  25. Experimental Results – Performance 0.0625 0.125 0.25 0.5 1 2

    4 8 16 32 64 128 256 512 1024 25 35 45 55 65 85 100 Execution Times (in s) for t = 3 Jenny N=1 N=2 N=4 N=8 24 / 25
  26. Conclusion & Future Work  We have shown that generation

    of a cost-aware CA is not necessarily more costly than generating a standard CA  Moreover, we have emprically demonstrated that parallelization can be a nice asset for CA generation process  As a future work, we still believe there are room for improvement in both cost reduction and performance  Also, we plan to investigate other approaches such as SAT solvers and branch-and-bound algorithms to compute cost-aware CAs in the future. 25 / 25
  27. Faculty of Engineering and Natural Sciences Sabanci University, Istanbul, Turkey

    {mkemaltas | hanefimercan | gulsend | kaya | cyilmaz}@sabanciuniv.edu TMPA 2017 <- Dışarı al Bantı genişletip yazarları içeri al Thanks For Listening QUESTIONS?