Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Evaluating and Improving White-Box Test Generation

Evaluating and Improving White-Box Test Generation

Presentation of Dávid Honfi's PhD dissertation about software testing. Link: http://hdl.handle.net/10890/15132

We present (1) two empirical studies to evaluate the effectiveness and understandability of tests generated from the source code, (2) an approach and tool called AutoIsolator to automatically mock the dependencies in unit testing, and (3) and approach and tool called SEViz to visualize symbolic execution for test generation.

More Decks by Critical Systems Research Group

Other Decks in Research

Transcript

  1. Evaluating and Improving White-Box Test Generation Dávid Honfi Advisor: Zoltán

    Micskei, PhD Public PhD defense, http://hdl.handle.net/10890/15132 Critical Systems Research Group (ftsrg) Department of Measurement and Information Systems Budapest University of Technology and Economics
  2. Evaluating and Improving White-Box Test Generation 2 • Software testing

    is crucial for quality • Thorough testing requires time and effort • Proposals of automated techniques • White-box test generation is being one of them • Non-trivial practical application of such techniques Scope and Motivation Extend the empirical evidence on white-box test generation techniques, and propose novel approaches that facilitate their use in practice.
  3. White-Box Test Generation public int ClassifyNum(int n) { if(n >

    0) { if(n % 2 == 0) return 0; else return 1; } else return 2; } 3 Evaluating and Improving White-Box Test Generation ID Input [n] Observed output T1 0 2 T2 1 1 T3 2 0 Seems to be easy on simple code snippets, but what about real code?
  4. Evaluating and Improving White-Box Test Generation 4 Research Method and

    Challenges Study 1 Test generation during development Study 2 Classification of generated white-box tests Empirical studies Identified challenges C1 UNDER- STANDING C2 LACK OF TRUST C3 LOW COVERAGE C4 COMPLEX PROGRAMS
  5. Evaluating and Improving White-Box Test Generation 6 • Several existing

    studies target white-box test generation • Only some of them employ human participants • Problem: limited conclusions for practical settings • Possible solutions: Background Replications New setting • Increase confidence in results • Build a body of knowledge • Manage validity concerns • Provides novel insights • Gives freedom in study design • Yields limited validity Study 1 Study 2
  6. Evaluating and Improving White-Box Test Generation 7 • Original study

    by Rojas et al. [1] • Goal: „Empirically evaluate the effects of using an automated test generation tool during development” • External differentiated replication with changes including: – Test generator tool: EvoSuite → Microsoft Pex/IntelliTest – Programming language: Java → C# – Participants knowledge: Students only → Professionals and students T1.1 Study 1: Test Generation in Development [1] José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. Automated unit test generation during software development: a controlled experiment and think-aloud observations. In: Proceedings of the ISSTA 2015, pp. 338–349. ACM, 2015. doi: 10.1145/2771783.2771801.
  7. Evaluating and Improving White-Box Test Generation 8 Summary of results

    (30 participants) • Coverage can be higher with test generation • Test generation reduces the amount of user activity required • Spending more time with test generation improves quality T1.1 Study 1: Test Generation in Development
  8. T1.2 Study 2: Classification of White-Box Tests Motivation • Question:

    Are these tests correct? ◦ OK: correct w.r.t specification ◦ WRONG: contradicts specification • Not considered in empirical studies • Issues caused in practice: ◦ Real efficiency can be worse ◦ Can be an effort-intensive task [TestMethod] public void CalculateSumTest284() { int[] ints = new int[5] { 4,5,6,7,8 }; int i = CalculateSum(0, 0, ints); Assert.AreEqual<int>(0, i); } [TestMethod] public void CalculateSumTest647() { int[] ints = new int[5] { 4,5,6,7,8 }; int i = CalculateSum(0, 4, ints); Assert.AreEqual<int>(15, i); } 9 Evaluating and Improving White-Box Test Generation Goal: How do developers who use test generator tools perform in deciding whether the generated tests encode expected or unexpected behavior?
  9. Evaluating and Improving White-Box Test Generation 10 RQ1: How do

    developers perform in the classification of generated tests? RQ2: How much time do developers spend with the classification? T1.2 Study 2: Classification of White-Box Tests Subjects Objects Context Task • Students only • Source: V&V course • Basic experience • Apply voluntarily • Source: GitHub • 5 methods in 4 repos • Artificial faults (ODC) • 3 tests per method • 15 mins tutorial • Experiment portal • Visual Studio • Test runs and debug • Classify all 15 tests • At most 60 minutes • Activities recorded • Data is logged
  10. Evaluating and Improving White-Box Test Generation 11 Summary of results

    (106 participants) • RQ1: ”… yield only a moderate classification performance …” • RQ2: ”… more than one minute is usually required …” T1.2 Study 2: Classification of White-Box Tests
  11. Evaluating and Improving White-Box Test Generation 12 1.1 I designed

    and conducted a replication for an existing empirical study with 30 human participants on using white-box test generation during development to gain further insights of the topic. 1.2 I designed and conducted an empirical study and its replication with 106 individuals altogether. The studies addressed the white-box test classification performance of the participants. Thesis 1 – Summary Design and execution of empirical studies to investigate the challenges of white-box test generation in practical settings. Σ Publication: SQJ’19 I have analyzed the use of white-box test generation in two separate empirical studies. Based on the results I have identified and quantified new challenges that may hinder practical white-box test generation, and also strengthened the evidence for already known challenges.
  12. Thesis 2 Automated Isolation of Dependencies in White-Box Test Generation

    Evaluating and Improving White-Box Test Generation 13
  13. Evaluating and Improving White-Box Test Generation 14 Motivation: challenges of

    white-box test generators • C3: Low code coverage • C4: Large, complex programs • One of common root causes: environment interaction Background bool TransferMoney(Token userToken, long amount, Account destination) { if (amount <= 0) throw new Exception("Invalid amount to transfer"); int balance = DB.RunQuery<int>("GetBalance", userToken); if (balance < amount) throw new Exception("Not enough balance"); TransferProcessor tp = new TransferProcessor(userToken); ProcessedTransfer pt = tp.Process(amount, destination); return pt.IsSuccess; } DB service network
  14. Evaluating and Improving White-Box Test Generation 15 T2.1 Design of

    the Approach Users can also defne behavior with input-effect pairs
  15. Evaluating and Improving White-Box Test Generation 16 Experimental evaluation with

    44 kLoC of open source projects • RQ1: How does it improve statement and branch coverage? • RQ2: How does it increase the time spent with test generation? T2.2 Adaptation and Evaluation
  16. Evaluating and Improving White-Box Test Generation 17 2.1 I designed

    an approach to automatically isolate the unit under test from its dependencies for test generation purposes. This approach also generates a parameterized sandbox around the isolated unit that can be utilized by the test generator. I mapped the automated isolation approach to the domain of a concrete white-box test generator (Microsoft Pex/IntelliTest) to demonstrate the feasibility of the approach. 2.2 I conducted a large-scale quantitative evaluation to assess the performance of the approach via the implemented tool. The tool was able to improve the coverage reached by generated tests by around 50-60% in problematic cases. Thesis 2 – Summary Design and evaluation of an approach that automatically transforms the code under test from its dependencies for test generation. Σ Publications: IST’20, DiB’20, PP’17, BME’16, BME’17 I designed a novel, automated, code transformation based approach for alleviating the external dependency problem in white-box test generation.
  17. Thesis 3 Visually Aiding the Use and Problem Resolution of

    Symbolic Execution Evaluating and Improving White-Box Test Generation 18
  18. Evaluating and Improving White-Box Test Generation 19 Motivation: challenges of

    white-box test generators • C1: Difficulty of understanding generated white-box tests • C2: Low trust in white-box test generation techniques and tools Idea: visualizing symbolic execution based test generation • The visualization should be clear, traceable, and detailed • It should represent all necessary information – To help in grasping an overview and understanding of the process – To support precise problem identification Background
  19. Evaluating and Improving White-Box Test Generation 20 • Main concept:

    symbolic execution tree • Elements of visualization – Nodes: shape, color, border, label – Edges: color • Additional data attached to nodes – Source code mapping – Path conditions • Use cases – Engineering – Education T3 Design of the Approach
  20. Evaluating and Improving White-Box Test Generation 21 • Mapping of

    the technique to domain of Microsoft Pex/IntelliTest • Implementation as an IDE extension T3 Adaptation
  21. Evaluating and Improving White-Box Test Generation 22 • Identification of

    additional metrics from related domains using survey papers and intuition – Static code metrics (SC) – Symbolic execution related metrics (SE) – Test code metrics (GT) – Generic graph metrics (GG) • Attachment of metrics to tree: node (N), path (P), execution (E) • Mapping of metrics to problems in SE based on experiences and intuition – Constraint solver (CS) – State space exploration (SSE) – Object creation (OC) – Environment interaction (EI) T3 Multi-Domain Metrics
  22. Evaluating and Improving White-Box Test Generation 23 I designed a

    generic visual representation of symbolic execution trees that handles additional data, which are related to the path conditions, constraints and generated test cases, for easier understanding and issue identification. I adapted the generic representation to a concrete white-box test generator (Microsoft Pex/IntelliTest) by precisely mapping each generic concept to the concrete domain. Based on the analysis of multiple related domains, I identified and organized several metrics that alleviate the problem identification process during test generation based on symbolic execution. Thesis 3 – Summary Designed a novel visualization technique for symbolic execution to support understanding and problem identification. Σ Publications: ICST’15, BME’18 I proposed a possible visual representation of symbolic execution trees that can handle additional me tadata as well. I implemented the technique for Microsoft Pex/IntelliTest, an advanced symbolic execu tion-based test generator.
  23. Evaluating and Improving White-Box Test Generation 25 Publications Thesis 1

    Thesis 2 Thesis 3 Journal SQJ IST DIB PerPol Conference ICST Workshop 2x BME BME Related to theses Highlights 8 publications • 4 journal (incl. IST, SQJ) • 1 conference (ICST) • 3 workshop (BME) 8 independent citation (peer-reviewed)
  24. • Tools implemented – T2 (AutoIsolator): Automated isolation for Microsoft

    Pex/IntelliTest – IDE extension for immediate dynamic analysis – Website: https://ftsrg.github.io/autoisolator – T3 (SEViz): Visualizing the test generation of Microsoft Pex/IntelliTest – Website: https://ftsrg.github.io/seviz – Cao et al. [2] built a tool on the basis of SEViz • Education – T2: Served as a basis of a B.Sc. thesis – T3: Using the implemented SEViz tool – Lab in the SWSV course at BME – Basis of two B.Sc. theses Evaluating and Improving White-Box Test Generation 26 Applications
  25. Evaluating and Improving White-Box Test Generation 27 Conclusions Thesis 1

    Thesis 2 Thesis 3 • Design and evaluate an approach for automatically isolating dependencies • Future work – User-oriented empirical study – Object state tracking and use • Evaluate white-box test generators in practical settings • Future work – Further replications – New studies for challenges Study 1 Study 2 New challenges • Design an approach for visualization, and propose use cases • Future work – User-oriented empirical study – Mapping CFG to SE tree Eval. Tool Automated isolation Tool Uses Visualization of SE