Evaluating and Improving White-Box Test Generation

Evaluating and Improving White-Box Test Generation Dávid Honfi Advisor: Zoltán
Micskei, PhD Public PhD defense, http://hdl.handle.net/10890/15132 Critical Systems Research Group (ftsrg) Department of Measurement and Information Systems Budapest University of Technology and Economics

Evaluating and Improving White-Box Test Generation 2 • Software testing
is crucial for quality • Thorough testing requires time and effort • Proposals of automated techniques • White-box test generation is being one of them • Non-trivial practical application of such techniques Scope and Motivation Extend the empirical evidence on white-box test generation techniques, and propose novel approaches that facilitate their use in practice.

White-Box Test Generation public int ClassifyNum(int n) { if(n >
0) { if(n % 2 == 0) return 0; else return 1; } else return 2; } 3 Evaluating and Improving White-Box Test Generation ID Input [n] Observed output T1 0 2 T2 1 1 T3 2 0 Seems to be easy on simple code snippets, but what about real code?

Evaluating and Improving White-Box Test Generation 4 Research Method and
Challenges Study 1 Test generation during development Study 2 Classification of generated white-box tests Empirical studies Identified challenges C1 UNDER- STANDING C2 LACK OF TRUST C3 LOW COVERAGE C4 COMPLEX PROGRAMS

Thesis 1 Empirical Investigation of Practical White-Box Test Generation Evaluating
and Improving White-Box Test Generation 5

Evaluating and Improving White-Box Test Generation 6 • Several existing
studies target white-box test generation • Only some of them employ human participants • Problem: limited conclusions for practical settings • Possible solutions: Background Replications New setting • Increase confidence in results • Build a body of knowledge • Manage validity concerns • Provides novel insights • Gives freedom in study design • Yields limited validity Study 1 Study 2

Evaluating and Improving White-Box Test Generation 7 • Original study
by Rojas et al. [1] • Goal: „Empirically evaluate the effects of using an automated test generation tool during development” • External differentiated replication with changes including: – Test generator tool: EvoSuite → Microsoft Pex/IntelliTest – Programming language: Java → C# – Participants knowledge: Students only → Professionals and students T1.1 Study 1: Test Generation in Development [1] José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. Automated unit test generation during software development: a controlled experiment and think-aloud observations. In: Proceedings of the ISSTA 2015, pp. 338–349. ACM, 2015. doi: 10.1145/2771783.2771801.

Evaluating and Improving White-Box Test Generation 8 Summary of results
(30 participants) • Coverage can be higher with test generation • Test generation reduces the amount of user activity required • Spending more time with test generation improves quality T1.1 Study 1: Test Generation in Development

T1.2 Study 2: Classification of White-Box Tests Motivation • Question:
Are these tests correct? ◦ OK: correct w.r.t specification ◦ WRONG: contradicts specification • Not considered in empirical studies • Issues caused in practice: ◦ Real efficiency can be worse ◦ Can be an effort-intensive task [TestMethod] public void CalculateSumTest284() { int[] ints = new int[5] { 4,5,6,7,8 }; int i = CalculateSum(0, 0, ints); Assert.AreEqual<int>(0, i); } [TestMethod] public void CalculateSumTest647() { int[] ints = new int[5] { 4,5,6,7,8 }; int i = CalculateSum(0, 4, ints); Assert.AreEqual<int>(15, i); } 9 Evaluating and Improving White-Box Test Generation Goal: How do developers who use test generator tools perform in deciding whether the generated tests encode expected or unexpected behavior?

Evaluating and Improving White-Box Test Generation 10 RQ1: How do
developers perform in the classification of generated tests? RQ2: How much time do developers spend with the classification? T1.2 Study 2: Classification of White-Box Tests Subjects Objects Context Task • Students only • Source: V&V course • Basic experience • Apply voluntarily • Source: GitHub • 5 methods in 4 repos • Artificial faults (ODC) • 3 tests per method • 15 mins tutorial • Experiment portal • Visual Studio • Test runs and debug • Classify all 15 tests • At most 60 minutes • Activities recorded • Data is logged

Evaluating and Improving White-Box Test Generation 11 Summary of results
(106 participants) • RQ1: ”… yield only a moderate classification performance …” • RQ2: ”… more than one minute is usually required …” T1.2 Study 2: Classification of White-Box Tests

Evaluating and Improving White-Box Test Generation 12 1.1 I designed
and conducted a replication for an existing empirical study with 30 human participants on using white-box test generation during development to gain further insights of the topic. 1.2 I designed and conducted an empirical study and its replication with 106 individuals altogether. The studies addressed the white-box test classification performance of the participants. Thesis 1 – Summary Design and execution of empirical studies to investigate the challenges of white-box test generation in practical settings. Σ Publication: SQJ’19 I have analyzed the use of white-box test generation in two separate empirical studies. Based on the results I have identified and quantified new challenges that may hinder practical white-box test generation, and also strengthened the evidence for already known challenges.

Thesis 2 Automated Isolation of Dependencies in White-Box Test Generation
Evaluating and Improving White-Box Test Generation 13

Evaluating and Improving White-Box Test Generation 14 Motivation: challenges of
white-box test generators • C3: Low code coverage • C4: Large, complex programs • One of common root causes: environment interaction Background bool TransferMoney(Token userToken, long amount, Account destination) { if (amount <= 0) throw new Exception("Invalid amount to transfer"); int balance = DB.RunQuery<int>("GetBalance", userToken); if (balance < amount) throw new Exception("Not enough balance"); TransferProcessor tp = new TransferProcessor(userToken); ProcessedTransfer pt = tp.Process(amount, destination); return pt.IsSuccess; } DB service network

Evaluating and Improving White-Box Test Generation 15 T2.1 Design of
the Approach Users can also defne behavior with input-effect pairs

Evaluating and Improving White-Box Test Generation 16 Experimental evaluation with
44 kLoC of open source projects • RQ1: How does it improve statement and branch coverage? • RQ2: How does it increase the time spent with test generation? T2.2 Adaptation and Evaluation

Evaluating and Improving White-Box Test Generation 17 2.1 I designed
an approach to automatically isolate the unit under test from its dependencies for test generation purposes. This approach also generates a parameterized sandbox around the isolated unit that can be utilized by the test generator. I mapped the automated isolation approach to the domain of a concrete white-box test generator (Microsoft Pex/IntelliTest) to demonstrate the feasibility of the approach. 2.2 I conducted a large-scale quantitative evaluation to assess the performance of the approach via the implemented tool. The tool was able to improve the coverage reached by generated tests by around 50-60% in problematic cases. Thesis 2 – Summary Design and evaluation of an approach that automatically transforms the code under test from its dependencies for test generation. Σ Publications: IST’20, DiB’20, PP’17, BME’16, BME’17 I designed a novel, automated, code transformation based approach for alleviating the external dependency problem in white-box test generation.

Thesis 3 Visually Aiding the Use and Problem Resolution of
Symbolic Execution Evaluating and Improving White-Box Test Generation 18

Evaluating and Improving White-Box Test Generation 19 Motivation: challenges of
white-box test generators • C1: Difficulty of understanding generated white-box tests • C2: Low trust in white-box test generation techniques and tools Idea: visualizing symbolic execution based test generation • The visualization should be clear, traceable, and detailed • It should represent all necessary information – To help in grasping an overview and understanding of the process – To support precise problem identification Background

Evaluating and Improving White-Box Test Generation 20 • Main concept:
symbolic execution tree • Elements of visualization – Nodes: shape, color, border, label – Edges: color • Additional data attached to nodes – Source code mapping – Path conditions • Use cases – Engineering – Education T3 Design of the Approach

Evaluating and Improving White-Box Test Generation 21 • Mapping of
the technique to domain of Microsoft Pex/IntelliTest • Implementation as an IDE extension T3 Adaptation

Evaluating and Improving White-Box Test Generation 22 • Identification of
additional metrics from related domains using survey papers and intuition – Static code metrics (SC) – Symbolic execution related metrics (SE) – Test code metrics (GT) – Generic graph metrics (GG) • Attachment of metrics to tree: node (N), path (P), execution (E) • Mapping of metrics to problems in SE based on experiences and intuition – Constraint solver (CS) – State space exploration (SSE) – Object creation (OC) – Environment interaction (EI) T3 Multi-Domain Metrics

Evaluating and Improving White-Box Test Generation 23 I designed a
generic visual representation of symbolic execution trees that handles additional data, which are related to the path conditions, constraints and generated test cases, for easier understanding and issue identification. I adapted the generic representation to a concrete white-box test generator (Microsoft Pex/IntelliTest) by precisely mapping each generic concept to the concrete domain. Based on the analysis of multiple related domains, I identified and organized several metrics that alleviate the problem identification process during test generation based on symbolic execution. Thesis 3 – Summary Designed a novel visualization technique for symbolic execution to support understanding and problem identification. Σ Publications: ICST’15, BME’18 I proposed a possible visual representation of symbolic execution trees that can handle additional me tadata as well. I implemented the technique for Microsoft Pex/IntelliTest, an advanced symbolic execu tion-based test generator.

Summary Evaluating and Improving White-Box Test Generation 24

Evaluating and Improving White-Box Test Generation 25 Publications Thesis 1
Thesis 2 Thesis 3 Journal SQJ IST DIB PerPol Conference ICST Workshop 2x BME BME Related to theses Highlights 8 publications • 4 journal (incl. IST, SQJ) • 1 conference (ICST) • 3 workshop (BME) 8 independent citation (peer-reviewed)

• Tools implemented – T2 (AutoIsolator): Automated isolation for Microsoft
Pex/IntelliTest – IDE extension for immediate dynamic analysis – Website: https://ftsrg.github.io/autoisolator – T3 (SEViz): Visualizing the test generation of Microsoft Pex/IntelliTest – Website: https://ftsrg.github.io/seviz – Cao et al. [2] built a tool on the basis of SEViz • Education – T2: Served as a basis of a B.Sc. thesis – T3: Using the implemented SEViz tool – Lab in the SWSV course at BME – Basis of two B.Sc. theses Evaluating and Improving White-Box Test Generation 26 Applications

Evaluating and Improving White-Box Test Generation 27 Conclusions Thesis 1
Thesis 2 Thesis 3 • Design and evaluate an approach for automatically isolating dependencies • Future work – User-oriented empirical study – Object state tracking and use • Evaluate white-box test generators in practical settings • Future work – Further replications – New studies for challenges Study 1 Study 2 New challenges • Design an approach for visualization, and propose use cases • Future work – User-oriented empirical study – Mapping CFG to SE tree Eval. Tool Automated isolation Tool Uses Visualization of SE

Evaluating and ImprovingWhite-Box Test Generation

Evaluating and Improving White-Box Test Generation

Critical Systems Research Group

More Decks by Critical Systems Research Group

Other Decks in Research

Featured

Transcript

Evaluating and Improving White-Box Test Generation Dávid Honfi Advisor: Zoltán

Evaluating and Improving White-Box Test Generation 2 • Software testing

White-Box Test Generation public int ClassifyNum(int n) { if(n >

Evaluating and Improving White-Box Test Generation 4 Research Method and

Thesis 1 Empirical Investigation of Practical White-Box Test Generation Evaluating

Evaluating and Improving White-Box Test Generation 6 • Several existing

Evaluating and Improving White-Box Test Generation 7 • Original study

Evaluating and Improving White-Box Test Generation 8 Summary of results

T1.2 Study 2: Classification of White-Box Tests Motivation • Question:

Evaluating and Improving White-Box Test Generation 10 RQ1: How do

Evaluating and Improving White-Box Test Generation 11 Summary of results

Evaluating and Improving White-Box Test Generation 12 1.1 I designed

Thesis 2 Automated Isolation of Dependencies in White-Box Test Generation

Evaluating and Improving White-Box Test Generation 14 Motivation: challenges of

Evaluating and Improving White-Box Test Generation 15 T2.1 Design of

Evaluating and Improving White-Box Test Generation 16 Experimental evaluation with

Evaluating and Improving White-Box Test Generation 17 2.1 I designed

Thesis 3 Visually Aiding the Use and Problem Resolution of

Evaluating and Improving White-Box Test Generation 19 Motivation: challenges of

Evaluating and Improving White-Box Test Generation 20 • Main concept:

Evaluating and Improving White-Box Test Generation 21 • Mapping of

Evaluating and Improving White-Box Test Generation 22 • Identification of

Evaluating and Improving White-Box Test Generation 23 I designed a

Summary Evaluating and Improving White-Box Test Generation 24

Evaluating and Improving White-Box Test Generation 25 Publications Thesis 1

• Tools implemented – T2 (AutoIsolator): Automated isolation for Microsoft

Evaluating and Improving White-Box Test Generation 27 Conclusions Thesis 1