Pizza vs Pinsa - ICSME 2020

Pizza vs Pinsa: On the Perception and Measurability of Unit
Test Code Quality    Giovanni Grano, Christian De Iaco, Fabio Palomba, Harald C. Gall 36th IEEE International Conference on Software Maintenance and Evolution

first and foremost: what is a pinsa?

motivation essential asset to foster software quality tests of good
quality how do we measure test quality?

motivation essential asset to foster software quality tests of good
quality how do we measure test quality? do existing metrics match developers’ perception of test quality?

goals and research questions 2 research questions what are the
features that inﬂuence unit test quality? do existing metrics match developers’ perception? mixed-method research approach

RQ1: the practitioner’s perspective 5 experts ~ 60 minutes initial
taxonomy interviews general discussion own deﬁnition of test quality summarise measurable factors

RQ1: the practitioner’s perspective 70 developer march 2020 conﬁrm initial
taxonomy additional factors rate quality of test snippets survey

taxonomy structural size test design reusability readability maintenance independence executional
time reliability infrastructure behavioral (self-)validation scope eﬀectiveness diagnosability RQ1: results

0 40 80 120 160 Structural Behavioral Executional U nclassiﬁed
0 20 40 60 80 Scope Test Design Readability Independence Reliability O thers RQ1: results

RQ1: findings multi-faceted concept non-functional aspects test code design

RQ2: the research perspective 70 developer 10 test code snippets
1-5 Likert scale response variable:  score on the Likert scale independent variables:  11 metrics proportional odds model survey

RQ2: results mutation score v.poor | poor poor | fair
fair | good good | v.good p<0.001 p<0.01 p<0.05 p<0.1 coupling between objects assertion roulette number of static invocations

RQ2: findings metrics can discern low-quality from fair ones poor
ability of metrics representing size, complexity, and readability metrics fails at providing a comprehensive model of perceived test quality

implications existing metrics are not enough metrics for test explainability
design for test code quality

@giograno90 first and foremost: what is a pinsa? 0 40
80 120 160 Structural Behavioral Executional U nclassiﬁed 0 20 40 60 80 Scope Test Design Readability Independence Reliability O thers RQ1: results RQ2: findings metrics can discern low-quality from fair ones poor ability of metrics representing size, complexity, and readability metrics fails at providing a comprehensive model of perceived test quality

Pizza vs Pinsa - ICSME 2020

Pizza vs Pinsa - ICSME 2020

Giovanni Grano

More Decks by Giovanni Grano

Other Decks in Research

Featured

Transcript

Pizza vs Pinsa: On the Perception and Measurability of Unit

first and foremost: what is a pinsa?

motivation essential asset to foster software quality tests of good

motivation essential asset to foster software quality tests of good

goals and research questions 2 research questions what are the

RQ1: the practitioner’s perspective 5 experts ~ 60 minutes initial

RQ1: the practitioner’s perspective 70 developer march 2020 conﬁrm initial

taxonomy structural size test design reusability readability maintenance independence executional

0 40 80 120 160 Structural Behavioral Executional U nclassiﬁed

RQ1: findings multi-faceted concept non-functional aspects test code design

RQ2: the research perspective 70 developer 10 test code snippets

RQ2: results mutation score v.poor | poor poor | fair

RQ2: findings metrics can discern low-quality from fair ones poor

implications existing metrics are not enough metrics for test explainability

@giograno90 first and foremost: what is a pinsa? 0 40