Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pizza vs Pinsa - ICSME 2020

Pizza vs Pinsa - ICSME 2020

Slides of the research paper "Pizza vs Pizza: On the Perception and Measurability of Unit Test Code Quality" presented at ICSME 2020.

Avatar for Giovanni Grano

Giovanni Grano

October 01, 2020
Tweet

More Decks by Giovanni Grano

Other Decks in Research

Transcript

  1. Pizza vs Pinsa: On the Perception and Measurability of Unit

    Test Code Quality
 
 Giovanni Grano, Christian De Iaco, Fabio Palomba, Harald C. Gall 36th IEEE International Conference on Software Maintenance and Evolution
  2. motivation essential asset to foster software quality tests of good

    quality how do we measure test quality? do existing metrics match developers’ perception of test quality?
  3. goals and research questions 2 research questions what are the

    features that influence unit test quality? do existing metrics match developers’ perception? mixed-method research approach
  4. RQ1: the practitioner’s perspective 5 experts ~ 60 minutes initial

    taxonomy interviews general discussion own definition of test quality summarise measurable factors
  5. RQ1: the practitioner’s perspective 70 developer march 2020 confirm initial

    taxonomy additional factors rate quality of test snippets survey
  6. taxonomy structural size test design reusability readability maintenance independence executional

    time reliability infrastructure behavioral (self-)validation scope effectiveness diagnosability RQ1: results
  7. 0 40 80 120 160 Structural Behavioral Executional U nclassified

    0 20 40 60 80 Scope Test Design Readability Independence Reliability O thers RQ1: results
  8. RQ2: the research perspective 70 developer 10 test code snippets

    1-5 Likert scale response variable:
 score on the Likert scale independent variables:
 11 metrics proportional odds model survey
  9. RQ2: results mutation score v.poor | poor poor | fair

    fair | good good | v.good p<0.001 p<0.01 p<0.05 p<0.1 coupling between objects assertion roulette number of static invocations
  10. RQ2: findings metrics can discern low-quality from fair ones poor

    ability of metrics representing size, complexity, and readability metrics fails at providing a comprehensive model of perceived test quality
  11. @giograno90 first and foremost: what is a pinsa? 0 40

    80 120 160 Structural Behavioral Executional U nclassified 0 20 40 60 80 Scope Test Design Readability Independence Reliability O thers RQ1: results RQ2: findings metrics can discern low-quality from fair ones poor ability of metrics representing size, complexity, and readability metrics fails at providing a comprehensive model of perceived test quality