Empirical Results on Cloning and Clone Detection

Slide 1

Slide 1 text

Cloning Clone Detection www.uni-stuttgart.de Empirical Results Stefan Wagner @prof_wagnerst Universität Bremen 27. Januar 2016 on and

Slide 2

Slide 2 text

You can copy, share and change, film and photograph, blog, live-blog and tweet this presentation given that you attribute it to its author and respect the rights and licences of its parts. basiert auf Vorlagen von @SMEasterbrook und @ethanwhite

Slide 3

Slide 3 text

Technische Universität München

Slide 4

Slide 4 text

Class A Class B

Slide 5

Slide 5 text

Class A Class B

Slide 6

Slide 6 text

Class A Class B

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

Often 20%–30% redundancy

Slide 9

Slide 9 text

We need to detect clones reliably and automatically.

Slide 10

Slide 10 text

Types of Clones Type 1 an exact copy without modifications (except for whitespace and comments) Type 2 a syntactically identical copy; only variable, type, or function identifiers have been changed Type 3 a copy with further modifications; statements have been changed, added, or removed

Slide 11

Slide 11 text

Clone detection: Processing steps Storage load tokenise & normalise ﬁnd duplicates extract clones visualise

Slide 12

Slide 12 text

• Number of clone groups/clone instances • Size of largest clone/cardinality of most frequent clone • Cloned Statements  Number of statements in the system being part of at least one clone • Clone Coverage – #Cloned Statements / #Statements – Probability of a randomly chosen statement to be part of a clone Measures for cloning

Slide 13

Slide 13 text

Compare View (~20 LOC) Seesoft View (~400 LOC) Tree Maps (>1.000.000 LOC) Trends over Time Visualisation of clone detection results

Slide 14

Slide 14 text

SME Study Technology Transfer of Quality Assurance Techniques

Slide 15

Slide 15 text

Cloning results 3 Study objects Clone coverage: 13.7 – 25.5% blow-up: 110 – 123% 1 Study object Clone coverage: 68 – 79.4 % blow-up: 239 – 336% 1 Study object Clone coverage: 36.7 – 45.4% blow-up:  137 – 150%

Slide 16

Slide 16 text

Perceived Usefulness Questions Clone Detection Experience in ASA techniques never Relevance of study results high Priority in future QA plans very high Gleirscher, Irlbeck, Wagner, Software Quality Journal, 2013

Slide 17

Slide 17 text

Technische Universität München 1 Effects of Code Clones

Slide 18

Slide 18 text

Inconsistencies Can you spot the difference?

Slide 19

Slide 19 text

How problematic are these inconsistencies (and clones)? Indicating harmfulness [Lague97]: inconsistent evolution of clones in industrial telecom. SW. [Monden02]: higher revision number for files with clones in legacy SW. [Kim05]: substantial amount of coupled changes to code clones. [Li06], [SuChiu07] and [Aversano07], [Bakota07]: discovery of bugs through search for inconsistent clones or clone evolution analysis. Doubting harmfulness [Krinke07]: inconsistent clones hardly ever become consistent later. [Geiger06]: Failure to statistically verify impact of clones on change couplings [Lozano08]: Failure to statistically verify impact of clones on changeability. [Göde11]: Most changes intentionally inconsistent [Rahman12]: no statistically significant impacts on faults

Slide 20

Slide 20 text

Our First Study at ICSE 2009 • Manual inspection of inconsistent clones by system developers No indirect measures of consequences of cloning • Both industrial and open source software analysed • Quantitative data Deissenboeck, Juergens, Hummel, Wagner, ICSE, 2009

Slide 21

Slide 21 text

Research Questions RQ1: Are clones changed inconsistently? |IC| / |C| RQ2: Are inconsistent clones created unintentionally? |UIC| / |IC| RQ3: Can inconsistent clones be indicators for faults in real systems? |F| / |IC|, |F| / |UIC| Clone Groups C (exact and incons.) Inconsistent clone groups IC Unintentionally incons. Clone Groups UIC Faulty clone Groups F

Slide 22

Slide 22 text

Study Design Tool detected clone group candidates CC Clone group candidate detection • Novel algorithm • Tailored to target program False positive removal • Manual inspection of all inconsistent and ¼ exact CCs • Performed by researchers Assessment of inconsistencies • All inconsistent clone groups inspected • Performed by developers Clone groups C (exact and incons.) Inconsistent clone groups IC Unintentionally inconsistent clone groups UIC Faulty clone groups F → CC → C, IC → UIC, F

Slide 23

Slide 23 text

Study Objects International reinsurance company, 37.000 employees Munich-based life-insurance company, 400 employees Sysiphus: Open source collaboration environment for distributed SW development. Developed at TUM. 281 8 Java TUM Sysiphus 197 17 Cobol LV 1871 D 495 2 C# Munich Re C 454 4 C# Munich Re B 317 6 C# Munich Re A Size (kLoC) Age (years) Language Organization System

Slide 24

Slide 24 text

Results Project A B C D Sys. Sum Clone groups |C| 286 160 326 352 303 1427 Inconsistent CGs |IC| 159 89 179 151 146 724 Unint. Incos. |UIC| 51 29 66 15 42 203 Faulty CGs |F| 19 18 42 5 23 107

Slide 25

Slide 25 text

Our Second Study • Investigating evolution of type-3 clones • Relationship with documented faults from issue tracker • Industrial systems Accepted at SANER 2016

Slide 26

Slide 26 text

Research Questions RQ1: Do software systems contain type-3 clones? |CT3| / |C| RQ2: Do type-3 clones contain documented faults? |CT3 F | / |CT3| RQ3: Are developers aware of type-3 clones? |IMS | / |IM |, interviews with key developers Clone Groups C (exact and incons.) Inconsistent clone groups CT3 Faulty clone Groups CT3 F

Slide 27

Slide 27 text

Data Collection and Analysis rt HTML Dash- board v1 v2 v3 Extract Analyse Query for relationships and evolution Extract

Slide 28

Slide 28 text

Study Objects System Size (KLOC) Age (Years) Developers A 253 4 10 B 332 5 5 C 454 4 10 Java Automotive domain

Slide 29

Slide 29 text

Quantitative Results System A B C Overall Share of type-3 clones 0.56 0.23 0.79 0.52 Share of faulty clone type-3 classes 0.33 0.05 0.03 0.17 Share of simultaneously modiﬁed type-3 clones 0.58 0.89 0.92 0.85

Slide 30

Slide 30 text

Qualitative Results System A B C General clone awareness x No general clone awareness x x No speciﬁc clone awareness x x No clone check while bug ﬁxing x x Clone warning while developing x Common code ownership x Discussion about co-changes x

Slide 31

Slide 31 text

Conclusions • About half of all clone classes are type-3 clones. • Rate of faulty type-3 clones is about 17 %. • There is a difference in awareness of clones and inconsistencies. • This awareness seems to impact how many faults are related to type-3 clones. • Further studies should take this into account. • Making developers aware of clones seems still to be worthwhile.

Slide 32

Slide 32 text

Technische Universität München 2 Functional Similarity

Slide 33

Slide 33 text

Functional Similarities Not necessarily syntactically similar Type-4 clone: functionally similar code fragment regarding I/O behaviour

Slide 34

Slide 34 text

First Idea • Execute candidate code fragments on random input • Compare output Type-4 Clones Source Code Libraries Detection Pipeline Deissenboeck, Heinemann, Hummel, Wagner, CSMR 2012

Slide 35

Slide 35 text

Study Objects System SLOC Commons Lang 17,504 Freemind 51,762 Jabref 74,586 Jetty 29,8 JHotDraw 78,902 Info1 submissions 8 – 55

Slide 36

Slide 36 text

Percentage of fragments that are type-4 clones Freemind JHotDraw Jetty JabRef Comm. Lang Info1 43,75 3,51 2,64 1,03 0,64 0,55

Slide 37

Slide 37 text

Discussion • Low results: no type-4 clones or ﬂaws in detection? • Main limitation: random testing approach  no input or generated input does achieve sufﬁcient code coverage • Notion of I/O similarity may not be suitable  e.g different data types or signatures • Further research required to quantify these problems

Slide 38

Slide 38 text

So how are functionally similar clones (FSC) different? • RQ 1: What share of independently developed similar programs are type-1–3 clones? • RQ 2: What are the differences between FSC that go beyond type-1–3 clones? • RQ 3: What share of FSC can be detected by a type-4 clone detector? • RQ 4: What should a benchmark contain that represents the differences between FSC? Wagner et. al, PeerJ Preprints, https://dx.doi.org/10.7287/peerj.preprints.1516v1

Slide 39

Slide 39 text

Data Collection and Analysis rt HTML Dash- board Sol. 1 Sol. 2 Sol. 3 Extract Analyse Share of syntactic similarity Manual qualitative analysis CCCD Deckard Characteristics of differences Benchmark

Slide 40

Slide 40 text

Java C ConQAT (partial) Deckard (partial) ConQAT (full) Deckard (full) ConQAT (partial) Deckard (partial) ConQAT (full) Deckard (full) 0,01 0 1,73 0 1,44 0,87 11,48 11,53 How syntactically similar are FSC? in %

Slide 41

Slide 41 text

What are other differences in FSC? Algorithms Input/output Libraries Object-oriented design Data structures Degree of difference low medium high

Slide 42

Slide 42 text

What can CCCD detect? Full and partial clone recall me sets for CCCD (in %) Mean SD Partial 16.03 0.07 Full 0.10 0.00 pret this result such that also contempo n tools have still problems detecting in %

Slide 43

Slide 43 text

A Benchmark to represent these differences changing structure nor algorithm, the code clon realistic than fully artiﬁcial copies where one ﬁed as part of a study. Language Java C Category Degree of Diff. Clone Kind Data OO-Design ... ... low medium high partial full ... ... ... Solution (fle) left right ... Benchmark re 4: Structure of the benchmark set (over Available at: https://github.com/SE-Stuttgart/clone-study

Slide 44

Slide 44 text

Conclusions Lessons Learned • Independently developed FSCs have very little syntactic similarity. • Type-1–3 detectors will not reliability detect them. • Newer approaches, such as CCCD, improve that but not by much. Future Work • Future proposal for type-4 detectors can use categorisation and benchmarks as „todo“ list. • Probably a combination of static and dynamic analyses needed

Slide 45

Slide 45 text

Outlook • Factors inﬂuencing the effects of cloning • Detector for functionally similar code

Slide 46

Slide 46 text

We need to detect clones reliably and automatically.

Slide 47

Slide 47 text

Pictures Used in this Slide Deck „Mercurial Logo“ by Mackall (http://www.selenic.com/hg-logo/)