Slide 1

Slide 1 text

Riaan Cornelius Using forensic techniques for targeted refactoring Crafting Code

Slide 2

Slide 2 text

Who am I > More than a decade of software dev experience > Mobile app developer by day > Purveyor of strange topics by night > I’ve dabbled in AI, computer vision, robotics and even cooking > Please remember to rate my talk: http://www.devconf.co.za/rate

Slide 3

Slide 3 text

Why do we refactor? > As a developer, what is your job?

Slide 4

Slide 4 text

Why do we refactor?

Slide 5

Slide 5 text

Why do we refactor?

Slide 6

Slide 6 text

Why do we refactor?

Slide 7

Slide 7 text

Why do we refactor?

Slide 8

Slide 8 text

Why do we refactor? > Maintenance is expensive

Slide 9

Slide 9 text

The enemy of change > Complexity > If our job is to understand code, how do we make that job easier

Slide 10

Slide 10 text

Some (potentially) useful tools > Static analysis > Complexity metrics > Code reviews > Tests

Slide 11

Slide 11 text

Tools I used > Git (specifically git log) > Code Maat > Python > D3.js (Javascript library)

Slide 12

Slide 12 text

Forget the tools > It’s not about the tools, but rather the techniques > These tools simplify some parsing, processing or visualisation > You can write your own scripts for any of these functions

Slide 13

Slide 13 text

Problems of scale > In large systems, how do you prioritise improvements?

Slide 14

Slide 14 text

The problem with complexity metrics > Complexity is only a problem if you need to deal with it

Slide 15

Slide 15 text

Offender profiling > You probably know something about offender profiling. > Hollywood loves it: • Silence of the lambs • Numbers • Criminal minds • NCIS • Many more…

Slide 16

Slide 16 text

Offender profiling > There is one serious limitation: They only work in Hollywood

Slide 17

Slide 17 text

Geographic profiling > Based in statistics and psychology. > Same principle as police officer sticking pins in a map

Slide 18

Slide 18 text

Geographic profiling

Slide 19

Slide 19 text

Applying geographical profiling to code > What if a hotspot analysis could narrow down areas of bad code?

Slide 20

Slide 20 text

Exploring the geography of code

Slide 21

Slide 21 text

Add a spatial component > Hopefully you all use a VCS. > We need to focus on areas with high developer activity

Slide 22

Slide 22 text

Add a spatial component > git log --pretty=format:'[%h] %an %ad %s' --date=short --numstat > maat.bat -l git.log -c git -a revisions > metric_data.cvs

Slide 23

Slide 23 text

Add a spatial component

Slide 24

Slide 24 text

Combine complexity and effort

Slide 25

Slide 25 text

Profiling your codebase > Choose a timespan for your analysis > Get frequency data > Add complexity data > Merge complexity and effort > Visualise this data

Slide 26

Slide 26 text

Profiling your codebase > We’ll look at the hibernate ORM > git clone https://github.com/hibernate/hibernate-orm.git

Slide 27

Slide 27 text

Profiling your codebase > Choosing a timeframe > Don’t look at the life of the project > What timeframe you use depend on your development methodology • Between releases • Over iterations • Around significant events (reorganisation of code or teams)

Slide 28

Slide 28 text

Profiling your codebase > Generate a log: > git log --pretty=format:'[%h] %an %ad %s' --date=short –numstat -- before=2013-09-05 --after=2012-01-01 > hib_evo.log

Slide 29

Slide 29 text

Profiling your codebase > A summary of the changes shows some interesting things: prompt> maat -l hib_evo.log -c git -a summary statistic,value number-of-commits,1346 number-of-entities,10193 number-of-entities-changed,18258 number-of-authors,89

Slide 30

Slide 30 text

Profiling your codebase > Analyzing change frequencies: > maat -l hib_evo.log -c git -a revisions > hib_freqs.csv

Slide 31

Slide 31 text

Profiling your codebase > Calculate complexity > Complexity by lines of code? > Bad metric, but no worse than others… > Cloc ./ --by-file –csv –quiet –report-file=hib_lines.csv

Slide 32

Slide 32 text

Profiling your codebase > Combine complexity and effort: > python scripts/merge_comp_freqs.py hib_freqs.csv hib_lines.csv > module,revisions,code build.gradle,79,402 hibernate-core/.../persister/entity/AbstractEntityPersister.java,44,3983 hibernate-core/.../cfg/Configuration.java,40,2673 hibernate-core/.../internal/SessionImpl.java,39,2097 hibernate-core/.../internal/SessionFactoryImpl.java,34,1384 …

Slide 33

Slide 33 text

Profiling your codebase > Now we can finally get to the fun part: Visualisation > I’m using a sample D3.js circle-packing algorithm > Due to security restrictions in modern browsers: > python -m SimpleHTTPServer 8888

Slide 34

Slide 34 text

Profiling your codebase

Slide 35

Slide 35 text

Profiling your codebase

Slide 36

Slide 36 text

Measuring complexity > Is there a simple option that is better than lines of code?

Slide 37

Slide 37 text

Measuring complexity

Slide 38

Slide 38 text

Measuring complexity > python scripts/complexity_analysis.py hibernate- core/src/main/java/org/hibernate/cfg/Configuration.java n, total, mean, sd, max 3335, 8072, 2.42, 1.63, 14

Slide 39

Slide 39 text

Measuring complexity > You’ve already seen how to analyze a single revision. Now we want to: 1. Take a range of revisions for a specific module. 2. Calculate the indentation complexity of the module as it occurred in each revision. 3. Output the results revision by revision for further analysis.

Slide 40

Slide 40 text

Measuring complexity > python scripts/git_complexity_trend.py --start ccc087b --end 46c962e --file hibernate-core/src/main/java/org/hibernate/cfg/Configuration.java > rev, n, total, mean, sd e75b8a7, 3080, 7610, 2.47, 1.76 23a6280, 3092, 7649, 2.47, 1.76 8991100, 3100, 7658, 2.47, 1.76 8373871, 3101, 7658, 2.47, 1.76 …

Slide 41

Slide 41 text

Visualising complexity trends

Slide 42

Slide 42 text

Visualising complexity trends

Slide 43

Slide 43 text

Visualising complexity trends

Slide 44

Slide 44 text

Going further

Slide 45

Slide 45 text

Resources > http://riaan.me/dc16 Twitter: @riaancornelius Please remember to rate my talk: http://www.devconf.co.za/rate

Slide 46

Slide 46 text

/* THANK YOU*/ Riaan Cornelius Entelect Software [email protected] 084 755 1866 http://www.devconf.co.za/