Slide 1

Slide 1 text

Guide Refactorings With Behavioral Code Analysis @AdamTornhill craft-conf.com

Slide 2

Slide 2 text

Modify someone else’s C++ code The Human Potential Hard Impossible Land on the Moon Sequence the Human Genome Revive the Dinosaurs The Pyramids Fusion Power Understand Consciousness @AdamTornhill

Slide 3

Slide 3 text

Lehman’s “Laws” of Software Evolution Continuing Change “a system must be continually adapted or it becomes progressively less satisfactory” @AdamTornhill Increasing Complexity “as a system evolves, its complexity increases unless work is done to maintain or reduce it” M. Lehman, “Programs, life cycles, and laws of software evolution”, 1980

Slide 4

Slide 4 text

The Two Forms Of Accidental Complexity Complex Parts Complex Inter-Dependencies @AdamTornhill

Slide 5

Slide 5 text

Behavioral Code Analysis - What Is It? Code: Important Evolution and Behavior: More Important @AdamTornhill

Slide 6

Slide 6 text

Version-Control: A Behavioral Data Source Commit: b557ca5 Date: 2016-02-12 Author: Kevin Flynn Fix behavior of StartsWithPrefix 8 27 src/Mvc.Abstractions/ModelBinding/ModelStateDictionary.cs 1 10 src/Mvc.Core/ControllerBase.cs 1 1 src/Mvc.Core/Internal/ElementalValueProvider.cs 1 39 src/Mvc.Core/Internal/PrefixContainer.cs Commit: fd6d28d Date 2016-02-10 Author: Professor Falken Make AddController not overwrite existing IControllerTypeProvider 8 1 src/Core/Internal/ControllersAsServices.cs 48 0 test/Core.Test/Internal/ControllerAsServicesTest.cs 13 0 test/Mvc.FunctionalTests/ControllerFromServicesTests.cs Commit: 910f013 Date :2016-02-05 Author Lisbeth Salander Fixes #4050: Throw an exception when media types are empty. 20 1 src/Mvc.Core/Formatters/InputFormatter.cs Social Information A Time Dimension Progress on Tasks Co-changing Files @AdamTornhill

Slide 7

Slide 7 text

Prefer Simple Metrics Metrics are a Guide, not a Replacement for Expertise because @AdamTornhill

Slide 8

Slide 8 text

Extract The Signal via a Human in the Loop View Complexity Through the Lens of Behavioral Data

Slide 9

Slide 9 text

Case Study: Android The Platform Framework Base in Numbers 3 Million Lines of Code 2,1 Million Lines of Java 2,000 Unique Authors @AdamTornhill

Slide 10

Slide 10 text

Case Study: Android The Platform Framework Base in Numbers 3 Million Lines of Code 2,1 Million Lines of Java 2,000 Unique Authors @AdamTornhill

Slide 11

Slide 11 text

Case Study: Android Code Complexity Code Change Frequency Hotspot @AdamTornhill

Slide 12

Slide 12 text

What we normally care about… A simpler view! Ref to Python script… What’s Code Complexity Anyway? Implementation: https://github.com/adamtornhill/maat-scripts/blob/master/miner/complexity_analysis.py @AdamTornhill

Slide 13

Slide 13 text

Case Study: Android 20,097 Lines of Code 2,009 Commits

Slide 14

Slide 14 text

Trade-Offs: Improvements vs New Features

Slide 15

Slide 15 text

Programming As If Social Factors Mattered Author #1 Author #2 Author #N The Relative Contributions of Each Author contributors Fractal Value: M. D’Ambros, M. Lanza, and H Gall. Fractal Figures: Visualizing Development Effort for CVS Entities.

Slide 16

Slide 16 text

The Splinter Pattern The splinter pattern provides a structured way to break up hotspots into manageable pieces that can be divided among several developers to work on, rather than having a group of developers work on one large piece of code. https://pragprog.com/book/atevol/software-design-x-rays

Slide 17

Slide 17 text

ActivityStack.java Translucence.java DeviceOwnership.java ActivityLifecycle.java delegate delegate delegate Stack behavior Translucence 
 behavior Ownership 
 behavior Lifecycle behavior Splinter Context Original Context The Splinter Pattern Unmodified, copy-pasted (yes, really) content @AdamTornhill

Slide 18

Slide 18 text

Splinter: Resulting Context ActivityStack.java Translucence.java DeviceOwnership.java ActivityLifecycle.java Better alignment between problem and solution domain 
 => facilitates parallel work Individual Parts that can be refactored in isolation @AdamTornhill

Slide 19

Slide 19 text

ActivityStack.java @AdamTornhill Cut The Middle Man

Slide 20

Slide 20 text

Measure and Visualize Improvements The effects of a Splinter refactoring @AdamTornhill

Slide 21

Slide 21 text

Methods and Functions: Where Do We Start? @AdamTornhill

Slide 22

Slide 22 text

Function Level Hotspots Parse Recommended functions to improve. Hotspots: X-Ray ActivityManagerService.java @AdamTornhill

Slide 23

Slide 23 text

X-Ray of ActivityManagerService.java @AdamTornhill

Slide 24

Slide 24 text

https://codescene.io/ Source Code: 
 https://github.com/adamtornhill/code-maat Tooling: Try it on your own Code Track functions with git log -L :: @AdamTornhill

Slide 25

Slide 25 text

Code Duplication and DRY Violations 5-20% of all Code is Duplicated to Some Extent @AdamTornhill

Slide 26

Slide 26 text

@AdamTornhill DRY Violations in handleMessage (Andoid)

Slide 27

Slide 27 text

@AdamTornhill DRY Violations in handleMessage (Andoid) Next refactoring step: Design Pattern COMMAND?

Slide 28

Slide 28 text

@AdamTornhill The Dirty Secret of Copy Paste Image from https://en.wikipedia.org/wiki/Rorschach_test

Slide 29

Slide 29 text

B() A() Change Coupling: Patterns That Emerge Over Time Commit #1 Commit #2 E() Commit #3 Changed code B() A() Changed code

Slide 30

Slide 30 text

Case Study: Code Clones in Linux @AdamTornhill The Linux Kernel in Numbers 16 Million Lines of Code 15 Million Lines of C 15,000 Unique Authors

Slide 31

Slide 31 text

Inside the Main Hotspot: intel_display.c 11,383 Lines of Code 3,040 Commits

Slide 32

Slide 32 text

Inside The Main Hotspot: intel_display.c 11,383 Lines of Code 3,040 Commits

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Inside the Main Hotspot: intel_display.c 11,383 Lines of Code 3,040 Commits

Slide 35

Slide 35 text

A bug due to omission? A context-specific check?

Slide 36

Slide 36 text

Combine copy-paste Detection techniques with change coupling to identify the code clones that really need refactoring. Clone Detector Applications Clone Digger (Java and Python): http://clonedigger.sourceforge.net/
 Simian (.NET): http://www.harukizaemon.com/simian/ @AdamTornhill

Slide 37

Slide 37 text

Duplication goes Beyond Code Similarity @AdamTornhill

Slide 38

Slide 38 text

Case Study: Refactoring Entity Framework Core @AdamTornhill Entity Framework Core in Numbers 574,000 Lines of Code 365,000 Lines of C# 102 Unique Authors

Slide 39

Slide 39 text

Code Clones in Entity Framework - Refactor?

Slide 40

Slide 40 text

The Principle of Proximity

Slide 41

Slide 41 text

The Principle of Proximity in Code @AdamTornhill

Slide 42

Slide 42 text

The Proximity Refactoring

Slide 43

Slide 43 text

Image from https://thedailywtf.com/articles/comments/Enterprise-Dependency-Big-Ball-of-Yarn Reducing Complex Inter-Dependencies

Slide 44

Slide 44 text

Image from https://thedailywtf.com/articles/comments/Enterprise-Dependency-Big-Ball-of-Yarn The Heuristic of Surprise

Slide 45

Slide 45 text

Case Study: jUnit5 @AdamTornhill jUnit5 in Numbers 59,000 Lines of Code 51,000 Lines of Java 74 Unique Authors

Slide 46

Slide 46 text

Unit tests that change together in ~70% of all commits @AdamTornhill

Slide 47

Slide 47 text

X-Ray of the Coevolving jUnit Tests @AdamTornhill

Slide 48

Slide 48 text

Let’s Look at the Code… ExceptionHandlingTests.java ReportingTests.java TestCaseWithInheritanceTests.java @AdamTornhill

Slide 49

Slide 49 text

Refactoring jUnit: Express Our Domain

Slide 50

Slide 50 text

Refactoring jUnit: Express Our Domain

Slide 51

Slide 51 text

Express Tests in the Language of your Domain; Generic Assertions are at Odds with that Goal. @AdamTornhill

Slide 52

Slide 52 text

The Costs of Implicit Dependencies Increase with Architectural Distance @AdamTornhill

Slide 53

Slide 53 text

UI code implemented in JavaScript… …changed coupled to backend code implemented in Clojure (different repository) @AdamTornhill

Slide 54

Slide 54 text

Git Repository Analysis Engine Backend UI Data Mining Git Repository Git Repository Change Coupling!

Slide 55

Slide 55 text

How Can You Track Changes Across Repositories? @AdamTornhill e9e57e48 2017-05-26 D.Cooper [email protected] 
 Bugfix: the owls are not what they seem. Task: CLJ-2141 A specific commit Use a Task ID or Ticket Reference

Slide 56

Slide 56 text

UI code implemented in JavaScript… …changed coupled to backend code implemented in Clojure (different repository) @AdamTornhill

Slide 57

Slide 57 text

X-Ray Across Repository Boundaries JavaScript function… …that changes with Clojure business logic. @AdamTornhill

Slide 58

Slide 58 text

Let Features Drive Architectural Building Blocks, not Technology. @AdamTornhill

Slide 59

Slide 59 text

@AdamTornhill Blog on Behavioral Code Analysis http://www.empear.com/blog/ Your Code As A Crime Scene https://pragprog.com/book/atcrime/your-code-as-a-crime-scene Software Design X-Rays https://pragprog.com/book/atevol/software-design-x-rays Test the Analyses in CodeScene: https://codescene.io/ [email protected]