Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Guide Refactorings With Behavioral Code Analysis

Guide Refactorings With Behavioral Code Analysis

Many codebases contain code that is overly complicated, hard to understand, and hence expensive to change. The pressure of new features and user needs makes it hard to stop and backtrack, and the longer we wait, the worse it's going to be. Mix in the people side with frequent organizational change and the siren song of a system rewrite becomes more and more attractive. It doesn't have to be that way, and in this presentation you'll see how easily obtained version-control data let us uncover the behavior and patterns of the development organization. These behavioral code analysis techniques provide a sweet spot to prioritize and guide refactorings. We cover refactoring techniques that reduce excess complexity, address hidden implicit dependencies, and discuss architectural restructuring that reduce inter-team coordination needs. Since behavioral code analysis also lets us consider the social side of code, such as refactoring modules that are under development by our peers, we explore novel patterns that help us limit risks and code conflicts. The specific examples are from real-world codebases like Android, the Linux Kernel, ASP.NET Core MVC, and more.

Adam Tornhill

May 10, 2018

More Decks by Adam Tornhill

Other Decks in Programming


  1. Modify someone else’s C++ code The Human Potential Hard Impossible

    Land on the Moon Sequence the Human Genome Revive the Dinosaurs The Pyramids Fusion Power Understand Consciousness @AdamTornhill
  2. Lehman’s “Laws” of Software Evolution Continuing Change “a system must

    be continually adapted or it becomes progressively less satisfactory” @AdamTornhill Increasing Complexity “as a system evolves, its complexity increases unless work is done to maintain or reduce it” M. Lehman, “Programs, life cycles, and laws of software evolution”, 1980
  3. Behavioral Code Analysis - What Is It? Code: Important Evolution

    and Behavior: More Important @AdamTornhill
  4. Version-Control: A Behavioral Data Source Commit: b557ca5 Date: 2016-02-12 Author:

    Kevin Flynn Fix behavior of StartsWithPrefix 8 27 src/Mvc.Abstractions/ModelBinding/ModelStateDictionary.cs 1 10 src/Mvc.Core/ControllerBase.cs 1 1 src/Mvc.Core/Internal/ElementalValueProvider.cs 1 39 src/Mvc.Core/Internal/PrefixContainer.cs Commit: fd6d28d Date 2016-02-10 Author: Professor Falken Make AddController not overwrite existing IControllerTypeProvider 8 1 src/Core/Internal/ControllersAsServices.cs 48 0 test/Core.Test/Internal/ControllerAsServicesTest.cs 13 0 test/Mvc.FunctionalTests/ControllerFromServicesTests.cs Commit: 910f013 Date :2016-02-05 Author Lisbeth Salander Fixes #4050: Throw an exception when media types are empty. 20 1 src/Mvc.Core/Formatters/InputFormatter.cs Social Information A Time Dimension Progress on Tasks Co-changing Files @AdamTornhill
  5. Extract The Signal via a Human in the Loop View

    Complexity Through the Lens of Behavioral Data
  6. Case Study: Android The Platform Framework Base in Numbers 3

    Million Lines of Code 2,1 Million Lines of Java 2,000 Unique Authors @AdamTornhill
  7. Case Study: Android The Platform Framework Base in Numbers 3

    Million Lines of Code 2,1 Million Lines of Java 2,000 Unique Authors @AdamTornhill
  8. What we normally care about… A simpler view! Ref to

    Python script… What’s Code Complexity Anyway? Implementation: https://github.com/adamtornhill/maat-scripts/blob/master/miner/complexity_analysis.py @AdamTornhill
  9. Programming As If Social Factors Mattered Author #1 Author #2

    Author #N The Relative Contributions of Each Author contributors Fractal Value: M. D’Ambros, M. Lanza, and H Gall. Fractal Figures: Visualizing Development Effort for CVS Entities.
  10. The Splinter Pattern The splinter pattern provides a structured way

    to break up hotspots into manageable pieces that can be divided among several developers to work on, rather than having a group of developers work on one large piece of code. https://pragprog.com/book/atevol/software-design-x-rays
  11. ActivityStack.java Translucence.java DeviceOwnership.java ActivityLifecycle.java delegate delegate delegate Stack behavior Translucence

 behavior Ownership 
 behavior Lifecycle behavior Splinter Context Original Context The Splinter Pattern Unmodified, copy-pasted (yes, really) content @AdamTornhill
  12. Splinter: Resulting Context ActivityStack.java Translucence.java DeviceOwnership.java ActivityLifecycle.java Better alignment between

    problem and solution domain 
 => facilitates parallel work Individual Parts that can be refactored in isolation @AdamTornhill
  13. https://codescene.io/ Source Code: 
 https://github.com/adamtornhill/code-maat Tooling: Try it on your

    own Code Track functions with git log -L :<funcname>:<file> @AdamTornhill
  14. Code Duplication and DRY Violations 5-20% of all Code is

    Duplicated to Some Extent @AdamTornhill
  15. B() A() Change Coupling: Patterns That Emerge Over Time Commit

    #1 Commit #2 E() Commit #3 Changed code B() A() Changed code
  16. Case Study: Code Clones in Linux @AdamTornhill The Linux Kernel

    in Numbers 16 Million Lines of Code 15 Million Lines of C 15,000 Unique Authors
  17. Combine copy-paste Detection techniques with change coupling to identify the

    code clones that really need refactoring. Clone Detector Applications Clone Digger (Java and Python): http://clonedigger.sourceforge.net/
 Simian (.NET): http://www.harukizaemon.com/simian/ @AdamTornhill
  18. Case Study: Refactoring Entity Framework Core @AdamTornhill Entity Framework Core

    in Numbers 574,000 Lines of Code 365,000 Lines of C# 102 Unique Authors
  19. Case Study: jUnit5 @AdamTornhill jUnit5 in Numbers 59,000 Lines of

    Code 51,000 Lines of Java 74 Unique Authors
  20. Express Tests in the Language of your Domain; Generic Assertions

    are at Odds with that Goal. @AdamTornhill
  21. UI code implemented in JavaScript… …changed coupled to backend code

    implemented in Clojure (different repository) @AdamTornhill
  22. How Can You Track Changes Across Repositories? @AdamTornhill e9e57e48 2017-05-26

    D.Cooper [email protected] 
 Bugfix: the owls are not what they seem. Task: CLJ-2141 A specific commit Use a Task ID or Ticket Reference
  23. UI code implemented in JavaScript… …changed coupled to backend code

    implemented in Clojure (different repository) @AdamTornhill
  24. @AdamTornhill Blog on Behavioral Code Analysis http://www.empear.com/blog/ Your Code As

    A Crime Scene https://pragprog.com/book/atcrime/your-code-as-a-crime-scene Software Design X-Rays https://pragprog.com/book/atevol/software-design-x-rays Test the Analyses in CodeScene: https://codescene.io/ [email protected]