Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Code Complexity

Jorge Silva
September 12, 2013

Code Complexity

Light overview of how we can tackle complexity using existing tools and frameworks. Talk given at Blip University in September 2013 for all interested devs.

Jorge Silva

September 12, 2013
Tweet

More Decks by Jorge Silva

Other Decks in Programming

Transcript

  1. Topics ™  Define: complexity ™  Why tackle complexity ™  Big

    O notation ™  Code metrics ™  McCabe’s Cyclomatic/Conditional ™  Martin’s Software Package Metrics ™  Other metrics ™  Semantic complexity / Clean Code ™  The myth of the genius programmer ™  Code Reviews
  2. Define: complexity the state or quality of being intricate or

    complicated -  Apple Dictionary SYNONYMS complication, problem, difficulty, twist, turn, convolution, entanglement; intricacy, complicatedness, involvement, convolutedness. ANTONYMS simplicity.
  3. Complexity It is threefold: 1.  ”Essential" or unavoidable complexity that

    is at the essence of the problem. 2.  "Accidental" complexity which covers the work that does not really have much to do with the problem at hand but needs doing anyway 3.  “Unnecessary” complexity which is just noise. Unnecessary
  4. Why tackle complexity Corollary 1 For every 10-percent increase in

    problem complexity, there is a 100-percent increase in the software solution’s complexity.
  5. Why tackle complexity Corollary 2 The most important factor in

    attacking complexity is not the tools and techniques that programmers use but rather the quality of the programmers themselves.
  6. Why tackle complexity Corollary 3 Mainten ance 60% Others 40%

    Software Cost Enhan cement 60% Others 23% Error correcti on 17% Maintenance Cost
  7. Why tackle complexity Corollary 4 …software’s "60/60" rule, that is

    that maintenance typically consumes 40 to 80% (60% average) of software costs, and then that enhancement is responsible for roughly 60% of software maintenance costs, while error correction is about 17%... Robert Glass, Frequently Forgotten Fundamental Facts about Software Engineering, 2001
  8. Why tackle complexity Because the hard part building a software

    system does not lie in the code itself. The hard part is maintaining it. Anyone can give orders to a computer. Not everyone is able to do so while being explicit about his/hers intentions.
  9. Why tackle complexity The competent programmer is fully aware of

    the strictly limited size of his own skull; therefore he approaches the programming task in full humility Edsger Djikstra, 1972
  10. Code metrics McCabe’s Cyclomatic ™  Linear Independent Path count ™ 

    It is a function of code branching complexity. Boils down to a number. The higher that number, the worst. ™  It’s not additive, i.e., inner function calls do not affect the outcome. ™  It has been correlated with low reliability and frequent errors
  11. Code metrics McCabe’s Cyclomatic ™  M = McCabe’s complexity ™ 

    M = E – N + 2P E = Edges N = Nodes P = # Connected Components ™  M = 2 + π – s = 2 + #If + #Loop + #Case - #Return Directed Graph Approach Imperative code Approach
  12. Code metrics McCabe’s Cyclomatic M = 2 + 1 +

    1 – 1 = 3 ™  void f1(int n) { for (int i=0; i<n; i+=1) { if (a) { f2(); } f3(); } }
  13. Code metrics McCabe’s Cyclomatic ™  void f1(int n) { for

    (int i=0; i<n; i+=1) { if (a) { f2(); } f3(); } } M = 9 – 8 + 2*1 = 3
  14. Code metrics McCabe’s Cyclomatic ™  The higher the M the

    worst; ™  McCabe himself defined M <= 10 OK 10 < M <= 15 ~OK Please justify M > 15 NOK ™  Other standards say ™  Define your rule and follow it 1-10 OK 11-20 ~OK; Justify please 21-50 NOK 51+ NOK; Untestable
  15. Code metrics McCabe’s Cyclomatic ™  Awareness during development. “programmers should

    keep track of the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.” Tom McCabe
  16. Code metrics McCabe’s Cyclomatic ™  Applications: ™  Indicates the minimum

    # of white box tests that need to be run in order to obtain sufficient coverage of the module. ™  More tests may be necessary because of path coverage. ™  branch coverage <= cyclomatic complexity <= # paths ™  May help code conciseness by limiting the size of an imperative module.
  17. Code metrics McCabe’s Cyclomatic ™  Is it worth it? ™ 

    McCabe’s metric was used on one 77.000 line program to identify problem areas. The program had a post- release defect rate of 0.31 defects per thousand lines of code. A 125.000 line program has had a 0.02 defects per thousand lines of code – William T. Ward, Hewlett Packard ™  Similar results were observed in countless other companies such as Steve McConnel’s Construx Software.
  18. Code metrics Software package metrics ™  Robert Martin’s software package

    metrics ™  Number of classes and interfaces ™  Classes, Pure abstract classes, abstract classes ™  Afferent Couplings / Ca (Package responsibility) ™  Number of packages that depend on classes within a package (inwards) ™  Efferent Couplings / Ce (Package independence) ™  Number of packages the classes within a package depend upon (outwards) ™  Abstracteness ™  Ratio of the packages abstracteness. ™  0 < A < 1. ™  Instability (resiliency to change) ™  I = Ce / (Ce + Ca) ™  0 < I < 1.
  19. Code metrics Software package metrics ™  Robert Martin’s software package

    metrics ™  Distance from the Main Sequence (balance between abstractness and stability) ™  D = A+I, 0 < D < 1 ™  D=0 indicating a package that is coincident with the main sequence ™  D=1 indicating a package that is as far from the main sequence as possible ™  Ideal packages are either completely abstract and stable (x=0, y=1) or completely concrete and unstable (x=1, y=0) 1 1 0
  20. Code metrics Other metrics ™  Bugs per line of code;

    ™  Code coverage; ™  Cohesion ™  degree to which the elements of a module belong together ™  Coupling ™  Cohesion counterpart; High cohesion may mean low coupling and vice-versa. ™  Program execution time ™  Nesting levels in control constructs ™  Variable span (number of lines between successive references to variables) ™  Variable lifetime (number of lines a variable is in use) ™  Others (use tools such as sonar, fortify, etc…)
  21. Big O notation ™  Also known as asymptotic notation; ™ 

    Describes the limiting behavior of a function when the argument tends to a particular value or infinity; ™  Used to classify algorithms – time & space - by how they respond to input size;
  22. Big O notation ™  Also known as asymptotic notation; Expressed

    as O(x); ™  Describes the limiting behavior of a function when the argument tends to a particular value or infinity; ™  Used to classify algorithms – time & space - by how they respond to input size; ™  It’s about the worst-case scenario of an algorithm; ™  Ω(x) (Big Omega), Θ(x) (Big Theta) also exist and offer different measures.
  23. Big O notation O(1) Constant Complexity ™  Input size does

    not affect the complexity; ™  Example: Given a binary representation of a number say if it is even or odd ™  If rightmost bit is one then the number is odd; even otherwise ™  No matter how many bits the number representation has we only need to look at the first bit (right to left). 0111101011010100100101 0111101011010100100100
  24. Big O notation O(n) Linear Complexity ™  Example: Add two

    numbers ™  Line the numbers up (to the right) ™  Add the digits in a column writing the last number of that addition in the result; ™  The 'tens' part of that number is carried over to the next column. ™  If we add two 100 digit numbers together we have to do 100 additions. If we add two 10,000 digit numbers we have to do 10,000 additions. See the pattern? 1234 + 5678 6912
  25. Big O notation O(log n) Logarithmic Complexity ™  Example: Phone

    Book ™  Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet ™  Check to see whether the person's name is at that point ™  Repeat the process about halfway through the part of the book where the person's name lies. ™  you can simply divide-and-conquer, and you only need to explore a tiny fraction of the entire space before you eventually find someone's phone number. ™  A bigger phone book will still take you a longer time, but it won't grow as quickly as the proportional increase in the additional size. ™  See the pattern? (also you could just the internet)
  26. Big O notation O(n2) Quadratic Complexity ™  Example: Multiply two

    numbers ™  Line the numbers up (to the right) ™  take the first digit in the bottom number and multiply it in turn against each digit in the top number; ™  and so on through each digit ™  For 4 digit number we need to do 16 multiplications (and 7 adds). For 100 digit numbers we need to do 10.000 multiplications and 200 adds. See the pattern? ™  Should it be O(n2 + 2n)? 1234 x 5678 9872 86380 740400 + 6170000 7006652
  27. Big O notation O(n!) Factorial Complexity ™  Example: The travelling

    salesman ™  You have N towns ™  Each of those towns is linked to 1 or more other towns by a road of a certain distance ™  find the shortest tour that visits every town
  28. Big O notation O(n!) Factorial Complexity ™  Imagine 3 towns

    A, B & C ™  A → B → C ™  A → C → B ™  B → C → A ™  B → A → C ™  C → A → B ™  C → B → A ™  There are 3 equivalents: A-B-C:C-B-A, A-C-B:B-C-A, B-A-C:C-A-B; So there are 3 possibilities ™  Take this to 4 towns and you have 12 possibilities. ™  With 5 it's 60. ™  6 becomes 360. See the pattern?
  29. Types of complexity Semantic Let us change our traditional attitude

    to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do. Donald Knuth – Literate Programming, Centre for the Study of Language & Information, 1992
  30. Types of complexity Semantic ™  Good naming ™  Ahah moment:

    Steve Yegge’s Execution in the Kingdom of Nouns ™  Names should fully and accurately describe the member they represent (see slide 13)
  31. Types of complexity Semantic ™  High quality routines & classes

    ™  Is the reason for creating a new routine/class sufficient? ™  Have all the parts of the routine/class that would benefit from being put into routines/classes of their own been put into routines/classes of their own? ™  Does the name describe everything the routine/class does? ™  Does the routine/class have strong, functional cohesion – doing one and only one thing and doing it well? ™  Do the routine/class have loose coupling – are the routine/ class connections to other routines/classes small, intimate, visible and flexible? ™  Is the length of the routine determined naturally by its function and logic rather than by an artificial coding standard?
  32. Types of complexity Semantic ™  High quality routines & classes

    ™  Does the routine have 5 or fewer parameters? ™  Is each input parameter used? ™  Is each output parameter used? ™  Does the routine avoid using input parameters as working variables? ™  If the routine is a function does it return a valid value under all possible circumstances? ™  Does the routine parameter list, taken as a whole, present a consistent interface abstraction?
  33. Types of complexity Semantic ™  High quality routines & classes

    ™  Does the routine protect itself from bad input data? ™  Have you used assertions to document assumptions, including pre-conditions and post-conditions? ™  Check out Eiffel’s invariants; Java has support at some extent also ™  Have assertions been used only to document conditions that should never occur?
  34. Types of complexity Semantic ™  High quality routines & classes

    ™  Have debugging aids been installed in such a way that they can be activated /deactivated without a great deal of fuss? ™  Is the amount of defensive programming code appropriate – neither to much nor too little? ™  Have you used offensive programming techniques to make errors difficult to overlook during development? ™  Make sure asserts abort the program ™  Be sure the code in each case statement’s default or else clause fails hard or is otherwise impossible to overlook
  35. The myth of the genius programmer ™  Also known as

    “Don't be afraid to show your code ™” ™  You are not one in a million. ™  And if you were there would be 10 like you in Portugal alone. Think about this. A pervasive elitism hovers in the background of collaborative software development: everyone secretly wants to be seen as a genius. How to avoid this trap and gracefully exchange personal ego for personal growth and super-charged collaboration. http://www.youtube.com/watch?v=0SARbwvhupQ
  36. The myth of the genius programmer Code reviews ™  Do

    them ™  Knowledge sharing – functional and business. ™  Engage with your peers. ™  Learn from others – be open minded. ™  Criticize and be criticized ™  But being polite and respectful doing both ™  Be like water my friend – Bruce Lee