Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Is Program Analysis The Silver Bullet Against Software Bugs? by Karim Ali

Is Program Analysis The Silver Bullet Against Software Bugs? by Karim Ali

Program analysis is the art of reasoning about the run-time behavior of a program without necessarily executing it. This information is useful for various real-life applications such as supporting software developers (e.g., bug-finding tools, code refactoring tools, and code recommenders) and compiler optimizations. Program analysis is also used to ensure complex software adheres to standards and regulations (e.g., medical devices, car industry, and aviation industry).

In this talk, I will discuss the three main properties that enable program analyses to be useful in practice: scalability, precision, and usability. I will relate that to various papers that have been published in the field of program analysis, as well as some of the work that my group has done. I will conclude with where I see program analysis research going and the challenges that we aim to solve in the field.

Papers_We_Love

September 12, 2019
Tweet

More Decks by Papers_We_Love

Other Decks in Programming

Transcript

  1. Karim Ali University of Alberta @karimhamdanali Is Program Analysis The

    Silver Bullet Against Software Bugs? Papers We Love Conference — 2019
  2. @karimhamdanali Software Bugs Invalid SSL/TLS connections earned Apple Most Epic

    Fail [Pwnie ’14] !3 goto fail; goto fail; Source: CVE-2014-1266
  3. @karimhamdanali © Copyright 2014, Philip Koopman. CC Attribution 4.0 International

    license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010 Software Bugs Errors in ABS software led to fatal accidents and cost Toyota $3 Billion !4 Source: Philip Koopman, CMU
  4. @karimhamdanali Program Analysis !6 goto fail; goto fail; © Copyright

    2014, Philip Koopman. CC Attribution 4.0 International license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010
  5. @karimhamdanali Program Analysis !8 A way of reasoning about the

    runtime behaviour of a program without necessarily executing it
  6. @karimhamdanali Rice’s Theorem “For any interesting property Pr of the

    behaviour of a program, it is impossible to write an analysis that can decide for every program p whether Pr holds for p.” !9 Image: CooperToons
  7. @karimhamdanali Program Analysis •Settle for an approximation of Pr •Make

    it as “good” as possible p analysis yes p analysis no !12 few Image: Jenna Mullins / ENews
  8. @karimhamdanali Program Analysis !13 Code Navigation Code Recommenders Code Refactoring

    Constant Propagation Dead Code Elimination Static Inlining Parallelization
  9. @karimhamdanali Collaborators •Erick Ochoa (UAlberta) •Spencer Killen (UAlberta) •Kristen Newbury

    (UAlberta) •Revan MacQueen (UAlberta) •Daniil Tiganov (UAlberta) •Jeff Cho (UAlberta) •Johannes Späth (Paderborn) •Lisa Nguyen (Paderborn) •Stefan Krüger (Paderborn) •Ondřej Lhoták (Waterloo) •Frank Tip (Northeastern) •Eric Bodden (Paderborn & Fraunhofer IEM) •Mira Mezini (TU Darmstadt) •Julian Dolby (IBM Research) •Andrew Craik (IBM) •Mark Stoodley (IBM) •Vijay Sundaresan (IBM) •Ben Livshits (Imperial College London & Brave) •Emerson Murphy-Hill (Google) •Justin Smith (Lafayette College) •José Nelson Amaral (UAlberta) •James Wright (UAlberta) •Kirsten Thommes (Paderborn) •René Fahr (Paderborn) !16
  10. @karimhamdanali Collaborators •Erick Ochoa (UAlberta) •Spencer Killen (UAlberta) •Kristen Newbury

    (UAlberta) •Revan MacQueen (UAlberta) •Daniil Tiganov (UAlberta) •Jeff Cho (UAlberta) •Johannes Späth (Paderborn) •Lisa Nguyen (Paderborn) •Stefan Krüger (Paderborn) •Ondřej Lhoták (Waterloo) •Frank Tip (Northeastern) •Eric Bodden (Paderborn & Fraunhofer IEM) •Mira Mezini (TU Darmstadt) •Julian Dolby (IBM Research) •Andrew Craik (IBM) •Mark Stoodley (IBM) •Vijay Sundaresan (IBM) •Ben Livshits (Imperial College London & Brave) •Emerson Murphy-Hill (Google) •Justin Smith (Lafayette College) •José Nelson Amaral (UAlberta) •James Wright (UAlberta) •Kirsten Thommes (Paderborn) •René Fahr (Paderborn) !17
  11. @karimhamdanali Call Graph !26 class Circle extends Shape { void

    draw() { ... } } class Square extends Shape { void draw() { ... } } Shape s; if(*) s = new Circle(); else s = new Square(); s.draw();
  12. @karimhamdanali Call Graph !27 class Circle extends Shape { void

    draw() { ... } } class Square extends Shape { void draw() { ... } } Shape s; if(*) s = new Circle(); else s = new Square(); s.draw(); required by every inter-procedural analysis
  13. @karimhamdanali Let’s build a Call Graph !28 public class Main

    { public static void main(String[] args) { Shape s; if (args.length > 2) s = new Circle(); else s = new Square(); s.draw(); } } abstract class Shape { abstract void draw(); } class Circle extends Shape { void draw() { ... } } class Square extends Shape { void draw() { ... } }
  14. @karimhamdanali Let’s build a Call Graph !29 public class Main

    { public static void main(String[] args) { Shape s; if (args.length > 2) s = new Circle(); else s = new Square(); s.draw(); } } abstract class Shape { abstract void draw(); } class Circle extends Shape { void draw() { ... } } class Square extends Shape { void draw() { ... } } Main.main()
  15. @karimhamdanali Let’s build a Call Graph !30 public class Main

    { public static void main(String[] args) { Shape s; if (args.length > 2) s = new Circle(); else s = new Square(); s.draw(); } } abstract class Shape { abstract void draw(); } class Circle extends Shape { void draw() { ... } } class Square extends Shape { void draw() { ... } } Main.main() Circle.<init>()
  16. @karimhamdanali Let’s build a Call Graph !31 public class Main

    { public static void main(String[] args) { Shape s; if (args.length > 2) s = new Circle(); else s = new Square(); s.draw(); } } abstract class Shape { abstract void draw(); } class Circle extends Shape { void draw() { ... } } class Square extends Shape { void draw() { ... } } Main.main() Shape.<init>() Circle.<init>() Object.<init>()
  17. @karimhamdanali Let’s build a Call Graph !32 public class Main

    { public static void main(String[] args) { Shape s; if (args.length > 2) s = new Circle(); else s = new Square(); s.draw(); } } abstract class Shape { abstract void draw(); } class Circle extends Shape { void draw() { ... } } class Square extends Shape { void draw() { ... } } Main.main() Shape.<init>() Square.<init>() Circle.<init>() Object.<init>()
  18. @karimhamdanali Let’s build a Call Graph !33 public class Main

    { public static void main(String[] args) { Shape s; if (args.length > 2) s = new Circle(); else s = new Square(); s.draw(); } } abstract class Shape { abstract void draw(); } class Circle extends Shape { void draw() { ... } } class Square extends Shape { void draw() { ... } } Main.main() Shape.<init>() Square.<init>() Circle.<init>() Square.draw() Circle.draw() Object.<init>()
  19. @karimhamdanali Let’s build a Call Graph for javac !36 •

    Java 1.4 • 0.5 MB of class files • 8 GB of RAM • HOURS! IRIS Reasoner
  20. @karimhamdanali Let’s build a Call Graph for javac !37 •

    Java 1.4 • 0.5 MB of class files • 8 GB of RAM • HOURS! IRIS Reasoner Exception in thread “main" java.lang.OutOfMemoryError: Java heap space
  21. @karimhamdanali !40 public class HelloWorld { public static void main(String[]

    args) { System.out.println("Hello, World!"); } } • > 30 seconds • > 5,000 reachable methods • > 23,000 call edges
  22. @karimhamdanali Not Alone! !45 I'd like to ignore library code

    what about callbacks? this would be unsound but better than nothing ignore non-application program elements (e.g., system libraries)? whole-program analysis always pulls in the world for completeness. The problem is that the world is fairly large I am NOT interested in those
  23. @karimhamdanali !52 Ideal Call Graph Whole-Program Call Graph Incomplete Call

    Graph (unsound) Conservative Call Graph (highly imprecise)
  24. @karimhamdanali !53 Ideal Call Graph Whole-Program Call Graph Incomplete Call

    Graph (unsound) Conservative Call Graph (highly imprecise) Partial-Program Call Graph
  25. @karimhamdanali The Separate Compilation Assumption !54 Source: Ali and Lhoták.

    Application-Only Call Graph Construction. [ECOOP '12]
  26. @karimhamdanali The Separate Compilation Assumption All of the library classes

    can be compiled in the absence of the application classes. !55
  27. @karimhamdanali Constraints 1. Class Hierarchy 2. Class Instantiation 3. Local

    Variables 4. Method Calls !56 5. Field Access 6. Array Access 7. Static Initialization 8. Exception Handling
  28. @karimhamdanali Constraints 1. Class Hierarchy 2. Class Instantiation 3. Local

    Variables 4. Method Calls !57 5. Field Access 6. Array Access 7. Static Initialization 8. Exception Handling
  29. @karimhamdanali Library Points-to Set (LPT) !58 Application Library pt(v1) =

    o1 o3 pt(v2) = o2 o3 pt(v3) = o1 o4 LPT = o1 o2 o3 o5
  30. @karimhamdanali Library Callbacks !59 Application Library class C { m();

    } class B extends L { m(); } class A extends L { m(); } calls class L { m(); } 1 LPT = A C 2
  31. @karimhamdanali Security-Related Static Analyses !68 public void main(String[] args) {

    Object x = null; Object y = x; y.toString(); } Null-Pointer Analysis
  32. @karimhamdanali Security-Related Static Analyses !69 public void main(String[] args) {

    String x = args[0]; String y = x; SQL.execute(''SELECT * FROM User where userId='' + y ); } Taint Analysis
  33. @karimhamdanali Security-Related Static Analyses !70 public void main(String[] args) {

    File x = new File(); File y = x; y.close(); } Typestate Analysis
  34. @karimhamdanali Precise Static Data-Flow Analysis !73 public void main(String[] args)

    { File x = new File(); this.z = x; foo(x); x.close(); foo(x); } public void foo(File y){ y.write(...); } public void foo(){ this.a.write(...); }
  35. @karimhamdanali Precise Static Data-Flow Analysis !74 public void main(String[] args)

    { File x = new File(); this.z = x; foo(x); x.close(); foo(x); } public void foo(File y){ y.write(...); } public void foo(){ this.a.write(...); } Context-Sensitive
  36. @karimhamdanali Precise Static Data-Flow Analysis !75 public void main(String[] args)

    { File x = new File(); this.z = x; foo(x); x.close(); foo(x); } public void foo(File y){ y.write(...); } public void foo(){ this.a.write(...); } Field-Sensitive
  37. @karimhamdanali Precise Static Data-Flow Analysis !76 x z y Pushdown

    Automaton main() foo(x) bar(y) foo(z) Stack of Calls f h g f Stack of Fields Context-Sensitive ∧ Field-Sensitive
  38. @karimhamdanali Precise Static Data-Flow Analysis !77 x z y Pushdown

    Automaton main() foo(x) bar(y) foo(z) Stack of Calls f h g f Stack of Fields Undecidable Reps [TOPLAS 2000] Source: Thomas W. Reps. Undecidability of Context-Sensitive Data-Dependence Analysis. [TOPLAS '00] Context-Sensitive ∧ Field-Sensitive
  39. @karimhamdanali Precise Static Data-Flow Analysis !78 x z y Pushdown

    Automaton main() foo(x) bar(y) foo(z) Stack of Calls f h g f Stack of Fields Context-Sensitive ∧ Field-Sensitive
  40. @karimhamdanali Precise Static Data-Flow Analysis !79 x z Pushdown Automaton

    main() foo(x) bar(y) foo(z) Stack of Calls k-limitting Access Paths/Graphs y.f y.g y.f.h y.f.g Context-Sensitive ∧ Field-Sensitive
  41. @karimhamdanali Precise Static Data-Flow Analysis !80 x z Pushdown Automaton

    main() foo(x) bar(y) foo(z) Stack of Calls k-limitting Access Paths/Graphs y.f y.g y.f.h y.f.g Context-Sensitive ∧ Field-Sensitive What’s a good value for k? k-limitting yields too many false positives
  42. @karimhamdanali Synchronized Pushdown Systems (SPDS) !81 Source: Späeth et al.

    Context-, Flow-, and Field-Sensitive Data-Flow Analysis using Synchronized Pushdown Systems. [POPL '19]
  43. @karimhamdanali Synchronized Pushdown Systems !84 Context-Sensitive Field-Sensitive ∧ Pushdown System

    of Calls x z y main() foo(x) bar(y) foo(z) Stack of Calls Variables f h g f Stack of Fields Pushdown System of Fields x z y Variables
  44. @karimhamdanali Synchronized Pushdown Systems !85 Context-Sensitive Field-Sensitive ∧ Pushdown System

    of Calls x z y main() foo(x) bar(y) foo(z) Stack of Calls Variables f h g f Stack of Fields Pushdown System of Fields x z y Variables Decidable
  45. @karimhamdanali Synchronized Pushdown Systems !86 Context-Sensitive Field-Sensitive ∧ Pushdown System

    of Calls x z y main() foo(x) bar(y) foo(z) Stack of Calls Variables f h g f Stack of Fields Pushdown System of Fields x z y Variables Decidable No k-limitting
  46. @karimhamdanali SPDS Evaluation !88 Analysis Time (seconds) 0 5 10

    15 20 25 30 35 40 45 50 Number of Field Accesses 2 4 6 8 10 12 14 16 18 Access Path (k=4) Access Path (k=3) Access Path (k=2) Access Path (k=1) SPDS Eclipse
  47. @karimhamdanali 99 precise responsive seamless tailored Sources: Johnson et al.

    Why Don’t Software Developers Use Static Analysis Tools to Find Bugs? [ICSE '13] Sources: Xiao et al. Social Influences on Secure Development Tool Adoption: Why Security Tools Spread. [CSCW '14] Sources: Smith et al. Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis. [FSE '15]
  48. @karimhamdanali @karimhamdanali Analysis-Driven Inliner Discriminants Budget Algorithm Search Space Call

    Frequency Method Size Method Size Nested Knapsack All IDT Methods Post-Inlining Transformations !59 Estimate Post-Inlining Transformations !106 themaplelab/openj9
  49. @karimhamdanali • Understanding the internals of neural networks is limited

    due to their complexity • Fixing errors in neural networks without retraining is hard and currently not supported • We use Rosette to solve for changes in weights to a neural network • Rosette is able to represent neural networks and their results as symbolic values, which can then be solved for, under the assertion that a given data point is correct OVERVIEW Rosette Objective Adjust n weights We use rosette to solve for changes in network weights subject to the following objective To maximize Weight Selection EVALUATION WEIGHT SELECTION METHOD TRAINING SOLVING Training Solving Evaluation #lang rosette (define-symbolic x integer?) (assert (> x3)) (define solution (solve x)) > (evaluate x solution) 4 Fixing Neural Networks using Solver-Aided Languages !107 coming soon...
  50. @karimhamdanali Future of Program Analysis !111 • Understanding the internals

    of neural networks is limited due to their complexity • Fixing errors in neural networks without retraining is hard and currently not supported • We use Rosette to solve for changes in weights to a neural network • Rosette is able to represent neural networks and their results as symbolic values, which can then be solved for, under the assertion that a given data point is correct OVERVIEW Rosette Objective Adjust n weights We use rosette to solve for changes in network weights subject to the following objective To maximize Weight Selection • We evaluate network performance before and after solving • Network with 784 input nodes, 300 hidden, and 10 output nodes • On average, after making changes, 99.85% of testing points remain correctly classified EVALUATION WEIGHT SELECTION METHOD TRAINING SOLVING Training Effect of Number of Symbolic Weights on Runtime Solving Evaluation #lang rosette (define-symbolic x integer?) (assert (> x3)) (define solution (solve x)) > (evaluate x solution) 4 @karimhamdanali Discriminants Budget Algorithm Search Space Call Frequency Method Size Method Size Nested Knapsack All IDT Methods Post-Inlining Transformations !59 Extra Images: SIGPLAN Blog
  51. @karimhamdanali Program Analysis !6 goto fail; goto fail; © Copyright

    2014, Philip Koopman. CC Attribution 4.0 International license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010
  52. @karimhamdanali Program Analysis !6 goto fail; goto fail; © Copyright

    2014, Philip Koopman. CC Attribution 4.0 International license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010 @karimhamdanali Program Analysis !13 Code Navigation Code Recommenders Code Refactoring Constant Propagation Dead Code Elimination Static Inlining Parallelization
  53. @karimhamdanali Program Analysis !6 goto fail; goto fail; © Copyright

    2014, Philip Koopman. CC Attribution 4.0 International license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010 @karimhamdanali Program Analysis !13 Code Navigation Code Recommenders Code Refactoring Constant Propagation Dead Code Elimination Static Inlining Parallelization @karimhamdanali !103 Scalability Usability Precision
  54. @karimhamdanali Program Analysis !6 goto fail; goto fail; © Copyright

    2014, Philip Koopman. CC Attribution 4.0 International license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010 @karimhamdanali Program Analysis !13 Code Navigation Code Recommenders Code Refactoring Constant Propagation Dead Code Elimination Static Inlining Parallelization @karimhamdanali !103 Scalability Usability Precision @karimhamdanali Future of Program Analysis !111 Fixing Neural Networks with Solver-Aided Languages Revan MacQueen1, Julian Dolby2, Karim Ali1 1UNIVERSITY OF ALBERTA, 2IBM RESEARCH https://github.com/themaplelab/ML-SE • Understanding the internals of neural networks is limited due to their complexity • Fixing errors in neural networks without retraining is hard and currently not supported • We use Rosette to solve for changes in weights to a neural network • Rosette is able to represent neural networks and their results as symbolic values, which can then be solved for, under the assertion that a given data point is correct OVERVIEW Rosette Objective Adjust n weights We use rosette to solve for changes in network weights subject to the following objective To maximize Weight Selection • We evaluate network performance before and after solving • Network with 784 input nodes, 300 hidden, and 10 output nodes • On average, after making changes, 99.85% of testing points remain correctly classified EVALUATION WEIGHT SELECTION METHOD TRAINING SOLVING Training Effect of Number of Symbolic Weights on Runtime Solving Evaluation #lang rosette (define-symbolic x integer?) (assert (> x3)) (define solution (solve x)) > (evaluate x solution) 4 @karimhamdanali Analysis-Driven Inliner Discriminants Budget Algorithm Search Space Call Frequency Method Size Method Size Nested Knapsack All IDT Methods Post-Inlining Transformations !59 Extra Images: SIGPLAN Blog
  55. Karim Ali University of Alberta @karimhamdanali Is Program Analysis The

    Silver Bullet Against Software Bugs? @karimhamdanali Program Analysis !6 goto fail; goto fail; © Copyright 2014, Philip Koopman. CC Attribution 4.0 International license. 5 http://www.cbsnews.com/news/toyota-unintended-acceleration-has-killed-89/ May 25, 2010 @karimhamdanali Program Analysis !13 Code Navigation Code Recommenders Code Refactoring Constant Propagation Dead Code Elimination Static Inlining Parallelization @karimhamdanali !103 Scalability Usability Precision @karimhamdanali Future of Program Analysis !111 Fixing Neural Networks with Solver-Aided Languages Revan MacQueen1, Julian Dolby2, Karim Ali1 1UNIVERSITY OF ALBERTA, 2IBM RESEARCH https://github.com/themaplelab/ML-SE • Understanding the internals of neural networks is limited due to their complexity • Fixing errors in neural networks without retraining is hard and currently not supported • We use Rosette to solve for changes in weights to a neural network • Rosette is able to represent neural networks and their results as symbolic values, which can then be solved for, under the assertion that a given data point is correct OVERVIEW Rosette Objective Adjust n weights We use rosette to solve for changes in network weights subject to the following objective To maximize Weight Selection • We evaluate network performance before and after solving • Network with 784 input nodes, 300 hidden, and 10 output nodes • On average, after making changes, 99.85% of testing points remain correctly classified EVALUATION WEIGHT SELECTION METHOD TRAINING SOLVING Training Effect of Number of Symbolic Weights on Runtime Solving Evaluation #lang rosette (define-symbolic x integer?) (assert (> x3)) (define solution (solve x)) > (evaluate x solution) 4 @karimhamdanali Analysis-Driven Inliner Discriminants Budget Algorithm Search Space Call Frequency Method Size Method Size Nested Knapsack All IDT Methods Post-Inlining Transformations !59 Extra Images: SIGPLAN Blog