Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Software Analytics for Pragmatists (DevOps Camp 2017)

Markus Harrer
May 13, 2017
31

Software Analytics for Pragmatists (DevOps Camp 2017)

Each step in the development or use of software leaves valuable, digital tracks. The analysis of this "software data" (such as runtime measures, log files or commits) refines our gut feeling to facts with sound evidence.

I'll show how questions that arise in software development can be answered automated, data-driven and reproducible. I demonstrate the interaction of open source analysis tools (such as jQAssistant, Neo4j, Pandas, and Jupyter) for the analysis of data from different sources (such as JProfiler, Jenkins, and Git). Together, we have a look at how we can develop solutions to optimize performance, identify build breaker or make knowledge gaps in our source code visible.

Markus Harrer

May 13, 2017
Tweet

More Decks by Markus Harrer

Transcript

  1. Software Analytics for Pragmatists Solving problems – automated, data-centric and

    reproducible Markus Harrer software analytics clean code
  2. Motivation Software Analytics “In software engineering there is much we

    are seeing, but little we are learning.” Tim Menzies
  3. Motivation Why? •Make problems visible + Improve clarity and understanding

    •Drive decisions + Raise money + Raise more money for further analysis •Support Continuous Learning + Master challenges + Thrive for improvement steadily
  4. Pipeline Data Mining • NumPy • scikit-learn • SciPy Visualization

    • matplotlib • plot.ly • Bokeh • python-pptx ...
  5. Pipeline XML/Graph Tabellen matplotlib Pandas, ... Pandas jQAssistant, Neo4j Text

    xlsx E pptx P Python Jupyter Input Pre- processing Analysis Output D3
  6. Pipeline ZIP GZ *.class JAR, WAR, EAR MANIFEST.MF *.properties XSD

    YAML XML application.xml web.xml beans.xml JaCoCo FindBugs CheckStyle pom.xml surefire-reports.xml RDBMS Schema M2 Repository DB CSV Excel BigQuery Inputs HDFStore Web JSON Git Pandas jQAssistant
  7. What‘s the value of the information related to the effort

    for the analysis of the information? Always ask the question
  8. References Leek, Jeff: The Elements of Data Analytic Style. LeanPub,

    2015. McKinney, Wes: Python For Data Analysis, O’Reilly, 2012. Mens, Tom; Serebrenik; Cleve, Anthony:Evolving Software Systems. Springer, 2014. Mens, Tom; Demeyer, Serge: Software Evolution. Springer, 2008. Shull, Forrest; Singer, Janice; Sjøberg, Dag I.K.: Guide to Advanced Empirical Software Engineering. Springer, 2008. Tornhill, Adam:Your Code As a Crime Scene. Pragmatic Programmers, 2015.