Upgrade to Pro — share decks privately, control downloads, hide ads and more …

On the Prevalence, Evolution, and Impact of Cod...

On the Prevalence, Evolution, and Impact of Code Smells in Simulation Modelling Software

Simulation modelling systems are routinely used to test or understand real-world scenarios in a controlled setting. They have found numerous applications in scientific research, engineering, and industrial operations. Due to their complex nature, the simulation systems could suffer from various code quality issues and technical debt. However, to date, there has not been any investigation into their code quality issues (e.g. code smells). In this paper, we conduct an empirical study investigating the prevalence, evolution, and impact of code smells in simulation software systems. First, we employ static analysis tools (e.g. Designite) to detect and quantify the prevalence of various code smells in 155 simulation and 327 traditional projects from Github. Our findings reveal that certain code smells (e.g. Long Statement, Magic Number) are more prevalent in simulation software systems than in traditional software systems. Second, we analyze the evolution of these code smells across multiple project versions and investigate their chances of survival. Our experiments show that some code smells such as Magic Number and Long Parameter List can survive a long time in simulation software systems. Finally, we examine any association between software bugs and code smells. Our experiments show that although Design and Architecture code smells are introduced simultaneously with bugs, there is no significant association between code smells and bugs in simulation systems.

Masud Rahman

December 12, 2024
Tweet

More Decks by Masud Rahman

Other Decks in Education

Transcript

  1. Simulation Modelling Real-world processes in a controlled, virtual environment. Highly

    complex models abstracting real-world physical systems. Complexity ranges from simple equations to interactions among thousands of entities.
  2. Simulation Modelling: Applications Medical risk factors and drug development. (Katsaliaki

    & Mustafee, 2011) Flight paths and military mission rehearsals. (Allerton, D. J., 2010). Travel routes and human activity (Balmer et al, 2004) $20.96 Billions, 2023
  3. Code Smells in Simulation Systems? • Code smells differ across

    domains: Android (Hecht et al, 2016), Deep Learning (Oort et al, 2021) (Jebnoun et al, 2020), and Data Intensive Systems (Biruk et al, 2020). • Code smells lead to problems with maintainability (Yamashita & Moonen, 2013) and performance (Hecht et al, 2016). • No study of code smells in simulation modelling systems. • Important insights regarding their code quality and maintenance issues.
  4. Research Questions • RQ1: Do simulation software systems smell like

    traditional software systems? • RQ2: How long do code smells last in simulation software systems? • RQ3: Do code smells co-occur with bugs in simulation software systems?
  5. Study Methodology Answer RQ1 Answer RQ2 Answer RQ3 Traditional systems

    (772) Sampling and filtration Detect code smells using Designite Final dataset (327 traditional + 155 simulation) Mine Bug-Fixing and Bug Inducing commits Detect code smell for all commits Kaplan Meier survival analysis Analyze detected code smells Mann Whitney U & Cliffs Delta test Chi-Squared & Cramer’s V test Simulation systems (422)
  6. RQ1: Prevalence of code smells • Code smells occur more

    frequently in simulation systems. • More code smells per line than traditional systems. • High variance in the prevalence of code smells for simulation systems.
  7. RQ1: Difference by Level of Abstraction • Implementation smells (e.g.

    Magic Number, Long Statement) are the most prevalent in simulation systems. • Design smells (e.g. Deep Hierarchy, Broken Modularization) are slightly less prevalent in simulation systems. • Architectural Smells (e.g. Cyclic Dependency, Dense Structure) are equally prevalent in both systems.
  8. RQ1: Difference in Individual Smells • Simulation Systems contain significantly

    more Magic number smells. • Shows limited use of named constants. • Contains more Long Statements and Long Parameter List smells. • Signals overly specialized code and a lack of reusable classes and methods.
  9. RQ2: Survival of Code Smells • Code smells survive longer

    in simulation software systems. • Code smells (e.g. Broken Hierarchy, Long Method) are refactored late in simulation software systems.
  10. RQ2: Difference by Level of Abstraction • Implementation code smells

    survive the longest with a median survival time of 3,499 days. • Several Implementation code smells (e.g. Long Method, Magic Number) are never refactored. • Most design and architectural smells are refactored but are still at a later date.
  11. RQ2: Survival of Smell Categories • Code Smells categorized according

    to (M Fowler, 2018) • Bloater and Dispensable smells survive the Longest. • Change Preventers are refactored through large changes in the codebase. • OOP Abusers and Couplers are refactored periodically.
  12. RQ2: Survival of Individual Smells • Broken Hierarchy code smell

    has the highest survival time of 3,661 days. • Implementation code smells have a median survival time of 3,499 days. • The Abstract Function Call From Constructor smell has the lowest survival time of 1,733 days. MST=Mean Survival Time
  13. RQ3: Prevalence of Smells in Commits • Design and Architectural

    code smells are more prevalent in bug- inducing commits. • Implementation smells are prevalent in both bug-Inducing and bug-fixing commits.
  14. RQ3: Association Tests •No code smell has a p-value of

    <0.05. • Suggests that code smells and bugs have no statistically significant association. • Cramer’s V is always <0.3 for every code smell. • Indicates a weak association between code smells and software bugs.
  15. Implications of Findings • Simulation systems have excessive instances of

    Magic numbers, suggesting improper uses of constants. (RQ1) • Significant presence of Long statements, Long Method smells indicates excessive complexity in simulation systems. (RQ1) • Code smells at the implementation level tend to be rarely refactored over the development cycle of Simulation Systems. (RQ2) • During the development of Simulation systems, they amass Dispensable code smells. This leads to unused classes and methods that are rarely refactored away. (RQ2) • Simulation systems also accumulate Bloater code smells over their development cycle. (RQ2) • No significant association between bugs and code smells was found in simulation systems. (RQ3)
  16. Controversial Statements •We understand bugs in traditional software systems, deep

    learning systems, and data intensive systems. What is a bug in simulation modelling system? •Simulation models are transparent but complex models, unlike data-driven DL models. Do we need a different SE practice for them?
  17. References • Simulation software market size & share report, 2030

    (no date) Simulation Software Market Size & Share Report, 2030. Available at: https://www.grandviewresearch.com/industry-analysis/simulation-software-market# (Accessed: 01 October 2024). • Jebnoun, H., Ben Braiek, H., Rahman, M., & Khomh, F. (2020). The Scent of Deep Learning Code: An Empirical Study. In Proceedings of the 17th International Conference on Mining Software Repositories (pp. 420–430). Association for Computing Machinery. • Yamashita, A., & Moonen, L. (2013). Exploring the impact of inter-smell relations on software maintainability: An empirical study. In 2013 35th International Conference on Software Engineering (ICSE) (pp. 682-691). • Hecht, G., Moha, N., & Rouvoy, R. (2016). An empirical study of the performance impacts of Android code smells. In Proceedings of the International Conference on Mobile Software Engineering and Systems (pp. 59–69). Association for Computing Machinery • Van Oort, B., Cruz, L., Aniche, M., & Van Deursen, A. (2021). The prevalence of code smells in machine learning projects. In 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN) (pp. 1–8). • Muse, B., Rahman, M., Nagy, C., Cleve, A., Khomh, F., & Antoniol, G. (2020). On the Prevalence, Impact, and Evolution of SQL Code Smells in Data-Intensive Systems. In Proceedings of the 17th International Conference on Mining Software Repositories (pp. 327–338). Association for Computing Machinery. • T. Sharma, P. Mishra, and R. Tiwari, “Designite: a software design quality assessment tool,” in Proceedings of the 1st International Workshop on Bringing Architectural Design Thinking into Developers’ Daily Activities, ser. BRIDGE ’16. New York, NY, USA: Association for Computing Machinery, 2016, p. 1–4. [Online]. Available: https://doi.org/10.1145/2896935.2896938 • Katsaliaki, K., & Mustafee, N. (2011). Applications of simulation within the healthcare context. Journal of the operational research society, 62(8), 1431-1451. • Balmer, M., Nagel, K., & Raney, B. (2004, October). Large-scale multi-agent simulations for transportation applications. In Intelligent Transportation Systems (Vol. 8, No. 4, pp. 205-221). Taylor & Francis Group. • Allerton, D. J. (2010). The impact of flight simulation in aerospace. The Aeronautical Journal, 114(1162), 747-756. • Fowler, M. (2018). Refactoring: improving the design of existing code. Addison-Wesley Professional.