2024_uzh_collo.pdf

Philipp Leitner Associate Professor [email protected] http://icet-lab.eu Measuring, Predicting, and Improving
the Performance of Software System

Chalmers 2 Associate Professor Unit Leader 2017 - ongoing 2014
- 2017 Postdoc PhD Student, Postdoc 2007 - 2014

Chalmers 3

Chalmers 4 Let’s Talk About Software Performance

Chalmers 5 “Premature optimization is the root of all evil.”

Chalmers 6

Chalmers 7

Chalmers 8 “Healthcare.gov was of fi cially launched on 1
October 2013 (…) High website demand (…) caused the website to go down within 2 hours of launch.” “A total of 6 users completed and submitted their applications and selected a health insurance plan on the fi rst day.” https://rctom.hbs.org/submission/the-failed-launch-of-www-healthcare-gov/

Chalmers 9 https://www.geeksforgeeks.org/performance-testing-software-testing/ Performance testing as the “last step” before
release

Chalmers 10 “Premature optimization is the root of all evil.”
“Performance issues should always be considered from the beginning.”

Chalmers 11 “Shift Left” https://devopedia.org/shift-left

Chalmers 12 How to “Shift Left” Performance Testing Microbenchmarking Performance
Prediction Debloating Generating Efficient Code

Chalmers 13 How to “Shift Left” Performance Testing Microbenchmarking Performance
Prediction Debloating Generating Efficient Code

Chalmers 14 Benchmarking and Predicting Performance

Chalmers 15 Microbenchmarking with JMH

Chalmers 16 Some issues and solutions: Benchmarks are difficult to
write Benchmark bug finders using static analysis Benchmark generators Costa, Bezemer, Leitner, Andrzejak (2021). What's Wrong with My Benchmark Results? Studying Bad Practices in JMH Benchmarks. IEEE Transactions on Software Engineering Jangali, Tang, Alexandersson, Leitner, Yang, Shang (2022). Automated Generation and Evaluation of JMH Microbenchmark Suites from Unit Tests. IEEE Transactions on Software Engineering Rodriguez-Cancio, Combemale, Baudry (2016). Automatic microbenchmark generation to prevent dead code elimination and constant folding. In ASE '16

Chalmers 17 Some issues and solutions: Benchmarks take a long
time to execute Smart reconfiguration Benchmark selection Laaber, Würsten, Gall, Leitner (2020) Dynamically reconfiguring software microbenchmarks: reducing execution time without sacrificing result quality. In ESEC/FSE 2020 Traini, Cortellessa, Di Pompeo, Tucci (2023) Towards effective assessment of steady state performance in Java software: are we there yet? In Empirical Software Engineering Laaber, Gall, Leitner (2021) Applying test case prioritization to software microbenchmarks. In Empirical Software Engineering

Chalmers 18 Fundamental problems remain … Microbenchmarks cannot provide immediate
feedback No “as you type” performance assessment

Chalmers 19 Performance Prediction - Predicting the performance impact of
code changes - Identifying bottlenecks - Suggesting improvements before execution! Cito, Leitner, Rinard, Gall (2019). Interactive Production Performance Feedback in the IDE. In International Conference on Software Engineering (ICSE’19)

Chalmers 20 Code

Chalmers 21 Code Workload Info

Chalmers 22 14ns Code Workload Info Performance Prediction

Chalmers 23 Samoaa, Bayram, Salza, Leitner (2022). A systematic mapping
study of source code representation for deep learning in software engineering. In IET Software. How to represent source code for machine learning? Tokens Trees Graphs

Chalmers 24 Samoaa, Longo, Mohamad, Leitner (2022). TEP-GNN: Accurate Execution
Time Prediction of Functional Tests Using Graph Neural Networks. In Product-Focused Software Process Improvement. FA-AST “Flow Augmented” AST

Time Prediction of Functional Tests Using Graph Neural Networks. In Product-Focused Software Process Improvement. ML Architecture

Chalmers 26 Experiments Predicting execution time of unit tests Samoaa,
Longo, Mohamad, Leitner (2022). TEP-GNN: Accurate Execution Time Prediction of Functional Tests Using Graph Neural Networks. In Product-Focused Software Process Improvement.

Time Prediction of Functional Tests Using Graph Neural Networks. In Product-Focused Software Process Improvement. Results

Chalmers 28 Challenges Predicting execution time of even very simple
code is hard Graphs are enormous Find ways to (smartly) simplify Data is expensive to collect

Chalmers 29 Ongoing Work - How to Predict With Little
Data Active Learning Samoaa, Aronsson, Longa, Leitner, Chehreghani (2024). A Unified Active Learning Framework for Annotating Graph Data for Regression Tasks. In Journal of Engineering Applications of Artificial Intelligence. Samoaa, Aronsson, Leitner, Chehreghani (2023). Batch Mode Deep Active Learning for Regression on Graph Data. In IEEE International Conference on Big Data (BigData).

Chalmers 30 Future Work Using domain knowledge to simplify graphs
Integrating some runtime information Scenario: Predicting the performance of a (small) code change Rather than predicting the performance of entirely unseen code

Chalmers 31 Microbenchmarking Performance Prediction Debloating Generating Efficient Code

Chalmers 32 Maybe LLMs will generate optimal code for us?

Currently: definitely not In the future: unlikely (at least purely with LLMs) Custom algorithm design is not particularly amenable to pattern learning Liu et al.: Evaluating Language Models for Efficient Code Generation Qiu et al.: How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark Niu et al.: On Evaluating the Efficiency of Source Code Generated by LLMs (all recent work available on arXiv)

Chalmers 35 Summary

Chalmers 36 Summary

Chalmers 37 Summary

Chalmers 38 Summary

Chalmers 39 References Costa, Bezemer, Leitner, Andrzejak (2021). What's Wrong
with My Benchmark Results? Studying Bad Practices in JMH Benchmarks. IEEE Transactions on Software Engineering, 47(7), pp. 1452-1467 Jangali, Tang, Alexandersson, Leitner, Yang, Shang (2022). Automated Generation and Evaluation of JMH Microbenchmark Suites from Unit Tests. IEEE Transactions on Software Engineering Cito, Leitner, Rinard, Gall (2019). Interactive Production Performance Feedback in the IDE. In Proceedings of the 41st International Conference on Software Engineering (ICSE) Samoaa, Bayram, Salza, Leitner (2022). A systematic mapping study of source code representation for deep learning in software engineering. In IET Software. Samoaa, Longo, Mohamad, Leitner (2022). TEP-GNN: Accurate Execution Time Prediction of Functional Tests Using Graph Neural Networks. In Product-Focused Software Process Improvement. Samoaa, Aronsson, Longa, Leitner, Chehreghani (2024). A Unified Active Learning Framework for Annotating Graph Data for Regression Tasks. In Journal of Engineering Applications of Artificial Intelligence.

2024_uzh_collo.pdf

2024_uzh_collo.pdf

More Decks by xLeitix

Other Decks in Technology

Featured

Transcript