sizes, data sets - VM parameters, hardware platform • Evaluation Methodology - Which numbers to report (avg, geomean, best, worst) - Non-determinism in or out (compilation, GC times, etc.) - Statistical rigorous methodologies [1] [1] A. George, D. Buytaert, L. Eeckhout. Statistically Rigorous Java Performance Evaluation, In OOPSLA 2007.
“five-why” rule • Diversify data sets and input sizes • Diversify hardware platform - Unless you optimize for a specific one • Pin down VM version - Change of VM (version, vendor, etc.)? - Redo all experiments • Diversify VM parameters - Can dramatically change performance - Top influencers: GC, heap sizes, heap layout
that most relate to your app - Mobile app | Startup times - Enterprise app | Peak performance - Real time app | Non-deterministic factors, outliers, noise • Best vs Avg vs Worst numbers - Closely related to what we measure - Always report stdev - Be clear and precise about the reported numbers • Always perform “apples-to-apples” comparison