field of “programming languages and systems” • Semi-structured interviews • Ad hoc result analysis • tiny sample • not representative • not the same for all interviews • Interpretation biases 13
groups >70% do some prepara5on <30% do no prepara5on Prepara5on may include • disabling daemons, disk usage, Address Space Layout Randomiza5on • disabling turbo boost, frequency scaling • NUMA-node pinning, thread pinning 24
groups >70% do some prepara5on <30% do no prepara5on Prepara5on may include • disabling daemons, disk usage, Address Space Layout Randomiza5on • disabling turbo boost, frequency scaling • NUMA-node pinning, thread pinning 25 👍 for awareness But, requires expertise and is not trivial
track it systema/cally >60% do not track it 28 Common issues named: • Comparing wrong data, only no5ced by inconsistencies • Losing track of what’s what • Parameters/setup details not recorded
machines, minimizing measurement error • Tracking data provenance • Historic data available/useful • Standard analyses, data processing, and sta/s/cs 33
least, check that benchmarks produce correct results • Use same setup for day-to-day engineering as for producing data for papers – The setup is already debugged! • Most CI systems can store ar0facts – Basic provenance tracking for results! • Automate data handling – Spreadsheets can import data from external data sources – Avoid manually copying data around • Define workflow that works for your group – And teach it! 34
Setup and maintain machines, minimizing measurement error • Tracking data provenance • Historic data available/useful • Standard analyses, data processing, and staBsBcs Best Prac/ces • Use CI/Automated Testing • Use same setup for day-to-day engineering as for producing data for papers • Most CI systems can store artifacts • Automate data handling • Define workflow that works for your group 35