for our Solution Discovery process • Data could be generated targeting the situation we wanted to achieve (not implementation details to achieve it) • High con fi dence for the Solution score (RICA) • Ideation pre-benchmarking could uncover implementation paths Evaluation
alpha Paired T-test Evidence that means between samples are different is WEAK p-value is BIGGER p-value is SMALLER Evidence that means between samples are different is STRONG Benchmark #02 (modi fi cations) Gradle task
executions were not isolated • Consolidating data from outcomes into Google Sheets was a manual process • Reliable interpretation of results could be non-trivial, specially when disambiguating inconclusive scenarios • Statistics is powerful but hard Summary of challenges
prepares a vanilla self-hosted Linux machine with the required tooling Set of scripts that wraps gradle-pro fi ler and git in an opionated way and drives the benchmark excution Small Python SDK that parses CSV fi les, pairs the data points and runs a Paired T- test on top of Scipy and NumPy.
experiments since March/2022 • Thousands of Engineer-minutes saved, async-await style • Assertive solutions to improve our build setup and clearer implementation path when delivering them • We can validate if any input from the Android ecosystem actually works for us, avoiding ad verecundium arguments Our journey so far