Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Android Benchmarking and other stories

Android Benchmarking and other stories

Android Benchmarking and other stories

AndroidMakers 2022.

Iury Souza
Enrique López-Mañas

Enrique López Mañas

May 15, 2022
Tweet

More Decks by Enrique López Mañas

Other Decks in Programming

Transcript

  1. • Mobile stuff @ Klarna • Currently building a shopping

    browser • Loves building tools @iurysza
  2. Introduction Benchmarking is the practice of comparing business processes and

    performance metrics to industry bests and best practices from other companies. Dimensions typically measured are quality, time and cost.
  3. Introduction Benchmarking is a way to test the performance of

    your application. You can regularly run benchmarks to help analyze and debug performance problems and ensure that you don't introduce regressions in recent changes.
  4. Introduction In software engineering, pro fi ling ("program pro fi

    ling", "software pro fi ling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls.
  5. Android Profiling • Android Pro fi ler (since Android Studio

    3.0) • Replaces Android Monitor Tools • CPU, Memory, Network and Energy pro fi lers • Pro fi leable apps • Useful for identifying performance bottlenecks
  6. Microbenchmark • Quickly benchmark your Android native code (Kotlin or

    Java) from within Android Studio. • Recommendation: pro fi le your code before writing a benchmark • Useful for CPU work that is run many times in your app • Examples: RecyclerView scrolling with one item shown at a time, data conversions/processing.
  7. Microbenchmark • Add benchmark: @RunWith(AndroidJUnit4::class) class SampleBenchmark { @get:Rule val

    benchmarkRule = BenchmarkRule() @Test fun benchmarkSomeWork() { benchmarkRule.measureRepeated { doSomeWork() } } }
  8. Microbenchmark // using random with the same seed, so that

    it generates the same data every run private val random = Random(0) // create the array once and just copy it in benchmarks private val unsorted = IntArray(10_000) { random.nextInt() } @Test fun benchmark_quickSort() { // creating the variable outside of the measureRepeated to be able to assert after done var listToSort = intArrayOf() // [END_EXCLUDE] benchmarkRule.measureRepeated { // copy the array with timing disabled to measure only the algorithm itself listToSort = runWithTimingDisabled { unsorted.copyOf() } // sort the array in place and measure how long it takes SortingAlgorithms.quickSort(listToSort) } // assert only once not to add overhead to the benchmarks assertTrue(listToSort.isSorted) }
  9. Microbenchmark // using random with the same seed, so that

    it generates the same data every run private val random = Random(0) // create the array once and just copy it in benchmarks
  10. Microbenchmark // using random with the same seed, so that

    it generates the same data every run private val unsorted = IntArray(10_000) { random.nextInt() } @Test
  11. Microbenchmark // using random with the same seed, so that

    it generates the same data every run fun benchmark_quickSort() { // creating the variable outside of the measureRepeated to be able to assert after done var listToSort = intArrayOf() // [END_EXCLUDE] benchmarkRule.measureRepeated { // copy the array with timing disabled to measure only the algorithm itself listToSort = runWithTimingDisabled { unsorted.copyOf() } // sort the array in place and measure how long it takes SortingAlgorithms.quickSort(listToSort) } // assert only once not to add overhead to the benchmarks
  12. Macrobenchmark • Testing larger use cases of the app •

    Application startup, complex UI manipulations, running animations
  13. Macrobenchmark • Make up “profileable” <!-- enable profiling by macrobenchmark

    --> <profileable android:shell="true" tools:targetApi="q" />
  14. Macrobenchmark • Con fi gure Benchmark 
 buildTypes { release

    { minifyEnabled true shrinkResources true proguardFiles getDefaultProguardFile(‘proguard- android-optimize.txt'), 'proguard-rules.pro' } benchmark { initWith buildTypes.release signingConfig signingConfigs.debug }
  15. Macrobenchmark @LargeTest @RunWith(AndroidJUnit4::class) class SampleStartupBenchmark { @get:Rule val benchmarkRule =

    MacrobenchmarkRule() @Test fun startup() = benchmarkRule.measureRepeated( packageName = TARGET_PACKAGE, metrics = listOf(StartupTimingMetric()), iterations = 5, setupBlock = { // Press home button before each run to ensure the starting activity isn't visible. pressHome() } ) { // starts default launch activity startActivityAndWait() }
  16. Macrobenchmark @LargeTest fun startup() = benchmarkRule.measureRepeated( packageName = TARGET_PACKAGE, metrics

    = listOf(StartupTimingMetric()), iterations = 5, setupBlock = { // Press home button before each run to ensure the starting activity isn't visible. pressHome() } ) {
  17. Macrobenchmark { "context": { "build": { "brand": "google", "device": "blueline",

    "fingerprint": "google/blueline/blueline:12/SP1A.210812.015/7679548:user/release-keys", "model": "Pixel 3", "version": { "sdk": 31 } }, "cpuCoreCount": 8, "cpuLocked": false, "cpuMaxFreqHz": 2803200000, "memTotalBytes": 3753299968, "sustainedPerformanceModeEnabled": false }, "benchmarks": [ { "name": "startup", "params": {}, "className": "com.example.macrobenchmark.startup.SampleStartupBenchmark", "totalRunTimeNs": 4975598256, "metrics": { "timeToInitialDisplayMs": { "minimum": 347.881076, "maximum": 347.881076, "median": 347.881076, "runs": [ 347.881076 ] } }, "sampledMetrics": {}, "warmupIterations": 0, "repeatIterations": 3, "thermalThrottleSleepSeconds": 0 } ] } •
  18. JankStats class JankLoggingActivity : AppCompatActivity() { private lateinit var jankStats:

    JankStats override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) // metrics state holder can be retrieved regardless of JankStats initialization val metricsStateHolder = PerformanceMetricsState.getForHierarchy(binding.root) // initialize JankStats for current window jankStats = JankStats.createAndTrack( window, Dispatchers.Default.asExecutor(), jankFrameListener, ) // add activity name as state metricsStateHolder.state?.addState("Activity", javaClass.simpleName) // ... }
  19. JankStats class JankLoggingActivity : AppCompatActivity() { private lateinit var jankStats:

    JankStats override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) // metrics state holder can be retrieved regardless of JankStats initialization val metricsStateHolder = PerformanceMetricsState.getForHierarchy(binding.root) // initialize JankStats for current window jankStats = JankStats.createAndTrack( window, Dispatchers.Default.asExecutor(), jankFrameListener, ) // add activity name as state metricsStateHolder.state?.addState("Activity", javaClass.simpleName) // ... }
  20. JankStats Reporting private val jankFrameListener = JankStats.OnFrameListener { frameData ->

    // A real app could do something more interesting, like writing the info to local storage and later on report it. Log.v("JankStatsSample", frameData.toString()) }
  21. JankStats Aggregating override fun onResume() { super.onResume() jankStatsAggregator.jankStats.isTrackingEnabled = true

    } override fun onPause() { super.onPause() // Before disabling tracking, issue the report with (optionally) specified reason. jankStatsAggregator.issueJankReport("Activity paused") jankStatsAggregator.jankStats.isTrackingEnabled = false }
  22. JankStats Aggregating class FrameData( /** * The time at which

    this frame began (in nanoseconds) */ val frameStartNanos: Long, /** * The duration of this frame (in nanoseconds) */ val frameDurationNanos: Long, /** * Whether this frame was determined to be janky, meaning that its * duration exceeds the duration determined by the system to indicate jank (@see * [JankStats.jankHeuristicMultiplier]) */ val isJank: Boolean, /** * The UI/app state during this frame. This is the information set by the app, or by * other library code, that can be used later, during analysis, to determine what * UI state was current when jank occurred. * * @see PerformanceMetricsState.addState */ val states: List<StateInfo> )
  23. Detecting Regressions in CI - CI (Continuous integration): A software-engineering

    practice of merging developer code into a main code base frequently.
  24. Detecting Regressions in CI - CI (Continuous integration): A software-engineering

    practice of merging developer code into a main code base frequently. - Regression: Noun: a return to a former or less developed state.
  25. Detecting Regressions in CI - CI (Continuous integration): A software-engineering

    practice of merging developer code into a main code base frequently. - Regression: Noun: a return to a former or less developed state. Performance degradation
  26. Detecting Regressions in CI A typical regression scenario usually goes

    like this: - You're working on something - Another team (usually QA) warns you about a critical performance issue - You switch context and start digging into the codebase not sure where to look - Pain - Manual profiling, benchmarking, etc
  27. Detecting Regressions in CI - Monitoring performance is much easier

    than profiling. - Catch problems before they hit users - Running benchmarks manually is repetitive and error prone. - The output is just a number. - Ideally, we should automate this process.
  28. Detecting Regressions in CI Example: Identifying degradation in app start-up

    time Solution: Use MacroBenchmark's StartupTimingMetric
  29. Detecting Regressions in CI When to run? - Every build

    (beware of resource cost) - Or maybe every release
  30. Detecting Regressions in CI When to run? - Every build

    (beware of resource cost) - Or maybe every release Where to run? - Real devices yield more reliable results - Firebase Test Lab (FTL)
  31. Detecting Regressions in CI When to run? - Every build

    (beware of resource cost) - Or maybe every release Where to run? - Real devices yield more reliable results - Firebase Test Lab (FTL) What to store? - The performance metric (time in ms) - The corresponding build-number or commit-hashId
  32. Detecting Regressions in CI - Now comes the detection part.

    - There are multiple possible approaches
  33. Detecting Regressions in CI Why a naive approach won't work?

    - Benchmarking values can vary a lot. - Lots of things can change between runs
  34. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  35. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  36. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  37. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  38. Detecting Regressions in CI Problems with naive approaches - Values

    are inconsistent between benchmarks - It may trigger false alerts - It may miss real regressions
  39. Constraints - Handle temporary instability - Avoid manual tuning, per

    benchmark - We want accuracy! Detecting Regressions in CI
  40. Detecting Regressions in CI Now comes the detection math part.

    We need more context to make a decision.
  41. Detecting Regressions in CI Now comes the detection math part.

    We need more context to make a decision.
  42. Step fitting algorithm Statistical approach for detecting jumps or steps

    in a time- series Detecting Regressions in CI
  43. Step fitting algorithm
 
 - Main objective: increase confidence in

    detecting regressions - The sliding window helps you make context-aware decisions - Use width size and threshold to fine-tune the confidence of regression Detecting Regressions in CI
  44. Recap Detecting Regressions in CI: - Automate regression detection on

    key points of your app - Use step- fi tting instead of naive approaches - Helps you catch issues before they hit users - When a new build result is ready, check its benchmark values inside the 2* width size - If there’s a regression or improvement fire an alert to investigate the performance in the last width builds
  45. Recap Pro fi ling: • Memory, Energy, Network, CPU. •

    Pro fi le to identify bottlenecks and implement your benchmarks.
  46. Resources Great article on fi ghting regressions by Chris Craik


    https://bit.ly/3kaidug Benchmarking o ff i cial docs
 https://bit.ly/3rQRaZ6 JankStats:
 https://developer.android.com/topic/performance/jankstats