$30 off During Our Annual Pro Sale. View Details »

Android Benchmarking and other stories

Android Benchmarking and other stories

Android Benchmarking and other stories

AndroidMakers 2022.

Iury Souza
Enrique López-Mañas

Enrique López Mañas

May 15, 2022
Tweet

More Decks by Enrique López Mañas

Other Decks in Programming

Transcript

  1. Android Benchmarking and other stories Iury Souza Enrique López-Mañas

  2. • Mobile stuff @ Klarna • Currently building a shopping

    browser • Loves building tools @iurysza
  3. @eenriquelopez • Android Freelancer • Kotlin Weekly maintainer (kotlinweekly.net) •

    Kotlin, Android • Running, fi nances.
  4. Introduction Benchmarking is the practice of comparing business processes and

    performance metrics to industry bests and best practices from other companies. Dimensions typically measured are quality, time and cost.
  5. Introduction Benchmarking is a way to test the performance of

    your application. You can regularly run benchmarks to help analyze and debug performance problems and ensure that you don't introduce regressions in recent changes.
  6. Introduction In software engineering, pro fi ling ("program pro fi

    ling", "software pro fi ling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls.
  7. Android Benchmarking • Microbenchmark • Macrobenchmark • Jetpack Benchmark •

    JankStats
  8. Android Profiling • Android Pro fi ler (since Android Studio

    3.0) • Replaces Android Monitor Tools • CPU, Memory, Network and Energy pro fi lers • Pro fi leable apps • Useful for identifying performance bottlenecks
  9. Android Profiling

  10. Android Profiling - Memory

  11. Android Profiling - Memory allocation

  12. Android Profiling - Energy

  13. Microbenchmark • Quickly benchmark your Android native code (Kotlin or

    Java) from within Android Studio. • Recommendation: pro fi le your code before writing a benchmark • Useful for CPU work that is run many times in your app • Examples: RecyclerView scrolling with one item shown at a time, data conversions/processing.
  14. Microbenchmark • Add dependency: dependencies { androidTestImplementation 'androidx.benchmark:benchmark-junit4 : 1.1.0-beta03'

    }
  15. Microbenchmark • Add benchmark: @RunWith(AndroidJUnit4::class) class SampleBenchmark { @get:Rule val

    benchmarkRule = BenchmarkRule() @Test fun benchmarkSomeWork() { benchmarkRule.measureRepeated { doSomeWork() } } }
  16. Microbenchmark • Add benchmark: @RunWith(AndroidJUnit4::class) val benchmarkRule = BenchmarkRule() @Test

  17. Microbenchmark • Add benchmark: @RunWith(AndroidJUnit4::class) fun benchmarkSomeWork() { benchmarkRule.measureRepeated {

    doSomeWork() } } }
  18. Microbenchmark

  19. Microbenchmark // using random with the same seed, so that

    it generates the same data every run private val random = Random(0) // create the array once and just copy it in benchmarks private val unsorted = IntArray(10_000) { random.nextInt() } @Test fun benchmark_quickSort() { // creating the variable outside of the measureRepeated to be able to assert after done var listToSort = intArrayOf() // [END_EXCLUDE] benchmarkRule.measureRepeated { // copy the array with timing disabled to measure only the algorithm itself listToSort = runWithTimingDisabled { unsorted.copyOf() } // sort the array in place and measure how long it takes SortingAlgorithms.quickSort(listToSort) } // assert only once not to add overhead to the benchmarks assertTrue(listToSort.isSorted) }
  20. Microbenchmark // using random with the same seed, so that

    it generates the same data every run private val random = Random(0) // create the array once and just copy it in benchmarks
  21. Microbenchmark // using random with the same seed, so that

    it generates the same data every run private val unsorted = IntArray(10_000) { random.nextInt() } @Test
  22. Microbenchmark // using random with the same seed, so that

    it generates the same data every run fun benchmark_quickSort() { // creating the variable outside of the measureRepeated to be able to assert after done var listToSort = intArrayOf() // [END_EXCLUDE] benchmarkRule.measureRepeated { // copy the array with timing disabled to measure only the algorithm itself listToSort = runWithTimingDisabled { unsorted.copyOf() } // sort the array in place and measure how long it takes SortingAlgorithms.quickSort(listToSort) } // assert only once not to add overhead to the benchmarks
  23. Microbenchmark • Run benchmark ./gradlew benchmark:connectedCheck ./gradlew benchmark:connectedCheck -P android.testInstrumentationRunnerArguments.class=com.ex

    ample.benchmark.SampleBenchmark#benchmarkSomeWork
  24. Microbenchmark • Results

  25. Macrobenchmark • Testing larger use cases of the app •

    Application startup, complex UI manipulations, running animations
  26. Macrobenchmark • Make up “profileable” <!-- enable profiling by macrobenchmark

    --> <profileable android:shell="true" tools:targetApi="q" />
  27. Macrobenchmark • Con fi gure Benchmark 
 buildTypes { release

    { minifyEnabled true shrinkResources true proguardFiles getDefaultProguardFile(‘proguard- android-optimize.txt'), 'proguard-rules.pro' } benchmark { initWith buildTypes.release signingConfig signingConfigs.debug }
  28. Macrobenchmark

  29. Macrobenchmark @LargeTest @RunWith(AndroidJUnit4::class) class SampleStartupBenchmark { @get:Rule val benchmarkRule =

    MacrobenchmarkRule() @Test fun startup() = benchmarkRule.measureRepeated( packageName = TARGET_PACKAGE, metrics = listOf(StartupTimingMetric()), iterations = 5, setupBlock = { // Press home button before each run to ensure the starting activity isn't visible. pressHome() } ) { // starts default launch activity startActivityAndWait() }
  30. Macrobenchmark @LargeTest @RunWith(AndroidJUnit4::class) class SampleStartupBenchmark { @get:Rule

  31. Macrobenchmark @LargeTest val benchmarkRule = MacrobenchmarkRule() @Test

  32. Macrobenchmark @LargeTest fun startup() = benchmarkRule.measureRepeated( packageName = TARGET_PACKAGE, metrics

    = listOf(StartupTimingMetric()), iterations = 5, setupBlock = { // Press home button before each run to ensure the starting activity isn't visible. pressHome() } ) {
  33. Macrobenchmark @LargeTest // starts default launch activity startActivityAndWait() }

  34. Macrobenchmark • StartupTimingMetric • FrameTimingMetric • TraceSectionMetric (experimental)

  35. Macrobenchmark • Show results: 


  36. Macrobenchmark { "context": { "build": { "brand": "google", "device": "blueline",

    "fingerprint": "google/blueline/blueline:12/SP1A.210812.015/7679548:user/release-keys", "model": "Pixel 3", "version": { "sdk": 31 } }, "cpuCoreCount": 8, "cpuLocked": false, "cpuMaxFreqHz": 2803200000, "memTotalBytes": 3753299968, "sustainedPerformanceModeEnabled": false }, "benchmarks": [ { "name": "startup", "params": {}, "className": "com.example.macrobenchmark.startup.SampleStartupBenchmark", "totalRunTimeNs": 4975598256, "metrics": { "timeToInitialDisplayMs": { "minimum": 347.881076, "maximum": 347.881076, "median": 347.881076, "runs": [ 347.881076 ] } }, "sampledMetrics": {}, "warmupIterations": 0, "repeatIterations": 3, "thermalThrottleSleepSeconds": 0 } ] } •
  37. JankStats •New framework (9th February 2022) •Build at the top

    of Android •In app benchmark
  38. JankStats class JankLoggingActivity : AppCompatActivity() { private lateinit var jankStats:

    JankStats override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) // metrics state holder can be retrieved regardless of JankStats initialization val metricsStateHolder = PerformanceMetricsState.getForHierarchy(binding.root) // initialize JankStats for current window jankStats = JankStats.createAndTrack( window, Dispatchers.Default.asExecutor(), jankFrameListener, ) // add activity name as state metricsStateHolder.state?.addState("Activity", javaClass.simpleName) // ... }
  39. JankStats class JankLoggingActivity : AppCompatActivity() { private lateinit var jankStats:

    JankStats override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) // metrics state holder can be retrieved regardless of JankStats initialization val metricsStateHolder = PerformanceMetricsState.getForHierarchy(binding.root) // initialize JankStats for current window jankStats = JankStats.createAndTrack( window, Dispatchers.Default.asExecutor(), jankFrameListener, ) // add activity name as state metricsStateHolder.state?.addState("Activity", javaClass.simpleName) // ... }
  40. JankStats Reporting private val jankFrameListener = JankStats.OnFrameListener { frameData ->

    // A real app could do something more interesting, like writing the info to local storage and later on report it. Log.v("JankStatsSample", frameData.toString()) }
  41. JankStats Aggregating override fun onResume() { super.onResume() jankStatsAggregator.jankStats.isTrackingEnabled = true

    } override fun onPause() { super.onPause() // Before disabling tracking, issue the report with (optionally) specified reason. jankStatsAggregator.issueJankReport("Activity paused") jankStatsAggregator.jankStats.isTrackingEnabled = false }
  42. JankStats Aggregating class FrameData( /** * The time at which

    this frame began (in nanoseconds) */ val frameStartNanos: Long, /** * The duration of this frame (in nanoseconds) */ val frameDurationNanos: Long, /** * Whether this frame was determined to be janky, meaning that its * duration exceeds the duration determined by the system to indicate jank (@see * [JankStats.jankHeuristicMultiplier]) */ val isJank: Boolean, /** * The UI/app state during this frame. This is the information set by the app, or by * other library code, that can be used later, during analysis, to determine what * UI state was current when jank occurred. * * @see PerformanceMetricsState.addState */ val states: List<StateInfo> )
  43. Detecting Regressions in CI

  44. Detecting Regressions in CI - CI (Continuous integration): A software-engineering

    practice of merging developer code into a main code base frequently.
  45. Detecting Regressions in CI - CI (Continuous integration): A software-engineering

    practice of merging developer code into a main code base frequently. - Regression: Noun: a return to a former or less developed state.
  46. Detecting Regressions in CI - CI (Continuous integration): A software-engineering

    practice of merging developer code into a main code base frequently. - Regression: Noun: a return to a former or less developed state. Performance degradation
  47. Detecting Regressions in CI Why would we need this?

  48. Detecting Regressions in CI A typical regression scenario usually goes

    like this: - You're working on something - Another team (usually QA) warns you about a critical performance issue - You switch context and start digging into the codebase not sure where to look - Pain - Manual profiling, benchmarking, etc
  49. Detecting Regressions in CI - Monitoring performance is much easier

    than profiling. - Catch problems before they hit users - Running benchmarks manually is repetitive and error prone. - The output is just a number. - Ideally, we should automate this process.
  50. Detecting Regressions in CI source: Kurzgesagt Let machines do what

    they’re best at!
  51. Detecting Regressions in CI Example: Monitoring degradation in app start-up

    time
  52. Detecting Regressions in CI Example: Identifying degradation in app start-up

    time Solution: Use MacroBenchmark's StartupTimingMetric
  53. Detecting Regressions in CI

  54. Detecting Regressions in CI

  55. Detecting Regressions in CI

  56. Detecting Regressions in CI When to run? - Every build

    (beware of resource cost) - Or maybe every release
  57. Detecting Regressions in CI When to run? - Every build

    (beware of resource cost) - Or maybe every release Where to run? - Real devices yield more reliable results - Firebase Test Lab (FTL)
  58. Detecting Regressions in CI When to run? - Every build

    (beware of resource cost) - Or maybe every release Where to run? - Real devices yield more reliable results - Firebase Test Lab (FTL) What to store? - The performance metric (time in ms) - The corresponding build-number or commit-hashId
  59. Detecting Regressions in CI Ok, setup finished. Now what? 🧐

  60. Detecting Regressions in CI Now comes the detection part.

  61. Detecting Regressions in CI - Now comes the detection part.

    - There are multiple possible approaches
  62. Detecting Regressions in CI Compare with the previous result:

  63. Detecting Regressions in CI Compare with the previous result: Don't

    do this Don't do this
  64. Detecting Regressions in CI Why a naive approach won't work?

    - Benchmarking values can vary a lot. - Lots of things can change between runs
  65. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  66. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  67. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  68. Detecting Regressions in CI Use a threshold value? - Compare

    against a manually defined percentage threshold
  69. Detecting Regressions in CI Problems with naive approaches - Values

    are inconsistent between benchmarks - It may trigger false alerts - It may miss real regressions
  70. Detecting Regressions in CI We can do better

  71. Constraints - Handle temporary instability - Avoid manual tuning, per

    benchmark - We want accuracy! Detecting Regressions in CI
  72. Now comes the detection part. Detecting Regressions in CI

  73. Now comes the detection math part. Detecting Regressions in CI

  74. Detecting Regressions in CI Now comes the detection math part.

    We need more context to make a decision.
  75. Detecting Regressions in CI Now comes the detection math part.

    We need more context to make a decision.
  76. Step fitting algorithm Statistical approach for detecting jumps or steps

    in a time- series Detecting Regressions in CI
  77. Step fitting algorithm
 
 - Main objective: increase confidence in

    detecting regressions - The sliding window helps you make context-aware decisions - Use width size and threshold to fine-tune the confidence of regression Detecting Regressions in CI
  78. Detecting Regressions in CI

  79. Detecting Regressions in CI

  80. Detecting Regressions in CI

  81. Detecting Regressions in CI

  82. Detecting Regressions in CI

  83. Detecting Regressions in CI

  84. Recap Detecting Regressions in CI: - Automate regression detection on

    key points of your app - Use step- fi tting instead of naive approaches - Helps you catch issues before they hit users - When a new build result is ready, check its benchmark values inside the 2* width size - If there’s a regression or improvement fire an alert to investigate the performance in the last width builds
  85. Recap Jetpack benchmark: •Micro, macro benchmarks. •Instrumentation tests as benchmarks

  86. Recap Pro fi ling: • Memory, Energy, Network, CPU. •

    Pro fi le to identify bottlenecks and implement your benchmarks.
  87. Resources Great article on fi ghting regressions by Chris Craik


    https://bit.ly/3kaidug Benchmarking o ff i cial docs
 https://bit.ly/3rQRaZ6 JankStats:
 https://developer.android.com/topic/performance/jankstats
  88. Your feedback! bit.ly/benchmarkFeedback

  89. Thank you! @iurysza @eenriquelopez