Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Android Benchmarking and other stories

Android Benchmarking and other stories

Android Benchmarking and other stories

AndroidMakers 2022.

Iury Souza
Enrique López-Mañas

Enrique López Mañas

May 15, 2022
Tweet

More Decks by Enrique López Mañas

Other Decks in Programming

Transcript

  1. Android Benchmarking and
    other stories
    Iury Souza


    Enrique López-Mañas

    View Slide

  2. • Mobile stuff @ Klarna


    • Currently building a shopping browser


    • Loves building tools
    @iurysza

    View Slide

  3. @eenriquelopez
    • Android Freelancer


    • Kotlin Weekly maintainer (kotlinweekly.net)


    • Kotlin, Android


    • Running,
    fi
    nances.

    View Slide

  4. Introduction
    Benchmarking is the practice of comparing business processes and
    performance metrics to industry bests and best practices from other companies.
    Dimensions typically measured are quality, time and cost.

    View Slide

  5. Introduction
    Benchmarking is a way to test the performance of your application. You can
    regularly run benchmarks to help analyze and debug performance problems and
    ensure that you don't introduce regressions in recent changes.

    View Slide

  6. Introduction
    In software engineering, pro
    fi
    ling ("program pro
    fi
    ling", "software pro
    fi
    ling") is a
    form of dynamic program analysis that measures, for example, the space
    (memory) or time complexity of a program, the usage of particular instructions,
    or the frequency and duration of function calls.

    View Slide

  7. Android Benchmarking
    • Microbenchmark


    • Macrobenchmark


    • Jetpack Benchmark


    • JankStats

    View Slide

  8. Android Profiling
    • Android Pro
    fi
    ler (since Android Studio 3.0)


    • Replaces Android Monitor Tools


    • CPU, Memory, Network and Energy pro
    fi
    lers


    • Pro
    fi
    leable apps


    • Useful for identifying performance bottlenecks

    View Slide

  9. Android Profiling

    View Slide

  10. Android Profiling - Memory

    View Slide

  11. Android Profiling - Memory allocation

    View Slide

  12. Android Profiling - Energy

    View Slide

  13. Microbenchmark
    • Quickly benchmark your Android native code (Kotlin or Java) from within
    Android Studio.


    • Recommendation: pro
    fi
    le your code before writing a benchmark


    • Useful for CPU work that is run many times in your app


    • Examples: RecyclerView scrolling with one item shown at a time, data
    conversions/processing.

    View Slide

  14. Microbenchmark
    • Add dependency:


    dependencies {


    androidTestImplementation 'androidx.benchmark:benchmark-junit4
    :
    1.1.0-beta03'


    }

    View Slide

  15. Microbenchmark
    • Add benchmark:


    @RunWith(AndroidJUnit4::class)


    class SampleBenchmark {


    @get:Rule


    val benchmarkRule = BenchmarkRule()


    @Test


    fun benchmarkSomeWork() {


    benchmarkRule.measureRepeated {


    doSomeWork()


    }


    }


    }

    View Slide

  16. Microbenchmark
    • Add benchmark:


    @RunWith(AndroidJUnit4::class)


    val benchmarkRule = BenchmarkRule()


    @Test


    View Slide

  17. Microbenchmark
    • Add benchmark:


    @RunWith(AndroidJUnit4::class)


    fun benchmarkSomeWork() {


    benchmarkRule.measureRepeated {


    doSomeWork()


    }


    }


    }

    View Slide

  18. Microbenchmark

    View Slide

  19. Microbenchmark
    // using random with the same seed, so that it generates the same data every run


    private val random = Random(0)


    // create the array once and just copy it in benchmarks


    private val unsorted = IntArray(10_000) { random.nextInt() }


    @Test


    fun benchmark_quickSort() {


    // creating the variable outside of the measureRepeated to be able to assert after done


    var listToSort = intArrayOf()


    // [END_EXCLUDE]


    benchmarkRule.measureRepeated {


    // copy the array with timing disabled to measure only the algorithm itself


    listToSort = runWithTimingDisabled { unsorted.copyOf() }


    // sort the array in place and measure how long it takes


    SortingAlgorithms.quickSort(listToSort)


    }


    // assert only once not to add overhead to the benchmarks


    assertTrue(listToSort.isSorted)


    }

    View Slide

  20. Microbenchmark
    // using random with the same seed, so that it generates the same data every run


    private val random = Random(0)


    // create the array once and just copy it in benchmarks


    View Slide

  21. Microbenchmark
    // using random with the same seed, so that it generates the same data every run


    private val unsorted = IntArray(10_000) { random.nextInt() }


    @Test


    View Slide

  22. Microbenchmark
    // using random with the same seed, so that it generates the same data every run


    fun benchmark_quickSort() {


    // creating the variable outside of the measureRepeated to be able to assert after done


    var listToSort = intArrayOf()


    // [END_EXCLUDE]


    benchmarkRule.measureRepeated {


    // copy the array with timing disabled to measure only the algorithm itself


    listToSort = runWithTimingDisabled { unsorted.copyOf() }


    // sort the array in place and measure how long it takes


    SortingAlgorithms.quickSort(listToSort)


    }


    // assert only once not to add overhead to the benchmarks


    View Slide

  23. Microbenchmark
    • Run benchmark


    ./gradlew benchmark:connectedCheck


    ./gradlew benchmark:connectedCheck -P
    android.testInstrumentationRunnerArguments.class=com.ex
    ample.benchmark.SampleBenchmark#benchmarkSomeWork


    View Slide

  24. Microbenchmark
    • Results


    View Slide

  25. Macrobenchmark
    • Testing larger use cases of the app


    • Application startup, complex UI manipulations, running animations

    View Slide

  26. Macrobenchmark
    • Make up “profileable”

    android:shell="true"
    tools:targetApi="q" />

    View Slide

  27. Macrobenchmark
    • Con
    fi
    gure Benchmark

    buildTypes {


    release {


    minifyEnabled true


    shrinkResources true


    proguardFiles getDefaultProguardFile(‘proguard-
    android-optimize.txt'),


    'proguard-rules.pro'


    }


    benchmark {


    initWith buildTypes.release


    signingConfig signingConfigs.debug


    }

    View Slide

  28. Macrobenchmark

    View Slide

  29. Macrobenchmark
    @LargeTest


    @RunWith(AndroidJUnit4::class)


    class SampleStartupBenchmark {


    @get:Rule


    val benchmarkRule = MacrobenchmarkRule()


    @Test


    fun startup() = benchmarkRule.measureRepeated(


    packageName = TARGET_PACKAGE,


    metrics = listOf(StartupTimingMetric()),


    iterations = 5,


    setupBlock = {


    // Press home button before each run to ensure the starting activity isn't visible.


    pressHome()


    }


    ) {


    // starts default launch activity


    startActivityAndWait()


    }

    View Slide

  30. Macrobenchmark
    @LargeTest


    @RunWith(AndroidJUnit4::class)


    class SampleStartupBenchmark {


    @get:Rule


    View Slide

  31. Macrobenchmark
    @LargeTest


    val benchmarkRule = MacrobenchmarkRule()


    @Test


    View Slide

  32. Macrobenchmark
    @LargeTest


    fun startup() = benchmarkRule.measureRepeated(


    packageName = TARGET_PACKAGE,


    metrics = listOf(StartupTimingMetric()),


    iterations = 5,


    setupBlock = {


    // Press home button before each run to ensure the starting activity isn't visible.


    pressHome()


    }


    ) {


    View Slide

  33. Macrobenchmark
    @LargeTest


    // starts default launch activity


    startActivityAndWait()


    }

    View Slide

  34. Macrobenchmark
    • StartupTimingMetric


    • FrameTimingMetric


    • TraceSectionMetric (experimental)

    View Slide

  35. Macrobenchmark
    • Show results:

    View Slide

  36. Macrobenchmark
    {
    "context": {
    "build": {
    "brand": "google",
    "device": "blueline",
    "fingerprint": "google/blueline/blueline:12/SP1A.210812.015/7679548:user/release-keys",
    "model": "Pixel 3",
    "version": {
    "sdk": 31
    }
    },
    "cpuCoreCount": 8,
    "cpuLocked": false,
    "cpuMaxFreqHz": 2803200000,
    "memTotalBytes": 3753299968,
    "sustainedPerformanceModeEnabled": false
    },
    "benchmarks": [
    {
    "name": "startup",
    "params": {},
    "className": "com.example.macrobenchmark.startup.SampleStartupBenchmark",
    "totalRunTimeNs": 4975598256,
    "metrics": {
    "timeToInitialDisplayMs": {
    "minimum": 347.881076,
    "maximum": 347.881076,
    "median": 347.881076,
    "runs": [
    347.881076
    ]
    }
    },
    "sampledMetrics": {},
    "warmupIterations": 0,
    "repeatIterations": 3,
    "thermalThrottleSleepSeconds": 0
    }
    ]
    }

    View Slide

  37. JankStats
    •New framework (9th February 2022)


    •Build at the top of Android


    •In app benchmark

    View Slide

  38. JankStats
    class JankLoggingActivity : AppCompatActivity() {
    private lateinit var jankStats: JankStats
    override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    // metrics state holder can be retrieved regardless of JankStats initialization
    val metricsStateHolder = PerformanceMetricsState.getForHierarchy(binding.root)
    // initialize JankStats for current window
    jankStats = JankStats.createAndTrack(
    window,
    Dispatchers.Default.asExecutor(),
    jankFrameListener,
    )
    // add activity name as state
    metricsStateHolder.state?.addState("Activity", javaClass.simpleName)
    // ...
    }

    View Slide

  39. JankStats
    class JankLoggingActivity : AppCompatActivity() {
    private lateinit var jankStats: JankStats
    override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    // metrics state holder can be retrieved regardless of JankStats initialization
    val metricsStateHolder = PerformanceMetricsState.getForHierarchy(binding.root)
    // initialize JankStats for current window
    jankStats = JankStats.createAndTrack(
    window,
    Dispatchers.Default.asExecutor(),
    jankFrameListener,
    )
    // add activity name as state
    metricsStateHolder.state?.addState("Activity", javaClass.simpleName)
    // ...
    }

    View Slide

  40. JankStats Reporting
    private val jankFrameListener = JankStats.OnFrameListener { frameData ->
    // A real app could do something more interesting, like writing the info to
    local storage and later on report it.
    Log.v("JankStatsSample", frameData.toString())
    }

    View Slide

  41. JankStats Aggregating
    override fun onResume() {
    super.onResume()
    jankStatsAggregator.jankStats.isTrackingEnabled = true
    }
    override fun onPause() {
    super.onPause()
    // Before disabling tracking, issue the report with (optionally) specified reason.
    jankStatsAggregator.issueJankReport("Activity paused")
    jankStatsAggregator.jankStats.isTrackingEnabled = false
    }

    View Slide

  42. JankStats Aggregating
    class FrameData(
    /**
    * The time at which this frame began (in nanoseconds)
    */
    val frameStartNanos: Long,
    /**
    * The duration of this frame (in nanoseconds)
    */
    val frameDurationNanos: Long,
    /**
    * Whether this frame was determined to be janky, meaning that its
    * duration exceeds the duration determined by the system to indicate jank (@see
    * [JankStats.jankHeuristicMultiplier])
    */
    val isJank: Boolean,
    /**
    * The UI/app state during this frame. This is the information set by the app, or by
    * other library code, that can be used later, during analysis, to determine what
    * UI state was current when jank occurred.
    *
    * @see PerformanceMetricsState.addState
    */
    val states: List
    )

    View Slide

  43. Detecting Regressions in CI

    View Slide

  44. Detecting Regressions in CI
    - CI (Continuous integration):


    A software-engineering practice of merging developer code
    into a main code base frequently.

    View Slide

  45. Detecting Regressions in CI
    - CI (Continuous integration):


    A software-engineering practice of merging developer code
    into a main code base frequently.


    - Regression:


    Noun: a return to a former or less developed state.


    View Slide

  46. Detecting Regressions in CI
    - CI (Continuous integration):


    A software-engineering practice of merging developer code
    into a main code base frequently.


    - Regression:


    Noun: a return to a former or less developed state.


    Performance degradation


    View Slide

  47. Detecting Regressions in CI
    Why would we need this?

    View Slide

  48. Detecting Regressions in CI
    A typical regression scenario usually goes like this:


    - You're working on something


    - Another team (usually QA) warns you about a critical
    performance issue


    - You switch context and start digging into the codebase not
    sure where to look


    - Pain


    - Manual profiling, benchmarking, etc

    View Slide

  49. Detecting Regressions in CI
    - Monitoring performance is much easier than profiling.


    - Catch problems before they hit users


    - Running benchmarks manually is repetitive and error
    prone.


    - The output is just a number.


    - Ideally, we should automate this process.

    View Slide

  50. Detecting Regressions in CI
    source: Kurzgesagt
    Let machines do what they’re best at!

    View Slide

  51. Detecting Regressions in CI
    Example: Monitoring degradation in app start-up time

    View Slide

  52. Detecting Regressions in CI
    Example: Identifying degradation in app start-up time


    Solution: Use MacroBenchmark's StartupTimingMetric

    View Slide

  53. Detecting Regressions in CI

    View Slide

  54. Detecting Regressions in CI

    View Slide

  55. Detecting Regressions in CI

    View Slide

  56. Detecting Regressions in CI
    When to run?


    - Every build (beware of resource cost)


    - Or maybe every release

    View Slide

  57. Detecting Regressions in CI
    When to run?


    - Every build (beware of resource cost)


    - Or maybe every release


    Where to run?


    - Real devices yield more reliable results


    - Firebase Test Lab (FTL)

    View Slide

  58. Detecting Regressions in CI
    When to run?


    - Every build (beware of resource cost)


    - Or maybe every release


    Where to run?


    - Real devices yield more reliable results


    - Firebase Test Lab (FTL)


    What to store?


    - The performance metric (time in ms)


    - The corresponding build-number or commit-hashId

    View Slide

  59. Detecting Regressions in CI
    Ok, setup finished.


    Now what? 🧐

    View Slide

  60. Detecting Regressions in CI
    Now comes the detection part.

    View Slide

  61. Detecting Regressions in CI
    - Now comes the detection part.


    - There are multiple possible approaches

    View Slide

  62. Detecting Regressions in CI
    Compare with the previous result:


    View Slide

  63. Detecting Regressions in CI
    Compare with the previous result:


    Don't do this

    Don't do this

    View Slide

  64. Detecting Regressions in CI
    Why a naive approach won't work?


    - Benchmarking values can vary a lot.


    - Lots of things can change between runs

    View Slide

  65. Detecting Regressions in CI
    Use a threshold value?


    - Compare against a manually defined percentage threshold

    View Slide

  66. Detecting Regressions in CI
    Use a threshold value?


    - Compare against a manually defined percentage threshold

    View Slide

  67. Detecting Regressions in CI
    Use a threshold value?


    - Compare against a manually defined percentage threshold

    View Slide

  68. Detecting Regressions in CI
    Use a threshold value?


    - Compare against a manually defined percentage threshold

    View Slide

  69. Detecting Regressions in CI
    Problems with naive approaches


    - Values are inconsistent between benchmarks


    - It may trigger false alerts


    - It may miss real regressions


    View Slide

  70. Detecting Regressions in CI
    We can do better

    View Slide

  71. Constraints
    - Handle temporary instability

    - Avoid manual tuning, per benchmark

    - We want accuracy!
    Detecting Regressions in CI

    View Slide

  72. Now comes the detection part.
    Detecting Regressions in CI

    View Slide

  73. Now comes the detection math part.
    Detecting Regressions in CI

    View Slide

  74. Detecting Regressions in CI
    Now comes the detection math part.


    We need more context to make a decision.

    View Slide

  75. Detecting Regressions in CI
    Now comes the detection math part.


    We need more context to make a decision.

    View Slide

  76. Step fitting algorithm
    Statistical approach for detecting jumps or steps in a time-
    series
    Detecting Regressions in CI

    View Slide

  77. Step fitting algorithm


    - Main objective: increase confidence in detecting regressions
    - The sliding window helps you make context-aware decisions
    - Use width size and threshold to fine-tune the confidence of
    regression
    Detecting Regressions in CI

    View Slide

  78. Detecting Regressions in CI

    View Slide

  79. Detecting Regressions in CI

    View Slide

  80. Detecting Regressions in CI

    View Slide

  81. Detecting Regressions in CI

    View Slide

  82. Detecting Regressions in CI

    View Slide

  83. Detecting Regressions in CI

    View Slide

  84. Recap
    Detecting Regressions in CI:


    - Automate regression detection on key points of your app


    - Use step-
    fi
    tting instead of naive approaches


    - Helps you catch issues before they hit users


    - When a new build result is ready, check its benchmark values inside the 2*
    width size


    - If there’s a regression or improvement fire an alert to investigate the
    performance in the last width builds


    View Slide

  85. Recap
    Jetpack benchmark:


    •Micro, macro benchmarks.


    •Instrumentation tests as benchmarks


    View Slide

  86. Recap
    Pro
    fi
    ling:


    • Memory, Energy, Network, CPU.


    • Pro
    fi
    le to identify bottlenecks and implement your benchmarks.


    View Slide

  87. Resources
    Great article on
    fi
    ghting regressions by Chris Craik

    https://bit.ly/3kaidug

    Benchmarking o
    ff i
    cial docs

    https://bit.ly/3rQRaZ6

    JankStats:

    https://developer.android.com/topic/performance/jankstats

    View Slide

  88. Your feedback!
    bit.ly/benchmarkFeedback

    View Slide

  89. Thank you!
    @iurysza


    @eenriquelopez

    View Slide