Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Experiments for your Android Builds driven by Gradle Profiler

Experiments for your Android Builds driven by Gradle Profiler

Companion slides for my talk about applying Statistics in order to improve Gradle Builds

Presented at

- N26 Barcelona | Android Meetup (February / 2020)
- GDG-SP Android Meetup #78 (March / 2020)
- Droidcon EMEA Online (October / 2020)
- Android Summit Online (October / 2020)

Ubiratan Soares
PRO

October 08, 2020
Tweet

More Decks by Ubiratan Soares

Other Decks in Programming

Transcript

  1. EXPERIMENTS FOR
    ANDROID BUILDS
    driven by Gradle Profiler
    Ubiratan Soares
    October / 2020

    View Slide

  2. https://n26.com/en/careers

    View Slide

  3. A problem like a big
    Android project and a
    really slow build …

    View Slide

  4. “You can’t improve
    what you can’t measure”
    Someone, somewhen

    View Slide

  5. https://www.youtube.com/watch?v=hBkIKfzd7Ms

    View Slide

  6. View Slide

  7. Measuring builds

    View Slide

  8. View Slide

  9. View Slide

  10. Gradle Profiler Features
    Tooling API
    Cold/warm builds
    Daemon control
    Benchmarking
    Profilling
    Multiple build systems
    Multiple profilers Scenarios definition
    Incremental builds evaluation
    Etc

    View Slide

  11. Installing with SDKMAN!
    https://github.com/gradle/gradle-profiler/releases
    sdk install gradleprofiler

    View Slide

  12. Installing with HomeBrew
    brew install gradle-profiler

    View Slide

  13. Benchmarking
    gradle-profiler

    View Slide

  14. View Slide

  15. Running Scenarios
    gradle-profiler

    View Slide

  16. Demo

    View Slide

  17. Evaluating
    Measurements

    View Slide

  18. Build Benchmark #01 Benchmark #02
    1 4 4.4
    2 5 5.1
    3 5.1 5.2
    4 4.4 6.2
    5 3.9 3.4
    6 4.2 6.2
    7 4.6 4.6
    8 4.5 4.5
    9 4.4 3.3
    10 4 4.4
    3
    4
    5
    6
    7

    View Slide

  19. 3
    4
    5
    6
    7
    Benchmark Mean Standard Deviation
    #01 4.46 0.41
    #02 4.77 1.04
    #01
    #02

    View Slide

  20. https:!//towardsdatascience.com/why-averages-are-often-wrong-1ff08e409a5b

    View Slide

  21. View Slide

  22. 3
    4
    5
    6
    7
    Benchmark Mean Standard Deviation
    #01 4.46 0.41
    #02 4.77 1.04

    View Slide

  23. When Build
    Engineering meets
    Data Science

    View Slide

  24. Statistical Inference
    Population
    Exploratory
    Data
    Mean (µ)
    Sampling
    Refined data
    Mean (X)
    Probability
    Analysis
    Inferred
    parameter

    View Slide



  25. Alice Bob
    Android Tech Lead Android Engineer

    View Slide

  26. “I KILLED SO MANY ANNOTATIONS ON MY PR
    THAT NOW I’M SURE app:assembleDebug IS
    RUNNING FASTER !!!!”


    “DID YOU RUN BENCHMARKS FOR IT
    WITH GRADLE PROFILER ???”

    View Slide


  27. “I CAN HELP WITH THAT !”

    “BUT I’M NOT SURE HOW TO DEMONSTRATE
    THE IMPROVEMENTS …”
    “YES I DID, BEFORE AND AFTER MY CHANGES.”

    View Slide

  28. Statistical Hyphotesis
    • Null hyphotesis (H0)
    • Alternative hyphotesis (HA)

    “I missed that class I GUESS…”

    Population
    Mean (µ)

    View Slide


  29. Null HyphoTHESIS - h0 - IS THE STATUS
    QUO. WHAT WE HAVE RIGHT NOW IN
    OUR TRUNK BRANCH IF YOU PREFER
    ALTERNATIVE HyphoTHESIS - HA- IS WHAT
    YOU WANT TO DEMONSTRATE

    View Slide

  30. Statistical significance
    P(sample 1) = 99.99%
    P(sample 2) = 88.88%
    Sample 1
    Probability
    Analysis
    Sample 2
    alpha = significance level = 0.05 = 1 - 0.95

    “Probably I missed that CLASS TOO …”

    View Slide


  31. “Your modifications will mean real
    improvements if I EXECUTE 100 runs of
    app:assembleDebug and I see an execution
    faster than 5000ms FOR 95 of them (AT
    LEAST).”

    View Slide

  32. Law of the Big Numbers
    Size ?
    Sample
    !>= 30
    < 30
    T-student

    Distribution
    Normal

    Distribution


    “SUPER EASY! ”

    View Slide


  33. “Given that app:assembleDebug IS
    quite slow, we wIll consider
    BENCHMARKS RUNNING SOMETHING
    between 15 and 25 MEASURED BUILDS;
    and Therefore t-student distribution
    will model our probability CURVE.”

    View Slide

  34. p-value
    Sample
    Probability Analysis Critical value
    (eg, Z or t)
    P-value (area)
    The probability of an error Type I



    View Slide

  35. View Slide


  36. “With the data you provided the
    probability model tells me that every 100
    RUNS OF app:assembleDebug, 3 OF THEM
    ACTUALLY WILL BE false positiveS”

    View Slide

  37. Critical value and Tail analysis
    Left-tailed Right-tailed
    Double-tailed
    µA < µ0 µA !!= µ0 µA > µ0

    “I DO NEED SOME COFFEE. DO YOU?”

    View Slide

  38. µA !<= µ0

    “Faster builds imply that the
    observed mean after your
    modifications are SMALLER than
    the value we had before. So
    we want a left-tail analysis”

    View Slide


  39. “we will compare the samples for IT
    captured on two different moments
    UNDER SLIGHTLY DIFFERENT CONDITIONS”
    “But we don’t KNOW the ACTUAL
    AVERAGE VALUE OF APP:ASSEMBLEDEBUG
    FOR ALL POSSIBLE BUILDS …”

    "THIS IS CALLED A PAIRED TEST.”

    View Slide


  40. “YES, YOU WILL”
    “PLEASE TELL ME THAT I WON’T HAVE TO
    DO ALL THE CALCULATIONS BY MYSELF …”

    View Slide


  41. “JUST JOKING. THERE ARE TOOLS
    WE CAN USE. ”


    View Slide

  42. https://www.statskingdom.com/160MeanT2pair.html

    View Slide

  43. View Slide

  44. A framework to
    drive experiments for
    your Gradle builds

    View Slide

  45. Benchmark #01
    (status quo)
    alpha

    (0.05)
    p-value
    Compare
    p-value
    and alpha
    Left-tailed
    Paired T-test
    Evidence that build has improved is
    stastistically WEAK
    p-value is BIGGER
    p-value is SMALLER
    Evidence that build has improved is
    stastistically STRONG
    Benchmark #02
    (modifications)
    Gradle task

    View Slide

  46. Examples

    View Slide

  47. View Slide

  48. Running Environment
    •2018 MacBook Pro
    •Intel Core i7 (6 Cores)
    •16GB RAM
    kotlin.parallel.tasks.in.project=true
    kapt.use.worker.api=true
    kapt.include.compile.classpath=true
    kapt.incremental.apt=false
    org.gradle.workers.max=6
    .gradle/gradle.properties

    View Slide

  49. Scenario
    build {
    title = “Assemble Debug APK"
    tasks = “mobile:assembleDebug”
    daemon = warm
    cleanup-tasks = ["clean"]
    }
    Execution
    • 4 warmed-up builds
    • 15 measured builds (samples)
    Hyphotesis H0 : No meaningful build improvements building with newer JDKs
    Ha : JDK11 delivers faster Gradle builds than JDK8
    https://github.com/JakeWharton/SdkSearch
    Target
    Example #01

    View Slide

  50. View Slide

  51. View Slide

  52. View Slide

  53. #1 Building with different JDKs
    •P-value > alpha

    View Slide

  54. #1 Building with different JDKs
    •P-value > alpha
    •Ha has not been accepted

    View Slide

  55. #1 Building with different JDKs
    •P-value > alpha
    •Ha has not been accepted
    •No STRONG statistical evidence of
    faster builds with JDK11

    View Slide

  56. View Slide

  57. View Slide

  58. View Slide

  59. Execution
    • 4 warmed-up builds
    • 15 measured builds (samples)
    Hyphotesis
    H0 : Bumps delivers no meanigful build improvementsts
    Current = AGP 3.4.1 and Gradle 5.1.1
    Ha : Bumps to AGP 3.5.3 and Gradle 6.2 deliver faster builds
    https://github.com/google/iosched
    Target
    Scenario
    build {
    title = “Assemble Debug APK"
    tasks = “mobile:assembleDebug”
    daemon = warm
    cleanup-tasks = ["clean"]
    }
    Example #02

    View Slide

  60. View Slide

  61. View Slide

  62. View Slide

  63. View Slide

  64. View Slide

  65. Final Remarks

    View Slide

  66. • Play around Gradle Profiler
    • Design your experiment
    • Run it !!!
    • Take your decisions based on
    data generated and analysed in
    particular your context, not on
    Tweets or Subreddits
    Call to action!

    View Slide

  67. UBIRATAN
    SOARES
    Brazilian Computer Scientist
    Senior Software Engineer @ N26
    GDE for Android and Kotlin
    @ubiratanfsoares
    ubiratansoares.dev

    View Slide

  68. THANKS

    View Slide