Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

Experiments for your Android Builds driven by G...

Experiments for your Android Builds driven by Gradle Profiler

Companion slides for my talk about applying Statistics in order to improve Gradle Builds

Presented at

- N26 Barcelona | Android Meetup (February / 2020)
- GDG-SP Android Meetup #78 (March / 2020)
- Droidcon EMEA Online (October / 2020)
- Android Summit Online (October / 2020)

Ubiratan Soares

October 08, 2020
Tweet

Video

More Decks by Ubiratan Soares

Other Decks in Programming

Transcript

  1. Gradle Profiler Features Tooling API Cold/warm builds Daemon control Benchmarking

    Profilling Multiple build systems Multiple profilers Scenarios definition Incremental builds evaluation Etc
  2. Build Benchmark #01 Benchmark #02 1 4 4.4 2 5

    5.1 3 5.1 5.2 4 4.4 6.2 5 3.9 3.4 6 4.2 6.2 7 4.6 4.6 8 4.5 4.5 9 4.4 3.3 10 4 4.4 3 4 5 6 7
  3. 3 4 5 6 7 Benchmark Mean Standard Deviation #01

    4.46 0.41 #02 4.77 1.04 #01 #02
  4. “I KILLED SO MANY ANNOTATIONS ON MY PR THAT NOW

    I’M SURE app:assembleDebug IS RUNNING FASTER !!!!” “DID YOU RUN BENCHMARKS FOR IT WITH GRADLE PROFILER ???”
  5. “I CAN HELP WITH THAT !” “BUT I’M NOT SURE

    HOW TO DEMONSTRATE THE IMPROVEMENTS …” “YES I DID, BEFORE AND AFTER MY CHANGES.”
  6. Statistical Hyphotesis • Null hyphotesis (H0) • Alternative hyphotesis (HA)

    “I missed that class I GUESS…” Population Mean (µ)
  7. Null HyphoTHESIS - h0 - IS THE STATUS QUO. WHAT

    WE HAVE RIGHT NOW IN OUR TRUNK BRANCH IF YOU PREFER ALTERNATIVE HyphoTHESIS - HA- IS WHAT YOU WANT TO DEMONSTRATE
  8. Statistical significance P(sample 1) = 99.99% P(sample 2) = 88.88%

    Sample 1 Probability Analysis Sample 2 alpha = significance level = 0.05 = 1 - 0.95 “Probably I missed that CLASS TOO …”
  9. “Your modifications will mean real improvements if I EXECUTE 100

    runs of app:assembleDebug and I see an execution faster than 5000ms FOR 95 of them (AT LEAST).”
  10. Law of the Big Numbers Size ? Sample !>= 30

    < 30 T-student Distribution Normal Distribution “SUPER EASY! ”
  11. “Given that app:assembleDebug IS quite slow, we wIll consider BENCHMARKS

    RUNNING SOMETHING between 15 and 25 MEASURED BUILDS; and Therefore t-student distribution will model our probability CURVE.”
  12. p-value Sample Probability Analysis Critical value (eg, Z or t)

    P-value (area) The probability of an error Type I
  13. “With the data you provided the probability model tells me

    that every 100 RUNS OF app:assembleDebug, 3 OF THEM ACTUALLY WILL BE false positiveS”
  14. Critical value and Tail analysis Left-tailed Right-tailed Double-tailed µA <

    µ0 µA !!= µ0 µA > µ0 “I DO NEED SOME COFFEE. DO YOU?”
  15. µA !<= µ0 “Faster builds imply that the observed mean

    after your modifications are SMALLER than the value we had before. So we want a left-tail analysis”
  16. “we will compare the samples for IT captured on two

    different moments UNDER SLIGHTLY DIFFERENT CONDITIONS” “But we don’t KNOW the ACTUAL AVERAGE VALUE OF APP:ASSEMBLEDEBUG FOR ALL POSSIBLE BUILDS …” "THIS IS CALLED A PAIRED TEST.”
  17. “YES, YOU WILL” “PLEASE TELL ME THAT I WON’T HAVE

    TO DO ALL THE CALCULATIONS BY MYSELF …”
  18. Benchmark #01 (status quo) alpha (0.05) p-value Compare p-value and

    alpha Left-tailed Paired T-test Evidence that build has improved is stastistically WEAK p-value is BIGGER p-value is SMALLER Evidence that build has improved is stastistically STRONG Benchmark #02 (modifications) Gradle task
  19. Running Environment •2018 MacBook Pro •Intel Core i7 (6 Cores)

    •16GB RAM kotlin.parallel.tasks.in.project=true kapt.use.worker.api=true kapt.include.compile.classpath=true kapt.incremental.apt=false org.gradle.workers.max=6 .gradle/gradle.properties
  20. Scenario build { title = “Assemble Debug APK" tasks =

    “mobile:assembleDebug” daemon = warm cleanup-tasks = ["clean"] } Execution • 4 warmed-up builds • 15 measured builds (samples) Hyphotesis H0 : No meaningful build improvements building with newer JDKs Ha : JDK11 delivers faster Gradle builds than JDK8 https://github.com/JakeWharton/SdkSearch Target Example #01
  21. #1 Building with different JDKs •P-value > alpha •Ha has

    not been accepted •No STRONG statistical evidence of faster builds with JDK11
  22. Execution • 4 warmed-up builds • 15 measured builds (samples)

    Hyphotesis H0 : Bumps delivers no meanigful build improvementsts Current = AGP 3.4.1 and Gradle 5.1.1 Ha : Bumps to AGP 3.5.3 and Gradle 6.2 deliver faster builds https://github.com/google/iosched Target Scenario build { title = “Assemble Debug APK" tasks = “mobile:assembleDebug” daemon = warm cleanup-tasks = ["clean"] } Example #02
  23. • Play around Gradle Profiler • Design your experiment •

    Run it !!! • Take your decisions based on data generated and analysed in particular your context, not on Tweets or Subreddits Call to action!
  24. UBIRATAN SOARES Brazilian Computer Scientist Senior Software Engineer @ N26

    GDE for Android and Kotlin @ubiratanfsoares ubiratansoares.dev