Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Developer Productivity Engineering: What's in it for me?

Developer Productivity Engineering: What's in it for me?

_It may surprise you to learn that we developers are a patient, tolerant species. People pay us to do what we enjoy - write code and create working applications. In return, we will put up with all sorts of blockages and toil that get in the way of this - long build times, flaky tests, hard-to-debug toolchain failures and so on._

_Is this truly the price we need to pay? Could there be a better world, where the build is as fast as it could possibly be? A world where problems that affect many developers are quickly identified, and fixed?_

Welcome to the world of Developer Productivity Engineering, where we can get computers to do what they’re good at (automation) to make developers’ lives easier, and make us more effective at our jobs. And while developer joy may be a difficult thing to sell to decision makers, effective developers who are making the best use of their time, and their hardware, have a direct impact on an organization’s ROI. What’s not to love?

In this talk, Trisha will explore what DPE is, give you some practical ways to get started, and discuss ways to help the leaders in your organisation to understand the enormous value DPE could unlock.

Trisha Gee

April 17, 2023
Tweet

More Decks by Trisha Gee

Other Decks in Technology

Transcript

  1. Enterprise
    Developer Productivity Engineering


    What’s in it for me?

    View full-size slide

  2. ⬢ Lead Developer Advocate


    ⬢ Java Champion


    ⬢ 20+ years Java experience


    ⬢ …and author
    Trisha Gee

    View full-size slide

  3. https://trishagee.com/books/

    View full-size slide

  4. But Bottlenecks to Productivity are Everywhere
    Code
    Code
    Wait Time for Local Build
    Debug Build Failure
    Lunch
    Code
    Wait Time for Local Build
    Investigate/Fix Flaky Tests
    Sprint
    Waiting time for CI Build

    View full-size slide

  5. “Bottlenecks in the toolchain are holding back the
    rockstar 10x developers”
    Pete Smoot, Software Architect, Dell Technologies

    View full-size slide

  6. The “best” programmers outperformed
    the worst by roughly a 10:1 ratio

    View full-size slide

  7. What Mattered?

    View full-size slide

  8. ⬢ Paired programmers performed at roughly the same level
    What Mattered?

    View full-size slide

  9. ⬢ Paired programmers performed at roughly the same level
    ⬢ The average difference was only 21% between paired participants
    What Mattered?

    View full-size slide

  10. ⬢ Paired programmers performed at roughly the same level
    ⬢ The average difference was only 21% between paired participants
    ⬢ They didn’t work together on the task, but they came from the same
    organization
    What Mattered?

    View full-size slide

  11. ⬢ Paired programmers performed at roughly the same level
    ⬢ The average difference was only 21% between paired participants
    ⬢ They didn’t work together on the task, but they came from the same
    organization
    ⬢ The best organization performed 11.1x better than the worst
    What Mattered?

    View full-size slide

  12. “While this productivity differential among
    programmers is understandable, there is also a 10 to 1
    difference in productivity among software
    organizations.”
    Software Productivity in the Enterprise


    Harlan (HD) Mills


    https://trace.tennessee.edu/cgi/viewcontent.cgi?article=1010&context=utk_harlan

    View full-size slide

  13. “The bald fact is that many companies provide
    developers with a workplace that is so crowded, noisy,
    and interruptive as to fill their days with frustration.
    That alone could explain reduced efficiency as well as a
    tendency for good people to migrate elsewhere.”
    Peopleware: Productive Projects and Teams, Third Edition


    Tom DeMarco, Tim Lister

    View full-size slide

  14. Though the phrase had not yet been coined, increased
    productivity came down to developer experience.

    View full-size slide

  15. Gradle is Pioneering DPE
    DPE is a new software development
    practice used by leading software
    development organizations to
    maximize developer productivity
    and happiness.

    View full-size slide

  16. What Problems Does DPE Solve?

    View full-size slide

  17. DevOps, 12-Factor, Agile, etc, have still not
    captured all bottlenecks, friction, and obstacles
    to throughput
    Many are hiding in plain sight, in the developer
    experience itself

    View full-size slide

  18. A 10x organization should be reducing
    build and test feedback times and
    improving the consistency and
    reliability of builds

    View full-size slide

  19. Pain Point:
    Waiting for Builds &
    Tests to Complete

    View full-size slide

  20. Are you tracking local build and test
    times?

    View full-size slide

  21. The only initiatives that will positively
    impact performance are ones which
    increase throughput while
    simultaneously decreasing cost

    View full-size slide

  22. Faster Builds Improve Creative Flow
    Team 1 Team 2
    No. of Devs 11 6
    Build Time 4 mins 1 mins
    No. of local builds 850 1010

    View full-size slide

  23. Very Fast Feedback Is Important

    View full-size slide

  24. Solution: Acceleration Technologies

    View full-size slide

  25. Build Caching Speeds up Builds and Tests

    View full-size slide

  26. ⬢ Introduced to the Java world by Gradle in 2017


    ⬢ Used by leading technology companies like Google and Facebook


    ⬢ Can support both user local and remote caching for distributed
    teams
    Build Caching

    View full-size slide

  27. Build Caching
    When the inputs have not changed, the outputs can be reused from a previous run.

    View full-size slide

  28. Demo: Build Cache for Maven and Gradle

    View full-size slide

  29. Remote Build Cache
    ⬢ Shared among different machines


    ⬢ Speeds up development for the whole team


    ⬢ Reuses build results among CI agents/jobs and individual developers

    View full-size slide

  30. Test Distribution Parallelizes Test Execution

    View full-size slide

  31. Existing solutions: Single machine parallelism
    Parallelism in Gradle is controlled by these flags:
    --
    parallel / org.gradle.parallel

    Controls project parallelism, defaults to false
    --
    max-workers / org.gradle.workers.max

    Controls the maximum number of workers, defaults to the number of processors/cores
    test.maxParallelForks

    Controls how many VMs are forked by an individual test task, defaults to 1
    See https://guides.gradle.org/performance/#parallel_execution for more information

    View full-size slide

  32. Existing solutions: CI fanout
    See https://builds.gradle.org/project/Gradle for an example of this strategy
    Test execution is distributed by manually partitioning the test set and then running partitions in
    parallel on several CI nodes.
    pipeline {

    stage('compile') { ... }

    parallelStage('test') {

    step {

    sh './gradlew :testGroup1' 

    }

    step {

    sh './gradlew :testGroup2' 

    }

    step {

    sh './gradlew :testGroup3' 

    }

    } 

    }

    View full-size slide

  33. Assessment of existing solutions
    ⬢ Build Caching is great in many cases but
    doesn’t help when test inputs have changed.
    ⬢ Single machine parallelism is limited by that
    machine’s resources.
    ⬢ CI fanout does not help during local
    development, requires manual setup and test
    partitioning, and result collection/aggregation

    View full-size slide

  34. Test Distribution in Gradle Enterprise

    View full-size slide

  35. Test Distribution Results
    ‑ ~50%
    ‑ ~50%
    ‑ ~50%
    Measurements from the demo project
    Doubling the number of executors cuts build time in half

    View full-size slide

  36. Netflix reduced a 62-minute test cycle time down to just under 5 minutes!

    View full-size slide

  37. Machine learning leads to greater efficiencies

    View full-size slide

  38. Predictive Test Selection
    01 Instead of trying to analyze which tests could possibly be impacted by
    developer changes, Predictive Test Selection looks at the history of changes
    and what has happened to tests in the past
    02 When tests complete, they can either FAIL, SUCCEED, or be FLAKY.
    Predictive Test Selection will predict the outcome of the test based on the
    history it is analyzing
    03 PTS will recommend skipping tests that are successful, and will only run tests
    that are likely to provide valuable feedback
    https://arxiv.org/pdf/1810.05286.pdf

    View full-size slide

  39. Force multiplier when used in combination
    1. Build Cache. Avoid unnecessarily running
    components of builds and tests whose inputs
    have not changed.
    2. Predictive Test Selection. Run only the
    relevant subset of test tasks likely to provide
    useful feedback.
    3. Test Distribution. Speed up the execution
    of the necessary and relevant remaining
    tests by running them in parallel.
    4. Performance Continuity. Sustain Test
    Distribution and other performance
    improvements over time with data analytic
    and performance profiling capabilities.

    View full-size slide

  40. Is the build and test cycle fast enough?

    View full-size slide

  41. Is the build and test cycle fast enough?

    View full-size slide

  42. Is the build and test cycle as fast as it
    can possibly be?

    View full-size slide

  43. Pain Point:
    Inefficient
    troubleshooting of
    broken builds

    View full-size slide

  44. “ You can observe a lot by just watching.”
    Yogi Berra, Catcher and Philosopher
    Blank background use at will

    View full-size slide

  45. Build Scan: scans.gradle.com

    View full-size slide

  46. Learn more
    https://bit.ly/grdl-scan

    View full-size slide

  47. DPE Organizations Track Failure Rates

    View full-size slide

  48. Pain Point:
    Flaky Tests & Other
    Avoidable Failures

    View full-size slide

  49. Flaky builds and tests are maddening

    View full-size slide

  50. ⬢ Try it again


    ⬢ Re-run it


    ⬢ Re-run it again


    ⬢ Ignore it and approve PR


    ⬢ All of the above
    The test is flaky. What do you do now?

    View full-size slide

  51. Identify and Track Flaky Tests

    View full-size slide

  52. https://youtu.be/vHBzZHE4tJ0

    View full-size slide

  53. Pain Point:
    No Metric/KPI
    Observability

    View full-size slide

  54. Without focus, problems can sneak
    back in

    View full-size slide

  55. Continuous Improvement: It doesn’t really matter what you
    improve as long as you are constantly improving something,
    because…
    …entropy denotes that if you aren’t doing
    anything, you’re always getting worse.

    View full-size slide

  56. “The tools, services, and environments that developers
    need to do their jobs should be treated with
    production-level SLAs. The development platform is
    the production environment for the job of creating
    software”
    Release It! Second Edition


    Michael Nygard


    View full-size slide

  57. Pain Point:
    Inefficient use of CI
    Resources

    View full-size slide

  58. All Of This Will Improve CI
    Body text


    View full-size slide

  59. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    In Summary

    View full-size slide

  60. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    In Summary

    View full-size slide

  61. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    In Summary

    View full-size slide

  62. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    ⬢ Fast feedback, efficient troubleshooting, and reliable cycles are key
    In Summary

    View full-size slide

  63. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    ⬢ Fast feedback, efficient troubleshooting, and reliable cycles are key
    ⬢ Start with observation, and then take action on data
    In Summary

    View full-size slide

  64. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    ⬢ Fast feedback, efficient troubleshooting, and reliable cycles are key
    ⬢ Start with observation, and then take action on data
    ⬢ Proactively solve problems for the whole team
    In Summary

    View full-size slide

  65. Source: TechValidate. TVID: 066-EEE-DB1

    View full-size slide

  66. DPE Transforms Every Business Layer

    View full-size slide

  67. https://bit.ly/speed-build
    Build speed challenge

    View full-size slide

  68. There’s a Book for This

    View full-size slide

  69. https://bit.ly/dpe-4me

    View full-size slide

  70. How it works…
    1. When a test run starts, the build tool
    submits a test input snapshot and test
    set to a machine learning model.


    2. PTS automatically develops a test
    selection strategy by learning from
    historical code changes and test
    outcomes from your Build Scan data to
    predict a subset of relevant tests, which
    are then executed by your build.


    3. Code change and test results data are
    processed immediately after a Build
    Scan is uploaded to PTS and updates
    the test selection strategy based on new
    results.

    View full-size slide

  71. Cache Key/Value Calculation


    The cacheKey for Gradle Tasks/Maven Goals is based on the Inputs:


    cacheKey(javaCompile) = hash(sourceFiles, jdk version, classpath, compiler args)


    The cacheEntry contains the output:


    cacheEntry[cacheKey(javaCompile)] = fileTree(classFiles)


    For more information, see:


    https://docs.gradle.org/current/userguide/build_cache.html


    View full-size slide