Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Developer Productivity Engineering: What's in it for me? (LJC)

Developer Productivity Engineering: What's in it for me? (LJC)

It may surprise you to learn that we developers are a patient, tolerant species. People pay us to do what we enjoy - write code and create working applications. In return, we will put up with all sorts of blockages and toil that get in the way of this - long build times, flaky tests, hard-to-debug toolchain failures and so on.

Is this truly the price we need to pay? Could there be a better world, where the build is as fast as it could possibly be? A world where problems that affect many developers are quickly identified and fixed?

Welcome to the world of Developer Productivity Engineering, where we can get computers to do what they’re good at (automation) to make developers’ lives easier, and make us more effective at our jobs. And while developer joy may be a difficult thing to sell to decision-makers, effective developers who are making the best use of their time, and their hardware, have a direct impact on an organization’s ROI.

In this talk, Trisha will explore what DPE is, give you some practical ways to get started, and discuss ways to help the leaders in your organisation to understand the enormous value DPE could unlock.

Trisha Gee

August 23, 2023
Tweet

More Decks by Trisha Gee

Other Decks in Programming

Transcript

  1. Enterprise
    Developer Productivity Engineering


    What’s in it for me?

    View full-size slide

  2. ⬢ Lead Developer Advocate


    ⬢ Java Champion


    ⬢ 20+ years Java experience


    ⬢ …and author
    Trisha Gee

    View full-size slide

  3. https://trishagee.com/books/

    View full-size slide

  4. But Bottlenecks to Productivity are Everywhere
    Code
    Code
    Wait Time for Local Build
    Debug Build Failure
    Lunch
    Code
    Wait Time for Local Build
    Investigate/Fix Flaky Tests
    Sprint
    Waiting time for CI Build

    View full-size slide

  5. How developers spend their time
    Source: The 2019 Tidelift managed open source survey results https://bit.ly/3MOEpK3

    View full-size slide

  6. “Bottlenecks in the toolchain are holding back the
    rockstar 10x developers”
    Pete Smoot, Software Architect, Dell Technologies

    View full-size slide

  7. The “best” programmers outperformed
    the worst by roughly a 10:1 ratio

    View full-size slide

  8. What Mattered?

    View full-size slide

  9. ⬢ Paired programmers performed at roughly the same level
    What Mattered?

    View full-size slide

  10. ⬢ Paired programmers performed at roughly the same level
    ⬢ The average difference was only 21% between paired participants
    What Mattered?

    View full-size slide

  11. ⬢ Paired programmers performed at roughly the same level
    ⬢ The average difference was only 21% between paired participants
    ⬢ They didn’t work together on the task, but they came from the same
    organization
    What Mattered?

    View full-size slide

  12. ⬢ Paired programmers performed at roughly the same level
    ⬢ The average difference was only 21% between paired participants
    ⬢ They didn’t work together on the task, but they came from the same
    organization
    ⬢ The best organization performed 11.1x better than the worst
    What Mattered?

    View full-size slide

  13. “While this productivity differential among
    programmers is understandable, there is also a 10 to 1
    difference in productivity among software
    organizations.”
    Software Productivity in the Enterprise


    Harlan (HD) Mills


    https://trace.tennessee.edu/cgi/viewcontent.cgi?article=1010&context=utk_harlan

    View full-size slide

  14. “The bald fact is that many companies provide
    developers with a workplace that is so crowded, noisy,
    and interruptive as to fill their days with frustration.
    That alone could explain reduced efficiency as well as a
    tendency for good people to migrate elsewhere.”
    Peopleware: Productive Projects and Teams, Third Edition


    Tom DeMarco, Tim Lister

    View full-size slide

  15. Though the phrase had not yet been coined, increased
    productivity came down to developer experience.

    View full-size slide

  16. Gradle is Pioneering DPE
    DPE is a new software development
    practice used by leading software
    development organizations to
    maximize developer productivity
    and happiness.

    View full-size slide

  17. What Problems Does DPE Solve?

    View full-size slide

  18. DevOps, 12-Factor, Agile, etc, have still not
    captured all bottlenecks, friction, and obstacles
    to throughput
    Many are hiding in plain sight, in the developer
    experience itself

    View full-size slide

  19. A 10x organization should be reducing
    build and test feedback times and
    improving the consistency and
    reliability of builds

    View full-size slide

  20. Pain Point:
    Waiting for Builds &
    Tests to Complete

    View full-size slide

  21. Are you tracking local build and test
    times?

    View full-size slide

  22. The only initiatives that will positively
    impact performance are ones which
    increase throughput while
    simultaneously decreasing cost

    View full-size slide

  23. Faster Builds Improve Creative Flow
    Team 1 Team 2
    No. of Devs 11 6
    Build Time 4 mins 1 mins
    No. of local builds 850 1010

    View full-size slide

  24. Very Fast Feedback Is Important

    View full-size slide

  25. Solution: Acceleration Technologies

    View full-size slide

  26. Build Caching Speeds up Builds and Tests

    View full-size slide

  27. Demo: No Build Cache for Maven

    View full-size slide

  28. ⬢ Introduced to the Java world by Gradle in 2017


    ⬢ Used by leading technology companies like Google and Facebook


    ⬢ Can support both user local and remote caching for distributed
    teams
    Build Caching

    View full-size slide

  29. Build Caching
    When the inputs have not changed, the outputs can be reused from a previous run.

    View full-size slide

  30. Demo: Build Cache for Maven

    View full-size slide

  31. Remote Build Cache
    ⬢ Shared among different machines


    ⬢ Speeds up development for the whole team


    ⬢ Reuses build results among CI agents/jobs and individual developers

    View full-size slide

  32. Test Distribution Parallelizes Test Execution

    View full-size slide

  33. Existing solutions: Single machine parallelism
    Parallelism in Gradle is controlled by these flags:
    --
    parallel / org.gradle.parallel

    Controls project parallelism, defaults to false
    --
    max-workers / org.gradle.workers.max

    Controls the maximum number of workers, defaults to the number of processors/cores
    test.maxParallelForks

    Controls how many VMs are forked by an individual test task, defaults to 1
    See https://guides.gradle.org/performance/#parallel_execution for more information

    View full-size slide

  34. Existing solutions: CI fanout
    See https://builds.gradle.org/project/Gradle for an example of this strategy
    Test execution is distributed by manually partitioning the test set and then running partitions in
    parallel on several CI nodes.
    pipeline {

    stage('compile') { ... }

    parallelStage('test') {

    step {

    sh './gradlew :testGroup1' 

    }

    step {

    sh './gradlew :testGroup2' 

    }

    step {

    sh './gradlew :testGroup3' 

    }

    } 

    }

    View full-size slide

  35. Assessment of existing solutions
    ⬢ Build Caching is great in many cases but
    doesn’t help when test inputs have changed.
    ⬢ Single machine parallelism is limited by that
    machine’s resources.
    ⬢ CI fanout does not help during local
    development, requires manual setup and test
    partitioning, and result collection/aggregation

    View full-size slide

  36. Test Distribution in Gradle Enterprise

    View full-size slide

  37. Test Distribution Results
    ‑ ~50%
    ‑ ~50%
    ‑ ~50%
    Measurements from the demo project
    Doubling the number of executors cuts build time in half

    View full-size slide

  38. Netflix reduced a 62-minute test cycle time down to just under 5 minutes!

    View full-size slide

  39. Machine learning leads to greater efficiencies

    View full-size slide

  40. Predictive Test Selection
    01 Instead of trying to analyze which tests could possibly be impacted by
    developer changes, Predictive Test Selection looks at the history of changes
    and what has happened to tests in the past
    02 When tests complete, they can either FAIL, SUCCEED, or be FLAKY.
    Predictive Test Selection will predict the outcome of the test based on the
    history it is analyzing
    03 PTS will recommend skipping tests that are successful, and will only run tests
    that are likely to provide valuable feedback
    https://arxiv.org/pdf/1810.05286.pdf

    View full-size slide

  41. Force multiplier when used in combination
    1. Build Cache. Avoid unnecessarily running
    components of builds and tests whose inputs
    have not changed.
    2. Predictive Test Selection. Run only the
    relevant subset of test tasks likely to provide
    useful feedback.
    3. Test Distribution. Speed up the execution
    of the necessary and relevant remaining
    tests by running them in parallel.
    4. Performance Continuity. Sustain Test
    Distribution and other performance
    improvements over time with data analytic
    and performance profiling capabilities.

    View full-size slide

  42. Is the build and test cycle fast enough?

    View full-size slide

  43. Is the build and test cycle fast enough?

    View full-size slide

  44. Is the build and test cycle as fast as it
    can possibly be?

    View full-size slide

  45. Pain Point:
    Inefficient
    troubleshooting of
    broken builds

    View full-size slide

  46. “ You can observe a lot by just watching.”
    Yogi Berra, Catcher and Philosopher
    Blank background use at will

    View full-size slide

  47. Build Scan: scans.gradle.com

    View full-size slide

  48. DPE Organizations Track Failure Rates

    View full-size slide

  49. Pain Point:
    Flaky Tests & Other
    Avoidable Failures

    View full-size slide

  50. Flaky builds and tests are maddening

    View full-size slide

  51. ⬢ Try it again


    ⬢ Re-run it


    ⬢ Re-run it again


    ⬢ Ignore it and approve PR


    ⬢ All of the above
    The test is flaky. What do you do now?

    View full-size slide

  52. “…our analysis revealed that re-running the failing
    build and attempting to repair the flaky test were the
    most common actions.”
    Surveying the Developer Experience of Flaky Tests


    https://mcminn.info/publications/c72.pdf


    View full-size slide

  53. “…our analysis revealed that re-running the failing
    build and attempting to repair the flaky test were the
    most common actions. Our findings also suggested that
    developers who experience flaky tests more often are
    more likely to take no action in response to them.”
    Surveying the Developer Experience of Flaky Tests


    https://mcminn.info/publications/c72.pdf


    View full-size slide

  54. Identify and Track Flaky Tests

    View full-size slide

  55. https://youtu.be/vHBzZHE4tJ0

    View full-size slide

  56. ⬢ https://bit.ly/why-fix-flaky


    ⬢ https://bit.ly/find-flaky


    ⬢ https://bit.ly/fix-flaky
    Blogs: Finding and Fixing Flaky Tests

    View full-size slide

  57. Pain Point:
    No Metric/KPI
    Observability

    View full-size slide

  58. Without focus, problems can sneak
    back in

    View full-size slide

  59. Continuous Improvement: It doesn’t really matter what you
    improve as long as you are constantly improving something,
    because…
    …entropy denotes that if you aren’t doing
    anything, you’re always getting worse.

    View full-size slide

  60. “The tools, services, and environments that developers
    need to do their jobs should be treated with
    production-level SLAs. The development platform is
    the production environment for the job of creating
    software”
    Release It! Second Edition


    Michael Nygard


    View full-size slide

  61. Pain Point:
    Inefficient use of CI
    Resources

    View full-size slide

  62. All Of This Will Improve CI
    Body text


    View full-size slide

  63. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    In Summary

    View full-size slide

  64. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    In Summary

    View full-size slide

  65. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    In Summary

    View full-size slide

  66. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    ⬢ Fast feedback, efficient troubleshooting, and reliable cycles are key
    In Summary

    View full-size slide

  67. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    ⬢ Fast feedback, efficient troubleshooting, and reliable cycles are key
    ⬢ Start with observation, and then take action on data
    In Summary

    View full-size slide

  68. ⬢ 10x Developers might be a myth, but 10x Organisations are real
    ⬢ Developer Productivity is deeply linked to Developer Experience
    ⬢ If you do nothing about productivity, life will get worse
    ⬢ Fast feedback, efficient troubleshooting, and reliable cycles are key
    ⬢ Start with observation, and then take action on data
    ⬢ Proactively solve problems for the whole team
    In Summary

    View full-size slide

  69. Source: TechValidate. TVID: 066-EEE-DB1

    View full-size slide

  70. DPE Transforms Every Business Layer

    View full-size slide

  71. https://bit.ly/dpe-4me

    View full-size slide

  72. Join the DPE Community and learn more!

    View full-size slide

  73. How it works…
    1. When a test run starts, the build tool
    submits a test input snapshot and test
    set to a machine learning model.


    2. PTS automatically develops a test
    selection strategy by learning from
    historical code changes and test
    outcomes from your Build Scan data to
    predict a subset of relevant tests, which
    are then executed by your build.


    3. Code change and test results data are
    processed immediately after a Build
    Scan is uploaded to PTS and updates
    the test selection strategy based on new
    results.

    View full-size slide

  74. Cache Key/Value Calculation


    The cacheKey for Gradle Tasks/Maven Goals is based on the Inputs:


    cacheKey(javaCompile) = hash(sourceFiles, jdk version, classpath, compiler args)


    The cacheEntry contains the output:


    cacheEntry[cacheKey(javaCompile)] = fileTree(classFiles)


    For more information, see:


    https://docs.gradle.org/current/userguide/build_cache.html


    View full-size slide