Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making your patch more committable

Melanie
December 21, 2023

Making your patch more committable

What to do and not to do to make your Postgres patch more committable.

Melanie

December 21, 2023
Tweet

More Decks by Melanie

Other Decks in Programming

Transcript

  1. MAKING YOUR
    PATCH MORE
    COMMITTABLE
    Melanie Plageman
    Microsoft

    View full-size slide

  2. Communication and collaboration
    Exercising your code
    Splitting up commits

    View full-size slide

  3. Terminology
    ■ patch ó commit
    ■ patch set ó commits
    (branch)
    ■ mailing list (pgsql-
    hackers)

    View full-size slide

  4. THREE YEARS AGO…
    August 2020
    https://www.postgresql.org/message-id/CAAKRu_aLMRHX6_y%3DK5i5wBMTMQvoPMO8DT3eyCziTHjsY11cVA%40mail.gmail.com

    View full-size slide

  5. https://www.postgresql.org/message-id/1fc1f116-b235-f016-a1f8-f124a9840a5c%40enterprisedb.com

    View full-size slide

  6. COMMUNICATION &
    COLLABORATION

    View full-size slide

  7. Structure of an
    initial email
    ■ High level summary
    ■ Why do we want this?
    ■ Why this way?
    ■ Call to action

    View full-size slide

  8. HIGH LEVEL SUMMARY
    https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

    View full-size slide

  9. Why do we
    want this?
    ■ Problem fixed
    ■ Benefit added
    ■ Users impacted
    ■ Why now
    https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com

    View full-size slide

  10. Why this
    way?
    ■ How is this different from
    previous attempts?
    ■ Explain the design
    ■ Why not any obvious
    alternative designs?
    ■ Why isn’t a part
    implemented?
    https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com

    View full-size slide

  11. Call to action
    ■ TODOs awaiting validation
    – No documentation yet
    – Haven’t implemented it for this scenario
    yet
    ■ Open questions
    – Is this part of the design good?
    ■ What else needs to be done?
    – Do I need to implement it for partition
    tables?
    ■ Issues
    – Why is this test failing?
    https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com
    https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

    View full-size slide

  12. Other tips for initial email
    ■ Don't make everything equally important in commit message and email
    ■ Try to loop in invested users if user facing or devs if API
    – CC people who have worked on this problem in the past
    ■ Chime in on/review threads in the same subject area

    View full-size slide

  13. How to respond
    ■ Answer questions that reviewers ask you
    ■ Address review inline point-by-point when sending a new version
    ■ Be flexible with stylistic changes
    ■ Address all instances of a given type of feedback
    ■ Don’t be afraid to fork new mailing list discussions

    View full-size slide

  14. EXERCISING YOUR
    CODE

    View full-size slide

  15. Opening Arguments
    With my patch applied, vacuum is 10% faster on average. A trivial example:
    CREATE TABLE foo(a INT) WITH (autovacuum_enabled=false);
    INSERT INTO foo SELECT i FROM generate_series(1, 1000) AS i;
    \timing on
    VACUUM (VERBOSE) foo;
    [Paste results here]

    View full-size slide

  16. SIMPLE USAGE EXAMPLE
    https://www.postgresql.org/message-id/flat/CA%2BTgmoYOYZfMCyOXFyC-P%2B-mdrZqm5pP2N7S-r0z3_402h9rsA%40mail.gmail.com

    View full-size slide

  17. Why Benchmark?
    ■ Performance improvement
    ■ Lack of performance regression

    View full-size slide

  18. Less is More in Benchmarks
    ■ \timing on is okay!
    ■ Use pgbench
    – Other supporting benchmarks are okay
    ■ Avoid bash scripts and attachments
    ■ Minimize distractions
    ■ Eliminate extraneous conditions
    – Use development GUCs, table options when helpful
    ■ Minimize runtime

    View full-size slide

  19. Benchmark the right thing
    ■ Use pgbench “custom scripts”
    pgbench –c 1 –M prepared –P 1 –t 50 \
    –f <(echo "COPY table(…) FROM 'copysource'")
    ■ Ensure reproducibility

    View full-size slide

  20. Explain How to Run Your Benchmark
    ■ Postgres prep steps: prewarm, restart, initdb
    ■ Postgres settings (GUCs, table options, etc)
    ■ Schema and data
    ■ Machine specs or settings (e.g. readahead)

    View full-size slide

  21. - DDL and data generation
    - steps required
    - command to run
    https://www.postgresql.org/message-id/flat/CA%2BTgmoa8nddwpMA4Emn7sVoNrQ883mn8ZJBiqXa8dm3puKn%3DqQ%40mail.gmail.com#4251b35792be16116af1c6cf87c2eb5d

    View full-size slide

  22. - machine under test
    - brief results summary
    https://www.postgresql.org/message-id/flat/0e9c1643-c620-47e7-b30e-6840b0abbe2e%40enterprisedb.com#a9ec204b960afffe4bbf405a93b603aa

    View full-size slide

  23. Provide the realistic use case
    ■ If example is contrived, explain realistic scenario
    – e.g. not tpcb-like pgbench with random data access distribution
    ■ Describe expected impact and how much of user base it would affect
    ■ Outline worst and best cases
    ■ Run additional benchmarks with longer runtimes that prove lack of negative
    interaction with subsystems like checkpointer

    View full-size slide

  24. Presenting Performance Test Results
    ■ Don't provide more benchmark results than is useful
    ■ TPS (Transactions per second) isn’t the only metric
    – Plan diffs
    – perf output summary
    – P99 latency
    – Memory utilization
    – Per process IO consumption
    – …

    View full-size slide

  25. https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com

    View full-size slide

  26. https://www.postgresql.org/message-id/flat/20230630002952.34zj7conccw42yiv%40awork3.anarazel.de#2016c72ed21d7d8b09d830dd84876ae9

    View full-size slide

  27. Regression Tests
    ■ Minimal
    ■ Fast
    ■ Reliable
    ■ Core suite (SQL)
    ■ “Isolation, recovery, subscription” suite (Perl)
    ■ …

    View full-size slide

  28. Reproduction Scripts
    https://www.postgresql.org/message-id/flat/8d526f29-c269-dc3d-38ac-10fa3a0a7fb3%40gmail.com#481e217bc7896fd4542f928d656dc46b

    View full-size slide

  29. ■ When adding a test is
    difficult
    ■ Instills confidence in
    correctness of work
    ■ Protects future developers
    Debug-only Validation

    View full-size slide

  30. SPLITTING COMMITS UP

    View full-size slide

  31. in July…
    ■ reduces vacuum WAL volume by > 50%!
    ■ passes tests!
    ■ and the code looks good too!

    View full-size slide

  32. in July…
    ■ 21 files changed, 140 insertions, 1348 deletions L
    ■ unreviewable L
    ■ no commit message L

    View full-size slide

  33. the journey to committability
    1. Every commit works
    2. No omnibus commits
    3. Each line of a commit is purposeful and clearly outlined in commit message
    4. Refactoring should be motivated
    5. Functional changes should be obvious
    6. Comments and commit messages must clearly communicate risk and action
    required

    View full-size slide

  34. initial splitting
    attempt

    View full-size slide

  35. minor improvement may not merit churn

    View full-size slide

  36. - turn the refactor into a feature
    - independently valuable commit justifying churn
    add error detail for messages originating in on-access pruning

    View full-size slide

  37. Commit
    message
    doesn’t
    communicate
    lack of risk

    View full-size slide

  38. Better commit
    message

    View full-size slide

  39. - separate form and function
    - single issue commits
    - don’t distract reviewer

    View full-size slide

  40. separated into two commits

    View full-size slide

  41. rearrange
    then relocate

    View full-size slide

  42. single issue commits
    - separate commit for new APIs

    View full-size slide

  43. SEPARATE
    COMMITS OR
    SEPARATE
    SUB-
    PROJECTS?

    View full-size slide

  44. 2O24.PGCONF.DEV
    May 28-31
    Vancouver, Canada
    CFP through Jan 15, 2024

    View full-size slide