Structure of an
initial email
■ High level summary
■ Why do we want this?
■ Why this way?
■ Call to action
Slide 8
Slide 8 text
HIGH LEVEL SUMMARY
https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
Slide 9
Slide 9 text
Why do we
want this?
■ Problem fixed
■ Benefit added
■ Users impacted
■ Why now
https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com
Slide 10
Slide 10 text
Why this
way?
■ How is this different from
previous attempts?
■ Explain the design
■ Why not any obvious
alternative designs?
■ Why isn’t a part
implemented?
https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com
Slide 11
Slide 11 text
Call to action
■ TODOs awaiting validation
– No documentation yet
– Haven’t implemented it for this scenario
yet
■ Open questions
– Is this part of the design good?
■ What else needs to be done?
– Do I need to implement it for partition
tables?
■ Issues
– Why is this test failing?
https://www.postgresql.org/message-id/flat/cf85f46f-b02f-05b2-5248-5000b894ebab%40enterprisedb.com
https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
Slide 12
Slide 12 text
Other tips for initial email
■ Don't make everything equally important in commit message and email
■ Try to loop in invested users if user facing or devs if API
– CC people who have worked on this problem in the past
■ Chime in on/review threads in the same subject area
Slide 13
Slide 13 text
How to respond
■ Answer questions that reviewers ask you
■ Address review inline point-by-point when sending a new version
■ Be flexible with stylistic changes
■ Address all instances of a given type of feedback
■ Don’t be afraid to fork new mailing list discussions
Slide 14
Slide 14 text
EXERCISING YOUR
CODE
Slide 15
Slide 15 text
Opening Arguments
With my patch applied, vacuum is 10% faster on average. A trivial example:
CREATE TABLE foo(a INT) WITH (autovacuum_enabled=false);
INSERT INTO foo SELECT i FROM generate_series(1, 1000) AS i;
\timing on
VACUUM (VERBOSE) foo;
[Paste results here]
Slide 16
Slide 16 text
SIMPLE USAGE EXAMPLE
https://www.postgresql.org/message-id/flat/CA%2BTgmoYOYZfMCyOXFyC-P%2B-mdrZqm5pP2N7S-r0z3_402h9rsA%40mail.gmail.com
Slide 17
Slide 17 text
BENCHMARKS
Slide 18
Slide 18 text
Why Benchmark?
■ Performance improvement
■ Lack of performance regression
Slide 19
Slide 19 text
Less is More in Benchmarks
■ \timing on is okay!
■ Use pgbench
– Other supporting benchmarks are okay
■ Avoid bash scripts and attachments
■ Minimize distractions
■ Eliminate extraneous conditions
– Use development GUCs, table options when helpful
■ Minimize runtime
Slide 20
Slide 20 text
Benchmark the right thing
■ Use pgbench “custom scripts”
pgbench –c 1 –M prepared –P 1 –t 50 \
–f <(echo "COPY table(…) FROM 'copysource'")
■ Ensure reproducibility
Slide 21
Slide 21 text
Explain How to Run Your Benchmark
■ Postgres prep steps: prewarm, restart, initdb
■ Postgres settings (GUCs, table options, etc)
■ Schema and data
■ Machine specs or settings (e.g. readahead)
Slide 22
Slide 22 text
- DDL and data generation
- steps required
- command to run
https://www.postgresql.org/message-id/flat/CA%2BTgmoa8nddwpMA4Emn7sVoNrQ883mn8ZJBiqXa8dm3puKn%3DqQ%40mail.gmail.com#4251b35792be16116af1c6cf87c2eb5d
Slide 23
Slide 23 text
- machine under test
- brief results summary
https://www.postgresql.org/message-id/flat/0e9c1643-c620-47e7-b30e-6840b0abbe2e%40enterprisedb.com#a9ec204b960afffe4bbf405a93b603aa
Slide 24
Slide 24 text
Provide the realistic use case
■ If example is contrived, explain realistic scenario
– e.g. not tpcb-like pgbench with random data access distribution
■ Describe expected impact and how much of user base it would affect
■ Outline worst and best cases
■ Run additional benchmarks with longer runtimes that prove lack of negative
interaction with subsystems like checkpointer
Slide 25
Slide 25 text
Presenting Performance Test Results
■ Don't provide more benchmark results than is useful
■ TPS (Transactions per second) isn’t the only metric
– Plan diffs
– perf output summary
– P99 latency
– Memory utilization
– Per process IO consumption
– …
■ When adding a test is
difficult
■ Instills confidence in
correctness of work
■ Protects future developers
Debug-only Validation
Slide 32
Slide 32 text
SPLITTING COMMITS UP
Slide 33
Slide 33 text
in July…
■ reduces vacuum WAL volume by > 50%!
■ passes tests!
■ and the code looks good too!
Slide 34
Slide 34 text
in July…
■ 21 files changed, 140 insertions, 1348 deletions L
■ unreviewable L
■ no commit message L
Slide 35
Slide 35 text
the journey to committability
1. Every commit works
2. No omnibus commits
3. Each line of a commit is purposeful and clearly outlined in commit message
4. Refactoring should be motivated
5. Functional changes should be obvious
6. Comments and commit messages must clearly communicate risk and action
required
Slide 36
Slide 36 text
initial splitting
attempt
Slide 37
Slide 37 text
minor improvement may not merit churn
Slide 38
Slide 38 text
- turn the refactor into a feature
- independently valuable commit justifying churn
add error detail for messages originating in on-access pruning
Slide 39
Slide 39 text
Commit
message
doesn’t
communicate
lack of risk
Slide 40
Slide 40 text
Better commit
message
Slide 41
Slide 41 text
- separate form and function
- single issue commits
- don’t distract reviewer
Slide 42
Slide 42 text
separated into two commits
Slide 43
Slide 43 text
rearrange
then relocate
Slide 44
Slide 44 text
single issue commits
- separate commit for new APIs
Slide 45
Slide 45 text
SEPARATE
COMMITS OR
SEPARATE
SUB-
PROJECTS?
Slide 46
Slide 46 text
2O24.PGCONF.DEV
May 28-31
Vancouver, Canada
CFP through Jan 15, 2024