Upgrade to Pro — share decks privately, control downloads, hide ads and more …

work mem warriors | PGCon 2020 | Jeff Davis, Me...

work mem warriors | PGCon 2020 | Jeff Davis, Melanie Plageman

Recent work on the Postgres executor has made improvements to memory management -- from accounting for the memory used to responding to memory pressure. It is important to bind the memory usage of the database with the appropriate execution mechanisms and to choose those during planning based on cost in order to meet users' expectations and ensure predictable performance.

This talk will cover three such improvements: the addition of memory accounting information to MemoryContexts, the memory-bounding of HashAgg (spilling HashAgg to disk), and the adaptive hashjoin fallback to nested hashloop join.

The talk will also include an interactive session to solicit feedback from users on their expectations and experiences with work_mem and the memory behavior of Postgres.

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7
  2. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7
  3. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7 Partitions
  4. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7
  5. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7
  6. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7
  7. Group 0 Group 1 Group 2 Group 3 Group 4

    Group 5 Group 6 Group 7 free free free Group 0 Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7
  8. By providing a knob for users to set memory consumption

    limits and then respecting those limits, resource consumption will be stable. Queries that might be faster were memory to be unlimited might be slower, but the memory consumption will be predictable. Users can decide to change their workload based on this in order to get the performance they are looking for [ ]
  9. bigger_t smaller_t bigger_t smaller_t i INT bucketno = hash(i) &

    (*nbuckets - 1) hash(i) -> 5 5 & (4 - 1) -> 1 35
  10. bigger_t smaller_t bigger_t smaller_t i INT hash(i) -> 10 10

    & (4 - 1) -> 2 bucketno = hash(i) & (nbuckets - 1) 36
  11. bigger_t smaller_t bigger_t smaller_t i INT hash(i) -> 13 13

    & (4 - 1) -> 1 37 bucketno = hash(i) & (nbuckets - 1)
  12. bigger_t smaller_t bigger_t smaller_t batchno = (hash(i) >> (nbuckets -

    1)) & (*nbatches - 1) i INT hash(i) -> 2 (2 >> 3) & 2 -> 0 2 & 3 -> 2 39 hash(i) = 00000010 1 2 3 smaller_t batch 0
  13. bigger_t smaller_t bigger_t smaller_t batchno = (hash(i) >> (nbuckets -

    1)) & (*nbatches - 1) i INT hash(i) -> 10 (10 >> 3) & 3 -> 1 10 & 3 -> 2 40 smaller_t batch 0 2 3 1
  14. bigger_t smaller_t batchno = (hash(i) >> (nbuckets - 1)) &

    (nbatches - 1) i INT 41 hash(i) -> 10 (10 >> 3) & 3 -> 1 10 & 3 -> 2 smaller_t batch 1 bigger_t 2 3 1 2 3 1
  15. bigger_t smaller_t batchno = (hash(i) >> (nbuckets - 1)) &

    (nbatches - 1) i INT 42 hash(i) -> 56 (56 >> 3) & 3 -> 3 56 & 3 -> 0 smaller_t batch 3
  16. batch 0 smaller_t batch 1 1 48 batchno = (hash(i)

    >> (nbuckets - 1)) & (nbatches - 1)
  17. batch 3 11 batch 7 111 53 1011 batch 11

    01 batch 1 batch 0 smaller_t
  18. bigger_t bigger_t smaller_t smaller_t batch 0 smaller_t batch 1 smaller_t

    batch 0 smaller_t batch 1 smaller_t batch 2 smaller_t batch 3 PHJ_BUILD_HASHING_INNER 55
  19. bigger_t smaller_t smaller_t batch 0 smaller_t batch 1 smaller_t batch

    2 smaller_t batch 3 PHJ_BUILD_HASHING_INNER PHJ_BUILD_HASHING_OUTER bigger_t batch 0 bigger_t batch 1 bigger_t batch 2 bigger_t batch 3 56
  20. bigger_t bigger_t smaller_t smaller_t batch 0 smaller_t batch 1 smaller_t

    batch 0 smaller_t batch 1 smaller_t batch 2 smaller_t batch 3 PHJ_BUILD_HASHING_INNER 57
  21. • • work_mem batch 3 11 batch 7 111 1011

    batch 3 batch 3 batch 11 011 0011 58
  22. • • • BufFile batch 3 11 batch 7 111

    1011 batch 11 011 0011 batch 3 batch 3 00011 10011 batch 19 batch 3 59
  23. batch 3 11 batch 7 111 1011 batch 11 011

    0011 batch 3 00011 10011 batch 19 batch 3 batch 3 11 batch 7 111 1011 batch 11 011 0011 batch 3 batch 3 • • • 64 batch 3
  24. bigger_t batch 5 bigger_t smaller_t bigger_t smaller_t stripe 0 stripe

    1 smaller_t 2 3 2 3 bigger_t batch 1 smaller_t batch 1 65
  25. bigger_t smaller_t bigger_t smaller_t smaller_t 2 3 2 3 bigger_t

    batch 1 smaller_t batch 1 stripe 0 stripe 0 stripe 1 stripe 0 66 bigger_t batch 5
  26. bigger_t smaller_t bigger_t smaller_t smaller_t 2 3 2 3 bigger_t

    batch 1 smaller_t batch 1 stripe 1 stripe 0 stripe 1 67
  27. bigger_t smaller_t PHJ_BUILD_HASHING_INNER bigger_t smaller_t batch 1 smaller_t batch 0

    smaller_t batch 1 smaller_t batch 2 smaller_t batch 3 smaller_t batch 0 68
  28. bigger_t smaller_t bigger_t batch 1 smaller_t batch 1 stripe 0

    stripe 0 stripe 1 stripe 0 smaller_t batch 2 smaller_t batch 3 bigger_t batch 0 bigger_t batch 2 bigger_t batch 3 smaller_t batch 0 70
  29. bigger_t smaller_t bigger_t batch 1 smaller_t batch 1 stripe 1

    smaller_t batch 2 smaller_t batch 3 bigger_t batch 0 bigger_t batch 2 bigger_t batch 3 smaller_t batch 0 stripe 0 stripe 1 71
  30. • • bigger_t smaller_t bigger_t batch 1 smaller_t batch 1

    stripe 0 72 stripe 1 smaller_t batch 1 stripe 0
  31. bigger_t smaller_t bigger_t batch 1 smaller_t batch 1 stripe 0

    outer match statuses 0001 bigger_t.batch_1.tuple_1 matches a tuple in smaller_t.batch_1.stripe_0 bigger_t tuples 2 - 4 have no matches in smaller_t.batch_1.stripe_0 73 stripe 1 smaller_t batch 1 stripe 0
  32. bigger_t smaller_t bigger_t batch 1 smaller_t batch 1 stripe 1

    outer match statuses 1001 bigger_t.batch_1.tuple_4 matches a tuple in smaller_t.batch_1.stripe_1 74 stripe 1 smaller_t batch 1 stripe 0
  33. write_file ParallelHashJoinState ParallelHashJoinBatch 0 ParallelHashJoinBatch 1 Outer SharedTuplestore batch 1

    Inner SharedTuplestore batch 1 STS Participant 1 STS Participant n STS Participant 2 STS Participant 1 STS Participant n STS Participant 2 P1 Inner STS accessor P2 Inner STS accessor write_file hashtable 77
  34. SharedTupleStore P2 Write file P1 Write file STS Chunk STS

    Chunk Stripe 1 Stripe 1 Stripe 2 Stripe 2 79 Inner STS
  35. • ◦ ▪ ▪ • ◦ work_mem** ▪ ▪ 84

    *Tomas Vondra’s patch https://www.postgresql.org/message-id/flat/20190504003414.bulcbnge3rhwhcsh%40development ** mentioned in the same thread