work mem warriors | PGCon 2020 | Jeff Davis, Melanie Plageman

work_mem

• • •

• • ◦ • •

• ◦ ◦ ◦ •

• ◦ ◦ • ◦ •

work_mem • • •

• • ◦ • ◦ • ◦ ◦ ◦

• • • • •

• • • ◦ • • •

Group 0 Group 1 Group 2 Group 3 Group 4
Group 5 Group 6 Group 7

• • • • • • • •

Group 5 Group 6 Group 7 Partitions

Group 5 Group 6 Group 7

• ◦ •

Group 5 Group 6 Group 7 free free free Group 0 Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7

• ◦ •

By providing a knob for users to set memory consumption
limits and then respecting those limits, resource consumption will be stable. Queries that might be faster were memory to be unlimited might be slower, but the memory consumption will be predictable. Users can decide to change their workload based on this in order to get the performance they are looking for [ ]

bigger_t smaller_t bigger_t smaller_t SELECT * FROM bigger_t JOIN smaller_t
USING (i); 34

bigger_t smaller_t bigger_t smaller_t i INT bucketno = hash(i) &
(*nbuckets - 1) hash(i) -> 5 5 & (4 - 1) -> 1 35

bigger_t smaller_t bigger_t smaller_t i INT hash(i) -> 10 10
& (4 - 1) -> 2 bucketno = hash(i) & (nbuckets - 1) 36

bigger_t smaller_t bigger_t smaller_t i INT hash(i) -> 13 13
& (4 - 1) -> 1 37 bucketno = hash(i) & (nbuckets - 1)

bigger_t smaller_t bigger_t smaller_t 38

bigger_t smaller_t bigger_t smaller_t batchno = (hash(i) >> (nbuckets -
1)) & (*nbatches - 1) i INT hash(i) -> 2 (2 >> 3) & 2 -> 0 2 & 3 -> 2 39 hash(i) = 00000010 1 2 3 smaller_t batch 0

bigger_t smaller_t bigger_t smaller_t batchno = (hash(i) >> (nbuckets -
1)) & (*nbatches - 1) i INT hash(i) -> 10 (10 >> 3) & 3 -> 1 10 & 3 -> 2 40 smaller_t batch 0 2 3 1

bigger_t smaller_t batchno = (hash(i) >> (nbuckets - 1)) &
(nbatches - 1) i INT 41 hash(i) -> 10 (10 >> 3) & 3 -> 1 10 & 3 -> 2 smaller_t batch 1 bigger_t 2 3 1 2 3 1

bigger_t smaller_t batchno = (hash(i) >> (nbuckets - 1)) &
(nbatches - 1) i INT 42 hash(i) -> 56 (56 >> 3) & 3 -> 3 56 & 3 -> 0 smaller_t batch 3

PHJ_BUILD_HASHING_OUTER

bigger_t bigger_t smaller_t smaller_t batch 1 smaller_t batch 0 smaller_t
46

bigger_t smaller_t smaller_t batch 1 smaller_t smaller_t batch 5 smaller_t
batch 1 2 3 2 3 bigger_t batch 1 47

batch 0 smaller_t batch 1 1 48 batchno = (hash(i)
>> (nbuckets - 1)) & (nbatches - 1)

batch 0 smaller_t batch 1 batch 3 1 11 49

batch 0 smaller_t batch 1 batch 3 01 11 batch
7 111 50

batch 3 01 11 51 batch 7 111 batch 1
batch 0 smaller_t

batch 3 01 11 batch 7 111 52 batch 1
batch 0 smaller_t

batch 3 11 batch 7 111 53 1011 batch 11
01 batch 1 batch 0 smaller_t

bigger_t smaller_t bigger_t smaller_t smaller_t batch 0 smaller_t batch 1
PHJ_BUILD_HASHING_INNER 54

batch 0 smaller_t batch 1 smaller_t batch 2 smaller_t batch 3 PHJ_BUILD_HASHING_INNER 55

bigger_t smaller_t smaller_t batch 0 smaller_t batch 1 smaller_t batch
2 smaller_t batch 3 PHJ_BUILD_HASHING_INNER PHJ_BUILD_HASHING_OUTER bigger_t batch 0 bigger_t batch 1 bigger_t batch 2 bigger_t batch 3 56

batch 0 smaller_t batch 1 smaller_t batch 2 smaller_t batch 3 PHJ_BUILD_HASHING_INNER 57

• • work_mem batch 3 11 batch 7 111 1011
batch 3 batch 3 batch 11 011 0011 58

• • • BufFile batch 3 11 batch 7 111
1011 batch 11 011 0011 batch 3 batch 3 00011 10011 batch 19 batch 3 59

• ◦ ◦ ◦ 60

• ◦ ◦ ◦ 61

• ◦ ◦ 62

• ◦ work_mem ◦ ◦ 63

batch 3 11 batch 7 111 1011 batch 11 011
0011 batch 3 00011 10011 batch 19 batch 3 batch 3 11 batch 7 111 1011 batch 11 011 0011 batch 3 batch 3 • • • 64 batch 3

bigger_t batch 5 bigger_t smaller_t bigger_t smaller_t stripe 0 stripe
1 smaller_t 2 3 2 3 bigger_t batch 1 smaller_t batch 1 65

bigger_t smaller_t bigger_t smaller_t smaller_t 2 3 2 3 bigger_t
batch 1 smaller_t batch 1 stripe 0 stripe 0 stripe 1 stripe 0 66 bigger_t batch 5

bigger_t smaller_t bigger_t smaller_t smaller_t 2 3 2 3 bigger_t
batch 1 smaller_t batch 1 stripe 1 stripe 0 stripe 1 67

bigger_t smaller_t PHJ_BUILD_HASHING_INNER bigger_t smaller_t batch 1 smaller_t batch 0
smaller_t batch 1 smaller_t batch 2 smaller_t batch 3 smaller_t batch 0 68

bigger_t smaller_t smaller_t batch 1 bigger_t batch 1 stripe 0
stripe 1 69

bigger_t smaller_t bigger_t batch 1 smaller_t batch 1 stripe 0
stripe 0 stripe 1 stripe 0 smaller_t batch 2 smaller_t batch 3 bigger_t batch 0 bigger_t batch 2 bigger_t batch 3 smaller_t batch 0 70

smaller_t batch 2 smaller_t batch 3 bigger_t batch 0 bigger_t batch 2 bigger_t batch 3 smaller_t batch 0 stripe 0 stripe 1 71

• • bigger_t smaller_t bigger_t batch 1 smaller_t batch 1
stripe 0 72 stripe 1 smaller_t batch 1 stripe 0

outer match statuses 0001 bigger_t.batch_1.tuple_1 matches a tuple in smaller_t.batch_1.stripe_0 bigger_t tuples 2 - 4 have no matches in smaller_t.batch_1.stripe_0 73 stripe 1 smaller_t batch 1 stripe 0

outer match statuses 1001 bigger_t.batch_1.tuple_4 matches a tuple in smaller_t.batch_1.stripe_1 74 stripe 1 smaller_t batch 1 stripe 0

bigger_t smaller_t bigger_t batch 1 outer match statuses 1001 75

SharedTupleStore MinimalTuple TupleIds MinimalTuple - SharedBitStore EXPLAIN 76

write_file ParallelHashJoinState ParallelHashJoinBatch 0 ParallelHashJoinBatch 1 Outer SharedTuplestore batch 1
Inner SharedTuplestore batch 1 STS Participant 1 STS Participant n STS Participant 2 STS Participant 1 STS Participant n STS Participant 2 P1 Inner STS accessor P2 Inner STS accessor write_file hashtable 77

SharedTupleStore Inner STS P2 Write file P1 Write file STS
Chunk STS Chunk 78

SharedTupleStore P2 Write file P1 Write file STS Chunk STS
Chunk Stripe 1 Stripe 1 Stripe 2 Stripe 2 79 Inner STS

• ◦ ◦ • • 83 *https://www.postgresql.org/message-id/flat/CA%2BhUKGKWWmf%3DWELLG%3DaUGbcugRaSQbtm0tKYiBut-B2rVKX63g%40mail.gmail.com **https://github.com/melanieplageman/postgres/commits/alternative_AHJ

• ◦ ▪ ▪ • ◦ work_mem** ▪ ▪ 84
*Tomas Vondra’s patch https://www.postgresql.org/message-id/flat/20190504003414.bulcbnge3rhwhcsh%40development ** mentioned in the same thread

work mem warriors | PGCon 2020 | Jeff Davis, Me...

work mem warriors | PGCon 2020 | Jeff Davis, Melanie Plageman

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Featured

Transcript