Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sorting improvements in PostgreSQL 9.5 and 9.6

Sorting improvements in PostgreSQL 9.5 and 9.6

Peter Geoghegan

September 18, 2015
Tweet

More Decks by Peter Geoghegan

Other Decks in Programming

Transcript

  1. PostgreSQL 9.5 • Sorting is important — CREATE INDEX, ORDER

    BY, DISTINCT(), CLUSTER, etc. • “Abbreviated keys” optimization added. • For text. • For numeric. • We sort an array of SortTuple structs. • Stores first field, and pointer to “tuple proper”. • e.g. For CREATE INDEX, tuple proper is IndexTuple.
  2. PostgreSQL 9.5 • First field is stored inline is SortTuple,

    and may be pass-by- value (the value itself, for types like integer), or pass-by- reference, for arbitrary sized types like jsonb. • Before, text was always simply treated as pass-by- reference. Pointer chasing required for every comparison. • However, with new abbreviated keys, a special pass-by- value prefix is stored rather than a pointer. Usually sufficient to resolve comparison without pointer chasing. • Basically, abbreviated value is first 8 bytes of each string being sorted.
  3. PostgreSQL 9.5 • Text also benefits from cheap memcmp() pre-

    check. Abbreviated key tie-breaker can use this for low-cardinality sets, and we still win big. • Tie-breaker must consult tuple proper. • For numeric, special encoding scheme usually makes comparisons simple integer comparisons.
  4. PostgreSQL 9.6 • Recently, started looking at external sorting. •

    Revising tuplesort to use simple hybrid sort-merge strategy. • Runs are quicksorted. Limited remaining use of heap data structure. • Various other optimizations that target memory bandwidth/ latency bottleneck. • Further big improvements in performance. 2x - 3.25x, even with a small fraction of the memory required to do everything in memory.