Sorting improvements in PostgreSQL 9.5 and 9.6

Sorting improvements in PostgreSQL 9.5 and 9.6

F9a2dba12b94d0c204d846a29da56bf5?s=128

Peter Geoghegan

September 18, 2015
Tweet

Transcript

  1. 2.

    PostgreSQL 9.5 • Sorting is important — CREATE INDEX, ORDER

    BY, DISTINCT(), CLUSTER, etc. • “Abbreviated keys” optimization added. • For text. • For numeric. • We sort an array of SortTuple structs. • Stores first field, and pointer to “tuple proper”. • e.g. For CREATE INDEX, tuple proper is IndexTuple.
  2. 3.

    PostgreSQL 9.5 • First field is stored inline is SortTuple,

    and may be pass-by- value (the value itself, for types like integer), or pass-by- reference, for arbitrary sized types like jsonb. • Before, text was always simply treated as pass-by- reference. Pointer chasing required for every comparison. • However, with new abbreviated keys, a special pass-by- value prefix is stored rather than a pointer. Usually sufficient to resolve comparison without pointer chasing. • Basically, abbreviated value is first 8 bytes of each string being sorted.
  3. 4.

    PostgreSQL 9.5 • Text also benefits from cheap memcmp() pre-

    check. Abbreviated key tie-breaker can use this for low-cardinality sets, and we still win big. • Tie-breaker must consult tuple proper. • For numeric, special encoding scheme usually makes comparisons simple integer comparisons.
  4. 5.

    PostgreSQL 9.6 • Recently, started looking at external sorting. •

    Revising tuplesort to use simple hybrid sort-merge strategy. • Runs are quicksorted. Limited remaining use of heap data structure. • Various other optimizations that target memory bandwidth/ latency bottleneck. • Further big improvements in performance. 2x - 3.25x, even with a small fraction of the memory required to do everything in memory.