Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The hitchiker's guide to PostgreSQL

The hitchiker's guide to PostgreSQL

Federico Campoli

May 04, 2024
Tweet

More Decks by Federico Campoli

Other Decks in Technology

Transcript

  1. The hitchiker’s guide to PostgreSQL NaLUG, Napoli Federico Campoli PostgreSQL

    Europe 4 May 2024 Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL 1 / 53
  2. Few words about the speaker Born in 1972 Passionate about

    IT since 1985 In love with PostgreSQL since 2006 PostgreSQL and FreeBSD tattoo on the right shoulder Freelance DBA Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL 2 / 53
  3. Table of contents 1 Don’t panic! 2 The pan galactic

    gargle blaster 3 The Ravenous Bugblatter Beast of Traal 4 Time is an illusion. Lunchtime doubly so 5 Mostly harmless 6 Wrap up Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL 4 / 53
  4. Don’t panic! Copyright by Kreg Steppe - https://www.flickr.com/photos/spyndle/ Federico Campoli

    (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Don’t panic! 5 / 53
  5. PostgreSQL, history PostgreSQL is an open source RDBMS with a

    BSD-like or MIT-like license. Started by Professor M. Stonebraker at Berkeley Follow up of Ingres hence the chosen name was POSTinGRES Managed by the PostgreSQL Global Development Group (PGDG) A new release each year, latest version is 16 Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Don’t panic! 6 / 53
  6. Versioning and support policy Version Supported First release date End

    of life date 16 Yes 14 September 2023 9 November 2028 15 Yes 13 October 2022 11 November 2027 14 Yes 30 September 2021 12 November 2026 13 Yes 24 September 2020 13 November 2025 12 EOL 3 October 2019 14 November 2024 11 No 18 October 2018 9 November 2023 10 No 5 October 2017 10 November 2022 Table: End Of Life (EOL) dates The PostgreSQL project aims to fully support a major release for five years. The policy is applied on a best-effort basis. The complete list of versions is available at https://www.postgresql.org/support/versioning/ Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Don’t panic! 7 / 53
  7. Minor version releases Every year there are at least four

    minor releases for each supported version. The target date for these releases are, unless otherwise stated, the second Thursday of February, May, August, and November. The current schedule for upcoming releases is: 9th May 2024 8th August 2024 14th November 2024 13th February 2025 The up to date release schedule is available here: https://www.postgresql.org/developer/roadmap/ Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Don’t panic! 8 / 53
  8. The pan galactic gargle blaster Copyright by Federico Campoli Federico

    Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The pan galactic gargle blaster 9 / 53
  9. PostgreSQL, features ACID compliant Highly scalable with MVCC Tablespaces Runs

    on almost any unix... And on *cough* MS Windows *cough* Foreign tables Procedural languages (pl/pgsql, pl/python, pl/perl...) Supports for NOSQL data type (HSTORE, JSON) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The pan galactic gargle blaster 10 / 53
  10. PostgreSQL, features Development Licence PostgreSQL, BSD-like Developed in C Programmable

    interface with external libraries Extension system Partitioning Parallel execution Logical decoding Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The pan galactic gargle blaster 11 / 53
  11. PostgreSQL, features Partitioning improvements, thousands of partitions managed. Btree improvements,

    size, deduplication, vacuum efficiency Statistics, most-common-value statistics for multiple columns CTE inline, no materialisation slowness when using WITH Prepared plan control, it’s possible to determine whether to use the generic or custom execution plan Just-in-time Compilation, rewrites the execution plan for complex queries. Designed to improve analytics queries. Checksum Control, allow to enable/disable the checksums while the cluster is offline REINDEX CONCURRENTLY, allows REINDEX with minimal locking Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The pan galactic gargle blaster 12 / 53
  12. Community There are several ways to interact and get help

    from the PostgreSQL community. Mailing lists organised by topics: https://www.postgresql.org/list/ IRC on Libera Chat: channel #postgresql Slack: https://postgres-slack.herokuapp.com/ Telegram: https://t.me/pg sql The community is informal, friendly and respectful. There is a Code of Conduct to follow https://www.postgresql.org/about/policies/coc/ Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The pan galactic gargle blaster 13 / 53
  13. Some cool PostgreSQL features As there’s no much time we’ll

    have a quick look to some of the cool features included in PostgreSQL JSON and JSONB Transaction snapshot exports Table inheritance and partitioning Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The pan galactic gargle blaster 14 / 53
  14. The Ravenous Bugblatter Beast of Traal Copyright by Federico Campoli

    Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 15 / 53
  15. A multi flavour RDBMS Alongside with the general purpose data

    types PostgreSQL have some exotic types. Range types Geometric types Network address types XML type JSON type HSTORE type (extension) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 16 / 53
  16. JSON JSON JSON is the JavaScript Object Notation. Supported as

    datatype in PostgreSQL with two flavours JSON The data is stored as text, parsed and validated when the data is accessed JSONB parsed and converted in binary representation during the insert and update Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 17 / 53
  17. JSON JSON - Examples From record to JSON postgres =#

    SELECT row_to_json (ROW(1,’foo ’)); row_to_json -- ------------------- {"f1":1,"f2":"foo"} (1 row) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 18 / 53
  18. JSON Expanding JSON into key to value elements postgres =#

    SELECT * from json_each(’{"a":" foo", "b":" bar "}’); key | value -- ---+------- a | "foo" b | "bar" (2 rows) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 19 / 53
  19. JSONB Things to consider when using JSON JSON is parsed

    and validated on the fly JSONB is parsed, validated and transformed during INSERT/UPDATE JSONB allows a single index on the field. All the keys are indexed automatically. If you don’t use the json functions in PostgreSQL then just use text Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 20 / 53
  20. Some numbers Let’s create three tables with text,json and jsonb

    type fields. Each record contains the same json element generated on https://www.json-generator.com/ [ { "_id": "5da82999b1c11e2a5a3cd35e", "index": 0, "guid": "4320e8ee-e28f-435e-8bce-975e75b23f79", "isActive": true, "balance": "$2,211.00", "picture": "http://placehold.it/32x32", "age": 34, "eyeColor": "blue", "name": "Wall Shaw", ... Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 21 / 53
  21. Some numbers DROP TABLE IF EXISTS t_json ; DROP TABLE

    IF EXISTS t_jsonb ; DROP TABLE IF EXISTS t_text ; \timing CREATE TABLE t_json as SELECT ’<JSON ELEMENT >’:: json as js_value FROM generate_series (1 ,100000); SELECT 100000 Time: 27350.758 ms (00:27.351) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 22 / 53
  22. Some numbers CREATE TABLE t_text as SELECT ’<JSON ELEMENT >’::

    text as t_value FROM generate_series (1 ,100000); SELECT 100000 Time: 29713.060 ms (00:29.713) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 23 / 53
  23. Some numbers CREATE TABLE t_jsonb as SELECT ’<JSON ELEMENT >’::

    jsonb as jsb_value FROM generate_series (1 ,100000); SELECT 100000 Time: 27109.003 ms (00:27.109) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 24 / 53
  24. Table size SELECT pg_size_pretty ( pg_total_relation_size (oid)), relname FROM pg_class

    WHERE relname LIKE ’t\_%’ ; pg_size_pretty | relname -- --------------+--------- 270 MB | t_json 322 MB | t_jsonb 270 MB | t_text (3 rows) JSONB use more space than the json and text1 1Sizing may vary depending on the json size Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 25 / 53
  25. Sequential scans TEXT EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM

    t_text; Seq Scan on t_text (cost =0.00..1637.00 rows =100000 width =18) (actual time =0.013..11.533 rows =100000 loops =1) Buffers: shared hit =637 Planning Time: 0.041 ms Execution Time: 18.359 ms (4 rows) Time: 18.796 ms Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 26 / 53
  26. Sequential scans JSON EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM

    t_json; Seq Scan on t_json (cost =0.00..1637.00 rows =100000 width =18) (actual time =0.043..14.880 rows =100000 loops =1) Buffers: shared read =637 Planning Time: 0.100 ms Execution Time: 21.310 ms (4 rows) Time: 21.747 ms Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 27 / 53
  27. Sequential scans JSONB EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM

    t_jsonb; Seq Scan on t_jsonb (cost =0.00..1637.00 rows =100000 width =18) (actual time =0.060..12.409 rows =100000 loops =1) Buffers: shared hit =91 read =546 Planning Time: 0.090 ms Execution Time: 18.471 ms (4 rows) Time: 18.852 ms Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 28 / 53
  28. Sequential scan with cast to json TEXT EXPLAIN (BUFFERS ,

    ANALYZE) SELECT t_value ::json ->0->’index ’ FROM t_text; Seq Scan on t_text (cost =0.00..2387.00 rows =100000 width =32) (actual time =0.260..5301.080 rows =100000 loops =1) Buffers: shared hit =366212 read =34426 Planning Time: 0.311 ms Execution Time: 5309.246 ms Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 29 / 53
  29. Sequential scan with native json JSON EXPLAIN (BUFFERS , ANALYZE)

    SELECT js_value ->0->’index ’ FROM t_json; Seq Scan on t_json (cost =0.00..1887.00 rows =100000 width =32) (actual time =0.234..4145.148 rows =100000 loops =1) Buffers: shared hit =366116 read =34522 Planning Time: 0.043 ms Execution Time: 4152.678 ms (4 rows) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 30 / 53
  30. Sequential scan with jsonb JSONB EXPLAIN (BUFFERS , ANALYZE) SELECT

    jsb_value ->0->’index ’ FROM t_jsonb; Seq Scan on t_jsonb (cost =0.00..1887.00 rows =100000 width =32) (actual time =0.154..1236.189 rows =100000 loops =1) Buffers: shared hit =379450 read =41188 Planning Time: 0.048 ms Execution Time: 1241.749 ms (4 rows) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL The Ravenous Bugblatter Beast of Traal 31 / 53
  31. Time is an illusion. Lunchtime doubly so Copyright by Federico

    Campoli Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 32 / 53
  32. Transaction’s snapshots Depending from the isolation level, when a query

    or a transaction access the database for the first time acquires a snapshot which includes the transactions that have been committed at that moment. Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 33 / 53
  33. Transaction’s snapshots In PostgreSQL a transaction’s snapshot can be exported

    to other sessions A session which import an exported snapshot sees the same data of the exporting session Useful when multiple sessions need to see a consistent data set frozen in time (e.g. parallel execution) pg dump use this feature to backup the database in parallel Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 34 / 53
  34. Example Let’s create a table CREATE TABLE t_data ( i_id

    serial , t_content text ); ALTER TABLE t_data ADD CONSTRAINT pk_t_data PRIMARY KEY (i_id); Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 35 / 53
  35. Example INSERT INTO t_data ( t_content ) SELECT md5(i_counter ::

    text) FROM ( SELECT i_counter FROM generate_series (1 ,200) as i_counter ) t_series; Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 36 / 53
  36. Exporting a snapshot postgres =# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE

    READ; BEGIN postgres =# SELECT pg_export_snapshot (); pg_export_snapshot -- ------------------ 00000004 -0000005A-1 (1 row) postgres =# SELECT count (*) FROM t_data; count ------- 200 (1 row) Isolation levels explained here: https://pgdba.org/post/2019/10/transactions/ Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 37 / 53
  37. Importing a snapshot In a different session we remove all

    the table’s rows postgres =# DELETE FROM t_data; DELETE 200 postgres =# SELECT count (*) FROM t_data; count ------- 0 (1 row) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 38 / 53
  38. Example If we import the snapshot we’ll see the rows

    back again. postgres =# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; BEGIN postgres =# SET TRANSACTION SNAPSHOT ’00000004 -0000005A-1’; SET postgres =# SELECT count (*) FROM t_data; count ------- 200 (1 row) Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Time is an illusion. Lunchtime doubly so 39 / 53
  39. Mostly harmless Copyright by Federico Campoli Federico Campoli (PostgreSQL Europe)

    The hitchiker’s guide to PostgreSQL Mostly harmless 40 / 53
  40. Table inheritance Before PostgreSQL 10 the partitioning was possible through

    the table inheritance. The table inheritance is a logical relationship between a parent table and its child tables. A child table inherits parent’s attributes. However the physical storage is not shared. Multiple level of inheritance are allowed. Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 41 / 53
  41. Table inheritance As physical storage is not shared between inherited

    tables, the unique constraints can’t be globally enforced on the inheritance tree preventing the creation of any global foreign key. There is no built in mechanism for distributing the data across the child tables. Query optimisation is made via CHECK constraint exclusion which need to be created and maintained. Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 42 / 53
  42. Table partitioning PostgreSQL has the native support for table partitioning

    with full support for the global primary key. Therefore the foreign keys are also fully supported on the partitioned tables. Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 43 / 53
  43. Table partitioning PostgreSQL supports range, list and hash partitioning. Range

    Partitioning: The table is partitioned into “ranges” defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. List Partitioning: The table is partitioned by explicitly listing which key values appear in each partition. Hash Partitioning: The table is partitioned by specifying a modulus and a remainder for each partition. Each partition will hold the rows for which the hash value of the partition key divided by the specified modulus will produce the specified remainder. Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 44 / 53
  44. Partitioning Define a table as partitioned CREATE TABLE t_rng_part (

    id integer NOT NULL , value text , CONSTRAINT pk_t_rng_part primary key (id) ) PARTITION BY RANGE (id) ; Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 45 / 53
  45. Partitioning Add the range partitions. The range’s upper bound is

    not inclusive. CREATE TABLE t_rng_part_0001_1000 PARTITION OF t_rng_part FOR VALUES FROM (’0’) TO (’1000 ’); CREATE TABLE t_rng_part_1000_2000 PARTITION OF t_rng_part FOR VALUES FROM (’1000 ’) TO (’2000 ’); CREATE TABLE t_rng_part_2000_3000 PARTITION OF t_rng_part FOR VALUES FROM (’2000 ’) TO (’3000 ’); CREATE TABLE t_rng_part_default PARTITION OF t_rng_part DEFAULT; Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 46 / 53
  46. Partitioning Insert data INSERT INTO t_rng_part (id) SELECT generate_series (1

    ,50000) ; Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 47 / 53
  47. Partitioning Table size \dt+ t_rng_part* List of relations Schema |

    Name | Type | Owner | Size | Description -- ------+----------------------+-------+----------+---------+------------- public | t_rng_part | table | postgres | 0 bytes | public | t_rng_part_0001_1000 | table | postgres | 72 kB | public | t_rng_part_1000_2000 | table | postgres | 72 kB | public | t_rng_part_2000_3000 | table | postgres | 72 kB | public | t_rng_part_default | table | postgres | 1696 kB | Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 48 / 53
  48. Partitioning PostgreSQL’s declarative partitioning relies on table inheritance system which

    rules still apply. However there are some differences. CHECK and NOT NULL constraints on a partitioned table are always inherited by all the partitions. CHECK constraints marked as NO INHERIT cannot be created on the partitioned tables. A partitioned table does not have any data directly therefore using TRUNCATE ONLY on a partitioned table will always generate an error. Differently from the table inheritance partitions cannot have columns that are not present in the parent. Is not possible to drop the NOT NULL constraint on a partition’s column if the constraint is present in the parent table. Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Mostly harmless 49 / 53
  49. Wrap up Copyright by Federico Campoli Federico Campoli (PostgreSQL Europe)

    The hitchiker’s guide to PostgreSQL Wrap up 50 / 53
  50. License This document is distributed under the terms of the

    Creative Commons Attribution, Not Commercial, Share Alike Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Wrap up 51 / 53
  51. That’s all folks! Thank you for listening! Any questions? Copyright

    by dan232323 http://dan232323.deviantart.com/art/Pinkie-Pie-Thats-All-Folks-454693000 Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Wrap up 52 / 53
  52. The hitchiker’s guide to PostgreSQL NaLUG, Napoli Federico Campoli PostgreSQL

    Europe 4 May 2024 Federico Campoli (PostgreSQL Europe) The hitchiker’s guide to PostgreSQL Wrap up 53 / 53