The hitchiker's guide to PostgreSQL

The hitchiker's guide to PostgreSQL

PostgreSQL's is one of the finest database systems available.
The talk will cover the history, the basic concepts of the PostgreSQL's architecture and the how the community behind the "the most advanced open source database" works.

F48fa173c6ddf4342e2c7b74ddec3bbe?s=128

Federico Campoli

October 24, 2019
Tweet

Transcript

  1. The hitchiker’s guide to PostgreSQL Federico Campoli Kamedata Plone Conf,

    Ferrara, 24 October 2019 Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 1 / 55
  2. Few words about the speaker Born in 1972 Passionate about

    IT since 1982 Joined the Oracle DBA secret society in 2004 In love with PostgreSQL since 2006 PostgreSQL tattoo on the right shoulder Freelance devops and data engineer Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 2 / 55
  3. Kamedata Whether you need a simple audit a tailored training

    or support for your infrastructure, we can help you to improve. Devops PostgreSQL Support Training Audit Migrations https://kamedata.com Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 3 / 55
  4. Table of contents 1 Don’t panic! 2 The pan galactic

    gargle blaster 3 The Ravenous Bugblatter Beast of Traal 4 Time is an illusion. Lunchtime doubly so 5 Mostly harmless 6 Wrap up Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 4 / 55
  5. Don’t panic! Copyright by Kreg Steppe - https://www.flickr.com/photos/spyndle/ Federico Campoli

    (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 5 / 55
  6. PostgreSQL, history PostgreSQL is an open source RDBMS with a

    BSD-like or MIT-like license. Started by Professor M. Stonebraker at Berkeley Follow up of Ingres hence the chosen name was POSTinGRES Managed by the PostgreSQL Global Development Group (PGDG) A new release each year, latest version is 12 Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 6 / 55
  7. Versioning and support policy Version Supported First release date End

    of life date 12 Yes 3 October 2019 14 November 2024 11 Yes 18 October 2018 9 November 2023 10 Yes 5 October 2017 10 November 2022 9.6 Yes 29 September 2016 11 November 2021 9.5 Yes 7 January 2016 11 February 2021 9.4 EOL 18 December 2014 13 February 2020 9.3 No 9 September 2013 8 November 2018 Table: End Of Life (EOL) dates The PostgreSQL project aims to fully support a major release for five years. The policy is applied on a best-effort basis. The complete list of versions is available at https://www.postgresql.org/support/versioning/ Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 7 / 55
  8. Minor version releases Every year there are at least four

    minor releases for each supported version. The target date for these releases are, unless otherwise stated, the second Thursday of February, May, August, and November. The current schedule for upcoming releases is: 7th November 2019 13th February 2020 14th May 2020 13th August 2020 The up to date release schedule is available here: https://www.postgresql.org/developer/roadmap/ Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 8 / 55
  9. The pan galactic gargle blaster Copyright by Federico Campoli Federico

    Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 9 / 55
  10. PostgreSQL, features ACID compliant Highly scalable with MVCC Tablespaces Runs

    on almost any unix... And on *cough* MS Windows *cough* Foreign tables Procedural languages (pl/pgsql, pl/python, pl/perl...) Supports for NOSQL data type (HSTORE, JSON) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 10 / 55
  11. PostgreSQL, features Development Licence PostgreSQL, BSD-like Developed in C Programmable

    interface with external libraries Extension system Partitioning Parallel execution Logical decoding Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 11 / 55
  12. PostgreSQL 12 new features Partitioning improvements, thousands of partitions managed.

    Btree improvements, size, deduplication, vacuum efficiency Statistics, most-common-value statistics for multiple columns CTE inline, no materialisation slowness when using WITH Prepared plan control, it’s possible to determine whether to use the generic or custom execution plan Just-in-time Compilation, rewrites the execution plan for complex queries. Designed to improve analytics queries. Checksum Control, allow to enable/disable the checksums while the cluster is offline REINDEX CONCURRENTLY, allows REINDEX with minimal locking Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 12 / 55
  13. Community There are several ways to interact and get help

    from the PostgreSQL community. Mailing lists organised by topics: https://www.postgresql.org/list/ IRC on freenode: channel #postgresql Slack: https://postgres-slack.herokuapp.com/ Telegram: https://t.me/pg sql The community is informal, friendly and respectful. There is a Code of Conduct to follow https://www.postgresql.org/about/policies/coc/ Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 13 / 55
  14. Some cool PostgreSQL features As there’s no much time we’ll

    have a quick look to some of the cool features included in PostgreSQL JSON and JSONB Transaction snapshot exports Table inheritance and partitioning Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 14 / 55
  15. The Ravenous Bugblatter Beast of Traal Copyright by Federico Campoli

    Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 15 / 55
  16. A multi flavour RDBMS Alongside with the general purpose data

    types PostgreSQL have some exotic types. Range types Geometric types Network address types XML type JSON type HSTORE type (extension) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 16 / 55
  17. JSON JSON JSON is the JavaScript Object Notation. Supported as

    datatype in PostgreSQL with two flavours JSON The data is stored as text, parsed and validated when the data is accessed JSONB parsed and converted in binary representation during the insert and update Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 17 / 55
  18. JSON JSON - Examples From record to JSON postgres =#

    SELECT row_to_json (ROW(1,’foo ’)); row_to_json -- ------------------- {"f1":1,"f2":"foo"} (1 row) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 18 / 55
  19. JSON Expanding JSON into key to value elements postgres =#

    SELECT * from json_each(’{"a":" foo", "b":" bar "}’); key | value -- ---+------- a | "foo" b | "bar" (2 rows) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 19 / 55
  20. JSONB Things to consider when using JSON JSON is parsed

    and validated on the fly JSONB is parsed, validated and transformed during INSERT/UPDATE JSONB allows a single index on the field. All the keys are indexed automatically. If you don’t use the json functions in PostgreSQL then just use text Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 20 / 55
  21. Some numbers Let’s create three tables with text,json and jsonb

    type fields. Each record contains the same json element generated on https://www.json-generator.com/ [ { "_id": "5da82999b1c11e2a5a3cd35e", "index": 0, "guid": "4320e8ee-e28f-435e-8bce-975e75b23f79", "isActive": true, "balance": "$2,211.00", "picture": "http://placehold.it/32x32", "age": 34, "eyeColor": "blue", "name": "Wall Shaw", ... Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 21 / 55
  22. Some numbers DROP TABLE IF EXISTS t_json ; DROP TABLE

    IF EXISTS t_jsonb ; DROP TABLE IF EXISTS t_text ; \timing CREATE TABLE t_json as SELECT ’<JSON ELEMENT >’:: json as js_value FROM generate_series (1 ,100000); SELECT 100000 Time: 27350.758 ms (00:27.351) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 22 / 55
  23. Some numbers CREATE TABLE t_text as SELECT ’<JSON ELEMENT >’::

    text as t_value FROM generate_series (1 ,100000); SELECT 100000 Time: 29713.060 ms (00:29.713) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 23 / 55
  24. Some numbers CREATE TABLE t_jsonb as SELECT ’<JSON ELEMENT >’::

    jsonb as jsb_value FROM generate_series (1 ,100000); SELECT 100000 Time: 27109.003 ms (00:27.109) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 24 / 55
  25. Table size SELECT pg_size_pretty ( pg_total_relation_size (oid)), relname FROM pg_class

    WHERE relname LIKE ’t\_%’ ; pg_size_pretty | relname -- --------------+--------- 270 MB | t_json 322 MB | t_jsonb 270 MB | t_text (3 rows) JSONB use more space than the json and text1 1Sizing may vary depending on the json size Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 25 / 55
  26. Sequential scans TEXT EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM

    t_text; Seq Scan on t_text (cost =0.00..1637.00 rows =100000 width =18) ( actual time =0.013..11.533 rows =100000 loops =1) Buffers: shared hit =637 Planning Time: 0.041 ms Execution Time: 18.359 ms (4 rows) Time: 18.796 ms Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 26 / 55
  27. Sequential scans JSON EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM

    t_json; Seq Scan on t_json (cost =0.00..1637.00 rows =100000 width =18) ( actual time =0.043..14.880 rows =100000 loops =1) Buffers: shared read =637 Planning Time: 0.100 ms Execution Time: 21.310 ms (4 rows) Time: 21.747 ms Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 27 / 55
  28. Sequential scans JSONB EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM

    t_jsonb; Seq Scan on t_jsonb (cost =0.00..1637.00 rows =100000 width =18) ( actual time =0.060..12.409 rows =100000 loops =1) Buffers: shared hit =91 read =546 Planning Time: 0.090 ms Execution Time: 18.471 ms (4 rows) Time: 18.852 ms Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 28 / 55
  29. Sequential scan with cast to json TEXT EXPLAIN (BUFFERS ,

    ANALYZE) SELECT t_value ::json ->0->’index ’ FROM t_text; Seq Scan on t_text (cost =0.00..2387.00 rows =100000 width =32) ( actual time =0.260..5301.080 rows =100000 loops =1) Buffers: shared hit =366212 read =34426 Planning Time: 0.311 ms Execution Time: 5309.246 ms Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 29 / 55
  30. Sequential scan with native json JSON EXPLAIN (BUFFERS , ANALYZE)

    SELECT js_value ->0->’index ’ FROM t_json ; Seq Scan on t_json (cost =0.00..1887.00 rows =100000 width =32) ( actual time =0.234..4145.148 rows =100000 loops =1) Buffers: shared hit =366116 read =34522 Planning Time: 0.043 ms Execution Time: 4152.678 ms (4 rows) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 30 / 55
  31. Sequential scan with jsonb JSONB EXPLAIN (BUFFERS , ANALYZE) SELECT

    jsb_value ->0->’index ’ FROM t_jsonb; Seq Scan on t_jsonb (cost =0.00..1887.00 rows =100000 width =32) ( actual time =0.154..1236.189 rows =100000 loops =1) Buffers: shared hit =379450 read =41188 Planning Time: 0.048 ms Execution Time: 1241.749 ms (4 rows) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 31 / 55
  32. Jsonpath PostgreSQL 12 adds the SQL specifications by introducing SQL/JSON

    path language. EXPLAIN (BUFFERS , ANALYZE) SELECT jsonb_path_query (jsb_value ,’$[0]. index ’) FROM t_jsonb; ProjectSet (cost =0.00..502387.00 rows =100000000 width =32) (actual time =0.116..1264.445 rows =100000 loops =1) Buffers: shared hit =420637 -> Seq Scan on t_jsonb (cost =0.00..1637.00 rows =100000 width =18) (actual time =0.013..8.009 rows =100000 loops =1) Buffers: shared hit =637 Planning Time: 0.058 ms Execution Time: 1269.603 ms (6 rows) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 32 / 55
  33. Time is an illusion. Lunchtime doubly so Copyright by Federico

    Campoli Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 33 / 55
  34. Transaction’s snapshots Depending from the isolation level, when a query

    or a transaction access the database for the first time acquires a snapshot which includes the transactions that have been committed at that moment. Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 34 / 55
  35. Transaction’s snapshots In PostgreSQL a transaction’s snapshot can be exported

    to other sessions A session which import an exported snapshot sees the same data of the exporting session Useful when multiple sessions need to see a consistent data set frozen in time (e.g. parallel execution) pg dump use this feature to backup the database in parallel Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 35 / 55
  36. Example Let’s create a table CREATE TABLE t_data ( i_id

    serial , t_content text ); ALTER TABLE t_data ADD CONSTRAINT pk_t_data PRIMARY KEY (i_id); Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 36 / 55
  37. Example INSERT INTO t_data ( t_content ) SELECT md5(i_counter ::

    text) FROM ( SELECT i_counter FROM generate_series (1 ,200) as i_counter ) t_series; Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 37 / 55
  38. Exporting a snapshot postgres =# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE

    READ; BEGIN postgres =# SELECT pg_export_snapshot (); pg_export_snapshot -- ------------------ 00000004 -0000005A-1 (1 row) postgres =# SELECT count (*) FROM t_data; count ------- 200 (1 row) Isolation levels explained here: https://pgdba.org/post/2019/10/transactions/ Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 38 / 55
  39. Importing a snapshot In a different session we remove all

    the table’s rows postgres =# DELETE FROM t_data; DELETE 200 postgres =# SELECT count (*) FROM t_data; count ------- 0 (1 row) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 39 / 55
  40. Example If we import the snapshot we’ll see the rows

    back again. postgres =# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; BEGIN postgres =# SET TRANSACTION SNAPSHOT ’00000004 -0000005A-1’; SET postgres =# SELECT count (*) FROM t_data; count ------- 200 (1 row) Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 40 / 55
  41. Mostly harmless Copyright by Federico Campoli Federico Campoli (Kamedata) The

    hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 41 / 55
  42. Table inheritance Before PostgreSQL 10 the partitioning were possible through

    the table inheritance. The table inheritance is a logical relationship between a parent table and its child tables. A child table inherits parent’s attributes. However the physical storage is not shared. Multiple level of inheritance are allowed. Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 42 / 55
  43. Table inheritance As physical storage is not shared between inherited

    tables, the unique constraints can’t be globally enforced on the inheritance tree preventing the creation of any global foreign key. There is no built in mechanism for distributing the data across the child tables. Query optimisation is made via CHECK constraint exclusion which need to be created and maintained. Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 43 / 55
  44. Table partitioning PostgreSQL 12 have the native support for table

    partitioning. Global foreign keys are allowed PostgreSQL to and from the partitioned table. Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 44 / 55
  45. Table partitioning PostgreSQL supports range, list and hash partitioning. Range

    Partitioning: The table is partitioned into “ranges” defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. List Partitioning: The table is partitioned by explicitly listing which key values appear in each partition. Hash Partitioning: The table is partitioned by specifying a modulus and a remainder for each partition. Each partition will hold the rows for which the hash value of the partition key divided by the specified modulus will produce the specified remainder. Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 45 / 55
  46. Partitioning Define a table as partitioned CREATE TABLE t_rng_part (

    id integer NOT NULL , value text , CONSTRAINT pk_t_rng_part primary key (id) ) PARTITION BY RANGE (id) ; Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 46 / 55
  47. Partitioning Add the range partitions. The range’s upper bound is

    not inclusive. CREATE TABLE t_rng_part_0001_1000 PARTITION OF t_rng_part FOR VALUES FROM (’0’) TO (’1000 ’); CREATE TABLE t_rng_part_1000_2000 PARTITION OF t_rng_part FOR VALUES FROM (’1000 ’) TO (’2000 ’); CREATE TABLE t_rng_part_2000_3000 PARTITION OF t_rng_part FOR VALUES FROM (’2000 ’) TO (’3000 ’); CREATE TABLE t_rng_part_default PARTITION OF t_rng_part DEFAULT; Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 47 / 55
  48. Partitioning Insert data INSERT INTO t_rng_part (id) SELECT generate_series (1

    ,50000) ; Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 48 / 55
  49. Partitioning Table size \dt+ t_rng_part* List of relations Schema |

    Name | Type | Owner | Size | Description -- ------+----------------------+-------+----------+---------+------- public | t_rng_part | table | postgres | 0 bytes | public | t_rng_part_0001_1000 | table | postgres | 72 kB | public | t_rng_part_1000_2000 | table | postgres | 72 kB | public | t_rng_part_2000_3000 | table | postgres | 72 kB | public | t_rng_part_default | table | postgres | 1696 kB | Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 49 / 55
  50. Partitioning PostgreSQL’s declarative partitioning relies on table inheritance system which

    rules still apply. However there are some differences. CHECK and NOT NULL constraints on a partitioned table are always inherited by all the partitions. CHECK constraints marked as NO INHERIT cannot be created on the partitioned tables. A partitioned table does not have any data directly therefore using TRUNCATE ONLY on a partitioned table will always generate an error. Differently from the table inheritance partitions cannot have columns that are not present in the parent. Is not possible to drop the NOT NULL constraint on a partition’s column if the constraint is present in the parent table. Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 50 / 55
  51. Wrap up Copyright by Federico Campoli Federico Campoli (Kamedata) The

    hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 51 / 55
  52. License This document is distributed under the terms of the

    Creative Commons Attribution, Not Commercial, Share Alike Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 52 / 55
  53. Contacts Blog: https://pgdba.org Twitter: @4thdoctor scarf Github: https://github.com/the4thdoctor Linkedin: https://www.linkedin.com/in/federicocampoli/

    Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 53 / 55
  54. That’s all folks! Thank you for listening! Any questions? Copyright

    by dan232323 http://dan232323.deviantart.com/art/Pinkie-Pie-Thats-All-Folks-454693000 Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 54 / 55
  55. The hitchiker’s guide to PostgreSQL Federico Campoli Kamedata Plone Conf,

    Ferrara, 24 October 2019 Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQL Plone Conf, Ferrara, 24 October 2019 55 / 55