Slide 1

Slide 1 text

@louisemeta Postgres index types (And where to find them) Louise Grandjonc Montreal Postgres Meetup

Slide 2

Slide 2 text

@louisemeta About me Software engineer at Citus/Microsoft Previously lead python developer Postgres enthusiast PostgresWomen co-founder @louisemeta on twitter www.louisemeta.com louise.grandjonc@microsoft.com @louisemeta !2

Slide 3

Slide 3 text

@louisemeta What we’re going to talk about 1. What are indexes for? 2. Creating indexes 3. B-Tree 4. GIN 5. GiST 6. Brin @louisemeta !3

Slide 4

Slide 4 text

@louisemeta First things first: the crocodiles • 250k crocodiles • 100k birds • 400K appointments @louisemeta !4

Slide 5

Slide 5 text

@louisemeta What are indexes for?

Slide 6

Slide 6 text

@louisemeta Constraints Some constraints transform into indexes. - PRIMARY KEY - UNIQUE - EXCLUDE USING "crocodile_pkey" PRIMARY KEY, btree (id) "crocodile_email_uq" UNIQUE CONSTRAINT, btree (email) Indexes: "appointment_pkey" PRIMARY KEY, btree (id) "appointment_crocodile_id_schedule_excl" EXCLUDE USING gist (crocodile_id WITH =, schedule WITH &&) In the crocodile table In the appointment table @louisemeta !6

Slide 7

Slide 7 text

@louisemeta Query optimization Often the main reason why we create indexes Why do indexes make queries faster In an index, tuples (value, pointer) are stored. Instead of reading the entire table for a value, you just go to the index (kind of like in an encyclopedia) @louisemeta !7

Slide 8

Slide 8 text

@louisemeta Creating indexes @louisemeta

Slide 9

Slide 9 text

@louisemeta Creating an index @louisemeta !9 Let’s say we would like to do queries like: SELECT * FROM crocodile WHERE number_of_teeth = 10; Time: 31ms Here is how to create a simple index: CREATE INDEX (optional index name) ON crocodile (number_of_teeth); SELECT * FROM crocodile WHERE number_of_teeth = 10; Time: 6ms New timing

Slide 10

Slide 10 text

@louisemeta Creating a unique index @louisemeta !10 We want to make sure that you can’t create two account with the same email: CREATE UNIQUE INDEX ON crocodile (email); INSERT INTO crocodile (email, first_name, last_name, birthday, number_of_teeth) VALUES ('louise@croco.com', 'Louise', 'Grandjonc', '1991-12-21', 32); INSERT 0 1 If I run the same insert again I get this error: DETAIL: Key (email)=(louise@croco.com) already exists.

Slide 11

Slide 11 text

@louisemeta Creating a partial index @louisemeta !11 The table appointments has a boolean done indicating when an appointment has already been handled by a bird. 95% of the appointments in our database have done=True. SELECT * FROM appointment WHERE emergency_level > 8 AND done = False; CREATE INDEX ON crocodile (emergency_level) WHERE done is False; Size on the index: 352 kB Time of the query: 3.639 ms Size on the index:13MB Time of the query: 29.106 ms Index without condition New index with condition

Slide 12

Slide 12 text

@louisemeta Creating a partial unique index @louisemeta !12 Want to add a UNIQUE index but have duplicates due to history or soft delete? CREATE UNIQUE INDEX ON crocodile (email) WHERE created_at > ‘2019-01-01’; Size of previous index: 6648 kB Size of the new index: 112 kB Why use a partial unique index? - Save disk space with smaller index - Faster inserts because the index tree is shorter to explore (especially when you have a huge volume of old data)

Slide 13

Slide 13 text

@louisemeta Creating a multi-column index @louisemeta !13 We have a job running regularly to list a bird’s emergencies, it runs the following query: SELECT * FROM appointment WHERE emergency_level >= 9 and plover_bird_id = 22551; Time: 41.560 ms CREATE INDEX ON appointment (plover_bird_id, emergency_level); Time after: 0.606 ms

Slide 14

Slide 14 text

@louisemeta Ordering the columns in a multi-column index @louisemeta !14 Two things to consider: - Re-using the index: the rightmost columns can be re-used for other queries. The first column will be ordered, so the index can be used by this query: SELECT * FROM appointment WHERE plover_bird_id = 22551; 30 8 55 10 96 7 31 5 31 10 55 10 56 10 57 3 31 6 RowID 31 10 RowID 31 10 RowID 32 1 RowID 32 6 RowID … 55 10 RowID plover_bird_id em ergency_level

Slide 15

Slide 15 text

@louisemeta Ordering the columns in a multi-column index @louisemeta !15 - The most filtering columns should come first croco_talk=# SELECT COUNT(*) FROM appointment WHERE emergency_level >= 9; count ------- 75982 (1 row) croco_talk=# SELECT COUNT(*) FROM appointment WHERE plover_bird_id = 22551; count ------- 5 (1 row) In this case, with plover_bird_id as the first column, it will first filter out and the second filter will be applied on only 5 rows.

Slide 16

Slide 16 text

@louisemeta Indexes using INCLUDE @louisemeta !16 New in Postgres 11 For some cases, if you want to be able to have index only scans, and don’t have filter on a key: You could have an index SELECT SUM(fee) FROM appointment WHERE crocodile_id = 1; CREATE INDEX ON appointment (crocodile_id) INCLUDE (fee); Advantage over a multi-column index: - The values in INCLUDE are stored in the index, but not in the tree. - Less columns to index - Faster updates

Slide 17

Slide 17 text

@louisemeta B-Trees @louisemeta

Slide 18

Slide 18 text

@louisemeta B-Trees internal data structure @louisemeta !18 Root High Key: None begin 16 31 Parent High Key: 16 begin 12 14 Parent High Key: 31 16 20 Parent High Key: None 31 33 Leave High Key: 12 Value: 1 Pointer: croco 10 Value: 1 Pointer: croco 12 Value : 2 Pointer: croco 23 … Value 10 Pointer: croco 1 Value: 11 Pointer: croco 2 Leave High Key: 14 Value: 12 Pointer: croco 17 Value: 13 Pointer: croco 3 Value : 13 Pointer: croco 4 … Value 13 Pointer: croco 27 … Leave High Key: None Value: 33 Pointer: croco 5 Value: 33 Pointer: croco 6 Value : 33 Pointer: croco 123 … Value: 38 Pointer: croco 26 - A BTree in a balanced tree - All the leaves are at equal distance from the root. - A parent node can have multiple children minimizing the tree’s depth

Slide 19

Slide 19 text

@louisemeta B-Trees internal data structure - 2 Pages The root, the parents, and the leaves are all pages with the same structure. Pages have: - A block number (pointer) - A high key (defines the highest value found in a page) - Items @louisemeta !19

Slide 20

Slide 20 text

@louisemeta B-Trees internal data structure - 4 Pages high key - Any item in the page will have a value lower or equal to the high key And in page 575, there is no high key as it’s the rightmost page. In page 3, I will find crocodiles with 16 or less teeth In page 289, with 31 and less @louisemeta !20

Slide 21

Slide 21 text

@louisemeta B-Trees internal data structure - 5 Items An item contains: - A value (of the indexed row in the leaves, of the first row in the parents) - Pointer (to the row in the leaves, to the child page in the parents) @louisemeta !21

Slide 22

Slide 22 text

@louisemeta To sum it up @louisemeta !22 Root High Key: None begin 16 31 Parent High Key: 16 begin 12 14 Parent High Key: 31 16 20 Parent High Key: None 31 33 Leave High Key: 12 Value: 1 Pointer: croco 10 Value: 1 Pointer: croco 12 Value : 2 Pointer: croco 23 … Value 10 Pointer: croco 1 Value: 11 Pointer: croco 2 Leave High Key: 14 Value: 12 Pointer: croco 17 Value: 13 Pointer: croco 3 Value : 13 Pointer: croco 4 … Value 13 Pointer: croco 27 … Leave High Key: None Value: 33 Pointer: croco 5 Value: 33 Pointer: croco 6 Value : 33 Pointer: croco 123 … Value: 38 Pointer: croco 26 - A Btree is a balanced tree - The values indexed are the values of the rows - Data is stored in pages - Pages have a high key defining the biggest value in the page - Pages have items pointing to an other page or the row.


Slide 23

Slide 23 text

@louisemeta What are BTree good for? @louisemeta !23 Root High Key: None begin 16 31 Parent High Key: 16 begin 12 14 Parent High Key: 31 16 20 Parent High Key: None 31 33 Leave High Key: 12 Value: 1 Pointer: croco 10 Value: 1 Pointer: croco 12 Value : 2 Pointer: croco 23 … Value 10 Pointer: croco 1 Value: 11 Pointer: croco 2 Leave High Key: 14 Value: 12 Pointer: croco 17 Value: 13 Pointer: croco 3 Value : 13 Pointer: croco 4 … Value 13 Pointer: croco 27 … Leave High Key: None Value: 33 Pointer: croco 5 Value: 33 Pointer: croco 6 Value : 33 Pointer: croco 123 … Value: 38 Pointer: croco 26 BTrees are good for the following operations: =, >, <, >=, <= Why? Because the value indexed is the value of the column(s) so we can easily perform binary search in the BTree

Slide 24

Slide 24 text

@louisemeta GIN

Slide 25

Slide 25 text

@louisemeta GIN - Used to index arrays, jsonb, and tsvector (for fulltext search) columns. - Efficient for <@, &&, @@@ operators New column healed_teeth: croco=# SELECT email, number_of_teeth, healed_teeth FROM crocodile WHERE id =1; -[ RECORD 1 ]---+-------------------------------------------------------- email | louise.grandjonc1@croco.com number_of_teeth | 58 healed_teeth | {16,11,55,27,22,41,38,2,5,40,52,57,28,50,10,15,1,12,46} !25

Slide 26

Slide 26 text

@louisemeta Creating a GIN index Here is how to create the GIN index for this column CREATE INDEX ON crocodile USING GIN(healed_teeth); !26 from django.contrib.postgres.indexes import GinIndex class Crocodile(models.Model): ... class Meta: indexes = [GinIndex(fields=['healed_teeth'])] Raw SQL

Slide 27

Slide 27 text

@louisemeta GIN How is it different from a BTree? - In a GIN index, the array is split and each value is an entry - The values are unique - As the value is unique, in the leaves, we keep a list of pointers to the rows !27 Root Value: Value: 10 Value: 20 … Parent Value: 1 Value: 4 Value: 6 … Parent Value: 10 Value: 15 Value: 17 … Parent Value: 20 Value: 24 Value: 26 … Leaf Pointers: {(269, 49), (296, 51), (296, 54), (296, 57), …} Pointers: { (306, 33), (306, 35), (306,36), …} … Leaf Pointer to posting tree … Page Page Page Root Posting tree

Slide 28

Slide 28 text

@louisemeta GIN How is it different from a BTree? Bitmap Heap Scan on crocodile (cost=516.59..6613.42 rows=54786 width=29) (actual time=15.960..38.197 rows=73275 loops=1) Recheck Cond: ('{1,2}'::integer[] <@ healed_teeth) Heap Blocks: exact=4218 -> Bitmap Index Scan on crocodile_healed_teeth_idx (cost=0.00..502.90 rows=54786 width=0) (actual time=15.302..15.302 rows=73275 loops=1) Index Cond: ('{1,2}'::integer[] <@ healed_teeth) Planning time: 0.124 ms Execution time: 41.018 ms (7 rows) Seq Scan on crocodile (cost=…) Filter: ('{1,2}'::integer[] <@ healed_teeth) Rows Removed by Filter: 250728 Planning time: 0.157 ms Execution time: 161.716 ms (5 rows) !28

Slide 29

Slide 29 text

@louisemeta To sum it up @louisemeta !29 - A GIN index is a balanced tree - Each value in the tree is unique - The row value is split and each value is an entry - Efficient for <@, &&, @@@ operators

Slide 30

Slide 30 text

@louisemeta GIST

Slide 31

Slide 31 text

@louisemeta GiST - keys Differences with a BTree index - Data isn’t ordered - The key ranges can overlap Which means that a same value can be inserted in different pages !31 To be more readable, the following example here is for a Integer Range type :) Root Page block number: 0 Page level: 0 Value: [3, 5] Value: [0, 2] Value: [4, 8] Value: [7, 9] Parent Page block number: 4699 Page level: 1 Parent Page block number: 1610 Page level: 1 Parent Page block number: 813 Page level: 1 Parent Page block number: 6249 Page level: 1

Slide 32

Slide 32 text

@louisemeta Creating a GiST index !32 Here is how to create the GiST index for this column CREATE INDEX ON appointment USING GIST(schedule); from django.contrib.postgres.indexes import GistIndex class Appointment(models.Model): ... class Meta: indexes = [GistIndex(fields=[‘schedule'])] Raw SQL

Slide 33

Slide 33 text

@louisemeta Why use GiST - Useful for overlapping (geometries, array, range etc.) - Especially useful when using postgis - Nearest neighbor - Can be used for full text search (tsvector, tsquery) !33

Slide 34

Slide 34 text

@louisemeta GiST or GIN for fulltext search movies=# CREATE INDEX ON film USING GIN(fulltext) with (fastupdate=off); CREATE INDEX Time: 8.083 ms movies=# INSERT INTO film (title, description, language_id) VALUES ('Nightmare at the dentist', 'A crocodile calls his dentist on halloween and ends up toothless and very sad, warning: not for kids, or teeth-sensitive crocodiles', 1); INSERT 0 1 Time: 3.057 ms movies=# INSERT INTO film (title, description, language_id) VALUES ('Nightmare at the dentist', 'The terrible adventure of a crocodile who never goes to the dentist', 1); INSERT 0 1 Time: 1.323 ms - Maintaining a GIN index is slower than GiST !34

Slide 35

Slide 35 text

@louisemeta GiST or GIN for fulltext search - Lookups are faster with GIN movies=# SELECT COUNT(*) FROM film WHERE fulltext @@ to_tsquery('crocodile'); count ------- 106 (1 row) Time: 1.275 ms movies=# SELECT COUNT(*) FROM film WHERE fulltext @@ to_tsquery('crocodile'); count ------- 106 (1 row) Time: 0.467 ms !35

Slide 36

Slide 36 text

@louisemeta GiST or GIN for fulltext search - GIN indexes are larger than GiST movies=# \di+ film_fulltext_idx List of relations Schema | Name | Type | Owner | Table | Size | Description --------+-------------------+-------+----------+-------+-------+------------- public | film_fulltext_idx | index | postgres | film | 88 kB | (1 row) movies=# \di+ film_fulltext_gin_idx List of relations Schema | Name | Type | Owner | Table | Size | Description --------+-----------------------+-------+----------+-------+--------+------------- public | film_fulltext_gin_idx | index | postgres | film | 112 kB | (1 row) !36

Slide 37

Slide 37 text

@louisemeta BRIN

Slide 38

Slide 38 text

@louisemeta BRIN Internal data structure - Block Range Index - Not a balanced tree - Not even a tree - Block range: group of pages physically adjacent - For each block range: the range of values is stored - BRIN indexes are very small - Fast scanning on large tables !38

Slide 39

Slide 39 text

@louisemeta BRIN Internal data structure SELECT * FROM brin_page_items(get_raw_page('appointment_created_at_idx', 2), 'appointment_created_at_idx'); itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value ------------+--------+--------+----------+----------+-------------+--------------------------------------------------- 1 | 0 | 1 | f | f | f | {2008-03-01 00:00:00-08 .. 2009-07-07 07:30:00-07} 2 | 128 | 1 | f | f | f | {2009-07-07 08:00:00-07 .. 2010-11-12 15:30:00-08} 3 | 256 | 1 | f | f | f | {2010-11-12 16:00:00-08 .. 2012-03-19 23:30:00-07} 4 | 384 | 1 | f | f | f | {2012-03-20 00:00:00-07 .. 2013-07-26 07:30:00-07} 5 | 512 | 1 | f | f | f | {2013-07-26 08:00:00-07 .. 2014-12-01 15:30:00-08} SELECT id, created_at FROM appointment WHERE ctid='(0, 1)'::tid; id | created_at --------+------------------------ 101375 | 2008-03-01 00:00:00-08 (1 row) !39

Slide 40

Slide 40 text

@louisemeta BRIN Internal data structure SELECT * FROM brin_page_items(get_raw_page('crocodile_birthday_idx', 2), 'crocodile_birthday_idx'); itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value ------------+--------+--------+----------+----------+-------------+---------------------------- 1 | 0 | 1 | f | f | f | {1948-09-05 .. 2018-09-04} 2 | 128 | 1 | f | f | f | {1948-09-07 .. 2018-09-03} 3 | 256 | 1 | f | f | f | {1948-09-05 .. 2018-09-03} 4 | 384 | 1 | f | f | f | {1948-09-05 .. 2018-09-04} 5 | 512 | 1 | f | f | f | {1948-09-05 .. 2018-09-02} 6 | 640 | 1 | f | f | f | {1948-09-09 .. 2018-09-04} … (14 rows) In this case, the values in birthday has no correlation with the physical location, the index would not speed up the search as all pages would have to be visited. BRIN is interesting for data where the value is correlated with the physical location. !40

Slide 41

Slide 41 text

@louisemeta BRIN Warning on DELETE and INSERT SELECT * FROM brin_page_items(get_raw_page('appointment_created_at_idx', 2), 'appointment_created_at_idx'); itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value ------------+--------+--------+----------+----------+-------------+--------------------------------------------------- 1 | 0 | 1 | f | f | f | {2008-03-01 00:00:00-08 .. 2018-07-01 07:30:00-07} 2 | 128 | 1 | f | f | f | {2009-07-07 08:00:00-07 .. 2018-07-01 23:30:00-07} 3 | 256 | 1 | f | f | f | {2010-11-12 16:00:00-08 .. 2012-03-19 23:30:00-07} 4 | 384 | 1 | f | f | f | {2012-03-20 00:00:00-07 .. 2018-07-06 23:30:00-07} DELETE FROM appointment WHERE created_at >= '2009-07-07' AND created_at < ‘2009-07-08'; DELETE FROM appointment WHERE created_at >= '2012-03-20' AND created_at < ‘2012-03-25'; Deleted and then vacuum on the appointment table New rows are inserted in the free space after VACUUM BRIN index has some ranges with big data ranges. Search will visit a lot of pages. !41

Slide 42

Slide 42 text

@louisemeta Creating a BRIN index !42 Here is how to create the BRIN index for this column CREATE INDEX ON crocodile USING BRIN(created_at); from django.contrib.postgres.indexes import BrinIndeex class Crocodile(models.Model): ... class Meta: indexes = [BrinIndex(fields=['created_at'])] Raw SQL

Slide 43

Slide 43 text

@louisemeta Conclusion - B-Tree - Great for <, >, =, >=, <= - GIN - Fulltext search, jsonb, arrays - Inserts can be slow because of unicity of the keys - GiST - Great for overlapping - Using key class functions - Can be implemented for any data type - BRIN - Great for huge table with correlation between value and physical location - <, >, =, >=, <= !43

Slide 44

Slide 44 text

@louisemeta Questions Thanks for your attention Go read the articles www.louisemeta.com Now only the ones on BTrees and GIN are published, but I’ll announce the rest on twitter @louisemeta Crocodiles by https://www.instagram.com/zimmoriarty/?hl=en !44