Database Clustered vs Non-Clustered Index

Slide 1

Slide 1 text

Index Heap vs Organized Tables Clustered Non-Clustered vs Index Database

Slide 2

Slide 2 text

Let’s dive into

Slide 3

Slide 3 text

Clustered I n d e x vs Non-Clustered I n d e x

Slide 4

Slide 4 text

vs Index o r g a n i z e d t a b l e Heap o r g a n i z e d t a b l e

Slide 5

Slide 5 text

RowID T a b l e s

Slide 6

Slide 6 text

and More…

Slide 7

Slide 7 text

Page 0  Database Page Employee ID Country City Surname Page 0 Tables are Organized in Pages

Slide 8

Slide 8 text

Page 0  Database Page Employee ID Country City Surname Page 0 Tables are Organized in Pages Small size Blocks Easy to read, cache and evict

Slide 9

Slide 9 text

Page 0  Database Page Employee ID Country City Surname Page 0 Block 8-16KiB Tables are Organized in Pages Small size Blocks Easy to read, cache and evict

Slide 10

Slide 10 text

Page 0  (format, stats, …) 0 483 29 Database Page Employee ID Country City Surname 344 US San Francisco Miller 483 Row 0 344 Miller San Francisco US Page 0

Slide 11

Slide 11 text

Page 0  (format, stats, …) 0 483 29 1 455 28 Database Page Employee ID Country City Surname 459 344 EU US Stockholm San Francisco Andersson Miller 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU Page 0

Slide 12

Slide 12 text

Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 459 344 EU US Stockholm San Francisco Andersson Miller 826 EU Berlin Meyer 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU Page 0

Slide 13

Slide 13 text

Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 459 344 405 EU US US Stockholm San Francisco San Francisco Andersson Martinez Miller 826 EU Berlin Meyer 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US Page 0

Slide 14

Slide 14 text

Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 4 380 23 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 459 Berlin 310 344 405 EU US US EU Stockholm San Francisco San Francisco Andersson Martinez Miller Schmidt 826 EU Berlin Meyer 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US 380 Row 4 310 Schmidt Berlin EU Page 0

Slide 15

Slide 15 text

Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 4 380 23 5 358 22 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 459 Berlin 310 344 405 517 UK EU US US EU London Stockholm San Francisco San Francisco Davies Andersson Martinez Miller Schmidt 826 EU Berlin Meyer 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US 380 Row 4 310 Schmidt Berlin EU 358 Row 5 517 Davies London EU Page 0

Slide 16

Slide 16 text

6 332 26 Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 4 380 23 5 358 22 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 235 459 Berlin Asplund EU 310 344 405 517 UK EU US US EU London Stockholm San Francisco San Francisco Stockholm Davies Andersson Martinez Miller Schmidt 826 EU Berlin Meyer 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US 380 Row 4 310 Schmidt Berlin EU 358 Row 5 517 Davies London EU 332 Row 6 235 Asplund Stockholm EU Page 0

Slide 17

Slide 17 text

6 332 26 7 302 30 Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 4 380 23 5 358 22 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 235 459 111 Berlin Asplund EU 310 344 405 517 UK EU US US EU US London Stockholm San Francisco San Francisco Stockholm San Francisco Davies Andersson Martinez Miller Schmidt Jackson 826 EU Berlin Meyer 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US 380 Row 4 310 Schmidt Berlin EU 358 Row 5 517 Davies London EU 332 Row 6 235 Asplund Stockholm EU 302 Row 7 111 Jackson San Francisco US Page 0

Slide 18

Slide 18 text

6 332 26 7 302 30 8 281 21 Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 4 380 23 5 358 22 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 235 459 111 Berlin Asplund EU 310 344 405 517 UK EU US US EU US London Stockholm San Francisco San Francisco Stockholm San Francisco Davies Andersson Martinez Miller Schmidt Jackson 826 878 EU UK Berlin London Meyer Scott 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US 380 Row 4 310 Schmidt Berlin EU 358 Row 5 517 Davies London EU 332 Row 6 235 Asplund Stockholm EU 302 Row 7 111 Jackson San Francisco US 281 Row 8 878 Scott London EU Page 0

Slide 19

Slide 19 text

6 332 26 7 302 30 8 281 21 9 254 27 Page 0  (format, stats, …) 0 483 29 1 455 28 2 434 21 3 403 31 4 380 23 5 358 22 434 Row 2 826 Meyer Berlin EU Database Page Employee ID Country City Surname 235 459 111 Berlin Asplund EU 310 344 405 504 517 UK EU EU US US EU US London Stockholm Stockholm San Francisco San Francisco Stockholm San Francisco Davies Karlsson Andersson Martinez Miller Schmidt Jackson 826 878 EU UK Berlin London Meyer Scott 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU 403 Row 3 405 Martinez San Francisco US 380 Row 4 310 Schmidt Berlin EU 358 Row 5 517 Davies London EU 332 Row 6 235 Asplund Stockholm EU 302 Row 7 111 Jackson San Francisco US 281 Row 8 878 Scott London EU 254 Row 9 504 Karlsson Stockholm EU Page 0

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Page 1  (format, stats, …) Database Page Employee ID Country City Surname 235 459 111 Berlin Asplund EU 310 344 405 504 517 UK EU EU US US EU US London Stockholm Stockholm San Francisco San Francisco Stockholm San Francisco Davies Karlsson Andersson Martinez Miller Schmidt Jackson 826 878 EU UK Berlin London Meyer Scott Page 0 Page 1

Slide 25

Slide 25 text

Page 1  (format, stats, …) 0 483 29 Database Page Employee ID Country City Surname 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson 826 878 EU UK Berlin London Meyer Scott 483 Row 0 344 Miller San Francisco US Page 0 Page 1

Slide 26

Slide 26 text

Page 1  (format, stats, …) 0 483 29 1 455 28 Database Page Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU Page 0 Page 1

Slide 27

Slide 27 text

… Page 1  (format, stats, …) 0 483 29 1 455 28 Database Page Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott 483 Row 0 344 Miller San Francisco US 455 Row 1 459 Andersson Stockholm EU Page 0 Page 1

Slide 28

Slide 28 text

Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Heap-Organized Tables SELECT * FROM employee WHERE country = ‘US’; SELECT * FROM employee WHERE employee_id = ‘235’; Data is stored in no particular order

Slide 29

Slide 29 text

Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Non-Clustered Index Heap-Organized Tables Heap-Organized Table Non-Clustered Index

Slide 30

Slide 30 text

Non-Clustered Index Heap-Organized Tables CREATE INDEX index_name ON table_name (column_name_a, column_name_b); Heap-Organized Table Non-Clustered Index

Slide 31

Slide 31 text

Non-Clustered Index Heap-Organized Tables CREATE INDEX index_name ON table_name (column_name_a, column_name_b); Heap-Organized Table Non-Clustered Index It contains a copy of the indexed columns Index Fields

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Non-Clustered Index Heap-Organized Tables Index Fields Block Offset CREATE INDEX index_name ON table_name USING BTREE (column_name_a, column_name_b); Heap-Organized Table Non-Clustered Index Fast Range & Lookup Queries SELECT * FROM table_name WHERE foo > ‘xyz’; SELECT * FROM table_name WHERE foo < ‘xyz’; SELECT * FROM table_name WHERE foo LIKE ‘xy%’; SELECT * FROM table_name WHERE foo > ‘bbb’ AND foo < ‘eee’; SELECT * FROM table_name WHERE foo = ‘xyz’; A BTREE index can be used for column comparisons I n expressions that use the =, >, >=, <, <=, or LIKE with a prefix 'foo%'

Slide 34

Slide 34 text

Non-Clustered Index Heap-Organized Tables Index Fields Block Offset CREATE INDEX index_name ON table_name USING HASH (column_name_a, column_name_b); Hash Indexes are used only for equality comparisons SELECT * FROM table_name WHERE foo = ‘xyz’;

Slide 35

Slide 35 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Employee ID Page Offset 344 0 0 459 0 1 826 0 2 405 0 3 310 0 4 517 0 5 235 0 6 111 0 7 878 0 8 504 0 9 166 1 0 105 1 1 CREATE INDEX eid_index ON employee (employee_id);

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text

Slide 41

Slide 41 text

Slide 42

Slide 42 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 CREATE INDEX cities_index ON employee (country, city); San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US

Slide 43

Slide 43 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 SELECT * FROM employee WHERE country = ‘UK’; CREATE INDEX cities_index ON employee (country, city); San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US

Slide 44

Slide 44 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 SELECT * FROM employee WHERE country = ‘UK’; CREATE INDEX cities_index ON employee (country, city); San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US

Slide 45

Slide 45 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 SELECT * FROM employee WHERE country = ‘UK’; CREATE INDEX cities_index ON employee (country, city); San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US

Slide 46

Slide 46 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 SELECT * FROM employee WHERE country = ‘UK’; CREATE INDEX cities_index ON employee (country, city); San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US

Slide 47

Slide 47 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US SELECT * FROM employee WHERE country = ‘EU’ AND city = ‘Stockholm’; CREATE INDEX cities_index ON employee (country, city);

Slide 48

Slide 48 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US SELECT * FROM employee WHERE country = ‘EU’ AND city = ‘Stockholm’; CREATE INDEX cities_index ON employee (country, city);

Slide 49

Slide 49 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US SELECT * FROM employee WHERE country = ‘EU’ AND city = ‘Stockholm’; CREATE INDEX cities_index ON employee (country, city);

Slide 50

Slide 50 text

Non-Clustered Index Heap-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 344 405 504 517 UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott Page Offset Stockholm 0 1 Berlin 0 2 Berlin 0 4 London 0 5 Stockholm 0 6 San Francisco 0 7 Stockholm 0 9 London 1 0 San Francisco 0 0 Berlin 1 1 San Francisco 0 3 London 0 8 US UK EU EU City Country EU EU EU EU UK UK US US SELECT * FROM employee WHERE country = ‘EU’ AND city = ‘Stockholm’; CREATE INDEX cities_index ON employee (country, city);

Slide 51

Slide 51 text

Clustered Index PostgreSQL MySQL SQL Server Oracle CREATE TABLE ( … PRIMARY KEY CLUSTERED, ); CREATE TABLE ( …PRIMARY KEY ) ORGANIZATION INDEX; InnoDB CREATE TABLE ( PRIMARY KEY, ); CREATE INDEX idx_name ON my_table (col_a, col_b); CLUSTER my_table USING idx_name; When a table is clustered, it is physically reordered based on the index information. Clustering is a one-time operation when the table is updated, the changes are not clustered. Data is physical stored order based on the clustered key (Primary Key) Microsoft Index-Organized Tables (IOT) have primary key data and non-key columns data stored within the same B+Tree structure Data can be retrieved quickly based on the clustered key, if the query uses the indexed columns.

Slide 52

Slide 52 text

Clustered Index Index-Organized Tables Data is physical stored order based on the clustered key (Primary Key) Data can be retrieved quickly based on the clustered key, if the query uses the indexed columns. Index-Organized Tables (IOT) have primary key data and non-key columns data stored within the same B+Tree structure

Slide 53

Slide 53 text

Clustered Index Index-Organized Tables Row Data Data is physical stored order based on the clustered key (Primary Key) Data can be retrieved quickly based on the clustered key, if the query uses the indexed columns. Index-Organized Tables (IOT) have primary key data and non-key columns data stored within the same B+Tree structure

Slide 54

Slide 54 text

Clustered Index Index-Organized Tables Employee ID Country City Surname 105 235 459 111 Berlin Asplund EU 310 166 405 504 517 UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez 344 US San Francisco Miller Schmidt Stewart Jackson Schneider 826 878 EU UK Berlin London Meyer Scott SELECT * FROM employee WHERE country = ‘UK’ AND city = ‘London’; CREATE TABLE employee ( PRIMARY KEY (country, city, surname, employee_id) ); SELECT * FROM employee WHERE country = ‘UK’ AND city = ‘London’ AND surname LIKE ’S%’; SELECT * FROM employee WHERE country = ‘EU’;

Slide 55

Slide 55 text

Non-Clustered Index Index-Organized Tables Row Data CREATE INDEX index_name ON table_name USING BTREE (column_name_a, column_name_b); Non-Clustered Index Clustered Index

Slide 56

Slide 56 text

Index Fields Non-Clustered Index Index-Organized Tables Row Data CREATE INDEX index_name ON table_name USING BTREE (column_name_a, column_name_b); Non-Clustered Index Clustered Index

Slide 57

Slide 57 text

PK Fields Index Fields Non-Clustered Index Index-Organized Tables Row Data CREATE INDEX index_name ON table_name USING BTREE (column_name_a, column_name_b); Non-Clustered Index Clustered Index

Slide 58

Slide 58 text

PK Fields Index Fields Non-Clustered Index Index-Organized Tables Row Data CREATE INDEX index_name ON table_name USING HASH (column_name_a, column_name_b); Non-Clustered Index Clustered Index

Slide 59

Slide 59 text

Employee ID Country City 826 459 235 111 Berlin EU 310 878 344 504 166 UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin 405 US San Francisco 105 517 EU UK Berlin London Surname Asplund Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider Meyer Scott Role HR Financial Operation Support Marketing IT Operation Support Operation HR Support IT Primary Key Non-Clustered Index Index-Organized Tables Employee ID Country City 826 459 235 111 Berlin EU 310 344 166 UK EU US EU US EU London Stockholm San Francisco Stockholm San Francisco Berlin 405 US San Francisco 105 517 EU UK Berlin London Role HR Financial Support Marketing IT Operation Operation HR Support IT 504 EU Stockholm Operation 878 UK London Support CREATE INDEX role_index ON employee (role);

Slide 60

Slide 60 text

Slide 61

Slide 61 text

Slide 62

Slide 62 text

Slide 63

Slide 63 text

Slide 64

Slide 64 text

Employee ID Country City 826 459 235 111 Berlin EU 310 878 344 504 166 UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin 405 US San Francisco 105 517 EU UK Berlin London Surname Asplund Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider Meyer Scott Role HR Financial Operation Support Marketing IT Operation Support Operation HR Support IT Primary Key Non-Clustered Index Index-Organized Tables Employee ID 105 235 459 111 310 166 405 504 517 344 826 878 Country City Berlin EU UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin US San Francisco EU UK Berlin London CREATE INDEX id_index ON employee (employee_id);

Slide 65

Slide 65 text

Employee ID Country City 826 459 235 111 Berlin EU 310 878 344 504 166 UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin 405 US San Francisco 105 517 EU UK Berlin London Surname Asplund Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider Meyer Scott Role HR Financial Operation Support Marketing IT Operation Support Operation HR Support IT Primary Key Non-Clustered Index Index-Organized Tables Employee ID 105 235 459 111 310 166 405 504 517 344 826 878 Country City Berlin EU UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin US San Francisco EU UK Berlin London CREATE INDEX id_index ON employee (employee_id); SELECT * FROM employee WHERE employee_id = 310;

Slide 66

Slide 66 text

Employee ID Country City 826 459 235 111 Berlin EU 310 878 344 504 166 UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin 405 US San Francisco 105 517 EU UK Berlin London Surname Asplund Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider Meyer Scott Role HR Financial Operation Support Marketing IT Operation Support Operation HR Support IT Primary Key Non-Clustered Index Index-Organized Tables Employee ID 105 235 459 111 310 166 405 504 517 344 826 878 Country City Berlin EU UK EU EU US EU UK US EU London Stockholm Stockholm San Francisco Stockholm London San Francisco Berlin US San Francisco EU UK Berlin London CREATE INDEX id_index ON employee (employee_id); SELECT * FROM employee WHERE employee_id = 310;

Slide 67

Slide 67 text

Slide 68

Slide 68 text

Slide 69

Slide 69 text

RowId Tables Role Country City Surname Support Support HR Operation Berlin Asplund EU Operation Financial Marketing IT Operation IT UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider HR Support EU UK Berlin London Meyer Scott RowId 1 2 3 4 5 6 7 8 9 10 11 12 CREATE TABLE employee (country, city, surname, role); SELECT rowid, * FROM employee; Index-Organized Tables order based on RowId SQLite (Default) AUTOINCREMENT

Slide 70

Slide 70 text

Slide 71

Slide 71 text

Slide 72

Slide 72 text

RowId 2 11 7 1 5 8 3 4 12 6 Role HR Financial Support Marketing IT Operation Operation HR Support IT 10 Operation 9 Support Role Country City Surname Support Support HR Operation Berlin Asplund EU Operation Financial Marketing IT Operation IT UK EU EU US US EU UK US EU London Stockholm Stockholm San Francisco San Francisco Stockholm London San Francisco Berlin Davies Karlsson Andersson Martinez Miller Schmidt Stewart Jackson Schneider HR Support EU UK Berlin London Meyer Scott RowId 1 2 3 4 5 6 7 8 9 10 11 12 CREATE INDEX role_index ON employee (role); SELECT * FROM employee WHERE role = ‘Operation’; Non-Clustered Index RowId Tables Reduced Index size (RowId vs Primary Key) Less Fragmentation (B+Tree Append Only) Same pitfalls of Heap-Organized Tables Too many Disk I/O from a Non-Clustered Index for range queries or sorting.

Slide 73

Slide 73 text

Fragmentation What is Page Fragmentation?

Slide 74

Slide 74 text

Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 New rows are added without any ordering constraint

Slide 75

Slide 75 text

Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not.

Slide 76

Slide 76 text

Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not.

Slide 77

Slide 77 text

Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor)

Slide 78

Slide 78 text

Slide 79

Slide 79 text

Slide 80

Slide 80 text

Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 Page 1 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created

Slide 81

Slide 81 text

Page 1 Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created

Slide 82

Slide 82 text

Slide 83

Slide 83 text

Page 1 Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 Row 7 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created

Slide 84

Slide 84 text

Page 1 Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 Row 7 Row 8 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created

Slide 85

Slide 85 text

Page 1 Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 Row 7 Row 8 Row 9 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created

Slide 86

Slide 86 text

Page 1 Fragmentation Heap-Organized Tables Page 0 Row 0 Row 1 Row 3 Row 4 Row 5 Row 6 Row 2 Row 7 Row 8 Row 9 Row 10 Row 11 New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created

Slide 87

Slide 87 text

Page 1 Fragmentation Heap-Organized Tables Page 0 Row 0 Row 3 Row 4 Row 5 Row 2 Row 7 Row 10 Row 11 unused spaces leads to more I/O, compared to a “more compact” table More I/O generally means: Slower query performance More memory usage (worse cache usage) More costs (Cloud providers bill based on I/O usage) New rows are added without any ordering constraint If Rows are variable in size (e.g. VARCHAR) An update may find space in the same table, or not. Some space is always kept free for this case (Page Fill Factor) But if the row is too large, a new Page is created Fragmentation

Slide 88

Slide 88 text

Slide 89

Slide 89 text

Fragmentation Index-Organized Tables Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent)

Slide 90

Slide 90 text

Fragmentation Index-Organized Tables Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent)

Slide 91

Slide 91 text

Fragmentation Index-Organized Tables Row C Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent)

Slide 92

Slide 92 text

Fragmentation Index-Organized Tables Row C Row F Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent)

Slide 93

Slide 93 text

Fragmentation Index-Organized Tables Row A Row C Row F Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted

Slide 94

Slide 94 text

Fragmentation Index-Organized Tables Row A Row C Row F Row H Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted

Slide 95

Slide 95 text

Slide 96

Slide 96 text

Fragmentation Index-Organized Tables Row A Row C Row F Row F Row H Row H Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Leaf Pages Contains the row with columns data

Slide 97

Slide 97 text

Slide 98

Slide 98 text

Fragmentation Index-Organized Tables Row A Row C Row D Row D Row F Row H Row H Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 99

Slide 99 text

Fragmentation Index-Organized Tables Row A Row C Row D Row D Row F Row I Row H Row I Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 100

Slide 100 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row H Row I Row F Row I Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 101

Slide 101 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 102

Slide 102 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 103

Slide 103 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row J Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 104

Slide 104 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row J Row I Row J Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 105

Slide 105 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row L Row I Row L Row K Row L Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted

Slide 106

Slide 106 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row L Row O Row I Row O Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around.

Slide 107

Slide 107 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row L Row O Row R Row I Row R Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around.

Slide 108

Slide 108 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row S Row T Row U Row L Row O Row R Row U Row I Row R Row U Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around.

Slide 109

Slide 109 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row S Row T Row U Row V Row W Row X Row L Row O Row R Row U Row X Row I Row R Row X Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around.

Slide 110

Slide 110 text

Fragmentation Index-Organized Tables Row A Row B Row C Row C Row D Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row S Row T Row U Row V Row W Row X Row Y Row Z Row L Row O Row R Row U Row X Row Z Row I Row R Row Z Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around.

Slide 111

Slide 111 text

Slide 112

Slide 112 text

Slide 113

Slide 113 text

Fragmentation Index-Organized Tables Row A Row B Row B Row D Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row S Row T Row U Row V Row W Row X Row Y Row Z Row L Row O Row R Row U Row X Row Z Row I Row R Row Z Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around.

Slide 114

Slide 114 text

Slide 115

Slide 115 text

Fragmentation Index-Organized Tables Row A Row B Row B Row F Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row S Row T Row U Row V Row W Row X Row Y Row Z Row L Row O Row R Row U Row X Row Z Row I Row R Row Z Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around. Removes may merge and remove “half-empty” nodes

Slide 116

Slide 116 text

Fragmentation Index-Organized Tables Row A Row B Row B Row G Row H Row F Row I Row I Row J Row K Row L Row M Row N Row O Row P Row Q Row R Row S Row T Row U Row V Row W Row X Row Y Row Z Row L Row O Row R Row U Row X Row Z Row I Row R Row Z Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around. Removes may merge and remove “half-empty” nodes

Slide 117

Slide 117 text

Slide 118

Slide 118 text

Slide 119

Slide 119 text

Fragmentation Index-Organized Tables Row A Row B Row B Row H Row F Row I Row I Row J Row L Row N Row O Row P Row R Row T Row U Row V Row X Row Z Row L Row O Row R Row U Row X Row Z Row I Row R Row Z Index-Organized Tables are implemented using a B+Tree or a variant. (The balancing behaviors are implementation dependent) Data is always sorted Random Order insertions causes a lot of “row movement” to keep the data sorted Append insertions are fast, data is already sorted No need to move things around. Removes may merge and remove “half-empty” nodes Some sort of fragmentation is inevitable

Slide 120

Slide 120 text

Slide 121

Slide 121 text

Slide 122

Slide 122 text

Large Rows BLOB, TEXT, … If a row does not fit in a page… Page X

Slide 123

Slide 123 text

Large Rows BLOB, TEXT, … If a row does not fit in a page… Page X TABLE posts ( post_date TIMESTAMP, post_title VARCHAR(128), post_content LONGTEXT, )

Slide 124

Slide 124 text

Slide 125

Slide 125 text

Large Rows BLOB, TEXT, … If a row does not fit in a page… Page X TABLE posts ( post_date TIMESTAMP, post_title VARCHAR(128), post_content LONGTEXT, ) TEXT & BLOB Used for storing “large” values Row 0 Hello World 1693116445 This is a Test If the value is small, it may be inlined.

Slide 126

Slide 126 text

Slide 127

Slide 127 text

Slide 128

Slide 128 text

Slide 129

Slide 129 text

Slide 130

Slide 130 text

Slide 131

Slide 131 text

Slide 132

Slide 132 text

Slide 133

Slide 133 text

Slide 134

Slide 134 text

Slide 135

Slide 135 text

Understanding what the storage does under the hood is crucial for optimizing performance and reduce costs

Slide 136

Slide 136 text

Heap-Organized Tables Index Fields Page Offset Heap-Organized Tables Non-Clustered Index Row Data Data is stored in no particular order. Without a matching non-clustered index, the entire table must be read (full table scan) to find any row. A Non-Clustered Index is a separate structure from the Data Table. It contains a copy of the indexed columns. Individual rows are identified by reference (page block, row offset) takes less space in non-clustered index than most primary keys. If the selected columns are part of the index, no extra lookup the the heap table has to be performed. Heap Tables excel at bulk insert operations since they don't require maintaining index order during insertion. Good for Archival or Staging Tables. Do not use when data is frequently sorted or grouped. Too many Disk I/O, Slow Performance, and the Cloud Provider bills everything

Slide 137

Slide 137 text

Index-Organized Tables Non-Clustered Index Index Fields Primary Key Clustered Index Index-Organized Tables Row Data Data is physical stored order based on the clustered key (Primary Key) Data can be retrieved quickly based on the clustered key, if the query uses the indexed columns. (Range & Lookup queries) Choose the Clustered Index based on the most frequent access pattern. A Non-Clustered Index is a separate structure from the Data Table. It contains a copy of the indexed columns. Individual rows are identified by the Primary Key takes more space (depending on the PK) compared to the block pointer of heap-organized tables. For Insert-heavy workloads, consider a Heap-organized table or use a Sequential RowId or a Time-based Id to reduce fragmentation.

Slide 138

Slide 138 text

Index Fields Page Offset Heap-Organized Tables Non-Clustered Index Row Data Non-Clustered Index Index Fields Primary Key Clustered Index Index-Organized Tables Row Data Heap-Organized Tables Index-Organized Tables Clustered Index Non-Clustered Index Data Pages Page Fragmentation BLOBs