Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Database Indexing - Organize your Data for Fast...

Database Indexing - Organize your Data for Faster Queries

Matteo Bertozzi

May 02, 2023
Tweet

More Decks by Matteo Bertozzi

Other Decks in Technology

Transcript

  1. Database Indexing The Book Th30z Book Title Author Book Title

    Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 Organize your Data For Faster Queries
  2. Database Indexing The Book Th30z Book Title Author Book Title

    Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15
  3. Database Indexing The Book Th30z Book Title Author Book Title

    Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3
  4. Give me all the books by this “author” Database Indexing

    The Book Th30z Book Title Author Book Title Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 SELECT * FROM books WHERE author = ‘abc’ Sorted by: (Author, Title) Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3
  5. Give me all the books by this “author” with the

    title starting with “z%” The Book Th30z Book Title Author Book Title Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 SELECT * FROM books WHERE author = ‘abc’ AND title LIKE ‘z%’ Sorted by: (Author, Title) Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  6. Give me the book by this “author” with the title

    “x2” The Book Th30z Book Title Author Book Title Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 SELECT * FROM books WHERE author = ‘abc’ AND title = ‘x2’ Sorted by: (Author, Title) Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  7. Give me the books Published in this year The Book

    Th30z Book Title Author Book Title Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 SELECT * FROM books WHERE published LIKE ‘%-1984’ Sorted by: (Author, Title) Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  8. Give me the books Published in this year The Book

    Th30z Book Title Author Book Title Au!or 1 2 3 4 A B x1 x2 x3 z05 z07 z11 z15 SELECT * FROM books WHERE published LIKE ‘%-1984’ Sorted by: (Author, Title) Full Table Scan Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  9. The Book Th30z Book Title Author 4 x3 z15 Book

    Title Au!or 1 2 3 A B x1 x2 z05 z07 z11 Sorted by: (Published) Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  10. The Book Th30z Book Title Author 4 x3 z15 Book

    Title Au!or 1 2 3 A B x1 x2 z05 z07 z11 Sorted by: (Published) SELECT * FROM books WHERE published LIKE ‘1984-%’ Give me the books Published in this year Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  11. The Book Th30z Book Title Author 4 x3 z15 Book

    Title Au!or 1 2 3 A B x1 x2 z05 z07 z11 Sorted by: (Published) SELECT * FROM books WHERE published > ’1984-Mar-01’ AND published < ’1985-Sep-01’ Give me the books Published in this period Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  12. The Book Th30z Book Title Author 4 x3 z15 Book

    Title Au!or 1 2 3 A B x1 x2 z05 z07 z11 Sorted by: (Published) Give me all the books by this “author” SELECT * FROM books WHERE author = ‘abc’ Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  13. The Book Th30z Book Title Author 4 x3 z15 Book

    Title Au!or 1 2 3 A B x1 x2 z05 z07 z11 Sorted by: (Published) Give me all the books by this “author” SELECT * FROM books WHERE author = ‘abc’ Full Table Scan Books Table Author: Sir write-a-lot Title: Made up stuff Category: Science fiction Publisher: Cool Stuff Press Published: 1984-Jan-09 ISBN: 123-4-56-789012-3 Database Indexing
  14. A1 117 A2 43 A3 178 B1 79 B2 48

    C1 176 C2 200 C3 190 C4 89 D1 233 D2 77 a b c aaaa bbbb cccc dddd eeee ffff gggg hhhh iiii rrrr zzzz Single key If the data is sorted, the database can perform “range queries” really fast WHERE a LIKE ‘%50’ The key is used to identify the “start” Then each row is checked on the condition (b = 200) WHERE b = 2 The prefix of key is used. Then a scan in that range is performed WHERE c = ‘dddd’ WHERE a LIKE ‘C%’ WHERE a > ‘B1’ AND b = 200 Full Table Scan: The index cannot be used each row in the table will be read. Full Table Scan: The index cannot be used each row in the table will be read. Full Table Scan: The column is not indexed each row in the table will be read. WHERE a > ‘B1’ AND a < ‘C3’ The key is used to identify the “start” and the “end” Then a scan is performed on that range WHERE a > ‘B1’ The key is used to identify the “start” Then a scan is performed from that row onwards The key is used WHERE a = ‘B2’ Database Indexing
  15. M 1 117 M 2 43 M 3 178 Q

    1 79 Q 2 48 R 1 176 R 2 200 R 3 190 R 4 89 Z 1 233 Z 2 77 a b c d aaaa bbbb cccc dddd eeee ffff gggg hhhh iiii rrrr zzzz Compound key The columns order is important. (a, b) is different from (b, a) WHERE a LIKE ‘%xyz’ The prefix of key is used. Then a scan in that range is performed WHERE b = 2 The prefix of key is used. Then a scan in that range is performed WHERE c = 200 WHERE a LIKE ‘xyz%’ WHERE a = ‘M’ Full Table Scan: The index cannot be used each row in the table will be read. Full Table Scan: The index cannot be used each row in the table will be read. Full Table Scan: The column is not indexed each row in the table will be read. WHERE a = ‘R’ AND b > 10 The full key is used. Then a scan is performed from that row onwards WHERE (a > ‘O’ AND a < ’S') AND (b > 10) The prefix of key is used, to identify the a range Then each row is checked on the condition (b > 10) The full key is used WHERE a = ‘M’ AND b = 3 Database Indexing
  16. (M, 2) 43 48 77 79 89 (R, 4) 117

    176 178 190 (Z, 2) 200 233 c (Q, 2) (Q, 1) PK ref (R, 2) (R, 1) (R, 3) (Z, 1) (M, 3) (M, 1) index_on_c M 1 117 M 2 43 M 3 178 Q 1 79 Q 2 48 R 1 176 R 2 200 R 3 190 R 4 89 Z 1 233 Z 2 77 a b c d aaaa bbbb cccc dddd eeee ffff gggg hhhh iiii rrrr zzzz table_pk_a_b SELECT a, b, c, d FROM table WHERE c > 70 AND c < 150 Secondary Index for each Row maching “there will be a lookup“ on the primary key Database Indexing Clustered Index Non-Clustered Index
  17. SELECT c, d FROM table WHERE c > 70 AND

    c < 150 M 1 117 M 2 43 M 3 178 Q 1 79 Q 2 48 R 1 176 R 2 200 R 3 190 R 4 89 Z 1 233 Z 2 77 a b c d aaaa bbbb cccc dddd eeee ffff gggg hhhh iiii rrrr zzzz table_pk_a_b (M, 2) 43 48 77 79 89 (R, 4) 117 176 178 190 (Z, 2) 200 233 c (Q, 2) (Q, 1) PK ref (R, 2) (R, 1) (R, 3) (Z, 1) (M, 3) (M, 1) index_on_c_d aaaa bbbb eeee cccc dddd ffff gggg hhhh iiii rrrr zzzz d Index with Included Columns If our query uses an index but it always selects a couple of fields we can “embed” them into the index (Only the index is used) Database Indexing
  18. Database Indexing There may be better indexes but the “Range

    Index” B+Tree (LSM Tree) that you find in every Relational and NoSQL is a good default for every kind of query Databases tends to have more reads than writes M 1 117 M 2 43 M 3 79 Q 1 79 Q 2 48 R 1 176 R 2 200 R 3 190 R 4 89 Z 1 233 Z 2 77 Know what queries you need to support! The best way to design a database table/index is to