Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Index Management in Depth

Index Management in Depth

Every time you choose how to store data in your database, a lot of things happen under the hood.
Making the best choice is even more important in those applications that aim to high performance.
The purpose of the talk is to show how indexes work and how slightly changing their combinations can impact on the performance of your application.

Andrea Giuliano

February 20, 2015
Tweet

More Decks by Andrea Giuliano

Other Decks in Technology

Transcript

  1. I N D E X M A N A G E M E N T
    I N D E P T H
    A n d re a G i u l i a n o
    @ b i t _ s h a r k

    View Slide

  2. D ATA B A S E M A N A G E M E N T S Y S T E M S

    View Slide

  3. T H E A R C H I T E C T U R E
    D A TA B A S E M A N A G E M E N T S Y S T E M S
    web forms Application Front Ends SQL Interface
    SQL COMMANDS
    Plan Executor Parser
    Optimizer
    Operator Evaluator
    Access File Manager
    SQL Engine
    Buffer Manager
    Disk Manager
    Transaction Manager
    Lock Manager
    Recovery
    Manager
    Concurrency Control
    DBMS

    View Slide

  4. S Q L E N G I N E
    A R C H I T E C T U R E O F A D B M S
    web forms Application Front Ends SQL Interface
    SQL COMMANDS
    Plan Executor Parser
    Optimizer
    Operator Evaluator
    Access File Manager
    SQL Engine
    Buffer Manager
    Disk Manager
    Transaction Manager
    Lock Manager
    Recovery
    Manager
    Concurrency Control
    DBMS

    View Slide

  5. A C C E S S F I L E M A N A G E R
    A R C H I T E C T U R E O F A D B M S
    web forms Application Front Ends SQL Interface
    SQL COMMANDS
    Plan Executor Parser
    Optimizer
    Operator Evaluator
    Access File Manager
    SQL Engine
    Buffer Manager
    Disk Manager
    Transaction Manager
    Lock Manager
    Recovery
    Manager
    Concurrency Control
    DBMS

    View Slide

  6. B U F F E R M A N A G E R
    A R C H I T E C T U R E O F A D B M S
    web forms Application Front Ends SQL Interface
    SQL COMMANDS
    Plan Executor Parser
    Optimizer
    Operator Evaluator
    Access File Manager
    SQL Engine
    Buffer Manager
    Disk Manager
    Transaction Manager
    Lock Manager
    Recovery
    Manager
    Concurrency Control
    DBMS

    View Slide

  7. D I S K M A N A G E R
    A R C H I T E C T U R E O F A D B M S
    web forms Application Front Ends SQL Interface
    SQL COMMANDS
    Plan Executor Parser
    Optimizer
    Operator Evaluator
    Access File Manager
    SQL Engine
    Buffer Manager
    Disk Manager
    Transaction Manager
    Lock Manager
    Recovery
    Manager
    Concurrency Control
    DBMS

    View Slide

  8. U N I T O F I N F O R M AT I O N
    Page
    size ranges from 2Kb to 64Kb
    I/O of pages dominates the cost of the operations

    View Slide

  9. F I L E O R G A N I Z AT I O N
    method of arranging the records in a file

    View Slide

  10. –Sars, Roebuck, and Co, Consumers’ Guide, 1897
    “If you don’t find it in the index, look very carefully
    through the entire catalog.”

    View Slide

  11. I N D E X E S
    organize data records on disk to optimize retrieval operations

    View Slide

  12. I N D E X E S
    data entry
    record stored in an index file
    data record
    record stored in a database file

    View Slide

  13. P R O P E R T I E S O F A N I N D E X
    1. Structure of data entries
    2. Clustered/unclustered
    3. Primary/secondary
    4. Dense/sparse
    5. Organization of the index

    View Slide

  14. S T R U C T U R E O F D ATA E N T R I E S
    k*: data entry whose search key value is k
    1. k* is a data record
    -extreme case
    2. k* is a pair (k, rid)
    -the index file is independent from the data file
    3. k* is a pair
    -the index file is independent from the data file
    -better use of space but variable-length data entries
    P R O P E R T I E S O F A N I N D E X

    View Slide

  15. C L U S T E R E D / U N C L U S T E R E D I N D E X
    unclustered clustered
    data record
    file
    index file
    data records
    data
    entries
    data
    entries
    index entries
    data records
    P R O P E R T I E S O F A N I N D E X

    View Slide

  16. P E R F O R M A N C E
    data record
    file
    index file
    data records
    data
    entries
    Clustered
    only few pages
    have to be retrieved
    Unclustered
    as many data page I/Os as
    the number of data entries
    C L U S T E R E D / U N C L U S T E R E D I N D E X

    View Slide

  17. P R I M A RY A N D S E C O N D A RY I N D E X
    P R O P E R T I E S O F A N I N D E X

    View Slide

  18. 10
    20
    30
    40
    10
    20
    30
    40
    page pointers
    page in index file
    page in data file
    P R O P E R T I E S O F A N I N D E X
    D E N S E I N D E X

    View Slide

  19. 10
    30
    50
    70
    10
    20
    30
    40
    50
    60
    70
    80
    page pointers
    page in index file
    page in data file
    P R O P E R T I E S O F A N I N D E X
    S PA R S E I N D E X

    View Slide

  20. P R O P E R T I E S O F A N I N D E X
    O R G A N I Z AT I O N O F T H E I N D E X
    sorted index
    the index is a sorted file
    tree-based index
    the index is a tree
    hash-based index
    the index is a function from search key values
    to record addresses

    View Slide

  21. I S A M A N D B + - T R E E
    T R E E - B A S E D I N D E X
    ISAM
    used when the relation is static: no insertion
    and deletion on the tree
    b+-tree
    effective in dynamic situations with insertion
    and deletion

    View Slide

  22. Data Pages
    Index Pages
    Overflow Pages
    Page Allocation in ISAM


    Non-leaf
    pages
    Leaf
    pages
    Primary pages
    Overflow page
    I S A M
    T R E E - B A S E D I N D E X

    View Slide

  23. I S A M : O P E R AT I O N S
    T R E E - B A S E D I N D E X
    search
    identical to b+-tree (more on this soon).
    insertion
    find the right position on the tree and write the
    key (possible overflow pages).
    deletion
    remove the entry and the empty overflow page if
    needed. Leave the empty primary page as it is.

    View Slide

  24. B + - T R E E
    T R E E - B A S E D I N D E X
    Index entries
    Data entries
    Index file

    View Slide

  25. 12 78
    3 9 19 56 86 94
    33 44
    Daniel, 22, 6003
    Ashby, 25, 3000
    Basu, 33, 4003 Rossi, 44, 3000
    Bianchi, 50, 5004
    B + - T R E E
    T R E E - B A S E D I N D E X
    … … … … … … …

    View Slide

  26. 12 78
    3 9 19 56 86 94
    33 44
    Daniel, 22, 6003
    Ashby, 25, 3000
    Basu, 33, 4003 Rossi, 44, 3000
    Bianchi, 50, 5004
    … … … … … … …

    B + - T R E E : S E A R C H
    T R E E - B A S E D I N D E X
    Start search

    View Slide

  27. B + - T R E E : S E A R C H
    T R E E - B A S E D I N D E X
    12 78
    3 9 19 56 86 94
    33 44
    Daniel, 22, 6003
    Ashby, 25, 3000
    Basu, 33, 4003 Rossi, 44, 3000
    Bianchi, 50, 5004
    … … … … … … …

    Start search
    A
    B
    L1 L2 L3
    find all data entries with
    24 < key < 50

    View Slide

  28. 12 78
    3 9 19 56
    33 44
    Daniel, 22, 6003
    Ashby, 25, 3000
    Basu, 33, 4003 Rossi, 44, 3000
    Bianchi, 50, 5004
    … … … …


    Start search
    A
    B
    L1 L2 L3
    S E A R C H : C O S T
    T R E E - B A S E D I N D E X
    f: fanout
    h: height
    m: leaves (f h)
    Cost of a search
    [logF
    m]

    View Slide

  29. T R E E - B A S E D I N D E X
    12 78
    3 9 19 56
    33 44
    Daniel, 22, 6003
    Ashby, 25, 3000
    Basu, 33, 4003 Rossi, 44, 3000
    Bianchi, 50, 5004
    … … … …


    Start search
    A
    B
    L1 L2 L3
    S E A R C H : C O S T
    f = 3
    h = 3
    m = 27
    I/Os
    [log3
    27] = 3

    View Slide

  30. B + - T R E E : I N S E R T
    T R E E - B A S E D I N D E X
    13 17 24 30
    2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
    Insertion of a data record with search key value 8

    View Slide

  31. 2* 3* 5* 7* 8*
    B + - T R E E : I N S E R T
    T R E E - B A S E D I N D E X

    5
    data entry is copied up

    View Slide

  32. B + - T R E E : I N S E R T
    T R E E - B A S E D I N D E X
    17
    data entry is pushed up
    5 13 24 30

    View Slide

  33. B + - T R E E : I N S E R T
    T R E E - B A S E D I N D E X
    17
    5* 7* 8* 14
    *
    16
    *
    19
    *
    20
    *
    22
    *
    24
    *
    27
    *
    29
    *
    33
    *
    34
    *
    38
    *
    39
    *
    5 13 24 30
    2* 3*
    The resulting tree after the insertion of a data record with search key value 8.

    View Slide

  34. B + - T R E E : D E L E T E
    T R E E - B A S E D I N D E X
    17
    5* 7* 8* 14
    *
    16
    *
    22 24 27 29 33
    *
    34
    *
    38
    *
    39
    *
    5 13 27 30
    2* 3*
    The resulting tree after deleting entries 19* and 20*

    View Slide

  35. View Slide

  36. - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - dense unclustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X

    View Slide

  37. SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%
    - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - dense unclustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X

    View Slide

  38. SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%
    - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - dense unclustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    how many page accesses
    do we need to answer to the query?
    E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X

    View Slide

  39. E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X
    - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - dense unclustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    Let’s build the index structure
    each record has 4 field so in each page there are 40 fields
    20 data entries fit in one leaf page of the index
    the tree has a fan-out of 20

    View Slide

  40. E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X
    fanout: 20
    occupancy factor of 67%
    leads to 13 data entries in the leaves
    How many leaves are there in the tree?
    2.000.000/13 = 153.846

    View Slide

  41. log20 (153.846) = 4 I/O page accesses
    E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X
    How many I/Os are needed to go to the leaves?
    leaves: 153.846
    fanout: 20

    View Slide

  42. E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X
    - 200 records with the same value of the attribute cost
    (on average)
    - 13 data entries in the leaves
    - dense unclustered b+-tree index with search key cost
    Book
    code
    author
    cost
    publisher
    SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%
    - 15 pages (200/13) to visit for reaching data
    records with the same cost value (on average)

    View Slide

  43. ~ 3 sec
    E X A M P L E
    D E N S E U N C L U S T E R E D I N D E X
    Costs
    path to the leaves: 4 I/Os
    leaves access: 15 I/Os
    data records: 200 I/Os
    Total cost
    4 + 15 + 200 = 219 I/Os

    View Slide

  44. - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - sparse clustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    E X A M P L E
    S PA R S E C L U S T E R E D I N D E X

    View Slide

  45. - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - sparse clustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%

    View Slide

  46. - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - sparse clustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%
    how many page accesses
    do we need to answer to the query?

    View Slide

  47. - 2.000.000 records
    - 200.000 pages
    - 10 data record in a page
    - 200 records with the same value of the attribute cost
    (on average)
    - sparse clustered b+-tree index with search key cost
    - alternative 2 (k*, rid)
    Book
    code
    author
    cost
    publisher
    E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    Let’s build the index structure
    each record has 4 field so in each page there are 40 fields
    20 data entries fit in one leaf page of the index
    the tree has a fan-out of 20

    View Slide

  48. E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    fanout: 20
    How many pages store 2.000.000 data records?
    2.000.000/10 = 200.000
    each data entry points to a
    data record page

    View Slide

  49. E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    fanout: 20
    occupancy factor of 67%
    leads to 13 data entries in the leaves
    How many leaves are there in the tree?
    200.000/13 = 15.384

    View Slide

  50. log20 (15.384) = 3 I/O page accesses
    E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    How many I/Os are needed to go to the leaves?
    leaves: 15.384
    fanout: 20
    fanout: 20

    View Slide

  51. E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    - 200 records with the same value of the attribute cost
    (on average)
    - 10 data records in a page
    - sparse clustered b+-tree index with search key cost
    Book
    code
    author
    cost
    publisher
    SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%
    - 20 pages (200/10) of data records to visit 

    (on average)

    View Slide

  52. ~ 0.3 sec
    E X A M P L E
    S PA R S E C L U S T E R E D I N D E X
    Costs
    path to the leaves: 3 I/Os
    data records: 20 I/Os
    Total cost
    3 + 20 = 23 I/Os

    View Slide

  53. E X A M P L E
    A VA R I A N T
    What if the attributes were part of the search key?
    SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%

    View Slide

  54. E X A M P L E
    W I T H O U T I N D E X
    In the worst case we have to visit 2.000.000 records
    SELECT code, author, publisher
    FROM Book
    WHERE cost = %cost%
    ~ 50 min

    View Slide

  55. View Slide

  56. $ W H O A M I
    Andrea Giuliano
    @bit_shark
    www.andreagiuliano.it

    View Slide

  57. joind.in/13333
    Please rate the talk!

    View Slide

  58. Ramakrishan, Gehrke “Database Management Systems”
    Assets:
    https://farm4.staticflickr.com/3577/3492185538_a39dbb4511_b_d.jpg
    https://farm3.staticflickr.com/2852/10740309163_12f6a671cc_k_d.jpg
    https://farm3.staticflickr.com/2456/3835365695_5e515a3492_b_d.jpg
    https://farm5.staticflickr.com/4048/4332381194_3cfbed7f8e_b_d.jpg
    https://farm5.staticflickr.com/4117/4800819674_3cf963deaa_b_d.jpg
    https://farm5.staticflickr.com/4141/4772464179_0672159bbd_b.jpg
    https://farm9.staticflickr.com/8530/8574154090_bd14f9ccbf_o_d.jpg
    https://farm8.staticflickr.com/7370/10847923014_2b3fc30cea_k_d.jpg
    R E F E R E N C E S

    View Slide