$30 off During Our Annual Pro Sale. View Details »

RivieraJUG - MySQL Indexes and Histograms

lefred
September 13, 2022

RivieraJUG - MySQL Indexes and Histograms

This session was presented at RivieraJUG, Nice, France on September 2022. Nobody complains that the database is too fast. But when things slow down, the complaints come quickly. The two most popular approaches to speeding up queries are indexes and histograms. But there are so many options and types on indexes that it can get confusing. Histograms are fairly new to MySQL but they do not work for all types of data. This talk covers how indexes and histograms work and shows you how to test just how effective they are so you can measure the performance of your queries.

lefred

September 13, 2022
Tweet

More Decks by lefred

Other Decks in Technology

Transcript

  1. Frédéric Descamps
    Community Manager
    Oracle MySQL
    RivieraJUG - September 2022
    MySQL Indexes and Histograms
    How They Really Speed Up Your Queries

    View Slide

  2. Frédéric Descamps
    Community Manager
    Oracle MySQL
    RivieraJUG - September 2022
    MySQL Indexes and Histograms
    How They Really Speed Up Your Queries

    View Slide

  3. Who am I ?
    about.me/lefred
    Copyright @ 2022 Oracle and/or its affiliates.
    2

    View Slide

  4. @lefred
    MySQL Evangelist
    using MySQL since version 3.20
    devops believer
    living in
    h ps://lefred.be
    Frédéric Descamps
    Copyright @ 2022 Oracle and/or its affiliates.
    3

    View Slide

  5. What Is This Session About ?
    Copyright @ 2022 Oracle and/or its affiliates.
    4

    View Slide

  6. Speed and Performance !
    Nobody ever complains that the database is too fast !
    Copyright @ 2022 Oracle and/or its affiliates.
    5

    View Slide

  7. Speed and Performance !
    Nobody ever complains that the database is too fast !
    It's often the reverse... blame the database !
    Copyright @ 2022 Oracle and/or its affiliates.
    5

    View Slide

  8. Speed and Performance !
    Nobody ever complains that the database is too fast !
    It's often the reverse... blame the database !
    Speeding up queries is not a 'dark art'... but understanding how to speed up queries is often
    treated as magic.
    Copyright @ 2022 Oracle and/or its affiliates.
    5

    View Slide

  9. Speed and Performance !
    Nobody ever complains that the database is too fast !
    It's often the reverse... blame the database !
    Speeding up queries is not a 'dark art'... but understanding how to speed up queries is often
    treated as magic.
    So we will be looking at the proper use of indexes, histograms and see how to keep the rigth
    balance for your workload.
    Copyright @ 2022 Oracle and/or its affiliates.
    5

    View Slide

  10. No coverage today of:
    System Con guration
    OS
    MySQL
    Hardware
    Networking/Cloud
    This is a dry subject !
    Do not try to absorb all the content at once, get the slides
    (h ps://slideshare.net/lefred.descamps)
    Copyright @ 2022 Oracle and/or its affiliates.
    6

    View Slide

  11. The North Star
    Query Respone Time
    Copyright @ 2022 Oracle and/or its affiliates.
    7

    View Slide

  12. Daniel Nichter, E cient MySQL
    Performance - Best Practices and
    Techniques, O'Reilly, 2021
    Query Response time
    Query response time is the only metric anyone truly cares about [...] because query
    response time is the only metric we experience. When a query takes 7.5 seconds to execute,
    we experience 7.5 seconds of impatience. That same query might examine a million rows,
    but we don't experience a million rows examined. Our time is precious.(*)
    Copyright @ 2022 Oracle and/or its affiliates.
    8

    View Slide

  13. Today's goal is to...
    reduce the Query Response Time
    Copyright @ 2022 Oracle and/or its affiliates.
    9

    View Slide

  14. Bad Queries
    Finding the Ugly Duckling
    Copyright @ 2022 Oracle and/or its affiliates.
    10

    View Slide

  15. What are bad queries ?
    We can de ne bad queries in two di erent categories:
    Copyright @ 2022 Oracle and/or its affiliates.
    11

    View Slide

  16. queries called way to often
    queries that are way too slow
    full table scan
    using lesort
    using temporary tables
    What are bad queries ?
    We can de ne bad queries in two di erent categories:
    Copyright @ 2022 Oracle and/or its affiliates.
    11

    View Slide

  17. What are bad queries ?
    Copyright @ 2022 Oracle and/or its affiliates.
    12

    View Slide

  18. If there could be only one ?
    If you should optimize only one query, the best candidate should be the query that
    consumes the most of the execution time (seen as latency in PFS, but usually called
    "response time").
    Copyright @ 2022 Oracle and/or its affiliates.
    13

    View Slide

  19. If there could be only one ?
    If you should optimize only one query, the best candidate should be the query that
    consumes the most of the execution time (seen as latency in PFS, but usually called
    "response time").
    sys Schema contains all the necessary info to nd that ugly Duckling:
    Copyright @ 2022 Oracle and/or its affiliates.
    13

    View Slide

  20. If there could be only one ?
    If you should optimize only one query, the best candidate should be the query that
    consumes the most of the execution time (seen as latency in PFS, but usually called
    "response time").
    sys Schema contains all the necessary info to nd that ugly Duckling:
    SELECT
    SELECT schema_name
    schema_name,
    , format_pico_time
    format_pico_time(
    (total_latency
    total_latency)
    ) tot_lat
    tot_lat,
    ,
    exec_count
    exec_count,
    , format_pico_time
    format_pico_time(
    (total_latency
    total_latency/
    /exec_count
    exec_count)
    ) latency_per_call
    latency_per_call,
    ,
    query_sample_text
    query_sample_text
    FROM
    FROM sys
    sys.
    .x$statements_with_runtimes_in_95th_percentile
    x$statements_with_runtimes_in_95th_percentile AS
    AS t1
    t1
    JOIN
    JOIN performance_schema
    performance_schema.
    .events_statements_summary_by_digest
    events_statements_summary_by_digest AS
    AS t2
    t2
    ON
    ON t2
    t2.
    .digest
    digest=
    =t1
    t1.
    .digest
    digest
    WHERE
    WHERE schema_name
    schema_name NOT
    NOT in
    in (
    ('performance_schema'
    'performance_schema',
    , 'sys'
    'sys')
    )
    ORDER
    ORDER BY
    BY (
    (total_latency
    total_latency/
    /exec_count
    exec_count)
    ) desc
    desc LIMIT
    LIMIT 1
    1\G
    \G
    Copyright @ 2022 Oracle and/or its affiliates.
    13

    View Slide

  21. If there could be only one ?
    If you should optimize only one query, the best candidate should be the query that
    consumes the most of the execution time (seen as latency in PFS, but usually called
    "response time").
    sys Schema contains all the necessary info to nd that ugly Duckling:
    SELECT
    SELECT schema_name
    schema_name,
    , format_pico_time
    format_pico_time(
    (total_latency
    total_latency)
    ) tot_lat
    tot_lat,
    ,
    exec_count
    exec_count,
    , format_pico_time
    format_pico_time(
    (total_latency
    total_latency/
    /exec_count
    exec_count)
    ) latency_per_call
    latency_per_call,
    ,
    query_sample_text
    query_sample_text
    FROM
    FROM sys
    sys.
    .x$statements_with_runtimes_in_95th_percentile
    x$statements_with_runtimes_in_95th_percentile AS
    AS t1
    t1
    JOIN
    JOIN performance_schema
    performance_schema.
    .events_statements_summary_by_digest
    events_statements_summary_by_digest AS
    AS t2
    t2
    ON
    ON t2
    t2.
    .digest
    digest=
    =t1
    t1.
    .digest
    digest
    WHERE
    WHERE schema_name
    schema_name NOT
    NOT in
    in (
    ('performance_schema'
    'performance_schema',
    , 'sys'
    'sys')
    )
    ORDER
    ORDER BY
    BY (
    (total_latency
    total_latency/
    /exec_count
    exec_count)
    ) desc
    desc LIMIT
    LIMIT 1
    1\G
    \G
    Copyright @ 2022 Oracle and/or its affiliates.
    *************************** 1. row ***************************
    schema_name: piday
    tot_lat: 4.29 h
    exec_count: 5
    latency_per_call: 51.51 min
    query_sample_text: select a.device_id, max(a.value) as `max temp`,
    min(a.value) as `min temp`, avg(a.value) as `avg temp`,
    max(b.value) as `max humidity`, min(b.value) as `min humidity`,
    avg(b.value) as `avg humidity`
    from temperature_history a
    join humidity_history b on b.device_id=a.device_id
    where date(a.time_stamp) = date(now())
    and date(b.time_stamp)=date(now()) group by device_id
    13

    View Slide

  22. More info about Queries
    Sys Schema contains all the required information in these tables :
    statements_with_full_table_scans
    statements_with_runtimes_in_95th_percentile
    statements_with_sorting
    statements_with_temp_tables
    Copyright @ 2022 Oracle and/or its affiliates.
    14

    View Slide

  23. More info about Queries
    Sys Schema contains all the required information in these tables :
    statements_with_full_table_scans
    statements_with_runtimes_in_95th_percentile
    statements_with_sorting
    statements_with_temp_tables
    And since MySQL 8.0 you can join the table
    performance_schema.events_statements_summary_by_digest to have a sample
    you can use.
    Copyright @ 2022 Oracle and/or its affiliates.
    14

    View Slide

  24. More info about Queries
    Sys Schema contains all the required information in these tables :
    statements_with_full_table_scans
    statements_with_runtimes_in_95th_percentile
    statements_with_sorting
    statements_with_temp_tables
    And since MySQL 8.0 you can join the table
    performance_schema.events_statements_summary_by_digest to have a sample
    you can use.
    We will check the meaning of this tables in some slides... be patient ;)
    Copyright @ 2022 Oracle and/or its affiliates.
    14

    View Slide

  25. Copyright @ 2022 Oracle and/or its affiliates.
    15

    View Slide

  26. The Optimizer
    some theory
    Copyright @ 2022 Oracle and/or its affiliates.
    16

    View Slide

  27. The MySQL Optimizer
    Consider the Optimizer the brain and nervous system of MySQL
    Query optimization is a feature of many Relational Database Management Systems
    The query optimizer a empts to determine the most e ective way to execute a given
    query by considering the possible query plans
    (h ps://en.wikipedia.org/wiki/Query_optimization)
    Copyright @ 2022 Oracle and/or its affiliates.
    17

    View Slide

  28. The MySQL Optimizer - estimation
    One of the hardest problems in query optimization is to accurately estimate the costs of
    alternative query plans.
    These costs are the result of a mathematical model of query execution costs that relies
    heavily on estimates of the cardinality, or number of tuple, owing through each edge in
    a query plan.
    Copyright @ 2022 Oracle and/or its affiliates.
    18

    View Slide

  29. The MySQL Optimizer - evaluation of the options
    The Optimizer wants to get your data the cheapest way possible.
    Like a route planner, the cost is built on historical statistics. And these statistics can
    change while the optimizer is working. So like a tra c jam, washed out road, or other
    tra c problem, the optimizer may be making poor decisions for the present situation...
    but this is very rare !
    The nal determination from the optimizer is called the Query Execution Plan (or QEP, or
    Query Plan).
    MySQL wants to optimize each query every time it sees it (there is no locking down the
    query plan like Oracle).
    Copyright @ 2022 Oracle and/or its affiliates.
    19

    View Slide

  30. 120
    if your query has ve joins the optimizer may
    have to evaluate 120 di erent options
    5!
    (5 * 4 * 3 * 2 * 1)
    Copyright @ 2022 Oracle and/or its affiliates.
    20

    View Slide

  31. The Query Execution Plan
    EXPLAIN: DBAs companion
    Copyright @ 2022 Oracle and/or its affiliates.
    21

    View Slide

  32. EXPLAIN is the command used to obtain
    the Query Execution Plan for a query
    including information about how tables are
    joined and in which order, which indexes are
    used and estimation of rows, ...
    EXPLAIN Syntax
    Copyright @ 2022 Oracle and/or its affiliates.
    22

    View Slide

  33. EXPLAIN Example
    Copyright @ 2022 Oracle and/or its affiliates.
    23

    View Slide

  34. EXPLAIN Example
    this is an ESTIMATION on how MySQL would run the query
    as it is not executed !
    Copyright @ 2022 Oracle and/or its affiliates.
    23

    View Slide

  35. system: the table contains excatly 1 row
    const: at most 1 row is matched for the
    table
    eq_ref: the table is the right-hand table
    in a join where the condition is on a PK
    or not null unique key.
    ref: the table is ltered by a nonunique
    secondary index.
    ref_or_null: the same as ref but the
    ltered column may also be NULL.
    index_merge: the Optimizer chooses a
    combination of two or more indexes to
    resolve a lter that includes an OR or
    AND between columns in di erent
    indexes.
    fulltext: use of a full text index to lter
    the table.
    range: this is used when an index is used
    to look up several values either in
    sequence or in groups.
    EXPLAIN - Access type
    Copyright @ 2022 Oracle and/or its affiliates.
    24

    View Slide

  36. index: the Optimizer chosen to perform
    a full index scan.
    ALL: full table scan !!
    EXPLAIN - Access type (2)
    Get much more info an examples in Chapter 20, Analyzing Queries from Jesper Wisborg
    Krogh's book: MySQL 8 Query Performance Tuning, Apress, 2020.
    Copyright @ 2022 Oracle and/or its affiliates.
    25

    View Slide

  37. EXPLAIN Example (with a JOIN)
    Copyright @ 2022 Oracle and/or its affiliates.
    26

    View Slide

  38. Visual EXPLAIN Example from MySQL Workbench
    Copyright @ 2022 Oracle and/or its affiliates.
    27

    View Slide

  39. EXPLAIN FORMAT=TREE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    28

    View Slide

  40. EXPLAIN FORMAT=TREE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    28

    View Slide

  41. EXPLAIN FORMAT=JSON Example
    Copyright @ 2022 Oracle and/or its affiliates.
    29

    View Slide

  42. EXPLAIN FORMAT=JSON Example
    this is the most detailed estimation !
    Copyright @ 2022 Oracle and/or its affiliates.
    29

    View Slide

  43. can we know the
    real numbers ?
    Copyright @ 2022 Oracle and/or its affiliates.
    30

    View Slide

  44. Estimated cost
    Actual execution statistics
    Time to return rst row
    Time to return all rows
    Number of rows returned
    Number of loops
    EXPLAIN ANALYZE
    Copyright @ 2022 Oracle and/or its affiliates.
    31

    View Slide

  45. Estimated cost
    Actual execution statistics
    Time to return rst row
    Time to return all rows
    Number of rows returned
    Number of loops
    Instruments and executes
    the query
    EXPLAIN ANALYZE
    Copyright @ 2022 Oracle and/or its affiliates.
    31

    View Slide

  46. EXPLAIN ANALYZE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    32

    View Slide

  47. EXPLAIN ANALYZE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    33

    View Slide

  48. EXPLAIN ANALYZE Example
    time is expressed in milliseconds
    Copyright @ 2022 Oracle and/or its affiliates.
    33

    View Slide

  49. EXPLAIN ANALYZE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    34

    View Slide

  50. EXPLAIN ANALYZE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    35

    View Slide

  51. EXPLAIN ANALYZE Example
    Copyright @ 2022 Oracle and/or its affiliates.
    36

    View Slide

  52. More on EXPLAIN
    h ps://dev.mysql.com/doc/refman/8.0/en/explain.html
    h ps://dev.mysql.com/doc/refman/8.0/en/explain-output.html
    h ps://dev.mysql.com/doc/refman/8.0/en/explain-extended.html
    h ps://dev.mysql.com/doc/workbench/en/wb-performance-explain.html
    Copyright @ 2022 Oracle and/or its affiliates.
    37

    View Slide

  53. Indexes
    nd rows with speci c column values quickly
    Copyright @ 2022 Oracle and/or its affiliates.
    38

    View Slide

  54. Indexes
    A database index is a data structure that improves the speed of data retrieval operations on
    a database table at the cost of additional writes and storage space to maintain the index
    data structure.
    Indexes are used to quickly locate data without having to search every row in a database
    table every time a database table is accessed.
    Indexes can be created using one or more columns of a database table, providing the basis
    for both rapid random lookups and e cient access of ordered records.
    (h ps://en.wikipedia.org/wiki/Database_index)
    Copyright @ 2022 Oracle and/or its affiliates.
    39

    View Slide

  55. MySQL supports multiple kind of indexes:
    primary key / clustered index
    secondary index
    full-text index
    spatial index
    Indexes in MySQL
    Copyright @ 2022 Oracle and/or its affiliates.
    40

    View Slide

  56. pre x of a column
    mutli-column
    unique
    covering
    functional
    multi-value
    Indexes in MySQL (2)
    MySQL supports 2 types of indexes:
    BTREE
    HASH
    and Indexes can have some "properties":
    Copyright @ 2022 Oracle and/or its affiliates.
    41

    View Slide

  57. Clustered Indexes
    Each InnoDB table has a special index called the clustered index that stores row data.
    Typically, the clustered index is synonymous with the primary key.
    Copyright @ 2022 Oracle and/or its affiliates.
    42

    View Slide

  58. Clustered Indexes
    Each InnoDB table has a special index called the clustered index that stores row data.
    Typically, the clustered index is synonymous with the primary key.
    Let's check now an example on how we usually mentally represent a table and an index.
    Copyright @ 2022 Oracle and/or its affiliates.
    42

    View Slide

  59. Mental representation of table and an index
    Table 1
    Indexed Column column 1 column 2 column x column n
    Index
    Copyright @ 2022 Oracle and/or its affiliates.
    43

    View Slide

  60. Mental representation of table and an index
    Table 1
    Index Indexed Column column 1 column 2 column x column n
    5001
    5001 a b 2022-03-14 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    44

    View Slide

  61. Mental representation of a table and an index
    Table 1
    Index Indexed Column column 1 column 2 column x column n
    5001
    3
    3
    5001
    a
    a
    c
    b 2022-03-14
    2022-03-15
    NULL
    NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    45

    View Slide

  62. Mental representation of a table and an index
    Table 1
    Index Indexed Column column 1 column 2 column x column n
    5001
    3
    3
    6
    a
    a
    c
    b 2022-03-14
    2022-03-15
    NULL
    NULL
    6
    5001 be
    f 2022-03-15 1
    Copyright @ 2022 Oracle and/or its affiliates.
    46

    View Slide

  63. Mental representation of a table and an index
    Table 1
    Index Indexed Column column 1 column 2 column x column n
    5001
    3
    3
    6
    a
    a
    c
    b 2022-03-14
    2022-03-15
    NULL
    NULL
    6
    27 be
    f 2022-03-15 1
    27
    5001 it
    d 2022-03-16 101
    Copyright @ 2022 Oracle and/or its affiliates.
    47

    View Slide

  64. Mental representation of a table and an index
    Table 1
    Index Indexed Column column 1 column 2 column x column n
    5001
    3
    3
    6
    a
    a
    c
    b 2022-03-14
    2022-03-15
    NULL
    NULL
    6
    27
    be
    f 2022-03-15 1
    27
    12
    it
    d 2022-03-16 101
    12
    5001 uk
    e 2022-03-22 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    48

    View Slide

  65. Mental representation of a table and an index
    Table 1
    Index Indexed Column column 1 column 2 column x column n
    5001
    3
    3
    6
    a
    a
    c
    b 2022-03-14
    2022-03-15
    NULL
    NULL
    6
    27
    be
    f 2022-03-15 1
    27
    12
    it
    d 2022-03-16 101
    12
    5001 uk
    e 2022-03-22 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    49

    View Slide

  66. InnoDB representation of a table and PK
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    insert into table1 values
    (5001,'a','b','2022-03-14', NULL);
    Copyright @ 2022 Oracle and/or its affiliates.
    50

    View Slide

  67. InnoDB representation of a table and PK
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    insert into table1 values
    (3,'c','a','2022-03-15', NULL);
    3 a
    c 2022-03-15 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    51

    View Slide

  68. InnoDB representation of a table and PK
    6 be
    f 2022-03-15 1
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    insert into table1 values
    (6,'f','be','2022-03-15', 1);
    3 a
    c 2022-03-15 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    52

    View Slide

  69. InnoDB representation of a table and PK
    6 be
    f 2022-03-15 1
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    insert into table1 values
    (27,'d','it','2022-03-16', 101);
    3 a
    c 2022-03-15 NULL
    27 it
    d 2022-03-16 101
    Copyright @ 2022 Oracle and/or its affiliates.
    53

    View Slide

  70. InnoDB representation of a table and PK
    6 be
    f 2022-03-15 1
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    insert into table1 values
    (12,'e','uk','2022-03-22', NULL);
    3 a
    c 2022-03-15 NULL
    27 it
    d 2022-03-16 101
    12 uk
    e 2022-03-22 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    54

    View Slide

  71. InnoDB representation of a table and PK
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    3 a
    c 2022-03-15 NULL
    6 be
    f 2022-03-15 1
    27 it
    d 2022-03-16 101
    12 uk
    e 2022-03-22 NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    55

    View Slide

  72. InnoDB representation of a table and PK
    Table 1
    PRIMARY KEY column 1 column 2 column x column n
    5001 a b 2022-03-14 NULL
    3 a
    c 2022-03-15 NULL
    6 be
    f 2022-03-15 1
    27 it
    d 2022-03-16 101
    12 uk
    e 2022-03-22 NULL
    This is the clustered index representation: stored by order of Primary Key
    Copyright @ 2022 Oracle and/or its affiliates.
    55

    View Slide

  73. InnoDB Primary Key
    InnoDB stores data in table spaces.
    And so far, we know that records are stored and sorted using the clustered index.
    The Primary Key is a key for the index that uniquely de ned for a row, should be
    immutable.
    InnoDB needs a PRIMARY KEY
    No NULL values are allowed
    Monotonically increasing
    use UID_To_BIN() if you must use UUIDs, otherwise avoid them
    Copyright @ 2022 Oracle and/or its affiliates.
    56

    View Slide

  74. InnoDB Primary Key (2)
    What we don't know is that all secondary indexes also contain the primary key as the right-
    most column in the index (even if this is not exposed). That means when a secondary index
    is used to retrieve a record, two indexes are used: rst the secondary one pointing to the
    primary key that will be used to nally retrieve the record.
    Copyright @ 2022 Oracle and/or its affiliates.
    57

    View Slide

  75. InnoDB Primary Key (2)
    What we don't know is that all secondary indexes also contain the primary key as the right-
    most column in the index (even if this is not exposed). That means when a secondary index
    is used to retrieve a record, two indexes are used: rst the secondary one pointing to the
    primary key that will be used to nally retrieve the record.
    When no primary key is de ned, the rst unique not null key is used. And if none is
    available, InnoDB will create an hidden primary key (6 bytes).
    Copyright @ 2022 Oracle and/or its affiliates.
    57

    View Slide

  76. InnoDB Primary Key (2)
    What we don't know is that all secondary indexes also contain the primary key as the right-
    most column in the index (even if this is not exposed). That means when a secondary index
    is used to retrieve a record, two indexes are used: rst the secondary one pointing to the
    primary key that will be used to nally retrieve the record.
    When no primary key is de ned, the rst unique not null key is used. And if none is
    available, InnoDB will create an hidden primary key (6 bytes).
    The problem with such key is that you don’t have any control on it and worse, this value is
    global to all tables without primary keys and can be a contention problem if you perform
    multiple simultaneous writes on such tables (dict_sys->mutex).
    Copyright @ 2022 Oracle and/or its affiliates.
    57

    View Slide

  77. InnoDB Primary Key (2)
    What we don't know is that all secondary indexes also contain the primary key as the right-
    most column in the index (even if this is not exposed). That means when a secondary index
    is used to retrieve a record, two indexes are used: rst the secondary one pointing to the
    primary key that will be used to nally retrieve the record.
    When no primary key is de ned, the rst unique not null key is used. And if none is
    available, InnoDB will create an hidden primary key (6 bytes).
    The problem with such key is that you don’t have any control on it and worse, this value is
    global to all tables without primary keys and can be a contention problem if you perform
    multiple simultaneous writes on such tables (dict_sys->mutex).
    If you use HA solutions, Primary Keys are mandatory !
    Copyright @ 2022 Oracle and/or its affiliates.
    57

    View Slide

  78. InnoDB Primary Key (3)
    Primary Keys impact how the values are inserted and the size of the secondary indexes. A
    non sequential PK can lead to many random IOPS.
    Copyright @ 2022 Oracle and/or its affiliates.
    58

    View Slide

  79. Also, it's more and more common to use
    applications that generate complete
    random primary keys...that means if the
    Primary Key is not sequential, InnoDB will
    have to heavily re-balance all the pages on
    inserts.
    InnoDB Primary Key (3)
    Primary Keys impact how the values are inserted and the size of the secondary indexes. A
    non sequential PK can lead to many random IOPS.
    Copyright @ 2022 Oracle and/or its affiliates.
    58

    View Slide

  80. InnoDB Primary Key (4)
    If we compare the same load (inserts) when using an auto_increment integer as Primary
    Key, we can see that only the latest pages are recently touched:
    Generated with h ps://github.com/jeremycole/innodb_ruby from @jeremycole
    Copyright @ 2022 Oracle and/or its affiliates.
    59

    View Slide

  81. InnoDB Primary Key (5)
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    My legacy application didn't define any Primary Key, adding one
    (auto_increment) breaks the application ! What can I do ?
    60

    View Slide

  82. InnoDB Primary Key (5)
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    My legacy application didn't define any Primary Key, adding one
    (auto_increment) breaks the application ! What can I do ?
    Easy, just create a new invisible column and define it as Primary Key !
    60

    View Slide

  83. select
    select *
    * from
    from actors
    actors;
    ;
    +
    +----------------+-----+
    ----------------+-----+
    |
    | name
    name |
    | age
    age |
    |
    +
    +----------------+-----+
    ----------------+-----+
    |
    | Al Pacino
    Al Pacino |
    | 80
    80 |
    |
    |
    | Robert De Niro
    Robert De Niro |
    | 77
    77 |
    |
    |
    | Joe Pesci
    Joe Pesci |
    | 78
    78 |
    |
    |
    | Sharon Stone
    Sharon Stone |
    | 63
    63 |
    |
    |
    | Diane Keaton
    Diane Keaton |
    | 75
    75 |
    |
    |
    | Talia Shire
    Talia Shire |
    | 74
    74 |
    |
    +
    +----------------+-----+
    ----------------+-----+
    Invisible column as Primary Key
    Copyright @ 2022 Oracle and/or its affiliates.
    61

    View Slide

  84. select
    select *
    * from
    from actors
    actors;
    ;
    +
    +----------------+-----+
    ----------------+-----+
    |
    | name
    name |
    | age
    age |
    |
    +
    +----------------+-----+
    ----------------+-----+
    |
    | Al Pacino
    Al Pacino |
    | 80
    80 |
    |
    |
    | Robert De Niro
    Robert De Niro |
    | 77
    77 |
    |
    |
    | Joe Pesci
    Joe Pesci |
    | 78
    78 |
    |
    |
    | Sharon Stone
    Sharon Stone |
    | 63
    63 |
    |
    |
    | Diane Keaton
    Diane Keaton |
    | 75
    75 |
    |
    |
    | Talia Shire
    Talia Shire |
    | 74
    74 |
    |
    +
    +----------------+-----+
    ----------------+-----+
    Do we have a Primary Key ?
    Invisible column as Primary Key
    Copyright @ 2022 Oracle and/or its affiliates.
    61

    View Slide

  85. Invisible column as Primary Key (2)
    Let's nd out by listing all tables where the clustered index was generated (internal hidden
    key):
    select
    select i
    i.
    .table_id
    table_id,
    , t
    t.
    .name
    name
    from
    from information_schema
    information_schema.
    .innodb_indexes i
    innodb_indexes i
    join
    join information_schema
    information_schema.
    .innodb_tables t
    innodb_tables t on
    on (
    (i
    i.
    .table_id
    table_id =
    = t
    t.
    .table_id
    table_id)
    )
    where
    where i
    i.
    .name
    name=
    ='GEN_CLUST_INDEX'
    'GEN_CLUST_INDEX';
    ;
    +
    +----------+------------------+
    ----------+------------------+
    |
    | table_id
    table_id |
    | name
    name |
    |
    +
    +----------+------------------+
    ----------+------------------+
    |
    | 1293
    1293 |
    | hollywood
    hollywood/
    /actors
    actors |
    |
    +
    +----------+------------------+
    ----------+------------------+
    1
    1 row
    row in
    in set
    set (
    (0.0211
    0.0211 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    62

    View Slide

  86. Invisible column as Primary Key (3)
    We can verify with the table's de nition:
    show
    show create
    create table
    table actors\G
    actors\G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    Table
    Table: actors
    : actors
    Create
    Create Table
    Table:
    : CREATE
    CREATE TABLE
    TABLE `
    `actors
    actors`
    ` (
    (
    `
    `name
    name`
    ` varchar
    varchar(
    (20
    20)
    ) DEFAULT
    DEFAULT NULL
    NULL,
    ,
    `
    `age
    age`
    ` int
    int unsigned
    unsigned DEFAULT
    DEFAULT NULL
    NULL
    )
    ) ENGINE
    ENGINE=
    =InnoDB
    InnoDB DEFAULT
    DEFAULT CHARSET
    CHARSET=
    =utf8mb4
    utf8mb4 COLLATE
    COLLATE=
    =utf8mb4_0900_ai_ci
    utf8mb4_0900_ai_ci
    Copyright @ 2022 Oracle and/or its affiliates.
    63

    View Slide

  87. Invisible column as Primary Key (3)
    We can verify with the table's de nition:
    show
    show create
    create table
    table actors\G
    actors\G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    Table
    Table: actors
    : actors
    Create
    Create Table
    Table:
    : CREATE
    CREATE TABLE
    TABLE `
    `actors
    actors`
    ` (
    (
    `
    `name
    name`
    ` varchar
    varchar(
    (20
    20)
    ) DEFAULT
    DEFAULT NULL
    NULL,
    ,
    `
    `age
    age`
    ` int
    int unsigned
    unsigned DEFAULT
    DEFAULT NULL
    NULL
    )
    ) ENGINE
    ENGINE=
    =InnoDB
    InnoDB DEFAULT
    DEFAULT CHARSET
    CHARSET=
    =utf8mb4
    utf8mb4 COLLATE
    COLLATE=
    =utf8mb4_0900_ai_ci
    utf8mb4_0900_ai_ci
    Now let's add a hidden column as primary key:
    alter
    alter table
    table actors
    actors add
    add id
    id int
    int unsigned
    unsigned auto_increment
    auto_increment primary
    primary key
    key invisible
    invisible first
    first;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    63

    View Slide

  88. select
    select *
    * from
    from actors
    actors;
    ;
    +
    +----------------+-----+
    ----------------+-----+
    |
    | name
    name |
    | age
    age |
    |
    +
    +----------------+-----+
    ----------------+-----+
    |
    | Al Pacino
    Al Pacino |
    | 80
    80 |
    |
    |
    | Robert De Niro
    Robert De Niro |
    | 77
    77 |
    |
    |
    | Joe Pesci
    Joe Pesci |
    | 78
    78 |
    |
    |
    | Sharon Stone
    Sharon Stone |
    | 63
    63 |
    |
    |
    | Diane Keaton
    Diane Keaton |
    | 75
    75 |
    |
    |
    | Talia Shire
    Talia Shire |
    | 74
    74 |
    |
    +
    +----------------+-----+
    ----------------+-----+
    Invisible column as Primary Key (4)
    We can now test again our application's queries:
    Copyright @ 2022 Oracle and/or its affiliates.
    64

    View Slide

  89. select
    select *
    * from
    from actors
    actors;
    ;
    +
    +----------------+-----+
    ----------------+-----+
    |
    | name
    name |
    | age
    age |
    |
    +
    +----------------+-----+
    ----------------+-----+
    |
    | Al Pacino
    Al Pacino |
    | 80
    80 |
    |
    |
    | Robert De Niro
    Robert De Niro |
    | 77
    77 |
    |
    |
    | Joe Pesci
    Joe Pesci |
    | 78
    78 |
    |
    |
    | Sharon Stone
    Sharon Stone |
    | 63
    63 |
    |
    |
    | Diane Keaton
    Diane Keaton |
    | 75
    75 |
    |
    |
    | Talia Shire
    Talia Shire |
    | 74
    74 |
    |
    +
    +----------------+-----+
    ----------------+-----+
    show
    show create
    create table
    table actors\G
    actors\G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    Table
    Table: actors
    : actors
    Create
    Create Table
    Table:
    : CREATE
    CREATE TABLE
    TABLE `
    `actors
    actors`
    ` (
    (
    `
    `id
    id`
    ` int
    int unsigned
    unsigned NOT
    NOT NULL
    NULL
    AUTO_INCREMENT
    AUTO_INCREMENT /*!80023 INVISIBLE */
    /*!80023 INVISIBLE */,
    ,
    `
    `name
    name`
    ` varchar
    varchar(
    (20
    20)
    ) DEFAULT
    DEFAULT NULL
    NULL,
    ,
    `
    `age
    age`
    ` int
    int unsigned
    unsigned DEFAULT
    DEFAULT NULL
    NULL,
    ,
    PRIMARY
    PRIMARY KEY
    KEY (
    (`
    `id
    id`
    `)
    )
    )
    ) ENGINE
    ENGINE=
    =InnoDB
    InnoDB AUTO_INCREMENT
    AUTO_INCREMENT=
    =7
    7 DEFAULT
    DEFAULT
    CHARSET
    CHARSET=
    =utf8mb4
    utf8mb4
    COLLATE
    COLLATE=
    =utf8mb4_0900_ai_ci
    utf8mb4_0900_ai_ci
    Invisible column as Primary Key (4)
    We can now test again our application's queries:
    Copyright @ 2022 Oracle and/or its affiliates.
    64

    View Slide

  90. Invisible column as Primary Key (5)
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    Great ! But what about inserts without specifying the columns ?
    65

    View Slide

  91. Invisible column as Primary Key (5)
    < >
    insert
    insert into
    into actors
    actors values
    values (
    ('James Caan'
    'James Caan',
    , 81
    81)
    );
    ;
    Query OK
    Query OK,
    , 1
    1 row
    row affected
    affected (
    (0.0248
    0.0248 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    Great ! But what about inserts without specifying the columns ?
    65

    View Slide

  92. select
    select id
    id,
    , a
    a.
    .*
    * from
    from actors a
    actors a;
    ;
    +
    +----+----------------+-----+
    ----+----------------+-----+
    |
    | id
    id |
    | name
    name |
    | age
    age |
    |
    +
    +----+----------------+-----+
    ----+----------------+-----+
    |
    | 1
    1 |
    | Al Pacino
    Al Pacino |
    | 80
    80 |
    |
    |
    | 2
    2 |
    | Robert De Niro
    Robert De Niro |
    | 77
    77 |
    |
    |
    | 3
    3 |
    | Joe Pesci
    Joe Pesci |
    | 78
    78 |
    |
    |
    | 4
    4 |
    | Sharon Stone
    Sharon Stone |
    | 63
    63 |
    |
    |
    | 5
    5 |
    | Diane Keaton
    Diane Keaton |
    | 75
    75 |
    |
    |
    | 6
    6 |
    | Talia Shire
    Talia Shire |
    | 74
    74 |
    |
    |
    | 7
    7 |
    | James Caan
    James Caan |
    | 81
    81 |
    |
    +
    +----+----------------+-----+
    ----+----------------+-----+
    Invisible column as Primary Key (6)
    But if needed we have access to that PK id:
    Copyright @ 2022 Oracle and/or its affiliates.
    66

    View Slide

  93. select
    select id
    id,
    , a
    a.
    .*
    * from
    from actors a
    actors a;
    ;
    +
    +----+----------------+-----+
    ----+----------------+-----+
    |
    | id
    id |
    | name
    name |
    | age
    age |
    |
    +
    +----+----------------+-----+
    ----+----------------+-----+
    |
    | 1
    1 |
    | Al Pacino
    Al Pacino |
    | 80
    80 |
    |
    |
    | 2
    2 |
    | Robert De Niro
    Robert De Niro |
    | 77
    77 |
    |
    |
    | 3
    3 |
    | Joe Pesci
    Joe Pesci |
    | 78
    78 |
    |
    |
    | 4
    4 |
    | Sharon Stone
    Sharon Stone |
    | 63
    63 |
    |
    |
    | 5
    5 |
    | Diane Keaton
    Diane Keaton |
    | 75
    75 |
    |
    |
    | 6
    6 |
    | Talia Shire
    Talia Shire |
    | 74
    74 |
    |
    |
    | 7
    7 |
    | James Caan
    James Caan |
    | 81
    81 |
    |
    +
    +----+----------------+-----+
    ----+----------------+-----+
    And this id is sequential, used as clustered
    index to store the data and externalized for
    replication !
    Invisible column as Primary Key (6)
    But if needed we have access to that PK id:
    Copyright @ 2022 Oracle and/or its affiliates.
    66

    View Slide

  94. Invisible column as Primary Key (7)
    Since MySQL 8.0.30 you can also enable GIPK mode !
    GIPK mode is controlled by the sql_generate_invisible_primary_key server
    system variable.
    When MySQL is running in GIPK mode, a primary key is added to a table by the server, the
    column and key name is always my_row_id.
    Copyright @ 2022 Oracle and/or its affiliates.
    67

    View Slide

  95. InnoDB Secondary Key
    Indexes other than the clustered index are known as secondary indexes.
    Remember that in InnoDB, each record in a secondary index contains the primary key
    columns for the row (right most), as well as the columns speci ed for the secondary
    index.
    InnoDB uses this primary key value to search for the row in the clustered index.
    If the Primary Key is long, the secondary indexes use more space. It's advantageous to
    have a short Primary Key.
    Copyright @ 2022 Oracle and/or its affiliates.
    68

    View Slide

  96. Indexing on a pre x of a column
    create
    create index
    index part_of_name
    part_of_name on
    on city
    city (
    (name
    name(
    (10
    10)
    ))
    );
    ;
    Only the rst 10 characters are indexed in this example and this can save space/speed.
    Copyright @ 2022 Oracle and/or its affiliates.
    69

    View Slide

  97. Indexing on a pre x of a column (2)
    Let's compare the between this pre x index and an index using the full column:
    select
    select database_name
    database_name,
    , table_name
    table_name,
    , index_name
    index_name,
    ,
    stat_value
    stat_value *
    * @
    @@innodb_page_size
    @innodb_page_size as
    as size_in_bytes
    size_in_bytes
    from
    from mysql
    mysql.
    .innodb_index_stats
    innodb_index_stats
    where
    where stat_name
    stat_name =
    = 'size'
    'size' and
    and database_name
    database_name=
    ='world'
    'world' and
    and table_name
    table_name=
    ='city'
    'city'
    and
    and index_name
    index_name like
    like '%name%'
    '%name%'
    order
    order by
    by size_in_bytes
    size_in_bytes desc
    desc;
    ;
    +
    +---------------+------------+--------------+---------------+
    ---------------+------------+--------------+---------------+
    |
    | database_name
    database_name |
    | table_name
    table_name |
    | index_name
    index_name |
    | size_in_bytes
    size_in_bytes |
    |
    +
    +---------------+------------+--------------+---------------+
    ---------------+------------+--------------+---------------+
    |
    | world
    world |
    | city
    city |
    | name_idx
    name_idx |
    | 212992
    212992 |
    |
    |
    | world
    world |
    | city
    city |
    | part_of_name
    part_of_name |
    | 114688
    114688 |
    |
    +
    +---------------+------------+--------------+---------------+
    ---------------+------------+--------------+---------------+
    Copyright @ 2022 Oracle and/or its affiliates.
    70

    View Slide

  98. Indexing on a pre x of a column (3)
    Copyright @ 2022 Oracle and/or its affiliates.
    71

    View Slide

  99. Indexing on a pre x of a column (4)
    Copyright @ 2022 Oracle and/or its affiliates.
    72

    View Slide

  100. Indexing on a pre x of a column (4)
    We see that both indexes on name are candidates and the partial one got the preference.
    Copyright @ 2022 Oracle and/or its affiliates.
    72

    View Slide

  101. Index key_len
    What does that 40 mean ?
    Copyright @ 2022 Oracle and/or its affiliates.
    73

    View Slide

  102. Index key_len
    What does that 40 mean ?
    The key_len column indicates the length of the key that MySQL decided to use.
    Copyright @ 2022 Oracle and/or its affiliates.
    73

    View Slide

  103. Index key_len (2)
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    Oh... Okay... but why 40 ? It doesn't make any sense, does it ?
    74

    View Slide

  104. Index key_len (2)
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    Oh... Okay... but why 40 ? It doesn't make any sense, does it ?
    In fact, we indexed the first 10 characters of the 'name' column... but this uses
    utf8mb4 charset: 1 character is encoded in up to 4 bytes
    10 x 4 bytes = 40 bytes per record in the index
    74

    View Slide

  105. Multi-column Index
    It's also possible to index multiple columns in one single index:
    create
    create index
    index first_last_idx
    first_last_idx on
    on employees
    employees (
    (first_name
    first_name,
    , last_name
    last_name)
    );
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    75

    View Slide

  106. Multi-column Index
    It's also possible to index multiple columns in one single index:
    create
    create index
    index first_last_idx
    first_last_idx on
    on employees
    employees (
    (first_name
    first_name,
    , last_name
    last_name)
    );
    ;
    This index will be work on ( rst_name, lastname) and ( rst_name) but not on
    (last_name).
    Put highest cardinality eld rst !
    Copyright @ 2022 Oracle and/or its affiliates.
    75

    View Slide

  107. Multi-column Index
    It's also possible to index multiple columns in one single index:
    create
    create index
    index first_last_idx
    first_last_idx on
    on employees
    employees (
    (first_name
    first_name,
    , last_name
    last_name)
    );
    ;
    This index will be work on ( rst_name, lastname) and ( rst_name) but not on
    (last_name).
    Put highest cardinality eld rst !
    Indexes are parsed from left to right
    Copyright @ 2022 Oracle and/or its affiliates.
    75

    View Slide

  108. Multi-column Index Example
    The value of key_len enables you to determine how many parts of a multiple-part key
    MySQL actually uses.
    Copyright @ 2022 Oracle and/or its affiliates.
    76

    View Slide

  109. Multi-column Index Example
    The value of key_len enables you to determine how many parts of a multiple-part key
    MySQL actually uses.
    Copyright @ 2022 Oracle and/or its affiliates.
    show create table employees\G
    *************************** 1. row ***************************
    Table: employees
    Create Table: CREATE TABLE `employees` (
    `emp_no` int NOT NULL,
    `birth_date` date NOT NULL,
    `first_name` varchar(14) NOT NULL,
    `last_name` varchar(16) NOT NULL,
    `gender` enum('M','F') NOT NULL,
    `hire_date` date NOT NULL,
    PRIMARY KEY (`emp_no`),
    KEY `first_last_idx` (`first_name`,`last_name`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    select (14*4)+2+(16*4)+2;
    +-------------------+
    | (14*4)+2+(16*4)+2 | (+2 --> VARCHAR's length is coded on 2 bytes)
    +-------------------+
    | 124 |
    +-------------------+
    76

    View Slide

  110. Multi-column Index Example (2)
    Copyright @ 2022 Oracle and/or its affiliates.
    77

    View Slide

  111. Multi-column Index Example (2)
    14 x 4 + 2 = 58
    Copyright @ 2022 Oracle and/or its affiliates.
    77

    View Slide

  112. Multi-column Index Example (3)
    Copyright @ 2022 Oracle and/or its affiliates.
    78

    View Slide

  113. Multi-column Index Example (3)
    the left-most part of the index cannot be used
    --> the index is not used
    Copyright @ 2022 Oracle and/or its affiliates.
    78

    View Slide

  114. select
    select emp_no
    emp_no,
    , first_name
    first_name,
    , last_name
    last_name,
    ,
    hire_date
    hire_date
    from
    from employees
    employees
    where
    where last_name
    last_name like
    like 'de%'
    'de%' limit
    limit 10
    10;
    ;
    select
    select emp_no
    emp_no,
    , first_name
    first_name,
    , last_name
    last_name,
    ,
    hire_date
    hire_date
    from
    from employees
    employees
    where
    where last_name
    last_name like
    like 'de%'
    'de%'
    order
    order by
    by first_name
    first_name limit
    limit 10
    10;
    ;
    Multi-column Index: challenge
    What do you think about these two statements:
    [A] none uses the index
    [B] the left-one uses the index
    [C] the right-one uses the index
    Copyright @ 2022 Oracle and/or its affiliates.
    79

    View Slide

  115. select
    select emp_no
    emp_no,
    , first_name
    first_name,
    , last_name
    last_name,
    ,
    hire_date
    hire_date
    from
    from employees
    employees
    where
    where last_name
    last_name like
    like 'de%'
    'de%' limit
    limit 10
    10;
    ;
    select
    select emp_no
    emp_no,
    , first_name
    first_name,
    , last_name
    last_name,
    ,
    hire_date
    hire_date
    from
    from employees
    employees
    where
    where last_name
    last_name like
    like 'de%'
    'de%'
    order
    order by
    by first_name
    first_name limit
    limit 10
    10;
    ;
    Multi-column Index: challenge
    What do you think about these two statements:
    [A] none uses the index
    [B] the left-one uses the index
    [C] the right-one uses the index
    Copyright @ 2022 Oracle and/or its affiliates.
    79

    View Slide

  116. Multi-column Index: hashing values
    If you need to perform search of the exact value (not using like or range) of multiple large
    columns, some times it could be more e cient to use a hash function and index its result:
    alter
    alter table
    table employees
    employees add
    add column
    column hash_bin_names
    hash_bin_names binary
    binary(
    (16
    16)
    )
    generated always
    generated always as
    as (
    (unhex
    unhex(
    (md5
    md5(
    (concat
    concat(
    (first_name
    first_name,
    , last_name
    last_name)
    ))
    ))
    ))
    ) virtual
    virtual,
    ,
    add
    add key
    key hash_bin_idx
    hash_bin_idx(
    (hash_bin_names
    hash_bin_names)
    );
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    80

    View Slide

  117. Multi-column Index: hashing values
    And now let's rewrite the query and exame the QEP:
    explain
    explain select
    select emp_no
    emp_no,
    , first_name
    first_name,
    , last_name
    last_name,
    , hire_date
    hire_date
    from
    from employees
    employees
    where
    where hash_bin_names
    hash_bin_names=
    =unhex
    unhex(
    (md5
    md5(
    ('AamodDeville'
    'AamodDeville')
    ))
    )
    and
    and first_name
    first_name=
    ='Aamod'
    'Aamod' and
    and last_name
    last_name like
    like 'Deville'
    'Deville' order
    order by
    by first_name
    first_name limit
    limit 10
    10\G
    \G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: employees
    : employees
    partitions:
    partitions: NULL
    NULL
    type
    type: ref
    : ref
    possible_keys: first_last_idx
    possible_keys: first_last_idx,
    ,hash_bin_idx
    hash_bin_idx
    key
    key: hash_bin_idx
    : hash_bin_idx
    key_len:
    key_len: 17
    17
    ref: const
    ref: const
    rows
    rows:
    : 1
    1
    filtered:
    filtered: 5
    5
    Extra:
    Extra: Using
    Using where
    where
    Copyright @ 2022 Oracle and/or its affiliates.
    81

    View Slide

  118. Functional Indexes
    MySQL supports functional key parts that index expression values rather than column or
    column pre x values.
    Use of functional key parts enables indexing of values not stored directly in the table.
    Copyright @ 2022 Oracle and/or its affiliates.
    82

    View Slide

  119. Functional Indexes
    MySQL supports functional key parts that index expression values rather than column or
    column pre x values.
    Use of functional key parts enables indexing of values not stored directly in the table.
    Let's suppose we want to retriev all employees that were hired in March:
    select
    select first_name
    first_name,
    , hire_date
    hire_date from
    from employees
    employees where
    where month
    month(
    (hire_date
    hire_date)
    )=
    =3
    3;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    82

    View Slide

  120. Functional Indexes (2)
    Get the Query Execution Plan:
    explain
    explain select
    select first_name
    first_name,
    , hire_date
    hire_date from
    from employees
    employees where
    where month
    month(
    (hire_date
    hire_date)
    )=
    =3
    3\G
    \G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: employees
    : employees
    partitions:
    partitions: NULL
    NULL
    type
    type:
    : ALL
    ALL
    possible_keys:
    possible_keys: NULL
    NULL
    key
    key:
    : NULL
    NULL
    key_len:
    key_len: NULL
    NULL
    ref:
    ref: NULL
    NULL
    rows
    rows:
    : 299379
    299379
    filtered:
    filtered: 100
    100
    Extra:
    Extra: Using
    Using where
    where
    Copyright @ 2022 Oracle and/or its affiliates.
    83

    View Slide

  121. Functional Indexes (2)
    Get the Query Execution Plan:
    explain
    explain select
    select first_name
    first_name,
    , hire_date
    hire_date from
    from employees
    employees where
    where month
    month(
    (hire_date
    hire_date)
    )=
    =3
    3\G
    \G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: employees
    : employees
    partitions:
    partitions: NULL
    NULL
    type
    type:
    : ALL
    ALL
    possible_keys:
    possible_keys: NULL
    NULL
    key
    key:
    : NULL
    NULL
    key_len:
    key_len: NULL
    NULL
    ref:
    ref: NULL
    NULL
    rows
    rows:
    : 299379
    299379
    filtered:
    filtered: 100
    100
    Extra:
    Extra: Using
    Using where
    where
    FULL TABLE SCAN !
    Copyright @ 2022 Oracle and/or its affiliates.
    83

    View Slide

  122. Functional Indexes (3)
    create
    create index
    index month_hire_idx
    month_hire_idx
    on
    on employees
    employees (
    ((
    (month
    month(
    (hire_date
    hire_date)
    ))
    ))
    );
    ; please mind the
    please mind the (
    ((
    (.
    ..
    ..
    .)
    ))
    ) notation
    notation
    Copyright @ 2022 Oracle and/or its affiliates.
    84

    View Slide

  123. Functional Indexes (3)
    create
    create index
    index month_hire_idx
    month_hire_idx
    on
    on employees
    employees (
    ((
    (month
    month(
    (hire_date
    hire_date)
    ))
    ))
    );
    ; please mind the
    please mind the (
    ((
    (.
    ..
    ..
    .)
    ))
    ) notation
    notation
    explain
    explain select
    select first_name
    first_name,
    , hire_date
    hire_date from
    from employees
    employees where
    where month
    month(
    (hire_date
    hire_date)
    )=
    =3
    3\G
    \G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: employees
    : employees
    partitions:
    partitions: NULL
    NULL
    type
    type: ref
    : ref
    possible_keys: month_hire_idx
    possible_keys: month_hire_idx
    key
    key: month_hire_idx
    : month_hire_idx
    key_len:
    key_len: 5
    5
    ref: const
    ref: const
    rows
    rows:
    : 51684
    51684
    filtered:
    filtered: 100
    100
    Extra:
    Extra: NULL
    NULL
    Copyright @ 2022 Oracle and/or its affiliates.
    84

    View Slide

  124. Please Keep in Mind...
    If there is a choice between multiple indexes, MySQL normally uses the index that nds the
    smallest number of rows (the most selective index).
    MySQL can use indexes on columns more e ciently if they are declared as the same type
    and size.
    Copyright @ 2022 Oracle and/or its affiliates.
    85

    View Slide

  125. NULL is used to designate a LACK of data: False 0
    True 1
    Don't know NULL
    NULL
    Indexing NULL values really drives down the performances of indexes.
    Copyright @ 2022 Oracle and/or its affiliates.
    86

    View Slide

  126. Before Invisible Indexes
    . doubt usefulness of index
    . remove that index
    . get phone/text/screams from power user about slow
    performance
    . the rest of the planet seems to need that dang index !
    . recreate the index... and it can take a looooooong time
    After Invisible Indexes
    . doubt usefulness of index
    . make index invisible - optimizer can not see it!
    . get phone/text/screams from power user about slow
    performance
    . make index visible again
    . blame a problem on { network | hardware | cloud | a colleague}
    Invisible Indexes
    MySQL o ers the possibility to hide indexes from the optimizer.
    This feature is very useful for testing the relevance of indexes before deleting them.
    And very useful for the operations team.
    Copyright @ 2022 Oracle and/or its affiliates.
    87

    View Slide

  127. How to use INVISIBLE INDEX
    alter
    alter table
    table employees
    employees alter
    alter index
    index first_last_idx invisible
    first_last_idx invisible;
    ;
    alter
    alter table
    table employees
    employees alter
    alter index
    index first_last_idx visible
    first_last_idx visible;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    88

    View Slide

  128. How to use INVISIBLE INDEX
    alter
    alter table
    table employees
    employees alter
    alter index
    index first_last_idx invisible
    first_last_idx invisible;
    ;
    alter
    alter table
    table employees
    employees alter
    alter index
    index first_last_idx visible
    first_last_idx visible;
    ;
    List all invisible indexes:
    select
    select table_name
    table_name,
    , index_name
    index_name,
    , is_visible
    is_visible
    from
    from information_schema
    information_schema.
    .statistics
    statistics
    where
    where is_visible
    is_visible=
    ='no'
    'no' group
    group by
    by table_name
    table_name,
    , index_name
    index_name;
    ;
    +
    +------------+----------------+------------+
    ------------+----------------+------------+
    |
    | TABLE_NAME
    TABLE_NAME |
    | INDEX_NAME
    INDEX_NAME |
    | IS_VISIBLE
    IS_VISIBLE |
    |
    +
    +------------+----------------+------------+
    ------------+----------------+------------+
    |
    | employees
    employees |
    | first_last_idx
    first_last_idx |
    | NO
    NO |
    |
    +
    +------------+----------------+------------+
    ------------+----------------+------------+
    Copyright @ 2022 Oracle and/or its affiliates.
    88

    View Slide

  129. Unused Indexes
    We learned that updating indexes that are not used can be expensive and increase the iops.
    Using sys Schema and innodb_index_stats it's possible to identify those unused
    indexes:
    select
    select database_name
    database_name,
    , table_name
    table_name,
    , t1
    t1.
    .index_name
    index_name,
    ,
    format_bytes
    format_bytes(
    (stat_value
    stat_value *
    * @
    @@innodb_page_size
    @innodb_page_size)
    ) size
    size
    from
    from mysql
    mysql.
    .innodb_index_stats t1
    innodb_index_stats t1
    join
    join sys
    sys.
    .schema_unused_indexes t2
    schema_unused_indexes t2 on
    on object_schema
    object_schema=
    =database_name
    database_name
    and
    and object_name
    object_name=
    =table_name
    table_name and
    and t2
    t2.
    .index_name
    index_name=
    =t1
    t1.
    .index_name
    index_name
    where
    where stat_name
    stat_name=
    ='size'
    'size' order
    order by
    by stat_value
    stat_value desc
    desc;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    89

    View Slide

  130. Unused Indexes
    We learned that updating indexes that are not used can be expensive and increase the iops.
    Using sys Schema and innodb_index_stats it's possible to identify those unused
    indexes:
    select
    select database_name
    database_name,
    , table_name
    table_name,
    , t1
    t1.
    .index_name
    index_name,
    ,
    format_bytes
    format_bytes(
    (stat_value
    stat_value *
    * @
    @@innodb_page_size
    @innodb_page_size)
    ) size
    size
    from
    from mysql
    mysql.
    .innodb_index_stats t1
    innodb_index_stats t1
    join
    join sys
    sys.
    .schema_unused_indexes t2
    schema_unused_indexes t2 on
    on object_schema
    object_schema=
    =database_name
    database_name
    and
    and object_name
    object_name=
    =table_name
    table_name and
    and t2
    t2.
    .index_name
    index_name=
    =t1
    t1.
    .index_name
    index_name
    where
    where stat_name
    stat_name=
    ='size'
    'size' order
    order by
    by stat_value
    stat_value desc
    desc;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    select
    select database_name
    database_name,
    , table_name
    table_name,
    , t1
    t1.
    .index_name
    index_name,
    ,
    format_bytes
    format_bytes(
    (stat_value
    stat_value *
    * @
    @@innodb_page_size
    @innodb_page_size)
    ) size
    size
    from
    from mysql
    mysql.
    .innodb_index_stats t1
    innodb_index_stats t1
    join
    join sys
    sys.
    .schema_unused_indexes t2
    schema_unused_indexes t2 on
    on object_schema
    object_schema=
    =database_name
    database_name
    and
    and object_name
    object_name=
    =table_name
    table_name and
    and t2
    t2.
    .index_name
    index_name=
    =t1
    t1.
    .index_name
    index_name
    where
    where stat_name
    stat_name=
    ='size'
    'size' and
    and database_name
    database_name=
    ="employees"
    "employees" order
    order by
    by stat_value
    stat_value desc
    desc;
    ;
    +
    +---------------+--------------+---------------------+-----------+
    ---------------+--------------+---------------------+-----------+
    |
    | database_name
    database_name |
    | table_name
    table_name |
    | index_name
    index_name |
    | size
    size |
    |
    +
    +---------------+--------------+---------------------+-----------+
    ---------------+--------------+---------------------+-----------+
    |
    | employees
    employees |
    | employees
    employees |
    | hash_bin_names2
    hash_bin_names2 |
    | 9.52
    9.52 MiB
    MiB |
    |
    |
    | employees
    employees |
    | employees
    employees |
    | month_year_hire_idx
    month_year_hire_idx |
    | 6.52
    6.52 MiB
    MiB |
    |
    |
    | employees
    employees |
    | dept_emp
    dept_emp |
    | dept_no
    dept_no |
    | 5.52
    5.52 MiB
    MiB |
    |
    |
    | employees
    employees |
    | dept_manager
    dept_manager |
    | dept_no
    dept_no |
    | 16.00
    16.00 KiB
    KiB |
    |
    +
    +---------------+--------------+---------------------+-----------+
    ---------------+--------------+---------------------+-----------+
    4
    4 rows
    rows in
    in set
    set (
    (0.0252
    0.0252 sec
    sec)
    )
    89

    View Slide

  131. Duplicate Indexes
    And this is the same behaviour for duplicate indexes. There is no reason to keep
    maintaining them:
    select
    select t2
    t2.
    .*
    *,
    , format_bytes
    format_bytes(
    (stat_value
    stat_value *
    * @
    @@innodb_page_size
    @innodb_page_size)
    ) size
    size
    from
    from mysql
    mysql.
    .innodb_index_stats t1
    innodb_index_stats t1
    join
    join sys
    sys.
    .schema_redundant_indexes t2
    schema_redundant_indexes t2
    on
    on table_schema
    table_schema=
    =database_name
    database_name and
    and t2
    t2.
    .table_name
    table_name=
    =t1
    t1.
    .table_name
    table_name
    and
    and t2
    t2.
    .redundant_index_name
    redundant_index_name=
    =t1
    t1.
    .index_name
    index_name
    where
    where stat_name
    stat_name=
    ='size'
    'size' order
    order by
    by stat_value
    stat_value desc
    desc\G
    \G
    Copyright @ 2022 Oracle and/or its affiliates.
    90

    View Slide

  132. Duplicate Indexes
    And this is the same behaviour for duplicate indexes. There is no reason to keep
    maintaining them:
    select
    select t2
    t2.
    .*
    *,
    , format_bytes
    format_bytes(
    (stat_value
    stat_value *
    * @
    @@innodb_page_size
    @innodb_page_size)
    ) size
    size
    from
    from mysql
    mysql.
    .innodb_index_stats t1
    innodb_index_stats t1
    join
    join sys
    sys.
    .schema_redundant_indexes t2
    schema_redundant_indexes t2
    on
    on table_schema
    table_schema=
    =database_name
    database_name and
    and t2
    t2.
    .table_name
    table_name=
    =t1
    t1.
    .table_name
    table_name
    and
    and t2
    t2.
    .redundant_index_name
    redundant_index_name=
    =t1
    t1.
    .index_name
    index_name
    where
    where stat_name
    stat_name=
    ='size'
    'size' order
    order by
    by stat_value
    stat_value desc
    desc\G
    \G
    Copyright @ 2022 Oracle and/or its affiliates.
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    table_schema: world
    table_schema: world
    table_name: city
    table_name: city
    redundant_index_name: part_of_name
    redundant_index_name: part_of_name
    redundant_index_columns: Name
    redundant_index_columns: Name
    redundant_index_non_unique:
    redundant_index_non_unique: 1
    1
    dominant_index_name: name_idx
    dominant_index_name: name_idx
    dominant_index_columns: Name
    dominant_index_columns: Name
    dominant_index_non_unique:
    dominant_index_non_unique: 1
    1
    subpart_exists:
    subpart_exists: 1
    1
    sql_drop_index:
    sql_drop_index: ALTER
    ALTER TABLE
    TABLE `
    `world
    world`
    `.
    .`
    `city
    city`
    ` DROP
    DROP INDEX
    INDEX `
    `part_of_name
    part_of_name`
    `
    size:
    size: 112.00
    112.00 KiB
    KiB
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 2.
    2. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    table_schema: world
    table_schema: world
    table_name: countrylanguage
    table_name: countrylanguage
    redundant_index_name: CountryCode
    redundant_index_name: CountryCode
    redundant_index_columns: CountryCode
    redundant_index_columns: CountryCode
    redundant_index_non_unique:
    redundant_index_non_unique: 1
    1
    dominant_index_name:
    dominant_index_name: PRIMARY
    PRIMARY
    dominant_index_columns: CountryCode
    dominant_index_columns: CountryCode,
    ,Language
    Language
    dominant_index_non_unique:
    dominant_index_non_unique: 0
    0
    subpart_exists:
    subpart_exists: 0
    0
    sql_drop_index:
    sql_drop_index: ALTER
    ALTER TABLE
    TABLE `
    `world
    world`
    `.
    .`
    `countrylanguage
    countrylanguage`
    ` DROP
    DROP INDEX
    INDEX `
    `CountryCode
    CountryCode`
    `
    size:
    size: 64.00
    64.00 KiB
    KiB
    2
    2 rows
    rows in
    in set
    set (
    (0.0330
    0.0330 sec
    sec)
    )
    90

    View Slide

  133. Don't forget !
    Do not take recommendations at face value, check before deleting an index.
    Do not delete an index immediately, but rst set it as INVISIBLE for some time. Once in a
    while this index might be used, like for a monthly report.
    Copyright @ 2022 Oracle and/or its affiliates.
    91

    View Slide

  134. Don't forget !
    Do not take recommendations at face value, check before deleting an index.
    Do not delete an index immediately, but rst set it as INVISIBLE for some time. Once in a
    while this index might be used, like for a monthly report.
    Copyright @ 2022 Oracle and/or its affiliates.
    But when I add or remove an Index, can I estimate the time left ?
    91

    View Slide

  135. ALTER Progression
    select
    select stmt
    stmt.
    .thread_id
    thread_id,
    , stmt
    stmt.
    .sql_text
    sql_text,
    , stage
    stage.
    .event_name
    event_name as
    as state
    state,
    ,
    stage
    stage.
    .work_completed
    work_completed,
    , stage
    stage.
    .work_estimated
    work_estimated,
    ,
    lpad
    lpad(
    (concat
    concat(
    (round
    round(
    (100
    100*
    *stage
    stage.
    .work_completed
    work_completed/
    /stage
    stage.
    .work_estimated
    work_estimated,
    , 2
    2)
    ),
    ,"%"
    "%")
    ),
    ,10
    10,
    ," "
    " ")
    )
    as
    as completed_at
    completed_at,
    ,
    lpad
    lpad(
    (format_pico_time
    format_pico_time(
    (stmt
    stmt.
    .timer_wait
    timer_wait)
    ),
    , 10
    10,
    , " "
    " ")
    ) as
    as started_ago
    started_ago,
    ,
    lpad
    lpad(
    (format_pico_time
    format_pico_time(
    (stmt
    stmt.
    .timer_wait
    timer_wait/
    /round
    round(
    (100
    100*
    *stage
    stage.
    .work_completed
    work_completed/
    /stage
    stage.
    .work_estimated
    work_estimated,
    ,2
    2)
    )*
    *100
    100)
    ),
    ,
    10
    10,
    , " "
    " ")
    ) as
    as estimated_full_time
    estimated_full_time,
    ,
    lpad
    lpad(
    (format_pico_time
    format_pico_time(
    ((
    (stmt
    stmt.
    .timer_wait
    timer_wait/
    /round
    round(
    (100
    100*
    *stage
    stage.
    .work_completed
    work_completed/
    /stage
    stage.
    .work_estimated
    work_estimated,
    ,2
    2)
    )*
    *100
    100)
    )
    -
    -stmt
    stmt.
    .timer_wait
    timer_wait)
    ),
    , 10
    10,
    , " "
    " ")
    ) as
    as estimated_remaining_time
    estimated_remaining_time,
    ,
    current_allocated memory
    current_allocated memory
    from
    from performance_schema
    performance_schema.
    .events_statements_current stmt
    events_statements_current stmt
    inner
    inner join
    join sys
    sys.
    .memory_by_thread_by_current_bytes mt
    memory_by_thread_by_current_bytes mt
    on
    on mt
    mt.
    .thread_id
    thread_id =
    = stmt
    stmt.
    .thread_id
    thread_id
    inner
    inner join
    join performance_schema
    performance_schema.
    .events_stages_current stage
    events_stages_current stage
    on
    on stage
    stage.
    .thread_id
    thread_id =
    = stmt
    stmt.
    .thread_id\G
    thread_id\G
    Copyright @ 2022 Oracle and/or its affiliates.
    92

    View Slide

  136. ALTER Progression - example
    Copyright @ 2022 Oracle and/or its affiliates.
    93

    View Slide

  137. Index Creation is slow
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    Creating indexes is a very slow operation even on my powerfull server with
    multiple cores ! Anything I can do ?
    94

    View Slide

  138. Index Creation is slow
    < >
    Copyright @ 2022 Oracle and/or its affiliates.
    Creating indexes is a very slow operation even on my powerfull server with
    multiple cores ! Anything I can do ?
    Since MySQL 8.0.27, you have the possibility to control the maximum of
    parallel threads InnoDB can use to create seconday indexes !
    94

    View Slide

  139. Parallel Index Creation
    The amount of parallel threads used by InnoDB is controlled by innodb_ddl_threads.
    This new variable is coupled with another new variable: innodb_ddl_buffer_size.
    If you have fast storage and multiple CPU cores, tuning these variables can speed up
    secondary index creation.
    Copyright @ 2022 Oracle and/or its affiliates.
    95

    View Slide

  140. Parallel Index Creation - example
    SQL
    SQL>
    > alter
    alter table
    table booking
    booking
    add
    add index
    index idx_2
    idx_2(
    (flight_id
    flight_id,
    , seat
    seat,
    , passenger_id
    passenger_id)
    );
    ;
    Query OK
    Query OK,
    , 0
    0 rows
    rows affected
    affected (
    (9
    9 min
    min 0.6838
    0.6838 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    96

    View Slide

  141. Parallel Index Creation - example
    SQL
    SQL>
    > alter
    alter table
    table booking
    booking
    add
    add index
    index idx_2
    idx_2(
    (flight_id
    flight_id,
    , seat
    seat,
    , passenger_id
    passenger_id)
    );
    ;
    Query OK
    Query OK,
    , 0
    0 rows
    rows affected
    affected (
    (9
    9 min
    min 0.6838
    0.6838 sec
    sec)
    )
    The default se ings are:
    innodb_ddl_threads = 4
    innodb_ddl_buffer_size = 1048576
    innodb_parallel_read_threads = 4
    Copyright @ 2022 Oracle and/or its affiliates.
    96

    View Slide

  142. Parallel Index Creation - example
    SQL
    SQL>
    > alter
    alter table
    table booking
    booking
    add
    add index
    index idx_2
    idx_2(
    (flight_id
    flight_id,
    , seat
    seat,
    , passenger_id
    passenger_id)
    );
    ;
    Query OK
    Query OK,
    , 0
    0 rows
    rows affected
    affected (
    (9
    9 min
    min 0.6838
    0.6838 sec
    sec)
    )
    The default se ings are:
    innodb_ddl_threads = 4
    innodb_ddl_buffer_size = 1048576
    innodb_parallel_read_threads = 4
    The innodb_ddl_buffer_size is shared between all innodb_ddl_threads de ned. If
    you increase the amount of threads, I recommend that you also increase the bu er size.
    Copyright @ 2022 Oracle and/or its affiliates.
    96

    View Slide

  143. Parallel Index Creation - example (2)
    To nd the best values for these variables, let's have a look at the amount of CPU cores:
    SQL
    SQL>
    > select
    select count
    count from
    from information_schema
    information_schema.
    .INNODB_METRICS
    INNODB_METRICS
    where
    where name
    name =
    = 'cpu_n'
    'cpu_n';
    ;
    +
    +-------+
    -------+
    |
    | count
    count |
    |
    +
    +-------+
    -------+
    |
    | 16
    16 |
    |
    +
    +-------+
    -------+
    We have then 16 cores to share. As my machine as plenty of memory, I will allocate 1GB for
    the InnoDB DDL bu er.
    Copyright @ 2022 Oracle and/or its affiliates.
    97

    View Slide

  144. Parallel Index Creation - example (3)
    SQL
    SQL>
    > SET
    SET innodb_ddl_threads
    innodb_ddl_threads =
    = 8
    8;
    ;
    SQL
    SQL>
    > SET
    SET innodb_parallel_read_threads
    innodb_parallel_read_threads =
    = 8
    8;
    ;
    SQL
    SQL>
    > SET
    SET innodb_ddl_buffer_size
    innodb_ddl_buffer_size =
    = 1048576000
    1048576000;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    98

    View Slide

  145. Parallel Index Creation - example (3)
    SQL
    SQL>
    > SET
    SET innodb_ddl_threads
    innodb_ddl_threads =
    = 8
    8;
    ;
    SQL
    SQL>
    > SET
    SET innodb_parallel_read_threads
    innodb_parallel_read_threads =
    = 8
    8;
    ;
    SQL
    SQL>
    > SET
    SET innodb_ddl_buffer_size
    innodb_ddl_buffer_size =
    = 1048576000
    1048576000;
    ;
    We can now retry the same index creation as previously:
    SQL
    SQL>
    > alter
    alter table
    table booking
    booking add
    add index
    index idx_2
    idx_2(
    (flight_id
    flight_id,
    , seat
    seat,
    , passenger_id
    passenger_id)
    );
    ;
    Query OK
    Query OK,
    , 0
    0 rows
    rows affected
    affected (
    (3
    3 min
    min 9.1862
    9.1862 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    98

    View Slide

  146. Parallel Index Creation - example (4)
    I recommend to make tests to de ne the optimal se ings for your database, your hardware
    and data.
    For example, I got the best result se ing the bu er size to 2GB and both ddl threads and
    parallel read threads to 4.
    It took 2 min 43 sec, much be er than the initial 9 minutes !
    For more information, go to h ps://lefred.be/content/mysql-8-0-innodb-parallel-threads-
    for-online-ddl-operations/
    Copyright @ 2022 Oracle and/or its affiliates.
    99

    View Slide

  147. Histograms
    help the Optimizer to make the right decision
    Copyright @ 2022 Oracle and/or its affiliates.
    100

    View Slide

  148. Histograms
    What is a histogram?
    Wikipedia declares a histogram is an accurate representation of the distribution of
    numerical data. For RDBMS, a histogram is an approximation of the data distribution within
    a speci c column.
    So in MySQL, histograms help the optimizer to nd the most e cient Query Plan to fetch
    that data.
    Copyright @ 2022 Oracle and/or its affiliates.
    101

    View Slide

  149. Histograms in MySQL
    MySQL provides:
    statement histograms
    optimizer histograms
    Copyright @ 2022 Oracle and/or its affiliates.
    102

    View Slide

  150. Histograms in MySQL
    MySQL provides:
    statement histograms
    optimizer histograms
    the second category is what we need to focus on today !
    Copyright @ 2022 Oracle and/or its affiliates.
    102

    View Slide

  151. Statements Histograms
    This is an example of query response time distribution for a statement:
    Copyright @ 2022 Oracle and/or its affiliates.
    103

    View Slide

  152. Global Statements Histograms
    If you want a global overview of all statements:
    SELECT
    SELECT CONCAT
    CONCAT(
    ('<'
    '<',
    ,ROUND
    ROUND(
    (BUCKET_TIMER_HIGH
    BUCKET_TIMER_HIGH/
    /1000000
    1000000,
    ,2
    2)
    ),
    ,
    ' microsec (<'
    ' microsec (<',
    ,ROUND
    ROUND(
    (BUCKET_TIMER_HIGH
    BUCKET_TIMER_HIGH/
    /1000000000
    1000000000,
    ,2
    2)
    ) ,
    ,'ms)'
    'ms)')
    ) QRT
    QRT,
    ,
    CONCAT
    CONCAT(
    (RPAD
    RPAD(
    (''
    '',
    ,ROUND
    ROUND(
    (BUCKET_QUANTILE
    BUCKET_QUANTILE*
    *100
    100)
    ),
    ,'*'
    '*')
    ),
    ,
    ROUND
    ROUND(
    (BUCKET_QUANTILE
    BUCKET_QUANTILE*
    *100
    100,
    ,2
    2)
    ),
    ,"%"
    "%")
    ) bar
    bar
    FROM
    FROM events_statements_histogram_global
    events_statements_histogram_global WHERE
    WHERE count_bucket
    count_bucket>
    >0
    0;
    ;
    Copyright @ 2022 Oracle and/or its affiliates.
    104

    View Slide

  153. Optimizer Histograms in MySQL
    A histogram is a distribution of data into logical buckets
    There are two types of histograms:
    singleton
    equi-height
    The maximum number of buckets is 1024.
    Copyright @ 2022 Oracle and/or its affiliates.
    105

    View Slide

  154. Optimizer Histograms in MySQL (2)
    So to help the optimizer to nd the most e cient Query Plan, histograms can be created. As
    we know, a histogram is an approximation of the data distribution within a speci c column.
    Histograms are useful for columns NOT being candidate to have indexes.
    A histogram is created or updated only on demand, so it adds no overhead when table data
    is modi ed. On the other hand, the statistics become progressively more out of date when
    table modi cations occur, until the next time they are updated.
    Copyright @ 2022 Oracle and/or its affiliates.
    106

    View Slide

  155. Optimizer Histograms in MySQL (2)
    So to help the optimizer to nd the most e cient Query Plan, histograms can be created. As
    we know, a histogram is an approximation of the data distribution within a speci c column.
    Histograms are useful for columns NOT being candidate to have indexes.
    A histogram is created or updated only on demand, so it adds no overhead when table data
    is modi ed. On the other hand, the statistics become progressively more out of date when
    table modi cations occur, until the next time they are updated.
    As an example is be er than 1000 words, let's see how we can create and bene t from
    histograms.
    (*)Example from MySQL 8 Query Performance Tuning, Jesper Wisborg Krogh, Apress, 2020
    Copyright @ 2022 Oracle and/or its affiliates.
    106

    View Slide

  156. SELECT
    SELECT film_id
    film_id,
    , title
    title,
    , length
    length,
    ,
    GROUP_CONCAT
    GROUP_CONCAT(
    (
    CONCAT_WS
    CONCAT_WS(
    (' '
    ' ',
    , first_name
    first_name,
    , last_name
    last_name)
    )
    )
    ) AS
    AS actors
    actors
    FROM
    FROM sakila
    sakila.
    .film
    film
    INNER
    INNER JOIN
    JOIN sakila
    sakila.
    .film_actor
    film_actor USING
    USING (
    (film_id
    film_id)
    )
    INNER
    INNER JOIN
    JOIN sakila
    sakila.
    .actor
    actor USING
    USING (
    (actor_id
    actor_id)
    )
    WHERE
    WHERE length
    length <
    < 55
    55 AND
    AND first_name
    first_name =
    = 'Groucho'
    'Groucho'
    GROUP
    GROUP BY
    BY film_id
    film_id;
    ;
    Optimizer Histograms in MySQL (3)
    Consider this query:
    Copyright @ 2022 Oracle and/or its affiliates.
    107

    View Slide

  157. SELECT
    SELECT film_id
    film_id,
    , title
    title,
    , length
    length,
    ,
    GROUP_CONCAT
    GROUP_CONCAT(
    (
    CONCAT_WS
    CONCAT_WS(
    (' '
    ' ',
    , first_name
    first_name,
    , last_name
    last_name)
    )
    )
    ) AS
    AS actors
    actors
    FROM
    FROM sakila
    sakila.
    .film
    film
    INNER
    INNER JOIN
    JOIN sakila
    sakila.
    .film_actor
    film_actor USING
    USING (
    (film_id
    film_id)
    )
    INNER
    INNER JOIN
    JOIN sakila
    sakila.
    .actor
    actor USING
    USING (
    (actor_id
    actor_id)
    )
    WHERE
    WHERE length
    length <
    < 55
    55 AND
    AND first_name
    first_name =
    = 'Groucho'
    'Groucho'
    GROUP
    GROUP BY
    BY film_id
    film_id;
    ;
    It returns 6 rows:
    6 rows in set (0.0143 sec)
    Optimizer Histograms in MySQL (3)
    Consider this query:
    Copyright @ 2022 Oracle and/or its affiliates.
    107

    View Slide

  158. Optimizer Histograms in MySQL (4)
    Let's have a look at the Query Execution Plan:
    Copyright @ 2022 Oracle and/or its affiliates.
    108

    View Slide

  159. Optimizer Histograms in MySQL (5)
    Now we will create an histogram on the length column of the lm table:
    ANALYZE
    ANALYZE TABLE
    TABLE sakila
    sakila.
    .film
    film
    UPDATE
    UPDATE HISTOGRAM
    HISTOGRAM ON
    ON length
    length
    WITH
    WITH 256
    256 BUCKETS\G
    BUCKETS\G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    Table
    Table: sakila
    : sakila.
    .film
    film
    Op: histogram
    Op: histogram
    Msg_type:
    Msg_type: status
    status
    Msg_text: Histogram
    Msg_text: Histogram statistics
    statistics created
    created for
    for column
    column 'length'
    'length'.
    .
    Copyright @ 2022 Oracle and/or its affiliates.
    109

    View Slide

  160. Optimizer Histograms in MySQL (6)
    Let's have a new look at the Query Execution Plan:
    Copyright @ 2022 Oracle and/or its affiliates.
    110

    View Slide

  161. Optimizer Histograms in MySQL (7)
    For info, the query is now faster:
    6 rows in set (0.0035 sec)
    Copyright @ 2022 Oracle and/or its affiliates.
    111

    View Slide

  162. Optimizer Histograms in MySQL (8)
    You can also retrieve some information about the histograms:
    SELECT
    SELECT SCHEMA_NAME
    SCHEMA_NAME,
    , TABLE_NAME
    TABLE_NAME,
    , COLUMN_NAME
    COLUMN_NAME,
    ,
    HISTOGRAM
    HISTOGRAM-
    ->
    >'$."data-type"'
    '$."data-type"' 'data-type'
    'data-type',
    ,
    HISTOGRAM
    HISTOGRAM-
    ->
    >'$."last-updated"'
    '$."last-updated"' 'last-updated'
    'last-updated',
    ,
    HISTOGRAM
    HISTOGRAM-
    ->
    >'$."histogram-type"'
    '$."histogram-type"' 'histogram-type'
    'histogram-type',
    ,
    HISTOGRAM
    HISTOGRAM-
    ->
    >'$."number-of-buckets-specified"'
    '$."number-of-buckets-specified"' 'specified-buckets'
    'specified-buckets'
    FROM
    FROM information_schema
    information_schema.
    .column_statistics
    column_statistics
    WHERE
    WHERE COLUMN_NAME
    COLUMN_NAME =
    = 'length'
    'length';
    ;
    +-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+
    | SCHEMA_NAME | TABLE_NAME | COLUMN_NAME | data-type | last-updated | histogram-type | specified-buckets |
    +-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+
    | sakila | film | length | "int" | "2021-02-24 09:59:13.046631" | "singleton" | 256 |
    +-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+
    Copyright @ 2022 Oracle and/or its affiliates.
    112

    View Slide

  163. select
    select v
    v value
    value,
    , concat
    concat(
    (round
    round(
    (c
    c*
    *100
    100,
    ,1
    1)
    ),
    ,'%'
    '%')
    ) cumulfreq
    cumulfreq,
    ,
    concat
    concat(
    (round
    round(
    ((
    (c
    c -
    - LAG
    LAG(
    (c
    c,
    , 1
    1,
    , 0
    0)
    )
    over
    over(
    ()
    ))
    ) *
    * 100
    100,
    ,1
    1)
    ),
    , '%'
    '%')
    ) freq
    freq
    from
    from information_schema
    information_schema.
    .column_statistics
    column_statistics,
    ,
    JSON_TABLE
    JSON_TABLE(
    (histogram
    histogram-
    ->
    >'$.buckets'
    '$.buckets',
    ,
    '$[*]'
    '$[*]' COLUMNS
    COLUMNS(
    (v
    v VARCHAR
    VARCHAR(
    (60
    60)
    )
    PATH
    PATH '$[0]'
    '$[0]',
    , c
    c double
    double PATH
    PATH '$[1]'
    '$[1]')
    ))
    ) hist
    hist
    where
    where schema_name
    schema_name =
    = 'sakila'
    'sakila'
    and
    and table_name
    table_name =
    = 'film'
    'film' and
    and column_name
    column_name =
    = 'length'
    'length';
    ;
    +-------+-----------+------+
    | value | cumulfreq | freq |
    +-------+-----------+------+
    | 46 | 0.5% | 0.5% |
    | 47 | 1.2% | 0.7% |
    | 48 | 2.3% | 1.1% |
    | 49 | 2.8% | 0.5% |
    | 50 | 3.7% | 0.9% |
    | 51 | 4.4% | 0.7% |
    | 52 | 5.1% | 0.7% |
    | 53 | 6% | 0.9% |
    | 54 | 6.6% | 0.6% |
    | 55 | 6.8% | 0.2% |
    | 56 | 7.3% | 0.5% |
    ...
    | 185 | 100% | 1% |
    +-------+-----------+------+
    140 rows in set (0.0040 sec)
    Optimizer Histograms (7)
    Copyright @ 2022 Oracle and/or its affiliates.
    113

    View Slide

  164. Two types of Histograms
    Equi-height: One bucket represents a range of values. This type of histograms will be
    created when distinct values in the column are greater than the number of buckets
    speci ed in the ANALYZE TABLE syntax. Think A-G H-L M-T U-Z.
    Singleton: One bucket represents one single value in the column, it is the most accurate and
    will be created when the number of distinct values in the column is less than or equal to the
    number of buckets speci ed in the ANALYZE TABLE statement.
    Copyright @ 2022 Oracle and/or its affiliates.
    114

    View Slide

  165. Optimizer Histograms Syntax
    ANALYZE
    ANALYZE TABLE
    TABLE t
    t UPDATE
    UPDATE HISTOGRAM
    HISTOGRAM ON
    ON c1
    c1,
    , c2
    c2,
    , c3
    c3 WITH
    WITH 10
    10 BUCKETS
    BUCKETS;
    ;
    ANALYZE
    ANALYZE TABLE
    TABLE t
    t UPDATE
    UPDATE HISTOGRAM
    HISTOGRAM ON
    ON c1
    c1,
    , c3
    c3 WITH
    WITH 10
    10 BUCKETS
    BUCKETS;
    ;
    ANALYZE
    ANALYZE TABLE
    TABLE t
    t DROP
    DROP HISTOGRAM
    HISTOGRAM ON
    ON c2
    c2;
    ;
    Note that the rst statement creates three di erent histograms on c1, c2 and c3 as an
    histogram is created per columns
    Copyright @ 2022 Oracle and/or its affiliates.
    115

    View Slide

  166. Optimizer Histograms Syntax
    ANALYZE
    ANALYZE TABLE
    TABLE t
    t UPDATE
    UPDATE HISTOGRAM
    HISTOGRAM ON
    ON c1
    c1,
    , c2
    c2,
    , c3
    c3 WITH
    WITH 10
    10 BUCKETS
    BUCKETS;
    ;
    ANALYZE
    ANALYZE TABLE
    TABLE t
    t UPDATE
    UPDATE HISTOGRAM
    HISTOGRAM ON
    ON c1
    c1,
    , c3
    c3 WITH
    WITH 10
    10 BUCKETS
    BUCKETS;
    ;
    ANALYZE
    ANALYZE TABLE
    TABLE t
    t DROP
    DROP HISTOGRAM
    HISTOGRAM ON
    ON c2
    c2;
    ;
    Note that the rst statement creates three di erent histograms on c1, c2 and c3 as an
    histogram is created per columns
    Histograms can be created for almost any data type.
    If a type is not supported you will get:
    The
    The column
    column 'doc'
    'doc' has an unsupported
    has an unsupported data
    data type
    type.
    .
    Copyright @ 2022 Oracle and/or its affiliates.
    115

    View Slide

  167. Optimizer Histograms Syntax (2)
    I've created histograms for some columns in di erent tables in the world database.
    Information_Schema can be used to retrieve the info related to these histograms:
    select
    select table_name
    table_name,
    , column_name
    column_name,
    , histogram
    histogram-
    ->>
    >>'$."data-type"'
    '$."data-type"' AS
    AS 'data-type'
    'data-type',
    ,
    json_length
    json_length(
    (histogram
    histogram-
    ->>
    >>'$."buckets"'
    '$."buckets"')
    ) AS
    AS 'bucket-count'
    'bucket-count'
    from
    from information_schema
    information_schema.
    .column_statistics
    column_statistics;
    ;
    +
    +------------+-------------+-----------+--------------+
    ------------+-------------+-----------+--------------+
    |
    | TABLE_NAME
    TABLE_NAME |
    | COLUMN_NAME
    COLUMN_NAME |
    | data
    data-
    -type
    type |
    | bucket
    bucket-
    -count
    count |
    |
    +
    +------------+-------------+-----------+--------------+
    ------------+-------------+-----------+--------------+
    |
    | country
    country |
    | Population
    Population |
    | int
    int |
    | 226
    226 |
    |
    |
    | city
    city |
    | Population
    Population |
    | int
    int |
    | 1024
    1024 |
    |
    |
    | countrylan
    countrylan |
    | Language
    Language |
    | string
    string |
    | 457
    457 |
    |
    +
    +------------+-------------+-----------+--------------+
    ------------+-------------+-----------+--------------+
    Copyright @ 2022 Oracle and/or its affiliates.
    116

    View Slide

  168. Where Histograms Shine
    create
    create table
    table h1
    h1 (
    (id
    id int
    int unsigned
    unsigned auto_increment
    auto_increment,
    ,
    x
    x int
    int unsigned
    unsigned,
    , primary
    primary key
    key(
    (id
    id)
    ))
    );
    ;
    insert
    insert into
    into h1
    h1 (
    (x
    x)
    ) values
    values (
    (1
    1)
    ),
    ,(
    (1
    1)
    ),
    ,(
    (2
    2)
    ),
    ,(
    (2
    2)
    ),
    ,(
    (2
    2)
    ),
    ,(
    (3
    3)
    ),
    ,(
    (3
    3)
    ),
    ,(
    (3
    3)
    ),
    ,(
    (3
    3)
    );
    ;
    select
    select x
    x,
    , count
    count(
    (x
    x)
    ) from
    from h1
    h1 group
    group by
    by x
    x;
    ;
    +
    +---+----------+
    ---+----------+
    |
    | x
    x |
    | count
    count(
    (x
    x)
    ) |
    |
    +
    +---+----------+
    ---+----------+
    |
    | 1
    1 |
    | 2
    2 |
    |
    |
    | 2
    2 |
    | 3
    3 |
    |
    |
    | 3
    3 |
    | 4
    4 |
    |
    +
    +---+----------+
    ---+----------+
    3
    3 rows
    rows in
    in set
    set (
    (0.0234
    0.0234 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    117

    View Slide

  169. explain
    explain select
    select *
    * from
    from h1
    h1 where
    where x
    x >
    > 0
    0\G
    \G
    *
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: h1
    : h1
    partitions:
    partitions: NULL
    NULL
    type
    type:
    : ALL
    ALL
    possible_keys:
    possible_keys: NULL
    NULL
    key
    key:
    : NULL
    NULL
    key_len:
    key_len: NULL
    NULL
    ref:
    ref: NULL
    NULL
    rows
    rows:
    : 9
    9
    filtered:
    filtered: 33.33
    33.33
    Extra:
    Extra: Using
    Using where
    where
    The ltered column indicates an estimated percentage of table rows
    that will be ltered by the table condition. The maximum value is 100,
    which means no ltering of rows occurred.
    Values decreasing from 100 indicate increasing amounts of ltering.
    rows shows the estimated number of rows examined and rows ×
    ltered shows the number of rows the opimizer plans to retrieve.
    In this example 9 rows x 33% = 3 rows
    The Optimizer estimates about 1/3 of the
    data will go match the 'X > 0' condition
    Where Histograms Shine (2)
    Copyright @ 2022 Oracle and/or its affiliates.
    118

    View Slide

  170. Where Histograms Shine (3)
    analyze
    analyze table
    table h1
    h1 update
    update histogram
    histogram on
    on x
    x with
    with 3
    3 buckets\G
    buckets\G
    *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    Table
    Table: test
    : test.
    .h1
    h1
    Op: histogram
    Op: histogram
    Msg_type:
    Msg_type: status
    status
    Msg_text: Histogram
    Msg_text: Histogram statistics
    statistics created
    created for
    for column
    column 'x'
    'x'.
    .
    1
    1 row
    row in
    in set
    set (
    (0.0287
    0.0287 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    119

    View Slide

  171. explain
    explain select
    select *
    * from
    from h1
    h1 where
    where x
    x >
    > 0
    0\G
    \G
    *
    **
    **
    **
    **
    **
    **
    **
    * 1.
    1. row
    row *
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    **
    *
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: h1
    : h1
    partitions:
    partitions: NULL
    NULL
    type
    type:
    : ALL
    ALL
    possible_keys:
    possible_keys: NULL
    NULL
    key
    key:
    : NULL
    NULL
    key_len:
    key_len: NULL
    NULL
    ref:
    ref: NULL
    NULL
    rows
    rows:
    : 9
    9
    filtered:
    filtered: 100
    100
    Extra:
    Extra: Using
    Using where
    where
    all rows !
    Where Histograms Shine (4)
    Copyright @ 2022 Oracle and/or its affiliates.
    120

    View Slide

  172. Without Histogram:
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: h1
    : h1
    partitions:
    partitions: NULL
    NULL
    type
    type:
    : ALL
    ALL
    possible_keys:
    possible_keys: NULL
    NULL
    key
    key:
    : NULL
    NULL
    key_len:
    key_len: NULL
    NULL
    ref:
    ref: NULL
    NULL
    rows
    rows:
    : 9
    9
    filtered:
    filtered: 33.33
    33.33
    Extra:
    Extra: Using
    Using where
    where
    With Histogram:
    id:
    id: 1
    1
    select_type:
    select_type: SIMPLE
    SIMPLE
    table
    table: h1
    : h1
    partitions:
    partitions: NULL
    NULL
    type
    type:
    : ALL
    ALL
    possible_keys:
    possible_keys: NULL
    NULL
    key
    key:
    : NULL
    NULL
    key_len:
    key_len: NULL
    NULL
    ref:
    ref: NULL
    NULL
    rows
    rows:
    : 9
    9
    filtered:
    filtered: 44.44
    44.44
    Extra:
    Extra: Using
    Using where
    where
    Where Histograms Shine (5)
    Let's change the condition to x > 2:
    Copyright @ 2022 Oracle and/or its affiliates.
    121

    View Slide

  173. Where Histograms Shine (6)
    select
    select round
    round(
    (9
    9*
    *0.3333
    0.3333)
    ) 'without histogram'
    'without histogram',
    , round
    round(
    (9
    9*
    *0.4444
    0.4444)
    ) 'with histogram'
    'with histogram';
    ;
    +
    +-------------------+----------------+
    -------------------+----------------+
    |
    | without histogram
    without histogram |
    | with
    with histogram
    histogram |
    |
    +
    +-------------------+----------------+
    -------------------+----------------+
    |
    | 3
    3 |
    | 4
    4 |
    |
    +
    +-------------------+----------------+
    -------------------+----------------+
    select
    select count
    count(
    (*
    *)
    ) from
    from h1
    h1 where
    where x
    x >
    > 2
    2;
    ;
    +
    +----------+
    ----------+
    |
    | count
    count(
    (*
    *)
    ) |
    |
    +
    +----------+
    ----------+
    |
    | 4
    4 |
    |
    +
    +----------+
    ----------+
    Copyright @ 2022 Oracle and/or its affiliates.
    122

    View Slide

  174. Histogram vs Index
    Why you might consider a histogram instead of an index:
    Maintaining an index has a cost. If you have an index, every INSERT/UPDATE/DELETE
    causes the index to be updated. This is not free, and will have an impact on your
    performance. A histogram on the other hand is created once and never updated unless
    you explicitly ask for it. It will thus not hurt your write performance.
    If you have an index, the optimizer will need to use it to do what we call "index dives" to
    estimate the number of records in a given range. This might become too costly if you
    have for instance very long IN-lists in your query. Histogram statistics are much cheaper
    in this case, and might thus be more suitable.
    Copyright @ 2022 Oracle and/or its affiliates.
    123

    View Slide

  175. MySQL HeatWave
    I need more performance, much more !
    Copyright @ 2022 Oracle and/or its affiliates.
    124

    View Slide

  176. MySQL HeatWave
    MySQL is also available in Oracle Cloud Infrastructure (OCI) as a managed service.
    HeatWave is a massively, high performance, in-memory query accelerator for OCI MySQL
    Database Service that accelerates MySQL performance by orders of magnitude for
    analytics and mixed workload.
    HeatWave can be enabled on demand.
    Copyright @ 2022 Oracle and/or its affiliates.
    125

    View Slide

  177. When queries are still too slow
    Some times, the data is to heavy and the indexes are not manageable or don't t in
    memory... in that case it's very complicate to perform query optimization.
    This is especially true for Analytics queries.
    Copyright @ 2022 Oracle and/or its affiliates.
    126

    View Slide

  178. HeatWave Example - data from PiDay
    SQL
    SQL >
    > select
    select *
    * from
    from (
    (
    select
    select date
    date(
    (time_stamp
    time_stamp)
    ) as
    as `
    `day
    day`
    `,
    , device_id
    device_id,
    , count
    count(
    (*
    *)
    ) as
    as `
    `tot
    tot`
    `,
    ,
    max
    max(
    (value
    value)
    ) as
    as `
    `max hum
    max hum`
    `,
    , min
    min(
    (value
    value)
    ) as
    as `
    `min hum
    min hum`
    `,
    , avg
    avg(
    (value
    value)
    ) as
    as `
    `avg hum
    avg hum`
    `
    from
    from humidity_history
    humidity_history group
    group by
    by device_id
    device_id,
    , day
    day)
    ) a
    a
    natural
    natural join
    join (
    (
    select
    select date
    date(
    (time_stamp
    time_stamp)
    ) as
    as `
    `day
    day`
    `,
    , device_id
    device_id,
    , count
    count(
    (*
    *)
    ) as
    as `
    `tot
    tot`
    `,
    ,
    max
    max(
    (value
    value)
    ) as
    as `
    `max temp
    max temp`
    `,
    , min
    min(
    (value
    value)
    ) as
    as `
    `min temp
    min temp`
    `,
    , avg
    avg(
    (value
    value)
    ) as
    as `
    `avg temp
    avg temp`
    `
    from
    from temperature_history
    temperature_history group
    group by
    by device_id
    device_id,
    , day
    day)
    ) b
    b order
    order by
    by day
    day,
    , device_id
    device_id;
    ;
    +
    +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    |
    | day
    day |
    | device_id
    device_id |
    | tot
    tot |
    | max hum
    max hum |
    | min hum
    min hum |
    | avg hum
    avg hum |
    | max
    max temp
    temp |
    | min
    min temp
    temp |
    | avg
    avg temp
    temp |
    |
    +
    +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    |
    | 2022
    2022-
    -03
    03-
    -09
    09 |
    | 00006227543
    00006227543c0000000000000002
    c0000000000000002 |
    | 14534
    14534 |
    | 65.00
    65.00 |
    | 55.00
    55.00 |
    | 60.009273
    60.009273 |
    | 29.99
    29.99 |
    | 20.00
    20.00 |
    | 22.597118
    22.597118 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -09
    09 |
    | 00006227543
    00006227543c0000000000000003
    c0000000000000003 |
    | 31605
    31605 |
    | 800.21
    800.21 |
    | 1.00
    1.00 |
    | 8.570861
    8.570861 |
    | 814.36
    814.36 |
    | 0.00
    0.00 |
    | 5.079733
    5.079733 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -09
    09 |
    | 00006227543
    00006227543c0000000000000004
    c0000000000000004 |
    | 31284
    31284 |
    | 279.32
    279.32 |
    | 30.00
    30.00 |
    | 35.294440
    35.294440 |
    | 288.44
    288.44 |
    | 10.00
    10.00 |
    | 12.797445
    12.797445 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000001
    c0000000000000001 |
    | 114906
    114906 |
    | 50.00
    50.00 |
    | 40.00
    40.00 |
    | 45.001613
    45.001613 |
    | 14.00
    14.00 |
    | 9.00
    9.00 |
    | 11.499796
    11.499796 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000002
    c0000000000000002 |
    | 100913
    100913 |
    | 65.00
    65.00 |
    | 55.00
    55.00 |
    | 59.999105
    59.999105 |
    | 25.00
    25.00 |
    | 20.00
    20.00 |
    | 22.501319
    22.501319 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000003
    c0000000000000003 |
    | 101465
    101465 |
    | 11.00
    11.00 |
    | 1.00
    1.00 |
    | 5.998472
    5.998472 |
    | 5.00
    5.00 |
    | 0.00
    0.00 |
    | 2.501763
    2.501763 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000004
    c0000000000000004 |
    | 101044
    101044 |
    | 40.00
    40.00 |
    | 30.00
    30.00 |
    | 34.991012
    34.991012 |
    | 15.00
    15.00 |
    | 10.00
    10.00 |
    | 12.496505
    12.496505 |
    |
    +
    +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    7
    7 rows
    rows in
    in set
    set (
    (1.2717
    1.2717 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    127

    View Slide

  179. HeatWave Example - data from PiDay
    Same Query after having enabled and loaded data to HeatWave
    SQL
    SQL >
    > select
    select *
    * from
    from (
    (
    select
    select date
    date(
    (time_stamp
    time_stamp)
    ) as
    as `
    `day
    day`
    `,
    , device_id
    device_id,
    , count
    count(
    (*
    *)
    ) as
    as `
    `tot
    tot`
    `,
    ,
    max
    max(
    (value
    value)
    ) as
    as `
    `max hum
    max hum`
    `,
    , min
    min(
    (value
    value)
    ) as
    as `
    `min hum
    min hum`
    `,
    , avg
    avg(
    (value
    value)
    ) as
    as `
    `avg hum
    avg hum`
    `
    from
    from humidity_history
    humidity_history group
    group by
    by device_id
    device_id,
    , day
    day)
    ) a
    a
    natural
    natural join
    join (
    (
    select
    select date
    date(
    (time_stamp
    time_stamp)
    ) as
    as `
    `day
    day`
    `,
    , device_id
    device_id,
    , count
    count(
    (*
    *)
    ) as
    as `
    `tot
    tot`
    `,
    ,
    max
    max(
    (value
    value)
    ) as
    as `
    `max temp
    max temp`
    `,
    , min
    min(
    (value
    value)
    ) as
    as `
    `min temp
    min temp`
    `,
    , avg
    avg(
    (value
    value)
    ) as
    as `
    `avg temp
    avg temp`
    `
    from
    from temperature_history
    temperature_history group
    group by
    by device_id
    device_id,
    , day
    day)
    ) b
    b order
    order by
    by day
    day,
    , device_id
    device_id;
    ;
    +
    +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    |
    | day
    day |
    | device_id
    device_id |
    | tot
    tot |
    | max hum
    max hum |
    | min hum
    min hum |
    | avg hum
    avg hum |
    | max
    max temp
    temp |
    | min
    min temp
    temp |
    | avg
    avg temp
    temp |
    |
    +
    +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    |
    | 2022
    2022-
    -03
    03-
    -09
    09 |
    | 00006227543
    00006227543c0000000000000002
    c0000000000000002 |
    | 14534
    14534 |
    | 65.00
    65.00 |
    | 55.00
    55.00 |
    | 60.009272
    60.009272 |
    | 29.99
    29.99 |
    | 20.00
    20.00 |
    | 22.597117
    22.597117 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -09
    09 |
    | 00006227543
    00006227543c0000000000000003
    c0000000000000003 |
    | 31605
    31605 |
    | 800.21
    800.21 |
    | 1.00
    1.00 |
    | 8.570860
    8.570860 |
    | 814.36
    814.36 |
    | 0.00
    0.00 |
    | 5.079732
    5.079732 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -09
    09 |
    | 00006227543
    00006227543c0000000000000004
    c0000000000000004 |
    | 31284
    31284 |
    | 279.32
    279.32 |
    | 30.00
    30.00 |
    | 35.294440
    35.294440 |
    | 288.44
    288.44 |
    | 10.00
    10.00 |
    | 12.797445
    12.797445 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000001
    c0000000000000001 |
    | 115609
    115609 |
    | 50.00
    50.00 |
    | 40.00
    40.00 |
    | 45.001736
    45.001736 |
    | 14.00
    14.00 |
    | 9.00
    9.00 |
    | 11.499157
    11.499157 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000002
    c0000000000000002 |
    | 100913
    100913 |
    | 65.00
    65.00 |
    | 55.00
    55.00 |
    | 59.999104
    59.999104 |
    | 25.00
    25.00 |
    | 20.00
    20.00 |
    | 22.501318
    22.501318 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000003
    c0000000000000003 |
    | 101465
    101465 |
    | 11.00
    11.00 |
    | 1.00
    1.00 |
    | 5.998472
    5.998472 |
    | 5.00
    5.00 |
    | 0.00
    0.00 |
    | 2.501762
    2.501762 |
    |
    |
    | 2022
    2022-
    -03
    03-
    -10
    10 |
    | 00006227543
    00006227543c0000000000000004
    c0000000000000004 |
    | 101044
    101044 |
    | 40.00
    40.00 |
    | 30.00
    30.00 |
    | 34.991011
    34.991011 |
    | 15.00
    15.00 |
    | 10.00
    10.00 |
    | 12.496504
    12.496504 |
    |
    +
    +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
    7
    7 rows
    rows in
    in set
    set (
    (0.1267
    0.1267 sec
    sec)
    )
    Copyright @ 2022 Oracle and/or its affiliates.
    128

    View Slide

  180. HeatWave Example - data from PiDay
    0.1267 sec VS 1.2717 sec
    10x faster... but that is only 1 day of data...
    Now lets increase the data to 14 days: +11G of data:
    +
    +----------+----------+------------+
    ----------+----------+------------+
    |
    | DATA
    DATA |
    | INDEXES
    INDEXES |
    | TOTAL SIZE
    TOTAL SIZE |
    |
    +
    +----------+----------+------------+
    ----------+----------+------------+
    |
    | 8.66
    8.66 GiB
    GiB |
    | 2.93
    2.93 GiB
    GiB |
    | 11.59
    11.59 GiB
    GiB |
    |
    +
    +----------+----------+------------+
    ----------+----------+------------+
    Copyright @ 2022 Oracle and/or its affiliates.
    129

    View Slide

  181. Without HeatWave:
    44
    44 rows
    rows in
    in set
    set (
    (10
    10 min
    min 14.1022
    14.1022 sec
    sec)
    )
    With HeatWave:
    45
    45 rows
    rows in
    in set
    set (
    (1.6051
    1.6051 sec
    sec)
    )
    HeatWave Example - data from PiDay
    14 days of data (11GB)
    69M ROWS
    383x faster !
    Copyright @ 2022 Oracle and/or its affiliates.
    130

    View Slide

  182. HeatWave - loading data
    alter
    alter table
    table temperature_history secondary_engine
    temperature_history secondary_engine=
    =rapid
    rapid;
    ;
    Query OK
    Query OK,
    , 0
    0 rows
    rows affected
    affected (
    (0.0257
    0.0257 sec
    sec)
    )
    alter
    alter table
    table temperature_history secondary_load
    temperature_history secondary_load;
    ;
    Query OK
    Query OK,
    , 0
    0 rows
    rows affected
    affected (
    (17.3070
    17.3070 sec
    sec)
    )
    4.6GB of data for this table (whitout indexes) loaded to HeatWave
    Copyright @ 2022 Oracle and/or its affiliates.
    131

    View Slide

  183. HeatWave
    Copyright @ 2022 Oracle and/or its affiliates.
    132

    View Slide

  184. MySQL Workload
    knowing your system
    Copyright @ 2022 Oracle and/or its affiliates.
    133

    View Slide

  185. MySQL Workload
    Read or Write Intensive ?
    Copyright @ 2022 Oracle and/or its affiliates.
    134

    View Slide

  186. MySQL Workload
    Read or Write Intensive ?
    You should know ! And many people just have an idea (and usually wrong...).
    Copyright @ 2022 Oracle and/or its affiliates.
    134

    View Slide

  187. MySQL Shell with
    plugins can help !
    MySQL Workload
    Read or Write Intensive ?
    You should know ! And many people just have an idea (and usually wrong...).
    Copyright @ 2022 Oracle and/or its affiliates.
    134

    View Slide

  188. MySQL Workload (2)
    Details are also available:
    Copyright @ 2022 Oracle and/or its affiliates.
    135

    View Slide

  189. MySQL Workload Changes
    What can cause a (one-time) performance drop ?
    Copyright @ 2022 Oracle and/or its affiliates.
    136

    View Slide

  190. MySQL Workload Changes
    What can cause a (one-time) performance drop ?
    Backup (physical, logical), Clone
    Filesystem snapshot
    External process consuming resources (IOPS, CPU...)
    Maintenance operation (ALTER, Changes in partitions, Archiving data..)
    Hardware problem or maintenance
    Di erent statistics, di erent Query Execution Plan
    Copyright @ 2022 Oracle and/or its affiliates.
    136

    View Slide

  191. Save the QEP for your queries
    I also recommend to save the Query Execution Plan of some of your queries in a table to
    compare them:
    Copyright @ 2022 Oracle and/or its affiliates.
    137

    View Slide

  192. Save the QEP for your queries (2)
    Copyright @ 2022 Oracle and/or its affiliates.
    138

    View Slide

  193. Recommended Lectures
    books
    Copyright @ 2022 Oracle and/or its affiliates.
    139

    View Slide

  194. Copyright @ 2022 Oracle and/or its affiliates.
    140

    View Slide

  195. Try MySQL HeatWave for Free:
    h ps://www.oracle.com/mysql/heatwave/
    Copyright @ 2022 Oracle and/or its affiliates.
    141

    View Slide

  196. MySQL Shell is now included in Visual Studio Code:
    Copyright @ 2022 Oracle and/or its affiliates.
    142

    View Slide

  197. Copyright @ 2022 Oracle and/or its affiliates.
    143

    View Slide

  198. Share your

    to MySQL
    #mysql
    Join our slack channel!
    bit.ly/mysql-slack
    Copyright @ 2022 Oracle and/or its affiliates.
    144

    View Slide

  199. Questions ?
    Copyright @ 2022 Oracle and/or its affiliates.
    145

    View Slide

  200. Resources & Credits
    h ps://dev.mysql.com/doc/refman/8.0/en/execution-plan-information.html
    h ps://github.com/lefred/mysqlshell-plugins
    h ps://dev.mysql.com/doc/heatwave/en/
    h ps://www.mysql.com/cloud/
    Copyright @ 2022 Oracle and/or its affiliates.
    146

    View Slide