Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Oracle-DB: Efficient use of function-based indexes

Peter Ramm
November 23, 2023

Oracle-DB: Efficient use of function-based indexes

- What are function-based indexes
- What generally happens at index maintenance
- Why deterministic function
- Usage example with access on small part of large tables (like queue processing)

Peter Ramm

November 23, 2023
Tweet

More Decks by Peter Ramm

Other Decks in Programming

Transcript

  1. Oracle-DB
    Efficient use of
    function-based indexes
    Peter Ramm November 2023

    View full-size slide

  2. Otto Group Solution Provider GmbH
    (OSP)
    Founded:
    March 1991
    Holding company:
    Otto Group
    Locations:
    Dresden, Hamburg, Altenkunstadt, Madrid, Valencia, Malaga, Taipei
    Number of employees:
    > 500
    Managing director:
    Dr. Stefan Borsutzky, Norbert Gödicke

    View full-size slide

  3. Zur Person
    Mail: [email protected]
    Peter Ramm
    Software architect / team lead at OSP in Dresden
    > 30 years of history in IT projects
    Main focus:
    • Development of OLTP systems based on Oracle databases
    • From architecture consulting up to trouble shooting
    • Performance optimization of existing systems

    View full-size slide

  4. Function-based Indexes in Oracle-Databases
    § What are function-based indexes
    § What generally happens at index maintenance
    § Why deterministic function
    § Usage example with access on small part of large tables (like queue processing)

    View full-size slide

  5. What are function-based indexes
    § Indexing based on function results instead of native table columns.
    § The functions transform column values of indexed table.
    That means, they use columns of indexed table as function parameter.
    § Allows the precise indexing of the conditions relevant for a query (WHERE-clause)
    - The footprint of indices can be drastically reduced by indexing only the relevant records (NULL values are
    not physically contained in the index)
    § The functions used must be deterministic.
    § Both internal and user-defined functions can be used.
    § Multi-column indexes can combine functions and columns of the indexed table.
    § Virtual columns can also be indexed instead of concrete function expressions.
    § In addition to function results, the results of CASE expressions can also be indexed

    View full-size slide

  6. What happens during index maintenance
    § INSERT:
    - Search for the leaf block matching the sort sequence according to the indexed values of the record by
    traversing the B-tree
    - Insert a new record at the corresponding position in the leaf block with the indexed values and reference to
    record in table (ROWID)
    § DELETE:
    - Search for the record in the index according to the indexed values and the ROWID of the record
    - Remove the found record from the leaf block of the index
    § UPDATE:
    - Find and remove the index record corresponding to the old values (analog DELETE)
    - Insert a new index record according to the new values (analog INSERT)

    View full-size slide

  7. Why deterministic functions
    § By declaring the function with the keyword DETERMINISTIC, the creator certifies that the function always
    returns identical results for identical call parameters
    § DB does not check whether the declared function is really deterministic
    § Functions are not deterministic, for example :
    - With dependency on the result of SELECT statements executed in the function
    - With dependency on the system time
    § Problem of not really deterministic functions :
    - INSERT works
    - With UPDATE or DELETE the existing index record is not found ->ORA_08102: index key not found
    - SELECT delivers different results with FULL SCAN or INDEX SCAN, as INDEX SCAN is based on values at
    the time of the INSERT, while FULL SCAN is based on the current function result

    View full-size slide

  8. Problem example of non-deterministic functions
    CREATE TABLE Log(Creation DATE);
    CREATE OR REPLACE FUNCTION AgeSecs(creation DATE) RETURN NUMBER DETERMINISTIC IS
    BEGIN
    RETURN (SYSDATE - creation)*86400;
    END;
    /
    CREATE INDEX Log_Age ON Log(AgeSecs(Creation));
    INSERT INTO Log VALUES(SYSDATE);
    -- Fails with ORA-08102
    DELETE FROM LOG;
    INSERT INTO Log VALUES(SYSDATE);
    INSERT INTO Log VALUES(SYSDATE);
    -- Uses FULL TABLE SCAN, age based on current time
    SELECT AgeSecs(creation), Creation FROM LOG;
    -- Uses INDEX RANGE SCAN on function-based index, age based on time at creation as stored in index
    SELECT AgeSecs(creation), Creation FROM LOG WHERE AgeSecs(creation) < 1;

    View full-size slide

  9. Application example: access to small parts of large tables
    Insights using the free analysis tool „Panorama“ (https://github.com/rammpeter/panorama)
    Starting point :
    • SQL with 6000 executions/day und 3 seconds runtime per execution
    • Selects on average 5 records per execution from table with 27 million records
    • Searches for a few newly added records that have not yet been sent (Sent=N)
    • Two further filters on "Event_Context" and list of values for "Event_Type“
    • Current indexing is suboptimal (column "Sent" in last position in index), size of index approx. 3 GB
    Alternative 1: Moving the "Sent" column as the leading column of the index
    • Reduction of the runtime to < 100µs (factor 20000++), size of the index remains comparable (1.5 GB new)
    • 4 Buffer gets in index / execution

    View full-size slide

  10. Alternative 2: Use of a function-based index that only indexes
    the records with Sent=N
    § All records with Sent != N are not indexed (indexed expression = NULL)
    § The index value contains other attributes apart from the "Sent" column. This "Sent" column does not have to be part
    of the index, as the index value != NULL already implicitly contains "Sent"=N.
    § Assuming that the two conditions for "Event-Type" are always fixed and the filter on "Event_Context" varies, then an
    optimal function-based index would look like this
    § CREATE INDEX IX_Min ON Test(CASE WHEN Sent=‚N‘
    AND Event_Type IN ('DeliveryCreated', 'DeliveryUpdated') THEN Event_Context END);
    § If the filter condition in the SQL is now formulated exactly like the indexed expression, the minimum function-based
    index can be used for the query
    § Reduction of runtime to < 100µs (factor 20000++), index size = 64K (initial extent, 4% of previous size)
    1 Buffer get in index / execution
    § For multi-column indices, all columns must have the NULL behavior in order to implement the size reduction

    View full-size slide

  11. Thank you for your interest
    Otto Group Solution Provider (OSP) Dresden GmbH
    Freiberger Str. 35 | 01067 Dresden
    T +49 (0)351 49723 0 | F +49 (0)351 49723 119
    osp.de

    View full-size slide