statement heap is not enough: The compiler will reduce the optimization. Mainly, it will use a different join enumeration method. In explain plan ( db2exfmt output ), the following message will be shown: SQLCA : (Warning SQLCA from compile) SQLCODE 437; Function SQLNO***; Message token '1'; Warning 'None' In this situation, the access plan could be sub-optimal. Increasing statement heap should help.
query execution, DB2 may run out of sort heap. This can be verified by monitornig the query execution by using: db2 snapshots db2 SQL monitoring functions The symptoms are: sort overflows rows written for a SELECT query temporary data logical/physical reads hash loops hash join overflows Before increasing sort heap, the following should be investigated: in explain plan, is input cardinality of SORT/HSJOIN underestimated ? provide more statistics to help optimizer can we create an index ? for SORT, index can be considered on sort columns for HSJOIN, index can be considered on join columns are there multiple concurrent queries using lot of sortheap ? consider tuning shared sortheap memory consider WLM to control concurrency
may be unnecessary logical as well as physical IOs. This can be verified by monitornig the query execution by using: db2 snapshots db2 SQL monitoring functions The symptoms are: buffer pool data logical/physical reads buffer pool index logical/physical reads asynchronous data/index reads The following should be investigated : in explain plan, is input cardinality of NLJOIN underestimated ? provide more statistics to help optimizer there may be a huge opportunity of indexing/partitioning find expensive operations from explain plan consider the relevant predicates to find columns on which indexing/partitioning can be done are there multiple concurrent queries using lot of sortheap ? consider WLM to control concurrency
may be applied sub-optimally. That can consume a lot of CPU. This can be verified by monitoring the query execution by using: db2 snapshots db2 SQL monitoring functions OS monitoring tools like vmstat The symptoms are: all ( or most of ) the IOs are logical high user-CPU utilization The following should be investigated: in explain plan, is there any sargable/residual predicate ? consider index on relevant columns consider rewriting the query
with various monitoring data, the explain plan can provide deep insights of query execution. The recommended way to collect explain plan is to use db2exfmt: connect to db set current explain mode explain < query > set current explain mode no db2exfmt -d db -1 -o < output file >
Step 1: Take a quick overview of query and environment Understand the original statement and optionally optimized statement from db2exfmt output Look at the global settings like buffer-pool, sort-heap, optimization-level etc. Look at the objects used in the query and check if there are signs of poor maintenance/design e.g. existance of overflow records
Step 2: Check optimizer's cardinality estimations Once the query and environment are found to be ok, then check for cardinality estimations. This is the most important step. It requires knowledge about underlying data. Look for cardinality underestimation's clue: small numbers less than 1 like 3.00579e-05 Can that small number have a big impact ? For example: is that small number on the outer side ( LHS child ) of NLJOIN operation and inner side ( RHS child ) is a big table ? is that small number on the build side ( RHS child ) of HSJOIN operation ?
Step 2: Check optimizer's cardinality estimations ( Contd … ) If the cardinality underestimation can have a big impact, then confirm the actual number of records by using section actuals. Once, it is confirmed that optimizer had underestimated cardinality, then analyze the relevant predicates and corresponding data. In most of the cases, collecting additional statistics resolves the issue. These additional statistics are statistical view and column group statistics. More on them is described via examples later.
Step 3: Check access-method and/or join-method of most expensive operation Once the cardinality estimations are found to be ok, then look for opportunities of reducing the cost of query execution. First, locate the most expensive operation in the explain plan. In most of the cases, indexing and/or partitioning strategies resolve the issue. Indexing and partitioning is further described via examples later.
query performance is poor. The CPU consumption is high and there are many logical IOs. There was no major change either in system or application. Query: select i_item_id, i_item_desc, s_state, count(ss_quantity) as store_sales_quantitycount, avg(ss_quantity) as store_sales_quantityave, stddev(ss_quantity) as store_sales_quantitystdev, stddev(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov, count(sr_return_quantity) as_store_returns_quantitycount, avg(sr_return_quantity) as_store_returns_quantityave, stddev(sr_return_quantity) as_store_returns_quantitystdev, stddev(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov, count(cs_quantity) as catalog_sales_quantitycount, avg(cs_quantity) as catalog_sales_quantityave, stddev(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitystdev, stddev(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov from store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3, store, item where d1.d_quarter_name = '2001Q1' and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('2001Q1', '2001Q2', '2001Q3') and sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and cs_sold_date_sk = d3.d_date_sk and d3.d_quarter_name in ('2001Q1', '2001Q2', '2001Q3') group by i_item_id, i_item_desc, s_state order by i_item_id, i_item_desc, s_state fetch first 100 rows only
involved: TBSCAN(13) is applying local predicate: Q8.D_QUARTER_NAME IN ('2001Q1', '2001Q2', '2001Q3') HSJOIN(11) is applying join predicate: (Q11.SR_RETURNED_DATE_SK = Q8.D_DATE_SK) It seems that majority of qualified fact table rows refer to very small fraction of dimension table. A statistical view should help in this case. Solution: Create a statistical view on tables STORE_RETURNS and DATE_DIM. create view sv_store_returns_date_dim as (select Q8.* from HARMISTR.DATE_DIM as Q8, HARMISTR.STORE_RETURNS as Q11 where Q11.SR_RETURNED_DATE_SK = Q8.D_DATE_SK); alter view harmistr.sv_store_returns_date_dim enable query optimization; runstats on view harmistr.sv_store_returns_date_dim tablesample bernoulli(10); In the same way, we should create statistical view for STORE_SALES and DATE_DIM as well as CATALOG_SALES and DATE_DIM.
only a few columns needed ? – Index only access possible ? • Predicates filter significantly ? – ISCAN–FETCH may be cheaper – Consider Indexes for IN or OR predicates • Is this a fact table with aggregation ? – Could we define a Materialized Query Table • Partition, Partition, Partition – Could it be partitioned by range ? – Could this be a Multi-dimension Clustered table ? – Both MDC + Range partitioning ?
Problem – C1 IN (10. 100, 500, 10000) – Large table scan to fetch a few rows • Solution – Consider an index with the column – For example: index on (C1, C3) – Index (C2, C1) is also good if the – query has an equality predicate on C2 8 NLJOIN ( 9) 122.22 16 /-----+-----\ 4 2 TBSCAN FETCH ( 10) ( 13) 0.014061 30.3206 0 4 | /----+----\ 4 2 298854 TABFNC: SYSIBM IXSCAN TABLE: MYSCHEMA GENROW ( 3) BIGTABLE Q3 30.2756 Q1 3 | 298854 INDEX: SAPP29 INDX~0 Q1 After 8 TBSCAN ( 2) 3312.15 702 | 298854 TABLE: MYSCHEMA BIGTABLE Q1 Before
spilling ? – Could you increase SORTHEAP or BUFFERPOOL ? … do not increase SORTHEAP too high if there are concurrent Hash joins or sorts in this query and specially in a multi-user environment • Could you create an index to avoid the sort – Note that an ISCAN-FETCH may be more expensive – Perhaps an index-only access ?
details for a SORT will indicate if the SORT spilled • The I/Os indicate that there was spilling associated with the SORT. • Minimize spills by considering indexes and as discussed earlier, by balancing SORTHEAP, SHEAPTHRES and BUFFERPOOL SORT ( 16) 6.14826e+06 1.30119e+06 | 3.65665e+07 TBSCAN ( 17) 2.00653e+06 1.14286e+06 | 3.74999e+07 TABLE: TPCD.ORDERS
of Symptoms in Monitoring Data Examples of Symptoms in Explain Plan Possible Solution Alternatives Statement Heap -- - SQLCODE 437 Message Token “1” - Consider increasing statement heap Sort Heap - Sort overflow - Hash join overflow - Temp IO - Cardinality underestimation for SORT/HSJOIN - Table scans - Consider collecting additional statistics - Creating indexes may help - For concurrent queries, consider WLM Buffer Pool - Logical and physical IOs - Few records SELECTed from many rows READ - Cardinality underestimation for NLJOIN - Table scans - Consider collecting additional statistics - Creating indexes may help - For concurrent queries, consider WLM CPU - No major IO bottlenecks - Sargable / Residual predicates - Consider rewriting query - Creating indexes may help - For concurrent queries, consider WLM
is performing poorly on production system as compared to development system. Original Statement: Note: VW is a view select * from db2inst1.vw v, db2inst1.t1 t1 where v.x = t1.x Optimized Statement: SELECT ... FROM DB2INST1.T4 AS Q1, DB2INST1.T3 AS Q2, DB2INST1.T2 AS Q3, DB2INST1.T1 AS Q4 WHERE (Q1.T2ID = Q2.T2ID) AND (Q3.T4ID = Q1.T4ID) AND (Q1.Y = 'Y') AND (Q1.Z = 'N') AND (Q3.X = Q4.X) AND (Q3.Y IS NULL )
to confirm the actual counts SELECT COUNT(*) FROM "DB2INST1 "."T4"; RESULT: 13 ROWS SELECT COUNT(*) FROM "DB2INST1 "."T4" AS Q1 WHERE (Q1.Y = 'Y') AND (Q1.Z = 'N'); RESULT: 13 ROWS Extended diagnostic information Diagnostic Identifier: 1 Diagnostic Details: EXP0045W. The table named "DB2INST1.T4" has fabricated statistics. This can lead to poor cardinality and predicate filtering estimates. The size of the table changed significantly since the last time the RUNSTATS command was run. Solution: Collect statistics by running RUNSTATS on the table DB2INST1.T4
take a very simple example to better understand statistical view Table:T1 Table:T2 X Y Y 1 A A 2 B A 2 C A 4 D A 7 E A 7 F B 7 G D 7 H E 9 I F 9 J I Distribution of values in column T1.X TYPE SEQNO COLVALUE VALCOUNT F 1 7 4 F 2 9 2 F 3 2 2 Consider the following query: SELECT * FROM T1, T2 WHERE T1.Y = T2.Y Distribution of values in column T1.X, after join TYPE SEQNO COLVALUE VALCOUNT F 1 1 5 F 2 7 2
A query's performance is poor. Query: Original statement SELECT ... FROM T1, T2, T3, T4 WHERE ... AND T1.ACCT_NUM = T2.ACCT_NUM AND T2.ACCT_NUM = T3.ACCT_NUM AND T2.ID_NUM = T3.ID_NUM AND T1.ACCT_NUM = T4.ACCT_NUM AND T1.REP_NUM = T4.REP_NUM AND T4.ACCT_NUM = T3.ACCT_NUM AND T4.ID_NUM = T3.ID_NUM AND T4.ACCT_IND = T3.ACCT_IND AND ('1800010100.00.00.000000' < T1.DATE_MOD) AND T1.COMM IN ('A', 'D') ...
Predicates applied by HSJOIN(6): (Q4.ACCT_NUM = Q1.ACCT_NUM) (Q1.REP_NUM = Q4.REP_NUM) Note that there are multiple equality join predicates between these two tables. The optimizer assumes that two predicates are independent of each other. However, the columns could be statistically correlated and that could result in underestimation of cardinality. This estimation is important because it is part of NLJOIN(3)'s outer side and that NLJOIN has a big table in its inner side. So, let us check the actual cardinality.
Actual cardinality SELECT COUNT(*) FROM T4 AS Q1, T1 AS Q4 WHERE (Q1.REP_NUM = Q4.REP_NUM) AND (Q4.ACCT_NUM = Q1.ACCT_NUM) AND ('1800010100.00.00.000000' < Q4.DATE_MOD) AND Q4.COMM IN ('A', 'D'); RESULT: 1,155,273 rows Solution: Column group statistics will help. In this case, columns group statistics should be collected on parent table ( generally the one with primary key ).
To find parent-child relationship between two tables, consider the statistics of join columns: Table:T1 Table:T4 Column Colcard High2key Low2key Colcard High2key Low2key ACCT_ NUM 3357997 'JERE2' '00003' 3662064 'V222S' '00003' REP_ NUM 32341 '95517' '63135' 36864 '99999' '63135' From these statistics, T4 is the parent table because COLCARD of ACCT_NUM and REP_NUM is greater than T1's respective COLCARD statistics the HIGH2KEY for each column is greater and the LOW2KEY are the same So, column group statistics should be collected on T4 table: RUNSTATS ON TABLE DB2INST1.T4 ON ALL COLUMNS AND COLUMNS ((ACCT_NUM, REP_NUM)) WITH DISTRIBUTION AND SAMPLED DETAILED INDEXES ALL
us take a very simple example to better understand column group statistics. Here, country and city columns are correlated. Country City Hotel Name Germany Bremen Hilton Germany Bremen Best Western Germany Frankfurt InterCity Canada Toronto Four Seasons Canada Toronto Intercontinental WHERE Country = ? And City = ? With Independence : Selectivity = 1/2 * 1/3 = 0.1667 The following column group statistics will help RUNSTATS ON TABLE db2inst1.hotel ON ALL COLUMNS AND COLUMNS ( (country, city) ) WITH DISTRIBUTION AND SAMPLED DETAILED INDEXES ALL With Column Group Statistics : Selectivity = 0.3333
FETCH cost is very high ? • List Prefetch plans on the inner of nested loop joins ? – Look at the CLUSTERRATIO (or CLUSTERFACTOR) – If it is not close to 100 (or 1), consider REORG against this index if this is the key index used in most queries • Index-Only Access possible ? • If good filtering commonly used predicates are applied at the FETCH – Consider adding the column or columns to the index
A TSCAN or SORT on the inner should be avoided as far as possible – Consider an index – Consider REORG • Outer is large and you have an ISCAN-FETCH on the inner ? – Consider Index-Only access • Expression on join predicate ? – Could you avoid the expression on the inner table column so that you could use an index with start-stop keys ? – Could you use generated columns with that expression ?
out of DB2 optimizer http://www.ibm.com/developerworks/data/library/techarticle/dm- 1025db2accessplan/ 2) Understand column group statistics in DB2: http://www.ibm.com/developerworks/data/library/techarticle/dm- 0612kapoor/index.html