First of all, the way you change a DB2 Catalog statistic is the way you would change any other piece of data in a DB2 table - using SQL UPDATE, INSERT, and DELETE statements. But only certain DB2 Catalog statistics can be modified using SQL instead of the normal method using RUNSTATS. Furthermore, SQL modification of the DB2 Catalog can be performed only by a SYSADM or SECADM.
Table 1 details the DB2 Catalog statistics that can be modified. You can use this table to determine which DB2 Catalog columns can be modified (updated or inserted using SQL) and which are used by the optimizer during sequential and parallel access path determination. Keep in mind, though, that certain DB2 Catalog tables that can be updated, for example SYSIBM.IPLIST, are not shown in this table because the data in those tables are not relevant to statistics and SQL performance tuning. Additionally, historical DB2 Catalog statistics (those tables ending in HIST) and data in the Real Time Stats tables can also be modified using SQL. But neither are used by the DB2 Optimizer.
Table 1. The Updateable DB2 Catalog Statistics
| 
Catalog Table | 
Column | 
How Used? | 
Description | 
| 
SYSCOLDIST | 
FREQUENCYF | 
Y | 
Percentage
  that COLVALUE in the column named in NAME occurs | 
| 
COLVALUE | 
Y | 
Column
  value for this statistic | |
| 
CARDF | 
Y | 
Number
  of distinct values | |
| 
COLGROUPCOLNO | 
Y | 
The
  set of columns for the statistics | |
| 
NUMCOLUMNS | 
Y | 
Number
  of columns for the statistics | |
| 
TYPE | 
Y | 
Type
  of stats: 
 
  C for cardinality 
 
  F for frequent value 
 
  H for histogram 
 
  N for non-padded frequent value | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSCOLDISTSTATS | 
PARTITION | 
N | 
The
  partition to which this statistic applies | 
| 
FREQUENCYF | 
N | 
Percentage
  that COLVALUE in the column named in NAME occurs | |
| 
COLVALUE | 
N | 
Column
  value for this statistic | |
| 
TYPE | 
N | 
Type
  of statistics (cardinality, frequent value, histogram, or non-padded frequent
  value) | |
| 
CARDF | 
N | 
Number
  of distinct values | |
| 
COLGROUPCOLNO | 
N | 
The
  set of columns for the statistics | |
| 
KEYCARDDATA | 
N | 
Representation
  of the estimate of distinct values in this partition | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSCOLSTATS | 
LOWKEY | 
Y | 
Lowest
  value for the column | 
| 
LOW2KEY | 
Y | 
Second
  lowest value for the column | |
| 
HIGHKEY | 
Y | 
Highest
  value for the column | |
| 
HIGH2KEY | 
Y | 
Second
  highest value for the column | |
| 
COLCARD | 
Y | 
Number
  of distinct values for the column | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSCOLUMNS | 
LOW2KEY | 
Y | 
Second
  lowest value for the column | 
| 
HIGH2KEY | 
Y | 
Second
  highest value for the column | |
| 
COLCARDF | 
Y | 
Number
  of distinct values for the column | |
| 
FOREIGNKEY | 
N | 
Indicates
  the subtype of CLOB data: B for bit data, M for mixed data, S for SBCS data. | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSINDEXES | 
CLUSTERRATIOF | 
Y | 
Percentage
  of rows in clustered order | 
| 
CLUSTERED | 
N | 
Indicates
  whether the table space is actually clustered | |
| 
FIRSTKEYCARDF | 
Y | 
Number
  of distinct values for the first column of the index key | |
| 
FULLKEYCARDF | 
Y | 
Number
  of distinct values for the full index key | |
| 
NLEAF | 
Y | 
Number
  of active leaf pages | |
| 
NLEVELS | 
Y | 
Number
  of index b-tree levels | |
| 
DATAREPEATFACTOR | 
Y | 
The
  anticipated number of data pages to be touched following an index key order. | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSINDEXPART | 
DSNUM | 
N | 
Number
  of data sets | 
| 
LEAFFAR | 
N | 
Number
  of leaf pages far from previous leaf page | |
| 
LEAFNEAR | 
N | 
Number
  of leaf pages near previous leaf page | |
| 
PSEUDO_DEL_ENTRIES | 
N | 
Number
  of pseudo deleted index keys | |
| 
SPACEF | 
N | 
Disk
  storage space | |
| 
SYSINDEXSTATS | 
CLUSTERRATIOF | 
N | 
Percentage
  of rows in clustered order | 
| 
FIRSTKEYCARDF | 
N | 
Number
  of distinct values for the first column of the index key | |
| 
FULLKEYCARDF | 
N | 
Number
  of distinct values for the full index key | |
| 
FULLKEYCARDDATA | 
N | 
Representation
  of number of distinct values of the full key | |
| 
NLEAF | 
N | 
Number
  of active leaf pages | |
| 
NLEVELS | 
N | 
Number
  of index b-tree levels | |
| 
KEYCOUNTF | 
N | 
Number
  of rows in the partition | |
| 
DATAREPEATFACTOR | 
N | 
The
  anticipated number of data pages to be touched following an index key order | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSROUTINES | 
IOS_PER_INVOC | 
Y | 
Estimated
  number of I/Os per invocation of the routine | 
| 
INSTS_PER_INVOC | 
Y | 
Estimated
  number of instructions per invocation of the routine | |
| 
INITIAL_IOS | 
Y | 
Estimated
  number of I/Os for the first invocation of the routine | |
| 
INITIAL_INSTS | 
Y | 
Estimated
  number of instructions for the first invocation of the routine | |
| 
CARDINALITY | 
Y | 
Predicted
  cardinality of a table function | |
| 
SYSTABLEPART | 
DSNUM | 
N | 
Number
  of data sets | 
| 
EXTENTS | 
N | 
Number
  of data set extents | |
| 
SPACEF | 
N | 
Disk
  storage space | |
| 
SYSTABLES | 
CARDF | 
Y | 
Number
  of rows for a table | 
| 
NPAGES | 
Y | 
Number
  of pages on which rows of the table appear | |
| 
NPAGESF | 
Y | 
Number
  of pages used by the table | |
| 
PCTPAGES | 
N | 
Percentage
  of tablespace pages that contain rows for this table | |
| 
PCTROWCOMP | 
Y | 
Percentage
  of rows compressed | |
| 
AVGROWLEN | 
N | 
Average
  row length | |
| 
SPACEF | 
N | 
Disk
  storage space | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSTABLESPACE | 
NACTIVEF | 
Y | 
Number
  of allocated tablespace pages | 
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | |
| 
SYSTABSTATS | 
CARDF | 
Y | 
Number
  of rows for the partition | 
| 
NPAGES | 
Y | 
Number
  of pages used by the partition | |
| 
NACTIVE | 
Y | 
Number
  of active pages in the partition | |
| 
PCTPAGES | 
Y | 
Percentage
  of table space pages that contain rows for this partition | |
| 
PCTROWCOMP | 
Y | 
Percentage
  (100) of rows compressed | |
| 
STATSTIME | 
N | 
Indicates
  the time RUNSTATS was run to generate these statistics | 
Legend:
N =
Not used by the optimizer
Y =
Used by the optimizer 
 
 
2 comments:
Hello,
While it is true that most of the time DB2 does not use information in RTS for optimization. However, there are occasions where DB2 will use index probing along with RTS information. An example would be when a query looks for matching indexed values where the value appears to be outside the domain of what statistics indicates exists. This would result in DB2 believing the predicate returned no rows, so in DB2 10 - DB2 will validate that assumption by probing the index. This can also be used when statistics indicate a partition is empty.
So the DB2 optimizer does - on rare occasions - use RTS information.
Thank you, Patrick. You are, of course, correct. And I'm guessing that RTS, over time, will be used more frequently in the optimization process...
Post a Comment