Last week, in Part 1 of this series, we introduced the concept and the 5 high-level methods of influencing the DB2 optimizer's access path choices. In today's post, we will tackle the first approach, which is using standard, DB2-based methods.
Of all the methods for influencing the DB2 optimizer,
standard DB2 methods are the only mandatory ones. Try all the standard methods
covered in this section before attempting any of the other methods (that we will cover in later installments). There are
several reasons for this precaution.
The standard methods place the burden for generating optimal
access paths on the shoulders of DB2, which is where it usually belongs. They
also use IBM-supported techniques available for every version and release of
DB2. Finally, these methods generally provide the greatest gain for the
smallest effort.
There are four (4) standard methods for tuning DB2 access paths. The
first method is ensuring that accurate statistics are available using the RUNSTATS
utility and the BIND or REBIND
command. RUNSTATS populates the DB2 Catalog with
statistics that indicate the state of your DB2 objects, including the
following:
- Their organization
- Clustering
information
- The cardinality of
table spaces, tables, columns, and indexes
- The range of values
for columns
- Skew and data
distribution details
All of these factors are considered by the optimizer when it
chooses what it deems to be the optimal access path for a given SQL statement. We will not delve deeply into all of the options at your disposal when running RUNSTATS in this series.
A good DB2 tuning strategy is to execute RUNSTATS
at least once for every table space, table, column, and index known to your DB2
subsystem. Schedule regular RUNSTATS executions for all DB2 objects that
are not read-only. This keeps the DB2 Catalog information current, enabling
proper access path selection. If you do not run RUNSTATS whenever your data changes significantly, the DB2 optimizer will not have accurate information upon which to rely when determining optimal access paths. And that means your performance will probably suffer.
The second standard method for tuning DB2 access paths is
ensuring that the DB2 objects are properly organized. Disorganized objects, if
properly reorganized, might be chosen for an access path. An object is
disorganized when data modification statements executed against the object
cause data to be stored in a non-optimal fashion, such as non-clustered data or
data that exists on a different page than its RID, thereby spanning more than
one physical page. To organize these objects more efficiently, run the REORG
utility, followed by RUNSTATS and REBIND.
Use the DB2 Catalog and Real Time Statistics (RTS) to determine when your DB2 table
spaces and indexes need to be reorganized. A few guidelines you might follow include:
- Reorganize a table space when the cluster ratio of its clustering
index falls below 95%. (Schedule this so that it does not affect system
performance and availability.)
- Reorganize indexes when there have been a lot of INSERTs and
DELETEs since the last time it was reorganized. As data is added to and removed
from an index, it can cause disorganization that impacts the performance of
queries using the index. The RTS columns
REORGINSERTS and REORGDELETES can be examined to ascertain
the number of index entries inserted or deleted since the index was
reorganized. A good rule of thumb is to consider reorganizing an index when 25%
or more entries have been inserted or deleted.
Another index-related statistic to pay particular attention to is R
REORGAPPENDINSERT, which contains the number of inserts into an
index since the last reorganization for which the index key was higher than any
existing key value. If this column consistently grows, you have identified an
object where data is inserted using an ascending key sequence. You might
consider lowering the free space for such objects, because the free space is
wasted space if inserts are always done in ascending key sequence. You should
also consider reorganizing the index when 20% or more entries have been appended.
In addition, the number of index leaf page splits should also be
examined when considering index reorganization. The RTS column that indicates
this metric is REORGLEAFAR. Think about reorganizing the index at 10% or more.
Consider reorganizing any DB2 table space or index when its data
set is in multiple physical extents. Before reorganizing, ensure that space
allocations have been modified to cause all data to be stored in a single
extent.
You may want to reorganize more frequently than indicated here by
creating scheduled REORG
jobs for heavily accessed or critical DB2 table spaces and indexes. This limits
performance problems due to disorganized DB2 objects and reduces the number of
reorganizations that must be manually scheduled or submitted by a DBA or
performance analyst.
Also, keep in mind that online REORGs can be run thereby mitigating the impact ot availability that reorganization can cause. For your largest and most important table spaces you might want to consider advanced REORG utilities such as those offered by third party vendors.
The third standard method for tuning DB2 access paths is to
encourage parallelism. Consider changing simple and segmented table spaces to
partitioned table spaces to encourage I/O, CPU, and Sysplex parallelism.
Furthermore, it may be advantageous to repartition already partitioned table
spaces to better align ranges of values, thereby promoting better parallel
access.
Note: Partitioning is not mandatory for parallel access, but it does aid in parallel access path selection and execution.
The fourth standard method for tuning DB2 access paths is
ensuring that there are proper indexes by creating new indexes or dropping
unnecessary and unused indexes. DB2 relies on indexes to achieve optimum
performance.
Analyze the predicates in your SQL statements to determine
whether there is an index that DB2 can use. Indexes can be used efficiently by
DB2 if the first column of the index key is specified in an indexable predicate
in the SQL statement. If no index meets these
requirements, consider creating one. As you index more columns referenced in
predicates, performance generally increases.
Dropping unused indexes is another critical part of application
tuning. Every table INSERT and DELETE
incurs I/O to every index defined for that table. Every UPDATE
of indexed columns incurs I/O to every index defined for that column. If an
index is not being used, drop it. This reduces the I/O incurred for data
modification SQL statements, reduces RUNSTATS resource
requirements, and speeds REORG and RECOVER
processing.
If table
data is accessed by random reads of small amounts of data, consider building a
hash to improve the performance of those reads.
You should also know which predicates are
Stage 1 and which are Stage 2 and try to favor Stage 1 to improve the performance of your queries.
That's all for today... but tune in again next week for more ways to influence DB2 access paths...