So far we have discussed several standard methods of influencing DB2’s access path selection -- in the first 4 parts of this series (1, 2, 3, 4) -- as well as updating DB2 Catalog statistics in part 5. But there is another tool in your DBA arsenal that you can use to force/influence access path selection - optimization hints, which will be the focus of today's post.
Using Optimization Hints to Force an Access Path
Optimization hints can be coded to influence the DB2 Optimizer's choice of access path. Actually, though,
this method does not “influence” the access path; instead it directs DB2 to use
a specific access path instead of determining a new access path using
statistics.
The same basic cautions that apply to modifying DB2 Catalog
statistics also apply to optimization hints. Only experienced analysts and DBAs
should attempt to use optimization hints. However, optimization hints are much
easier to apply than updating DB2 Catalog statistics.
There are several methods of implementing optimization hints. First, you can code optimization hints using the PLAN_TABLE.
However, before you can use optimization hints, the DB2 DSNZPARM parameter for
optimization hints (OPTHINTS) must be set to YES.
If it is set to NO, you cannot use optimization hints.
There are two ways to specify an optimization hint to the PLAN_TABLE:
- Modify PLAN_TABLE data to use an access path that was
previously created by the DB2 optimizer, or;
- INSERT
rows to the PLAN_TABLE to create a new access path
independently.
In general, favor the first method over the second . It is
a difficult task to create from scratch an accurate access path in the PLAN_TABLE.
If you do not get every nuance of the access path correct, it is possible that
DB2 will ignore the optimization hint and calculate an access path at bind
time. However, if you use an access path that was originally created by DB2,
you can be reasonably sure that the access path will be valid. Of course, sometimes an access path created for an older version of DB2 will
not be valid in a newer version of DB2.
You should consider using optimization hints for the same reasons
you would choose to modify DB2 Catalog statistics or tweak SQL. The general
reason is to bypass the access path chosen by DB2 and use a different, hopefully
more efficient, access path.
In addition to this reason, optimization hints are very useful as
you migrate from release to release of DB2. Sometimes, a new release or version
of DB2 can cause different access paths to be chosen for queries that were running
fine. Or perhaps new statistics were accumulated between binds causing access
paths to change. By saving old access paths in a PLAN_TABLE,
you can use optimization hints to direct DB2 to use the old access paths
instead of the new, and perhaps undesirable, access paths due to the new
release or statistics.Of course, ever since DB2 9 for z/OS, you can use the plan management feature
to save old access paths across rebinds. This is a more effective approach than
saving them yourself.
Always test and analyze the results of any query that uses
optimization hints to be sure that the desired performance is being achieved.
Defining an Optimization Hint
To specify that an optimization hint is to be used, you
will have to ensure that the PLAN_TABLE has the appropriate columns in it:
OPTHINT VARCHAR(128) NOT NULL WITH DEFAULT
HINT_USED VARCHAR(128) NOT NULL WITH DEFAULT
PRIMARY_ACCESSTYPE CHAR(1) NOT NULL WITH DEFAULT
To set an optimization hint, you need to first identify (or create) the PLAN_TABLE rows that refer to the desired access path. You will then need to update those rows in the PLAN_TABLE, specifying an identifier for the hint in the OPTHINT column. For example,
UPDATE PLAN_TABLE
SET OPTHINT = ‘SQLHINT’
WHERE PLANNO = 50
AND APPLNAME = ‘PLANNAME’;
Of course, this is just an example. You may need to use other predicates to specifically identify the PLAN_TABLE rows to include in the optimization hint. Some columns that might be useful, depending on your usage of dynamic SQL and packages, include QUERYNO, PROGNAME, VERSION, and COLLID.
Keep in mind, though, that when you change a program that uses static SQL statements, the
statement number might change, causing rows in the PLAN_TABLE to be out
of sync with the modified application. For this reason, you should probably choose the newer method of specifying optimization hints, which I will discuss in a moment.
You can use the QUERYNO clause in SQL
statements to ease correlation of SQL statements in your program with your
optimization hints. Statements that use the QUERYNO
clause are not dependent on the statement number. To use QUERYNO,
you will need to modify the SQL in your application to specify a QUERYNO,
as shown in the following:
SELECT MGRNO
FROM DSN81010.DEPT
WHERE DEPTNO = ‘A00’
QUERYNO 200;
You can then UPDATE the PLAN_TABLE more easily using QUERYNO and be sure that the optimization hint will take effect, as shown in the following:
UPDATE PLAN_TABLE
SET OPTHINT = ‘SQLHINT’
WHERE QUERYNO = 200
AND APPLNAME = ‘PLANNAME’;
When the PLAN_TABLE is correctly updated (as well as possibly the application), you must REBIND the plan or package to determine if the hint is being used by DB2. When rebinding you must specify the OPTHINT parameter:
REBIND PLAN PLANNAME . . . OPTHINT(SQLHINT)
Be aware that the optimization hints may not actually be used by
DB2. For optimization hints to be used, the hint must be correctly specified,
the REBIND must
be accurately performed, and the environment must not have changed. For
example, DB2 will not use an access path specified using an optimization hint
if it relies on an index that has since been dropped.
Use EXPLAIN(YES) to verify whether the hint was
actually used. If the hint was used, the HINT_USED
column for the new access path will contain the name of the optimization hint
(such as SQLHINT in the previous example).
Optimization Hints and DB2 10 for z/OS
As of DB2 Version 10, applying optimization hints is much easier
than it was in prior versions of DB2. You can build the access path repository
using the BIND QUERY command. The access path repository
contains system-level access path hints and optimization options.
These statement-level optimization hints, also known as
instance-based statement hints, are enforced across the entire DB2 subsystem
based upon the actual text of the SQL statement.
The access path repository contains information about queries
including the text of the text, access paths, and optimization options. Using
the access path repository, you can save multiple copies of access paths and
switch back and forth between different copies of access paths for the same
query. The access path repository resides in the DB2 Catalog and is composed of
the following tables:
- The primary table
in the access path repository is SYSIBM.SYSQUERY. It
contains one row for each static or dynamic SQL query.
- SYSIBM.SYSQUERYPLAN
holds access path details for queries in SYSQUERY.
A query can have more than one access path for it.
- SYSIBM.SYSQUERYOPTS
stores miscellaneous information about each query.
The access path repository tables are populated when you issue
the BIND
QUERY command. Before running BIND QUERY,
though, you must first populate the DSN_USERQUERY_TABLE
with the query text you want to bind. Table 1 below depicts the columns of the DSN_USERQUERY_TABLE.
Table 1. DSN_USERQUERY_TABLE Columns
Column
|
Description
|
QUERYNO
|
An integer value
that uniquely identifies the query. It can be used to correlate data in this
table with the PLAN_TABLE.
|
SCHEMA
|
The default schema name to be used for unqualified database
objects used in the query (or blank).
|
HINT_SCOPE
|
The scope of the access plan hint:
System-level access plan hint
Package-level access plan hint
|
QUERY_TEXT
|
The actual text of the SQL statement.
|
USERFILTER
|
A filter name that can be used to group a set of queries together
(or blank).
|
OTHER_OPTIONS
|
IBM use only (or blank).
|
COLLECTION
|
The collection name of the package (option for package-level
access plan hints).
|
PACKAGE
|
The name of the package (option for package-level access plan
hints).
|
VERSION
|
The version of the package (option for package-level access plan
hints); if '*' is specified, DB2 uses only COLLECTION
and PACKAGE
values to look up rows in the SYSIBM.SYSPACKAGE and SYSIBM.SYSQUERY catalog tables.
|
REOPT
|
The value of the REOPT BIND parameter:
A: REOPT(AUTO)
1: REOPT(ONCE)
N: REOPT(NONE)
Y: REOPT(ALWAYS)
blank:
Not specified
|
STARJOIN
|
Contains an indicator specifying whether star join processing was
enabled for the query:
Y: Yes, star join enabled
N: No, star join disabled
blank:
Not specified.
|
MAX_PAR__DEGREE
|
The maximum degree of parallelism (or -1 if not specified).
|
DEF_PAR_DEGREE
|
Indicates whether parallelism was enabled:
ONE: Parallelism disabled
ANY: Parallelism enabled
Blank:
Not specified
|
SJTABLES
|
Contains the minimum number of tables to qualify for star join
(or [nd]1 if not specified).
|
QUERYID
|
Identifies access plan hint information in the SYSIBM.SYSQUERY and SYSIBM.SYSQUERYPLAN
tables.
|
OTHER_PARMS
|
IBM use only (or blank).
|
You can populate the DSN_USERQUERY_TABLE using
simple INSERT
statements. Consider using a subselect from the DB2 Catalog to assist, such as
in the following example:
INSERT INTO DSN_USERQUERY_TABLE
(QUERYNO, SCHEMA, HINT_SCOPE, QUERY_TEXT, USERFILTER,
OTHER_OPTIONS, COLLECTION, PACKAGE, VERSION, REOPT,
STARJOIN, MAX_PAR_DEGREE, DEF_CURR_DEGREE, SJTABLES,
OTHER_PARMS)
SELECT SPS.STMTNO, 'SCHEMANAME', 1, SPS.STATEMENT, '',
'', SPS.COLLID, SPS.NAME, SPS.VERSION, '',
'', -1, '', -1, ''
FROM SYSIBM.SYSPACKSTMT SPS
WHERE SPS.COLLID = 'COLLID1'
AND SPS.NAME = 'PKG1'
AND SPS.VERSION = 'VERSION1'
AND SPS.STMTNO = 177;
This particular INSERT statement retrieves data from SYSIBM.SYSPACKSTMT for statement number 177 of VERSION1 of the PKG1 package in collection COLLID1.
You still must populate the PLAN_TABLE with hints. Then you need to run the BIND QUERY command to build the data in the access path repository. Running BIND QUERY LOOKUP(NO) reads the statement text, default schema, and bind options from DSN_USERQUERY_TABLE, as well as the system-level access path hint details from correlated PLAN_TABLE rows, and inserts the data into the access path repository tables.
After you have populated the access path repository, it is a good idea to delete the statement from the DSN_USERQUERY_TABLE. Doing so ensures that hints are not replaced when you issue subsequent BIND QUERY commands.
When a statement-level hint is established, DB2 tries to enforce it for that statement. Hints for static SQL statements are validated and applied when you REBIND the package that contains the statements. Hints for dynamic SQL statements are validated and enforced when the statements are prepared.
To remove hints from the access path repository, use the FREE QUERY command.
Runtime Options Hints
You can also use BIND QUERY to set
run-time options for a given SQL statement text. This allows you to control how
a particular statement should behave without trying to force its actual access
path.
To specify a run-time options hint, INSERT a row into
DSN_USERQUERY_TABLE indicating the QUERYNO as you would for an optimization
hint. Before running BIND QUERY, however, make sure that the PLAN_TABLE does
not contain a corresponding access path for the query.
If the PLAN_TABLE
contains an access path when you run BIND QUERY LOOKUP(NO), the SYSIBM.SYSQUERYOPTS
table won't be populated. If you err and issue the BIND QUERY when
data exists in the PLAN_TABLE,
simply DELETE
the pertinent PLAN_TABLE
data and rerun the BIND
QUERY command.
The REOPT, STARJOIN, MAX_PAR_DEGREE,
DEF_CURR_DEGREE,
SJTABLES,
and GROUP_MEMBER
columns can be used to set the run-time options using DSN_USERQUERY_TABLE.
Summary
Optimization hints provide a strong option for particularly vexing DB2 tuning problems. Consider investigating their usage for troublesome queries that once ran efficiently (and you still have the PLAN_TABLE data) but for whatever reasons, are no longer being optimized effectively.