In the first installments of this series of posts on ways to influence the DB2 optimizer we looked at standard methods like RUNSTATS and SQL tweaking techniques. Today, we will look at yet another
method of influencing access path selection: specifying the OPTIMIZE
FOR n ROWS for a SELECT
statement.
This clause enables programmers to specify the estimated maximum
number of rows that will be retrieved. By indicating that a different number of rows will be returned
than DB2 anticipates, you can influence access path selection.
For example,
consider the following statement:
EXEC SQL
DECLARE OPT_CUR FOR
SELECT WORKDEPT, EMPNO, SALARY
FROM DSN81010.EMP
WHERE WORKDEPT IN (‘A00’, ‘D11’)
OPTIMIZE FOR 5 ROWS
END-EXEC.
In this example, the number of rows to be optimized for has been set to 5. Keep in mind, though, that this this query could return more than 5 rows. DB2 formulates an access path
optimized for 5 rows. More rows can be retrieved, but performance could suffer
if you greatly exceed the estimated maximum.
Note: The FETCH FIRST n ROWS ONLY clause can be used to limit the number of rows returned regardless of the number that actually qualify.
Using OPTIMIZE FOR generally is preferable to both manually updating the DB2
Catalog statistics (our next topic, so say tuned) and tweaking the SQL statement (our previous topic). It provides more information
to DB2’s optimization process, thereby giving DB2 the opportunity to establish
a better access path. The crucial point, though, is that DB2 is doing the
optimization; no manual updates or artificial SQL constructs are required.
When using the OPTIMIZE FOR n ROWS clause, it is a good idea to make n
as accurate as possible. An accurate estimate gives DB2 the best opportunity to
achieve optimum performance for the statement and also helps document the
purpose of the SQL statement. For example, if you are coding an online transactions in which 25 rows (for example) are
displayed on the screen, code OPTIMIZE FOR 25 ROWS.
But this is not really a hard and fast rule. You can tinker with different values to see what impact it may have on the optimization process.
There is another "trick" that you can deploy using OPTIMIZE
FOR n ROWS. By setting n=1 you should be able to disable list prefetch for the query.
One caution to keep in mind: DB2 uses the value of n for the block size of a
distributed network request. The smaller the value of n,
the fewer rows sent across the network for each block. The only exception is
that when n=1, DB2 will set the block size to 16.
Synopsis
So if you are having trouble with a specific access path, consider using the OPTIMIZE FOR n ROWS clause with various values for n. It just might give you some reasonable results.
No comments:
Post a Comment