Many shops battle with locking issues and frequently,
the cause of performance issues can be traced back to locking issues, more specifically,
lock timeout issues. When you experience a timeout, it means that another
process holds a lock on the data that you are trying to modify.
You may experience timeout troubles as the
dreaded -911 SQLCODE. Timeouts are different than most performance issues
because the job or users receiving the -911 SQLCODE is usually not the source
of the problem. Instead, the jobs causing the problem typically continue to run
just fine, bringing about problems for everybody else!
The primary factors to examine when looking to
resolve timeout issues are:
- The lock size parameter that was specified at table space creation time.
- The duration of the
locks established by the BIND strategies. Lock duratiuon is determined by a
combination of the ACQUIRE and RELEASE strategies for the table spaces and the
ISOLATION parameters for the page locks.
- The use of LOCK TABLE statements in any programs (See DB2 Locking Part 14).
- The scope of the commit in the application code.
- The manner in which the table space was started (RW or RO).
Most of the time, resource contention and
timeouts are due to long-running programs that are not issuing COMMITs…or
perhaps are not committing their work frequently enough. Issuing a COMMIT ends
a unit of recovery and hardens all database modifications made during that unit
of recovery.
So be sure to verify that all of your batch
processes -- especially any that run concurrently with other workload (but
really all batch process) -- have a COMMIT strategy. This means that your
programs should issue a COMMIT after processing "a set number of"
inserts, updates, and deletes. A COMMIT will tell DB2 to make the changes
permanent and releases locks. A good rule of thumb is to strive for having no
lock held for more than five seconds.
A reasonable approach is to set a counter that
is incremented after every modification. Then, check it and when it exceeds a
predefined threshold -- say 25 or 50 or 100 modifications -- then issue a
COMMIT. You should make the threshold an input parameter so that you can change
it as the workload in your system changes. For example, make it 25 when
concurrent activity is high, but ramp it up to 100 or higher when it is low.
Failure to issue COMMITs will result in timeouts, as well as possibly deadlocks
and lock escalation.
For a more in-depth discussion on the importance of issuing COMMITs in your application programs review DB2 Locking Part 9.
Another practical approach for reducing lock
duration is to make data modifications as close to the end of the unit of work
as possible. Look at all of your programs, both batch and online, and try to
save the data modification statements to as close to the COMMIT as you can. By
saving the data modification until right before you issue a COMMIT, you reduce
the overall average lock duration. This should result in reducing contention
and therefore, the number of timeouts.
Some additional suggestions to maximize
concurrency and reduce timeouts include:
- Use WHERE CURRENT OF CURSOR to perform UPDATEs and DELETEs in your programs.
- Ensure that you have created your table spaces with the appropriate LOCKSIZE (usually PAGE, but sometimes perhaps ROW).
- If you have tables that are static, consider starting their table space as RO instead of RW. Doing so can enable DB2 to utilize table locking instead of page or row locking for those tables.
- Limit the scope of SQL statements so that a statement locks 150 or fewer pages on a table for each unit of work that is performed.
Finally, if you want to investigate timeout
details, be sure to examine the statistics trace class 3 and IFCID 0196 for
timeouts (IFCID 0172 is for deadlocks).
No comments:
Post a Comment