Monday, August 05, 2013

COPYTOCOPY: The Forgotten Little DB2 Utility?

The COPYTOCOPY utility was introduced some time ago now, way back in DB2 for z/OS Version 7. The purpose of the COPYTOCOPY utility is to make additional image copies of currently existing image copy data sets. But many DBAs "out there" keep pluggiong away, almost unaware that COPYTOCOPY even exists. So let's spend a little time talking about it today.

The primary benefit of COPYTOCOPY is to reduce the amount of time required to run the COPY utility. Remember that the COPY utility can be used to take up to four image copies with a single execution of the utility. But with COPYTOCOPY available, instead of using COPY to make four image copy backups, the DBA can use COPY to make a single image copy, and then run COPYTOCOPY to make additional image copies. The COPY utility will take less time to create a single image copy backup than it will to take multiple image copy backups. And the combination of COPY plus COPYTOCOPY can, at times perhaps, be used to increase availability.

Individual data and index partitions are treated as distinct target objects by the COPYTOCOPY utility. Any other utilities operating on different partitions of the same table space or index space can be run concurrently with COPYTOCOPY.

The following utilities can not be run concurrently on the same database object as the COPYTOCOPY utility:
  •        COPY
  •        LOAD
  •        MERGECOPY
  •        MODIFY
  •        RECOVER
  •        REORG INDEX
  •        REORG TABLESPACE

Furthermore, COPYTOCOPY is flexible enough to run against any DB2 image copy data set. This includes inline copies made during the execution of the REORG and LOAD utilities. COPYTOCOPY must start with a primary image copy backup – either the local primary or recovery site primary copy. From that image copy, the COPYTOCOPY utility can make up to three copies of one or more of the following types:
  •        local primary
  •        local backup
  •        recovery site primary
  •        recovery site backup

Copies created by COPYTOCOPY can be used by the RECOVER utility just like regular image copies created using the COPY utility. Both table space and index space copies can be made using the COPYTOCOPY utility. Any DB2 utility process that uses image copy data sets can use the image copy data sets created by COPYTOCOPY. This includes MERGECOPY, UNLOAD, and subsequent runs of COPYTOCOPY. However, keep in mind that image copies created with the CONCURRENT option of the COPY utility are not supported by the COPYTOCOPY utility.

Just like the COPY utility, the COPYTOCOPY utility records information about the image copies that it creates in the SYSIBM.SYSCOPY system catalog table. The COPYTOCOPY utility will insert the values in the DSNAME, GROUP_MEMBER, JOBNAME, AUTHID, DSVOLSER and DEVTYPE columns as appropriate depending on the copies that are being created.

You cannot run COPYTOCOPY to create additional image copies for certain DB2 Catalog (SYSCOPY in DSNDB06) and DB2 Directory (DSNDB01 and SYSUTILX both in DSNDB01) objects.

The COPYTOCOPY utility operates in these distinct phases:
  1.        UTILINIT – Initialization and setup
  2.        CPY2CPY – Copying the image copy
  3.        UTILTERM – Cleanup

TERM and Restart Issues

The use of the TERM command to terminate a COPYTOCOPY step that has abended is not recommended. A current restart should be done instead to allow COPYTOCOPY to pickup where it left off. Terminating COPYTOCOPY in such a situation might cause inconsistencies between the ICF catalog and DB2 catalog when GDGs are used.

You cannot use RESTART(PHASE) for a COPYTOCOPY job. It is fine to use RESTART(CURRENT) as long as you avoid using the -TERM UTILITY command to terminate a COPYTOCOPY step. When you use RESTART(CURRENT), COPYTOCOPY will restart from the last commit point with the same image copy data set, so be sure to code a data set disposition of DISP=(MOD,CATLG,CATLG) on your JCL DD statements.

Inline Copy Exception

When using COPYTOCOPY to copy an inline image copy that was made by the REORG utility with the part range option you will need to specify individual DSNUM for the partitions to be copied. The COPYTOCOPY utility does not support part range. COPYTOCOPY will copy only the specified partition data from the input inline image copy data set into the output image copy data set.

COPYTOCOPY Execution

To run the COPYTOCOPY utility it is not necessary to provide the explicit data set name of the image copy being copied. Instead, the input to the COPYTOCOPY utility is the name of the table space, index space, or index for which the original copy was made, and an indication of which image copy in the catalog should be copied. To specify this information COPYTOCOPY provides three options:

  1.         FROMLASTCOPY – indicates that the most recent image copy taken for the table space or index space is to be used as input to the COPYTOCOPY utility. The input could be either a full image copy or incremental copy. The utility will retrieve the information from the SYSIBM.SYSCOPY system catalog table.
  2.         FROMLASTFULLCOPY – indicates that the most recent full image copy taken for the object is to be used as the input to COPYTOCOPY job. Once again, this information is obtained by querying the DB2 Catalog.
  3.      FROMLASTINCRCOPY – indicates that the most recent incremental image copy taken for the object is to be used as the input to COPYTOCOPY job. FROMLASTINCRCOPY is not valid for index spaces or indexes. If FROMLASTINCRCOPY is specified for an index space or index, COPYTOCOPY will use the last full image copy that was taken for the index, if one is available. And once again, this information is obtained by querying the DB2 Catalog.

Of course, you may choose instead to specify the data set name for the image copy that is to be copied by the COPYTOCOPY utility. This can be accomplished by using the FROMCOPY clause. But keep in mind that when you are using COPYTOCOPY with a list of objects defined using the LISTDEF statement, the FROMCOPY clause is not valid.

If the FROMCOPY keyword is not used the COPYTOCOPY utility must determine which specific image copy is to be copied. Before COPYTOCOPY can execute it may have to choose between the local site primary copy, local site backup copy, recovery site primary copy, and recovery site backup copy data sets. COPYTOCOPY will search image copies in the following order to determine the input data set to be used:
  •      If you are running COPYTOCOPY at your local site, the search order will be (1) local site primary copy, (2) local site backup copy, (3) recovery site primary copy, (4) recovery site backup copy.
  •       If you are running the utility at your recovery site, the search order will be (1) recovery site primary copy, (2) recovery site backup copy, (3) local site primary copy, then finally (4) local site backup copy.

If the input data set cannot be allocated or opened, the COPYTOCOPY utility will try to use the next image copy data with the same START_RBA value in SYSIBM.SYSCOPY column, in the search order as indicated previously. When the FROMCOPY keyword is used though, only the explicitly specified data set can be used as the input to COPYTOCOPY.

An Example of the COPYTOCOPY Utility

Let’s take a quick look at a sample JCL job step to run the COPYTOCOPY utility. The following code can be run to make a backup local image copy of the table space DSN8S71E in the sample DB2 database DSN8D71A. This will be either a full or incremental image copy, whichever was last run for this object:

//STEP1    EXEC DSNUPROC,UID='DBAPCSM.CPY2CPYT',
//         UTPROC='',
//         SYSTEM='DB2T',DB2LEV=DB2A
//SYSIN    DD *
//COPY2    DD DSN=COPY002F.IFDY01,UNIT=SYSDA,VOL=SER=CPY02I,
//         SPACE=(CYL,(15,1)),DISP=(NEW,CATLG,CATLG)
//SYSIN    DD *
   COPYTOCOPY TABLESPACE DSN8D71A.DSN8S71E COPYDDN(,COPY2)

/*


Bottom Line


The COPYTOCOPY utility provides a useful new feature to the toolkit of DB2 DBAs. Using COPYTOCOPY to create additional image copies from existing image copies can enhance availability and assist DBAs in creating an optimal backup and recovery plan for their DB2 applications and databases.

Wednesday, July 31, 2013

Compression Becoming More Important in the Age of Big Data

DBAs and database professionals have been aware of the pros and cons of compressing data for years. The traditional argument goes something like this: with compression you can store more data in less space, but at the cost of incurring CPU to compress the data upon insertion (and modification) and decompress the data upon reading it. Over time, the benefits of compression became greater as compression algorithms became more robust, hardware assist chips became available to augment compression speed, and the distributed model of computing made transmitting data across networks a critical piece of the business transaction (and transmitting compressed data is more efficient than transmitting uncompressed data).
IBM has significantly improved compression in DB2 for z/OS over the years. In the early days of mainframe DB2 no compression capability came with DB2 out-of-the-box -- the only mechanism for compressing data was via an exit routine (EDITPROC). Many software vendors developed and sold compression routines for DB2. Eventually, IBM began shipping a sample compression routine with DB2. And then in DB2 Version 3 (1993) hardware-assisted compression was introduced. Using the hardware assist , the CPU used by DB2 compression is minimal and the cons list gets a little shorter.
Indeed, one piece of advice that I give to most shops when I consult for them is that they probably need to look at compressing more data than they already are. Compressed data can improve performance these days because, in many cases, you can fit more rows per page. And therefore scans and sequential processes can process more data with the same number of I/Os, thereby improving performance. Of course, you should use the DSN1COMP utility to estimate the amount of savings that can accrue via compression before compressing any existing data.
Eventually, in DB2 9 we even get index compression capability (of course, using different technology than data compression). At any rate, compressing data on DB2 for z/OS is no longer the “only-if-I-have-to” task that it once was.
Then along comes the Big Data phenomenon where increasingly large data sets need to be stored and analyzed. Big Data is typified by data sets that are so large and complex that traditional tools and database systems are ill-suited to process them. Clearly, compressing such data could be advantageous… but is it possible to process and compress such large volumes of data?
New alternatives to traditional systems are being made available that offer efficient resource usage based on principles of compressed sensing and other techniques. One example of this new technology is IBM’s BLU Acceleration, which is included in DB2 10.5 for Linux, Unix, and Windows. One feature of BLU Acceleration is extended compression, which eliminates the need for indexes and aggregation and operates on compressed data and can thereby eliminate the CPU time that would be required to decompress the data. Advanced encoding maximizes compression while preserving the order of encoding so compressed data can be quickly analyzed without decompressing it. It is an impressive technology as no changes are required to your existing SQL statements.
IBM reports that some clients using DB2 10.5 for LUW with BLU Acceleration have achieved compression rates 10 times greater than uncompressed tables.
Of course, BLU Acceleration is much more than compression (it combines in-memory, columnar and compression technologies), but for the purposes of today’s blog entry we won’t delve deeper into the technology. If you are interested in a little bit more on BLU read my high-level overview in my coverage of this year’s IDUG DB2 TechnicalConference.

So compression is becoming cool… who’d have thought that back in the 1980s when compression was something we only did when we absolutely had to?

Friday, July 26, 2013

Top Ten Most Common DB2 Performance Problems


  1. PEBCAK 

    The number one cause of DB2 performance problems today, as always, is Problem Exists Between Chair And Keyboard!
     
  2. Poorly coded SQL

    Many performance problems can be traced back to inappropriately coded SQL Code it correctly from the beginning and tune what is already out there.
         
  3. Improper indexing

    Optimize performance via indexing by workload, not by object.  
     
  4. Bad program design

    Coding DB2 SQL for Performance in your application programs from the outset can eliminate many future problems.

    .

  5. Bachelor programming syndrome

    Yes, the dreaded "Fear of Committing" can cause performance problems due to concurrency issues.
         
  6. Improperly defined buffer pools

    Defining effective buffer pools for your DB2 workload is important. There are many things you can do to identify the proper settings and sizing of your DB2 buffer pools.
         
  7. Index / table space needs to be reorganized

    Reorganization of indexes and table spaces can improve performance. Follow the Five R's to assure optimal DB2 application performance.
     
  8. Improperly designed database structures

    Designing database structures correctly -- from the beginning -- is the way to go for efficiency and efficacy.
     
  9. Copied code syndrome

     
  10. RUNSTATS not up-to-date (or not even run)

    How can you expect for the DB2 Optimizer to do its thing on your SQL without statistics about your data and environment? Again, The Five R's!

And if a Top Ten list does not provide enough detail for you, then splurge for a copy of my book: DB2 Developer's Guide, 6th edition. Recently updated for DB2 10 for z/OS, the book delivers over 1600 pages full of DB2 tips, tricks, guidelines, and details...  It rates 5 stars on Amazon!

Friday, July 19, 2013

DB2 Locking, Part 17: In Conclusion

Today's blog post concludes our multi-part series on DB2 for z/OS locking and concurrency. We have touched on a great many aspects of locking in this series. Such an in-depth, technical topic as DB2 locking can be difficult to master, but doing so can deliver a more clear understanding of how DB2 operates and how your programming and design decisions impact not only your application, but also the entire DB2 subsystem. 

A Couple Additional Locking Ideas and Thoughts

Before concluding this series, here are a few more guidelines and thoughts on DB2 locking and concurrency:

  • It is a good idea to use clustering to encourage DB2 to maintain data that is accessed together on the same page. If you use page locking, fewer locks are required to access multiple rows if the rows are clustered on the same page or pages. You can also use larger page sizes to control the amount of data locked for page locking.

  • Consider using the free space parameters to influence locking. If you increase free space, fewer rows are stored on a single page. Therefore, fewer rows are locked by a single page lock. This approach can decrease contention. However, it consumes additional disk storage, and it can also decrease the performance of table space scans (because more pages with fewer rows must be read). Additionally, keep in mind that as data is added to the table the free space will decrease (because the new data is using it). As such, locking issues may become more prevalent.
  • You can also decrease the number of rows per page using the MAXROWS option of the CREATE TABLESPACE statement. The fewer rows per page, the less intrusive page locking will be because fewer rows will be impacted by a page lock. This approach is probably better than the free space approach (previous bullet) because new data will not impact number of rows per page.

  • Design your application programs with locking considerations in mind. THis is the Number One thing to remember to increase concurrency and minimize the impact of locking on DB2 application and system performance. You can minimize the effect of locking through proper application program design. This means:
  • Limiting the number of rows that are accessed by coding predicates to filter unwanted rows
  • Requesting only the data (rows and columns) that your actually need to perform your business processes
  • Perform modifications as close to the end of the unit of work as possible
  • And remember to avoid bachelor programming syndrome (see Part 9).

Summary

Of course, there are probably many more hints, tips, and guidelines for developing DB2 databases and applications with concurrency in mind, but I think a 17 part series is sufficient for my blog. If you want more details on concurrency (or any other aspect of DB2 for z/OS) might I recommend the latest edition of my book -- DB2 Developer's Guide, 6th edition.

And be sure to come back and review this series on locking if you get confused as you work to maximize the concurrency of your DB2 queries, transactions, and programs. 

Finally, as a service to my readers, this post includes a directory/index to the 16 separate posts that make up this series.

Index of Blog Posts on DB2 Locking


Monday, July 15, 2013

DB2 Locking, Part 16: Skipping Uncommitted Inserts

DB2 10 for z/OS introduces a new technique for concurrency. As we have discussed previously in this series on DB2 locking, one of the most troublesome problems for DB2 performance analysts is reducing timeouts and lock suspensions.

The CONCURRENTACCESSRESOLUTION parameter can be used to specify a concurrent access resolution option to use for statements in a package when binding your application program.
There are two options. The first, USECURRENTLYCOMMITTED, indicates that DB2 can use the currently committed version of the data for applicable scans when data is in the process of being updated or deleted. Rows that are in the process of being inserted can be skipped. This clause applies only when the isolation level in effect is Cursor Stability (CS) or Read Stability (RS) making skip uncommitted inserts apply. It is ignored for other isolation levels.

The second option is WAITFOROUTCOME, which indicates that applicable scans must wait for a COMMIT or ROLLBACK operation to complete when data is in the process of being updated or deleted. Rows that are in the process of being inserted are not skipped.

The default behavior is WAITFOROUTCOME. Instead, if you choose to specify USECURRENTLYCOMMITTED DB2 will ignore rows that are in the process of being inserted and use only currently committed rows. This might be desirable for highly concurrent web transactions or to mimic the application concurrency settings or behavior of another DBMS.

Another option at your disposal is to specify that uncommitted inserts are to be skipped at the subsystem level (using option 19 on panel DSNTIPB during DB2 installation or by assembling a new DSNZPARM). The SKIPUNCI subsystem parameter specifies whether statements ignore a row that was inserted by another transaction if the row has not if the row has not yet been detected as committed. A newly inserted row can be detected as committed only after the lock held on the row has been released.

There are two options for this parameter:
  • Specifying YES for SKIPUNCI will cause DB2 to behave as though the newly inserted row has not yet arrived and the row is skipped, until the lock held on a newly inserted row is released.
  • SKIPUNCI(NO) indicates that DB2 will wait for the inserted row to be committed or rolled back. It then processes the row if the insert commits, or it moves on to find another row if the insert is rolled back. If a transaction performs one or more inserts, and then spawns a second transaction, specify NO for SKIP UNCOMM INSERTS if the first transaction needs the second transaction to wait for the outcome of the inserts. This is the default value.
By using either of these two methods to skip uncommitted inserts you can improve concurrent access to data as you tune your application’s locking and concurrency requirements.

Tuesday, July 09, 2013

DB2 Locking, Part 15: Tackling Timeout Troubles

Many shops battle with locking issues and frequently, the cause of performance issues can be traced back to locking issues, more specifically, lock timeout issues. When you experience a timeout, it means that another process holds a lock on the data that you are trying to modify.
You may experience timeout troubles as the dreaded -911 SQLCODE. Timeouts are different than most performance issues because the job or users receiving the -911 SQLCODE is usually not the source of the problem. Instead, the jobs causing the problem typically continue to run just fine, bringing about problems for everybody else!
The primary factors to examine when looking to resolve timeout issues are:
  • The lock size parameter that was specified at table space creation time.
  • The duration of the locks established by the BIND strategies. Lock duratiuon is determined by a combination of the ACQUIRE and RELEASE strategies for the table spaces and the ISOLATION parameters for the page locks.

  • The use of LOCK TABLE statements in any programs (See DB2 Locking Part 14).
  • The scope of the commit in the application code.
  • The manner in which the table space was started (RW or RO).
Most of the time, resource contention and timeouts are due to long-running programs that are not issuing COMMITs…or perhaps are not committing their work frequently enough. Issuing a COMMIT ends a unit of recovery and hardens all database modifications made during that unit of recovery.
So be sure to verify that all of your batch processes -- especially any that run concurrently with other workload (but really all batch process) -- have a COMMIT strategy. This means that your programs should issue a COMMIT after processing "a set number of" inserts, updates, and deletes. A COMMIT will tell DB2 to make the changes permanent and releases locks. A good rule of thumb is to strive for having no lock held for more than five seconds.
A reasonable approach is to set a counter that is incremented after every modification. Then, check it and when it exceeds a predefined threshold -- say 25 or 50 or 100 modifications -- then issue a COMMIT. You should make the threshold an input parameter so that you can change it as the workload in your system changes. For example, make it 25 when concurrent activity is high, but ramp it up to 100 or higher when it is low. Failure to issue COMMITs will result in timeouts, as well as possibly deadlocks and lock escalation.
For a more in-depth discussion on the importance of issuing COMMITs in your application programs review DB2 Locking Part 9.
Another practical approach for reducing lock duration is to make data modifications as close to the end of the unit of work as possible. Look at all of your programs, both batch and online, and try to save the data modification statements to as close to the COMMIT as you can. By saving the data modification until right before you issue a COMMIT, you reduce the overall average lock duration. This should result in reducing contention and therefore, the number of timeouts.
Some additional suggestions to maximize concurrency and reduce timeouts include:
  • Use WHERE CURRENT OF CURSOR to perform UPDATEs and DELETEs in your programs.
  • Ensure that you have created your table spaces with the appropriate LOCKSIZE (usually PAGE, but sometimes perhaps ROW).
  • If you have tables that are static, consider starting their table space as RO instead of RW. Doing so can enable DB2 to utilize table locking instead of page or row locking for those tables.
  • Limit the scope of SQL statements so that a statement locks 150 or fewer pages on a table for each unit of work that is performed.

Finally, if you want to investigate timeout details, be sure to examine the statistics trace class 3 and IFCID 0196 for timeouts (IFCID 0172 is for deadlocks).

Saturday, July 06, 2013

DB2 Locking, Part 14: Using the LOCK TABLE Statement

We continue the series on DB2 locking today with a look at the SQL statement: LOCK TABLE .

You can issue the LOCK TABLE statement in your application programs to raise the lock granularity to the table (actually, table space) level. Doing so means that you will not need to take lower level locks (at the page or row level, whichever is in effect for the table space in question). Issuing a LOCK TABLE in a program can make sense in certain, specific circumstances. Consider using the LOCK TABLE statement to control the efficiency of locking in programs that will issue many page/row lock requests when there are no concurrent requests for the same data. Issuing a LOCK TABLE, at times, can be a reasonable alternative to using an ISOLATION level of RR or RS when a large percentage of a table's rows or pages will be modified.

If your environment can withstand the concurrency hit caused by LOCK TABLE you can gain performance by reducing locking activity. Taking a larger granularity lock at the table(space) level instead of multiple page or row locks will improve the performance of your program, albeit at the expense of concurrent activity to the data. If you wish to avoid modifying the program you can START the table space is read only mode (RO) and achieve similar results (no locks are taken because the data cannot be modified). This can be accomplished using a command like so:

-START DATABASE(DBNAME) SPACENAM(SPNAME) ACCESS(RO)

Of course, if you go that route you will have to make sure that you restart the table space for read write activity (RW) after the program finishes. This can be unwieldy to implement.

If, instead, you wish to use the LOCK TABLE approach, there are two types of LOCK TABLE requests. The LOCK TABLE...IN SHARE MODE statement acquires an S-lock on the table specified in the statement. This locking strategy effectively eliminates the possibility of concurrent modification programs running while the LOCK TABLE is in effect. The S-lock is obtained on the table space for tables contained in non-segmented table spaces. This is important to understand, especially if you have multi-table table spaces.

The LOCK TABLE...IN EXCLUSIVE MODE statement acquires an X-lock on the table specified in the statement. All concurrent processing is suspended until the X-lock is released. Again, for non-seqmented table spaces, the X-lock is obtained on the table space not the table.

In both cases, you can specify the PART parameter to indicate that only a specific partition is to be locked. For example, to lock only the third partition of a partitioned table space, you can issue: LOCK TABLE...PART 3 IN EXCLUSIVE MODE.
The table locks acquired as a result of the LOCK TABLE statement are held until the next COMMIT point unless RELEASE(DEALLOCATE) was specified for the plan issuing the LOCK TABLE statement. In that situation, the lock is held until the program terminates. That means, for RELEASE(COMMIT) programs, you will need to issue the LOCK TABLE again after each COMMIT or processing will revert to page/row locking.

Also, keep in mind that the lock will not take effect until the statement executes, even if ACQUIRE(ALLOCATE) was coded at BIND time. 

Monday, July 01, 2013

DB2 Locking, Part 13: Optimistic Locking

Continuing our series on DB2 locking, let's look into a relatively recent development -- optimistic locking...

IBM added improvements for optimistic locking techniques in DB2 9 for z/OS.  What is optimistic locking? Sometimes referred to as optimistic concurrency control, optimistic locking is basically just what it sounds like. We are optimists and think that usually we will be the only ones with interest in the data. In other words, when optimistic locking is implemented you are assuming that most of the time there will be no other programs that are interested in the page of data that you are planning to modify.

Of course, even in the most optimistic world there will be exceptions, so optimistic locking does not assume that there will never be any concurrent processes that need to access your page(s). Basically, with optimistic locking you can improve performance by minimizing locking. So how do we do that?

When an application uses optimistic locking, locks are obtained immediately before a read operation and then released immediately. Update locks are obtained immediately before an update operation and held until the end of the transaction. Optimistic locking uses the RID (Record IDentifier) and a row change timestamp to test whether data has been changed by another transaction since the last read operation.

DB2 knows when a row was changed and so therefore he (I always tend to make DB2 masculine, sorry ladies) can ensure data integrity even as he minimizes the duration of locks. With optimistic locking, DB2 releases the page (or row) locks immediately after a read operation. And if you are using row locks, DB2 releases the row lock after each FETCH, taking a new lock on a row only for a positioned update or a positioned delete.

Careful readers will have noticed that I talked about a “row change timestamp” but you may not have heard that expression before. DB2 V9 added support for automatically generated timestamp columns and if you wish to implement optimistic locking you will need to create (or alter) your tables to have a row change timestamp column, defined as follows:

NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

or

NOT NULL GENERATED BY DEFAULT
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

For tables having a row change timestamp column, DB2 automatically populates and maintains the timestamp values for each row. Notice how the syntax is similar to the syntax used for other automatically generated DB2 values, such as sequences. DB2 will automatically generate the timestamp value for each row when the row is inserted, and modify the timestamp for each row when any column in that row is updated.

When you add a ROW CHANGE TIMESTAMP column to an existing table, the initial value for existing rows will not be immediately populated. Instead, DB2 places the table space in an advisory-REORG pending state. When you reorganize the table space, DB2 will generates the values for the ROW CHANGE TIMESTAMP column for all rows (and, of course, remove the advisory-REORG pending status).

OK, but how does this implement optimistic locking? Well, you can use this new column as a condition for making an UPDATE, by specifying it in your WHERE clause. Let’s walk thru a couple of examples.

First of all, when a table contains a ROW CHANGE TIMESTAMP you can use it to find out when its rows were modified. Let’s use the following table as an example:

CREATE TABLE CUSTOMER
 (CUSTNO           CHAR(8)   NOT NULL,
  CUST_INFOCHANGE  NOT NULL GENERATED ALWAYS
                   FOR EACH ROW ON UPDATE
                   AS ROW CHANGE TIMESTAMP,
  CUST_NAME        VARCHAR(50),
  CUST_ADDRESS     VARCHAR(100),
  CUST_CITY        CHAR(20),
  CUST_STATE       CHAR(2),
  CUST_ZIP         CHAR(9),
  CUST_PHONE       CHAR(10),

  PRIMARY KEY (CUSTNO))

Now that the table is defined with the ROW CHANGE TIMESTAMP we can use it in our programs and queries to determine change information about the data. For example, if we want to find all of the customer rows that were changed in the past week (ie. the last 7 days) we could run the following query:

SELECT CUSTNO, CUST_NAME
FROM   CUSTOMER
WHERE  ROW CHANGE TIMESTAMP FOR CUSTOMER <=
       CURRENT TIMESTAMP
AND    ROW CHANGE TIMESTAMP FOR CUSTOMER >=
       CURRENT TIMESTAMP - 7 DAYS;

But what would happen if you issued a statement like this against a table that was altered to include a ROW CHANGE TIMESTAMP? For example, if we created the CUSTOMER table as shown but without the CUST_INFOCHANGE column, populated the table with data, and then altered the table to include the CUST_INFOCHANGE column? In this case, DB2 will use the time the page was last modified. So the results will not be exactly correct because it would return all the rows on each page that qualifies (because at least one row on the page changed). This is why it is important to clear up the advisory REORG pending as soon as possible after adding the ROW CHANGE TIMESTAMP.

This is all well and good, and you can probably see the value of having this automagically changing timestamp in some of your tables, but where is the optimistic locking part? Well, for programs that use updateable static scrollable cursors DB2 can use optimistic locking as long as the program is bound specifying ISOLATION(CS). If you have this situation, DB2 will deploy optimistic locking to reduce the duration of locks between consecutive FETCH operations and between fetch operations and subsequent positioned UPDATE or DELETE operations.

Without optimistic locking, the lock taken at the first FETCH is held until the next FETCH. The lock taken at the last FETCH is held until COMMIT, ROLLBACK, or the end of transaction.

With optimistic locking, the scenario changes significantly. When the application requests a FETCH to position the cursor on a row, DB2 locks that row, executes the FETCH and releases the lock. When the application requests a positioned UPDATE or DELETE on the row, DB2 locks the row and then re-evaluates the predicate to ensure that the row still qualifies for the result table.

Optimistic locking itself will not happen without some effort on your part. Your application must have a ROW CHANGE TIMESTAMP and it must be selected first. Then, during a modification, a predicate needs to be added as a condition to tell whether the row has been modified or not. The static scrollable cursor uses the optimistic locking technique automatically. DB2 cannot use optimistic concurrency control for dynamic scrollable cursors. With dynamic scrollable cursors, the most recently fetched row or page from the base table remains locked to maintain position for a positioned UPDATE or DELETE.


So, if you have not taken a look at which applications might benefit from optimistic locking techniques since your shop migrated to you move to DB2 9 for z/OS it is time to take a look at what applications could take advantage of optimistic locking – and then add the ROW CHANGE TIMESTAMP to the appropriate tables.

Wednesday, June 19, 2013

DB2 Locking, Part 12: Lock Promotion and Escalation

It can be potentially troublesome when DB2 lock promotion and escalation occur. These situations can greatly reduce the availability of concurrent access to data.

First let's look at lock promotion. When binding a program with an ISOLATION level of RR, the optimizer sometimes decides that table space locks will perform better than page locks. As such, the optimizer promotes the locking level to table space locking, regardless of the LOCKSIZE specified in the DDL. This process is called lock promotion.

When you set the LOCKSIZE bind parameter to ANY, DB2 processing begins with page-level locking. As processing continues and locks are acquired, however, DB2 might decide that too many page (or row) locks have been acquired, causing inefficient processing.

In this scenario, DB2 escalates the level of locking from page (or row) locks to table or table space locks—a procedure called lock escalation. The threshold governing when lock escalation occurs is set in one of two ways:

  • The DSNZPARM start-up parameters for DB2 
  • The LOCKMAX parameter of the CREATE or ALTER TABLESPACE statement (which is stored in the MAXROWS column of SYSIBM.SYSTABLESPACE)

Lock escalation applies to objects defined with LOCKSIZE ANY in the DDL.

There are some limitations on lock escalation, though. A table lock can never be escalated to a table space lock. Table space locks are the highest level of locking and, therefore, cannot be escalated. Furthermore, a row lock will not be escalated to a page lock. It will be escalated to a table space lock.

User lock escalation occurs if a single user accumulates more page locks than are allowed by the DB2 subsystem (as set in DSNZPARMs), the program is informed via a -904 SQLCODE. The program can either issue a ROLLBACK and produce a message indicating that the program should be modified to COMMIT more frequently or, alternately, escalate the locking strategy itself by explicitly issuing a LOCK TABLE statement within the code.

Prior to implementing the second approach, be sure to understand the ramifications of issuing the  LOCK TABLE statement and how it works.

Locking DSNZPARMs

There are two  DSNZPARM parameters that are used to govern DB2 locking and lock escalation: NUMLKTS and NUMLKUS.

NUMLKTS defines the threshold for the number of page locks that can be concurrently held for any one table space by any single DB2 application (thread). When the threshold is reached, DB2 escalates all page locks for objects defined as LOCKSIZE ANY according to the following rules:

  • All page locks held for data in segmented table spaces are escalated to table locks.
  • All page locks held for data in partitioned table spaces are escalated to table space locks.

NUMLKUS defines the threshold for the total number of page locks across all table spaces that can be concurrently held by a single DB2 application. When any given application attempts to acquire a lock that would cause the application to surpass the NUMLKUS threshold, the application receives a resource unavailable message (SQLCODE of -904).

TABLESPACE DDL Locking Parameters

In general, letting DB2 handle the level of locking required can be a fine strategy. Turning over the determination for lock size to DB2 requires setting LOCKSIZE ANY. Of course, you might have a compelling reason to use a specific LOCKSIZE. For example, you might choose to specify LOCKSIZE PAGE to explicitly direct DB2 to lock at the page level. Or, under certain conditions you might choose LOCKSIZE ROW to implement row-level locking.

The LOCKMAX parameter specifies the maximum number of page or row locks that any one process can hold at any one time for the table space. When the threshold is reached, the page or row locks are escalated to a table or table space lock. The LOCKMAX parameter is similar to the NUMLKTS parameter, but for a single table space only.

Monday, June 17, 2013

DB2 Locking, Part 11: Data Sharing Global Lock Management

Data sharing adds an additional level of complexity to the DB2 locking strategies and techniques we have been discussing in this series. Because data sharing group members can access any object from any member in the group, a global locking mechanism is required. It is handled by the lock structure defined in the coupling facility. The lock structure is charged with managing inter-member locking. Without a global lock management process, data integrity problems could occur when one member attempts to read (or change) data that is in the process of being changed by another member.

Data sharing groups utilize a global locking mechanism to preserve the integrity of the shared data. The global locking mechanism allows locks to be recognized between members.

Global Locking

All members of a data sharing group must be aware of locks that are held or requested by the other members. The DB2 data sharing group utilizes the coupling facility (CF) to establish and administer global locks.

The IRLM performs locking within each member DB2 subsystem. Additionally, the IRLM communicates with the coupling facility to establish global locks. Each member of the data sharing group communicates lock requests to the coupling facility’s lock structure. The manner in which a transaction takes locks during execution does not change. The only difference is that, instead of being local locks, the locks being taken are global in nature.

DB2 data sharing does not use message passing to perform global locking. The members DB2 IRLMs use the coupling facility to do global locking. Contention can be identified quickly without having to suspend the tasks to send messages around to the other DB2 members contained in the data sharing group. The following list outlines the events that occur when transactions from different DB2 members try to access the same piece of data:


  1. TXN1 requests a lock that is handled by the local IRLM.
  2. The local IRLM passes the request to the coupling facility global lock structures to ensure that no other members have incompatible locks. No incompatible locks are found, so the lock is taken.
  3. TXN2 requests a lock that is handled by its local IRLM. The lock is for the same data held by TXN1 executing in a different DB2 subsystem.
  4. Once again, the local IRLM passes the request to the coupling facility global lock structures to check for lock compatibility. In this case, an incompatible lock is found, so the lock request cannot be granted. The task is suspended.
  5. Eventually, TXN1 executes a COMMIT, which releases all local and global locks.
  6. TXN2 now can successfully execute the lock and continue processing.


Data Sharing Locking Considerations 

Consider specifying TRACKMOD NO for objects used in a data sharing environment to avoid locking problems. In addition, consider the MEMBER CLUSTER option. The MEMBER CLUSTER option indicates that DB2 should locate table space data based on available space rather than clustering the data by the clustering index. This option can benefit applications when there are many inserts to the same table from multiple members.

You might also improve performance by randomizing index key columns to reduce hot spots. This can be accomplished using the RANDOM keyword of CREATE INDEX.

The coupling facility level (CFLEVEL) can also be a consideration. DB2 prefetch processing for GBP-dependent page sets and partitions varies depending on the CFLEVEL in which the group buffer pool is allocated.

If the group buffer pool is allocated in a coupling facility with CFLEVEL=0 or 1, DB2 reads and registers one page at a time in the group buffer pool. If the group buffer pool is allocated in a coupling facility with CFLEVEL=2 or higher, DB2 can register the entire list of pages prefetched with one request to the coupling facility.

You can determine the CFLEVEL of your coupling facility using the DISPLAY GROUP command.

Lock Structures

The coupling facility contains several lock structures that are used for global locking purposes. The lock lists contain names of modified resources. This information is used to notify members of the data sharing group that the various resources have been changed.

Additionally, a hash table is used to identify compatible and incompatible lock modes. If the same hash value is used for the same resource name from different systems (with incompatible lock modes), lock contention will occur. If the same hash value is used for different resource names (called a hashing collision), false contention will occur. Any contention requires additional asynchronous processing to occur.

Hierarchical Locking

DB2 data sharing introduces the concept of explicit hierarchical locking to reduce global locking overhead (which increases global locking performance). Explicit hierarchical locking allows data sharing to differentiate between global and local locks. When no inter-DB2 interest occurs in a resource, the local IRLM can grant locks locally on the resources that are lower in the hierarchy. This feature allows the local DB2 to obtain local locks on pages or rows for that table space without notifying the coupling facility. In a data sharing environment, locks on the top parents are always propagated to the coupling facility lock structures. (These structures are detailed on the previous page.) In addition, the local DB2 propagates locks on children, depending on the compatibility of the maximum lock held on a table space that also has other members of the DB2 data sharing group requesting locks on it.

P-Locks Versus L-Locks

DB2 data sharing introduces two new lock identifiers: P-locks and L-locks.

P-locks preserve inter-DB2 coherency of buffered pages. P-locks are owned by the member DB2 subsystem and are used for physical resources such as page sets. These physical resources can be either data objects or index objects. P-locks are held for the length of time the pages are locally cached in the local buffer pool. As such, data can be cached beyond a transaction commit point.

P-locks are negotiable. If multiple DB2 members hold incompatible P-locks, the IRLMs try to downgrade lock compatibility. P-locks are never timed out. Because P-locks are not owned by transactions, they cannot be deadlocked. The sole job of a P-lock is to ensure inter-DB2 coherency. P-locks notify the data sharing group that a member of that group is performing work on that resource. This way, the coupling facility can become involved and begin treating the resources globally.

L-locks are used for both intra- and inter-DB2 concurrency between transactions. L-locks can either be local or global in scope. L-locks are owned by transactions and are held for COMMIT or allocation duration. L-locks are not negotiable and, as such, must wait for incompatible L-locks held by other DB2 members to be released before they can be taken. Suspended L-locks can be timed out by the IRLM.

Monday, June 10, 2013

DB2 Locking, Part 10: Know Your ISOLATION Levels

DB2 provides a way to change the way that a program or SQL statement acquires locks. That way is known as the isolation level and it can be set to specify the locking behavior for a transaction or statement. Standard SQL defines four isolation levels that can be set using the SET TRANSACTION ISOLATION LEVEL statement:
  • Serializable
  • Repeatable read
  • Read committed
  • Read uncommitted

The isolation level determines the mode of page or row locking implemented by the program as it runs.
DB2 supports a variation of the standard isolation levels. DB2 implements page and row locking at the program execution level, which means that all page or row locks are acquired as needed during the program run. Page and row locks are released as the program run depending on the isolation level.

In DB2 you can specify the following four isolation levels: cursor stability (CS), repeatable read (RR), read stability (RS), and uncommitted read (UR).

Using the ISOLATION parameter of the BIND command you can set the isolation level of a package or plan. You also can use the WITH parameter on a SELECT statement to set the isolation level of a single SQL statement.

Cursor stability is the DB2 implementation of the SQL standard read committed isolation level. CS is perhaps the most common DB2 isolation level in use in production applications because it offers a good tradeoff between data integrity and concurrency. When CS is specified the transaction will never read data that is not yet committed; only committed data can be read. Cursor stability is the default isolation level if none is specified at BIND time.

A higher level of integrity is provided with repeatable read. Under an RR isolation level all page locks are held until they are released by a COMMIT (or ROLLBACK), whereas with CS read-only page locks are released as soon as another page is accessed.
Repeatable read is the default isolation level if none is specified at BIND time.

An RR page locking strategy is useful when an application program requires consistency in rows that may be accessed twice in one execution of the program, or when an application program requires data integrity that cannot be achieved with CS.

For example of a good reason to use RR page locking, consider a reporting program that scans a table to produce a detail report, and then scans it again to produce a summarized managerial report. If the program is bound using CS, the results of the first report might not match the results of the second.

Suppose that you are reporting the estimated completion dates for project activities. The first report lists every project and the estimated completion date. The second, managerial report lists only the projects with a completion date greater than one year.

The first report indicates that two activities are scheduled for more than one year. After the first report but before the second, however, an update occurs. A manager realizes that she underestimated the resources required for a project. She invokes a transaction to change the estimated completion date of one of her project's activities from 8 months to 14 months. The second report is produced by the same program, but reports 3 activities.

If the program used an RR isolation level rather than CS, an UPDATE that occurs after the production of the first report but before the second would not have been allowed. The program would have maintained the locks it held from the generation of the first report and the updater would be locked out until the locks were released.

How about another example? Consider a program that is looking for pertinent information about employees in the information center and software support departments who make more than $30,000 in base salary. In the DB2 sample tables department 'C01' is the information center and department 'E21' is software support.
The program opens a cursor based on the following SELECT statement:

    SELECT  EMPNO, FIRSTNME, LASTNAME,
            WORKDEPT, SALARY
    FROM    DSN8710.EMP
    WHERE   WORKDEPT IN ('C01', 'E21')
    AND     SALARY > 30000;

The program then begins to FETCH employee rows. Assume further, as would probably be the case, that the statement uses the XEMP2 index on the WORKDEPT column. An update program that implements employee modifications is running concurrently. The program handles transfers by moving employees from one department to another, and implements raises by increasing the salary.

Assume that Sally Kwan, one of your employees, has just been transferred from the information center to software support. Assume further that another information center employee, Heather Nicholls, received a 10 percent raise. The update program running concurrently with the report program implements both of these modifications.

If the report program were bound with an isolation level of CS, the second program could move Sally from 'C01' to 'E21' after she was reported to be in department 'C01' but before the entire report was finished. Thus, she could be reported twice: once as an information center employee and again as a software support employee. Although this circumstance is rare, it can happen with programs that use cursor stability. If the program were bound instead with RR, this problem could not happen. The update program probably would not be allowed to run concurrently with a reporting program, however, because it would experience too many locking problems.
Now consider Heather's dilemma. The raise increases her salary 10 percent, from $28,420 to $31,262. Her salary now fits the parameters specified in the WHERE condition of the SQL statement. Will she be reported? It depends on whether the update occurs before or after the row has been retrieved by the index scan, which is clearly a tenuous situation. Once again, RR avoids this problem.

You might be wondering, "If CS has the potential to cause so many problems, why is it used so ubiquitously? Why not trade the performance and concurrency gain of CS for the integrity of RR?"

The answer is simple: the types of problems outlined are rare. The expense of using RR, however, can be substantial in terms of concurrency. So the tradeoff between the concurrency expense of RR and the efficiency of CS usually is not a sound one.

The third isolation level provided by DB2 is read stability (RS), which is like the SQL standard serializable level. Read stability is similar in functionality to the RR isolation level. A retrieved row or page is locked until the end of the unit of work; no other program can modify the data until the unit of work is complete, but other processes can insert values that might be read by your application if it accesses the row a second time.

Use read stability only when your program can handle retrieving a different set of rows each time a cursor or singleton SELECT is issued. If using read stability, be sure your application is not dependent on having the same number of rows returned each time.

Finally, we come to the last, and most maligned isolation level, uncommitted read (UR). The UR isolation level provides read-through locks, also know as dirty read or read uncommitted. Using UR can help to overcome concurrency problems. When you're using an uncommitted read, an application program can read data that has been changed but is not yet committed.

UR can be a performance booster, too, because application programs bound using the UR isolation level will read data without taking locks. This way, the application program can read data contained in the table as it is being manipulated. Consider the following sequence of events:

(1) To change a specific value, at 9:00 a.m. a transaction containing the following SQL is executed:

    UPDATE    EMP
       SET FIRSTNME = "MICHELLE"
    WHERE     EMPNO = 10020;

     The transaction is a long-running one and continues to execute without issuing a COMMIT.

(2) At 9:01 a.m., a second transaction attempts to SELECT the data that was changed, but not committed.

If the UR isolation level were used for the second transaction, it would read the changed data even though it had yet to be committed. Obviously, if the program doesn't need to wait to take a lock and merely reads the data in whatever state it happens to be at that moment, the program will execute faster than if it had to wait for locks to be taken and resources to be freed before processing.

The implications of reading uncommitted data, however, must be carefully examined before being implemented. Several types of problems can occur. Using the previous example, if the long-running transaction rolled back the UPDATE to EMPNO 10020, the program using dirty reads may have picked up the wrong name ("MICHELLE") because it was never committed to the database.

Inaccurate data values are not the only problems that can be caused by using UR. A dirty read can cause duplicate rows to be returned where none exist. Alternatively, a dirty read can cause no rows to be returned when one (or more) actually exists. Additionally, an ORDER BY clause does not guarantee that rows will be returned in order if the UR isolation level is used. Obviously, these problems must be taken into consideration before using the UR isolation level.

Keep in mind, too, that the UR isolation level applies to read-only operations: SELECT, SELECT INTO, and FETCH from a read-only result table. Any application plan or package bound with an isolation level of UR will use uncommitted read functionality for any read-only SQL. Operations contained in the same plan or package and are not read-only will use an isolation level of CS.

When is it appropriate to use UR isolation? The general rule of thumb is to avoid UR whenever the results must be 100 percent accurate. Following are examples of when this would be true:
  • Calculations that must balance are being performed on the selected data
  • Data is being retrieved from one source to insert to or update another
  • Production, mission-critical work is being performed that cannot contain or cause data integrity problems

In general, most DB2 applications are not serious candidates for dirty reads. In a few specific situations, however, the dirty read capability will be of major benefit. Consider the following cases in which the UR isolation level could prove to be useful:
  • Access is required to a reference, code, or look-up table that basically is static in nature. Due to the non-volatile nature of the data, a dirty read would be no different than a normal read the majority of the time. In those cases when the code data is being modified, any application reading the data would incur minimum, if any, problems.
  • Statistical processing must be performed on a large amount of data. Your company, for example, might want to determine the average age of female employees within a certain pay range. The impact of an uncommitted read on an average of multiple rows will be minimal because a single value changed will not greatly impact the result.
  • Dirty reads can prove invaluable in a data warehousing environment that uses DB2 as the DBMS. A data warehouse is a time-sensitive, subject-oriented, store of business data that is used for online analytical processing. Other than periodic data propagation and/or replication, access to the data warehouse is read-only. Because the data is generally not changing, an uncommitted read is perfect in a read-only environment due to the fact that it can cause little damage. More data warehouse projects are being implemented in corporations worldwide and DB2 with dirty read capability is a very wise choice for data warehouse implementation.
  • In those rare cases when a table, or set of tables, is used by a single user only, UR can make a lot of sense. If only one individual can be modifying the data, the application programs can be coded such that all (or most) reads are done using UR isolation level, and the data will still be accurate.
  • Finally, if the data being accessed already is inconsistent, little harm can be done using a dirty read to access the information.

Although the dirty read capability can provide relief to concurrency problems and deliver faster performance in specific situations, it also can cause data integrity problems and inaccurate results. Be sure to understand the implications of the UR isolation level and the problems it can cause before diving headlong into implementing it in your production applications.

Summary


It is important for DB2 DBAs and application programmers to know the four isolation levels and their impact on SQL. Using the isolation levels is an effective way to control concurrency and locking for your DB2 applications.

Monday, June 03, 2013

DB2 Locking, Part 9: Avoid Bachelor Programming Syndrome

Unless you plan for, and issue regular COMMITs in your DB2 application programs, you will be causing locking problems. It is important for every programmer to issue COMMIT statements in all application programs where data is modified (INSERT, UPDATE, and DELETE). A COMMIT externalizes the modifications that occurred in the program since the beginning of the program or the last COMMIT. A COMMIT ensures that all modifications have been physically applied to the database, thereby ensuring data integrity and recoverability.

Failing to code COMMITs in a data modification program is what I like to call Bachelor Programming Syndrome — or in other words, fear of committing.

One important factor affecting the need for a COMMIT strategy is the amount of elapsed time required for the program to complete. The greater the amount of time needed, the more important it becomes to issue periodic COMMITs. A sound COMMIT strategy will reduce rollback time and reprocessing time in the event of program failure. It is a safe assumption that the elapsed time increases as the number of modifications increases.

Issuing COMMITs in an application program is important for three reasons:
  1. If the program fails, all the modifications are backed out to the last COMMIT point. This process could take twice the time it took to perform the modifications in the first place—especially if you are near the end of a program with no COMMITs that performed hundreds or thousands of modification operations.
  2. If you resubmit a failing program that issues no COMMITs, the program redoes work unnecessarily.
  3. Programs using the repeatable read page locking strategy or the RELEASE(COMMIT) table space locking strategy hold their respective page and table space locks until a COMMIT is issued. If no COMMITs are issued during the program, locks are not released until the auto-COMMIT when the program completes, thereby negatively affecting concurrent access. This can cause lock timeouts and lock escalation.
If concurrent batch or online access uses uncommitted read (UR) isolation, sometimes referred to as dirty reads, COMMIT processing is irrelevant. This is so because uncommitted read isolation does not take any locks. However, most processing requires accurate data and, as such, will not use uncommitted read isolation.

If your program is running for a long time and has issued no COMMITs, your program will probably not suffer. However, others attempting to access the data that is locked by your program will suffer. This will manifest itself with lock timeout errors (such as -911 and -913 SQL return codes). If you see a lot of lock timeouts occurring it is probably a good bet that you have at least one long-running program (maybe more) that needs to have COMMIT logic added.

It is also good programming practice to issue periodic COMMITs in long-running read-only programs. Even though data is not being changed, some databae management systems, such as DB2 for z/OS, acquire read locks whenever data is accessed. A read lock can block other programs (such as utility operations) from running successfully. Therefore, failure to COMMIT, even in a read only program, can result in locking problems.

In some rare circumstances, you might be able to avoid issuing COMMITs in your database programs. But you should only consider doing so for programs that run very quickly, and therefore will not hold locks for a long duration.

Given the tremendous benefits to your DBMS environment that can be gained by assuring a proper COMMIT strategy, why would you ever consider avoiding COMMITs? Perhaps the simplest argument is that the code is easier to write without COMMITs. Additionally, recovery is simpler. When modification programs without COMMITs fail, you can generally restart them from the beginning because database changes have not been committed. Of course, this might not be true if you have modified data not controlled by the DBMS (in which case dual modifications may cause problems).

If you choose to forgo COMMITs in any application program it is wise to clearly document why there are no COMMITs being issued. That way, if the program is changed later rendering the original assumptions inaccurate, adding COMMITs can be considered using the original reasoning accompanied by the added processing changes.

I recommend that you plan to issue COMMITs in every batch program... even read only programs. The COMMIT will release claim requests, which can free up DB2 utilities to execute, thereby improving overall availability.

You can structure the logic so that the COMMIT processing is contingent on a parameter passed to the program. This approach enables an analyst to turn off COMMIT processing but ensures that all batch programs are prepared if COMMIT processing is required in the future.

Thursday, May 30, 2013

DB2 Locking, Part 8: LOBs and Locking

When a row is read or modified in a table containing LOB columns, the application will obtain a normal transaction lock on the base table. The actual values for (most) LOBs are stored in a separate table space from the rest of the table data. The locks on the base table also control concurrency for the LOB table space. But DB2 uses locking strategies for large objects, too. A lock that is held on a LOB value is referred to as a LOB lock. LOB locks are deployed to manage the space used by LOBs and to ensure that LOB readers do not read partially updated LOBs.

Note: For applications reading rows using ISOLATION(UR) or lock avoidance, page or row locks are not taken on the base table. However, DB2 takes S-locks on the LOB to ensure that a partial or inconsistent LOB is not accessed.

NoteInline LOBs, available as of DB2 V10, are treated like normal data and do not require LOB locking. 


One reason LOB locks are used is to determine whether space from a deleted LOB can be reused by an inserted or updated LOB. DB2 will not reuse the storage for a deleted LOB until the DELETE has been committed and there are no more readers on the LOB. Another purpose for locking LOBs is to prevent deallocating space for a LOB that is currently being read. All readers, including “dirty readers” acquire S-locks on LOBs to prevent the storage for the LOB they are reading from being deallocated. 

Types of LOB Locks

There are only two types of LOB locks:

  • S-locks, or SHARE: The lock owner and any concurrent processes can SELECT, DELETE, or UPDATE the locked LOB. Concurrent processes can acquire an S-lock on the LOB.
  • X-locks, or EXCLUSIVE: The lock owner can read or change the locked LOB, but concurrent processes cannot access the LOB.

Just like regular transaction locking, though, DB2 also takes LOB table space locks. If the LOB table space has a gross lock, DB2 does not acquire LOB locks. The following lock modes can be taken for a the LOB table space:

  • S-lock, or SHARE: The lock owner and any concurrent processes can read and delete LOBs in the LOB table space. The lock owner does not need to take individual LOB locks.
  • IS-lock, or INTENT SHARE: The lock owner can UPDATE LOBs to null or zero-length, or SELECT or DELETE LOBs in the LOB table space. Concurrent processes can both read and modify LOBs in the same table space. The lock owner acquires a LOB lock on any data that it reads or deletes.
  • X-lock, or EXCLUSIVE: The lock owner can read or change LOBs in the LOB table space. The lock owner does not need to take individual LOB locks.
  • IX-lock, or INTENT EXCLUSIVE: The lock owner and any concurrent process can read and change data in the LOB table space. The lock owner acquires an individual LOB lock for any LOB it accesses.
  • SIX-lock, or SHARE WITH INTENT EXCLUSIVE: The lock owner can read and change data in the LOB table space. The lock owner obtains a LOB locks when inserting or updating. Concurrent processes can SELECT or DELETE data in the LOB table space (or UPDATE the LOB to a null or zero-length).

As with transaction locking, there is a hierarchical relationship between LOB locks and LOB table space locks (see Figure 1). If the LOB table space is locked with a gross lock, LOB locks are not acquired. 


Figure 1. The DB2 LOB locking hierarchy.

The type of locking used is controlled using the LOCKSIZE clause for the LOB table space. LOCKSIZE TABLESPACE indicates that no LOB locks are to be acquired by processes that access the LOBs in the table space. Specifying LOCKSIZE LOB indicates that LOB locks and the associated LOB table space locks (IS or IX) are taken. The LOCKSIZE ANY specification allows DB2 to choose the size of the lock, which is usually to do LOB locking.

Duration of LOB Locks

The ACQUIRE option of BIND has no impact on LOB table space locking. DB2 will take locks on LOB table spaces as needed. However, the RELEASE option of BIND does control when LOB table space locks are releases. For RELEASE(COMMIT), the LOB table space lock is released at COMMIT (unless WITH HOLD is specified or a LOB locator is held).

LOB locks are taken as needed and are usually released at COMMIT. If that LOB value is assigned to a LOB locator, the S-lock on the LOB remains until the application commits. If the application uses HOLD LOCATOR, the locator (and the LOB lock) is not freed until the first commit operation after a FREE LOCATOR statement is issued, or until the thread is deallocated. If a cursor is defined WITH HOLD, LOB locks are held through COMMIT operations. 

LOB Table Space Locking Considerations

Under some circumstances, DB2 can avoid acquiring a lock on a LOB table space. For example, when deleting a row where the LOB column is null, DB2 need not lock the LOB table space. 

DB2 does not access the LOB table space in the following instances:

  • A SELECT of a LOB that is null or zero-length
  • An INSERT of a LOB that is null or zero-length
  • When a null or zero-length LOB is modified (by UPDATE) to a null or zero-length
  • A DELETE for a row where the LOB is null or zero-length