The Db2 Portal Blog: locking

Showing posts with label locking. Show all posts

Monday, April 22, 2013

DB2 Locking, Part 3: Locks Versus Latches

So far in this series on DB2 locking we have offered up a broad overview of what locking is and then delved into the world of table and table space locks. In this short entry, before we tackles page and row locks, we are going to look at the difference between a lock and a latch.

A true lock is handled by DB2 using the IRLM. The IRLM, or internal resource lock manager, is both a separate subsystem and an integral component of DB2. As its name implies, it manages locks for DB2.

However, whenever doing so is practical, DB2 can lock resources without going to the IRLM. This type of lock is called a latch. True locks are always set in the IRLM. Latches, by contrast, are set internally by DB2, without going to the IRLM.

When a latch is taken instead of a lock, it is handled in the Buffer Manager by internal DB2 code; so the cross-memory service calls to the IRLM are eliminated. Latches are usually held only briefly—for a shorter duration than locks. Also, a latch requires about one-third the number of instructions as a lock. Therefore, latches are more efficient than locks because they avoid the overhead associated with calling an external address space. Latches are used when a resource serialization situation is required for a short time. Both latches and locks guarantee data integrity.

In subsequent blog entries, any usage of the term lock generically, refers to both locks and latches.

Monday, November 23, 2009

Reading Things That Aren't There... and Missing Things That Are!

You can shoot yourself in the foot using DB2 if you are not careful. There are options that you can specify that may cause you to read data that is not really in the database. And, alternately, you can set things up so that you miss reading data that is actually in the database.

How, you might be asking? Well, dirty reads will take care of the first one. Specifying ISOLATION(UR) implements read-through locks, which is sometimes referred to as a dirty read. It applies to read operations only. With this isolation level data may be read that never actually exists in the database, because the transaction can read data that has been changed by another process but is not yet committed.

Read uncommitted isolation provides the highest level availability and concurrency of the isolation levels, but the worst degree of data integrity. It should be used only when data integrity problems can be tolerated. Certain types of applications, such as those using analytical queries, estimates, and averages are likely candidates for read uncommitted locking. A dirty read can cause duplicate rows to be returned where none exist or no rows may be returned when one (or more) actually exists. When choosing read uncommitted isolation the programmer and DBA must ensure that these types of problems are acceptable for the application.

OK, so what about not reading data that is in the database? DB2 V9 provides us an option to do just that. In DB2 9 it is possible for a transaction to skip over rows that are locked. This can be accomplished by means of the SKIP LOCKED DATA option within your SQL statement(s). SKIP LOCKED DATA can be specified in SELECT, SELECT INTO, and PREPARE, as well as searched UPDATE and DELETE statements. You can also use the SKIP LOCKED DATA option with the UNLOAD utility.

When you tell DB2 to skip locked data then that data is not accessed and your program will not have it available. DB2 just skip over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time. But it comes at the cost of not accessing the locked data at all. This means that you should only utilize this clause when your program can tolerate skipping over some data.

The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation. But it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 will simply ignore the SKIP LOCKED DATA clause under UR and RR isolation levels.

Additionally, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks. And the bigger the lock size, the more data that will be skipped when a lock is encountered. With row locking you will be skipping over locked rows, but with page locking you will be skipping over all the rows on the locked page.

Use both of these features with extreme care and make sure that you know exactly what you are telling DB2 to do. Otherwise, you might be reading more... or less than you want!

Friday, September 04, 2009

Dynamic SQL Causing Lock Escalation?

Lock escalation is the promotion of a lock from a row, page or LOB lock to a table space lock because the number of page locks that are concurrently held on a given resource exceeds a preset limit.

But lock escalation can cause problems. Yes, when fewer locks are taken, CPU cost and memory usage can be reduced. On the other hand, escalating the size of a lock causes more resources to be locked... and that impacts concurrency with the most likely result being applications experiencing lock timeouts or deadlocks.

When lock escalation occurs, DB2 writes messages to the system log. So when complaints inevitably arrive about failing transactions you can look for the appropriate lock escalation message indicating that it could be the culprit.

But such simple steps are typically not sufficient to track down the root cause and take corrective actions. You need to be able to determine what SQL statement in what program and under what circumstances is causing the escalation. Doing so involves tracing and using performance monitoring tools.

For those interested in this topic, (indeed, the primary purpose of this blog post) you should seek out the IBM TechDoc titled Identifying the dynamic SQL statement which is causing a
lock escalation in DB2 for z/OS. For those who don't know what a TechDoc is, don't worry, it isn't that complicated. It is basically a published piece of technical documentation that isn't part of a larger manual or IBM RedBook. They can be short papers, product flashes, presentations, etc.

This particular TechDoc details a methodology for identifying dynamic SQL statements involved in a lock escalation. It describes which traces to start, and which jobs to run to analyze the collected traces. It also explains how to analyze the trace report to find the problematic SQL statement.

You can search all of IBM's TechDocs by following this link.

Happy reading!

Monday, July 06, 2009

The "Dirty" Read (AKA Uncommitted Read)

Anyone accustomed to application programming when access to a database is required understands the potential for concurrency problems. When one application program attempts to read data that is in the process of being changed by another, the DBMS must forbid access until the modification is complete in order to ensure data integrity. Most DBMS products, DB2 included, use a locking mechanism for all data items being changed. Therefore, when one task is updating data on a page, another task can not access data (read or update) on that same page until the data modification is complete and committed.

Programs that read DB2 data typically access numerous rows during their execution and are thus quite susceptible to concurrency problems. Since V4, DB2 has provided read-through locks, also know as “dirty read” or “uncommitted read,” to help overcome concurrency problems. When using an uncommitted reads an application program can read data that has been changed, but is not yet committed.

Dirty read capability is implemented using a new isolation level, UR, for uncommitted read. This adds to the current isolation levels of RR (repeatable read) and CS (cursor stability). If the application program is using the UR isolation level, it will read data without taking locks. This enables the application program to read data contained in the table as it is being manipulated.

Consider the following sequence of events:

(1) At 9:00 AM, the a transaction is executed containing the following SQL to change a specific value.


   UPDATE EMP
     SET  FIRST_NAME = “MICHELLE”
   WHERE  EMPNO = 10020;

The transaction is a long-running one and continues to execute without issuing a COMMIT.

(2) At 9:01 AM, a second transaction attempts to SELECT the data that was changed, but not committed.

If the UR isolation level were specified for the second transaction, it would read the changed data even though it had yet to be committed. Obviously, if the program need not wait to take a lock and merely reads the data in whatever state it happens to be at that moment, the program will execute faster than if it had to wait for locks to be taken and resources to be freed before processing.

However, the implications of reading uncommitted data must be carefully examined before being implemented. Several types of problems can occur. A dirty read can cause duplicate rows to be returned where none exist. Alternately, a dirty read can cause no rows to be returned when one (or more) actually exists. Additionally, an ORDER BY clause does not guarantee that rows will be returned in order if the UR isolation level is used. Obviously, these problems must be taken into consideration before using the UR isolation level.

UR Isolation Requirements

The UR isolation level applies to read-only operations: SELECT, SELECT INTO, and FETCH from a read-only result table.
Any application plan or package bound with an isolation level of UR will use uncommitted read functionality for any read-only SQL. Operations that are contained in the same plan or package and are not read-only will use an isolation level of CS.
The isolation level that is defined at the plan or package level during BIND or REBIND can be overridden as desired for each SQL statement in the program. The WITH clause can be used to specify the isolation level for any individual SQL statement. For example:


      SELECT EMPNO, LAST_NAME
      FROM EMP
      WITH UR;

The WITH clause is used to allow an isolation level to be used on a statement-by-statement basis. The UR isolation level can be used only with read-only SQL statements. This includes read-only cursors and SELECT INTO statements.

Benefits and Drawbacks

So when is it a good idea to implement dirty reads using the UR isolation level? The general rule of thumb is to avoid dirty reads whenever the results must be 100 percent accurate. Examples of this would be when:

calculations that must balance are being performed on the selected data
data is being retrieved from one source to insert to or update another
production, mission-critical work is being performed that can not contain or cause data integrity problems

Truthfully, most DB2 applications are not usually candidates for dirty reads. However, there are a few specific situations in which the dirty read capability will be of major benefit. Consider the following cases in which the UR isolation level could prove to be useful:

Access is required to a reference, code, or look-up table that is basically static in nature. Due to the non-volatile nature of the data, a dirty read would be no different than a normal read the majority of the time. In those cases when the code data is being modified, any application reading the data would incur minimum, if any, problems.
Statistical processing must be performed on a large amount of data. For example, your company may wish to determine the average age of female employees within a certain pay range. The impact of an uncommitted read on an average of multiple rows will be minimal because a single value changed will not greatly impact the result.
Dirty read can prove invaluable in a data warehousing environment that uses DB2 as the DBMS. A data warehouse is a time sensitive, subject-oriented, store of business data that is used for on-line analytical processing. Other than periodic data propagation and/or replication, access to the data warehouse is read only. An uncommitted read can be perfect in a read only environment since it can cause little damage because the data is generally not changing. More and more data warehouse projects are being implemented in corporations worldwide and DB2 with dirty read capability may be a very wise choice for data warehouse implementation.
In those rare cases when a table, or set of tables, is used by a single user only, UR can make a lot of sense. If only one individual can be modifying the data, the application programs can be coded such that all (or most) reads are done using UR isolation level, and the data will still be accurate.
Finally, if the data being accessed is already inconsistent little harm can be done using a dirty read to access the information.

Administration

Because of the data integrity issues associated with dirty reads, it is a good idea for DBAs to keep track of the plans and packages that specify an isolation level of UR. This information can be found in the DB2 catalog. The following two queries can be used to find the applications using uncommitted reads.

Issue the following SQL for a listing of plans that were bound with ISOLATION(UR) or contain at least one statement specifying the WITH UR clause:


SELECT  DISTINCT S.PLNAME
FROM    SYSIBM.SYSPLAN P,
        SYSIBM.SYSSTMT S
WHERE   (P.NAME = S.PLNAME AND
         P.ISOLATION = ‘U’
        )
        OR S.ISOLATION = ‘U’
ORDER BY S.PLNAME;

Issue the following SQL for a listing of packages that were bound with ISOLATION(UR) or contain at least one statement specifying the WITH UR clause:


SELECT  DISTINCT P.COLLID, P.NAME, P.VERSION
FROM    SYSIBM.SYSPACKAGE   P,
        SYSIBM.SYSPACKSTMT  S
WHERE   (P.LOCATION = S.LOCATION AND
         P.LOCATION = ‘ ‘        AND
         P.COLLID = S.COLLID     AND
         P.NAME = S.NAME         AND
         P.VERSION = S.VERSION   AND
         P.ISOLATION = ‘U’
        )
        OR S.ISOLATION = ‘U’
ORDER BY S.COLLID, S.NAME, S.VERSION;

Synopsis

The dirty read capability can provide relief to concurrency problems and deliver faster performance in very specific situations. Be certain to understand the implications of the UR isolation level and the “problems” it can cause before diving headlong into implementing it in your production applications.

Monday, June 22, 2009

Know Your ISOLATION Levels

Did you know that DB2 provides a way to change the way that a program or SQL statement acquires locks? That way is known as the isolation level and it can be set to specify the locking behavior for a transaction or statement. Standard SQL defines four isolation levels that can be set using the SET TRANSACTION ISOLATION LEVEL statement:

Serializable
Repeatable read
Read committed
Read uncommitted

The isolation level determines the mode of page or row locking implemented by the program as it runs.

DB2 supports a variation of the standard isolation levels. DB2 implements page and row locking at the program execution level, which means that all page or row locks are acquired as needed during the program run. Page and row locks are released as the program run depending on the isolation level.

In DB2 you can specify the following four isolation levels:

cursor stability (CS),
repeatable read (RR),
read stability (RS), and
uncommitted read (UR).

Using the ISOLATION parameter of the BIND command you can set the isolation level of a package or plan. You also can use the WITH parameter on a SELECT statement to set the isolation level of a single SQL statement.

Cursor stability is the DB2 implementation of the SQL standard read committed isolation level. CS is perhaps the most common DB2 isolation level in use in production applications because it offers a good tradeoff between data integrity and concurrency. When CS is specified the transaction will never read data that is not yet committed; only committed data can be read.

A higher level of integrity is provided with repeatable read. Under an RR isolation level all page locks are held until they are released by a COMMIT (or ROLLBACK), whereas with CS read-only page locks are released as soon as another page is accessed. Repeatable read is the default isolation level if none is specified at BIND time.

An RR page locking strategy is useful when an application program requires consistency in rows that may be accessed twice in one execution of the program, or when an application program requires data integrity that cannot be achieved with CS.

For example of a good reason to use RR page locking, consider a reporting program that scans a table to produce a detail report, and then scans it again to produce a summarized managerial report. If the program is bound using CS, the results of the first report might not match the results of the second.

Suppose that you are reporting the estimated completion dates for project activities. The first report lists every project and the estimated completion date. The second, managerial report lists only the projects with a completion date greater than one year.

The first report indicates that two activities are scheduled for more than one year. After the first report but before the second, however, an update occurs. A manager realizes that she underestimated the resources required for a project. She invokes a transaction to change the estimated completion date of one of her project's activities from 8 months to 14 months. The second report is produced by the same program, but reports 3 activities.

If the program used an RR isolation level rather than CS, an UPDATE that occurs after the production of the first report but before the second would not have been allowed. The program would have maintained the locks it held from the generation of the first report and the updater would be locked out until the locks were released.

How about another example? Consider a program that is looking for pertinent information about employees in the information center and software support departments who make more than $30,000 in base salary. In the DB2 sample tables department 'C01' is the information center and department 'E21' is software support.

The program opens a cursor based on the following SELECT statement:


SELECT  EMPNO, FIRSTNME, LASTNAME,
        WORKDEPT, SALARY
FROM    DSN8710.EMP
WHERE   WORKDEPT IN ('C01', 'E21')
AND     SALARY > 30000;

The program then begins to FETCH employee rows. Assume further, as would probably be the case, that the statement uses the XEMP2 index on the WORKDEPT column. An update program that implements employee modifications is running concurrently. The program handles transfers by moving employees from one department to another, and implements raises by increasing the salary.

Assume that Sally Kwan, one of your employees, has just been transferred from the information center to software support. Assume further that another information center employee, Heather Nicholls, received a 10 percent raise. The update program running concurrently with the report program implements both of these modifications.

If the report program were bound with an isolation level of CS, the second program could move Sally from 'C01' to 'E21' after she was reported to be in department 'C01' but before the entire report was finished. Thus, she could be reported twice: once as an information center employee and again as a software support employee. Although this circumstance is rare, it can happen with programs that use cursor stability. If the program were bound instead with RR, this problem could not happen. The update program probably would not be allowed to run concurrently with a reporting program, however, because it would experience too many locking problems.

Now consider Heather's dilemma. The raise increases her salary 10 percent, from $28,420 to $31,262. Her salary now fits the parameters specified in the WHERE condition of the SQL statement. Will she be reported? It depends on whether the update occurs before or after the row has been retrieved by the index scan, which is clearly a tenuous situation. Once again, RR avoids this problem.

You might be wondering, "If CS has the potential to cause so many problems, why is it used so ubiquitously? Why not trade the performance and concurrency gain of CS for the integrity of RR?"

The answer is simple: the types of problems outlined are rare. The expense of using RR, however, can be substantial in terms of concurrency. So the tradeoff between the concurrency expense of RR and the efficiency of CS usually is not a sound one.

The third isolation level provided by DB2 is read stability (RS). Read stability is similar in functionality to the RR isolation level, but a little less. A retrieved row or page is locked until the end of the unit of work; no other program can modify the data until the unit of work is complete, but other processes can insert values that might be read by your application if it accesses the row a second time.

Consider using read stability over repeatable read only when your program can handle retrieving a different set of rows each time a cursor or singleton SELECT is issued. If using read stability, be sure your application is not dependent on having the same number of rows returned each time.

Finally, we come to the last, and most maligned isolation level, uncommitted read (UR). The UR isolation level provides read-through locks, also know as dirty read or read uncommitted. Using UR can help to overcome concurrency problems. When you're using an uncommitted read, an application program can read data that has been changed but is not yet committed.

UR can be a performance booster, too, because application programs bound using the UR isolation level will read data without taking locks. This way, the application program can read data contained in the table as it is being manipulated. Consider the following sequence of events:

1. To change a specific value, at 9:00 a.m. a transaction containing the following SQL is executed:


 UPDATE EMP
 SET FIRSTNME = "MICHELLE"
 WHERE EMPNO = 10020;

The transaction is a long-running one and continues to execute without issuing a COMMIT.

2. At 9:01 a.m., a second transaction attempts to SELECT the data that was changed, but not committed.

If the UR isolation level were used for the second transaction, it would read the changed data even though it had yet to be committed. Obviously, if the program doesn't need to wait to take a lock and merely reads the data in whatever state it happens to be at that moment, the program will execute faster than if it had to wait for locks to be taken and resources to be freed before processing.

The implications of reading uncommitted data, however, must be carefully examined before being implemented. Several types of problems can occur. Using the previous example, if the long-running transaction rolled back the UPDATE to EMPNO 10020, the program using dirty reads may have picked up the wrong name ("MICHELLE") because it was never committed to the database.

Inaccurate data values are not the only problems that can be caused by using UR. A dirty read can cause duplicate rows to be returned where none exist. Alternatively, a dirty read can cause no rows to be returned when one (or more) actually exists. Additionally, an ORDER BY clause does not guarantee that rows will be returned in order if the UR isolation level is used. Obviously, these problems must be taken into consideration before using the UR isolation level.

Keep in mind, too, that the UR isolation level applies to read-only operations: SELECT, SELECT INTO, and FETCH from a read-only result table. Any application plan or package bound with an isolation level of UR will use uncommitted read functionality for any read-only SQL. Operations contained in the same plan or package and are not read-only will use an isolation level of CS.

When is it appropriate to use UR isolation? The general rule of thumb is to avoid UR whenever the results must be 100 percent accurate. Following are examples of when this would be true:

Calculations that must balance are being performed on the selected data
Data is being retrieved from one source to insert to or update another
Production, mission-critical work is being performed that cannot contain
or cause data integrity problems

In general, most DB2 applications are not serious candidates for dirty reads. In a few specific situations, however, the dirty read capability will be of major benefit. Consider the following cases in which the UR isolation level could prove to be useful:

Access is required to a reference, code, or look-up table that basically is static in nature. Due to the non-volatile nature of the data, a dirty read would be no different than a normal read the majority of the time. In those cases when the code data is being modified, any application reading the data would incur minimum, if any, problems.
Statistical processing must be performed on a large amount of data. Your company, for example, might want to determine the average age of female employees within a certain pay range. The impact of an uncommitted read on an average of multiple rows will be minimal because a single value changed will not greatly impact the result.
Dirty reads can prove invaluable in a data warehousing environment that uses DB2 as the DBMS. A data warehouse is a time-sensitive, subject-oriented, store of business data that is used for online analytical processing. Other than periodic data propagation and/or replication, access to the data warehouse is read-only. Because the data is generally not changing, an uncommitted read is perfect in a read-only environment due to the fact that it can cause little damage. More data warehouse projects are being implemented in corporations worldwide and DB2 with dirty read capability is a very wise choice for data warehouse implementation.
In those rare cases when a table, or set of tables, is used by a single user only, UR can make a lot of sense. If only one individual can be modifying the data, the application programs can be coded such that all (or most) reads are done using UR isolation level, and the data will still be accurate.
Finally, if the data being accessed already is inconsistent, little harm can be done using a dirty read to access the information.

Although the dirty read capability can provide relief to concurrency problems and deliver faster performance in specific situations, it also can cause data integrity problems and inaccurate results. Be sure to understand the implications of the UR isolation level and the problems it can cause before diving headlong into implementing it in your production applications.

Summary

It is important for DB2 DBAs and application programmers to know the four isolation levels and their impact on SQL. Using the isolation levels is an effective way to control concurrency and locking for your DB2 applications.

Wednesday, March 18, 2009

A Short Introduction to Lock Avoidance

Lock avoidance is a mechanism employed by DB2 for z/OS to access data without locking while still maintaining data integrity. It prohibits access to uncommitted data and serializes access to pages. Lock avoidance improves performance by reducing the overall volume of lock requests. After all, let’s face it, the most efficient lock is the one never taken.

Of course, even if it is not taking a lock, DB2 must still maintain the integrity of its data. Instead of taking a lock, DB2 uses a latch. To take advantage of Lock Avoidance, the SQL statement must be Read Only and the plan must be bound with Isolation Level Cursor Stability (CS) and CURRENTDATA(NO).

In general, DB2 avoids locking data pages if it can determine that the data to be accessed is committed and that no semantics are violated by not acquiring the lock. DB2 avoids locks by examining the log to verify the committed state of the data.

When determining if lock avoidance techniques will be practical, DB2 first scans the page to be accessed to determine whether any rows qualify. If none qualify, a lock is not required.

For each data page to be accessed, the RBA of the last page update (stored in the data page header) is compared with the log RBA for the oldest active unit of recovery. This RBA is called the Commit Log Sequence Number, or CLSN. If the CLSN is greater than the last page update RBA, the data on the page has been committed and the page lock can be avoided.

Additionally, a bit is stored in the record header for each row on the page. The bit is called the Possibly UNCommitted, or PUNC, bit. The PUNC bit indicates whether update activity has been performed on the row. For each qualifying row on the page, the PUNC bit is checked to see whether it is off. This indicates that the row has not been updated since the last time the bit was turned off. Therefore, locking can be avoided. (Note that there is no external method for DBAs to use to determine whether a row’s PUNC bit is on or off.)

If neither CLSN nor PUNC bit testing indicates that a lock can be avoided, DB2 acquires the requisite lock.

In addition to enhancing performance, lock avoidance improves data availability. Data that without lock avoidance would have been considered locked, and therefore unavailable, can now be accessible.

Lock avoidance is used only for data pages. Further, DB2 Catalog and DB2 Directory access does not use lock avoidance techniques. You can avoid locks under the following circumstances:

For any pages accessed by read-only or ambiguous queries bound with ISOLATION(CS) and CURRENTDATA NO
For any unqualified rows accessed by queries bound with ISOLATION(CS) or ISOLATION(RS)
When DB2 system-managed referential integrity checks for dependent rows caused by either the primary key being modified, or the parent row being deleted and the DELETE RESTRICT rule is in effect
For both COPY and RUNSTATS when SHRLEVEL(CHANGE) is specified

To determine the impact of lock avoidance on your system, you can review DB2 trace records. IFCIDs 218 and 223 provide CLSN information, and IFCIDs 226 and 227 provide 'wait for page latch' information.

Avoiding locks can improve the performance of your queries and programs that satisfy the preceding requirements. To encourage DB2 to avoid locks, BIND your plans and packages specifying ISOLATION(CS) and CURRENTDATA NO. Furthermore, avoid ambiguous cursors by specifying FOR READ ONLY for all cursors that are not used for updating.

Wednesday, January 16, 2008

Q+A: Locking

I get questions e-mailed to me all the time. Although I try to read and answer them all, sometimes I don't. I hope those whom I've ignored over the years will forgive me, but I can't always answer everything (not enough time/energy) and sometimes things get lost or drop through the cracks.

Anyway, at times I will take a question I get and blog about it in Q+A format. Today is one of those days!

The question was: I want to perform a retry on an INSERT under DB2 Z/OS when I get a deadlock/timeout. -911 causes a rollback automatically. Is there a ZPARM or other method of turning this off? I am inserting millions of rows and do not want a rollback to the last commit point.

Here is my answer:

Well, first of all, let me recommend that you minimize the size of your unit of work. If you are inserting millions of rows without a COMMIT you are likely causing locking issues in your environment. The pages you have locked while you are waiting for your millions of inserts to finish are all unavailable to any other user of the table (assuming page locking). That means any data on any page that you have locked cannot be read by anyone else until your unit of work is committed. Any other user, running at the same time as you are, trying to get to any page you have modified, would be getting -911 too.

That being said, you can control whether or not the work is rolled back automatically in CICS (on a thread basis) using an RDO parameter (or RCT if on an ancient CICS). The parameter is called ROLBE (RCT) or DROLLBACK (RDO). If it is set to YES a CICS SYNCPOINT ROLLBACK is issued and a -911 SQLCODE is returned to the program. If NO is coded, a CICS SYNCPOINT ROLLBACK is not issued and the SQLCODE is set to -913. You will have to programmatically either specify COMMIT or ROLLBACK for the unit of work.

In a batch environment you will need to code your programs to periodically issue COMMITs after so many modifications (or using some other method like a timer or loop counter). There is no method I am aware of to automatically control this behavior outside of looking into a third party product (for example, Softbase Checkpoint Restart, and others).

Wednesday, August 22, 2007

Optimistic Locking [DB2 9 for z/OS]

DB2 Version 9 improves support for coding optimistic locking techniques. What is optimistic locking? Sometimes referred to as optimistic concurrency control, optimistic locking is basically just what it sounds like. We are optimists and think that usually we will be the only ones with interest in the data. In other words, when optimistic locking is implemented you are assuming that most of the time there will be no other programs that are interested in the page of data that you are planning to modify.

Of course, even in the most optimistic world there will be exceptions, so optimistic locking does not assume that there will never be any concurrent processes that need to access your page(s). Basically, with optimistic locking you can improve performance by minimizing locking. So how do we do that?

When an application uses optimistic locking, locks are obtained immediately before a read operation and then released immediately. Update locks are obtained immediately before an update operation and held until the end of the transaction. Optimistic locking uses the RID (Record IDentifier) and a row change timestamp to test whether data has been changed by another transaction since the last read operation.

DB2 knows when a row was changed and so therefore he (I make DB2 masculine) can ensure data integrity even as he minimizes the duration of locks. With optimistic locking, DB2 releases the page (or row) locks immediately after a read operation. And if you are using row locks, DB2 releases the row lock after each FETCH, taking a new lock on a row only for a positioned update or a positioned delete.

Careful readers will have noticed that I talked about a “row change timestamp” but you may not have heard that expression before. DB2 V9 adds support for automatically generated timestamp columns and if you wish to implement optimistic locking you will need to create (or alter) your tables to have a row change timestamp column, defined as follows:

NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

NOT NULL GENERATED BY DEFAULT
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

For tables having a row change timestamp column, DB2 will automatically populate and maintain the timestamp values for each row. Notice how the syntax is similar to the syntax used for other automatically generated DB2 values, such as sequences. DB2 will automatically generate the timestamp value for each row when the row is inserted, and modify the timestamp for each row when any column in that row is updated.

When you add a ROW CHANGE TIMESTAMP column to an existing table, the initial value for existing rows will not be immediately populated. Instead, DB2 places the table space in an advisory-REORG pending state. When you reorganize the table space, DB2 will generates the values for the ROW CHANGE TIMESTAMP column for all rows (and, of course, remove the advisory-REORG pending status).

Now that you have this new column you can use it as a condition for making an UPDATE and specify it in your WHERE clause. Let’s walk thru a couple of examples.

First of all, when a table contains a ROW CHANGE TIMESTAMP you can use it to find out when its rows were modified. Let’s use the following table as an example:

CREATE TABLE CUSTOMER
 (CUSTNO           CHAR(8) NOT NULL,
  CUST_INFOCHANGE  NOT NULL GENERATED ALWAYS
                   FOR EACH ROW ON UPDATE
                   AS ROW CHANGE TIMESTAMP,
  CUST_NAME        VARCHAR(50),
  CUST_ADDRESS     VARCHAR(100),
  CUST_CITY        CHAR(20),
  CUST_STATE       CHAR(2),
  CUST_ZIP         CHAR(9),
  CUST_PHONE       CHAR(10),

  PRIMARY KEY (CUSTNO)
 );

Now that the table is defined with the ROW CHANGE TIMESTAMP we can use it in our programs and queries to determine change information about the data. For example, if we want to find all of the customer rows that were changed in the past week (ie. the last 7 days) we could run the following query:


SELECT CUSTNO, CUST_NAME

FROM   CUSTOMER
WHERE ROW CHANGE TIMESTAMP FOR CUSTOMER <=
         CURRENT TIMESTAMP AND
      ROW CHANGE TIMESTAMP FOR CUSTOMER >=
         CURRENT TIMESTAMP - 7 DAYS;

But what would happen if you issued a statement like this against a table that was altered to include a ROW CHANGE TIMESTAMP? For example, if we created the CUSTOMER table as shown but without the CUST_INFOCHANGE column, populated the table with data, and then altered the table to include the CUST_INFOCHANGE column? In this case, DB2 will use the time the page was last modified. So the results will not be exactly correct because it would return all the rows on each page that qualifies (because at least one row on the page changed). This is why it is important to clear up the advisory REORG pending as soon as possible after adding the ROW CHANGE TIMESTAMP.

OK, this is all well and good, and you can probably see the value of having this automagically changing timestamp in some of your tables, but where is the optimistic locking part? Well, for programs that use updateable static scrollable cursors DB2 can use optimistic locking as long as the plan/package is bound specifying ISOLATION(CS). If you have this situation, DB2 will deploy optimistic locking to reduce the duration of locks between consecutive FETCH operations and between fetch operations and subsequent positioned UPDATE or DELETE operations.

Remember, though, DB2 cannot use optimistic concurrency control for dynamic scrollable cursors. With dynamic scrollable cursors, the most recently fetched row or page from the base table remains locked to maintain position for a positioned UPDATE or DELETE.

Without optimistic locking, the lock taken at the first FETCH is held until the next FETCH. The lock taken at the last FETCH is held until COMMIT, ROLLBACK, or the end of transaction.

With optimistic locking, the scenario changes significantly. When the application requests a FETCH to position the cursor on a row, DB2 locks that row, executes the FETCH and releases the lock. When the application requests a positioned UPDATE or DELETE on the row, DB2 locks the row and then re-evaluates the predicate to ensure that the row still qualifies for the result table.

So, when you move to DB2 9 for z/OS you should evaluate your applications looking for programs that could take advantage of optimistic locking – and then add the ROW CHANGE TIMESTAMP to the appropriate tables.

Monday, June 18, 2007

Skipping Locked Rows [DB2 9 for z/OS]

In DB2 9 it is possible for a transaction to skip over rows that are locked. This can be accomplished by means of the SKIP LOCKED DATA option within your SQL statement(s). SKIP LOCKED DATA can be specified in SELECT, SELECT INTO, and PREPARE, as well as searched UPDATE and DELETE statements. You can also use the SKIP LOCKED DATA option with the UNLOAD utility.

Of course, if a program skips over locked data then that data is not accessed and the program will not have it available. When this option is used DB2 will just skip over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time. But it comes at the cost of not accessing the locked data at all. This means that you should only utilize this clause when your program can tolerate skipping over some data.

The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation. But it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 will simply ignore the SKIP LOCKED DATA clause under UR and RR isolation levels.

Additionally, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks.
Let's look at an example. Suppose we have a table with 5 rows in it that looks like this:


KEY  FNAME  LNAME
---  ------ -------
 1   JOE     MAMA
 2   DON     KNOTTS
 3   KIM     PORTANT
 4   BOB     NOBBLE
 5   KIM     BIMBO

Assume row level locking. Next assume that an UPDATE statement is run against the table changing FNAME to JIM WHERE LNAME = 'KIM'. And it is hanging out there without a COMMIT. Next, we run:


SELECT COUNT (*)
FROM   TABLE
WHERE  FNAME >= ’AAA’
SKIP LOCKED DATA;

The count returned would be 3 because DB2 skips the two locked rows (rows 3 and 5). And, of course, if the locks are released the count would be 5 again.

Monday, December 19, 2005

Minimizing Lock Contention Issues

I frequently get e-mails with DB2 questions and I plan to start posting answers to some of the more common ones up here.

One issue that comes up a lot is dealing with locking issues. Usually it is posed by someone who is experiencing timeouts in an online environment and they want to know how to minimize them. Here is some guidance.

When you experience a timeout, it means that another process holds a lock on the data that you are trying to modify. So, it stands to reason that you should try to minimize the duration of locks that are being held in your system. There are some approaches you can take to achieve this, but for the most part, they require programming changes.

First of all, make sure that all of your batch processes -- especially any that run during the same timeframe, but really all batch process -- have a COMMIT strategy. This means that your programs should issue a COMMIT after processing "a set number of" inserts, updates, and deletes. A COMMIT will tell DB2 to make the changes permanent and releases locks.
A good approach is to set a counter that is incremented after every modification. Then, check it and when it exceeds a predefined threshold -- say 25 or 50 or 100 modications -- then issue a COMMIT. You should make the threshold an input parameter so that you can change it as the workload in your system changes. For example, make it 25 when concurrent activity is high, but ramp it up to 100 or higher when it is low. Failure to issue COMMITs will result in timeouts, as well as possibly deadlocks and lock escalation.

(Also, please note that these are just sample numbers and not necessarily the correct numbers to start with at your shop.)

A good article to read regarding COMMIT strategies is "The Woes of Commitment" by Bonnie Baker.

Next, look at all of your programs, batch and online, and try to save the data modification statements to as close to the COMMIT as possible. By saving the data modification until right before you issue a COMMIT, you reduce the overall average lock duration. This should result in reducing contention and therefore, the number of timeouts.

If you want to investigate the timeout details, be sure to examine the statistics trace class 3 and IFCID 0196 for timeouts (IFCID 0172 is for deadlocks).