Monday, July 06, 2009

The "Dirty" Read (AKA Uncommitted Read)

Anyone accustomed to application programming when access to a database is required understands the potential for concurrency problems. When one application program attempts to read data that is in the process of being changed by another, the DBMS must forbid access until the modification is complete in order to ensure data integrity. Most DBMS products, DB2 included, use a locking mechanism for all data items being changed. Therefore, when one task is updating data on a page, another task can not access data (read or update) on that same page until the data modification is complete and committed.

Programs that read DB2 data typically access numerous rows during their execution and are thus quite susceptible to concurrency problems. Since V4, DB2 has provided read-through locks, also know as “dirty read” or “uncommitted read,” to help overcome concurrency problems. When using an uncommitted reads an application program can read data that has been changed, but is not yet committed.

Dirty read capability is implemented using a new isolation level, UR, for uncommitted read. This adds to the current isolation levels of RR (repeatable read) and CS (cursor stability). If the application program is using the UR isolation level, it will read data without taking locks. This enables the application program to read data contained in the table as it is being manipulated.

Consider the following sequence of events:

(1) At 9:00 AM, the a transaction is executed containing the following SQL to change a specific value.

UPDATE EMP
SET FIRST_NAME = “MICHELLE”
WHERE EMPNO = 10020;

The transaction is a long-running one and continues to execute without issuing a COMMIT.

(2) At 9:01 AM, a second transaction attempts to SELECT the data that was changed, but not committed.


If the UR isolation level were specified for the second transaction, it would read the changed data even though it had yet to be committed. Obviously, if the program need not wait to take a lock and merely reads the data in whatever state it happens to be at that moment, the program will execute faster than if it had to wait for locks to be taken and resources to be freed before processing.

However, the implications of reading uncommitted data must be carefully examined before being implemented. Several types of problems can occur. A dirty read can cause duplicate rows to be returned where none exist. Alternately, a dirty read can cause no rows to be returned when one (or more) actually exists. Additionally, an ORDER BY clause does not guarantee that rows will be returned in order if the UR isolation level is used. Obviously, these problems must be taken into consideration before using the UR isolation level.

UR Isolation Requirements

  • The UR isolation level applies to read-only operations: SELECT, SELECT INTO, and FETCH from a read-only result table.
  • Any application plan or package bound with an isolation level of UR will use uncommitted read functionality for any read-only SQL. Operations that are contained in the same plan or package and are not read-only will use an isolation level of CS.
  • The isolation level that is defined at the plan or package level during BIND or REBIND can be overridden as desired for each SQL statement in the program. The WITH clause can be used to specify the isolation level for any individual SQL statement. For example:


SELECT EMPNO, LAST_NAME
FROM EMP
WITH UR;

The WITH clause is used to allow an isolation level to be used on a statement-by-statement basis. The UR isolation level can be used only with read-only SQL statements. This includes read-only cursors and SELECT INTO statements.

Benefits and Drawbacks

So when is it a good idea to implement dirty reads using the UR isolation level? The general rule of thumb is to avoid dirty reads whenever the results must be 100 percent accurate. Examples of this would be when:
  • calculations that must balance are being performed on the selected data
  • data is being retrieved from one source to insert to or update another
  • production, mission-critical work is being performed that can not contain or cause data integrity problems
Truthfully, most DB2 applications are not usually candidates for dirty reads. However, there are a few specific situations in which the dirty read capability will be of major benefit. Consider the following cases in which the UR isolation level could prove to be useful:
  • Access is required to a reference, code, or look-up table that is basically static in nature. Due to the non-volatile nature of the data, a dirty read would be no different than a normal read the majority of the time. In those cases when the code data is being modified, any application reading the data would incur minimum, if any, problems.
  • Statistical processing must be performed on a large amount of data. For example, your company may wish to determine the average age of female employees within a certain pay range. The impact of an uncommitted read on an average of multiple rows will be minimal because a single value changed will not greatly impact the result.
  • Dirty read can prove invaluable in a data warehousing environment that uses DB2 as the DBMS. A data warehouse is a time sensitive, subject-oriented, store of business data that is used for on-line analytical processing. Other than periodic data propagation and/or replication, access to the data warehouse is read only. An uncommitted read can be perfect in a read only environment since it can cause little damage because the data is generally not changing. More and more data warehouse projects are being implemented in corporations worldwide and DB2 with dirty read capability may be a very wise choice for data warehouse implementation.
  • In those rare cases when a table, or set of tables, is used by a single user only, UR can make a lot of sense. If only one individual can be modifying the data, the application programs can be coded such that all (or most) reads are done using UR isolation level, and the data will still be accurate.
  • Finally, if the data being accessed is already inconsistent little harm can be done using a dirty read to access the information.
Administration

Because of the data integrity issues associated with dirty reads, it is a good idea for DBAs to keep track of the plans and packages that specify an isolation level of UR. This information can be found in the DB2 catalog. The following two queries can be used to find the applications using uncommitted reads.

Issue the following SQL for a listing of plans that were bound with ISOLATION(UR) or contain at least one statement specifying the WITH UR clause:

SELECT DISTINCT S.PLNAME
FROM SYSIBM.SYSPLAN P,
SYSIBM.SYSSTMT S
WHERE (P.NAME = S.PLNAME AND
P.ISOLATION = ‘U’
)
OR S.ISOLATION = ‘U’
ORDER BY S.PLNAME;

Issue the following SQL for a listing of packages that were bound with ISOLATION(UR) or contain at least one statement specifying the WITH UR clause:

SELECT DISTINCT P.COLLID, P.NAME, P.VERSION
FROM SYSIBM.SYSPACKAGE P,
SYSIBM.SYSPACKSTMT S
WHERE (P.LOCATION = S.LOCATION AND
P.LOCATION = ‘ ‘ AND
P.COLLID = S.COLLID AND
P.NAME = S.NAME AND
P.VERSION = S.VERSION AND
P.ISOLATION = ‘U’
)
OR S.ISOLATION = ‘U’
ORDER BY S.COLLID, S.NAME, S.VERSION;

Synopsis

The dirty read capability can provide relief to concurrency problems and deliver faster performance in very specific situations. Be certain to understand the implications of the UR isolation level and the “problems” it can cause before diving headlong into implementing it in your production applications.

Labels: , ,

5 Comments:

Blogger Arun S said...

I got an error msg when running the sql to find the ur statements in packages or statements. The sysibm.sysstmt dont have the isolation column. Here is it's description
db2instp@DALPADM1:~/arunz> db2 "describe table sysibm.sysstmt"

Column Type Type
name schema name Length Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
PLNAME SYSIBM VARCHAR 128 0 No
PLCREATOR SYSIBM VARCHAR 128 0 No
STMTNO SYSIBM INTEGER 4 0 No
SECTNO SYSIBM SMALLINT 2 0 No
TEXT SYSIBM CLOB 2097152 0 No
UNIQUE_ID SYSIBM CHARACTER 8 0 No
HOST_VARS SYSIBM BLOB 4190000 0 Yes
I am using udb version 9.1

7:42 AM  
Blogger rivanor said...

Very good post, indeed. Helped me to clearly understand UR. Thanks.

9:33 AM  
Anonymous Andy said...

Thanks for the nice article. Do you have one for CS and RR?

12:48 AM  
Blogger Neeraj said...

Thanks a lot Craig. Very useful article indeed. I was wondering what would happen if the program is coded with "WITH UR" and bound to a plan that is a "CS" isolation level and vice versa.

2:27 PM  
Blogger Craig S Mullins said...

The individual SQL statements that are coded using the WITH clause will use the isolation level you specify there; all other SQL statements will use the isolation level that was specified at BIND time.

12:05 PM  

Post a Comment

Links to this post:

Create a Link

<< Home