DB2 automatically guarantees the integrity of data by enforcing several locking strategies. These strategies permit multiple users from multiple environments to access and modify data concurrently. Basically, DB2 combines the following strategies to implement an overall locking strategy:
• Table and table space locking
• IRLM page and row locking
• Internal page and row latching
• Claims and drains to achieve partition independence
• Checking commit log sequence numbers (CLSN) and PUNC bits to achieve lock avoidance
• Global locking through the coupling facility in a data sharing environment
But what exactly is locking? How does DB2 utilize these strategies to lock pages and guarantee data integrity? Why does DB2 have to lock data before it can process it? What is the difference between a lock and a latch? How can DB2 provide data integrity while operating on separate partitions concurrently? Finally, how can DB2 avoid locks and still guarantee data integrity?
Today's blog post will offer a high level introduction to DB2 locking, why it is needed, as well as the benefits and drawbacks that it introduces to using DB2 databases.
Anyone accustomed to application programming when access to a database is required understands the potential for concurrency problems. When one application program tries to read data that is in the process of being changed by another, the DBMS must forbid access until the modification is complete to ensure data integrity. Most DBMS products, DB2 included, use a locking mechanism for all data items being changed. Therefore, when one task is updating data on a page, another task cannot access data (read or update) on that same page until the data modification is complete and committed.
When multiple users can access and update the same data at the same time, a locking mechanism is required. This mechanism must be capable of differentiating between stable data and uncertain data. Stable data has been successfully committed and is not involved in an update in a current unit of work. Uncertain data is currently involved in an operation that could modify its contents.
Consider two DB2 application programs. If program #1 updates a piece of data on page 1, you must ensure that program #2 cannot access the data until program #1 commits the unit of work. Otherwise, a loss of integrity could result. Without a locking mechanism, the following sequence of events would be possible:
- Program #1 retrieves a row from DSN81010.EMP for EMPNO ‘000010’.
- Program #1 issues an update statement to change that employee’s salary to 55000.
- Program #2 retrieves the DSN81010.EMP row for EMPNO ‘000010’. Because the change was not committed, the old value for the salary, 52750, is retrieved.
- Program #1 commits the change, causing the salary to be 55000.
- Program #2 changes a value in a different column and commits the change.
- The value for salary is now back to 52750, negating the change made by program #1.
The DBMS avoids this situation by using a locking mechanism. DB2 supports locking at four levels, or granularities: table space-, table-, page-, and row-level locking. DB2 also provides LOB locking for large objects (BLOBs, CLOBs, and DBCLOBs).
Locks can be taken at any level in the locking hierarchy without taking a lock at the lower level. However, locks cannot be taken at the lower levels without a compatible higher-level lock also being taken. For example, you can take a table space lock without taking any other lock, but you cannot take a page lock without first securing a table space-level lock (and a table lock as well if the page is part of a table in a segmented table space).
Additionally, a page lock does not have to be taken before a row lock is taken. Your locking strategy requires an “either/or” type of choice by table space: either row locking or page locking. An in-depth discussion on the merits of both are beyond the scope of today's blog post, but will be covered in future posts.
Both page locks and row locks can escalate to a table level and then to a table space level for segmented tables or straight to a table space level for partitioned table spaces. A table or table space cannot have both page locks and row locks held against it at the same time.
Many modes of locking are supported by DB2, but they can be divided into two types:
• Locks to enable the reading of data
• Locks to enable the updating of data
But remember, we are still talking at a very high, and somewhat simplistic level. DB2 uses varieties of these two types of locks to indicate the type of locking required.
Locks Versus Latches
A true lock is handled by DB2 using the IRLM. However, whenever doing so is practical, DB2 tries to lock resources without going to the IRLM. This type of lock is called a latch. Whereas true locks are always set in the IRLM, latches are set internally by DB2, without going to the IRLM.
When a latch is taken instead of a lock, it is handled in the Buffer Manager by internal DB2 code; so the cross-memory service calls to the IRLM are eliminated. Latches are usually held only briefly—for a shorter duration than locks. Also, a latch requires about one-third the number of instructions as a lock. Therefore, latches are more efficient than locks because they avoid the overhead associated with calling an external address space. Latches are used when a resource serialization situation is required for a short time. Both latches and locks guarantee data integrity. In subsequent sections, any usage of the term lock generically, refers to both locks and latches.
Before you learn about the various types of locks that can be acquired by DB2, you should understand lock duration, which refers to the length of time that a lock is maintained. The duration of a lock is based on the BIND options chosen for the program requesting locks. Locks can be acquired either immediately when the program is requested to be run or iteratively as needed during the execution of the program. Locks can be released when the plan is terminated or when they are no longer required for a unit of work.
The BIND parameters affecting DB2 locking are summarized below:
The BIND parameters that impact table space locks are the ACQUIRE and RELEASE parameters.
- ACQUIRE(ALLOCATE | USE):
option specifies that locks will be acquired when the plan is allocated, which
normally occurs when the first SQL statement is issued. The USE option indicates that locks
will be acquired only as they are required, SQL statement by SQL statement. As of DB2 10, ACQUIRE(ALLOCATE) is no longer supported when binding or rebinding plans.
- RELEASE(DEALLOCATE | COMMIT): When you specify DEALLOCATE for a program, locks are not released until the plan is terminated. When you specify COMMIT, table space locks are released when a COMMIT is issued.
The BIND parameter affecting page and row locks is the ISOLATION parameter. There are four choices for isolation level:
- ISOLATION(CS), or Cursor Stability, acquires and releases page locks as pages are read and processed. CS provides the greatest level of concurrency at the expense of potentially different data being returned by the same cursor if it is processed twice during the same unit of work.
- ISOLATION(RR), or Repeatable Read, holds page and row locks until a COMMIT point; no other program can modify the data. If data is accessed twice during the unit of work, the same exact data will be returned.
- ISOLATION(RS), or Read Stability, holds page and row locks until a COMMIT point, but other programs can INSERT new data. If data is accessed twice during the unit of work, new rows may be returned, but old rows will not have changed.
- ISOLATION(UR), or Uncommitted Read, is also known as dirty read processing. UR avoids locking altogether, so data can be read that never actually exists in the database.
Regardless of the ISOLATION level chosen, all page locks are released when a COMMIT is encountered.
I think that is enough for this first installment on DB2 locking... come back soon and we will expand on table space locks in the next installment of this blog series on DB2 locking...