The Db2 Portal Blog

Wednesday, May 22, 2013

DB2 Locking Part 7: Lock Avoidance, Related Issues, and Stuff

In today's blog entry, part 7 in our on-going series on DB2 locking, we will take a look at lock avoidance... as well as some other related things.

Lock Avoidance

Lock avoidance is a mechanism employed by DB2 for z/OS to access data without locking but also while maintaining data integrity. It prohibits access to uncommitted data and serializes access to pages. Lock avoidance improves performance by reducing the overall volume of lock requests. Let’s face it, the most efficient lock is the one never taken.

In general, DB2 avoids locking data pages if it can determine that the data to be accessed is committed and that no semantics are violated by not acquiring the lock. DB2 avoids locks by examining the log to verify the committed state of the data.

When determining if lock avoidance techniques will be practical, DB2 first scans the page to be accessed to determine whether any rows qualify. If none qualify, a lock is not required.

For each data page to be accessed, the RBA of the last page update (stored in the data page header) is compared with the log RBA for the oldest active unit of recovery. This RBA is called the Commit Log Sequence Number, or CLSN. If the CLSN is greater than the last page update RBA (or LRSN in a data sharing environment), the data on the page has been committed and the page lock can be avoided.

Additionally, a bit is stored in the record header for each row on the page. The bit is called the Possibly UNCommitted, or PUNC, bit. The PUNC bit indicates whether update activity has been performed on the row. The PUNC bit is checked if the CLSN test fails. For each qualifying row on the page, the PUNC bit is checked to see whether it is off. This indicates that the row has not been updated since the last time the bit was turned off. Therefore, locking can be avoided.

Note

IBM provides no external method for you to determine whether the PUNC bit is on or off for each row. Therefore, you should ensure that any table that can be modified should be reorganized on a regularly scheduled basis.

If neither CLSN nor PUNC bit testing indicates that a lock can be avoided, DB2 acquires the requisite lock.

In addition to enhancing performance, lock avoidance increases data availability. Data that in previous releases would have been considered locked, and therefore unavailable, is now considered accessible.

When Lock Avoidance Can Occur

Lock avoidance can be used only for data pages. Further, DB2 Catalog and DB2 Directory access does not use lock avoidance techniques. You can avoid locks under the following circumstances:

For any pages accessed by read-only or ambiguous queries bound with ISOLATION(CS) and CURRENTDATA NO
For any unqualified rows accessed by queries bound with ISOLATION(CS) or ISOLATION(RS)
When DB2 system-managed referential integrity checks for dependent rows caused by either the primary key being updated, or the parent row being deleted and the DELETE RESTRICT rule is in effect
For both COPY and RUNSTATS when SHRLEVEL(CHANGE) is specified

An ambiguous cursor is one where DB2 cannot determine whether there is intent to modify data retrieved using that cursor. The cursor is ambiguous if it is in a plan or package containing either PREPARE or EXECUTE IMMEDIATE SQL statements, along with the following conditions: the cursor is not defined with the FOR READ ONLY clause or the FOR UPDATE OF clause; the cursor is not defined on a read-only result table; the cursor is not the target of a WHERE CURRENT clause on an UPDATE or DELETE statement.

Skipping Locked Rows

Although not really a part of lock avoidance, you can indeed avoid locking data using the capability added with DB2 V9 that allows for a transaction to skip over rows that are locked using the SKIP LOCKED DATA option. SKIP LOCKED DATA can be coded on the following SQL statements:

SELECT
SELECT INTO
PREPARE
searched UPDATE
searched DELETE

You can also use the SKIP LOCKED DATA option with the UNLOAD utility.

When you tell DB2 to skip locked data, then that data is not accessed and your program will not have it available. DB2 just skips over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time; however, it comes at the cost of not accessing the locked data. This means that you should utilize this clause only when your program can tolerate skipping over some data.

The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation; however, it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 simply ignores the SKIP LOCKED DATA clause under UR and RR isolation levels.

In addition, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks. And the bigger the lock size, the more data that will be skipped when a lock is encountered. With row locking, you will be skipping over locked rows; however, with page locking, you will be skipping over all the rows on the locked page.

Use this feature with extreme care, and make sure that you know exactly what you are telling DB2 to do; otherwise, you might be reading less than you want. Consider using this option in certain test environments or possibly even in production under the proper conditions. For example, perhaps you have a program that needs to read from a table like a queue to get a next number. If it is not imperative that the numbers be sequential, using SKIP LOCKED DATA can eliminate bottlenecks by skipping any locked rows/pages to get data off of the queue.

Using Currently Committed Data

DB2 10 for z/OS adds another interested nuance to locking with the option to use currently committed data.

A common problem encountered by DB2 subsystems with poorly designed applications is frequent lock timeouts (and deadlocks). Under normal locking semantics, if data is locked an application requesting the same data would just wait until it becomes available, or timeout after waiting. As of DB2 V10, though, there is an alternative option for applications that cannot tolerate waiting on locks.

Using the CONCURRENTACCESSRESOLUTION parameter of the BIND command, you can direct your application program to use currently committed semantics, by specifying
USECURRENTLYCOMMITTED for the CONCURRENTACCESSRESOLUTION parameter. This clause applies only when the isolation level in effect is cursor stability (CS) or read stability (RS); otherwise it is simply ignored.

The impact of this clause on the program is that DB2 uses the currently committed version of the data for applicable scans when data is in the process of being updated or deleted. Rows that are in the process of being inserted can be skipped. So, only committed data is returned, as was the case previously, but now readers do not wait for writers to release locks. Instead, readers return data based on the currently committed version; that is, data prior to the start of the write operation.

Until next time... stay tuned for the next edition in the DB2 Locking series here at the DB2 Portal.

Friday, May 10, 2013

DB2 Locking, Part 6: Claims, Drains, and Partition Independence

DB2 augments resource serialization using claims and drains in addition to transaction locking. The claim and drain process enables DB2 to perform concurrent operations on multiple partitions of the same table space.

Claims and drains provide another “locking” mechanism to control concurrency for resources between SQL statements, utilities, and commands. But do not confuse the issue: DB2 continues to use transaction locking, as well as claims and drains.

As with transaction locks, claims and drains can timeout while waiting for a resource.

Claims

DB2 uses a claim to register that a resource is being accessed. The following resources can be claimed:

Segmented table spaces
A single data partition of a partitioned table space (classic or universal range-partitioned)
A non-partitioned index space
A single index partition of a partitioned index (of either the partitioning index or a DPSI)

Think of claims as usage indicators. A process stakes a claim on a resource, telling DB2, in effect, “Hey, I’m using this!”A claim is a notification to DB2 that an object is being accessed. Claims prevent drains from occurring until the claim is released, which usually occurs at a commit point.

Claims prevent drains from acquiring a resource. A claim is acquired when a resource is first accessed. Claims are released at commit time, except for cursors declared using the WITH HOLD clause or when the claimer is a utility.

Multiple agents can claim a single resource. Claims on objects are acquired by the following:

SQL statements (SELECT, INSERT, UPDATE, MERGE, DELETE)
DB2 restart on INDOUBT objects
Some utilities (for example, COPY SHRLEVEL CHANGE, RUNSTATS SHRLEVEL CHANGE, and REPORT)

Every claim has a claim class associated with it. The claim class is based on the type of access being requested, as follows:

A CS claim is acquired when data is read from a package or plan bound specifying ISOLATION(CS).
An RR claim is acquired when data is read from a package or plan bound specifying ISOLATION(RR).
A write claim is acquired when data is deleted, inserted, or updated.

Drains

Like claims, drains also are acquired when a resource is first accessed. A drain acquires a resource by quiescing claims against that resource. Drains can be requested by commands and utilities. A drain is the act of acquiring a locked resource by quiescing access to that object.

Multiple drainers can access a single resource. However, a process that drains all claim classes cannot drain an object concurrently with any other process.

To more fully understand the concept of draining, think back to the last time that you went to a movie theater. Before anyone is permitted into the movie, the prior attendees must first be cleared out. In essence, this example illustrates the concept of draining. DB2 drains make sure that all other users of a resource are cleared out before allowing any subsequent access.

The following resources can be drained:

Segmented table spaces
A single data partition of a partitioned table space (classic or universal range-partitioned)
A non-partitioned index space
A single index partition of a partitioned index (of either the partitioning index or a DPSI)

A drain places drain locks on a resource. A drain lock is acquired for each claim class that must be released. Drain locks prohibit processes from attempting to drain the same object at the same time.

The process of quiescing a claim class and prohibiting new claims from being acquired for the resource is called draining. Draining allows DB2 utilities and commands to acquire partial or full control of a specific object with a minimal impact on concurrent access. Three types of drain locks can be acquired:

A cursor stability drain lock
A repeatable read drain lock
A write drain lock

A drain requires either partial control of a resource, in which case a write drain lock is taken, or complete control of a resource, accomplished by placing a CS drain lock, an RR drain lock, and a write drain lock on an object.

You can think of drains as the mechanism for telling new claimers, “Hey, you can’t use this in that way!” The specific action being prevented by the drain is based on the claim class being drained. Draining write claims enables concurrent access to the resource, but the resource cannot be modified. Draining read (CS and/or RR) and write claims prevents any and all concurrent access.

Drain locks are released when the utility or command completes. When the resource has been drained of all appropriate claim classes, the drainer acquires sole access to the resource.

Claim and Drain Lock Compatibility

As with transaction locks, concurrent claims and drains can be taken, but only if they are compatible with one another. Table 1 shows which drains are compatible with existing claims... Table 2 shows which drains are compatible with existing drains:

Table 1. Claim/Drain Compatibility Matrix

Table 2. Drain/Drain Compatibility Matrix

Transaction Locking Versus Claims and Drains

DB2 uses transaction locks to serialize access to a resource between multiple claimers, such as two SQL statements or an SQL statement and a utility that takes claims, such as RUNSTATS SHRLEVEL(CHANGE).

Claims and drains serialize access between a claimer and a drainer. For example, an INSERT statement is a claimer that must be dealt with by the LOAD utility, which is a drainer.

Drain locks are used to control concurrency when both a command and a utility try to access the same resource.

The Impact of Commit

It is vitally important that you design all your long-running application programs with a COMMIT strategy in mind. This means that after a period of execution, a COMMIT is issued. This guidance applies to read-only programs as well as to modification programs. Remember that claims are released at COMMIT, so a long-running read-only program that never commits can hold claims for extended periods, thereby causing concurrency issues, particularly for utilities (including online utilities such as REORG SHRLEVEL CHANGE).

The LRDRTHLD DSNZPARM, introduced with DB2 V10, can be used to identify processes that hold read claims for extended periods of time. The parameter can be used to set a threshold that when met, causes DB2 to write a trace record. The default is 10 minutes.

You also can use the -DISPLAY DATABASE command with the CLAIMERS keyword to display claim types and durations for specified database objects. For example, to show the claim information for the TSCUST01 table space in the DBCUST database, you could issue the following command:

-DISPLAY DATABASE(DBCUST) SPACENAM(TSCUST01) CLAIMERS

Monday, May 06, 2013

DB2 Locking, Part 5: Lock Suspensions, Timeouts, and Deadlocks

The longer a lock is held, the greater the potential impact to other applications. When an application requests a lock that is already held by another process, and the lock cannot be shared, that application is suspended. A suspended process temporarily stops running until the lock can be acquired. Lock suspensions can be a significant barrier to acceptable performance and application availability.

When an application has been suspended for a pre-determined period of time, it will be terminated. When a process is terminated because it exceeds this period of time, it is said to timeout. In other words, a timeout is caused by the unavailability of a given resource. For example, consider the scenario depicted in Figure 1.

Figure 1. A Timeout Occurs

If Program 2, holding no other competitive locks, requests a lock currently held by Program 1, DB2 tries to obtain the lock for a period of time. Then it quits trying. This example illustrates a timeout. This timeout scenario is also applicable to row locks, not just page locks.

The length of time a user waits for an unavailable resource before being timed out is determined by the IRLMRWT DSNZPARM parameter. You also can set this period of time by using the RESOURCE TIMEOUT field on the DB2 installation panel DSNTIPI.

When a lock is requested, a series of operations is performed to ensure that the requested lock can be acquired (see Figure 2). Two conditions can cause the lock acquisition request to fail: a deadlock or a timeout.

Figure 2. Processing a Lock Request

A deadlock occurs when two separate processes compete for resources held by one another. DB2 performs deadlock detection for both locks and latches. For example, consider the following processing sequence for two concurrently executing application programs:

Figure 3. A Deadlock Occurs

A deadlock occurs when Program 1 requests a lock for a data page held by Program 2, and Program 2 requests a lock for a data page held by Program 1. A deadlock must be resolved before either program can perform subsequent processing. DB2’s solution is to target one of the two programs as the victim of the deadlock and deny that program’s lock request by setting the SQLCODE to -911. This deadlocking scenario is also applicable to row locks, not just page locks. A graphic depiction of a deadlock is shown in Figure 4.

Figure 4. The Deadlock

The length of time DB2 waits before choosing a victim of a deadlock is determined by the DEADLOK IRLM parameter. You also can set this parameter using the RESOURCE TIMEOUT field on the DB2 installation panel DSNTIPJ.

Friday, April 26, 2013

IDUG NA 2013 in Orlando, Florida

Just a short blog post to let everybody know that I will be in Orlando next week for the IDUG DB2 Tech Conference. I'll be delivering an education seminar (DB2 Developer's Guide Comes Alive!) on Monday and giving two presentations on Thursday...

On Tuesday and Wednesday I'll be roaming around the conference, attending sessions, and looking to say hello to old friends and colleagues.

Hope to see you there!

IDUG NA 2013

Thursday, April 25, 2013

DB2 Locking, Part 4: Page and Row Locks

In the first three installments of this series on DB2 locking we have looked ata broad overview of locking (part 1), table and table space locks (part 2) and the difference between locks and latches (part 3). Today we will move ahead and discuss page and row locking.

Page Locking

The types of page locks that DB2 can take are outlined in Table 1. S-locks allow data to be read concurrently but not modified. With an X-lock, data on a page can be modified (with INSERT, UPDATE, or DELETE), but concurrent access is not allowed. U-locks enable X-locks to be queued, whereas S-locks exist on data that must be modified.

Table 1. Page Locks

As with table space locks, concurrent page locks can be acquired but only with compatible page locks. The compatibility matrix for page locks is shown in Table 2.

Table 2. Page Lock Compatibility Matrix

When are these page locks taken? Page locks can be acquired only under the following conditions:

The DDL for the object requesting a lock specifies LOCKSIZE PAGE or LOCKSIZE ANY.
If LOCKSIZE ANY was specified, the NUMLKTS threshold or the table space LOCKMAX specification must not have been exceeded. You learn more about these topics later in this section.

Keep in mind, though, that if ISOLATION(RR) was used when the program was bound, the optimizer might decide not to use page locking even if the above criteria are met.

If all these factors are met, page locking progresses as outlined in Table 3. The type of processing in the left column causes the indicated page lock to be acquired for the scope of pages identified in the right column. DB2 holds each page lock until it is released as specified in the ISOLATION level of the plan requesting the particular lock. Page locks can be promoted from one type of lock to another based on the type of processing that is occurring. A program can FETCH a row using a cursor with the FOR UPDATE OF clause, causing a U-lock to be acquired on that row’s page. Later, the program can modify that row, causing the U-lock to be promoted to an X-lock.

Table 3. How Page Locks Are Acquired

Row Locks

The smallest piece of DB2 data that you can lock is the individual row. The types of row locks that DB2 can take are similar to the types of page locks that it can take. Refer back to Table 1 and simply replace Page with Row. So row locks act like page locks, only on a smaller granularity (that is, on rows instead of pages).

S-locks allow data to be read concurrently but not modified. With an X-lock, you can modify data in that row (using INSERT, UPDATE, MERGE, or DELETE), but concurrent access is not allowed. U-locks enable X-locks to be queued, whereas S-locks exist on data that must be modified.

Once again, concurrent row locks can be acquired but only with compatible row locks. Table 2 works the same way for row locks as it does for page locks.

When are these row locks taken? Row locks can be acquired when the DDL for the object requesting a lock specifies LOCKSIZE ROW. (Although it is theoretically possible for LOCKSIZE ANY to choose row locks, in practice I have yet to see this happen.) Again, we can use an earlier Table (Table 3) replacing the word page with the word row to see how row locking progresses. The type of processing in the left column causes the indicated row lock to be acquired for the scope of rows identified in the right column. A row lock is held until it is released as specified by the ISOLATION level of the plan requesting the particular lock.

Row locks can be promoted from one type of lock to another based on the type of processing that is occurring. A program can FETCH a row using a cursor with the FOR UPDATE OF clause, causing a U-lock to be acquired on that row. Later, the program can modify that row, causing the U-lock to be promoted to an X-lock.

Page Locks Versus Row Locks

The answer to the question of whether to use page locks or row locks is, of course, “It depends!” The nature of your specific data and applications determine whether page or row locks are most applicable.

The resources required to acquire, maintain, and release a row lock are just about the same as the resources required for a page lock. Therefore, the number of rows per page must be factored into the row-versus-page locking decision. The more rows per page, the more resources row locking will consume. For example, a table space with a single table that houses 25 rows per page can consume as much as 25 times more resources for locking if row locks are chosen over page locks. Of course, this estimate is very rough, and other factors (such as lock avoidance) can reduce the number of locks acquired, and thereby reduce the overhead associated with row locking. However, locking a row-at-a-time instead of a page-at-a-time can reduce contention. Row locking almost always consumes more resources than page locking. Likewise, if two applications running concurrently access the same data in different orders, row locking might actually decrease concurrent data access.

You must therefore ask these questions:

What is the nature of the applications that access the objects in question? Of course, the answer to this question differs not only from organization to organization, but also from application to application within the same organization.
Which is more important, reducing the resources required to execute an application or increasing data availability? The answer to this question will depend upon the priorities set by your organization and any application teams accessing the data.

As a general rule of thumb, favor specifying LOCKSIZE PAGE, as page locking is generally the most practical locking strategy for most applications. If you’re experiencing severe contention problems on a table space that is currently using LOCKSIZE PAGE, consider changing to LOCKSIZE ROW and gauging the impact on performance, resource consumption, and concurrent data access. Alternatively, you also might choose to specify LOCKSIZE ANY and let DB2 choose the type of locking to be performed.

Note

Note: A possible alternative to row locking is to specify MAXROWS 1 for the table space and use LOCKSIZE PAGE (or LOCKSIZE ANY), instead of LOCKSIZE ROW.

Monday, April 22, 2013

DB2 Locking, Part 3: Locks Versus Latches

So far in this series on DB2 locking we have offered up a broad overview of what locking is and then delved into the world of table and table space locks. In this short entry, before we tackles page and row locks, we are going to look at the difference between a lock and a latch.

A true lock is handled by DB2 using the IRLM. The IRLM, or internal resource lock manager, is both a separate subsystem and an integral component of DB2. As its name implies, it manages locks for DB2.

However, whenever doing so is practical, DB2 can lock resources without going to the IRLM. This type of lock is called a latch. True locks are always set in the IRLM. Latches, by contrast, are set internally by DB2, without going to the IRLM.

When a latch is taken instead of a lock, it is handled in the Buffer Manager by internal DB2 code; so the cross-memory service calls to the IRLM are eliminated. Latches are usually held only briefly—for a shorter duration than locks. Also, a latch requires about one-third the number of instructions as a lock. Therefore, latches are more efficient than locks because they avoid the overhead associated with calling an external address space. Latches are used when a resource serialization situation is required for a short time. Both latches and locks guarantee data integrity.

In subsequent blog entries, any usage of the term lock generically, refers to both locks and latches.

Thursday, April 18, 2013

DB2 Locking, Part 2: Table Space and Table Locks

Today's post is the second in our DB2 locking series and it covers the topic of table space and table locks.

Table Space Locks

A table space lock is acquired when a DB2 table or index is accessed. Note that I said accessed, not updated. The table space is locked even when simple read-only access is occurring.

Refer to to Table 1 below for a listing of the types of table space locks that can be acquired during the execution of an SQL statement. Every table space lock implies two types of access: the access acquired by the lock requester and the access allowed to other subsequent, concurrent processes.

Table 1. Table Space Locks

(click on image for larger view)

When an SQL statement is issued and first accesses data, it takes an intent lock on the table space. Later in the process, actual S-, U-, or X-locks are taken. The intent locks (IS, IX, and SIX) enable programs to wait for the required S-, U-, or X-lock that needs to be taken until other processes have released competing locks.

The type of table space lock used by DB2 during processing is contingent on several factors, including the table space LOCKSIZE specified in the DDL, the bind parameters chosen for the plan being run, and the type of processing requested. Table 2 provides a synopsis of the initial table space locks acquired under certain conditions.

Table 2. How Table Space Locks Are Acquired

(click on image for larger view)

A table space U-lock indicates intent to update, but an update has not occurred. This is caused by using a cursor with the FOR UPDATE OF clause. A U-lock is non-exclusive because it can be taken while tasks have S-locks on the same table space. More information on table space lock compatibility follows in Table 3.

An additional consideration is that table space locks are usually taken in combination with table and page locks, but they can be used on their own. When you specify the LOCKSIZE TABLESPACE DDL parameter, table space locks alone are used as the locking mechanism for the data in that table space. This way, concurrent access is limited and concurrent update processing is eliminated.

Similar in function to the LOCKSIZE DDL parameter is the LOCK TABLE statement. The LOCK TABLE statement requests an immediate lock on the specified table. The LOCK TABLE statement has two forms—one to request a share lock and one to request an exclusive lock.

LOCK TABLE table_name IN SHARE MODE;
LOCK TABLE table_name IN EXCLUSIVE MODE;

You also can issue LOCK TABLE against a specific partition, for example:

LOCK TABLE table-name
PARTITION integer IN SHARE MODE;

A locking scheme is not effective unless multiple processes can secure different types of locks on the same resource concurrently. With DB2 locking, some types of table space locks can be acquired concurrently by discrete processes. Two locks that can be acquired concurrently on the same resource are said to be compatible with one another.

Refer to Table 3 for a breakdown of DB2 table space lock compatibility. A Yes in the matrix indicates that the two locks are compatible and can be acquired by distinct processes on the same table space concurrently. A No indicates that the two locks are incompatible. In general, two locks cannot be taken concurrently if they allow concurrent processes to negatively affect the integrity of data in the table space.

Table 3. Table Space Lock Compatability

(click on image for larger view)

Table Locks

When segmented table spaces are involved, DB2 can use table locks . Table locks are always associated with a corresponding table space lock.

The same types of locks are used for table locks as are used for table space locks. S, U, X, IS, IX, and SIX table locks can be acquired by DB2 processes when data in segmented table spaces is accessed. Table 1 describes the options available to DB2 for table locking. The compatibility chart in Table 3 applies to table locks as well as table space locks.

For a table lock to be acquired, an IS-lock must first be acquired on the segmented table space in which the table exists. The type of table lock to be taken depends on the LOCKSIZE specified in the DDL, the bind parameters chosen for the plan being run, and the type of processing requested. Table 4 (below) is a modified version of Table 2 (shown earlier), depicting the initial types of table spaces and table locks acquired given a certain set of conditions. Table locks are not acquired when the LOCKSIZE TABLESPACE parameter is used.

Table 4. How Table Locks Are Acquired

(click on image for larger view)

Summary

And so concludes Part 2 of our multi-part series on DB2 locking. Be sure to keep watching this blog for additional postings in this series!

Monday, April 15, 2013

DB2 Locking, Part 1: An Overview

DB2 automatically guarantees the integrity of data by enforcing several locking strategies. These strategies permit multiple users from multiple environments to access and modify data concurrently. Basically, DB2 combines the following strategies to implement an overall locking strategy:

• Table and table space locking

• IRLM page and row locking

• Internal page and row latching

• Claims and drains to achieve partition independence

• Checking commit log sequence numbers (CLSN) and PUNC bits to achieve lock avoidance

• Global locking through the coupling facility in a data sharing environment

But what exactly is locking? How does DB2 utilize these strategies to lock pages and guarantee data integrity? Why does DB2 have to lock data before it can process it? What is the difference between a lock and a latch? How can DB2 provide data integrity while operating on separate partitions concurrently? Finally, how can DB2 avoid locks and still guarantee data integrity?

Today's blog post will offer a high level introduction to DB2 locking, why it is needed, as well as the benefits and drawbacks that it introduces to using DB2 databases.

Anyone accustomed to application programming when access to a database is required understands the potential for concurrency problems. When one application program tries to read data that is in the process of being changed by another, the DBMS must forbid access until the modification is complete to ensure data integrity. Most DBMS products, DB2 included, use a locking mechanism for all data items being changed. Therefore, when one task is updating data on a page, another task cannot access data (read or update) on that same page until the data modification is complete and committed.

When multiple users can access and update the same data at the same time, a locking mechanism is required. This mechanism must be capable of differentiating between stable data and uncertain data. Stable data has been successfully committed and is not involved in an update in a current unit of work. Uncertain data is currently involved in an operation that could modify its contents.

Consider two DB2 application programs. If program #1 updates a piece of data on page 1, you must ensure that program #2 cannot access the data until program #1 commits the unit of work. Otherwise, a loss of integrity could result. Without a locking mechanism, the following sequence of events would be possible:

Program #1 retrieves a row from DSN81010.EMP for EMPNO ‘000010’.
Program #1 issues an update statement to change that employee’s salary to 55000.
Program #2 retrieves the DSN81010.EMP row for EMPNO ‘000010’. Because the change was not committed, the old value for the salary, 52750, is retrieved.
Program #1 commits the change, causing the salary to be 55000.
Program #2 changes a value in a different column and commits the change.
The value for salary is now back to 52750, negating the change made by program #1.

The DBMS avoids this situation by using a locking mechanism. DB2 supports locking at four levels, or granularities: table space-, table-, page-, and row-level locking. DB2 also provides LOB locking for large objects (BLOBs, CLOBs, and DBCLOBs).

Locks can be taken at any level in the locking hierarchy without taking a lock at the lower level. However, locks cannot be taken at the lower levels without a compatible higher-level lock also being taken. For example, you can take a table space lock without taking any other lock, but you cannot take a page lock without first securing a table space-level lock (and a table lock as well if the page is part of a table in a segmented table space).

Additionally, a page lock does not have to be taken before a row lock is taken. Your locking strategy requires an “either/or” type of choice by table space: either row locking or page locking. An in-depth discussion on the merits of both are beyond the scope of today's blog post, but will be covered in future posts.

Both page locks and row locks can escalate to a table level and then to a table space level for segmented tables or straight to a table space level for partitioned table spaces. A table or table space cannot have both page locks and row locks held against it at the same time.

Many modes of locking are supported by DB2, but they can be divided into two types:

• Locks to enable the reading of data

• Locks to enable the updating of data

But remember, we are still talking at a very high, and somewhat simplistic level. DB2 uses varieties of these two types of locks to indicate the type of locking required.

Locks Versus Latches

A true lock is handled by DB2 using the IRLM. However, whenever doing so is practical, DB2 tries to lock resources without going to the IRLM. This type of lock is called a latch. Whereas true locks are always set in the IRLM, latches are set internally by DB2, without going to the IRLM.

Lock Duration

Before you learn about the various types of locks that can be acquired by DB2, you should understand lock duration, which refers to the length of time that a lock is maintained. The duration of a lock is based on the BIND options chosen for the program requesting locks. Locks can be acquired either immediately when the program is requested to be run or iteratively as needed during the execution of the program. Locks can be released when the plan is terminated or when they are no longer required for a unit of work.

The BIND parameters affecting DB2 locking are summarized below:

The BIND parameters that impact table space locks are the ACQUIRE and RELEASE parameters.

ACQUIRE(ALLOCATE | USE): The ALLOCATE option specifies that locks will be acquired when the plan is allocated, which normally occurs when the first SQL statement is issued. The USE option indicates that locks will be acquired only as they are required, SQL statement by SQL statement. As of DB2 10, ACQUIRE(ALLOCATE) is no longer supported when binding or rebinding plans.
RELEASE(DEALLOCATE | COMMIT): When you specify DEALLOCATE for a program, locks are not released until the plan is terminated. When you specify COMMIT, table space locks are released when a COMMIT is issued.

The BIND parameter affecting page and row locks is the ISOLATION parameter. There are four choices for isolation level:

ISOLATION(CS), or Cursor Stability, acquires and releases page locks as pages are read and processed. CS provides the greatest level of concurrency at the expense of potentially different data being returned by the same cursor if it is processed twice during the same unit of work.
ISOLATION(RR), or Repeatable Read, holds page and row locks until a COMMIT point; no other program can modify the data. If data is accessed twice during the unit of work, the same exact data will be returned.
ISOLATION(RS), or Read Stability, holds page and row locks until a COMMIT point, but other programs can INSERT new data. If data is accessed twice during the unit of work, new rows may be returned, but old rows will not have changed.
ISOLATION(UR), or Uncommitted Read, is also known as dirty read processing. UR avoids locking altogether, so data can be read that never actually exists in the database.

Regardless of the ISOLATION level chosen, all page locks are released when a COMMIT is encountered.

I think that is enough for this first installment on DB2 locking... come back soon and we will expand on table space locks in the next installment of this blog series on DB2 locking...

Monday, March 18, 2013

DB2 Table Editors

In today's blog post I want to briefly discuss one of the more useful, yet often ignored, DB2 tools available on the market -- Table Editors...

Typically, the only method of updating DB2 data (indeed, any data stored in a relational database) is with SQL data manipulation language statements DELETE, INSERT, and UPDATE (or with a database load). Because these SQL statements operate on data a set at a time, multiple rows -- or even all of the rows -- can be affected by a single SQL statement. Coding SQL statements for every data modification required during the application development and testing phase can be time-consuming and error-prone.

A table editing tool can reduce the time needed to make simple data modifications by providing full-screen edit capability for database tables. The user specifies the table to edit and is placed into an edit session. The data is presented to the user as a series of rows, with the columns separated by spaces. A header line indicates the column names. The data can be scrolled up and down as well as left and right. To change data, the user simply types over the current data.

This type of tool is ideal for supporting the application development process. A programmer can make quick changes without coding SQL. Also, if properly implemented, a table editor can reduce the number of erroneous data modifications made by beginning SQL users.

When a table editor is used, all columns are available for update. Thus, if a table editor is used to change production data, a simple mis-keying can cause unwanted updates. Native SQL should be used if you must ensure that only certain columns are updated.

One final note: Tested SQL statements and application programs are characterized by their planned nature. These modification requests are well thought out and tested. This is not true for changes implemented through a table editor, so always exercise caution when using a table editor.

Examples of DB2 Table Editors include:

BMC Catalog Manager (which includes an integrated table editor)
CA RC/Update (which includes an integrated table editor)
IBM DB2 Table Editor
RazorSQL DB2 Table Editor

Thursday, February 14, 2013

The Importance of Database Design

It is impossible to over-estimate the importance of database design on the effectiveness and efficiency of application systems. The first step, of course, is to create a logical data model of the business information that must be stored in, and accessed through, the database. This is a non-trivial task, but it is not the focus of today’s blog post, which is implementing optimal and well-designed database schemata.

Oftentimes the database design does not get the attention it deserves. This can occur for numerous reasons such as:

Insufficient specifications and/or poor logical data modeling
Not enough time in the development schedule
Too many changes occurring throughout the development cycle
Database design assigned to, or performed by novices

And even when the database design is being performed by experienced professionals with sufficient time and a solid logical model, it is very easy for design flaws to creep into the database. This is especially the case for larger and more complex databases required to support mission critical applications.

Of course, the first step in constructing a physical database should be transforming the logical design using best practices. The transformation consists of the following:

Transforming entities into tables
Transforming attributes into columns
Transforming domains into data types and constraints
Transforming relationships into primary and foreign keys

But a simple transformation will not result in a complete and correct physical database design – it is merely the first step. And design flaws can be introduced even during such a transformation.

The process of normalizing your data should be conducted during the logical design phase, but sometimes mistakes are made during the logical modeling process. In simple terms, normalization is the process of identifying the one best place where each fact belongs. A normalized design minimizes data redundancy and optimizes data structures by systematically and properly placing data elements into the appropriate groupings.

First normal form (1NF) eliminates repeating groups and non-atomic data from an entity.
Second normal form (2NF) ensures that all the attributes of each entity are dependent on the primary key.
Third normal form (3NF) ensures that no relationships between attributes exist within an entity.

Although normalization is a logical process and does not necessarily dictate physical database design it is good practice to implement normalized physical databases – especially with today’s powerful hardware and database systems. A normalized data model will ensure that each entity is well formed and that each attribute is assigned to the proper entity. And denormalized data structures should always be documented.

A database that is not normalized can be difficult to modify (because facts may be strewn about within the database) and prone to data quality issues (again because one fact may be in more than one place). Failing to achieve appropriate normalization is a significant design flaw.

Design flaws can make it difficult to access a database in an efficient manner. Something as simple as creating columns with inconsistent data type and length across tables can introduce inefficiency. For example, perhaps JOB_CODE is defined as CHAR(10) in one table and CHAR(15) in another. If these columns need to be compared or joined, additional work is required by the DBMS to make the columns comparable. It is easy to see how a database design issue can make applications inefficient no matter how adept the developers are. And this is just one type of design flaw.

There are numerous other types of design flaws that can negatively impact the usability and efficiency of a database implementation (and the applications that use it). Improperly defined constraints (referential, check, uniqueness) can cause data quality problems. Improper indexing (to support constraints and other physical structures) can cause a database to operate inefficiently or even cease operating altogether. In DB2 for z/OS, failing to explicitly name a clustering index causes DB2 to default cluster sequence to the oldest index. Changing index structures, which sometimes require dropping and recreating the index, can cause the data to be ordered in a completely different physical sequence, thereby impacting performance.

Some database design flaws are more subtle. What happens if you create two incompatible check constraints? For example, consider the following:

CHECK (empno < 100)

CHECK (empno >= 101)

No data can be added to the table because no number is less than 100 and greater than or equal to 101. Of course, this is an extreme example to highlight the problem. Another situation can occur if the default value is not one of the values permitted by the check constraint, for example:

emp_type CHAR(8) DEFAULT 'new'

CHECK (emp_type IN ('temp',
'fulltime',
'contract')), ...

Cycles are another potential problem that can be created in a physical database schema. A cycle is a referential path that connects a table to itself. This can occur when multiple tables are related back to each other and it looks like a loop when diagrammed. DB2 forbids a table from being delete-connected to itself in a cycle. A table is delete-connected to another table if it is a dependent of a table specified with a CASCADE delete rule.

Nested triggers can also create problems. A trigger is fired by an INSERT, UPDATE, or DELETE operation. However, a trigger can also contain INSERT, UPDATE, or DELETE logic within itself. Therefore, a trigger is fired by a data modification, but can also cause another data modification, thereby firing yet another trigger. When a trigger contains INSERT, UPDATE, and/or DELETE logic (including MERGE statements), the trigger is said to be a nested trigger. If referential integrity is combined with triggers, additional cascading updates and/or deletes can occur.

DB2 restricts the number of nested triggers that can be executed within a single firing event. If this were not done, it could be quite possible to have triggers firing triggers ad infinitum until all of the data was removed from an entire database. DB2 limits this cascading effect to 16 levels to prevent endless looping. If more than 16 levels of nesting occur, the transaction is aborted. Such nesting may restrict certain types of data modification from happening at all because the number of nested calls will always be exceeded. So nesting levels need to be controlled and managed in the database schema to avoid problems.

And this is by no means a comprehensive list of database design flaws that can cause real problems for application developers and end users. Getting the database design correct is imperative if you wish to have any hope of assuring application performance.

So how can you go about examining the correctness of your database structures? The best approach is an automated one such as that provided by Database Examiner, a product offered by DBE Software that performs a comprehensive validation of a database schema. The in-depth technology incorporated into Database Examiner allows you to review your database schema, identify problems, and remediate the issues to ensure a quality database implementation offering peak performance.

Simply provide Database Examiner with the data model, DDL, or a link to the active database and it will perform a series of comprehensive diagnostics. Database Examiner applies the rules of relational technology to detect flaws, inconsistencies and lack of integrity. The product applies more than 50 diagnostics that can be organized by category or severity levels. And you can customize the diagnostics by selecting those to be executed and assigning each one a severity level.

But Database Examiner doesn’t just highlight the flaws, it also helps to fix them be providing recommendations and corrections for each issue it detects, including the generation of SQL DDL scripts to remedy the flaws. Even better, Database Examiner provides multi-platform DBMS support, including DB2 for z/OS, DB2 LUW, Oracle, Microsoft SQL Server and Sybase.

And DBE Software is currently offering folks the ability to download and use a scaled down “Performance Edition” of Database Examiner, free ofcharge for a limited time. Be sure to enter the promo code MULL on the download request to get the best offer available.

Hopefully this blog entry has convinced you that database design is important… important enough to take some time to evaluate the quality of your existing database structures. And to take a look at automating the process using Database Examiner (using the promo code MULL ).