Thursday, March 26, 2009

Cost vs. Advantage of Moving From IMS DB to DB2

As my regular readers know, every now and then I like to share Q+A exchanges I've had with folks. Today, the question I was asked is as follows:

My customer is wondering about the possible advantages of converting his IMS DB/DC system to IMS DC/DB2. The application currently performs well with an internal response time of less than .5 seconds on average.

Even with an arrival rate of 425 full-function transactions per second, the queue count rarely goes above 10. This system typically peaks at 12.5 million transactions per twelve-hour day against HDAM and HIDAM databases totaling close to 1 terabyte. The application itself is currently a bit over seven million lines of code.

Can you comment on the relative cost vs. advantage of moving an existing application from IMS DB to DB2 along with relative CPU capacity requirements?

And here is my short response:

Well, the main advantages of converting from IMS/DB to DB2 would be to gain better support for ad hoc queries, standard SQL (instead of non-standard DL/1) for writing queries and a deeper pool of talent to support the DB2 environment (there are more DB2 folks out there than IMS folks now-a-days).

The benefit of sticking with IMS is the good performance you currently enjoy as well as no need to convert the database structures or the 7 million lines of application code. Converting database structures is not horribly difficult, but there are some gotchas that can arise. The bigger problem is converting all of those DL/1 calls to appropriate SQL. This will not be a simple 1 to 1 conversion and it will very likely be quite time-consuming.

I guess it boils down to this: How happy are you with the current application, are you able to support it properly and how many other IMS/DB databases do you support? If this is the last IMS/DB database and you are looking to rid yourself of the IMS license, then it might make sense to convert. But you should do a project plan and cost/benefit analysis before making your final decision (conversion can be very costly). If you have a lot of other IMS/DB databases, then it probably doesn't make a lot of sense to convert to DB2 unless you cannot support the needs of your end users (management, ad hoc support, etc.) using IMS.

In terms of CPU requirements, DB2 will consume more CPU than IMS. DB2 optimizes queries internally whereas IMS programmers construct access paths to data. This additional requirement will cause DB2 to consume more CPU. But, of course, that additional CPU brings with it the enormous benefit of database optimization and better ad hoc query support.

You might also want to take a look at a product like DL/2. I have never used it so I cannot recommend for or against its functionality, but it looks like it might save you some work.

Wednesday, March 18, 2009

A Short Introduction to Lock Avoidance

Lock avoidance is a mechanism employed by DB2 for z/OS to access data without locking while still maintaining data integrity. It prohibits access to uncommitted data and serializes access to pages. Lock avoidance improves performance by reducing the overall volume of lock requests. After all, let’s face it, the most efficient lock is the one never taken.


Of course, even if it is not taking a lock, DB2 must still maintain the integrity of its data. Instead of taking a lock, DB2 uses a latch. To take advantage of Lock Avoidance, the SQL statement must be Read Only and the plan must be bound with Isolation Level Cursor Stability (CS) and CURRENTDATA(NO).


In general, DB2 avoids locking data pages if it can determine that the data to be accessed is committed and that no semantics are violated by not acquiring the lock. DB2 avoids locks by examining the log to verify the committed state of the data.


When determining if lock avoidance techniques will be practical, DB2 first scans the page to be accessed to determine whether any rows qualify. If none qualify, a lock is not required.


For each data page to be accessed, the RBA of the last page update (stored in the data page header) is compared with the log RBA for the oldest active unit of recovery. This RBA is called the Commit Log Sequence Number, or CLSN. If the CLSN is greater than the last page update RBA, the data on the page has been committed and the page lock can be avoided.


Additionally, a bit is stored in the record header for each row on the page. The bit is called the Possibly UNCommitted, or PUNC, bit. The PUNC bit indicates whether update activity has been performed on the row. For each qualifying row on the page, the PUNC bit is checked to see whether it is off. This indicates that the row has not been updated since the last time the bit was turned off. Therefore, locking can be avoided. (Note that there is no external method for DBAs to use to determine whether a row’s PUNC bit is on or off.)


If neither CLSN nor PUNC bit testing indicates that a lock can be avoided, DB2 acquires the requisite lock.


In addition to enhancing performance, lock avoidance improves data availability. Data that without lock avoidance would have been considered locked, and therefore unavailable, can now be accessible.


Lock avoidance is used only for data pages. Further, DB2 Catalog and DB2 Directory access does not use lock avoidance techniques. You can avoid locks under the following circumstances:


  • For any pages accessed by read-only or ambiguous queries bound with ISOLATION(CS) and CURRENTDATA NO
  • For any unqualified rows accessed by queries bound with ISOLATION(CS) or ISOLATION(RS)
  • When DB2 system-managed referential integrity checks for dependent rows caused by either the primary key being modified, or the parent row being deleted and the DELETE RESTRICT rule is in effect
  • For both COPY and RUNSTATS when SHRLEVEL(CHANGE) is specified


To determine the impact of lock avoidance on your system, you can review DB2 trace records. IFCIDs 218 and 223 provide CLSN information, and IFCIDs 226 and 227 provide 'wait for page latch' information.


Avoiding locks can improve the performance of your queries and programs that satisfy the preceding requirements. To encourage DB2 to avoid locks, BIND your plans and packages specifying ISOLATION(CS) and CURRENTDATA NO. Furthermore, avoid ambiguous cursors by specifying FOR READ ONLY for all cursors that are not used for updating.

Friday, March 06, 2009

Attend the 2009 IDUG North American Conference (at a Discount)

Today's blog post is just a friendly reminder to the DB2 community that the North American IDUG conference is fast approaching. This year's event will be held in Denver, CO from Tuesday May 12, 2009 through Friday, May 15th. And if you act quickly you can attend at a discounted rate using the early bird registration discount coupon valid through March 27th).

And don't forget those day long seminars that IDUG holds before the regular conference. Instead of being on a Sunday, the seminars will be on the Monday before the conference this year! The
Monday-Friday schedule is a departure from previous IDUG conferences, and was done to reduce or even eliminate the need for weekend travel.

IDUG is one of the best places to advance your DB2 knowledge. This year's conference boasts over 120 hours of technical material to be presented by a mix of real-world DB2 users, third-party vendors, DB2 Gold Consultants, IBM Fellows, IBM Distinguished Engineers, IBM Vice Presidents, and dozens of the most sought-after DB2 speakers in the world.

I will be delivering two presentations at this year's IDUG:
  • DB2 9: For Developers Only - Wed, 5/13/09 at 1:30 PM
  • Counting Down the DB2 Performance Top 40 - Fri, 5/15/09 at 9:00 AM
I will also be participating in the Data Privacy, Security and Audit Compliance special Interest Group (SIG), one of the many SIGs that will be conducted at IDUG.

Attendees will have ample time to meet informally between sessions, or as part of SIGs, discussion panels, or the Thursday night "dine-around" with some of IDUG's most popular presenters. And if you are thinking about getting certified, IDUG is the place to do that! Throughout the conference, IBM will waive the $200 certification test fee for all attendees, with no limit on the number of tests each attendee can take. IBM will offer 40 different certification tests that cover DB2, InfoSphere, U2, Content Management, DataStage, and other IBM Information Management products.

Seriously, you don't want to miss out on all of the wonderful learning and networking opportunities that IDUG offers DB2 professionals. Take the time to check out the IDUG conference details on the web and work on getting your management's approval for this great educational event.

Thursday, February 12, 2009

A Twittering You Will Go?

This week, a thread was started on the DB2-L list server about Twitter, the micro-messaging Web 2.0 social networking tool. Basically, someone wanted to know why more DB2 people did not use Twitter. The consensus seems to be that many organizations block it as a "non-business" web site.

(Surprisingly, LinkedIn seems not to be blocked as often as Twitter, even though LinkedIn is a prime vehicle for job search networking.)

This is disputable. If you've tried Twittering you know that it can be addictive, but it is also growing in popularity as a business tool for communication. This might seem hard to believe when you first dive into Twittering.

The basic idea of Twitter is simple: provide a platform for users to publish messages of no more than 140 characters at a time. And that can seem limiting... until you've used Twitter for awhile. If you subscribe to my Twitter feed you'll find that I send out regular Tweets (that is what a Twitter message is called) for many things, such as:
  • when I post a new blog entry (maybe you got here that way),
  • to share the highlights of interesting sessions when I attend a conference or user group,
  • to notify folks when I've published a new article or column, and
  • just to share some of the "things" going on in my life.
OK, so what are the business uses of Twitter? Well, sharing information (like I do) is absolutely a
business usage. Sharing practical web links is another. Keeping abreast of technology topics, yet
another. Micro-messaging can help you reduce email and eliminate unproductive meetings.

Other DB2 professionals use Twitter to communicate and solve problems. Willie Favero, Troy Coleman, and even some in-the-trenches folks use Twitter. So you know you'll get some good DB2 information if you participate.

So what? you may say: "my company already blocked Twitter so I can't participate." Well, there might be a way around that (I don't know if this will work or not). From your home PC, or some other non-company PC, go to twitter.com, register and see what it is all about. Then download a Twitter client, like TweetDeck (which my personal favorite) or Twhirl. Take the download and install it at work... now see if things are still blocked when you use a different client. They might be, but then again, maybe not...

Now (wink-wink) I do not really advocate people trying to get around their company's policies. But if you try this out and it works (or even if it does not) post a comment here to let us all know.

Wednesday, February 11, 2009

Don't Forget DISPLAY as a Part of Your DB2 Tuning Efforts

Although a DB2 performance monitor is probably the best solution for gathering information about your DB2 subsystems and databases, you can gain significant insight into “what is going on out there” using the simple DISPLAY command. The DISPLAY command can be used to return information about the status of DB2 data sharing groups, databases and table spaces, threads, stored procedures, user-defined functions, utilities, and traces; it can also monitor the Resource Limit Facility (RLF) and distributed data locations. Let’s take a quick tour of the useful information provided by the DISPLAY command.

Database Information

There are eight variations of the DISPLAY command that you can utilize, depending on the type of information you are looking for. Probably the most often-used variation of the DISPLAY command is the DATABASE option. By running the DISPLAY DATABASE command, you can gather information on DB2 databases and tablespaces. The output of the basic command will show the status of the objects specified along with any exception states that apply. For example:
-DISPLAY DATABASE(DBNAME)

Issuing this command will display details on the DBNAME database including information about the tablespaces and indexes in that database. So, with a simple command you can easily find all of the tablespaces and indexes within any database — pretty powerful stuff. But the status information for each space is useful, too. When a status other than RO or RW is encountered, the object is in an indeterminate state or is being processed by a DB2 utility. The possible statuses that DB2 can assign to a page set are detailed in the following table.

ARBDP

Index is in Advisory Rebuild Pending status; the index should be rebuilt to improve performance and allow the index to be used for index-only access again.

AREO*

The table space, index, or partition is in Advisory Reorg Pending status; the object should be reorganized to improve performance. This status is new as of DB2 V8.

ACHKP

The Auxiliary Check Pending status has been set for the base table space. An error exists in the LOB column of the base table space.

AREST

The table space, index space, or partition is in Advisory Restart Pending status. If back-out activity against the object is not already underway, either issue the RECOVER POSTPONED command or recycle the specifying LBACKOUT=AUTO.

AUXW

Either the base table space or the LOB table space is in the Auxiliary Warning status. This warning status indicates an error in the LOB column of the base table space or an invalid LOB in the LOB table space.

CHKP

The Check Pending status has been set for this table space or partition.

COPY

The Copy Pending flag has been set for this table space or partition.

DEFER

Deferred restart is required for the object.

GRECP

The table space, table space partition, index, index partition, or logical index partition is in the group buffer pool Recover Pending state.

ICOPY

The index is in Informational Copy Pending status.

INDBT

In-doubt processing is required for the object.

LPL

The table space, table space partition, index, index partition, or logical index partition has logical page errors.

LSTOP

The logical partition of a non-partitioning index is stopped.

PSRBD

The entire non-partitioning index space is in Page Set Rebuild Pending status.

OPENF

The table space, table space partition, index, index partition, or logical index partition had an open data set failure.

PSRCP

Indicates Page Set Recover Pending state for an index (non-partitioning indexes).

PSRBD

The non-partitioning index space is in a Page Set Rebuild Pending status.

RBDP

The physical or logical index partition is in the Rebuild Pending status.

RBDP*

The logical partition of a non-partitioning index is in the Rebuild Pending status, and the entire index is inaccessible to SQL applications. However, only the logical partition needs to be rebuilt.

RECP

The Recover Pending flag has been set for this table space, table space partition, index, index partition, or logical index partition.

REFP

The table space, index space, or index is in Refresh Pending status.

RELDP

The object has a release dependency.

REORP

The data partition is in a REORG Pending state.

REST

Restart processing has been initiated for the table space, table space partition, index, index partition, or logical index partition.

RESTP

The table space or index is in the Restart Pending status.

RO

The table space, tables pace partition, index, index partition, or logical index partition has been started for read-only processing.

RW

The table space, table space partition, index, index partition, or logical index partition has been started for read and write processing.

STOP

The table space, table space partition, index, index partition, or logical index partition has been stopped.

STOPE

The table space or index is stopped because of an invalid log RBA or LRSN in one of its pages.

STOPP

A stop is pending for the table space, table space partition, index, index partition, or logical index partition.

UT

The table space, table space partition, index, index partition, or logical index partition has been started for the execution of utilities only.

UTRO

The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, but only RO processing is enabled because a utility is in progress for that object.

UTRW

The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, and a utility is in progress for that object.

UTUT

The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, but only UT processing is enabled because a utility is in progress for that object.

WEPR

Write error page range information.


Of course, there are many additional options that can be used in conjunction with the DISPLAY DATABASE command. The following options can be used to narrow down the amount of information displayed:

  • USE displays what processes are using resources for the page sets in the database
  • CLAIMERS displays the claims on the page sets in the database
  • LOCKS displays the locks held on the page sets in the database
  • LPL displays the logical page list entries
  • WEPR displays the write error page range information.

Additionally, for partitioned page sets, you can specify which partition, or range of partitions, that you wish to display.

The OVERVIEW option can be specified to display each object in the database on its own line. This condenses the output of the command and makes it easier to view. The OVERVIEW keyword cannot be specified with any other keywords except SPACENAM, LIMIT, and AFTER.

Another tactic that can be used to control the amount of output generated by DISPLAY DATABASE is to use the LIMIT parameter. The default number of lines returned by the DISPLAY command is 50, but the LIMIT parameter can be used to set the maximum number of lines returned to any numeric value. For example:



-DISPLAY DATABASE(DBNAME) LIMIT(300)

Using the LIMIT parameter in this manner would increase the limit to 200 lines of output. To indicate no limit, you can replace the numeric limit with an asterisk (*).

Finally, you can choose to display only objects in restricted or advisory status using either the ADVISORY or RESTRICT key word.

Buffer Pool Information

The DISPLAY BUFFERPOOL command can be issued to display the current status and allocation information for each buffer pool. For example, consider the following:




-DISPLAY BUFFERPOOL (BP0)

DSNB401I ALLOCATED = 2000 TO BE DELETED = 0
IN USE/UPDATED = 12

DSNB403I ALLOCATED = 100000 TO BE DELETED = 0
BACKED BY ES = 91402

DSNB404I VPSEQUENTIAL = 80 HPSEQUENTIAL = 80
DEFERRED WRITE = 50 VERTICAL DEFERRED WRT = 10
IOP SEQUENTIAL = 50

DSNB405I HIPERSPACE NAMES - @001SSOP

DSN9022I DSNB1CMD '-DISPLAY BUFFERPOOL' NORMAL COMPLETION






We can see by reviewing these results that BP0 has been assigned 2,000 pages, all of which have been allocated. Furthermore, we see that it is backed by a hiperpool of 100,000 pages (so this is not a V8 subsystem, because hiperpools are no longer supported as of V8). The output also shows us the current settings for each of the sequential steal and deferred write thresholds.

For additional information on buffer pools you can specify the DETAIL parameter. Using DETAIL(INTERVAL) produces buffer pool usage information since the last execution of DISPLAY BUFFERPOOL. To report on buffer pool usage since the pool was activated, specify DETAIL(*). In each case, DB2 will return detailed information on buffer-pool usage such as the number of GETPAGEs, prefetch usage, and synchronous reads. The detailed data returned after executing this command can be used for rudimentary buffer pool tuning. For example, you can monitor the read efficiency of each buffer pool using the following formula:


(Total GETPAGEs) / [ (SEQUENTIAL PREFETCH) +
(DYNAMIC PREFETCH) +
(SYNCHRONOUS READ)
]

A higher read efficiency value is better than a lower one because it indicates that pages, once read into the buffer pool, are used more frequently. Additionally, if buffer pool I/O is consistently high, you might consider adding pages to the buffer pool to handle more data.

Finally, you can gather even more information about your buffer pools using the LIST and LSTATS parameters. The LIST parameter lists the open table spaces and indexes within the specified buffer pools; the LSTATS parameter lists statistics for the table spaces and indexes reported by LIST. Statistical information is reset each time DISPLAY with LSTATS is issued, so the statistics are as of the last time LSTATS was issued.

Utility Execution Information

If you are charged with running (IBM) DB2 utilities, another useful command is DISPLAY UTILITY. Issuing a DISPLAY UTILITY command will cause DB2 to display the status of all active, stopped, or terminating utilities.

So, if you are in over the weekend running REORGs, issuing an occasional DISPLAY UTILITY allows you to keep up-to-date on the status of the job. By monitoring the current phase of the utility and matching this information with the utility phase information, you can determine the relative progress of the utility as it processes.

For the IBM COPY, REORG, and RUNSTATS utilities, the DISPLAY UTILITY command also can be used to monitor the progress of particular phases. The COUNT specified for each phase lists the number of pages that have been loaded, unloaded, copied, or read.

You also can check the progress of the CHECK, LOAD, RECOVER, and MERGE utilities using DISPLAY UTILITY. The number of rows, index entries, or pages, that have been processed are displayed by this command.

Log Information

You can use the DISPLAY LOG command to display information about the number of logs, their current capacity, and the setting of the LOGLOAD parameter. This information pertains to the active logs. DISPLAY ARCHIVE will show information about your archive logs.

Stored Procedure and UDF Information

If your organization uses stored procedures and/or user-defined functions (UDFs), the DISPLAY command once again comes in handy. You can use the DISPLAY PROCEDURE command to monitor stored procedure statistics. This command will return the following information:
  • Whether the named procedure is currently started or stopped
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • The WLM environment in which the stored procedure executes

For UDFs, you can use the DISPLAY FUNCTION SPECIFIC command to monitor UDF statistics. This command displays one output line for each function that a DB2 application has accessed. It shows:

  • Whether the named function is currently started or stopped, and why
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • The WLM environment in which the function executes

When displaying information about stored procedures and UDFs using the DISPLAY PROCEDURE and DISPLAY FUNCTION SPECIFIC commands, a status is returned indicating the state of the procedure or UDF. A procedure or UDF can be in one of four potential states:

  1. STARTED - requests for the function can be processed
  2. STOPQUE - requests are queued
  3. STOPREJ - requests are rejected
  4. STOPABN - requests are rejected because of abnormal termination
Log Information

There is a wealth of additional information that the DISPLAY command can uncover.
  • For distributed environments, use DISPLAY DDF to show DDF configuration and status information, as well as statistical details on distributed connections and threads; use DISPLAY LOCATION to show information about distributed threads.
  • For data sharing, you can use the DISPLAY GROUP command to display information about the data-sharing group (including the version of DB2 for each member); and DISPLAY GROUPBUFFERPOOL can be used to show information about the status of DB2 group buffer pools.
  • If you use the Resource Limit Facility, the DISPLAY RLIMIT command can be used to show the status of the RLF, including the ID of the active Resource Limit Specification Table (RLST).
  • To display active and in-doubt connections to DB2 for a specified connection or all connections, use the DISPLAY THREAD command.
  • And finally, the DISPLAY TRACE command can be used to list your active trace types and classes along with the specified destinations for each.

Summary

The DB2 DISPLAY command is indeed a powerful, and simple tool that can be used to gather a wide variety of details about your DB2 subsystems and databases. Every DBA should know how to use DISPLAY and its many options to simplify their day-to-day duties and job tasks.