Wednesday, February 11, 2009

Don't Forget DISPLAY as a Part of Your DB2 Tuning Efforts

Although a DB2 performance monitor is probably the best solution for gathering information about your DB2 subsystems and databases, you can gain significant insight into “what is going on out there” using the simple DISPLAY command. The DISPLAY command can be used to return information about the status of DB2 data sharing groups, databases and table spaces, threads, stored procedures, user-defined functions, utilities, and traces; it can also monitor the Resource Limit Facility (RLF) and distributed data locations. Let’s take a quick tour of the useful information provided by the DISPLAY command.

Database Information

There are eight variations of the DISPLAY command that you can utilize, depending on the type of information you are looking for. Probably the most often-used variation of the DISPLAY command is the DATABASE option. By running the DISPLAY DATABASE command, you can gather information on DB2 databases and tablespaces. The output of the basic command will show the status of the objects specified along with any exception states that apply. For example:
-DISPLAY DATABASE(DBNAME)

Issuing this command will display details on the DBNAME database including information about the tablespaces and indexes in that database. So, with a simple command you can easily find all of the tablespaces and indexes within any database — pretty powerful stuff. But the status information for each space is useful, too. When a status other than RO or RW is encountered, the object is in an indeterminate state or is being processed by a DB2 utility. The possible statuses that DB2 can assign to a page set are detailed in the following table.

ARBDP

Index is in Advisory Rebuild Pending status; the index should be rebuilt to improve performance and allow the index to be used for index-only access again.

AREO*

The table space, index, or partition is in Advisory Reorg Pending status; the object should be reorganized to improve performance. This status is new as of DB2 V8.

ACHKP

The Auxiliary Check Pending status has been set for the base table space. An error exists in the LOB column of the base table space.

AREST

The table space, index space, or partition is in Advisory Restart Pending status. If back-out activity against the object is not already underway, either issue the RECOVER POSTPONED command or recycle the specifying LBACKOUT=AUTO.

AUXW

Either the base table space or the LOB table space is in the Auxiliary Warning status. This warning status indicates an error in the LOB column of the base table space or an invalid LOB in the LOB table space.

CHKP

The Check Pending status has been set for this table space or partition.

COPY

The Copy Pending flag has been set for this table space or partition.

DEFER

Deferred restart is required for the object.

GRECP

The table space, table space partition, index, index partition, or logical index partition is in the group buffer pool Recover Pending state.

ICOPY

The index is in Informational Copy Pending status.

INDBT

In-doubt processing is required for the object.

LPL

The table space, table space partition, index, index partition, or logical index partition has logical page errors.

LSTOP

The logical partition of a non-partitioning index is stopped.

PSRBD

The entire non-partitioning index space is in Page Set Rebuild Pending status.

OPENF

The table space, table space partition, index, index partition, or logical index partition had an open data set failure.

PSRCP

Indicates Page Set Recover Pending state for an index (non-partitioning indexes).

PSRBD

The non-partitioning index space is in a Page Set Rebuild Pending status.

RBDP

The physical or logical index partition is in the Rebuild Pending status.

RBDP*

The logical partition of a non-partitioning index is in the Rebuild Pending status, and the entire index is inaccessible to SQL applications. However, only the logical partition needs to be rebuilt.

RECP

The Recover Pending flag has been set for this table space, table space partition, index, index partition, or logical index partition.

REFP

The table space, index space, or index is in Refresh Pending status.

RELDP

The object has a release dependency.

REORP

The data partition is in a REORG Pending state.

REST

Restart processing has been initiated for the table space, table space partition, index, index partition, or logical index partition.

RESTP

The table space or index is in the Restart Pending status.

RO

The table space, tables pace partition, index, index partition, or logical index partition has been started for read-only processing.

RW

The table space, table space partition, index, index partition, or logical index partition has been started for read and write processing.

STOP

The table space, table space partition, index, index partition, or logical index partition has been stopped.

STOPE

The table space or index is stopped because of an invalid log RBA or LRSN in one of its pages.

STOPP

A stop is pending for the table space, table space partition, index, index partition, or logical index partition.

UT

The table space, table space partition, index, index partition, or logical index partition has been started for the execution of utilities only.

UTRO

The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, but only RO processing is enabled because a utility is in progress for that object.

UTRW

The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, and a utility is in progress for that object.

UTUT

The table space, table space partition, index, index partition, or logical index partition has been started for RW processing, but only UT processing is enabled because a utility is in progress for that object.

WEPR

Write error page range information.


Of course, there are many additional options that can be used in conjunction with the DISPLAY DATABASE command. The following options can be used to narrow down the amount of information displayed:

  • USE displays what processes are using resources for the page sets in the database
  • CLAIMERS displays the claims on the page sets in the database
  • LOCKS displays the locks held on the page sets in the database
  • LPL displays the logical page list entries
  • WEPR displays the write error page range information.

Additionally, for partitioned page sets, you can specify which partition, or range of partitions, that you wish to display.

The OVERVIEW option can be specified to display each object in the database on its own line. This condenses the output of the command and makes it easier to view. The OVERVIEW keyword cannot be specified with any other keywords except SPACENAM, LIMIT, and AFTER.

Another tactic that can be used to control the amount of output generated by DISPLAY DATABASE is to use the LIMIT parameter. The default number of lines returned by the DISPLAY command is 50, but the LIMIT parameter can be used to set the maximum number of lines returned to any numeric value. For example:



-DISPLAY DATABASE(DBNAME) LIMIT(300)

Using the LIMIT parameter in this manner would increase the limit to 200 lines of output. To indicate no limit, you can replace the numeric limit with an asterisk (*).

Finally, you can choose to display only objects in restricted or advisory status using either the ADVISORY or RESTRICT key word.

Buffer Pool Information

The DISPLAY BUFFERPOOL command can be issued to display the current status and allocation information for each buffer pool. For example, consider the following:




-DISPLAY BUFFERPOOL (BP0)

DSNB401I ALLOCATED = 2000 TO BE DELETED = 0
IN USE/UPDATED = 12

DSNB403I ALLOCATED = 100000 TO BE DELETED = 0
BACKED BY ES = 91402

DSNB404I VPSEQUENTIAL = 80 HPSEQUENTIAL = 80
DEFERRED WRITE = 50 VERTICAL DEFERRED WRT = 10
IOP SEQUENTIAL = 50

DSNB405I HIPERSPACE NAMES - @001SSOP

DSN9022I DSNB1CMD '-DISPLAY BUFFERPOOL' NORMAL COMPLETION






We can see by reviewing these results that BP0 has been assigned 2,000 pages, all of which have been allocated. Furthermore, we see that it is backed by a hiperpool of 100,000 pages (so this is not a V8 subsystem, because hiperpools are no longer supported as of V8). The output also shows us the current settings for each of the sequential steal and deferred write thresholds.

For additional information on buffer pools you can specify the DETAIL parameter. Using DETAIL(INTERVAL) produces buffer pool usage information since the last execution of DISPLAY BUFFERPOOL. To report on buffer pool usage since the pool was activated, specify DETAIL(*). In each case, DB2 will return detailed information on buffer-pool usage such as the number of GETPAGEs, prefetch usage, and synchronous reads. The detailed data returned after executing this command can be used for rudimentary buffer pool tuning. For example, you can monitor the read efficiency of each buffer pool using the following formula:


(Total GETPAGEs) / [ (SEQUENTIAL PREFETCH) +
(DYNAMIC PREFETCH) +
(SYNCHRONOUS READ)
]

A higher read efficiency value is better than a lower one because it indicates that pages, once read into the buffer pool, are used more frequently. Additionally, if buffer pool I/O is consistently high, you might consider adding pages to the buffer pool to handle more data.

Finally, you can gather even more information about your buffer pools using the LIST and LSTATS parameters. The LIST parameter lists the open table spaces and indexes within the specified buffer pools; the LSTATS parameter lists statistics for the table spaces and indexes reported by LIST. Statistical information is reset each time DISPLAY with LSTATS is issued, so the statistics are as of the last time LSTATS was issued.

Utility Execution Information

If you are charged with running (IBM) DB2 utilities, another useful command is DISPLAY UTILITY. Issuing a DISPLAY UTILITY command will cause DB2 to display the status of all active, stopped, or terminating utilities.

So, if you are in over the weekend running REORGs, issuing an occasional DISPLAY UTILITY allows you to keep up-to-date on the status of the job. By monitoring the current phase of the utility and matching this information with the utility phase information, you can determine the relative progress of the utility as it processes.

For the IBM COPY, REORG, and RUNSTATS utilities, the DISPLAY UTILITY command also can be used to monitor the progress of particular phases. The COUNT specified for each phase lists the number of pages that have been loaded, unloaded, copied, or read.

You also can check the progress of the CHECK, LOAD, RECOVER, and MERGE utilities using DISPLAY UTILITY. The number of rows, index entries, or pages, that have been processed are displayed by this command.

Log Information

You can use the DISPLAY LOG command to display information about the number of logs, their current capacity, and the setting of the LOGLOAD parameter. This information pertains to the active logs. DISPLAY ARCHIVE will show information about your archive logs.

Stored Procedure and UDF Information

If your organization uses stored procedures and/or user-defined functions (UDFs), the DISPLAY command once again comes in handy. You can use the DISPLAY PROCEDURE command to monitor stored procedure statistics. This command will return the following information:
  • Whether the named procedure is currently started or stopped
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • The WLM environment in which the stored procedure executes

For UDFs, you can use the DISPLAY FUNCTION SPECIFIC command to monitor UDF statistics. This command displays one output line for each function that a DB2 application has accessed. It shows:

  • Whether the named function is currently started or stopped, and why
  • How many requests are currently executing
  • The high-water mark for concurrently running requests
  • How many requests are currently queued
  • How many times a request has timed out
  • The WLM environment in which the function executes

When displaying information about stored procedures and UDFs using the DISPLAY PROCEDURE and DISPLAY FUNCTION SPECIFIC commands, a status is returned indicating the state of the procedure or UDF. A procedure or UDF can be in one of four potential states:

  1. STARTED - requests for the function can be processed
  2. STOPQUE - requests are queued
  3. STOPREJ - requests are rejected
  4. STOPABN - requests are rejected because of abnormal termination
Log Information

There is a wealth of additional information that the DISPLAY command can uncover.
  • For distributed environments, use DISPLAY DDF to show DDF configuration and status information, as well as statistical details on distributed connections and threads; use DISPLAY LOCATION to show information about distributed threads.
  • For data sharing, you can use the DISPLAY GROUP command to display information about the data-sharing group (including the version of DB2 for each member); and DISPLAY GROUPBUFFERPOOL can be used to show information about the status of DB2 group buffer pools.
  • If you use the Resource Limit Facility, the DISPLAY RLIMIT command can be used to show the status of the RLF, including the ID of the active Resource Limit Specification Table (RLST).
  • To display active and in-doubt connections to DB2 for a specified connection or all connections, use the DISPLAY THREAD command.
  • And finally, the DISPLAY TRACE command can be used to list your active trace types and classes along with the specified destinations for each.

Summary

The DB2 DISPLAY command is indeed a powerful, and simple tool that can be used to gather a wide variety of details about your DB2 subsystems and databases. Every DBA should know how to use DISPLAY and its many options to simplify their day-to-day duties and job tasks.

Friday, February 06, 2009

A New DB2 Manual

I'm just now getting around to downloading the recently refreshed IBM DB2 9 for z/OS manuals. IBM updated almost all of the DB2 manuals in December 2008. Indeed, 19 of the 24 manuals listed have a publication date of December 2008.

But wait, I haven't seen one of these manuals before: IRLM Messages and Codes for IMS and DB2 for z/OS. If you take a look at the manual, yes, it is a first edition.

This "new" manual describes messages and codes that are issued by the IRLM (internal resource lock manager) which is used by both IMS and DB2 for z/OS. The information is not necessarily new, though, as it was previously contained in the messages and codes publications for both IMS and DB2. But now, we have a single manual.

Another thing I noticed, but I'm not sure exactly when it happened, is that Directory of Subsystem Parameters has been removed as an Appendix of the DB2 Installation Guide (dsnigk15). Now I know this Appendix was there in this manual when DB2 9 first came out (I still have the PDF)... but it was not in the previous edition (dsnigk14) of the Installation Guide either. Anyone know if it was moved somewhere else (wouldn't make much sense since it refers back to pages in the Installation Guide)? Or if there are plans afoot to make a DSNZPARM manual (I've been requesting and wishing for that for years).

Thursday, February 05, 2009

DB2 Performance Monitoring Overview

In today's blog entry we will discuss the basics monitoring and DB2 performance monitors.


The most common way to provide online DB2 performance monitoring capabilities is by online access to DB2 trace information in the MONITOR trace class. You generally specify OPX or OPn for the destination of the MONITOR trace. This way, you can place the trace records into a buffer that can be read using the IFI.


Some online DB2 performance monitors also provide direct access to DB2 performance data by reading the control blocks of the DB2 and application address spaces. This type of monitoring provides a "window" to up-to-the-minute performance statistics while DB2 is running. Such products can deliver in-depth performance monitoring without the excessive overhead of traces. Of course, they typically use a non-standard API into DB2, which could conceivable cause trouble.


Most online DB2 performance monitors provide a menu-driven interface accessible from TSO or VTAM. It enables online performance monitors to start and stop traces as needed based on the menu options chosen by the user. Consequently, you can reduce overhead and diminish the learning curve involved in understanding DB2 traces and their correspondence to performance reports.


Following are some typical uses of online performance monitors. Many online performance monitors can establish effective exception-based monitoring. When specified performance thresholds are reached, triggers can offer notification and take action. For example, you could set a ‘trigger’ when the number of lock suspensions for TXN2 is reached; when the ‘trigger’ is activated, a message is sent to the console and a batch report is generated to provide accounting detail information for the plan. You can set any number of ‘triggers’ for many thresholds.


Following are suggestions for setting thresholds:


  • When a buffer pool threshold is reached (PREFETCH DISABLED, DEFERRED WRITE THRESHOLD, or DM CRITICAL THRESHOLD).

  • For critical transactions, when predefined performance objectives are not met. For example, if TXN1 requires subsecond response time, set a trigger to notify a DBA when the transaction receives a class 1 accounting elapsed time exceeding 1 second by some percentage (10 percent; or even 25 percent, for example).

  • Many types of thresholds can be established. Most online monitors support this capability. As such, you can customize the thresholds for the needs of your DB2 environment.


Online performance monitors can produce real-time EXPLAINs for long-running SQL statements. If an SQL statement is taking a significant amount of time to process, an analyst can display the SQL statement as it executes and dynamically issue an EXPLAIN for the statement. Even as the statement executes, an understanding of why it is taking so long to run can be achieved. This can be particularly useful for dynamic SQL because it is not pre-bound and therefore you which won’t have any access path information for it.


Online performance monitors can also reduce the burden of monitoring more than one DB2 subsystem. Multiple DB2 subsystems can be tied to a single online performance monitor to enable monitoring of distributed capabilities, multiple production DB2s, or test and production DB2 subsystems, all from a single session.


Most online performance monitors provide historical trending. These monitors track performance statistics and store them in DB2 tables or in VSAM files with a timestamp. They also provide the capability to query these stores of performance data to assist in the following:


  • Analyzing recent history. Most SQL statements execute quickly, making difficult the job of capturing and displaying information about the SQL statement as it executes. However, you might not want to wait until the SMF data is available to run a batch report. Quick access to recent past-performance data in these external data stores provides a type of online monitoring that is as close to real time as is usually needed.

  • Determining performance trends, such as a transaction steadily increasing in its CPU consumption or elapsed time.

  • Performing capacity planning based on a snapshot of the recent performance of DB2 applications.


Some monitors also run when DB2 is down to provide access to the historical data accumulated by the monitor.


A final benefit of online DB2 performance monitors is their capability to interface with other z/OS monitors, for example IMS, CICS, MVS, and WebSphere monitors. This way, you can obtain a view of the entire spectrum of system performance.

Monday, February 02, 2009

Congratulations Pittsburgh Steelers!

Today my blog entry will veer away from technology briefly to congratulate the Pittsburgh Steelers on winning a record sixth Super Bowl title. I was born and raised in Pittsburgh and even though I live in Texas now, I'm still a die-hard Steelers fan.

Kudos to the Arizona Cardinals on putting up a great fight... and making the game too close for comfort there at the end!

I'll get back to our regularly scheduled DB2 programming in my next post... promise!

Friday, January 30, 2009

Hey DBAs! Recoverability Trumps Performance

Many DBAs reading this blog will probably think I'm wrong, at least initially. They'll claim that managing performance is the most important thing they do, but they are confusing frequency with importance. Yes, DBAs confront performance issues more often than they build backup plans – and they better be managing performance more frequently than they are actually recovering their databases or their company has big problems!

So why do I say that recoverability is at the pinnacle of the DBA task list? Well, if you cannot recover your databases after a problem then it won’t matter how fast you can access them, will it? Anybody can deliver fast access to the wrong information (or worse yet, no information at all). It is the job of the DBA to keep the information in their company’s databases accurate, secure, and accessible.

So what do we need to do to assure the integrity of our database data? First we need to understand the availability needs of our data in terms of the business. In the event of a failure how rapidly must we be able to recover from that failure? Keep in mind that the failure could be either physical, such as a failed disk drive, or logical, such as applying the wrong input to a process which corrupts the database.

Only after we know the impact to the business can we develop an appropriate backup and recovery plan. We need service level agreements (SLAs) for recovery just like we have SLAs for performance. The recovery SLA needs to be phrased as a recovery time objective (RTO) from an application perspective; for example “The amount of time to restore application availability after a failure of the order entry system cannot exceed 2 hours (or 10 minutes or whatever is appropriate for your business)”

To create effective RTOs you will need to be able to answer the question “What is the cost of not having this data available?” When we know the expectations of the business we can work to create a backup and recovery plan that matches the requirements. There are multiple techniques and methods for backing up and recovering databases. Some techniques, while more costly, can enhance availability by recovering data more rapidly.

It is imperative that the DBA team creates an appropriate recovery strategy for each database object. This requires mapping database objects to applications so we can adopt the proper strategy in accordance with the application recovery SLA. Some database objects will participate in multiple applications, and their recovery strategy will therefore be more complex.

Not all data is created equal. Some of your databases and tables contain data that is necessary for the core of your business. Other database objects contain data that is less critical or easily derived from other sources. Armed with this information, DBAs can develop RTOs such that the recovery plan matches the needs of the business.

Establishing a reasonable backup schedule requires you to balance two competing demands: the need to take image copy backups frequently to assure reasonable recovery time, while at the same time dealing with the need to take image copies infrequently so as not to interrupt daily business. All the while keeping in mind, if you make fewer image copies you will need to apply more log records during the recovery, and the recovery will take longer. The DBA must balance these competing objectives based on RTOs, usage criteria, and the capabilities of the DBMS.

When was the last time you re-evaluated and tested your backup and recovery plans? Oh, you may have looked at disaster plans, but have you examined your ability to recover locally? Do you know how long it would take to recover your most important primary customer tables, for example, if you took a hit in the middle of the day?

Regular recoverability health checking should be a standard, documented responsibility for the DBA staff; and if you can acquire software to automate the health-check process, all the better.