Wednesday, October 24, 2007
Miscellaneous Utility Enhancements [DB2 9 for z/OS]
One of these small changes comes to us with the DSN1LOGP utility. The utility will now detect possible erroneous recovery information. For example, consider a situation where you specify a range of log records to print but the entire range is no longer recorded in the BSDS. This can happen if your archive logs have rolled off. Prior to DB2 V9 DSN1LOGP would have just merrily printed the records for you. But because this can be confusing and c ould lead you to recoverying improperly, V9 will detect this situation, produce a RC 4 and the DSN1224I error message Indicating logs could not be found.
Another nice new utility-related enhancement made to DB2 9 for z/OS is that the DB2 Utilities Panel within DB2I adds an option for you to specify the SDSNLOAD library. This is useful when there are more than one DB2 subsystems on an LPAR.
DSN1COPY has been enhanced, too. As of DB2 9 for z/OS, running DSN1COPY RESET to copy a compressed table space from a non-data sharing DB2 subsystem to another non-data sharing subsystem will copy the specified data and reset the PGLOGRBA of each page and the dictionary version. In V7 and V8 the dictionary version was not reset (instead, you had to REORG the copied table space with KEEPDICTIONARY=NO in the target subsystem). This was not an issue for data sharing because the LRSN is used instead of an RBA.
Keep an eye out for future editions of this blog where we will go over the V9 changes for REORG and the BACKUP and RESTORE SYSTEM utilities.
Wednesday, October 10, 2007
RECOVER Improvements [DB2 9 for z/OS]
Point-In-Time Consistency
The most significant enhancement centers on recovering to a point-in-time (PIT) with consistency. As any DBA who has ever been charged with recovering an operational database to a prior PIT knows, it can be challenging to accomplish. Of course, if you have a usable QUIESCE point then point in time recovery is easy. But that means you’d have to have planned ahead of time and taken a QUIESCE, which is not always possible with heavy 24x7 workloads.
Well, DB2 V9 offers some relief because the IBM RECOVER utility is enhanced to automatically identify uncommitted work at the PIT to which you are recovering. After detecting the uncommitted work DB2 will rollback any changes made to the database objects that are being recovered. This means that anything that is being recovered will be in a transactionally consistent state. Of course, you still have to make sure that all of the appropriate objects are participating in the recovery.
This works with for PIT recoveries executed with either the TOLOGPOINT or TORBA keywords only. To accomplish this two new phases are added after the LOGAPPLY phase: LOGSCR and LOGUNDO.
The LOGSCR phase runs for each DB2 member with an active unit of recovery (UR). If no UR was active then this phase will be skipped. During the LOGCSR phase, all active and incomplete URs are identified. This includes URs that were:
- INFLIGHT
- INABORT
- INDOUBT
- POSTPONED ABORT
If any URs were found, the LOGUNDO phase runs. This is where RECOVER backs out
the changes made on recovered objects by active URs. In a data sharing environment, this
will be done one member at a time.
Additional Information on the Progress of Your Recovery
Another improvement made to the IBM RECOVER utility in V9 is the ability to gather more information about the progress of the recovery. Any DBA who has been tasked with sitting around and monitoring recoveries what the DISPLAY command to show accurate, up-to-date, and appropriate information about the status of the recovery.
Well, DB2 V9 enhances the -DISPLAY UTILITY command to provide additional information during the LOGAPPLY phase. DISPLAY now will show the range of the log that needs to be applied, for the database objects that are being recovered. It will also show the progress up to the last commit, as well as the elapsed time since the start of the log apply phase of the recovery.
This information will assist the DBA in determining how long it will take for the recovery job to successfully complete.
Using an Earlier Image Copy
One final enhancement made to the RECOVER utility in V9 gives the DBA control to get DB2 to recover using an earlier image copy. This is accomplished by specifying a point in the log before which DB2 must select an image copy for the recovery.
This technique can be particularly helpful when the most recent image copy is damaged or somehow unusable. You can use an earlier copy by providing an RBA/LRSN value to the new RESTOREBEFORE parameter that is at a point on the log prior to when the unusable image copy was taken.
Monday, October 08, 2007
LOAD and UNLOAD Enhancements [DB2 9 for z/OS]
Handling DECFLOAT Data
In terms of functionality, the first change we’ll discuss is how to deal with DECFLOAT data type in LOAD (and UNLOAD). If you do not recall what the DECFLOAT data type is, refresh your memory here.
The DECFLOAT data type is compatible with SMALLINT, INTEGER, BIGINT, DECIMAL, FLOAT, DECFLOAT. Of course, you will have to concern yourself with how the data is handled if it is loaded to or from DECFLOAT. You specify how to handle this manipulation using the DECFLOAT_ROUNDMODE parameter, which can be used in both the LOAD and UNLOAD utilities.
The following DECFLOAT_ROUNDMODE specifications are allowed:
- ROUND_CEILING - Round towards positive infinity.
- ROUND_DOWN - Round towards zero.
- ROUND_FLOOR - Round towards negative infinity.
- ROUND_HALF_DOWN - Round to nearest; if equidistant, round down.
- ROUND_HALF_EVEN - Round to nearest; if equidistant, round so that the final digit is even.
- ROUND_HALF_UP - Round to nearest; if equidistant, round up.
- ROUND_UP - Round away from zero.
If you do not specify DECFLOAT_ROUNDMODE, the LOAD statement uses the DFPDEFDM value in DSNHDECP as the default value.
Loading LOBs
Loading and unloading LOBs has been improved. For LOAD, an input field value can contain the name of the file that contains a LOB column value. The LOB column value will then be loaded from that file. For UNLOAD, you can store the value of a LOB column in a file and record the name of the file in the unloaded record in the base table.
LOAD, LOBs, and SORTKEYS NO
When a LOAD SHRLEVEL NONE of a table with a LOB column, fails, it can only be restarted with RESTART(CURRENT). This has been the case since LOBs were introduced to DB2.
However, when the SORTKEYS keyword was introduced in V5, a restriction with its use was that a LOAD that abended in the RELOAD, SORT, or BUILD phases could only be restarted with RESTART(PHASE). Restart will convert any RESTART option to RESTART(PHASE) when SORTKEYS is specified. This situation limited our options when LOBs were involved.
In V8 SORTKEYS became the default, so SORTKEYS and LOB restart processing were incompatible. In V9 this is fixed:
- You can turn SORTKEYS off by specifying SORTKEYS NO.
- And you can now use RESTART(PHASE) for LOAD REPLACE of a table with a LOB.
If you run LOAD REPLACE SHRLEVEL NONE on a table with a LOB, specifying SORTKEYS NO means that you can restart with either RESTART(CURRENT) or RESTART(PHASE). Without the SORTKEYS NO you can only use RESTART(PHASE).
If you run LOAD RESUME YES SHRLEVEL NONE on a table with a LOB, you must specify SORTKEYS NO if you want the job to be restartable; only RESTART(CURRENT) will be available.
Skipping Locked Rows
Recall from an earlier post here that you can skip locked rows in your transactions using the SKIP LOCKED DATA option within your SQL statements. Well, guess what, you can skip locked rows when running an UNLOAD, too.
If you are running an UNLOAD with SHRLEVEL CHANGE ISOLATION CS, you can also specify SKIP LOCKED DATA. This will cause the UNLOAD utility to skip rows on which incompatible locks are held by other transactions. This option only applies to table spaces with row level or page level locking.
Tuesday, October 02, 2007
More Online Utilities [DB2 9 for z/OS]
First up we have CHECK DATA. This utility is executed when you want to make sure that the data in your tables matches your integrity constraints. Prior to V9, access to tables (or table spaces) is read only when CHECK DATA is being run. This means that as long as CHECK DATA is executing you cannot modify data. If you have complex referential sets and/or very large databases you likely will have issues with running CHECK DATA.
You can run CHECK DATA online – that means with SHRLEVEL CHANGE – as of DB2 V9. As with other online utilities, an online CHECK DATA will work with on a shadow copy of the data and indexes. DB2 makes the shadow copy using DFSMS ADRDSSU. This copy can be super fast if you have data set level FlashCopy V2 enabled. Even if you do not have FlashCopy availability will improve because the outage is reduced when you use SHRLEVEL CHANGE (just to make the copy) instead of SHRLEVEL REFERENCE (for the entire utility execution).
DB2 will check data integrity using the shadow copies of the data and indexes. When violations are found DB2 will not set the CHECK pending state. And remember, we are checking shadows, so DB2 will not delete rows in violation. Instead, CHECK DATA will create a PUNCHDDN containing REPAIR LOCATE DELETE input that can be run against the active data.
The CHECK LOB utility has also been enhanced to enable online execution. DB2 V9 allows SHRLEVEL CHANGE for checking LOBs using a shadow copy of the LOB table space. And just like CHECK DATA, DB2 uses the DFSMS ADRDSSU utility to create the shadow. And similarly, the check is executed against the shadow and REPAIR statements are generated that you subsequently can run against the actual LOB table space.
Finally, you can also run the REPAIR LOCATE utility specifying SHRLEVEL CHANGE. This enhancements allows us to execute the REPAIR utility online with LOCATE against indexes, index spaces and table spaces.
Monday, September 24, 2007
Histogram Statistics [DB2 9 for z/OS]
Another utility upgrade that has found its way into DB2 9 for z/OS is the ability to gather histogram statistics. This feature is already available in DB2 for Linux, Unix, and Windows… and after you migrate to DB2 V9 it will be available to you on z/OS.
What is it? Well, let’s first define what a histogram is for those of you who are not statistics experts. A histogram is a way of summarizing data that is measured on an interval scale. A histogram is particularly helpful to quickly highlight how data is distributed; to determine if data is symmetrical or skewed; and to indicate whether or not outliers exists.
The histogram is only appropriate for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets. Histogram statistics can be quite useful to the optimizer for certain types of queries.
Instead of the frequency statistics, which are collected for only a subset of the data, sometimes DB2 can improve access path selection by estimating predicate selectivity from histogram statistics, which are collected over all values in a table space.
Consider collecting histogram statistics to improve access paths for troublesome queries with RANGE, LIKE, and BETWEEN predicates. They can also help in some cases for =, IS NULL, IN LIST and COL op COL predicates.
How to Collect Histogram Statistics
IBM RUNSTATS in DB2 V9 can collect statistics by quantiles. DB2 allows up to 100 quantiles. The user can specify how many quantiles DB2 is to use from 1 to 100. Of course, avoid 1 because it will not help.
You can tell RUNSTATS to collect histogram statistics by coding the HISTOGRAM keyword in conjunction with the COLGROUP option. In this way you can collect histogram statistics for a group of columns. You must also tell DB2 the number of quantiles to collect by specifying the NUMQUANTILES parameter. NUMQUANTILES can also be specified with the INDEX parameter, in which can it indicates that histogram statistics are to be collected for the columns of the index.
A single value can never broken into more than one interval. This means that the maximum number of intervals is equal to the number of distinct column values. Therefore, be sure that you do not specify a value for NUMQUANTILES that is greater than the total number of distinct values for the column (or column group) specified. Also, keep in mind that any NULLs will occupy a single interval.
So then, how do you decide on the number of quantiles to collect? If you do not specify NUMQUANTILES, the default value of 100 will be used, and then based on the number of records in the table, the number will be readjusted to an optimal number. Therefore, unless you have a good understanding of the application or a viable reason to deviate, a good rule of thumb is to simply let the NUMQUANTILES default and let DB2 work it out.
RUNSTATS will produce an equal-depth histogram. This means that each interval will have about the same number of rows. Please note that this does not mean the same number of values – it is the same number of rows. This means that in some cases a highly frequent single value could potentially occupy an interval all by itself.
The histogram statistics are collected in three new columns: QUANTILENO, LOWVALUE, and HIGHVALUE. These columns can be found in the following six DB2 Catalog tables:
- SYSIBM.SYSCOLDIST
- SYSIBM.SYSKEYTGTDIST
- SYSIBM.SYSCOLDIST_HIST
- SYSIBM.SYSCOLDISTSTATS
- SYSIBM.SYSKEYTGTDIST_HIST
- SYSIBM.SYSKEYTGTDISTSTATS.
Here is an example of a RUNSTATS to gather histogram statistics for the key columns of the indexes.:
RUNSTATS TABLESPACE DB07.CSMTS02
INDEX ALL
HISTOGRAM NUMCOLS 2 NUMQUANTILES 10
SHRLEVEL(CHANGE)
UPDATE ALL
REPORT YES
Summary
Histogram statistics is a very powerful new capability of the RUNSTATS utility that can be used to gather distribution statistics across all data values. These statistics can be helpful when you need additional distribution data to enable the optimizer to arrive at a better access path for certain queries/predicates.