Sunday, September 24, 2006

Sequence Objects and Identity Columns

When designing DB2 databases a frequently heard request is for a column that contains sequentially generated numbers. For example, each row has a counter associated with it. When a new row is inserted, the counter should be incremented by one for the new row. This way, each new DB2 row has a unique “row number” associated with it. Until recently such a design was difficult to deliver.

Without sequence objects or identity columns an application program can implement similar functionality, but usually not in a manner that performs adequately as database usage scales. A common technique is to maintain a one-row table that contains the sequence number. Each transaction locks that table, increments the number, and then commits the change to unlock the table. In this scenario only one transaction at a time can increment the sequence number. A variation uses something like this:

SELECT MAX()+ 1
FROM ONEROW_TABLE
WITH RR;

The result is the next highest number to be used. This value is used by the application and ONEROW_TABLE must be updated with the incremented value. Performance bottlenecks will occur with this method when a lot of concurrent usage is required.

But now DB2 offers two methods of automatically generating sequential numbers for a column:
· Identity columns, and;
· SEQUENCE objects.

Identity Columns

Identity columns were formally added to DB2 as of Version 7, but were actually available as of the DB2 Version 6 refresh. The identity property is applied to a DB2 column using the IDENTITY parameter. A column thusly defined will cause DB2 to automatically generate a sequential value for that column when a row is added to the table. For example, identity columns might be used to generate primary key values or a value that somewhat mimics Oracle’s row number capability. Using identity columns helps to avoid some of the concurrency and performance problems that can occur when application programs are used to populate sequential values for a “counter” column.

When inserting data into a table that uses an identity column, the program or user will not provide a value for the identity column. Instead, DB2 automatically generates the appropriate value to be inserted.

Only one identity column can be defined per DB2 table. Additionally, the data type of the column must be SMALLINT, INTEGER, or DECIMAL with a zero scale, that is DECIMAL(n,0). The data type also can be a user-defined DISTINCT type based on one of these numeric data types. The designer has control over the starting point for the generated sequential values, and the number by which the count is incremented.

An example creating a table with an identity column follows:

CREATE TABLE EXAMPLE
(ID_COL INTEGER NOT NULL
GENERATED ALWAYS AS IDENTITY
START WITH 100
INCREMENT BY 10 ...);

In this example, the identity column is named ID_COL. The first value stored in the column will be 100 and subsequent INSERTs will add 10 to the last value. So the identity column values generated will be 100, 110, 120, 130, and so on.

Note, too, that each identity column has a property associated with it assigned using the GENERATED parameter. This parameter indicates how DB2 generates values for the column. You must specify GENERATED if the column is to be considered an identity column or the data type of the column is a ROWID. This means that DB2 must be permitted to generate values for all identity columns. There are two options for the GENERATED parameter: ALWAYS and BY DEFAULT.

GENERATED ALWAYS indicates that DB2 will always generate a value for the column when a row is inserted into the table. You will usually specify ALWAYS for your identity columns unless you are using data propagation.

GENERATED BY DEFAULT indicates that DB2 will generate a value for the column when a row is inserted into the table unless a value is specified. So, if you want to be able to insert an explicit value into an identity column you must specify GENERATED BY DEFAULT.

Additionally, you can specify what to do when the maximum value is hit. Specifying the CYCLE keyword will cause DB2 to begin generating values from the minimum value all over again. Of course, this can cause duplicate values to be generated and should only be used when uniqueness is not a requirement.

Actually, the only way to ensure uniqueness of your identity columns is to create a unique index on the column. The IDENTITY property alone will not guarantee uniqueness.

Sometimes it is necessary to retrieve the value of an identity column immediately after it is inserted. For example, if you are using identity columns for primary key generation you may need to retrieve the value to provide the foreign key of a child table row that is to be inserted after the primary key is generated. DB2 provides the IDENTITY_VAL_LOCAL() function that can be used to retrieve the value of an identity column after insertion. For example, you can run the following statement immediately after the INSERT statement that sets the identity value:

VALUES IDENTITY_VAL_LOCAL() INTO :IVAR;

The host variable IVAR will contain the value of the identity column.

Problems with Identity Columns

Identity columns can be useful, depending on your specific needs, but the problems that accompany identity column are numerous. Some of these problems include:


  • Handling the loading of data into a table with an identity column defined as GENERATED BY DEFAULT. The next identity value stored by DB2 to be assigned may not be the correct value that should be generated. This can be especially troublesome in a testing environment.
  • LOAD INTO PART x is not allowed if an identity column is part of the partitioning index.
    What about environments that require regular loading and reloading (REPLACE) for testing? The identity column will not necessarily hold the same values for the same rows from test to test.
  • Prior to V8, it was not possible to change the GENERATED parameter (such as from GENERATED BY DEFAULT to GENERATED ALWAYS).
  • The IDENTITY_VAL_LOCAL() function returns the value used for the last insert to the identity column. But it only works after a singleton INSERT. This means you cannot use INSERT INTO SELECT FROM or LOAD, if you need to rely on this function.
  • When the maximum value is reached for the identity column, DB2 will cycle back to the beginning to begin reassigning values - which might not be the desired approach.
If you can live with these caveats, then identity columns might be useful to your applications. However, in general, these "problems" make identity columns a very niche solution. IBM has intentions to rectify some of these problems over time in upcoming versions of DB2.

SEQUENCE Objects

But remember, DB2 has two methods of automatically generating sequential numbers. The first method is to define an identity column for the table; the second is to create a SEQUENCE object. A SEQUENCE object is a separate structure that generates sequential numbers.

New to DB2 V8, a SEQUENCE is a database object specifically created to generate sequential values. So, a using a SEQUENCE object requires the creation of a database object; using an identity column does not.

A SEQUENCE objects is created using the CREATE SEQUENCE statement.

When the SEQUENCE object is created it can be used by applications to “grab” a next sequential value for use in a table. SEQUENCE objects are ideal for generating sequential, unique numeric key values. A sequence can be accessed and incremented by many applications concurrently without the hot spots and performance degradation associated with other methods of generating sequential values.

Sequences are designed for efficiency and to be used by many users at the same time without causing performance problems. Multiple users can concurrently and efficiently access SEQUENCE objects because DB2 does not wait for a transaction to COMMIT before allowing the sequence to be incremented again by another transaction.

An example creating a SEQUENCE object follows:

CREATE SEQUENCE ACTNO_SEQ
AS SMALLINT
START WITH 1
INCREMENT BY 1
NOMAXVALUE
NOCYCLE
CACHE 10;

This creates the SEQUENCE object named ACTNO_SEQ. Now it can be used to generate a new sequential value, for example:

INSERT INTO DSN8810.ACT (ACTNO, ACTKWD, ACTDESC)
VALUES
(NEXT VALUE FOR ACTNO_SEQ, ‘TEST’, ‘Test activity’);


The NEXT VALUE FOR clause is known as a sequence expression. Coding the sequence expression causes DB2 to use the named SEQUENCE object to automatically generate the next value. You can use a sequence expression to request the previous value that was generated. For example:

SELECT PREVIOUS VALUE FOR ACTNO_SEQ
INTO :IVAR
FROM DSN8810.ACT;


As you can see, sequence expressions are not limited to INSERT statements, but can be used in UPDATE and SELECT statements, too.

Caution: If you specify the NEXT VALUE FOR clause more than once in the same SQL statement DB2 will return the same value for each NEXT VALUE FOR specification.

SEQUENCE Object Parameters

Similar to identity columns, a SEQUENCE object has parameters to control the starting point for the generated sequential values, and the number by which the count is incremented. You can also specify the data type to be generated (the default is INTEGER). You can also specify a minimum value (MINVALUE) and a maximum value (MAXVALUE) if you wish to have further control over the values than is provided by the data type chosen.

Again, as with identity columns, you can specify how the SEQUENCE should handle running out of values when the maximum value is hit. Specifying the CYCLE keyword will cause the SEQUENCE object to wrap around and begin generating values from the minimum value all over again.

A final consideration for SEQUENCE objects is caching. Sequence values can be cached in memory to facilitate better performance. The size of the cache specifies the number of sequence values that DB2 will pre-allocate in memory. In the previous example CACHE 10 indicates that ten sequence values will be generated and stored in memory for subsequent use. Of course, you can turn off caching by specifying NO CACHE. With caching turned off each new request for a sequence number will cause I/O to the DB2 Catalog (SYSIBM.SYSSEQUENCES) to generate the next sequential value.

SEQUENCE Object Guidelines

DB2 does not wait for an application that has incremented a sequence to commit before allowing the sequence to be incremented again by another application. Applications can use one sequence for many tables, or create multiple sequences for use of each table requiring generated key values. In either case, the applications control the relationship between the sequences and the tables.

The name of the SEQUENCE object indicates that we are going to use it to generate activity numbers (ACTNO), but its usage is not limited to that. Of course, failure to control the use of a SEQUENCE object can result in gaps in the sequential values. For example, if we use the ACTNO_SEQ object to generate a number for a different column, the next time we use it for ACTNO there will be a gap where we generated that number.


Other scenarios can cause gaps in a SEQUENCE, too. For example, issuing a ROLLBACK after acquiring a sequence number will not roll back the value of the sequence generator - so that value is lost. A DB2 failure can also cause gaps because cached sequence values will be lost.
Please note, too, that when sequences were introduced in non-mainframe DB2, syntax was supported that did not conform to the SQL standard. This non-standard syntax is supported on the mainframe as well:

  • NEXTVAL can be used in place of NEXT VALUE; and
  • PREVVAL can be used in place of PREVIOUS VALUE.
Choosing Between IDENTITY and SEQUENCE
Although both identity columns and SEQUENCE objects are useful for generating incremental numeric values, you will be confronted with situations where you will have to choose between the two. Consider the following criteria for when to use one instead of the other. Identity columns are useful when:

  • Only one column in a table requires automatically generated values
  • Each row requires a separate value
  • An automatic generator is desired for a primary key of a table
  • The LOAD utility is not used to load data into the table
  • The process of generating a new value is tied closely to inserting into a table, regardless of how the insert happens

SEQUENCE objects are useful when:

  • Values generated from one sequence are to be stored in more than one table
  • More than one column per table requires automatically generated values (multiple values may be generated for each row using the same sequence or more than one sequence)
  • The process of generating a new value is independent of any reference to a table
Unlike SEQUENCE objects, which are more flexible, identity columns must adhere to several rigid requirements. For example, an IDENTITY column is always defined on a single table and each table can have at most one IDENTITY column. Furthermore, when you create an IDENTITY column, the data type for that column must be numeric; not so for sequences. If you used a SEQUENCE object to generate a value you could put that generated into a CHAR column, for example. Finally, when defining an IDENTITY column you cannot specify the DEFAULT clause and the column is implicitly defined as NOT NULL. Remember, DB2 automatically generates the IDENTITY column’s value, so default values and nulls are not useful concepts.

Summary

Both identity columns and SEQUENCE objects can be used to automatically generate sequential values for DB2 columns. Prior to Version 8, identity columns are your only option. However, after you move to V8, SEQUENCE objects will provide more flexibility and be easier to use than the identity column option.

Happy sequential value generation with DB2!

Friday, September 15, 2006

The Path to an Executable DB2 Program

Here is another Q+A exchange that I thought might be useful to share with everyone here on the blog:

QUESTION: Could you please explain the difference between a package and a plan, the process of precompilation, compilation and running a COBOL DB2 program as well as a COBOL CICS DB2 program?

ANSWER: Well, I'll hit the highlights in response to your question but if you really want an in-depth answer then you should read the DB2 manuals (or a book like DB2 Developer's Guide).

What is a plan?

A plan is an executable module containing the access path logic produced by the DB2 optimizer. It can be composed of one or more DBRMs and packages. Before a DB2 for z/OS program (with static SQL) can be run, it must have a plan associated with it.

Plans are created by the BIND command. The plan is stored in the DB2 directory and accessed when its program is run. Information about the plan is stored in the DB2 catalog.

What is a package?

A package is a single, bound DBRM with optimized access paths. By using packages, the table access logic is "packaged" at a lower level of granularity than a plan -- at the package (or program) level.

To execute a package, you first must include it in the package list of a plan (usually, there are some exceptions, such as for triggers and user-defined functions). In general, packages are not directly executed, they are only indirectly executed when the plan in which they are contained executes -- as discussed previously, UDFs and triggers are exceptions to this rule. A plan can consist of one or more DBRMs, one or more packages or, a combination of packages and DBRMs.
What is program preparation?

A DB2 application program must go through a process known as program preparation before it can run successfully. Program preparation requires a series of code preprocessors that when enacted in the proper sequence, create an executable load module and a DB2 application plan. The combination of the executable load module and the application plan is required before any DB2 program can be run, whether batch or online. CICS programs require an additional preprocessing step.

The first step is precompilation. DB2 programs must be parsed and modified before normal compilation. The DB2 precompiler performs this task. At a high level, the precompiler basically searches for SQL, strips it out of the program and puts it into a DBRM. The output of precompilation is a DBRM with SQL and a modified source program.

The DBRM is bound using the BIND command, which is a type of compiler for SQL statements. In general, BIND reads SQL statements from DBRMs and produces a mechanism to access data as directed by the SQL statements being bound.

The modified source is compiled using the language compiler of your choice. The compiled source then is link-edited to an executable load module. The appropriate DB2 host language interface module also must be included by the link edit step. This interface module is based on the environment (TSO, CICS, or IMS/TM) in which the program will execute.

At the end of this series of steps, if all goes well, you will have an executable DB2 program.

Wednesday, September 06, 2006

Help for SAP Shops Using DB2 for z/OS

Just a quick blog post today to alert folks to a relatively new blog focusing entirely on SAP usage with DB2 for z/OS. According to the blogger, Omer Brandis:

Whether you have already implemented SAP on DB2 for z/OS, or are just seriously considering it, this is the blog for you. This blog will deal with real issues from the point of view of a true real-world professional, discussing the good, bad, and ugly of SAP on DB2 for z/OS.

Recent posts have covered offsite recovery, reorganizing SAP OFFICE, and hidden REORG jobs. If you use SAP and DB2 on the mainframe, be sure to check back in on this blog on a regular basis (no pun intended).

Friday, September 01, 2006

The Two Biggest DB2 Performance Things

DB2 performance is one of those perennial topics that people just can't seem to get enough of. I guess that is because the performance of applications is one of the bigger issues that end users complain about. And DBAs can be heroes if they can resolve performance problems quickly. It alos could be that performance problems are so ubiquitous because people keep on making the same design and coding mistakes...

With all of that in mind, let's take a look at what I think are the two biggest things you need to get control of to keep DB2 and SQL performance in check.

(1) Keep DB2 statistics up-to-date

Without the statistics stored in the DB2' system catalog, the optimizer will have a difficult time optimizing anything. These statistics provide the optimizer with information pertinent to the state of the tables that the SQL statement being optimized will access. The type of statistical information stored in the system catalog include:
  • Information about tables, including the total number of rows, information about compression, and total number of pages;
  • Information about columns, including number of discrete values for the column and the distribution range of values stored in the column;
  • Information about table spaces, including the number of active pages;
  • Current status of the index, including whether an index exists, the organization of the index (number of leaf pages and number of levels), the number of discrete values for the index key, and whether the index is clustered;
  • Information about the table space and index node groups or partitions.
Statistics populate the DB2 system catalog when you execute the RUNSTATS utility. Generally, this utility is invoked in batch jobs. Be sure to work with your DBA to ensure you accumulate statistics at the appropriate time, especially in a production environment.

(2) Build appropriate indexes

Perhaps the most important thing you can do to assure optimal DB2 application performance is create correct indexes for your tables based on the queries your applications use. Of course, this is easier said than done. But we can start with some basics. For example, consider this SQL statement:

SELECT LASTNAME, SALARY
FROM EMP
WHERE EMPNO = '000010'
AND DEPTNO = 'D01' ;


What index or indexes would make sense for this simple query? ""'First, think about all the possible indexes that you could create. Your first short list probably looks something like this:

  • Index1 on EMPNO
  • Index2 on DEPTNO
  • Index3 on EMPNO and DEPTNO
This is a good start, and Index3 is probably the best of the lot. It lets DB2 use the index to immediately look up the row or rows that satisfy the two simple predicates in the WHERE clause. Of course, if you already have a lot of indexes on the EMP table, you might want to examine the impact of creating yet another index on the table. Factors to consider include:

Modification impact: DB2 will automatically maintain every index you create. This means every INSERT and every DELETE to this table will insert and delete not just from the table, but also from its indexes. And if you UPDATE the value of a column that is in an index, you also update the index. So, indexes speed the process of retrieval but slow down modification.

Columns in the existing indexes: If an index already exists on EMPNO or DEPTNO, it might not be wise to create another index on the combination. However, it might make sense to change the other index to add the missing column. But not always, because the order of the columns in the index can make a big difference depending on the query. For example, consider this query:

SELECT LASTNAME, SALARY
FROM EMP
WHERE EMPNO = '000010'
AND DEPTNO > 'D01';

In this case, EMPNO should be listed first in the index. And DEPTNO should be listed second, allowing DB2 to do a direct index lookup on the first column (EMPNO) and then a scan on the second (DEPTNO) for the greater-than.

Furthermore, if indexes already exist for both columns (one for EMPNO and one for DEPTNO), DB2 can use them both to satisfy this query so creating another index might not be necessary.

Importance of this particular query: The more important the query, the more you might want to tune by index creation. If you are coding a query that the CIO will run every day, you want to make sure it delivers optimal performance. So building indexes for that particular query is important. On the other hand, a query for a clerk might not necessarily be weighted as high, so that query might have to make do with the indexes that already exist. Of course, the decision depends on the application's importance to the business-not just on the user's importance.

Index design involves much more than I have covered so far. For example, you might consider index overloading to achieve index-only access. If all the data that a SQL query asks for is contained in the index, DB2 might be able to satisfy the request using only the index. Consider our previous SQL statement. We asked for LASTNAME and SALARY, given information about EMPNO and DEPTNO. And we also started by creating an index on the EMPNO and DEPTNO columns. If we include LASTNAME and SALARY in the index as well, we never need to access the EMP table because all the data we need exists in the index. This technique can significantly improve performance because it cuts down on the number of I/O requests.

Keep in mind that making every query an index-only access is not prudent or even possible. You should save this technique for particularly troublesome or important SQL statements.

Summary

If you are just embarking on your journey into the wonderful world of DB2 performance management, please, start with the two items covered in this short blog entry. I've just scratched the surface of both areas and you can benefit by additional research and education in both statistics gathering and index design. And if you are a long-time DB2 professional, it can't hurt to bone up on these topics either. You might learn about some newer DB2 feature or function that you haven't used yet, or maybe just strengthen what you already know.

So what do you think? Are these the two most important DB2 performance things, or do you think something else is more important? Post your comments below...

Thursday, August 31, 2006

Accessing Partitioned Data

One area that tends to confuse some DB2 developers until they gain experience is how DB2 partitioning works. A common question I get goes something like this: “If a table is in a partitioned table space, say four partitions, for example, then to process the table in batch can I run four instances of the batch program in parallel, one against each partition. What do I code to do this?”

Well, the short and sweet answer to this question is “Yes, you can run four instances of a batch program in parallel if you so desire.” But there is a nuance to this misconception that might be missed here. The question lurking beneath the question is this: “How can I make sure I am accessing only data in one partition in each of the batch programs?”

To do this requires some programming work. The program will need to have a means of identifying which partition it should run against. So, you might code the program to accept an input parameter of 1, 2, 3, or 4. The program would read the parameter and translate it into the key range of values that should be read by the program. This is the LIMITKEY value for the particular partition as found in the partitioning index. You can retrieve this value from the DB2 catalog using the following query:

SELECT PARTITION, LIMITKEY
FROM SYSIBM.SYSINDEXPART
WHERE IXNAME = ?
AND IXCREATOR = ?
ORDER BY PARTITION;

Supply the index name and creator and this query will return the partition number and LIMITKEY for that partition. (If you include this query in the program you probably will want to include the PARTITION column in the WHERE clause and return only a single row.) The LIMITKEY is the high key value for that partition. Using this information you will need to write the queries in the program such that only values from the partition being processes will be retrieved. As long as the program adheres to that key range you should only process data from the one partition that holds that data.

Of course, none of this is necessary to get DB2 to operate in parallel. The best approach uses DB2 query parallelism because it minimizes the amount of programming work and has the potential to maximize performance. To signal DB2 to turn on parallelism you will need to BIND your program specifying the DEGREE(ANY) parameter. Then DB2 will choose the degree of parallelism for the batch program. You will only need to run the program once (instead of 4 times as in our example); DB2 will figure out how many tasks it can run in parallel for each query in the program. And there is no need to modify the program at all! This is far simpler than any alternate approach because DB2 handles chunking up the work into parallel tasks for you.

Tuesday, August 29, 2006

How are Indexes Being Used?

In keeping with my promise to periodically post blog entries based on questions I have received, here we have another question I have been asked:

If I have five indexes on a table, what is the best way to determine if all, some or none of the indexes are being used?

Here is the answer I sent:

The best approach would be to make sure you have run EXPLAIN on all of your production plans and packages. Then examine the PLAN_TABLE output for those indexes. For example:

SELECT *
FROM my.PLAN_TABLE
WHERE ACCESSNAME IN (put your list of indexes here);

Of course, this will not show the dynamic SQL that uses any of these indexes. There are tools on the market that can help you to gather SQL usage statistics and access path information for both static and dynamic SQL statements.

You might consider acquiring one of these tools if you use a lot of dynamic SQL. One such offering is NEON Enterprise Software’s SQL Performance Expert solution.

Thursday, August 24, 2006

VARCHAR versus Compression

A couple of days ago I posted a blurb giving advice on using variable character columns in DB2. After thinking about the topic a little bit more, I decided to post a follow-on topic: namely, comparing the use of VARCHAR to DB2 compression.

Even though these are two entirely different "things," they are each probably done for similar reasons - to save disk storage. VARCHAR does this by adjusting the size of the column to fit the actual length of text being stored; compression does this by sending rows of data through an algorithm to minimize its length. For those interested in the details of compression I refer you to Willie Favero's excellent blog where he has written a several-part series on compression -- here are the links to it: part one, part two, and part three.

So, what advice can I give on comparing the two? Well, you might want to consider forgoing the use of variable columns and instead turn on compression. With variable columns you always add overhead: there is a two-byte prefix for every VARCHAR column to store the length of the VARCHAR. If instead you use CHAR and turn on compression you no longer need the extra two bytes per row per variable column.

Also, keep in mind that compression reduces the size of the entire row. So not only will you be compressing the CHAR column (that used to be VARCHAR), but you will also give DB2 the opportunity to compress every other column in that row.

All in all, that means that comrpession can return better disk storage savings than variable columns, and all without the programmatic overhead of having to calculate and store the two-byte prefix for each previously variable column.

Of course, I don't want to give the impression that this should always be done... (remember the DBA's maxim: Almost never say "always or never.") And there are additional things to consider, such as:
  • Compression adds a compression dictionary to the table space so a compressed table space can actually be larger than a non-compressed table space (if it is very small to begin with).
  • Compression requires additional CPU cycles to compress and de-compress the data as it is inserted, modified, and read (of course, I/O can decrease because smaller rows will fit more on each page, so degraded CPU performance can be offset by improved I/O)

This is just an additional "thing to consider" when you are building your DB2 databases and trying to decide whether you should use VARCHAR or CHAR...

Monday, August 21, 2006

IBM Mainframes - Not Just for Big Shops Any More

Just a quick blog today to point you to an interesting article in the latest issue of IBM Systems Magazine - Mainframe Edition. The article, titled A New System for a New Market, points out that the System z9 Business Class (z9 BC) platform, the latest mainframe in IBM's product line announced in April 2006, is suitable for the small and medium business (SMB) space.

This offering brings high performance and scalability to the SMB market at a very reasonable cost (around $100k). With specialty engines that can be added (IFL, zIIP and zAAP), again at a reasonable cost, it looks like the IBM mainframe will not only remain viable for large shops, but it could expand out into smaller ones, too.

So, as most mainframe afficianados know, the mainframe is not dead. But, it may actually be able to grow with the new features and affordability being built into IBM's new mainframes.

Sunday, August 20, 2006

Advice on Using Variable Character Columns in DB2

One of the long-standing, troubling questions in DB2-land is when to use VARCHAR versus CHAR. The high-level advice for when to use VARCHAR instead of CHAR is for larger columns whose length varies considerably from row-to-row. Basically, VARCHAR should be used to save space in the database when your values are truly variable.

In other words, if you have a 10-byte column, it is probably not a good idea to make it variable... unless, of course, 90% of the values are only one or two bytes, then it might make some sense. Have you gotten the idea here that I'm not going to give any hard and fast rules? Hope so, cause I won't - just high-level guidance.

Another situation: say you have an 80 byte column where values range from 10 bytes to the full 80 bytes... and more than 50% of them are less than 60 bytes. Well, that sounds like a possible candidate for VARCHAR to me.

Of course, there are other considerations. Java programmers tend to prefer variable character columns because Java does not have a native fixed length character data type.

For traditional programming languages though, CHAR is preferred because VARCHAR requires additional programmatic handling (to set the length of each column when inserting or modifying the data).

OK, so what if you are trying to determine whether or not the appropriate decision was made when for VARCHAR columns instead of CHAR? You can use information from the DB2 Catalog to get a handle on the actual sizes of each VARCHAR column.

Using views and SQL it is possible to develop a report showing the lengths of the variable column values. First, determine which VARCHAR column you need information about. For the purposes of this example, let's examine the NAME column of SYSIBM.SYSTABLES. This column is defined as VARCHAR(18). Create a view that returns the length of the NAME column for every row, for example:

CREATE VIEW LENGTH_INFO
(COL_LGTH)
AS
SELECT LENGTH(NAME)
FROM SYSIBM.SYSTABLES;

Then, issue the following query using SPUFI to produce a report detailing the LENGTH and number of occurrences for that length:

SELECT COL_LGTH, COUNT(*)
FROM LENGTH_INFO
GROUP BY COL_LGTH
ORDER BY COL_LGTH;

This query will produce a report listing the lengths (in this case, from 1 to 18, excluding those lengths which do not occur) and the number of times that each length occurs in the table. These results can be analyzed to determine the range of lengths stored within the variable column. If you are not concerned about this level of detail, the following query can be used instead to summarize the space characteristics of the variable column in question:

SELECT 18*COUNT(*),
SUM(2+LENGTH(NAME)),
18*COUNT(*)-SUM(2+LENGTH(NAME)),
18,
AVG(2+LENGTH(NAME)),
18-AVG(2+LENGTH(NAME))
FROM SYSIBM.SYSTABLES;

The constant 18 will need to be changed in the query to indicate the maximum length of the variable column as defined in the DDL. This query will produce a report such as the one shown below:

SPACE SPACE TOTAL AVERAGE AVERAGE AVERAGE
USED AS USED AS SPACE SPACE AS SPACE AS SPACE
CHAR(18) VARCHAR(18) SAVED CHAR(18) VARCHAR(18) SAVED
--------- ----------- ------ -------- ----------- -------
158058 96515 61543 18 10 8



This information can then be analyzed to determine if the appropriate decision was made when VARCHAR was chosen. (Of course, the values returned will differ based on your environment and the column(s) that you choose to analyze.) Also, keep in mind that this report will not include the 2 byte prefix stored by DB2 for variable length columns.

I hope this high-level overview with advice on when to use VARCHAR versus CHAR has been helpful. If you have your own guidelines or queries that you use please feel free to post a comment to this blog and share them with everyone.



NOTE: You could skip the creation of the VIEW in the above query and just use a nested table expression (aka in-line view) instead.

Thursday, August 17, 2006

Greatest Software Ever?

I just stumbled across a very interesting article this afternoon and thought I'd share it with everybody through my blog. The article, published in Information Week is titled What's The Greatest Software Ever Written? And isn't that an intriguing question?

Well, I read through the article and other than a few quibbles here and there I'd have to say that the author did a good job of assembling his list. He spends quite a bit of time talking about the IBM 360 project - and well he should. This was one of the first truly huge software projects and it set the bar for what is expected of an operating system. It also was the catalyst for causing one of the best ever books on software development to be written - The Mythical Man Month. Written by Fred Brooks, the manager in charge of the IBM 360 project, this book outlines many of the truisms about software development that we acknowledge even today - more than 40 years later. If you work in IT and you haven't read The Mythical Man Month you really should buy a copy and read it immediately. Anyway, this blog isn't about that book, so let's move on.

I won't spoil it here and publish the list of greatest software - you will have to click on the link for the article and read it yourself (the actual list doesn't start until page 3 of the article, but don't just click right over to that page, read the whole thing).

Suffice it to say, several IBM projects make the list (I'm kinda partial to what came in at #2 -- it would've been my #1 actually). And I think perhaps that VisiCalc belongs on the list instead of the spreadsheet software that is listed - I mean, Dan Bricklin invented the entire spreadsheet category of software when Software Arts published VisiCalc back in the late 1970s.

But the article is good anyway and I'm sure it is almost impossible to publish a list like this without causing disagreement - and perhaps that is its intent any way. So take a moment and click over to the article and give it a read. And feel free to share your thoughts on it here by posting a comment or two.

Thursday, August 10, 2006

SHARE Travelers Take Heed

With the upcoming SHARE conference in Baltimore next week, there are sure to be many of you out there who will be traveling to the nation's capital region over the weekend. As you prepare to travel, be sure to factor in additional time at the airport due to the latest TSA warning.

Basically, in response to a recently thwarted terrorist plot in the UK, the threat level has been raised to High (or Orange) for all commercial aviation operating in or destined for the United States. That means the lines will be longer and the searches more thorough going through security at the airport.

Additionally, please read the TSA announcement and heed what it is saying. I am referring specifically to this: "Due to the nature of the threat revealed by this investigation, we are prohibiting any liquids, including beverages, hair gels, and lotions from being carried on the airplane." Please, for everyone's sake, leave your liquids at home:
  • You can get a drink after you pass through security.
  • Every hotel provides shampoo, conditioner, and lotion for free, so you don't need to bring them.
  • If you absolutely have to have your favorite brand, or some gel or spray, pack it in your checked bags.
And yes, please check your dang luggage! Although I am sometimes amused by idiots trying to jam a huge bag into the overhead bin, it becomes less amusing after a two hour amble through security. If you have a large bag check it!

And I'll see you all in Baltimore.

Wednesday, August 09, 2006

Where exactly is a DB2 plan stored?

The title of this posting is a question I received awhile ago. As I promised earlier on this blog, I will periodically post the answers I have given to e-mailed questions. So, here goes:

A DB2 "plan" is stored in the DB2 Directory and information about the plan is stored in the DB2 Catalog.

The DB2 Directory table that contains actual plans is SYSIBM.SCT02 (and SYSIBM.SPT01 contains actual packages). The plan is stored as an internal structure called a cursor table; packages are stored as package tables. As DB2 runs application programs, it loads the cursor tables for plans and package tables for packages from the DB2 Directory tables into the EDM Pool. This is where the "access path" code that determines how to get the actual DB2 data resides.

There is also metadata about plans and packages that you might find useful. This information includes data about the state, privileges, isolation level, release specification, and so. The DB2 Catalog contains information about plans in the following tables:

  • SYSIBM.SYSDBRM
  • SYSIBM.SYSPLAN
  • SYSIBM.SYSPLANAUTH
  • SYSIBM.SYSPLANDEP
  • SYSIBM.SYSSTMT

And, the DB2 Catalog contains information about packages in the following tables:

  • SYSIBM.SYSPACKAGE
  • SYSIBM.SYSPACKAUTH
  • SYSIBM.SYSPACKDEP
  • SYSIBM.SYSPACKLIST
  • SYSIBM.SYSPACKSTMT
  • SYSIBM.SYSPKSYSTEM
  • SYSIBM.SYSPLSYSTEM

Tuesday, August 08, 2006

Mainframe Weekly: A new mainframe-focused blog

Mainframe Weekly is a new blog featuring the insights of Trevor Eddolls. Trevor is an editor who has worked for Xephon for some time. Xephon publishes those "Update" journals - you know the ones, DB2 Update, CICS Update, etc. The ones that are full of content and don't accept any ads.

I've had the pleasure of writing for DB2 Update and working with Trevor for a number of years now, and I look forward to regularly reading his new blog. Recent entries there have covered CICS, DB2 and Viper, and storage technology.

Do yourself a favor and be sure to check in on Trevor's blog on a regular basis.

Monday, August 07, 2006

Upcoming SHARE Conference

Just a quick posting to remind everyone that the bi-annual SHARE conference is coming up next week in Baltimore, MD. For those of you who don't know about SHARE: the conference has offered education to IT professionals, improving the value of IT to businesses, and advancing careers since 1955 - the longest history of any IT user group.

SHARE in Baltimore runs from August 13-18, 2006 and, as usual, there will be a plethora of useful and educational sessions on multiple IT topics. According to the folks at SHARE there will be over 800 technical sessions to choose from. Session topics range from implementation of new software releases to in-depth looks at what the latest new technologies can do for your business.

I'll be there, and will be delivering two presentations on DB2. On Monday, August 14, (3:00 - 4:00 PM) I'll give an hour of DB2 for z/OS tips and techniques... and then on Tuesday, August 15, (8:00 - 9:00 AM) I'll be giving an overview of DB2 for z/OS performance tuning.

Hope to see you there!

Wednesday, August 02, 2006

New Mainframe Redbook

IBM published a new mainframe redbook this week that everyone who works with mainframe technology should download and have handy. It is titled Introduction to the New Mainframe: z/OS Basics and it offers a very nice high-level overview of mainframe technology circa 2006.

Of course, if you are a long-time mainframer much of this redbook may be too high level for you. But it will be just the right thing to have at your disposal when someone drops by to ask a basic question. Just e-mail them a copy of the book.

And even long-time experts still might want to at least leaf through it. It is divided into four broad sections, as follows:
  • Part 1. Introduction to z/OS and the mainframe
  • Part 2. Application programming on z/OS
  • Part 3. Online workloads for z/OS
  • Part 4. System programming on z/OS
So maybe you are a crackerjack programmer, but would like to know more about system programming. Maybe your online knowledge is skimpy. Or maybe you just want a nice overview of everything. Then this book is just the ticket!

Here is a brief synopsis out of the Preface of the redbook:

This IBM Redbook provides students of information systems technology with the background knowledge and skills necessary to begin using the basic facilities of a mainframe computer. It is the first in a planned series of textbooks designed to introduce students to mainframe concepts and help prepare them for a career in large systems computing.

This textbook can also be used as a prerequisite for courses in advanced topics or for internships and special studies. It is not intended to be a complete text covering all aspects of mainframe operation, nor is it a reference book that discusses every feature and option of the mainframe facilities.

Others who will benefit from this course include experienced data processing professionals who have worked with non-mainframe platforms, or who are familiar with some aspects of the mainframe but want to become knowledgeable with other facilities and benefits of the mainframe environment.

So download Introduction to the New Mainframe: z/OS Basics today...

Monday, July 31, 2006

Network World: Mainframes Still Relevant

Just finished reading a great new article at NetworkWorld.com called Working on mainframes not just for old fogies. Well, that is sure good to know. I may be getting older, but I sure don't think of myself as one of those "old fogies" yet... and I still work on mainframes.

The article talks about the continuing relevance of mainframe computing in a novel way -- by talking to several young mainframe newbies. Yes, there are twenty-somethings out there who are working on mainframes, they are just hard to find. And the article makes the point that all of us in the mainframe world know -- we need more young 'uns to learn the mainframe.

The article goes on to point out some interesting mainframe statistics from the industry analysts. According to Gartner "large mainframe users have been increasing their mainframe environments for years." They say that installed MIPs will continue to gorw at a CAGR of 15 to 20 percent through 2009. And the analysts at IDC seem to agree, with 2006 MIPS shipments up 14.2 percent.

If you work on mainframes be sure to click over and give the article a read. To me, anyway, it is invigorating to hear about young people embracing the mainframe. And the more younger people who learn about mainframe computing, the stronger the platform becomes... and that is good, too.

Friday, July 28, 2006

IBM Software Support Toolbar

It seems like everyone under the sun is offering a toolbar to add in to the Internet Explorer browser. Google has one, Yahoo has one, and so do countless others. Now, IBM has one, too.

The new IBM Software Support Toolbar plugs into your browser to allow you to search IBM's Support knowledge base using keywords, specific error codes or exact phrases. You can also use it to search or browse for product downloads including Fix Packs, Updates, Patches, etc. So if you have trouble remembering how to get to the IBM site for support, or just want a quicker way to get there, the IBM Software Support Toolbar could prove itself to be quite helpful.

The toolbar even allows you to seek specific results for a specific brand. So you can scroll to the "Information Management" brand and select from that sub-menu. That way you'd only be looking for the "database stuff"...

Wednesday, July 26, 2006

How and when to use DB2 scrollable cursors

As I've mentioned here before, sometimes I will use this blog to post answers to questions that have been sent to me. The question being answered today, is this: I have a select query that returns 1000 rows and I want to display the rows from 200 to 500. Could you please let me how to code after the where clause? I know how to get the first 100 rows using -- FETCH FIRST 100 rows only clause.

If you are using DB2 V7 or higher, consider using scrollable cursors. With scrollable cursors, you can move directly to the rows you want without having to FETCH every other row returned by the cursor.

In DB2 V7, scrollable cursors require you to use declared temporary tables, another new feature of DB2 Version 7. If you are using DB2 V8, dynamic scrollable cursors are available and thus temprorary tables are not required.

In V7, DB2 uses a declared temporary table to hold and maintain the data returned by a scrollable cursor. Scrollable cursors allow developers to move through the results of a query in multiple ways. The following key words are supported when fetching data from a scrollable cursor:
  • NEXT - will FETCH the next row, the same way that the pre-V7
  • FETCH statement functioned
  • PRIOR - will FETCH the previous row
  • FIRST - will FETCH the first row in the results set
  • LAST - will FETCH the last row in the results set
  • CURRENT - will re-FETCH the current row from the result set
  • BEFORE - positions the cursor before the first row of the results set
  • AFTER - positions the cursor after the last row of the results set
  • ABSOLUTE n - will FETCH the row that is n rows away from the first row in the results set
  • RELATIVE n - will FETCH the row that is n rows away from the last row fetched

For both ABSOLUTE and RELATIVE, the number n must be an integer. It can be either a positive or a negative number and it can be represented as a numeric constant or as a host variable. All of the FETCH options for scrollable cursors also reposition the cursor before fetching the data. For example, let's consider your problem of a cursor that returns 1000 rows, but you only want rows 200 through 500.

Consider the following cursor logic:

DECLARE csr1 SENSITIVE STATIC SCROLL CURSOR
FOR SELECT FIRSTNAME, LASTNME
FROM DSN8710.EMP
ORDER BY LASTNME
FETCH FIRST 1000 ROWS ONLY;

OPEN csr1;

FETCH ABSOLUTE 200 csr1 INTO :FN, :LN;

I used the FETCH FIRST 1000 ROWS ONLY clause to ensure that no more than 1,000 rows were returned. This clause is, of course, optional (and if not specified, DB2 will not limit the result set returned by the cursor). Then I open the cursor and FETCH row 200. This positions the cursor just after the 200 result row that was just fetched. After that, all you would need would be to create a loop that just issues FETCH NEXT 300 times and that would retrieve only rows 200 through 500.

Basically, scrollable cursors reduce the amount of time and effort required to move backward and forward through the results of SQL queries. But as helpful as scrollable cursors are, do not make every cursor a scrollable cursor. Scrollable cursors require substantially more overhead than a traditional, non-scrollable cursor. Analyze the requirements of your applications and deploy scrollable cursors only where it makes sense to do so. Also, be sure to discuss this with your DBAs before implementing as there will probably be some setup work required of the DBA group to facilitate this solution.

Good luck...

Tuesday, July 25, 2006

Free DB2 Statistics Health Check Software

NEON Enterprise Software SEGUS, Inc. is offering free software for checking the health of the statistics in your DB2 Catalog.

Statistics HealthCheck for DB2 z/OS enables you to quickly and effectively analyze and judge the quality of your DB2 Catalog statistics. Checking the health of your DB2 subsystem is especially important considering the heightened sensitivity of DB2 V8 and V9 to bad statistics.

The extensive rule system used by Statistics HealthCheck is based on IBM’s own recommendations for maintaining good statistics. Using a violation system, Statistics HealthCheck pinpoints precisely those objects that could benefit from a RUNSTATS utility health check—or that otherwise require statistics housekeeping.

Statistics HealthCheck is particularly useful as a prerequisite to Bind ImpactExpert during a V8 migration. First, Statistics HealthCheck identifies the RUNSTATs you need. Then, Bind ImpactExpert provides REBIND insurance to further guarantee consistent and improved access paths.

With Statistics HealthCheck, you know exactly what is “wrong” with your statistics so that you can proactively correct any problems that might negatively affect DB2 subsystem performance.

And come on, it is free after all. What do you have to lose by downloading it and trying it on your DB2 subsystems today?

--------------

Just a quick note to explain that since the links in this blog posting were no longer valid, I modified them. This offering is now being provided by SEGUS, and it is still a free download. The statistics health check software can be downloaded at the following link:

http://www.segus.com/index.php/productdetails/index/en_product_014_Statistics_HealthCheck

Friday, July 07, 2006

New Red Paper on DB2 and Storage

Disk storage has changed rapidly over the past few years with the delivery of new functions and improved performance. DB2 has made many changes to keep pace and make use of the disk improvements. In fact, there is a new "red paper" that outlines these changes titled Disk storage access with DB2 for z/OS.

A red paper is similar to a redbook, but it is shorter (in length) and focuses on a more limited subject matter. They are kind of in between a white paper and a redbook. But they are still free to download - and they are easier to digest quickly than a redbook.

If you don't know what a redbook is, check out my earlier blog posting on the subject: Can You Write a Redbook?

Thursday, July 06, 2006

"Messages & Codes" Now "Messages" and "Codes"

Has anyone noticed? What used to be a single IBM DB2 manual is now two separate manuals. IBM has split the Messages & Codes manual into one manual for Messages and a separate manual for Codes. The links in the previous sentence take you directly to the web versions of the manuals. If you prefer to download PDF documents, use this link for DB2 V8.

So, if you are looking for the meaning of a DSN message you would use the Messages manual; if you are looking for a SQLCODE or SQLSTATE, use the Codes manual. I kinda liked them both in one manual, but I guess the combined manual was getting a bit too big to manage...

Also, if you haven't used it already you might want to become familiar with LookAt. LookAt is an online facility for displaying explanations for most IBM messages, as well as for some system abends and codes.

You can use LookAt on the web at: www.ibm.com/eserver/zseries/zos/bkserv/lookat/

Or, you can use it from anywhere you can access a TSO/E command line. Of course, to use LookAt as a TSO/E command, LookAt must first be installed on your system. You can get the LookAt code via ftp at ftp.software.ibm.com/ps/products/ibmreader/tools/lookat/ZOS/

Wednesday, July 05, 2006

Free DB2 Webinars

Well, everyone in the US should be recovered from the 4th of July holiday by now and be back to work... unless you took a vacation, in which case, you'll be reading this later. It rained here in Texas on the 4th, but that didn't stop the barbecues. Hope your 4th was relaxing (and for my international readers, I hope you'll forgive the brief discussion of a US holiday)...

Anyway, I wanted to take this opportunity during this holiday week to point you to a couple of pre-recorded webinars that I conducted earlier this year. Both are available to be streamed free-of-charge over the Internet.

The first one is titled Managing Common DB2 Performance Problems . In it I discuss some of the more common performance issues with DB2 -- and I also offer some guidance on how to manage DB2 performance at your site. Every DBA and data management professional knows that database performance is an on-going mission, and if you pick up tip or two in this session your time will be well spent.

The second webinar is titled Using Real Time Statistics to Improve DB2 Administration. Even though the Real Time Statistics (RTS) feature was delivered in DB2 Version 7, the number of organizations using them is still slim. This is unfortunate because RTS can be extremely beneficial in terms of analyzing and automating your DB2 maintenance tasks. This webcast offers an overview of RTS, including discussions on implementing them, integrating them with traditional statistics, and using them to help automate utility processing.

These webinars were sponsored by NEON Enterprise Software - and they offer some good tools for managing DB2 performance, automating DB2 administration, and managing changing DB2 access paths and BINDs.

Finally, you might want to hear what Roger Miller has to say about the next version of DB2 for the mainframe -- DB2 for z/OS Version 9. You can listen to a free replay of his webinar outlining this new version here. This webcast talks about many of the new features of the upcoming V9 release of DB2 for z/OS and it offers a great opportunity to learn from Roger - that jack of all trades and master of several...


NOTE
As of late 2011, the webinars referenced in this blog post are no longer available for replay.

Tuesday, June 20, 2006

The Scoop on z/OS v1.8

Earlier this year IBM announced details of the next version of z/OS and z/OS.e - Version 1.8. It seems to me that the hallmark of this new version is availability. There will be new features that enable more granular options for fast replication of data and improved recoverability options in a sysplex. And logical support for up to 4 terabytes of real storage that can be exploited by a single LPAR or z/OS image is a big availability improvement.

What else? Well there are a lot of improvements being offering in z/OS V1.8. For example, it provides improved XML support with XML System Services. It offers the ability to parse and process XML documents. And RACF will support pass phrases - or passwords over 8 characters. And the Unicode 4.0 standard is supported. Of course, these are just a few of the improvements.

But why am I writing about this now? Well, v1.8 is planned for general availability in September 2006 - so it can't hurt to do some up-front planning now in order to be prepared to move to the new version of the operating system in a couple of months.

For those looking for some additional information, there is an interesting article in IBM Systems Magazine called z/OS v.1.8: More of the Same, and That's a Good Thing by Jim Schesvold that offers a quick synopsis of the z/OS and z/OS.e V1.8 announcement.

Saturday, June 03, 2006

A New Online DB2 Community

Just a quick posting today to inform my readers of a new, online DB2 community that I stumbled across while surfing the web. It is called DB2 Noise and it is an independent community -- meaning it is not affiliated with IBM or any other IT vendor.

The site, at http://www.db2noise.com/main/, is designed to provide DB2 professionals a meeting place to help others, learn and encourage the effective use and development of the DB2 product range.

Check it out when you get a chance.

Thursday, May 25, 2006

IBM Announces Their CMDB

IBM announced that their Change and Configuration Management Database (CCMDB) will be shipped on June 30, 2006 -- along with additional process management tools. The IBM CCMDB is billed as a command center for the automatic discovery of IT information on servers, applications, storage and network devices and software across an enterprise. To hear more about IBM's news read this Q+A with Al Zollar, general manager for Tivoli Software at IBM, conducted by Computerworld.

Why is this interesting? Well, for one, it seems like IBM has a good handle on the actual problem, linking it to master data management. And IBM's system software is typically top notch. But to take it up a few levels, why should anyone truly adopy CMDB? Well, according to a recent article Network World (CMDB adoption: What some numbers tell us and why) the CMDB is at the heart of change and configuration management, service assurance, and problem and incident management initiatives.

These are important initiatives because many organizations do not know what assets they have deployed, so they are over- and under-buying these assets. This, of course, results in inefficiencies -- either too much IT spend or ineffective IT. And the bugaboo of missed SLAs should always be top of mind. And implementing a CCMDB promises to help alleviate these problems. And this is goodness...

Wednesday, May 17, 2006

Mainframe Availability

Sometimes those of us who work with mainframes can start to take them for granted. But today's mainframes offer many great features and capabilities for delivering extremely high availability. To that end, I just wanted to point my readers to a nicely written article that details the availability features of z/OS.

Thursday, May 11, 2006

I Lied! One More IDUG Blog

OK, so yesterday I said that I wasn't going to post another blog entry from IDUG, but I guess that was a lie because here I am posting another entry. But not to fear, this will be a short one.

I just finished my final presentation (the one on NULLs) and I'm sitting in an empty hallway as I type this. And, I have to say, this has been a very productive IDUG.

Anyway, I know that my blogs on IDUG this week have been brief, but that is because IDUG is a very busy week for me (as it is for most of the attendees, I suppose). If you are looking for more in-depth coverage of IDUG, I suggest you check out IDUG's Official Conference Blog. There is a LOT of great stuff (as well as pictures) posted there.

Also, if you are interested in another perspective on IDUG, click on over to Willie Favero's DB2 Blog for Willie's insights on the week.

Cheers!

Wednesday, May 10, 2006

IDUG Day Three

It has been a busy couple of days here at the IDUG conference in Tampa this week. Let me catch you up if you aren't lucky enough to be here.

First of all, the vendor exhibit hall was rocking Monday, Tuesday, and Wednesday - but as of lunch today, the exhibit hall is over. Let's see, Compuware was giving away yo-yo's, Golden Gate had light up balls, NEON Enterprise Software was giving away cool black t-shirts, the Database Brothers were zooming around the hall on a Segway that they were giving away, and SoftBase had a slew of stuff including sunglasses, baseball caps, and stop watches. And IBM has finally put out a DB2 Catalog poster of their own. So now you can choose whether you want the CA, BMC, or IBM version. Or better yet, get all three and use them to wall paper your cubicle!

And let's not forget that CA held their annual party which is always a highlight of the show - at least for the drinkers in the crowd. But wait-a-minute, I don't think I've met the DBA who won't quaff a brew when it is offered for free. So I guess fun was had by one and all.

What about the serious stuff, you might ask? OK, I spent several sessions over the past two days concentrating on backup and recovery stuff. I mentioned Thomas Bauman's presentation on Day One, but I also saw Dave Schwartz of IBM outline the capabilities of IBM's Recovery Tools and Ulf Heinreich of Software Engineering talk about SEG's new Recovery Health Check for DB2 z/OS offering. Ulf's presentation discussed how automated daily checks can assure that each DB2 database object can always be recovered reliably and as fast as possible. The product can tell you how long it will take to recover - and isn't that what your boss is always leaning over your shoulder to ask when things go awry? "When will the recovery be done? When will the application be back up?" Wouldn't it be nice to tell him, "36.3 minutes, now make yourself useful and get me a refill on this coffee!"

The conference is winding down, and tomorrow will be the last day. I still have my presentation on Nulls - if you want a preview of that, read my blog from a week or so ago titled Using Nulls in DB2.

So this will be my last post from the IDUG show floor - after my presentation tomorrow, my wife is flying over here to join me for a few days of vacation on the Florida coast. After this week, I think I've earned... heck, we've all earned it, so maybe I'll see some fellow attendees on the beach this weekend. Just don't ask me to take a look at that outer join that is vexing you - I'm on vacation!

Tuesday, May 09, 2006

My First Presentation at IDUG

Oh, yes, I almost forgot to post about my presentation on Monday. I spoke at a vendor-sponsored presentation for NEON Enterprise Software. The topic was on Change Management for DB2 Access Paths - and it is an important topic to consider. Although my presentation was followed by a product pitch for NEON's BindImpact Expert, the topic warrants consideration, product or not.

Basically, the thrust of the presentation is that more focus needs to be applied to managing DB2 access paths. We implement strict change management procedures on every other change we make in the mainframe environment: program changes, system software changes, PTFs, etc. But with access paths it is move the DBRM to production and BIND. We don't move access paths, we create new ones on the fly in the production world. Is this any way to work?

The result of this situation is that many sites often BIND once, and then never REBIND for fear of introducing degraded access paths. This is an unfortunate situation. Basically, what this does is penalize EVERY statement because we are worried about 2 or 3 statements that might degrade.

I know these two things seem at odds with one another: that is, we need change management but we also need to keep REBINDing. And they are. But the proper response is NOT to stop all REBINDs, the proper response is to introduce methods and procedures for implementing a change management discipline. And that is where the NEON solution comes in.

'Nuff said, for now...

Monday, May 08, 2006

IDUG Day One

Just a quick post to report on the first day goings-on at the IDUG North American conference in Tampa. As usual the keynote was entertaining and highly attended. Bob Picciano of IBM talked about "Information as a Service... A New Era in Databases." The basic thrust of the pitch was that SOA and services will provide organizations with the ability to better integrate information and transform and modernize their existing systems.

I also attended a great overview of DB2 Recovery by Robert Goodman of Florida Hospital. He outlined a series of more than 30 tips on how to best implement DB2 recovery practices and procedures.

I'm just now waiting outside the exhibit hall which will open in a few minutes.

As always, IDUG is living up to its reputation to be informative, exciting, and a great place to network. If you're not here this year it is not too early to start pleading your case to attend next year....

Tuesday, May 02, 2006

DB2 for z/OS Version 9 Beta Announcement

On May 2, 2006 IBM announced the beta for the next version of mainframe DB2: namely, DB2 V9.1 for z/OS. You can view the announcement here.

According to the announcement the key areas of improvement in V9 will be in business insight innovations, cost savings through optimized innovations, and business resiliency innovations.

Selected features that deliver these valuable benefits to your business include:

  • Rich new hybrid data server support for both relational and pure XML storage, with the necessary services to support both data structures
  • New data types (decimal float, BIGINT, and varbinary)
  • Native SQL procedural language
  • Improved security with roles, trusted context, and new encryption functions
  • Extensions of the V8 capabilities to make changes to the data definitions without taking your data offline
  • Enhancements to large object support and performance
  • Volume-based copy and recover
  • Refinements to the DB2 industry-leading optimization
  • QMF interface redesigned to provide on demand access to data, reports, and interactive
    visual solutions with optional Web browser

So, if you haven't started planning your move to V8 yet, now is definitely the time to get moving. V8 has been out for a couple of years now and there is a new version of DB2 on the horizon...

Friday, April 28, 2006

System z9 Business Class Mainframe Getting Positive Press

It looks like IBM's new low-cost z9 mainframe is getting some positive press. It always does my heart good to see the mainframe in the IT press.

Take a look at these articles:
Anyone interested in reading all about IBM's z9 mainframe offerings can check 'em out at http://www-03.ibm.com/systems/z/.

Sunday, April 23, 2006

Using Nulls in DB2

A null represents missing or unknown information at the column level. If a column “value” can be null, it can mean one of two things: the attribute is not applicable for certain occurrences of the entity, or the attribute applies to all entity occurrences, but the information may not always be known. Of course, it could be a combination of these two situations, too.

A null is not the same as 0 or blank. Null means no entry has been made for the column and it implies that the value is either unknown or not applicable.

Because DB2 supports null you can distinguish between a deliberate entry of 0 (for numerical columns) or a blank (for character columns) and an unknown or inapplicable entry (NULL for both numerical and character columns). Null indicates that the user did not explicitly make an entry or has explicitly entered NULL for the column. For example, a null value in the Price column of the ITEM table in some database does not mean that the item is being given away for free; instead it means that the price is not known or has not yet been set.

Nulls sometimes are inappropriately referred to as “null values.” Using the term value to describe a null is inaccurate because a null implies the lack of a value. Therefore, simply use the term null or nulls (without appending the term “value” or “values” to it).

A Few Examples

When are nulls useful? Well, defining a column as NULL provides a place holder for data you might not yet know. For example, when a new employee is hired and is inserted into the EMP table, what should the employee termination date column be set to? I don’t know about you, but I wouldn’t want any valid date to be set in that column for my employee record. Instead, null can be used to specify that the termination date is currently unknown.

Let’s consider another example. Suppose that we also capture employee’s hair color when they are hired. Consider three potential entity occurrences: a man with black hair, a woman with unknown hair color, and a bald man. The woman with the unknown hair color and the bald man both could be assigned as null, but for different reasons. The woman’s hair color would be null meaning presently unknown; the bald man’s hair color could be null too, in this case meaning not applicable.

How could you handle this without using nulls? You would need to create special values for the HairColor column that mean “bald” and “unknown.” This is possible for a CHAR column like HairColor. But what about a DB2 DATE column? All occurrences of a column assigned as a DATE data type are valid dates. It might not be possible to use a special date value to mean “unknown.” This is where using nulls is most practical.

DB2 does not differentiate between nulls that signify unknown data and those that signify inapplicable data. This distinction must be made by the program logic of each application.
Keep in mind, though, that using null to indicate “not applicable” can be an indication of improper database design. By properly modeling and normalizing your data structures you can usually eliminate the need to use nulls to indicate that a column is inapplicable for a specific row. For example, consider the following table:

CREATE TABLE EMP
(EMPNO INTEGER NOT NULL,
LAST_NAME CHAR(20) NOT NULL,
FIRST_NAME CHAR(15) NOT NULL,
STREET_ADDR CHAR(30) NOT NULL WITH DEFAULT,
CITY CHAR(12) NOT NULL WITH DEFAULT,
STATE CHAR(2) NOT NULL WITH DEFAULT,
POSTAL_CODE CHAR(10) NOT NULL WITH DEFAULT,
EMP_TYPE CHAR(1) NOT NULL
CHECK(EMP_TYPE IN 'F', 'C', 'P'),
HIRE_DATE DATE,
SALARY DECIMAL(9,2),
BILLING_RATE DECIMAL(5,2));

In this case, we have a code in the EMP_TYPE column that can contain F (full-time), C (contractor), or P (part-time). We also have a SALARY column that is populated for full-time and part-time employees, but is set to null for contractors; and a BILLING_RATE column that is populated for contractors but set to null for full-time and part-time employees. Additionally, the HIRE_DATE column is set to null for contractors.

Well, here we have three columns that are set to null (or not) based on other values in the table. We can design our way out of this problem by creating a separate table for employees and contractors. If additional columns were needed for full-time employees that did not apply part-time employees we might even split the employee table into two: one for full-time and another for part-time. After doing so, there is no more need to use null for inapplicable data.

Indicator Variables

DB2 represents null in a special “hidden” column known as an indicator variable. An indicator variable is defined to DB2 for each column that can accept nulls. The indicator variable is transparent to an end user, but must be provided for when programming in a host language (such as COBOL or PL/I).

The null indicator is used by DB2 to track whether its associated column is null or not. A positive value or a value of 0 means the column is not null and any actual value stored in the column is valid. If a CHAR column is truncated on retrieval because the host variable is not large enough, the indicator value will contain the original length of the truncated column. A negative value indicates that the column is set to null. If the value is -2 then the column was set to null as the result of a data conversion error.

Let’s take a moment to clear up a common misunderstanding right here: nulls NEVER save storage space in DB2 for OS/390 and z/OS. Every nullable column requires one additional byte of storage for the null indicator. So, a CHAR(10) column that is nullable will require 11 bytes of storage per row – 10 for the data and 1 for the null indicator. This is the case regardless of whether the column is set to null or not.

DB2 for Linux, Unix, and Windows has a compression option that allows columns set to null to save space. Using this option causes DB2 to eliminate the unused space from a row where columns are set to null. This option is not available on the mainframe, though.

Syntax

Every column defined to a DB2 table must be designated as either allowing or disallowing nulls. A column is defined as nullable – meaning it can be set to NULL – in the table creation DDL. Null is the default if nothing is specified after the column name. To prohibit the column from being set to NULL you must explicitly specify NOT NULL after the column name. In the following sample table, COL1 and COL3 can be set to null, but not COL2, COL4, or COL5:

CREATE TABLE SAMPLE1
(COL1 INTEGER,
COL2 CHAR(10) NOT NULL,
COL3 CHAR(5),
COL4 DATE NOT NULL WITH DEFAULT,
COL5 TIME NOT NULL);

In SELECT statements, testing for null is accomplished differently than testing for other “values.” You cannot specify WHERE COL = NULL, because this does not make any sense. Remember, null is a lack of a value, so the column does not equal anything. Instead, you would have to code WHERE COL IS [NOT] NULL.

In INSERT statements NULL can be specified in the VALUES clause to indicate that a column is to be set to NULL; but in UPDATE statements you can use the equality predicate (=) to assign a column to NULL.

When inserting data, if the user fails to make an entry in a column that allows nulls, DB2 supplies the NULL as a default (unless another default value exists). If an attempt to insert NULL is made against a column defined as NOT NULL, the statement will fail.

Guidance

Now that you have a good understanding of the basics of nulls, let’s review some guidelines for their usage.

Whenever possible, avoid nulls in columns that must participate in arithmetic logic (for example, DECIMAL money values), and especially when functions will be used. The AVG, COUNT DISTINCT, SUM, MAX, and MIN functions omit column occurrences set to null. The COUNT(*) function, however, does not omit columns set to null because it operates on rows. Thus, AVG is not equal to SUM/COUNT(*) when the average is being computed for a column that can contain nulls. To clarify with an example, if the COMM column is nullable, the result of the following query:

SELECT AVG(COMM)
FROM DSN8810.EMP;

is not the same as for this query:

SELECT SUM(COMM)/COUNT(*)
FROM DSN8810.EMP;

So to avoid confusion, avoid nulls in columns involved in math functions whenever possible.
When DATE, TIME, and TIMESTAMP columns can be unknown, consider creating them as nullable. DB2 checks to ensure that only valid dates, times, and timestamps are placed in columns defined as such. If the column can be unknown, it must be defined to be nullable because the default for these columns is the current date, current time, and current timestamp (unless explicitly defined otherwise using the DEFAULT clause). Null, therefore, is the only viable option for the recording of missing dates, times, and timestamps (unless you pick a specific valid date that is not used by your applications to indicate unknown).

For every other column, determine whether nullability can be of benefit before allowing nulls. Consider these rules of operation:
When a nullable column participates in an ORDER BY or GROUP BY clause, the returned nulls are grouped at the high end of the sort order.
  • Nulls are considered to be equal when duplicates are eliminated by SELECT DISTINCT or COUNT (DISTINCT column).
  • A unique index considers nulls to be equivalent and disallows duplicate entries because of the existence of nulls, unless the WHERE NOT NULL clause is specified in the index.
  • For comparison in a SELECT statement, two null columns are not considered equal. When a nullable column participates in a predicate in the WHERE or HAVING clause, the nulls that are encountered cause the comparison to evaluate to UNKNOWN.
  • When a nullable column participates in a calculation, the result is null.
  • Columns that participate in a primary key cannot be null.
  • To test for the existence of nulls, use the special predicate IS NULL in the WHERE clause of the SELECT statement. You cannot simply state WHERE column = NULL. You must state WHERE column IS NULL.
  • It is invalid to test if a column is <> NULL, or >= NULL. These are all meaningless because null is the absence of a value.
Examine these rules closely. ORDER BY, GROUP BY, DISTINCT, and unique indexes consider nulls to be equal and handle them accordingly. The SELECT statement, however, deems that the comparison of null columns is not equivalence, but unknown. This inconsistent handling of nulls is an anomaly that you must remember when using nulls. The following are several sample SQL queries and the effect nulls have on them.

SELECT JOB, SUM(SALARY)
FROM DSN8810.EMP
GROUP BY JOB;

This query returns the average salary for each type of job. All instances in which JOB is null will group at the bottom of the output.

SELECT EMPNO, PROJNO, ACTNO, EMPTIME,
EMSTDATE, EMENDATE
FROM DSN8810.EMPPROJACT
WHERE EMSTDATE = EMENDATE;

This query retrieves all occurrences in which the project start date is equal to the project end date. This information is clearly erroneous, as anyone who has ever worked on a software development project can attest. The query does not return any rows in which either dates or both dates are null for two reasons: (1) two null columns are never equal for purposes of comparison, and (2) when either column of a comparison operator is null, the result is unknown.

UPDATE DSN8810.DEPT
SET MGRNO = NULL
WHERE MGRNO = '000010';

This query sets the MGRNO column to null wherever MGRNO is currently equal to '000010' in the DEPT table.

When creating tables, treat nullable columns the same as you would any other column. Some DBAs advise you to place nullable columns of the same data type after non-nullable columns. This is supposed to assist in administering the null columns, but it does not really help – and it might hurt. Sequencing nullable columns in this manner provides no clear benefit and should be avoided.

Summary

Nulls are clearly one of the most misunderstood features of DB2 – indeed, of most SQL database systems. Although nulls can be confusing, you cannot bury your head in the sand and ignore nulls if you choose to use DB2 as your DBMS. Understanding what nulls are, and how best to use them, can help you to create usable DB2 databases and design useful and correct queries in your DB2 applications.

For those of you interested in hearing about nulls in more depth, please consider attending my presentation at IDUG 2006 in Tampa, FL. It is titled "Null and Void? Dealing with Nulls in DB2" and it will be held on Thursday, May 11th, at 8:30 AM. For more details on IDUG, including how to register check out IDUG on the web at http://conferences.idug.org/Default.aspx?alias=conferences.idug.org/na

Sunday, April 16, 2006

Mainframe Alive at Merrill Lynch

A recent article in Baseline Magazine highlights how Merrill Lynch & Company reinvigorated 420 financial programs stored on the mainframe by building Web services that can handle millions of interactions a day.

Merrill was looking to leverage its continuing heavy investment in mainframe applications and hardware by making functions in legacy applications available as Web services. Indeed, many companies with robust mainframe systems should be looking to do this - it can breathe new life into the applications without requiring a complete global re-write of still-working programs. And maintain the high availability, security, and performance of the mainframe.

According to the article, "Merrill has a huge IBM mainframe installation—one of the largest in the world—with 1,200 programmers supporting some 23,000 mainframe programs that process 80 million Customer Information Control System (CICS) transactions per day."

Yes, there is still life in mainframe systems. Wise organizations will continue to utilize their existing and vibrant mainframe systems by building web services on top of them.

Friday, March 31, 2006

DSNTIAD - The Dynamic SQL Update Program

In my last blog entry I posted about DSNTEP2, the batch SQL program. But there is another batch SQL program named DSNTIAD that is less widely-known and used.

DSNTIAD is an assembler application program that can issue the same DB2 dynamic SQL statements as DSNTEP2, with the exception of the SELECT statement. For this reason, applications programmers almost always prefer to use DSNTEP2 rather than DSNTIAD.

DSNTAID is written in Assembler language. Because DSNTIAD is a sample program, its source code could be modified to accept SELECT statements if you so desired. But this task is complex and should not be undertaken by a beginning programmer.

Here is sample JCL for running DSNTIAD:

//DB2JOBU JOB (UTILITY),'DB2 SAMPLE UPD',MSGCLASS=X,
// CLASS=X,NOTIFY=USER
//*
//**************************************************
//*
//* DSNTIAD - SAMPLE DB2 UPDATE PROGRAM
//*
//**************************************************
//*
//JOBLIB DD DSN=DSN810.DSNLOAD,DISP=SHR
//BATUPSQL EXEC PGM=IKJEFT01,DYNAMNBR=20
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//SYSTSIN DD *
DSN SYSTEM(DSN)
RUN PROGRAM(DSNTIAD) PLAN(DSNTIAD6) -
LIB('DSN810.RUNLIB.LOAD')
END
/*
//SYSIN DD *
UPDATE DSN8810.DEPT
SET DEPTNAME = 'CHANGED NAME'
WHERE DEPTNO = 'D01' ;

INSERT INTO DSN8510.ACT
VALUES (129, 'XXXXXX', 'SAMPLE ACCT') ;

DELETE FROM DSN8510.EMP
WHERE SALARY < 1000 ;

CREATE DATABASE TESTNAME
BUFFERPOOL BP12
STOGROUP DSN8G510 ;

GRANT DBADM ON TESTNAME TO USERA ;
/*
//


So why would anyone consider using DSNTIAD over DSNTEP2? Well, DSNTIAD supports the LABEL ON statement, whereas DSNTEP2 does not. But unlike DSNTEP2, DSNTIAD does not accept comments embedded in SQL statements.

Also note that DSNTIAD can be a little more efficient than DSNTEP2 because it is written in Assembler.

Friday, March 24, 2006

DSNTEP2 aka Batch SPUFI

DSNTEP2 is an application program that can be used to issue DB2 dynamic SQL statements. It is sometimes referred to as “Batch SPUFI” because it allows you to submit SQL in batch similar to how SPUFI allows online SQL execution.

The following sample JCL demonstrates the capability of DSNTEP2 to issue DCL, DDL, and DML dynamically.

//DB2JOBU JOB (BATCHSQL),'DB2 SAMPLE SQL',MSGCLASS=X,
// CLASS=X,NOTIFY=USER
//*
//***********************************
//*
//* DB2 SAMPLE SQL PROGRAM
//*
//***********************************
//*
//JOBLIB DD DSN=DSN810.DSNLOAD,DISP=SHR
//BATCHSQL EXEC PGM=IKJEFT01,DYNAMNBR=20
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//SYSTSIN DD *
DSN SYSTEM(DSN)
RUN PROGRAM(DSNTEP2) PLAN(DSNTEP81) -
LIB('DSN810.RUNLIB.LOAD')
END
/*
//SYSIN DD *
SELECT * FROM SYSIBM.SYSTABLES;

UPDATE DSN8810.DEPT
SET DEPTNAME = 'CHANGED NAME'
WHERE DEPTNO = 'D01' ;

INSERT INTO DSN8810.ACT
VALUES (129, 'XXXXXX', 'SAMPLE ACCT');

DELETE FROM DSN8810.EMP
WHERE SALARY <>

/*

The DNSTEP2 program is written in the PL/I programming language. Prior to DB2 V6, you needed to have a PL/I compiler to use DSNTEP2. However, as of V6 IBM now provides both the source code and an object code version of DSNTEP2 with DB2. So, you no longer need a PL/I compiler to use DSNTEP2.

Because DSNTEP2 is an application program, and the PL/I source code is provided with DB2, a knowledgeable PL/I programmer can easily modify the code. After doing so, of course, it must be compiled, linked, and bound before it can be used.

DSNTEP2 can process almost every SQL statement that can be executed dynamically. DSNTEP2 accepts
· The GRANT and REVOKE DCL statements,
· The ALTER, COMMENT ON, CREATE, and DROP DDL statements,
· The DELETE, INSERT, SELECT, and UPDATE DML statements, and
· The COMMIT, ROLLBACK, EXEC SQL, EXPLAIN, and LOCK statements.

The only important statement that DSNTEP2 does not support is the LABEL ON DDL statement. Of course, DSNTEP2 can be modified to support this statement if you have PL/I knowledge and a PL/I compiler.

When Does DSNTEP2 Commit?

Well, the simple answer to that question is that the results of the SQL are committed upon completion of all the SQL. A helpful answer is a little longer.

First off, you need to know that DSNTEP2 has an internal parameter named MAXERRORS that controls the number of failing statements that can occur before it stops. A failing statement is one which returns a negative SQLCODE. The value of MAXERRORS is set to 10 inside the program, so DSNTEP2 will allow 9 failing SQL statements but when it hits the 10th failing statement, it will exit, COMMITting all other work.

This is ugly because it can wreak havoc on the integrity of your data. I mean, who wants to figure out what was run, what was impacted, and then try to rebuild a job to fix data and/or restart at the right place? To rerun DSNTEP2, remember that all SQL statements that completed with a 0 SQL code were committed. These statements should not be rerun. All SQL statements completed with a negative SQL code must be corrected and reprocessed.

Certain severe errors cause DSNTEP2 to exit immediately. One severe error is a -101 “SQL statement too long or too complex".

If any SQL errors occurred during the execution of DSNTEP2, a return code of 8 is returned by the job step.

At any rate, DSNTEP2 never issues an explicit COMMIT or ROLLBACK by itself. A COMMIT occurs at the end unless the program abends.

Specify Your SQL Statements Properly

The SQL to be run by DSNTEP2 is specified in SYSIN. Be sure to code the DSNTEP2 input properly.

DSNTEP2 reads SQL statements from an input data set with 80-byte records. The SQL statements must be coded in the first 72 bytes of each input record. SQL statements can span multiple input records and are terminated by a semicolon (;). Semicolons are not permitted in the text of the SQL statement.

Liberally Comment DSNTEP2 Input

Comments can be passed to DSNTEP2 in the SQL statements using two hyphens in columns 1 and 2 or a single asterisk in column 1. Doing so is good form and helps others to understand what your DSNTEP2 job is attempting to accomplish.

Bottom Line

DSNTEP2 is especially useful for running one-off SQL statements. Use DSNTEP2 when you have ad hoc SQL to run in a batch environment. DSNTEP2 is easier than writing your own quick and dirty programs to run ad hoc SQL in batch. It is simple to set up and saves time. But be careful if you have multiple SQL modification statements (INSERT, UPDATE, DELETE) because of the above-mentioned possibility of causing data integrity problems if some of the statements fail and others succeed.

Tuesday, March 21, 2006

IBM Announces z/OS V1.8

On February 28, 2006 IBM announced the latest planned upgrade of its stalwart mainframe operating system, z/OS. IBM has announced some interesting new functionality for the new version, V1.8. Planned are improvements in scale, availability, security, and resource optimization. Also, there is an increased focus on simplification to make z/OS easier to use.

The actual IBM announcement can be read here.

Further details can be found in this handy summary of z/OS futures planning put together by Jim Schesvold, a technical editor for IBM Systems Magazine.

IBM plans for z/OS V1.8 to be available in September 2006.

Tuesday, March 14, 2006

DB2 Versions, Service, and Such

Keeping up-to-date with the latest and greatest DB2 versions and functionality can be a time-consuming task. Every 18 to 36 months IBM announces a new version of DB2 with even more features and functionality than ever before.

DB2 will soon celebrate its 25th anniversary. The basis for DB2 began with IBM’s System R research project. In 1982, IBM delivered SQL/DS on VM and VSE, and then a year later in 1983, IBM released DB2 for MVS Version 1. Through subsequent versions and releases IBM has improved the functionality and performance of DB2.

Functionality aside, though, it can be difficult to keep track of new DB2 versions. Questions like “What version has which feature?”, “How much longer will IBM support the DB2 version we are running?”, and “When should we begin to migrate to a new versions… and which one?” are very common in this day and age of rapid technology change. Let’s examine some of these questions.

First of all, we need to understand some basic terminology: “withdrawal from marketing” and “withdrawal from service.” When IBM withdraws a product from marketing the product will no longer be advertised or sold by IBM; but IBM will continue to service and support customers. When IBM withdraws a product from service you will no longer be able to get technical support for that product… unless you negotiate extended service (at an extended price) with IBM.

So, the current version of DB2 for z/OS is Version 8 and it has been available for just over 2 years now. And it seems that the announcement of the next version of DB2 is imminent, what with some of the Vnext presentations IBMers are delivering.

But the big question these days, for most shops, is not the next version, but the current version. Most DB2 users are still running Version 7, but they will need to migrate to Version 8 soon. If you are running an earlier version of DB2 (than V7) you really should waste no time in getting to V7 – not only is it a solid release, but you’ll need to start worrying about V8 soon because the projected end of service date for DB2 Version 7 is September 2007.

You can keep an eye on the current versions of DB2 that are available by bookmarking this link http://www.ibm.com/software/data/db2/zos/support/plc/. IBM keeps this information up-to-date regarding the end of marketing and end of service dates for DB2 versions.

The bottom line, of course, is that more functionality is available to you by keeping up-to-date with the latest DB2 version. But issues such as rapid versioning, complexity, difficulty of migration, and managing new versions can make keeping up-to-date difficult. And diligence is required to keep everything straight. At least we have a lot of information available to us:

Good luck with DB2…