Wednesday, March 19, 2008

DB2 Developer's Guide Tops Book Poll

I learned of some good news today. Evidently my DB2 book, DB2 Developer’s Guide, recently won the favorite book poll conducted by IBM Database Magazine (formerly DB2 Magazine) a couple of months ago… and it won by a substantial margin.



Thanks to everyone who voted for my book… I appreciate your support.

Tuesday, March 18, 2008

Free Downloadable eBook from IBM Press

Pearson and IBM Press have just published a new eBook on IBM database technology that is available for free download. Just follow this link, provide your e-mail, and they’ll direct you to a PDF containing sample chapters from six recently published books.

The information in the eBook comes from the following books:

IBM Press is also offering a 35% discount on these, and other selected titles. But in some cases, the amazon discount is greater than 35%, so if you'd rather buy from amazon.com, simply follow the links in the bulleted list above.

But make sure you download that free eBook... after all, what is better than free?

Monday, March 03, 2008

Q+A: Nulls and DASD

Recently, I received the following question, which I will answer today on this blog posting.

Question: Lets say I have a table A which has 500 columns. Out of those 500 columns only 5 columns have been defined as not nullable and the rest have been defined as NULLS allowed. And out of those 500 columns I have found that 300 columns are unused(empty) totally. My business allows me to remove those 300 columns. My doubt is if I remove those 300 empty columns will I save on DASD space occupied by DB2? Will empty columns occupy DASD space?Would be really helpful if you can guide me on this.

Answer: I'm happy to try to help out. First of all, the short answer to your question is "Yes!" Those 500 columns are all consuming valuable disk space. To determine how much space is being consumed, you will need to examine the data type and length assigned to each column and add them up. And to make matters worse, you must add an additional 1 byte to each of them because the columns are nullable.

In DB2, a NULL is stored using a special one-byte null indicator that is "attached" to every nullable column. If the column is set to NULL, then the indicator field is used to record this. Using NULL will never save space in a DB2 database design - in fact, it will always add an extra byte for every column that can be NULL. The byte is used whether or not the column is actually set to NULL.

So, a column defined as CHAR(5) NOT NULL will required five bytes of storage space - but if it is defined as nullable, then it requires six bytes of storage space - five bytes for the data, and one byte for the null indicator.

Given all of this, it would seem that there is a very viable case to be made for you to remove those columns that are not being used. Of course, this means that you will likely have to make changes to any programs accessing that table. Because the table definition will change (fewer columns) you will need new DCLGENs and those will have to be included and bound into your programs. Be sure to factor this additional workload into your planning before moving forward with this change.

The better question to ask is "How the heck did all of the empty columns get put into the table to begin with and how did that design get past the DBAs?"

If you have an answer for that one, please share it by posting your answer in a comment here!

Wednesday, February 20, 2008

Let's Hear It for COBOL!

I know that this blog is supposed to be primarily about DB2, but I like to sneak in mainframe-related topics from time to time. And I read a great article today in Computerworld that I want to share with you. The article, titled Confession of a COBOL Programmer, talks up the need for COBOL coders. Just like with other mainframe technologies, as the baby boomers retire there is an insufficient supply of newbies available to step in and continue the care and feeding of the COBOL legacy.

I've written briefly about COBOL before, in my Data Management Today blog. COBOL is still all over the place and in no danger of dying off. According to the Computerworld article, 75% of the world's businesses data is still processed in COBOL, and about 90% of all financial transactions are in COBOL.

Yet there is a lingering perception "out there" that COBOL is dead (or at least dying). And as far as graduating seniors and new programmers are concerned, COBOL ain't cool! New programmers don't want to learn it and most universities don't teach it in their computer science or information science curricula. Just like the mainframe (which is alive and well, too), COBOL is ignored and a big problem is developing.

Analysts at Gartner estimate that there are 180 billion lines of COBOL code in existence and about 90,000 COBOL programmers. To convert all of that to something else "each programmer will require 100,000 hours to complete the conversion of 2 million lines. That works out to 12,500 eight-hour workdays. If we figure 250 workdays per year (though it’s unlikely any Cobol programmers are settling for just two weeks of vacation per year), these guys should be done in 50 years."

Who knows, when I retire (sometime in the far-off future) maybe I'll hang up a shingle and offer my services as a COBOL coder... after all, that is what I started out doing right out of college (all those years ago)...

Tuesday, February 05, 2008

Intelligent Automatic Reoptimization? [DB2 9 for z/OS]

Most seasoned SQL programmers know that when you use host variables, DB2 may not always come up with the absolutely most optimal access path at bind time. Without knowing the actual values of the host variables the optimizer has to make some best guesses as to how best to satisfy the SQL request.

Furthermore, we know that we can guide DB2 on how best to approach this situation using the REOPT parameter of the BIND command. Prior to DB2 V9, there were three options for REOPT:

REOPT(NONE) – DB2 will not reoptimize SQL at run time.

REOPT(ALWAYS) – DB2 will prepare SQL statements again at run time when the host variable values are known. This enables the DB2 optimizer to formulate the query execution plan using the actual host variable values, which can result in better performing access paths.

REOPT(ONCE) – DB2 will prepare SQL statements only once, using the first set of host variable values, no matter how many times the statement is executed by the program. The access path is stored in the Dynamic Statement Cache (DSC) and will be used for all subsequent executions of the same SQL statement. REOPT(ONCE) only applies to dynamic SQL statements and is ignored if you use it with static SQL statements. This option was introduced in DB2 V8.

What is New in V9?

DB2 9 for z/OS introduces a new REOPT option: REOPT(AUTO). The ideas behind REOPT(AUTO) is to come up with the optimal access path in the minimum number of prepares.

The basic premise of REOPT(AUTO) is to re-optimize only when host variable values change. Using this option, DB2 will examine the host variable values and will generate new access paths only when host variable values change and DB2 has not already generated an access path for those values.

REOPT(AUTO) only applies to dynamic statements that can be cached.

After migrating to DB2 9, consider re-evaluating programs bound specifying REOPT(ALWAYS) and REOPT(NONE). In many cases, switching to REOPT(AUTO) from REOPT(ALWAYS) can produce performance improvement; and in some cases you can use re-optimization with REOPT(AUTO) for programs bound REOPT(NONE) because of the fear of too frequent re-optimization causing a performance hit.

In particular, consider specifying REOPT(AUTO) for SQL statements that at times can take a relatively long time to execute, depending on the values of parameter markers. In particular, you should especially consider doing this when parameter markers refer to non-uniform data that is joined to other tables.