Friday, December 11, 2009

A Short History of DB2 for z/OS – Part 2

Today’s blog entry is a continuation of yesterday’s post in which we began a brief review of the history of DB2 for z/OS. That post covered Versions 1 through 3; so today we pick up our historical review with Version 4.

Version 4 was a very significant milestone in the history of DB2. It was highlighted by the introduction of Type 2 indexes, which removed the need to lock index pages (or subpages, now obsolete). Prior to V4, index locking was a particularly thorny performance problem that vexed many shops. And, of course, I’d be remiss if I did not discuss data sharing, which made its debut in V4. With data sharing, DB2 achieved new heights of scalability and availability unmatched within the realm of DBMS; it afforded users the ability to upgrade without an outage and to add new subsystems to a group “on the fly.” The new capabilities did not stop with there; V4 also introduced stored procedures, CP parallelism, performance improvements, and more. DB2 V4 was, indeed, a major milestone in the history of mainframe DB2.

In June 1997 DB2 Version 5 became generally available. It was the first DB2 version to be referred to as DB2 for OS/390 (previously it was DB2 for MVS). Not as significant as V4, we see the trend of even numbered releases being bigger and more significant than odd numbered releases (of course, this is just my opinion). V5 was touted by IBM as the e-business and BI version. It included Sysplex parallelism, prepared statement caching, reoptimization, online REORG, and conformance to the SQL-92 standard.

Version 6 brings us to 1999 and the introduction of the Universal Database term to the DB2 moniker. The “official” name of the product is now DB2 Universal Database for OS/390. And the Release Guide swelled to over 600 pages! Six categories of improvements were introduced with V6 spanning:
  • Object-relational extensions and active data
  • Network computing
  • Performance and availability
  • Capacity improvements
  • Data sharing enhancements
  • User productivity
The biggest of the new features were SQLJ, inline statistics, triggers, large objects (LOBs), user-defined functions, and distinct types.

Version 6 is also somewhat unique in that there was this “thing” typically referred to as the V6 refresh. It added functionality to DB2 without there being a new release or version. The new functionality in the refresh included SAVEPOINTs, identity columns, declared temporary tables, and performance enhancements (including star join).

March 2001 brings us to DB2 Version 7, another “smaller” version of DB2. Developed and released around the time of the Year 2000 hubbub, it offered much improved utilities and some nice new SQL functionality including scrollable cursors, limited FETCH, and row expressions. Unicode support was also introduced in Db2 V7. For a more detailed overview of V7 (and the V6 refresh) consult An Introduction to DB2 for OS/390 Version 7.

DB2 Version 8 followed, but not immediately. IBM took advantage of Y2K and the general desire of shop’s to avoid change during this period to take its time and deliver the most significant and feature-laden version of DB2 ever. V8 had more new lines of code than DB2 V1R1 had total lines of code!

I don’t want to get bogged down in recent history here outlining the features and functionality of DB2 releases that should be fresh in our memory (V8 and V9). If you really want some details on those refer to these links for them:

An Overview of DB2 for z/OS Version 8


DB2 9 for z/OS Features



Which brings us to today. Most shops should be either running Version 9 in production or planning their migration from V8 to V9. And we are all waiting with baited breath for DB2 X… which hopefully should be announced sometime next year.

Thursday, December 10, 2009

A Short History of DB2 for z/OS – Part 1

Let's go back in time... almost three decades ago... back to the wild and woolly 1980s! And watch our favorite DBMS, DB2, grow up over time.

Version 1 Release 1 was announced on June 7, 1983. And it became generally available on Tuesday, April 2, 1985. I wonder if it was ready on April 1st but not released because of April Fool’s Day? Any old-time IBMer out there care to comment?

Initial DB2 development focused on the basics of making a relational DBMS work. Early releases of DB2 were viewed by many as an “information center” DBMS, not for production work like IMS.

Version 1 Release 2 was announced on February 4, 1986 and was released for general availability a month later on March 7, 1986. Wow! Can you imagine waiting only a month for a new release of DB2 these days? But that is how it happened back then. Same thing for Version 1 Release 3, which was announced on May 19, 1987 and became GA on June 26, 1987. DB2 V1R3 saw the introduction of date data types.

You might notice that IBM delivered “releases” of DB2 back in the 1980s, whereas today (and ever since V3) there have only been versions. Versions are major, whereas releases are not quite as significant as a version.

Version 2 of DB2 became a reality in 1988. Version 2 Release 1 was announced in April 1988 and delivered in September 1988. Here we start to see the gap widening again between announcement and delivery. V2R1 was a very significant release in the history of DB2. Some mark it as the bellwether for when DB2 began to be viewed as a DBMS capable of supporting mission critical, transaction processing workloads. Not only did V2R1 provide many performance enhancements but it also signaled the introduction of declarative Referential Integrity (RI) constraints. RI was important for the acceptance of DB2 because it helps to assure data integrity within the DBMS.

No sooner than V2R1 became GA than IBM announced Version 2 Release 2 on October 4, 1988. But it was not until a year later that it became generally available on September 23, 1988. DB2 V2R2 again bolstered performance in many ways. It also saw the introduction of distributed database support (private protocol) across MVS systems.

Version 2 Release 3 was announced on September 5, 1990 and became generally available on October 25, 1991. Two very significant features were added in V2R3: segmented table spaces and packages. Segmented table spaces have become the de facto standard for most DB2 data and packages made DB2 application programs easier to support. DB2 V2R3 is also the version that beefed up distributed support with Distributed Relational Database Architecture (DRDA). Remote unit of work distribution was not available in the initial GA version, but IBM came out with RUOW support for DB2 V2R3 in March 1992.

And along comes DB2 Version 3 announced in November 1993 and GA in December 1993. Now it may look like things sped up again here, but not really. This is when the QPP program for early support of DB2 started. QPP was announced in March 1993 and delivered to customers in June 1993. Still though, fairly rapid turnaround by today’s standards, right?

V3 greatly expanded the number of bufferpool options available (from 5 pools to 80). There were many advances made in DB2 V3 to take better advantage of the System 390 environment: V3 introduced support for hardware assisted compression and hiperpools. It was also V3 that introduced I/O parallelism for the first time.

We’ll stop here for today and continue our short history of DB2 in my next DB2Portal blog posting. See you soon...

Wednesday, December 02, 2009

Wordle of my DB2 Portal Blog

Wordle: DB2 Portal Wordle The "jumble" of words shown here is a Wordle, which is a "word cloud" of text. I fed my blog location into the Wordle generator and it created this pretty picture based on the words I most commonly use here in this blog.

The cloud gives greater prominence to words that appear more frequently in the source text. No surprise that "DB2" and "data" dominate the other words!

Monday, November 23, 2009

Reading Things That Aren't There... and Missing Things That Are!

You can shoot yourself in the foot using DB2 if you are not careful. There are options that you can specify that may cause you to read data that is not really in the database. And, alternately, you can set things up so that you miss reading data that is actually in the database.

How, you might be asking? Well, dirty reads will take care of the first one. Specifying ISOLATION(UR) implements read-through locks, which is sometimes referred to as a dirty read. It applies to read operations only. With this isolation level data may be read that never actually exists in the database, because the transaction can read data that has been changed by another process but is not yet committed.

Read uncommitted isolation provides the highest level availability and concurrency of the isolation levels, but the worst degree of data integrity. It should be used only when data integrity problems can be tolerated. Certain types of applications, such as those using analytical queries, estimates, and averages are likely candidates for read uncommitted locking. A dirty read can cause duplicate rows to be returned where none exist or no rows may be returned when one (or more) actually exists. When choosing read uncommitted isolation the programmer and DBA must ensure that these types of problems are acceptable for the application.

OK, so what about not reading data that is in the database? DB2 V9 provides us an option to do just that. In DB2 9 it is possible for a transaction to skip over rows that are locked. This can be accomplished by means of the SKIP LOCKED DATA option within your SQL statement(s). SKIP LOCKED DATA can be specified in SELECT, SELECT INTO, and PREPARE, as well as searched UPDATE and DELETE statements. You can also use the SKIP LOCKED DATA option with the UNLOAD utility.

When you tell DB2 to skip locked data then that data is not accessed and your program will not have it available. DB2 just skip over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time. But it comes at the cost of not accessing the locked data at all. This means that you should only utilize this clause when your program can tolerate skipping over some data.

The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation. But it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 will simply ignore the SKIP LOCKED DATA clause under UR and RR isolation levels.

Additionally, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks. And the bigger the lock size, the more data that will be skipped when a lock is encountered. With row locking you will be skipping over locked rows, but with page locking you will be skipping over all the rows on the locked page.

Use both of these features with extreme care and make sure that you know exactly what you are telling DB2 to do. Otherwise, you might be reading more... or less than you want!

Tuesday, November 17, 2009

Replacing UNION with CASE

When a UNION is required to put together data from multiple queries, you might be able to use a CASE statement instead. This is very useful, particularly when the data for each of the queries in the UNION come from the same table. The CASE statement can potentially enhance performance by minimizing the number of times the data is read.

Let’s look at an example to clarify why:

SELECT CREATOR, NAME, 'TABLE'
FROM SYSIBM.SYSTABLES
WHERE TYPE = 'T'
UNION
SELECT CREATOR, NAME, 'VIEW '
FROM SYSIBM.SYSTABLES
WHERE TYPE = 'V'
UNION
SELECT CREATOR, NAME, 'ALIAS'
FROM SYSIBM.SYSTABLES
WHERE TYPE = 'A'
ORDER BY NAME;

This simple SQL statement uses UNION to put together the results of three queries against the SYSTABLES table. The report shows all of the DB2 table-like objects that exist in the DB2 subsystem: tables, views, and synonyms.

To do this, DB2 must scan through the table three times – once for each query (as there is no index on the TYPE column). But, you can use CASE and code an equivalent, but more efficient query, as follows:

SELECT CREATOR, NAME,
CASE TYPE
WHEN 'T' THEN 'TABLE'
WHEN 'V' THEN 'VIEW '
WHEN 'A' THEN 'ALIAS'
END
FROM SYSIBM.SYSTABLES
ORDER BY NAME;

This new query will need to scan SYSTABLES only once. The CASE statement will translate the code in the TYPE column into the text that we desire.

CASE statements are very powerful and you should use them when you can to create elegant and optimal SQL in your DB2 applications.