Showing posts with label DB2 9. Show all posts
Showing posts with label DB2 9. Show all posts

Thursday, April 02, 2020

A Condensed 35-Year History of DB2 for z/OS (...and Db2 for z/OS)


Let's go back in time... over three decades ago... back to the wild and woolly 1980s! And watch our favorite DBMS, DB2, grow up over time.

DB2 Version 1 Release 1 was announced on June 7, 1983, and it became generally available on Tuesday, April 2, 1985. I wonder if it was ready on April 1st but not released because of April Fool’s Day? Initial DB2 development focused on the basics of making a relational DBMS work. Early releases of DB2 were viewed by many as an “information center” DBMS, not for production workloads, like IMS was.

Version 1 Release 2 was announced on February 4, 1986 and was released for general availability a month later on March 7, 1986. Can you imagine waiting only a month for a new release of DB2 these days? But that is how it happened back then. Same thing for Version 1 Release 3, which was announced on May 19, 1987 and became GA on June 26, 1987. DB2 V1R3 saw the introduction of DATE data types.

You might notice that IBM delivered “releases” of DB2 back in the 1980s, whereas today (and ever since V3) there have only been versions. Versions are major changes, whereas releases are not quite as significant as a version.

Version 2 Release 1 was announced in April 1988 and delivered in September 1988. Here we start to see the gap widening again between announcement and delivery. V2R1 was a significant release in the history of DB2, a bellwether of sorts for when DB2 began to be viewed as capable of supporting mission-critical, transaction processing workloads. Not only did V2R1 provide significant performance enhancements but it also signaled the introduction of declarative Referential Integrity (RI) constraints.

No sooner than V2R1 became GA than IBM announced Version 2 Release 2 on October 4, 1988. But it was not until a year later that it became generally available on September 23, 1988. DB2 V2R2 again bolstered performance in many ways. It also saw the introduction of distributed database support (private protocol) across MVS systems.

Version 2 Release 3 was announced on September 5, 1990, and became generally available on October 25, 1991. Two very significant features were added in V2R3: segmented table spaces and packages. Segmented table spaces quickly became a de facto standard and packages made DB2 application programs easier to support. DB2 V2R3 is also the version that beefed up distributed support with Distributed Relational Database Architecture (DRDA).

Along comes DB2 Version 3, announced in November 1993 and GA in December 1993. Now it may look like things sped up again here, but not really. This is when the early support program for DB2 started. Early support was announced in March 1993 and delivered to customers in June 1993. V3 greatly expanded the number of buffer pool options available (from 5 pools to 80), and many advances were made for DB2 to take better advantage of the System 390 environment, including support for hardware-assisted compression and hiperpools. It was also V3 that introduced I/O parallelism for the first time.

Version 4 signaled another significant milestone in the history of DB2. It was highlighted by the introduction of Type 2 indexes, which removed the need to lock index pages (or subpages, now obsolete). Prior to V4, index locking was a particularly thorny performance problem that vexed many shops. Data Sharing made its debut in V4, too, and with it, DB2 achieved new heights of scalability and availability allowing users to upgrade without an outage and to add new subsystems to a group “on the fly.” DB2 V4 also introduced stored procedures, as well as CP parallelism.

In June 1997 DB2 Version 5 became generally available. It was the first DB2 version to be referred to as DB2 for OS/390 (previously it was DB2 for MVS). Not as significant as V4, we see the trend of even-numbered releases being bigger and more significant than odd-numbered releases (of course, that is just my opinion). V5 was touted by IBM as the e-business and BI version. It included Sysplex parallelism, prepared statement caching, reoptimization, online REORG, and conformance to the SQL-92 standard.

Version 6 brings us to 1999 and the introduction of the Universal Database term to the DB2 moniker. The “official” name of the product became DB2 Universal Database for OS/390. And the Release Guide swelled to over 600 pages! Six categories of improvements were introduced with V6 spanning object-relational extensions, network computing, performance and availability, capacity improvements, data sharing enhancements, and user productivity. The biggest of the new features were SQLJ, inline statistics, triggers, large objects (LOBs), user-defined functions, and distinct types.

Version 6 is also somewhat unique in that there was this “thing” typically referred to as the V6 refresh. It added functionality to DB2 without there being a new release or version. The new functionality in the refresh included SAVEPOINTs, identity columns, declared temporary tables, and performance enhancements (including star join).

March 2001 brings us to DB2 Version 7, another “smaller” version of DB2. Developed and released around the time of the Year 2000 hubbub, it offered much-improved utilities and some nice new SQL functionality including scrollable cursors, limited FETCH, and row expressions. Unicode support was also introduced in Db2 V7.

DB2 Version 8 followed, but not immediately. IBM took advantage of Y2K and the general desire of shops to avoid change during this period to take its time and deliver the most significant and feature-laden version of DB2 ever. V8 had more new lines of code than DB2 V1R1 had total lines of code!

With DB2 9 for z/OS, we drop the “V” from the name. Is that in response to Oracle’s naming conventions? Well, we do add a space between the DB2 and the version number because we don’t want to talk about DB-twenty-nine! A lot of great new functionality comes with DB2 9 including additional database definition on demand capabilities, binary data types, and a lot of new SQL capabilities including OLAP functions and EXCEPT/INTERSECT. But probably the biggest new feature is pureXML, which allows you to store DB2 data as native XML. The XML is stored natively as a new data type that can be searched and analyzed without the need to reformat it. The approach was novel in that it  supports native XML, basically enabling dual storage engines.

And that brings us to DB2 10 for z/OS. This version of DB2 was built to take advantage of many zEnterprise (the latest new mainframe at the time) features to deliver scalability. Examples include improved compression, cache optimization, blades for running the Smart Analytics Optimizer, etc. 

Additional capabilities included many performance improvements (BIND, IN-list, utilities, etc.), hash organized table spaces, high-performance DBATs (DDF threads) forced to use RELEASE COMMIT, parallel index updating, efficient caching of dynamic SQL with literals, temporal data support, safe query optimization, improved access path hints, access to currently committed data, new TIMESTAMP precision and time zones, and buffer pool options for pinning objects in memory.

In October 2013 we got another new version, DB2 11 for z/OS. Click on that link if you want all the details, but some highlights included transparent archiving, global variables, improved SQL PL, APREUSE(WARN), significant utility improvements, DROP COLUMN support, and JSON support with IBM BigInsights.

And that brings us to the present day, with DB2 12 for z/OS as the current (and soon to be only) supported version of Db2. Released for general availability in October 2016, DB2 12 for z/OS abandons the traditional new release cycle that IBM has followed for decades, adopting a new continuous delivery model. New functionality is now delivered in Function Levels (FLs) that are easily applied and delivered much more rapidly than in the past. Indeed, the current Db2 function level is FL506, which means there have been 6 new function levels added since 2016.

Version 12 brought with it a plethora of new capabilities including virtual storage enhancements, optimization improvements, and improved control over the introduction of new SQL capabilities. DB2 12 for z/OS delivered many improvements for both application development and database administration. Examples of new application capabilities include:
  • Additional support for triggers, arrays, global variables, pureXML, and JSON
  • MERGE statement enhancements
  • SQL pagination support
  • Support for Unicode columns in an EBCDIC table
  • Piece-wise deletion of data
  • Support for temporal referential constraint
  • More flexibility in defining application periods for temporal tables
  • PERCENTILE function support
  • Resource limits for static SQL statements
  • Db2 REST services improve efficiency and security
  • DevOps with Db2: Automated deployment of applications with IBM UrbanCode Deploy
Examples of new DBA and SYSADM capabilities include:

  • Installation or migration without requiring SYSADM
  • Improved availability when altering index compression
  • Online schema enhancements
  • Improved catalog availability
  • Object ownership transfer
  • Improved data validation after running DSN1COPY
  • Automatic start of profiles at Db2 start
  • Increased partition sizes and simplified partition management for partition-by-range table spaces with relative page numbering
  • Ability to add partitions between existing logical partitions
  • UNLOAD privilege for the UNLOAD utility
  • Temporal versioning for Db2 catalog tables
  • Statistics collection enhancements for SQL performance    
Of course, these are just some of the V12 improvements; there are many more (as well as all of the Function Level improvements)!

Then sometime in the middle of 2017, IBM decided to change the name of DB2 by making the uppercase B a lowercase b. So now the name of our beloved DBMS is Db2. Nobody has been able to explain to me what the benefit of this was, so don’t ask me!

The Bottom Line

I worked with DB2 way back in its Version 1 days, and I’ve enjoyed watching DB2 grow over its first 35 years. Of course, we did not cover every new feature and capability of each version and release, only the highlights. Perhaps this journey back through time will help you to remember when you jumped on board with Db2 and relational database technology. I am happy to have been associated with Db2 (and DB2) for its first 35 years and I look forward to many more years of working with Db2… 

Monday, February 16, 2015

Choosing Between DECIMAL and FLOAT Data Types

DB2 can use both DECIMAL and FLOAT data types to store non-integer numeric data. But the two are not equivalent. In general, use DECIMAL instead of FLOAT whenever you can. The main problem with floating point numbers is that they are not precise. DECIMAL values are precise. In other words, a FLOAT value will be an approximate value whereas a DECIMAL value will be an exact value.

At times, if 100 percent precision is not required, you might want to use floating point numbers to save on storage. DB2 provides a parameter to size the floating point column; n where the data type specification is FLOAT(n). If n is between 1 and 21, this is a single precision floating point number and the column will require 4 bytes of storage; if n is between 22 and 53, this is a double precision floating point number and it will require 8 bytes to store it. A single precision floating-point number is a short (32 bits) floating-point number. A double precision floating-point number is a long (64 bits) floating-point number.

For DECIMAL columns, the byte count is calculated as INTEGER(p/2)+1; where p is the precision of the DECIMAL column. So, a DECIMAL(10,2) column will require (10/2)+1 bytes = 6 bytes. An approximation of the same number could be stored in a FLOAT(21) column that would require only 4 bytes of storage.

For very large or very small numbers, though, you will have to use FLOAT columns. This is so because there is a limit of 31 on decimal precision. The following outlines the largest and smallest values that can be supported using DECIMAL and FLOAT data types:

  •   Smallest FLOAT value is about -7.2**75
  •   Largest FLOAT value is about 7.2**75
  •   Smallest positive FLOAT value is about 5.4**-79
  •   Largest negative FLOAT value is about -5.4**-79
  •   Smallest DECIMAL value is 1 – 10**31
  •   Largest DECIMAL value is 10**31 - 1

Note that the values for floating point numbers are approximations. The ** is used to indicate “raised to the power of”.


If you are moving the between platforms, there is an additional concern when using FLOAT. Mainframes use an IBM standard whereas other platforms use different standards. Since floating point numbers are imprecise to begin with this may not be a problem. However, if you want to make sure that a particular column will be exactly the same value regardless of platform, then floating point is not the way to go. 

There is a third, newer option called DECFLOAT. Introduced in DB2 9 for z/OS, DECFLOAT is a combination of the two data types discussed here, or a decimal floating-point data type. Specified as DECFLOAT(n), where the value of n can be either 16 or or 34, representing the number of significant digits that can be stored. If the n is not specified, then the DECFLOAT column can represent 34 significant digits.

A decimal floating-point value is an IEEE 754r number with a decimal point. The maximum precision is 34 digits and the range of a DECFLOAT number is as follows:

DECFLOAT(16) value can range from a low of: 
-9.999999999999999×10**384

to a high of:
9.999999999999999×10**384

And a DECFLOAT(34) value can range from a low of:
-9.999999999999999999999999999999999 ×10**6144

to a high of:
9.999999999999999999999999999999999 ×10**6144


In addition, the DECFLOAT data type can be used to represent several special values that represent "non-number numbers," as follows:

  • Infinity - a value that represents a number whose magnitude is infinitely large.
  • Quiet NaN - a value that represents undefined results which does not cause an invalid number condition. NaN is not a number.
  • Signaling NaN - a value that represents undefined results which will cause an invalid number condition if used in any numerical operation.


So decimal-floating point values can be more flexible and precise with the ability to range lower and higher than floating point values (or decimal values). However, before using DECFLOAT be careful and plan ahead. If you use COBOL programs to operate on your DB2 for z/OS data because there is no way to specify the SQL DECFLOAT data type in COBOL.


Monday, July 01, 2013

DB2 Locking, Part 13: Optimistic Locking

Continuing our series on DB2 locking, let's look into a relatively recent development -- optimistic locking...

IBM added improvements for optimistic locking techniques in DB2 9 for z/OS.  What is optimistic locking? Sometimes referred to as optimistic concurrency control, optimistic locking is basically just what it sounds like. We are optimists and think that usually we will be the only ones with interest in the data. In other words, when optimistic locking is implemented you are assuming that most of the time there will be no other programs that are interested in the page of data that you are planning to modify.

Of course, even in the most optimistic world there will be exceptions, so optimistic locking does not assume that there will never be any concurrent processes that need to access your page(s). Basically, with optimistic locking you can improve performance by minimizing locking. So how do we do that?

When an application uses optimistic locking, locks are obtained immediately before a read operation and then released immediately. Update locks are obtained immediately before an update operation and held until the end of the transaction. Optimistic locking uses the RID (Record IDentifier) and a row change timestamp to test whether data has been changed by another transaction since the last read operation.

DB2 knows when a row was changed and so therefore he (I always tend to make DB2 masculine, sorry ladies) can ensure data integrity even as he minimizes the duration of locks. With optimistic locking, DB2 releases the page (or row) locks immediately after a read operation. And if you are using row locks, DB2 releases the row lock after each FETCH, taking a new lock on a row only for a positioned update or a positioned delete.

Careful readers will have noticed that I talked about a “row change timestamp” but you may not have heard that expression before. DB2 V9 added support for automatically generated timestamp columns and if you wish to implement optimistic locking you will need to create (or alter) your tables to have a row change timestamp column, defined as follows:

NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

or

NOT NULL GENERATED BY DEFAULT
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

For tables having a row change timestamp column, DB2 automatically populates and maintains the timestamp values for each row. Notice how the syntax is similar to the syntax used for other automatically generated DB2 values, such as sequences. DB2 will automatically generate the timestamp value for each row when the row is inserted, and modify the timestamp for each row when any column in that row is updated.

When you add a ROW CHANGE TIMESTAMP column to an existing table, the initial value for existing rows will not be immediately populated. Instead, DB2 places the table space in an advisory-REORG pending state. When you reorganize the table space, DB2 will generates the values for the ROW CHANGE TIMESTAMP column for all rows (and, of course, remove the advisory-REORG pending status).

OK, but how does this implement optimistic locking? Well, you can use this new column as a condition for making an UPDATE, by specifying it in your WHERE clause. Let’s walk thru a couple of examples.

First of all, when a table contains a ROW CHANGE TIMESTAMP you can use it to find out when its rows were modified. Let’s use the following table as an example:

CREATE TABLE CUSTOMER
 (CUSTNO           CHAR(8)   NOT NULL,
  CUST_INFOCHANGE  NOT NULL GENERATED ALWAYS
                   FOR EACH ROW ON UPDATE
                   AS ROW CHANGE TIMESTAMP,
  CUST_NAME        VARCHAR(50),
  CUST_ADDRESS     VARCHAR(100),
  CUST_CITY        CHAR(20),
  CUST_STATE       CHAR(2),
  CUST_ZIP         CHAR(9),
  CUST_PHONE       CHAR(10),

  PRIMARY KEY (CUSTNO))

Now that the table is defined with the ROW CHANGE TIMESTAMP we can use it in our programs and queries to determine change information about the data. For example, if we want to find all of the customer rows that were changed in the past week (ie. the last 7 days) we could run the following query:

SELECT CUSTNO, CUST_NAME
FROM   CUSTOMER
WHERE  ROW CHANGE TIMESTAMP FOR CUSTOMER <=
       CURRENT TIMESTAMP
AND    ROW CHANGE TIMESTAMP FOR CUSTOMER >=
       CURRENT TIMESTAMP - 7 DAYS;

But what would happen if you issued a statement like this against a table that was altered to include a ROW CHANGE TIMESTAMP? For example, if we created the CUSTOMER table as shown but without the CUST_INFOCHANGE column, populated the table with data, and then altered the table to include the CUST_INFOCHANGE column? In this case, DB2 will use the time the page was last modified. So the results will not be exactly correct because it would return all the rows on each page that qualifies (because at least one row on the page changed). This is why it is important to clear up the advisory REORG pending as soon as possible after adding the ROW CHANGE TIMESTAMP.

This is all well and good, and you can probably see the value of having this automagically changing timestamp in some of your tables, but where is the optimistic locking part? Well, for programs that use updateable static scrollable cursors DB2 can use optimistic locking as long as the program is bound specifying ISOLATION(CS). If you have this situation, DB2 will deploy optimistic locking to reduce the duration of locks between consecutive FETCH operations and between fetch operations and subsequent positioned UPDATE or DELETE operations.

Without optimistic locking, the lock taken at the first FETCH is held until the next FETCH. The lock taken at the last FETCH is held until COMMIT, ROLLBACK, or the end of transaction.

With optimistic locking, the scenario changes significantly. When the application requests a FETCH to position the cursor on a row, DB2 locks that row, executes the FETCH and releases the lock. When the application requests a positioned UPDATE or DELETE on the row, DB2 locks the row and then re-evaluates the predicate to ensure that the row still qualifies for the result table.

Optimistic locking itself will not happen without some effort on your part. Your application must have a ROW CHANGE TIMESTAMP and it must be selected first. Then, during a modification, a predicate needs to be added as a condition to tell whether the row has been modified or not. The static scrollable cursor uses the optimistic locking technique automatically. DB2 cannot use optimistic concurrency control for dynamic scrollable cursors. With dynamic scrollable cursors, the most recently fetched row or page from the base table remains locked to maintain position for a positioned UPDATE or DELETE.


So, if you have not taken a look at which applications might benefit from optimistic locking techniques since your shop migrated to you move to DB2 9 for z/OS it is time to take a look at what applications could take advantage of optimistic locking – and then add the ROW CHANGE TIMESTAMP to the appropriate tables.

Tuesday, February 28, 2012

Identifying Unused Indexes

Did you know that DB2 V9 added a new column to the Real Time Statistics to help identify unused indexes?

The LASTUSED column in the SYSINDEXSPACESTATS table contains a date indicating the last time the index was used. Any time the index is used to satisfy a SELECT, FETCH, searched UPDATE, searched DELETE, or to enforce a referential constraint, the date is updated.

This helps to solve the problem of determining whether or not an index is being used. Standard operating advice is to DROP or delete anything that is not going to be used. But trying to determine whether something is actually used or not can be difficult at times.

You could always query your PLAN_TABLEs or the plan and package dependency tables in the DB2 Catalog for static SQL. But what about dynamic SQL? That is more difficult. Now, as of DB2 V9, you can simply query the LASTUSED column to see when the index was last used. The LASTUSED date is by partition. So, for a partitioned index, the last used date of each partition in the index should be checked.

Of course, you will have to give it some time because you might have an index supporting a rarely used query. Most shops have queries and programs that run quarterly, or even annually, but nevertheless are very important... and you wouldn't want to drop indexes on those queries even though they do not run frequently because when they do run, they are important...

Examine the LASTUSED column over time to determine which indexes are truly not being used, and then DROP the unused indexes.

Tuesday, July 19, 2011

Updating DB2 Developer's Guide (6th edition)

Just a brief blog post today to inform regular readers who might be concerned that I have not been blogging as frequently as in the past.

I am currently working on updating my book, DB2 Developer's Guide, for DB2 versions 9 and 10. This will be the sixth edition of the book and if all goes as planned, it should be available on IBM Press before the end of the year (2011).

Be sure to keep an eye on this blog and my web site (http://www.craigsmullins.com) for further details about the next edition of the book.

Friday, September 24, 2010

A Recommended New DB2 Book

Judy Nall has performed a much-needed service for the DB2 for z/OS community by writing her new book, DB2 9 System Administration for z/OS: Certification Study Guide. There are many DB2 for z/OS books (heck, I wrote one myself) that cover programming, performance, and database administration details. But never before has there been one that focused on system administration and system programming.

Of course, the book is targeted at those looking to become an IBM Certified System Administrator for DB2 for z/OS. I have never taken the exams required for that certification, but the material in this book will go a long way toward making you a better system programmer for a mainframe DB2 environment.

Whereas some of the material can be found in greater detail in other books on the market, we must keep in mind that target market for the book. And the coverage of DB2 fundamentals and performance is well-written and hits the mark for systems folks. And the chapters on installation and migration, system backup and recovery, and systems operation and troubleshooting offer great systems-level knowledge not found in other DB2 for z/OS books.

So while DB2 9 System Administration for z/OS: Certification Study Guide is not for everyone, the people that it is for (systems programmers and systems DBAs) should enjoy it and benefit from the nice job Judy has done organizing and explaining the details of system administration for DB2 for z/OS.

Thursday, September 24, 2009

Limiting the Number of Rows Fetched

Application developers frequently need to retrieve a limited number of qualifying rows from a table. For example, maybe you need to list the top ten best selling items from inventory, or a list of the top five most expensive products (i.e., highest price tag). There are several ways to accomplish this prior to DB2 V7 using SQL, but they are not necessarily efficient.

The first reaction is to simply use the WHERE clause to eliminate non-qualifying rows. But this is simplistic, and often is not sufficient to produce the results desired in an optimal manner. What if the program only requires that the top ten results be returned? This can be a somewhat difficult request to formulate using SQL alone.

Consider, for example, an application that needs to retrieve only the top ten most highly paid employees from the EMP sample table. You could simply issue a SQL request that retrieves all of the employees in order by salary, but only use the first ten retrieved. That is easy; for example:

EMPNO, FIRSTNME, LASTNAME, SALARY
FROM DSN8710.EMP
ORDER BY SALARY DESC;

You must specify the ORDER BY clause with the DESC key word. This sorts the results into descending order, instead of the default, which is ascending. Without the DESC key word, the "top ten" would be at the very end of the results set, not at the beginning.

But that does not really satisfy the requirement - retrieving only the top ten. It merely sorts the results into descending sequence. So the results would still be all employees in the table, but in the correct order so you can view the "top ten" salaries very easily. The ideal solution should return only the top ten employees with the highest salary and not merely a sorted list of all employees.

You can code some "tricky" SQL to support this request for all versions of DB2, such as the following:

SELECT EMPNO, FIRSTNME, LASTNAME, SALARY
FROM DSN8710.EMP A
WHERE 10 > (SELECT COUNT(*)
FROM DSN8710.EMP B
WHERE A.SALARY < B.SALARY)
AND SALARY IS NOT NULL
ORDER BY SALARY DESC;

This SQL is portable from DB2 to other DBMSs, such as Oracle or SQL Server. And, of course, you can change the constant 10 to any number you wish, thereby retrieving the top 20, or top 5, as deemed necessary by the needs of your application.

Since the SALARY column is nullable in the EMP table, you must remove the nulls from the results set. And the ORDER BY is required to sort the results in the right order. If it is removed from the query, the results will still contain the top ten, but they will be in no particular order.

But DB2, as of V7, provides an easier and less complicated way to limit the results of a SELECT statement - the FIRST key word. You can code FETCH FIRST n ROWS which will limit the number of rows that are fetched and returned by a SELECT statement.

Additionally, you can specify a new clause -- FETCH FIRST ROW ONLY clause -- on SELECT INTO statements when the query can return more than one row in the answer set. Doing so informs DB2 to ignore the other rows.

There is one difference between the new V7 formulation and the other SELECT statement we reviewed, and that is the way "ties" are handled. A tie occurs when more than one row contains the same value. The previous query we examined may return more than 10 rows if there are multiple rows with the same value for price within the top ten.

Using the FIRST key word DB2 will limit the number of rows returned to ten, even if there are other rows with the same value for price as the number ten row in the results set. The needs of your application will dictate whether ties are to be ignored or included in the result set. If all "ties" need to be included in the results set, which would mean that more than 10 rows would be needed, the new V7 feature may not prove to be helpful.

And it is also important to note that as of DB2 9, you can include the FETCH FIRST clause in a subselect. ORDER BY is allowed in a subselect, too. The subselect MUST be enclosed in parentheses and the FETCH FIRST (or ORDER BY) cannot be in the outermost fullselect of a view, or in a materialized query table.

Thursday, May 28, 2009

Hasn't DB2 9 for z/OS Been GA for 2 Years Now?

Just a quick note to question the entire DB2 community about embracing the latest (I hesitate to call a 2 year old product "new") version of DB2. Isn't it about time that most of the DB2 base moved to DB2 V9?

Why would people not be moving? Well, there is always the cost and time involved in migrating to a new version. Perhaps organizations are not willing to expend the resources needed to migrate as frequently as they did in the past.

Perhaps some organizations are waiting for others to shake the bugs out of a new release. But this is troublesome. The more organizations that follow that approach, the longer it takes for those that do adopt the latest version to fully test the software. Which means that waiting might not help.

There is also the possibility that some organizations do not find a compelling need for the new functionality offered by V9. Even if that is the case, most organizations still should want the performance gains to be had by upgrading. But to fully take advantage of that requires rebinding, which also takes time and resources to do properly.

And then there is the issue of support for new versions. Organizations need their DB2 tool providers (ISVs and IBM) to not only support, but exploit the new features of the "new" version. Remember a few years ago when IBM entered the DB2 tools market in a big way? One of their stated objectives was to provide day one support for new versions. I wonder what happened to that objective? It is years after V9 GA and many of the IBM tools (according to IBM's own web site) only offer a "very limited number of new functions and features of DB2 9."

So is your shop at DB2 V9 yet? If not, why? If so, howzitgoing?

Tuesday, April 22, 2008

My Most Recent DB2 Articles

Today I'm posting a quick blog entry to let my readers know about a couple of recent DB2-related articles/columns that I've had published. You can always keep up-to-date on my writings by visiting my web site at http://www.craigsmullins.com/articles.htm.

Any way, the following three articles might be of interest to DB2 for z/OS folks:

Use Real Time Statistics to Automate Your Database Maintenance was published in the April/May 2008 issue of zJournal. This article examines Real Time Statistics (RTS) and the benefits that can be accrued by using RTS. If you aren't using RTS yet, be sure to read this article to learn why you should!

Collecting Histogram Statistics With RUNSTATS was published in the March 2008 issue of DB2 Update. This article discusses one of the many new enhancements that have found their way into DB2 9 for z/OS -- the ability to gather histogram statistics with the IBM RUNSTATS utility.

And finally, the February/March 2008 issue of zJournal contains Much Ado About DB2 Locking. This installation of the z/Data Perspectives column takes a look at the most recent, new locking-related features of DB2 for z/OS.

Happy reading!

Tuesday, February 05, 2008

Intelligent Automatic Reoptimization? [DB2 9 for z/OS]

Most seasoned SQL programmers know that when you use host variables, DB2 may not always come up with the absolutely most optimal access path at bind time. Without knowing the actual values of the host variables the optimizer has to make some best guesses as to how best to satisfy the SQL request.

Furthermore, we know that we can guide DB2 on how best to approach this situation using the REOPT parameter of the BIND command. Prior to DB2 V9, there were three options for REOPT:

REOPT(NONE) – DB2 will not reoptimize SQL at run time.

REOPT(ALWAYS) – DB2 will prepare SQL statements again at run time when the host variable values are known. This enables the DB2 optimizer to formulate the query execution plan using the actual host variable values, which can result in better performing access paths.

REOPT(ONCE) – DB2 will prepare SQL statements only once, using the first set of host variable values, no matter how many times the statement is executed by the program. The access path is stored in the Dynamic Statement Cache (DSC) and will be used for all subsequent executions of the same SQL statement. REOPT(ONCE) only applies to dynamic SQL statements and is ignored if you use it with static SQL statements. This option was introduced in DB2 V8.

What is New in V9?

DB2 9 for z/OS introduces a new REOPT option: REOPT(AUTO). The ideas behind REOPT(AUTO) is to come up with the optimal access path in the minimum number of prepares.

The basic premise of REOPT(AUTO) is to re-optimize only when host variable values change. Using this option, DB2 will examine the host variable values and will generate new access paths only when host variable values change and DB2 has not already generated an access path for those values.

REOPT(AUTO) only applies to dynamic statements that can be cached.

After migrating to DB2 9, consider re-evaluating programs bound specifying REOPT(ALWAYS) and REOPT(NONE). In many cases, switching to REOPT(AUTO) from REOPT(ALWAYS) can produce performance improvement; and in some cases you can use re-optimization with REOPT(AUTO) for programs bound REOPT(NONE) because of the fear of too frequent re-optimization causing a performance hit.

In particular, consider specifying REOPT(AUTO) for SQL statements that at times can take a relatively long time to execute, depending on the values of parameter markers. In particular, you should especially consider doing this when parameter markers refer to non-uniform data that is joined to other tables.

Tuesday, December 18, 2007

Indexing the DB2 Catalog [DB2 9 for z/OS]

First of all, I want to apologize for the lack of new posts here the last couple of weeks. I've been busy traveling and preparing for the holidays -- as have most of you I guess, at least the holiday preparation part! Anyway, this will likely be my last post this year and it will be a short one at that. But be sure to check back again in the new year (2008) as I will begin posting a bit more regularly again (hopefully).

In today's post I want to tout a small but helpful improvement in DB2 9 for z/OS that makes it easier to build all of the indexes you want on your DB2 Catalog tables. As you know, even though the DB2 Catalog ships with a set of pre-defined indexes from IBM, you can add your own for improving the performance of your catalog queries. Some DB2 tools vendors even ship recommendations of indexes for you to build to make their solutions work more efficiently.

Well, prior to V9 the limit for user-defined defined indexes on the DB2 Catalog was 100. Over time, some shops have bumped up against this limit and were unable to define any additional indexes.

The good news is that DB2 V9 ups the limit to 500 user-defined catalog indexes.

That's all for today... and I wish all of my readers a very happy holiday season!

Monday, November 26, 2007

UPDATE SCHEMA and CATMAINT [DB2 9 for z/OS]

Welcome back to my blog as I continue our examination of the new features of DB2 9 for z/OS. Today we will look at the new UPDATE SCHEMA capability of the CATMAINT utility.

Have you ever wanted to make a global change to a schema, owner, creator, or VCAT name for your DB2 objects? Well, you can do that with CATMAINT in DB2 9 for z/OS using new UPDATE SCHEMA options.

There are three (3) new options added to CATMAINT, namely:

  • SCHEMA: Owner, creator and schema names can be changed using this option.
  • VCAT: Indexes, table spaces and storage groups can be altered to use a different ICF or VCATNAME using this option.
  • OWNER: Ownership of objects can be changed to a role using this option.

To use any of these options you must be in DB2 9 NFM and have Install SYSADM authority.

How does it work? Well, let’s take a look at a few examples, starting with the SCHEMA option. To rename the owner, creator, and schema of database objects, plan, and packages, we will run CATMAINT specifying the SCHEMA SWITCH option. This process updates every owner, creator or schema name in the catalog and directory that matches the specified schema_name value. Importantly, all GRANTs that were made by or received by the original owner are changed to the new owner. Ownership of objects is not changed if the owner is a role.

So if we want to change OLDNAME to NEWNAME we can code the following CATMAINT job:

CATMAINT UPDATE
SCHEMA SWITCH(OLDNAME, NEWNAME)

You can change multiple names by repeating the SWITCH keyword, but you are not allowed to code the same name more than once.

Be aware though, when the schema name of an object is changed, any plans or packages that are dependent on the object are invalidated. If you do not REBIND those plans and packages an automatic REBIND will occur the next time you execute any of those programs.

Here is another example, this time for the VCAT option. To change the VCAT name that is used by storage groups or by index spaces and table spaces, we can run CATMAINT specifying the VCAT SWITCH option. This option is similar to using the ALTER TABLESPACE USING VCAT statement for changing the VCAT name. You need to move the data for the affected indexes or table spaces to the data set on the new catalog in a separate step.

So if we want to change OLDVCAT to NEWVCAT we can code the following CATMAINT job:

CATMAINT UPDATE
VCAT SWITCH(OLDVCAT, NEWVCAT)

You can change multiple VCAT names by repeating the SWITCH keyword, but you cannot specify the same name more than once. There are several restrictions to this option that you should research in the IBM manuals before attempting to switch VCAT names.

The final option is the OWNER option. It is used for changing the ownership of objects from a user to a role. Roles are new in DB2 9 and are associated with a TRUSTED CONTEXT. This will be the subject of a future blog posting here on the DB2portal blog – so keep an eye out for that one soon.

For example, if we want to switch ownership of objects for OWNER1, OWNER2 and OWNER3 to a role, we can run CATMAINT as follows:

CATMAINT UPDATE
OWNER FROM(OWNER1, OWNER2, OWNER3) TO ROLE

You must be running under a trusted context with a role to run this utility. The current role will become the owner. Privileges held on the object will be transferred from the original owner to the role.

A final caveat: be sure to create backups of your DB2 Catalog and DB2 Directory before running this CATMAINT to switch SCHEMA, VCAT, or OWNER.

Wednesday, November 07, 2007

BACKUP and RESTORE SYSTEM [DB2 9 for z/OS]

I am posting today’s blog entry from Athens, Greece as I participate in the European IDUG conference. Good thing I know how to use the Blogger site because when I log in over here in Greece the text on their site is all converted into Greek - and as I'm sure comes as no surprise to anyone, I don't understand Greek!

Anyway, today's post will be about the improvements IBM has made to the BACKUP SYSTEM and RESTORE SYSTEM utilities in DB2 9 for z/OS. And this will be the final entry in this series on Version 9 features discussing utility improvements… it will not be the last in the series on V9 improvements though, just the last one on the utilities.

Also, please keep in mind that these blog posts are meant to deliver a flavor of the new functionality in DB2 9 for z/OS. They will not cover every nuance and detail of what V9 has to offer. With that said, let’s dive into the enhancements to the BACKUP and RESTORE SYSTEM utilities.

Overview

As most of you surely know, BACKUP SYSTEM and RESTORE SYSTEM were are relatively new utilities, added to DB2 as of Version 8. They use disk volume FlashCopy backups and copypool z/OS DFSMShsm V1R5 constructs to copy and restore large volumes of DB2 data. In DB2 V9 these utilities are enhanced to use new functions available with z/OS V1R8 DFSMShsm.

Recovery of Individual Database Objects

In V9, backups produced by BACKUP SYSTEM (aka system level backups) can be used to recover individual table spaces or index spaces. This is helpful because previously you had to recover the entire system, and that is not always what is necessary.

When you wish to recover a subset of a system level backup you will use the RECOVER utility instead of RESTORE SYSTEM. Before your RECOVER jobs can use system level backups you must first set the SYSTEM_LEVEL_BACKUPS DSNZPARM option to YES. This can be set from the DSNTIP6 install panel. If you specify YES then your system-level backups will be considered in object level recoveries (along with your other image copy backups).

If you wish to use your system level backups for individual database object recoveries then you need to make sure that you are copying your indexes (specifying COPY YES).

Why would you want to use your system level backups in this way? Well, doing so should enable you to reduce the frequency with which you are taking conventional image copies. If you take a daily system level backup, then the database objects that you were also backing up on a daily basis may not be required. Of course, you cannot completely forgo all individual image copies because the system level backup timing may not conform to the timing needed for each object based on application requirements, and of course, image copies will still be needed after running utilities like LOAD REPLACE and REORG LOG NO to resolve copy pending situations.

Tape Support for BACKUP SYSTEM

DB2 V9 also delivers the ability for the BACKUP SYSTEM utility to copy the data directly to tape. The new parameters allowing this capability are the DUMP and DUMPONLY options.

The output of the DUMP or DUMPONLY is directed to a DFSMShsm dump class, which specifies the unit type the data will be directed to. Although IBM implemented this change to enable tape support, an SMS dump class is not restricted to tape.

Keep in mind that directing data to tape will have an impact on the speed of your restore. Restoring from tape will not be as fast as restoring from a FlashCopy made to disk. Of course, having your data on tape can help in terms of storage management, disaster recovery and off-site data storage, and long-term data retention. So be aware of these trade-offs before creating system level backups on tape.

Additionally, recognizing that copying data to tape can be time-consuming IBM has added a new keyword, FORCE, to enable a new backup to be started even if a previous DUMP has not yet completed. Of course, FORCE should not be used all the time - - only be used when it is very critical that a new backup be started.

Incremental FlashCopy

And finally, support for incremental copying has been added to FlashCopy. So now you can take a system level backup and then subsequent incremental system level backups. An incremental FlashCopy will copy only the tracks that have changed on the source volume since the last copy was taken. But unlike a typical incremental image copy, the previous content on the volume(s) will be replaced by the new content. That means there is no merging of incrementals required; essentially, the merge is part of the incremental FlashCopy.

I won’t go into all of the gory details here but this new functionality can greatly minimize I/O activity for system level backups.

Summary

So, to sum things up, the ability to work with system level backups becomes easier in V9 because you can recover individual table spaces and indexes from a system level backup without having to restore the entire backup, you can make system level backups directly to tape, and we get the ability to do incremental system level backups. All in all, some nice new features for BACKUP SYSTEM and RESTORE SYSTEM, wouldn't you say?

Monday, October 29, 2007

Improvements to the REORG Utility [DB2 9 for z/OS]

As we continue to wind our way through the multiple improvements that DB2 V9 brings to the IBM utilities, it is time to address the enhancements made to the REORG utility.

Performance

First up, let’s talk about performance improvements. In DB2 9 for z/OS, the REORG utility can unload and reload partitions in parallel. This should result in a nice reduction in elapsed time when you are reorganizing partitioned table spaces. To enable this improvement you will need to code the NOSYSREC keyword or the UNLDDN keyword with a template. Note that NOSYSREC is always used for SHRLEVEL change.

Parallelism will not be enabled if any of the following conditions apply:

  • DATAWKnn ddnames are specified in the JCL
  • SORTDEVT keyword is not specified
  • UTPRINT is allocated to anything other than SYSOUT
  • The REBALANCE keyword is used.

Also consider that REORG SHRLEVEL CHANGE can use subtasks during the LOG phase to speed up the processing of log records. Subtasks will be used when unload and reload have been done by partition and there are log records to apply for the partition.

BUILD2 Phase Eliminated in Online REORG

As you probably know, an Online REORG uses shadow data sets. The reorganization is done to the shadows while the data remains available in the primary data sets. When the REORG is done, the shadow data sets are switched to be the primary data sets.

As most DBA know, reorganization by partition can create a long outage when non-partitioned indexes (NPIs) exist on the table. Why is this so? Well, prior to V9, during Online REORG, NPIs had to be updated with the new RIDs for the data of the partition being reorganized during the BUILD2 phase. And that is what typically caused the sometimes significant outage.

With DB2 V9, the BUILD2 phase is eliminated. Instead, the entire NPI will be reorganized and then switched in the SWITCH phase along with all of the other data sets. The only outage is during the SWITCH phase, which will be much, much shorter than the BUILD2 phase outage.

As a result of the elimination of the BUILD2 phase, DB2 will need additional temporary storage for the shadow data sets for each NPI. Additionally, the cost of the REORG will increase because building an entire NPI will consume more CPU than the previous method of updating RIDs in the BUILD2 phase. Finally, you will need to modify any REORG jobs against different partitions that ran in parallel because the entire NPI will be built and then switched.

What about running REORG SHRLEVEL REFERENCE by partition? Well, REORG SHRLEVEL REFERENCE by partition will also rebuild any associated NPIs using shadows. So the same type of concerns apply (more disk storage, additional CPU, no separate jobs in parallel).

Reorganization and LOBs

Finally, as I blogged about before in the posting on LOBS, prior to V9 you could not access LOB data during a REORG. And a REORG did not reclaim physical space from the LOB data set because LOBs were moved within the existing LOB table space. V9 fixes these problems. During a REORG (in V9), the original LOB table space is drained of writers. All LOBs are then extracted from the original data set and inserted into a shadow data set. When this operation is complete, all access to the LOB table space is stopped (the readers are drained) while the original data set is switched with the shadow data set. At this point, full access to the new data set is enabled, and an inline copy is taken to ensure recoverability of data.

Tuesday, October 02, 2007

More Online Utilities [DB2 9 for z/OS]

Today we continue our investigation into the new functionality in DB2 9 for z/OS... and specifically into the improvements made to IBM utilities for version 9. We will turn our attention to improved availability with IBM DB2 utilities. Under DB2 V9, three more utilities can run online – that is, with SHRLEVEL CHANGE.

First up we have CHECK DATA. This utility is executed when you want to make sure that the data in your tables matches your integrity constraints. Prior to V9, access to tables (or table spaces) is read only when CHECK DATA is being run. This means that as long as CHECK DATA is executing you cannot modify data. If you have complex referential sets and/or very large databases you likely will have issues with running CHECK DATA.

You can run CHECK DATA online – that means with SHRLEVEL CHANGE – as of DB2 V9. As with other online utilities, an online CHECK DATA will work with on a shadow copy of the data and indexes. DB2 makes the shadow copy using DFSMS ADRDSSU. This copy can be super fast if you have data set level FlashCopy V2 enabled. Even if you do not have FlashCopy availability will improve because the outage is reduced when you use SHRLEVEL CHANGE (just to make the copy) instead of SHRLEVEL REFERENCE (for the entire utility execution).

DB2 will check data integrity using the shadow copies of the data and indexes. When violations are found DB2 will not set the CHECK pending state. And remember, we are checking shadows, so DB2 will not delete rows in violation. Instead, CHECK DATA will create a PUNCHDDN containing REPAIR LOCATE DELETE input that can be run against the active data.

The CHECK LOB utility has also been enhanced to enable online execution. DB2 V9 allows SHRLEVEL CHANGE for checking LOBs using a shadow copy of the LOB table space. And just like CHECK DATA, DB2 uses the DFSMS ADRDSSU utility to create the shadow. And similarly, the check is executed against the shadow and REPAIR statements are generated that you subsequently can run against the actual LOB table space.

Finally, you can also run the REPAIR LOCATE utility specifying SHRLEVEL CHANGE. This enhancements allows us to execute the REPAIR utility online with LOCATE against indexes, index spaces and table spaces.

Friday, September 07, 2007

COPY Improvements [DB2 9 for z/OS]

Next up in this sub-series on IBM utility improvements in the larger series on DB2 V9 features and functions we will take a look at the new “stuff” in the IBM COPY utility.

Copying Just the Pending Database Objects

The first nice new feature is a new parameter that can be used to make backup copies of only those database objects in a copy pending state. The new parameter is called SCOPE PENDING, and when you specify it the IBM COPY utility will only copy objects in copy pending or informational copy pending state.

The default is SCOPE ALL, which makes COPY act like it always acted; that is, without regard to the pending state.

This option can be used in conjunction with LISTDEF to build very powerful, selective backup jobs.

Copying Clone Tables

The IBM COPY utility also has been extended to support copying clone tables. If you are not sure what they are, follow the link to the blog entry in this series that covers that topic.

Anyway, the new CLONE parameter must be coded on the COPY specification in order for clone table and index data to be copied. It is important to understand that the CLONE parameter is basically an on/off toggle. If CLONE is coded, then COPY will copy only clone table (and/or index) data. You can specify both clone and regular database objects in the same COPY specification, but the IBM COPY utility will ignore non-clone objects when the CLONE parameter is specified; and it will ignore clone objects if the CLONE parameter is not specified.

CHECKPAGE

The CHECKPAGE parameter is not new to V9, but it is improved. Specifying CHECKPAGE indicates that the COPY utility should check each page in the table space or index for validity. If it finds an error, COPY issues a message that describes the type of error. Using CHECKPAGE is a good idea because it will identify problems in your page sets.

In previous releases of DB2, using CHECKPAGE with COPY was problematic because of high overhead and availability issues. In the DB2 9 for z/OS Technical Overview redbook, IBM reports that the CPU overhead is about 5 percent for COPY INDEX and 14 percent for COPY TABLESPACE if you are running V8. As for the availability issue, as soon as a validity problem is found on a page, the COPY utility will place the page set into COPY PENDING so it cannot be accessed while COPY chugs merrily along looking for additional invalid pages.

Both problems are addressed in DB2 9 for z/OS: IBM has significantly reduced the CPU overhead and fixed the availability issue. Instead of putting the page set into COPY PENDING immediately when it finds a problem, instead COPY will produce the message that it found a problem, but it will not set the pending state. Furthermore, it will continue checking pages but not copying anything.

Upon completion, COPY will issue a return code of 8 and it will update the SYSIBM.SYSCOPY catalog table indicating it found a broken page. This information is recording in a new column, TTYPE, with a value of “B”). This is important to understand because the page set will not be in a COPY PENDING state so you will have to check for the return code and fix the problem.

Tuesday, August 28, 2007

TEMPLATE Switching [DB2 9 for z/OS]

Next up in this on-going series covering the new features and functionality of DB2 9 for z/OS we will tackle the improvements that have been made to the IBM DB2 utilities.

A nice new feature that impacts multiple utilities is TEMPLATE switching. For those not familiar with templating, TEMPLATE is a utility control statement that enables the allocation of data sets for a LISTDEF (LISTDEF is the way you control the list of database objects for a utility execution). With the TEMPLATE statement you essentially specify data set naming conventions and allocation information without using JCL DD statements.

OK, so what is TEMPLATE switching? What it enables you to do is to specify different characteristics for image copies of varying sizes. So, for example, you can create a different template for low and high volume image copies. Here is an example:



//SYSIN DD *
TEMPLATE low DSN &DB..&TS..IC.D&DA..T&TI.
UNIT=DASD LIMIT(50 CYL,high)

TEMPLATE high DSN &DB..&TS..IC.D&DA..T&TI.

UNIT=TAPE

COPY TABLESPACE MY.SMALLTS COPYDDN(low)

COPY TABLESPACE MY.LARGETS COPYDDN(low)



Note the new parameter named LIMIT. This is what controls when the TEMPLATE is switched. In this example we set the LIMIT for low volume at 50 cylinders. When this is reached the TEMPLATE will be switched to the high volume template. Of course, you have the flexibility to control the limit by specifying whatever number is appropriate for your site as well as specifying it in CYL, GB, or MB.

It is important to understand that DB2 can only switch the TEMPLATE once, so you cannot have more than two templates.

Template switching is available for image copies produced by COPY, COPYTOCOPY, MERGECOPY, LOAD, and REORG. And you can set up template switching for both the COPYDDN and RECOVERYDDN.

Wednesday, August 22, 2007

Optimistic Locking [DB2 9 for z/OS]

DB2 Version 9 improves support for coding optimistic locking techniques. What is optimistic locking? Sometimes referred to as optimistic concurrency control, optimistic locking is basically just what it sounds like. We are optimists and think that usually we will be the only ones with interest in the data. In other words, when optimistic locking is implemented you are assuming that most of the time there will be no other programs that are interested in the page of data that you are planning to modify.


Of course, even in the most optimistic world there will be exceptions, so optimistic locking does not assume that there will never be any concurrent processes that need to access your page(s). Basically, with optimistic locking you can improve performance by minimizing locking. So how do we do that?


When an application uses optimistic locking, locks are obtained immediately before a read operation and then released immediately. Update locks are obtained immediately before an update operation and held until the end of the transaction. Optimistic locking uses the RID (Record IDentifier) and a row change timestamp to test whether data has been changed by another transaction since the last read operation.


DB2 knows when a row was changed and so therefore he (I make DB2 masculine) can ensure data integrity even as he minimizes the duration of locks. With optimistic locking, DB2 releases the page (or row) locks immediately after a read operation. And if you are using row locks, DB2 releases the row lock after each FETCH, taking a new lock on a row only for a positioned update or a positioned delete.


Careful readers will have noticed that I talked about a “row change timestamp” but you may not have heard that expression before. DB2 V9 adds support for automatically generated timestamp columns and if you wish to implement optimistic locking you will need to create (or alter) your tables to have a row change timestamp column, defined as follows:


NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

or

NOT NULL GENERATED BY DEFAULT
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

For tables having a row change timestamp column, DB2 will automatically populate and maintain the timestamp values for each row. Notice how the syntax is similar to the syntax used for other automatically generated DB2 values, such as sequences. DB2 will automatically generate the timestamp value for each row when the row is inserted, and modify the timestamp for each row when any column in that row is updated.

When you add a ROW CHANGE TIMESTAMP column to an existing table, the initial value for existing rows will not be immediately populated. Instead, DB2 places the table space in an advisory-REORG pending state. When you reorganize the table space, DB2 will generates the values for the ROW CHANGE TIMESTAMP column for all rows (and, of course, remove the advisory-REORG pending status).

Now that you have this new column you can use it as a condition for making an UPDATE and specify it in your WHERE clause. Let’s walk thru a couple of examples.

First of all, when a table contains a ROW CHANGE TIMESTAMP you can use it to find out when its rows were modified. Let’s use the following table as an example:

CREATE TABLE CUSTOMER
(CUSTNO CHAR(8) NOT NULL,
CUST_INFOCHANGE NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP,
CUST_NAME VARCHAR(50),
CUST_ADDRESS VARCHAR(100),
CUST_CITY CHAR(20),
CUST_STATE CHAR(2),
CUST_ZIP CHAR(9),
CUST_PHONE CHAR(10),

PRIMARY KEY (CUSTNO)
);

Now that the table is defined with the ROW CHANGE TIMESTAMP we can use it in our programs and queries to determine change information about the data. For example, if we want to find all of the customer rows that were changed in the past week (ie. the last 7 days) we could run the following query:


SELECT CUSTNO, CUST_NAME

FROM CUSTOMER
WHERE ROW CHANGE TIMESTAMP FOR CUSTOMER <=
CURRENT TIMESTAMP AND
ROW CHANGE TIMESTAMP FOR CUSTOMER >=
CURRENT TIMESTAMP - 7 DAYS;

But what would happen if you issued a statement like this against a table that was altered to include a ROW CHANGE TIMESTAMP? For example, if we created the CUSTOMER table as shown but without the CUST_INFOCHANGE column, populated the table with data, and then altered the table to include the CUST_INFOCHANGE column? In this case, DB2 will use the time the page was last modified. So the results will not be exactly correct because it would return all the rows on each page that qualifies (because at least one row on the page changed). This is why it is important to clear up the advisory REORG pending as soon as possible after adding the ROW CHANGE TIMESTAMP.

OK, this is all well and good, and you can probably see the value of having this automagically changing timestamp in some of your tables, but where is the optimistic locking part? Well, for programs that use updateable static scrollable cursors DB2 can use optimistic locking as long as the plan/package is bound specifying ISOLATION(CS). If you have this situation, DB2 will deploy optimistic locking to reduce the duration of locks between consecutive FETCH operations and between fetch operations and subsequent positioned UPDATE or DELETE operations.

Remember, though, DB2 cannot use optimistic concurrency control for dynamic scrollable cursors. With dynamic scrollable cursors, the most recently fetched row or page from the base table remains locked to maintain position for a positioned UPDATE or DELETE.

Without optimistic locking, the lock taken at the first FETCH is held until the next FETCH. The lock taken at the last FETCH is held until COMMIT, ROLLBACK, or the end of transaction.

With optimistic locking, the scenario changes significantly. When the application requests a FETCH to position the cursor on a row, DB2 locks that row, executes the FETCH and releases the lock. When the application requests a positioned UPDATE or DELETE on the row, DB2 locks the row and then re-evaluates the predicate to ensure that the row still qualifies for the result table.

So, when you move to DB2 9 for z/OS you should evaluate your applications looking for programs that could take advantage of optimistic locking – and then add the ROW CHANGE TIMESTAMP to the appropriate tables.

Wednesday, August 08, 2007

Altering LONG VARCHAR [DB2 9 for z/OS]

Just a short entry today to discuss a feature that will be useful for organizations having tables containing long text strings which are stored as VARCHARs. It involves altering the data type of a column from LONG VARCHAR to a compatible data type.

In previous versions of DB2 before to V9 it is not possible to change the data type of a LONG VARCHAR or LONG VARGRAPHIC column using ALTER. Instead, you would have had to unload the data, drop the table, create the table using the new column definitions and reload the data back into the new table. Of course, when you drop the table you’d lose any authorizations, indexes, etc. defined on the table so you’d have to save and re-create those as well. Basically, it wasn’t easy unless you have a change management tool that automates the process.

But before going any further, what is a LONG VARCHAR? Well, a VARCHAR is a varying-length column for character string data where n specifies the maximum length of the string. If n was greater than 254, the column was setup as a long string column – that is, a LONG VARCHAR; same story for VARGRAPHIC columns.

As of DB2 V9, you will not create any more LONG VARCHAR columns – they are all treated as VARCHARs. But the old LONG VARCHAR columns, migrated from earlier releases, still exist. The COLTYPE in the DB2 Catalog (SYSIBM.SYSCOLUMNS) will be LONGVAR for LONG VARCHAR columns and LOGVARG for LONG VARGRAPHIC columns.

The good news is that V9 allows you to ALTER the data type of LONG VARCHAR to VARCHAR and LONG VARGRAPHIC to VARGRAPHIC using ALTER TABLE with the ALTER COLUMN parameter. First, you must lookup the LENGTH of the column in the SYSCOLUMNS and use that for the length of the new data type specification. For example, if LONGCOL in table EXAMPLE is a LONG VARCHAR you can ALTER it in DB2 V9 to VARCHAR. Assuming the length of the column was 500, you can issue the following ALTER to change the column to VARCHAR from LONG VARCHAR:

ALTER TABLE EXAMPLE
ALTER COLUMN LONGCOL
SET DATA TYPE VARCHAR(500);

You must specify the exact value stored in the DB2 Catalog for LENGTH in this ALTER TABLE statement. Failure to do so will result in an error message.

So, you can convert all of those LONG VARCHAR columns from past versions of DB2 to VARCHAR using the ALTER statement in V9.

Monday, August 06, 2007

Encryption [DB2 9 for z/OS]

DB2 V9 offers some encryption news, but we need to go back a version to start the story. You see, DB2 supports encryption in Version 8 through encryption functions that need to be explicitly coded in order to encrypt and decrypt data.

These functions (ENCRYPT and DECRYPT) allow you to encrypt and decrypt data at the column level. Because you can specify a different password for every row that you insert, you are encrypting data at the “cell” level in your tables. If you use these functions to encrypt your data, be sure to put some mechanism in place to manage the passwords that are used to encrypt the data. Without the password, there is absolutely no way to decrypt the data.

To assist you in remembering the password, you have an option to specify a hint (for the password) at the time you encrypt the data. The following SQL example shows an INSERT that encrypts the SSN ( social security number ) using a password and a hint:

INSERT INTO EMP (SSN)
VALUES(ENCRYPT('289-46-8832','TARZAN','? AND JANE'));

The password is “TARZAN” and the hint we’ve chosen to provide is “? AND JANE”… so the hint will prompt us to think of Tarzan as the companion of Jane.

In order to retrieve the encrypted data you will need to use the DECRYPT function supplying the correct password. This is shown in the following SELECT statement:

SELECT DECRYPT_BIT(SSN,'TARZAN') AS SSN
FROM EMP;


If we fail to supply a password, or the wrong password, the data is returned in an encrypted format that is unreadable.

The result of encrypting data using the ENCRYPT function is VARCHAR FOR BIT DATA. The encryption algorithm is an internal algorithm. For those who care to know, it uses Triple DES cipher block chaining (CBC) with padding and the 128-bit secret key is derived from the password using an MD5 hash.

When defining columns to contain encrypted data the DBA must be involved because the data storage required is significantly different. The length of the column has to include the length of the non-encrypted data + 24 bytes + the number of bytes to the next 8 byte boundary + 32 bytes for the hint.

OK, that is all V8 stuff and this series of blog postings is supposed to be about V9 functionality, right? So what about version 9? Well, DB2 9 for z/OS offers some nice improvements to encryption support. Firstly, DB2 can take advantage of encryption hardware advances.

CP Assist for Cryptographic Function, aka CPACF, is available on z990 hardware. CPACF can run on all the CPUs, but remember, this feature is available only on z990 and later machines, not the older z900. The z990 also introduces a PCIXCC card which is needed for the IBM Data Encryption Tool, but not for the DB2 encryption functions.

Note: The IBM Data Encryption Tool (available from IBM at an additional price) offers encryption for DB2 tables at the table level, whereas the encryption functions (free with DB2) offer encryption at the column level.

The CP Assist for Cryptographic Function delivers cryptographic support on every CP with Data Encryption Standard (DES), Triple DES (TDES), and Advanced Encryption Standard (AES)-128 bit data encryption/decryption, as well as Secure Hash Algorithm (SHA-1) and SHA-256 hashing. For a more detailed discussion of CPACF, associated technology and functionality, check out the following IBM redbook: IBM eServer zSeries 990 (z990) Cryptography Implementation (SG24-7070).

Basically, the net result is that the cost of encrypting DB2 data under V9 is reduced on the z990 hardware.

Additionally, IBM has added encryption support in the controllers of its storage devices.

Both the IBM TS1120 tape drive and IBM Ultrium 4 tape drives include data encryption capabilities within the drives. This support can allow you to avoid the need for host-based encryption of data or the use of specialized encryption appliances. In addition, IBM claims that the encryption does not significantly impact the performance of the drives so there should be minimal to no impact on the batch processing window when encrypting in this manner.

So far, we’ve been talking about encryption for data at rest. But DB2 9 for z/OS also improves support for encryption of data in transit. DB2 9 supports the Secure Socket Layer (SSL) protocol by implementing the z/OS Communications Server IP Application Transparent Transport Layer Security (AT-TLS) function. The z/OS V1R7 Communications Server for TCP/IP introduces the AT-TLS function in the TCP/IP stack for applications that require secure TCP/IP connections. AT-TLS performs transport layer security on behalf of the application, in this case DB2 for z/OS, by invoking the z/OS system SSL in the TCP layer of the TCP/IP stack. The z/OS system SSL provides support for TLS V1.0, SSL V3.0, and SSL V2.0 protocols.

So encryption of data over the wire is improved in z/OS 1.7. The Communications Server supports AT-TLS, which uses SSL data encryption. Now SSL encryption has been available on z/OS for a long time, but now DB2 9 for z/OS makes use of this facility and offers SSL encryption using a new secure port.

When acting as a requester, DB2 for z/OS can request a connection using the secure port of another DB2 subsystem. When acting as a server, and from within a trusted context (I’ll discuss trusted context in a later DB2portal blog entry), SSL encryption can be required for the connection.

So, little by little, better encryption support is being made available within the world of DB2 for z/OS.