The Db2 Portal Blog: April 2007

Monday, April 30, 2007

Database Definition on Demand [DB2 9 for z/OS]

As you probably know, online schema evolution (sometimes referred to as “online schema change”) was one of the key new features of DB2 V8. But, as its name implies, its capabilities continue to evolve. With V9, online schema evolution expands to simplify more types of database definition changes. The new term IBM is using for this in V9 is Database Definition On Demand (DDOD).

One of the nice new components provided by DDOD in V9 is that online table space reorganization is significantly improved. Today, when reorganizing just a couple of partitions in a partitioned table space the BUILD2 phase takes a long time to complete. V8 removed the outage for DPSIs, and now V9 removes the BUILD2 phase for all types of secondary indexes.

Another new DDOD capability supports replacing one table quickly with another. This is accomplished via cloning and the technique can even avoid the need to REBIND packages. Cloning allows you to generate, in the same table space, a new table with the same structure as the original table. After creating the clone you can do with it what you want – LOAD, DELETE, INSERT, UPDATE, etc. data – and then exchange the clone table name with the current table name. In this way you can keep the existing table operational while you work on the next “generation” of that table in its clone. When the clone is ready, you EXCHANGE it with the existing table. Nice, huh?

How about being able to rename a column within your table or rename an index? V9 provides the ability to do both! No longer do you have to DROP and re-CREATE in order to rename columns and indexes.

DB2 V9 also introduces a new type of table space that combines the attributes of segmented and partitioned. It is called a universal table space. Universal table spaces offer improved space management for variable length rows. This is so because it uses the space map page with more free space information like segmented table spaces. Also, like segmented table spaces, universal table spaces deliver improved mass delete performance and you can immediately reuse the table segments after the mass delete.

There are two types of universal table spaces:

Partition-by-growth: this type of universal table space will partition the data as it grows without the need to specify key ranges. This type of universal table space is beneficial for tables that will grow over time and/or need the additional limits afforded by partitioning, but can benefit from the performance of segmented. You can define more than one table in this type of universal table space if you wish.
Range-partitioned: this type of universal table space requires a key range for partitioning – and it can contain only a single table. This is basically adding segmentation to the existing partitioned table space.

Additionally, you can define SMS constructs (MGMTCLAS, DATACLASS, and STORCLAS) on a STOGROUP and you can ALTER those constructs as well. And table space and index logging parameters can be altered.

DB2 V9 even adds a new capability to change the DB2 early code without requiring an IPL.

So, with DB2 9 for z/OS we get more flexibility in modifying our database schemas. And that is a good thing, right?

Thursday, April 19, 2007

SELECT from DELETE, UPDATE, and MERGE [DB2 9 for z/OS]

Another nice new SQL feature provides the ability to SELECT from DELETE, UPDATE, and MERGE statements.

This capability is similar to the SELECT from INSERT feature that was introduced with DB2 V8. So, before looking at the new V9 feature, let’s review the V8 feature.

The ability to SELECT from an INSERT statement is an intriguing feature. To understand why this is so, we need to review some background details. In some cases, it is possible to perform actions on an inserted row before it gets saved to disk. For example, a BEFORE TRIGGER might change data before it is even recorded to disk. But the application program will not have any knowledge of this change that is made in the trigger. Identity columns and columns with defaults (especially user-defined defaults) have similar effects. What if the program needs to know the final column values? Prior to V8 this was difficult and inefficient to implement.

Well, the SELECT FROM INSERT feature introduced in DB2 V8 solves this problem. It allows you to both insert the row and retrieve the values of the columns with a single SQL statement. It performs very well because it performs both the INSERT and the SELECT as a single operation. Consider the following example:


  SELECT COL5 INTO :C5-HV
  FROM FINAL TABLE
   (INSERT (COL1, COL2, COL5, COL7) INTO SAMPLE_TABLE
    VALUES('JONES', 'CHARLES', CURRENT DATE, 'HOURLY')
   );

The data is inserted as specified in the VALUES clause, and retrieved as specified in the SELECT. Without the ability to select COL5, the program would have no knowledge of the value supplied to COL5, because it was assigned using CURRENT DATE. With this new syntax the program can retrieve the CURRENT DATE value that was just inserted into COL5 without incurring additional overhead.

OK, on to V9. In this new version you can retrieve columns from rows that are modified via DELETE, UPDATE, and MERGE statements, thereby replacing multiple SQL calls with one. DB2 V9 allows a searched UPDATE or a searched DELETE statement in the FROM clause of a SELECT statement that is a subselect, or in the SELECT INTO statement. This capability allows a user or program to know which values were updated or deleted.

And, if you recall from a few blog entries ago, DB2 9 for z/OS also adds the MERGE statement. Well, it also adds the ability to SELECT from the MERGE. This allows you to return all the rows that were either inserted or updated as a result of the MERGE.

Here is an example of a SELECT from an UPDATE:


 SELECT SUM(salary) INTO :SAL-HV
 FROM FINAL_TABLE
  (UPDATE EMP
   SET SALARY = SALARY * 1.02
   WHERE WORKDEPT = ‘A01’);

Prior to the capability you would have had to run the UPDATE statement, and then only after it finishes, you would run the SELECT to add up the new salary values. Now, instead of multiple statements requiring multiple passes through the data, you can consolidate it into one.

Nice, huh?

Tuesday, April 17, 2007

Free DB2 Access Paths Webinar - April 18, 2007

Just a quick note to inform my readers that I will be conducting a webinar tomorrow - Wednesday, April 18, 2007 - on the topic of Change Control for DB2 Access Path Selection. The session will start at 10:30 Central time and will last about an hour. To register, click on the link above.

Here is a short description of what I'll discuss: Most changes are strictly controlled in the mainframe environment. But that is not the case for DB2 access paths. When we BIND or REBIND a program, DB2 formulates access paths “on the fly” and we do not have any control over what DB2 creates for us. This lack of control over changes can cause unpredictable performance.

NOTE
This webinar is in the past and can no longer be viewed.

Tuesday, April 10, 2007

MERGE and TRUNCATE [DB2 9 for z/OS]

Two additional new SQL statements available in DB2 Version 9 are the MERGE and TRUNCATE statements.

MERGE

The MERGE statement basically takes two “tables” and merges the data into one table. The table that will contain the merged results is referred to as the target; the other participating table is called the source. Rows in the target that match the source are updated and rows that do not exist in the target are inserted from the source to the target.

If you use other DBMSs you may be somewhat familiar with MERGE functionality. It is sometimes called an UPSERT (taking the UP from update and the SERT from insert). A simplified version of the MERGE syntax follows:


MERGE INTO table_name
  USING table_name
ON (condition)
WHEN MATCHED THEN
     UPDATE SET column1 = value1 [, column2 = value2 ...]
WHEN NOT MATCHED
     THEN INSERT column1 [, column2 ...]
          VALUES (value1 [, value2 ...]) ;

The DB2 implementation is a tad different, though. Instead of the USING clause specifying an actual table, it instead specifies a VALUES clause of data for a single row or an array of rows. So the source, if it consists of multiple rows, must be populated into a host variable array.

So, say we have a customer table, CUST, and we want to accept several customers from a file. If the customer already exists, we want to populate it with the new, updated information; if the customer does not exist, we want to insert the new customer. To accomplish this in DB2 V9, we can write a MERGE statement such as this:


MERGE INTO CUST C
 USING VALUES
   ((:CUSTNO, :CUSTNAME, :CUSTDESC)
    FOR :HV_NROWS ROWS) AS N (CUSTNO, NAME, DESC)
 ON (C.CUSTNO = N.CUSTNO)
 WHEN MATCHED THEN UPDATE
   SET (C.NAME, C.DESC) = (N.NAME, N.DESC)
 WHEN NOT MATCHED THEN INSERT (CUSTNO, NAME, DESC)
   VALUES (N.CUSTNO, N.NAME, N.DESC)
 NOT ATOMIC CONTINUE ON SQL EXCEPTION;

Of course, this is a simple example as there will likely be many other columns in the customer table that would be of interest. But you can easily extrapolate from this example using it as a template of sorts to build a more complex example.

The rows of input data are processed separately. When errors are encountered and NOT ATOMIC CONTINUE ON SQL EXCEPTION is in effect, processing continues, and some of the specified rows will not be processed. Regardless of the failure of any particular source row, the MERGE statement will not undo any changes that are made to the database by the statement. Merge will be attempted for rows that follow the failed row. However, the minimum level of atomicity is at least that of a single source row (in other words, it is not possible for a partial merge to complete).

If you are using triggers be sure to consult the SQL Reference manual (PDF) to understand how MERGE impacts trigger processing.

At any rate, you need to know that the MERGE statement lets you combine UPDATE and INSERT across many rows into a table based upon a matching key. You can embed the MERGE statement in an application program or issue it interactively. The statement is executable and can be dynamically prepared. In addition, you can use the SELECT FROM MERGE statement to return all the updated rows and inserted rows, including column values that are generated by DB2.

TRUNCATE

OK, so that is MERGE, but the title of this blog entry is MERGE and TRUNCATE, so what is TRUNCATE? Well, that is an easier story to tell. The TRUNCATE statement is simply a quick way to DELETE all of the data from a table. The table can be in any type of table space and it can be either a base table or a declared global temporary table. If the table contains LOB or XML columns, the corresponding table spaces and indexes are also truncated.

For clarification, consider the following example:


TRUNCATE TABLE EXAMPLE_TABLE
  REUSE STORAGE
  IGNORE DELETE TRIGGERS
  IMMEDIATE;

Seems easy enough, doesn’t it? But what are those additional parameters? Well, REUSE STORAGE tells DB2 to empty the storage that is allocated but keeps it allocated. The alternate, which is the default, is DROP STORAGE. This option tells DB2 to release the storage that is allocated for the table and to make it available for use for the same table or any other table in the table space. REUSE STORAGE is ignored for a table in a simple table space and the statement is processed as if DROP STORAGE is specified.

The next parameter, which is the default if nothing is specified, is IGNORE DELETE TRIGGERS. This tells DB2 to not fire any DELETE triggers. Alternately, you could specify RESTRICT WHEN DELETE TRIGGERS, which will return an error if there are any delete triggers defined on the table.

Finally, we have the IMMEDIATE option. This causes the TRUNCATE to be immediately executed and it cannot be undone. If IMMEDIATE is not specified you can issue a ROLLBACK to undo the TRUNCATE.

Synopsis

So with DB2 9 for z/OS we have two new helpful SQL statements that can simplify our coding efforts. MERGE to combine INSERT and UPDATE processing and TRUNCATE to quickly DELETE all data from a table. Keep them in mind and use them to aid your DB2 application development efforts.

Monday, April 09, 2007

New DB2 9 Security Redbook

Just a quick entry today to alert everyone that there is a new DB2 for z/OS redbook available on the topics of regulatory compliance, security, and audit. The redbook is titled Securing DB2 and Implementing MLS on z/OS and you can download it for free today over the web.

The redbook is 360 pages (including index) and covers the plethora of new security features in DB2 for z/OS. If you haven't looked at DB2's authorization and security functionality in awhile there is much to learn... and this redbook will be very illuminating.

That's all for today.

Thursday, April 05, 2007

Native XML Support in DB2 Databases [DB2 9 for z/OS]

One of the biggest technological advances in DB2 V9 is the ability to combine the management of structured and unstructured data. Basically, V9 will allow you to store data as native XML. This capability has already been introduced into V9 of DB2 for Linux, Unix, and Windows. Many of you may well ask “Hey, what’s the big deal here? Can’t we already use the XML Extender and store XML data in DB2 prior to V9?” Yes, but V9 changes the game. You will be able to search and analyze structured data in a relational data repository and unstructured data in an XML repository without the need to reformat it. So your regular “relational” data gets stored as always; and XML data gets stored in its native format without the need to shove it into a CLOB or shred it into “relational” columns. The approach is novel in that DB2 will now support native XML via dual storage engines – the traditional SQL/relational engine and a new XML engine. DB2 9 for z/OS handles XML as a new data type that is stored in a natural hierarchy - different from relational data. For those of you not familiar with XML, you need to know that there are big differences between XML data and typical DB2 data. Foremost among these differences is that XML data is hierarchical, whereas “relational” DB2 data is basically “flat.” Additionally, XML data is self-describing. XML tags identify and name the data elements in the XML document. This capability concentrates both the data and its structure into a single document. So, in essence, the XML document becomes self-describing. This is important to keep in mind because a single XML document can have many different types of data, whereas “relational” DB2 data is defined in the system catalog by its column definition. And all data in the same column must have the same data type (e.g. you cannot store a name in an integer column). Finally, XML data is ordered, whereas “relational” DB2 data is not. The order in which data items are specified in the XML document is relevant. There is often no other way to specify order within an XML document. For relational data, the order of the rows is not guaranteed unless you specify an ORDER BY clause on one or more columns. OK, now, just how would you support XML data in DB2 V9 then? Think of XML as just another data type. You would use the XML data type in a CREATE TABLE statement to define a column to be of type XML. Each column of type XML can hold one XML document for every row of the table. Even though the XML documents are logically associated with a row, XML and “relational” columns are stored differently. The “relational” columns are stored in the traditional structures we all know and love. The XML data is stored in hierarchical structures. Don’t let that scare you. IBM has seamlessly integrated XML with relational data to simplify application development while optimizing search performance with highly optimized XML indexes. Here is a quick example walkthrough from the DB2 XML Guide manual that creates a simple table with an XML column. First, as with triggers, when you create tables with XML in SPUFI be sure to set the SQL terminator to a character other than a semicolon, for example, the pound sign (#). This is done so that your SQL can have embedded semicolons. Also, you’ll probably want to set CAPS OFF in SPUFI to preserve lower case. Then, create a table like this: CREATE TABLE MYCUSTOMER (CID BIGINT, INFO XML) # This creates a two columns table, the first column as a big integer and the second for the XML data. Next, we’ll build an index over XML data. We will assume that the XML documents to be stored in the INFO column will have a root element named customerinfo with an attribute named Cid. So, here is the DDL for the unique index on the Cid attribute:

CREATE UNIQUE INDEX MYCUT_CID_XMLIDX ON MYCUSTOMER(INFO)
GENERATE KEY USING XMLPATTERN
‘declare default element namespace
   "http://posample.org"; /customerinfo/@Cid’
AS SQL DECFLOAT
#

The XML pattern defining the index is case-sensitive. The element and attribute names in the XML pattern must match the element and attribute names in the XML documents exactly. Now we can insert a couple of XML documents into the INFO column, such as:

INSERT INTO MYCUSTOMER (CID, INFO) VALUES (1000,
’<customerinfo xmlns="http://posample.org" cid="1000">
  <name>Kathy Smith</name>
  <addr country="Canada">
    <street<5 Rosewood</street>
    <city>Toronto</city>
    <prov-state<Ontario</PROV-STATE>
    <pcode-zip<M6W 1E6</PCODE-ZIP>
  </addr>
  <phone type="work">416-555-1358</phone>
 </customerinfo>’)
#

INSERT INTO MYCUSTOMER (CID, INFO) VALUES (1002,
’<customerinfo xmlns="http://posample.org" cid="1002">
  <name>Jim Noodle</name>
  <addr country="Canada">
    <street>25 EastCreek</street>
    <city>Markham</city>
    <prov-state>Ontario</PROV-STATE>
    <pcode-zip>N9C 3T6</PCODE-ZIP>
  </addr>
  <phone type="work">905-555-7258</phone>
  <phone type="cell">905-555-7254</phone>
 </customerinfo>’)
#

Then you can issue a SELECT statement against this table and thereby verify that the XML documents were successfully inserted. For example: SELECT CID, INFO FROM MYCUSTOMER # V9 also supports XPath to query elements within an XML document, as well as catalog extensions to support definitions of XML schemas. Furthermore, the IBM DB2 utilities have been extended such that they can be used to administer XML data, too. To my mind, though, one of the problems with XML in DB2 9 for z/OS is the lack of support for XQuery. XQuery is an XML query language capable of traversing XML documents. Just like SQL is the query language for native DB2 data, XQuery is the query language for native XML data. DB2 9 for Linux, Unix, and Windows supports XQuery, but DB2 9 for z/OS does not. For an independent tutorial on XQuery, click on this link or for an IBM tutorial on using XQuery in DB2 LUW click on this link instead. So, how do you retrieve XML data using DB2 9 for z/OS? You can use SQL to retrieve entire XML documents from XML columns just like you would retrieve any other column. But if you need to retrieve portions of that XML document you will need to specify XPath expressions, through SQL with XML extensions. For an independent tutorial on XPath, click on this link . Here is an example of using XPath to identify data within our XML data: DELETE FROM MYCUSTOMER WHERE XMLEXISTS ( ’declare default element namespace "http://posample.org"; /customerinfo/phone[@type="cell"]’ PASSING INFO) # This should DELETE any XML document that has cell phone information, and for the purposes of this example, that would be CID 1002. I do not wish to go into any detailed description of XPath in this blog, but you can use XML functions with XPath expressions to traverse the XML document for data. One final note: some of the IBM documentation could be clearer. For example, I take exception with this paragraph lifted directly out of the “What’s New” manual (GC18-9856-00): “Support for XML capabilities and functions span the entire DB2 family. Version 8 of DB2 for z/OS and Version 8 of DB2 for Linux, Unix, and Windows provide basic support for storing, retrieving, and querying XML documents. DB2 9 for Linux, UNIX and Windows continues the work by delivering rich support of XML, including an XQuery interface to the data. Now, DB2 V9.1 for z/OS expands on similar support by delivering seamless integration of XML data and relational data in the DB2 database.” Anyone reading that paragraph would be completely justified in expecting DB2 V9 for z/OS to include XQuery support. It seems to have been written using intentionally misleading wording in order to avoid admitting that XQuery is not supported on z/OS. At least, that is what it seems like to me, I could be wrong. I’m also interested in how many folks out there in DB2-mainframe-land expect to use the XML capabilities of DB2 for z/OS? Please sign in and leave a comment expressing whether or not you plan to use DB2’s XML support. Thanks, and that is all for today. Keep an eye out for future DB2 9 for z/OS related posts as I plan to continue adding to this series on new V9 features over the course of the next month or so (at least). Cheers!

Wednesday, April 04, 2007

ORDER BY and FETCH FIRST in Subselects [DB2 9 for z/OS]

Here is another quick post in my series on new features in DB2 9 for z/OS.

Today, we will look at the additional flexibility gained in how the ORDER BY and FETCH FIRST n ROWS ONLY clauses can be specified in V9. Prior to the V9, the only place you could specify these clauses was at the statement level. Indeed, this has been a source of confusion for many DB2 SQL programmers.

A discussion of DB2 SELECT should be broken down into three topics:

fullselect,
subselect, and
select-statement.

The select-statement is the form of a query that can be directly specified in a DECLARE CURSOR statement, or prepared and then referenced in a DECLARE CURSOR statement. It is the thing most people think of when they think of SELECT in all its glory. If so desired, it can be issued interactively using SPUFI. The select-statement consists of a fullselect, and any of the following optional clauses: order-by, fetch-first, update, read-only, optimize-for, isolation and queryno. Well, that is, until V9 which still allows the fetch-first and order-by at this level, but also at the fullselect and subselect level!

A fullselect can be part of a select-statement, a CREATE VIEW statement, or an INSERT statement. Basically, a fullselect specifies a result table. Prior to V9, this sometimes confused folks as they tried to put a FETCH FIRST n ROWS clause or an ORDER BY in a view or as part of an INSERT. That was not allowed!

Finally, a subselect is a component of the fullselect. A subselect specifies a result table derived from the result of its first FROM clause. The derivation can be described as a sequence of operations in which the result of each operation is input for the next.

This is all a bit confusing. Think of it this way: in a subselect you specify the FROM to get the tables, the WHERE to get the conditions, GROUP BY to get aggregation, HAVING to get the conditions on the aggregated data, and the SELECT clause to get the actual columns. In a fullselect you add in the UNION to combine subselects and other fullselects. Finally, you add on any optional order-by, fetch-first, update, read-only, optimize-for, isolation and queryno clauses to get the select-statement.

But, of course, as of V9 you can use the order-by and/or the fetch-first at the subselect or fullselect level. This can be useful if you want to limit the results within a subquery or part of a UNION statement. Also, if you specify ORDER BY with FETCH FIRST n ROWS ONLY, the result is ordered before the fetch-first is applied. (That's a good thing.)

So, that means all of the following are now legal SQL formulations in V9:

(SELECT COL1 FROM T1
UNION
SELECT COL1 FROM T2
ORDER BY 1)
UNION
SELECT COL1 FROM T3
ORDER BY 1;

This example shows how ORDER BY can be applied in fullselects with UNION.

SELECT EMP_ACT.EMPNO, PROJNO
FROM EMP_ACT
WHERE EMP_ACT.EMPNO IN (SELECT EMPLOYEE.EMPNO
FROM EMP
ORDER BY SALARY DESC
FETCH FIRST 10 ROWS ONLY);

And this example will return the employee number and project number for any projects assigned to employees with one of the top ten salaries.

So, once you move to V9 you will have much more lattitude in how you write your SELECTs. If you are interested in more details, here is a link to the section of the DB2 9 for z/OS SQL Reference manual on building SQL queries.

Tuesday, April 03, 2007

INTERSECT and EXCEPT [DB2 9 for z/OS]

With this blog entry I am introducing a new series in which I will briefly blog about the new feature of DB2 9 for z/OS. Today's entry will cover the new INTERSECT and EXCEPT keywords.

DB2 for Linux, Unix, and Windows has supported INTERSECT and EXCEPT in SQL SELECT statements for quite some time now, and with V9 the z/OS platform catches up. These two set operations can be used to simplify some SQL statements. Think of them as being similar to the UNION operation.

INTERSECT is used to match result sets between two tables. If the data is the same in both results sets it passes through. When INTERSECT ALL is specified, the result consists of all rows that are in both result sets. If INTERSECT is specified without the ALL option, the duplicates will be removed from the results. For example, the following SQL will show all customers in the USA who are also employees (with no duplicates):

SELECT last_name, first_name, cust_num
FROM CUST
WHERE country = 'USA'
INTERSECT
SELECT last_name, first_name, emp_num
FROM EMP
WHERE country = 'USA';

EXCEPT, on the other hand, combines non-matching rows from two result tables. Some other DBMS implementations refer to this as the MINUS operation. When EXCEPT ALL is specified, the result consists of all rows from the first result table that do not have a corresponding row in the second and any duplicate rows are kept. If EXCEPT is specified without the ALL option, duplicates are eliminated. As an example, the following SQL will return only those items from TABLE1 that are not also in TABLE2:

SELECT item FROM TABLE1
EXCEPT
SELECT item FROM TABLE2;

Both INTERSECT and EXCEPT make it easier to formulate SQL requests...