Monday, July 16, 2007
CLONE Tables [DB2 9 for z/OS]
OK, so how does it work? Basically, you will create a table with the exact same attributes as a table that already exists at the current server, except that it is empty of data. It is created using the ALTER TABLE SQL statement with the ADD CLONE parameter. This clone table is created in the same table space as the existing table. After creating the clone table you can do whatever you want to do with it. LOAD it, INSERT rows to it, etc.
The clone table will be structurally identical to the base table in every way: same number of columns, column names, data types, as well as have the same indexes, before triggers, LOB objects and so on. All of this is done auto-magically by DB2 when it the clone table is created. Authorizations and views on the base table, however, are not duplicated, or cloned.
So, the clone is created and it can be manipulated without impacting the base table. When the clone is ready to become the base table it is exchanged with the base table. This is accomplished using the new EXCHANGE SQL statement. After running an EXCHANGE the clone becomes the real table and the previously “real” table becomes the clone - - and you can repeat the process.
The clone and the base table are kept in different underlying VSAM linear data sets. We all should know that the data sets used by DB2 are named using the following format:
cat.DSNDBD.DBNAME.TSNAME.I0001.A001. Well, the clone uses the same name except for the next to last component. It will be I0002. When the exchange is made, DB2 flips the I0001 with the I0002.
Of course, there are rules and details you will need to know to properly deploy cloning in your organizations, but that is all documented in the manuals. The point of these blog entries is more to make you aware of new functionality and give you the basic flavor for how it works and what it is intended for…
With that in mind, this is intended to allow you to quickly replace one version of DB2 table data with another version, without impacting availability while the new version is built.
Thursday, July 12, 2007
New Data Types [DB2 9 for z/OS]
As we continue our blog series on the new features and functionality of DB2 9 for z/OS, today we examine the four (OK, five) new data types introduced in this version of DB2.
BIGINT
First up, we have the BIGINT data type. A BIGINT is an exact numeric data type capable of representing 63-bit integers. This is the third integer data type now available to DB2 and it offers the ability to store the largest range of values:
- SMALLINT values can range from -32768 to 32767
- INTEGER values can range from -2147483648 to 2147483647
- BIGINT values can range from -9223372036854775808 to 9223372036854775807
So when you have the need to store very large integers you don’t have to use DECIMAL with a zero scale any longer.
BINARY and VARBINARY
Next up, and perhaps more exciting, V9 delivers a true binary data type for the first time in DB2, so we no longer have to use a BLOB or CHAR with FOR BIT DATA. BINARY is a fixed-length binary string up to 255 bytes. DB2 9 also delivers a VARBINARY data type, which is a variable-length binary string up to 32704 bytes.
BINARY and VARBINARY data types extend current support of binary strings (BLOB), and are compatible with the BLOB data type. They are not compatible with character string data types. IBM does make it somewhat easy to migrate existing columns defined as CHAR FOR BIT DATA or VARCHAR FOR BIT DATA over to BINARY or VARBINARY. If there is an index defined on the column, the index is placed in RBDP. You cannot alter BINARY or VARBINARY data types to CHAR FOR BIT DATA or VARCHAR FOR BIT DATA.
Also, there are some usage considerations to keep in mind. Two binary strings are equal only if the lengths are identical. If two strings are equal up to the length of the shorter string length, the shorter string is considered less than the longer string, even when the remaining bytes in the longer string are hex zeros.
DECFLOAT
The next new data type supported by DB2 9 for z/OS is DECFLOAT. V9 takes advantage of new System z9 hardware support delivering the DECFLOAT data type that lets you use decimal floating-point numbers with greater precision. A decimal floating-point value (DECFLOAT) is an IEEE 754r number with a decimal point.
The maximum precision is 34 digits and the range of a decimal floating point number is either 16 or 34 digits of precision.
DECFLOAT(16) values can range from a low of:
-9.999999999999999×10**384
to a high of:
9.999999999999999×10**384
DECFLOAT(34) values can range from a low of:
-9.999999999999999999999999999999999 ×10**6144
to a high of:
9.999999999999999999999999999999999 ×10**6144
Anyone think they’ll need bigger numbers?
In addition, the DECFLOAT data type is able to represent special values representing “non-number numbers”:
- Infinity - a value that represents a number whose magnitude is infinitely large.
- Quiet NaN - a value that represents undefined results which does not cause an invalid number condition. NaN is not a number.
- Signaling NaN - a value that represents undefined results which will cause an invalid number condition if used in any numerical operation.
XML
And finally, we get pureXML support to store XML as a native data type in DB2. That means you can specify XML as a data type for columns in your DB2 tables in DB2 9 for z/OS. But I’m not really going to elaborate any further on XML here. A few more details can be found in an earlier post here.
Monday, July 09, 2007
New Built-in Functions [DB2 9 for z/OS]
So, what new function functionality do we get with DB2 9 for z/OS? First of all, we get some new ASCII and EBCDIC conversion functions. The ASCII_CHR function returns the character that has the ASCII code value that is specified by the argument.; and the ASCII_STR function returns a string, in the system ASCII CCSID that is an ASCII version of the string. Knowing that, I bet you can guess what EBCDIC_CHR and EBCDIC_STR do.
We also get UNICODE and UNICODE_STR functions in DB2 9 for z/OS. The UNICODE function returns the Unicode UTF-16 code value of the leftmost character of the argument as an integer. And the UNICODE_STR function? It returns a string in Unicode UTF-8 or UTF-16 (depending on the specified parameter) representing a Unicode encoding of the input string.
Perhaps more interesting is soundex support for testing whether two strings sound the same as each other. DB2 offers two functions here: SOUNDEX and DIFFERENCE. The SOUNDEX function returns a 4 character code that represents the sound of the words in the argument. The result can be used to compare with the sound of other strings. The data type of the result is CHAR(4). So what does this mean? Consider the following example:
SELECT LASTNAME
FROM DSN910.EMP
WHERE SOUNDEX(LASTNAME) = SOUNDEX(’Smith’);
This query would return not only employees with a last name of “Smith,” but also anything that sounds like Smith, such as Smythe.
The DIFFERENCE function is related to SOUNDEX. It returns a value from 0 to 4 where the number represents the difference between the sounds of two strings based on applying the SOUNDEX function to the strings. The higher the value, the closer the two strings are to sounding alike. Consider:
SELECT DIFFERENCE(’CONSTRAINT’,’CONSTANT’),
SOUNDEX(’CONSTRAINT’),
SOUNDEX(’CONSTANT’)
FROM SYSIBM.SYSDUMMY1;
This example returns the values 4, C523, and C523. Since the two strings return the same SOUNDEX value, the difference is 4 (the highest value possible). The more different-sounding the two strings are, the smaller the number would be.
Another interesting series of BIFs are focused on date and time data. These functions include EXTRACT, MONTHS_BETWEEN, and various new timestamp-related functions.
EXTRACT returns a portion of a date or timestamp. You can use EXTRACT to slice up a date/time value into its component pieces. Consider:
SELECT BIRTHDATE,
EXTRACT(DAY FROM BIRTHDATE) AS DAY,
EXTRACT(MONTH FROM BIRTHDATE) AS MONTH,
EXTRACT(YEAR FROM BIRTHDATE) AS YEAR
FROM DSN8910.EMP;
This query would return the entire date, along with each component (year, month, day) of the date as a separate column. You can use one function (EXTRACT) to do the job of the already-existing YEAR, MONTH, and DAY functions. Of course, YEAR, MONTH, and DAY are still available for your use. Similar functionality for EXTRACT exists for time components using HOUR, MINUTE, and SECOND.
The next temporal BIF introduced with DB2 9 for z/OS is the MONTHS_BETWEEN function. It returns an estimate of the number of months between two expressions. If the first expression represents a date that is later than the second, the result will be positive; if the opposite is true the result will be negative. The result is calculated based on a 31 day month.
For example, consider this statement:
SELECT MONTHS_BETWEEN ('2007-02-20','2007-01-17')
AS MONTHS_BETWEEN
FROM SYSIBM.SYSDUMMY1;
The result of this query would be 1.096774193548387.
We also get four new timestamp-related BIFs: TIMESTAMPADD, TIMESTAMPDIFF, TIMESTAMP_FORMAT, and TIMESTAMP_ISO. The first two are rather straightforward: TIMESTAMPADD adds an interval to a timestamp and TIMESTAMPDIFF subtracts two timestamps and returns an interval. I will not get into a discussion of date/time arithmetic here, but if you are interested check out my article Q+As on Dates and DB2.
What about the other two timestamp-related BIFs? TIMESTAMP_FORMAT offers the much-needed ability to choose different display formats for a timestamp value. The valid formats that can be specified are:
• ‘YYYY-MM-DD’
• ‘YYYY-MM-DD-HH24-MI-SS’
• ‘YYYY-MM-DD-HH24-MI-SS-NNNNNN’
And instead of using the dash ( - ) as the separator, you can also use . / , : ; and blank. These separators can be used in any combination. And the VARCHAR_FORMAT function returns a character representation of a timestamp in a format specified as above. So, consider the following query:
SELECT SUBSTR(NAME,1,8) AS TSNAME,
VARCHAR_FORMAT(CREATEDTS,'YYYY-MM-DD-HH24:MI:SS') AS TSCR
FROM SYSIBM.SYSTABLESPACE;
WHERE
CREATEDTS >=
TIMESTAMP_FORMAT('2007-01-01 00:00:00','YYYY-MM-DD HH24:MI:SS');
This query will return the name of all table spaces created since the first of the year, along with its creation timestamp using the format specified.
We also get some new string manipulation functions. The LOCATE_IN_STRING function returns the starting position of the first occurrence of one string within another string. This is basically the same as the existing LOCATE function.
We also get LPAD and RPAD functions. The LPAD function returns a string that is padded on the left, with blanks (or a specific character). The LPAD function treats leading or trailing blanks as significant. RPAD, of course, does the same but on the right. So, consider the following example:
SELECT LPAD(LASTNAME, 30, ’.’ ) AS LAST,
RPAD(FIRSTNME, 30) AS FIRST
FROM DSN910.EMP;
This query will left pad the last name with periods and right pad the first name with blanks.
OVERLAY is yet another new string manipulation function. It allows you to return a string with portions of it overlaid by a specified value. You provide the string, a substring to be overlaid, and its starting point and length, and DB2 does the rest. Learning by example is simpler than trying to explain how it works, so here goes:
SELECT CHAR(OVERLAY('PLATELET','CEMEN',4,4,OCTETS),9),
CHAR(OVERLAY('INSERTING','IS',4,2,OCTETS),10),
FROM SYSIBM.SYSDUMMY1;
The results returned by this query would be:
• 'PLACEMENT ' (starting at position 4 overlay 4 bytes with 5 bytes 'CEMEN')
• 'INSISTING' (starting at position 4 overlay 2 bytes with 'IS')
Additional BIFs include RID, which returns the RID of a row, COLLATION_KEY which returns a varying-length binary string that represents the collation key of the expression in a named collation, DECRYPT_BINARY which adds the ability to decrypt the new BINARY and VARBINARY data types, and NORMALIZE_STRING which takes a Unicode string argument and returns a normalized string. And, of course, we also get scalar functions to support the new data types in DB2 9 (BIGINT, BINARY, VARBINARY and DECFLOAT).
Finally, DB2 9 for z/OS adds three new aggregate functions: CORRELATION, COVARIANCE, and COVARIANCE_SAMP.
The CORRELATION function returns the coefficient of correlation of a set of number pairs. And the COVARIANCE and COVARIANCE_SAMP functions return the (population) covariance of a set of number pairs.
So, as you can see, DB2 9 for z/OS continues adding new and useful functions to simplify our development efforts with DB2.
Friday, June 29, 2007
Implicitly Created Database Objects [DB2 9 for z/OS]
Not being content with that, DB2 9 extends this capability with the ability to implicitly create additional types of database objects. By coding your CREATE TABLE statement with the proper options you can implicitly create any or all of the following:
OK, so how does this happen? Let’s go down the list. If you fail to specify the IN clause on a CREATE TABLE, DB2 works a bit differently. In the past, DB2 would simply create an implicit table space in DSNDB04. As of DB2 9, the database is involved as well. DB2 will either create an implicit database or use a previously implicitly created database. The names of these implicitly created databases will range from DSN00001 to DSN60000. The first time, DB2 will create DSN00001, the second DSN00002, and so on until we reach DSN60000. The next time, DB2 will wrap around and start again from the beginning, using existing implicitly created databases. For the implicitly created databases, the STOGROUP will be set to SYSDEFLT; buffer pool values are determined via DSNZPARMs.
Next up is the table space. Although DB2 has supported implicitly created table spaces forever, there are some twists in DB2 9. First of all, you cannot create simple table spaces any longer, so all implicitly created table spaces will be segmented. In compatibility mode (CM), a implicitly created table spaces will be defined as SEGSIZE 4 and LOCKSIZE ROW. After migrating to new function mode (NFM) your implicitly created table spaces will be created as partition by growth table spaces. The options uses will be SEGSIZE 4, DSSIZE 4G, MAXPARTITIONS 256, LOCKSIZE ROW, and LOCKMAX SYSTEM.
As for the rest of the objects in the list, these system-required objects will always be implicitly created if the table space is created implicitly. For indexes that support the primary key or unique constraints, the names will be generated using a combination of the table name and randomly generated characters.
OK, so now that you know about the ability of DB2 9 to implicitly create objects, let me give you some advice. Whenever possible, don’t rely on it. It is much better, if at all feasible, for your DBAs to explicitly create and name all database objects as needed. Yes, it takes more time, but it gives you more control. You can explicitly decide which objects go into which database; you can explicitly set paramters; etc.
So, this is a nice new feature and it can enable DB2 to do some definitional things for you automatically. But most DBAs will want to continue to do things the traditional way, that is, building their DDL themselves without relying on implicitly
Monday, June 18, 2007
Skipping Locked Rows [DB2 9 for z/OS]
Of course, if a program skips over locked data then that data is not accessed and the program will not have it available. When this option is used DB2 will just skip over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time. But it comes at the cost of not accessing the locked data at all. This means that you should only utilize this clause when your program can tolerate skipping over some data.
The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation. But it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 will simply ignore the SKIP LOCKED DATA clause under UR and RR isolation levels.
Additionally, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks.
Let's look at an example. Suppose we have a table with 5 rows in it that looks like this:
KEY FNAME LNAME
--- ------ -------
1 JOE MAMA
2 DON KNOTTS
3 KIM PORTANT
4 BOB NOBBLE
5 KIM BIMBO
Assume row level locking. Next assume that an UPDATE statement is run against the table changing FNAME to JIM WHERE LNAME = 'KIM'. And it is hanging out there without a COMMIT. Next, we run:
SELECT COUNT (*)
FROM TABLE
WHERE FNAME >= ’AAA’
SKIP LOCKED DATA;
The count returned would be 3 because DB2 skips the two locked rows (rows 3 and 5). And, of course, if the locks are released the count would be 5 again.