Wednesday, August 08, 2007

Altering LONG VARCHAR [DB2 9 for z/OS]

Just a short entry today to discuss a feature that will be useful for organizations having tables containing long text strings which are stored as VARCHARs. It involves altering the data type of a column from LONG VARCHAR to a compatible data type.

In previous versions of DB2 before to V9 it is not possible to change the data type of a LONG VARCHAR or LONG VARGRAPHIC column using ALTER. Instead, you would have had to unload the data, drop the table, create the table using the new column definitions and reload the data back into the new table. Of course, when you drop the table you’d lose any authorizations, indexes, etc. defined on the table so you’d have to save and re-create those as well. Basically, it wasn’t easy unless you have a change management tool that automates the process.

But before going any further, what is a LONG VARCHAR? Well, a VARCHAR is a varying-length column for character string data where n specifies the maximum length of the string. If n was greater than 254, the column was setup as a long string column – that is, a LONG VARCHAR; same story for VARGRAPHIC columns.

As of DB2 V9, you will not create any more LONG VARCHAR columns – they are all treated as VARCHARs. But the old LONG VARCHAR columns, migrated from earlier releases, still exist. The COLTYPE in the DB2 Catalog (SYSIBM.SYSCOLUMNS) will be LONGVAR for LONG VARCHAR columns and LOGVARG for LONG VARGRAPHIC columns.

The good news is that V9 allows you to ALTER the data type of LONG VARCHAR to VARCHAR and LONG VARGRAPHIC to VARGRAPHIC using ALTER TABLE with the ALTER COLUMN parameter. First, you must lookup the LENGTH of the column in the SYSCOLUMNS and use that for the length of the new data type specification. For example, if LONGCOL in table EXAMPLE is a LONG VARCHAR you can ALTER it in DB2 V9 to VARCHAR. Assuming the length of the column was 500, you can issue the following ALTER to change the column to VARCHAR from LONG VARCHAR:

ALTER TABLE EXAMPLE
ALTER COLUMN LONGCOL
SET DATA TYPE VARCHAR(500);

You must specify the exact value stored in the DB2 Catalog for LENGTH in this ALTER TABLE statement. Failure to do so will result in an error message.

So, you can convert all of those LONG VARCHAR columns from past versions of DB2 to VARCHAR using the ALTER statement in V9.

Monday, August 06, 2007

Encryption [DB2 9 for z/OS]

DB2 V9 offers some encryption news, but we need to go back a version to start the story. You see, DB2 supports encryption in Version 8 through encryption functions that need to be explicitly coded in order to encrypt and decrypt data.

These functions (ENCRYPT and DECRYPT) allow you to encrypt and decrypt data at the column level. Because you can specify a different password for every row that you insert, you are encrypting data at the “cell” level in your tables. If you use these functions to encrypt your data, be sure to put some mechanism in place to manage the passwords that are used to encrypt the data. Without the password, there is absolutely no way to decrypt the data.

To assist you in remembering the password, you have an option to specify a hint (for the password) at the time you encrypt the data. The following SQL example shows an INSERT that encrypts the SSN ( social security number ) using a password and a hint:

INSERT INTO EMP (SSN)
VALUES(ENCRYPT('289-46-8832','TARZAN','? AND JANE'));

The password is “TARZAN” and the hint we’ve chosen to provide is “? AND JANE”… so the hint will prompt us to think of Tarzan as the companion of Jane.

In order to retrieve the encrypted data you will need to use the DECRYPT function supplying the correct password. This is shown in the following SELECT statement:

SELECT DECRYPT_BIT(SSN,'TARZAN') AS SSN
FROM EMP;


If we fail to supply a password, or the wrong password, the data is returned in an encrypted format that is unreadable.

The result of encrypting data using the ENCRYPT function is VARCHAR FOR BIT DATA. The encryption algorithm is an internal algorithm. For those who care to know, it uses Triple DES cipher block chaining (CBC) with padding and the 128-bit secret key is derived from the password using an MD5 hash.

When defining columns to contain encrypted data the DBA must be involved because the data storage required is significantly different. The length of the column has to include the length of the non-encrypted data + 24 bytes + the number of bytes to the next 8 byte boundary + 32 bytes for the hint.

OK, that is all V8 stuff and this series of blog postings is supposed to be about V9 functionality, right? So what about version 9? Well, DB2 9 for z/OS offers some nice improvements to encryption support. Firstly, DB2 can take advantage of encryption hardware advances.

CP Assist for Cryptographic Function, aka CPACF, is available on z990 hardware. CPACF can run on all the CPUs, but remember, this feature is available only on z990 and later machines, not the older z900. The z990 also introduces a PCIXCC card which is needed for the IBM Data Encryption Tool, but not for the DB2 encryption functions.

Note: The IBM Data Encryption Tool (available from IBM at an additional price) offers encryption for DB2 tables at the table level, whereas the encryption functions (free with DB2) offer encryption at the column level.

The CP Assist for Cryptographic Function delivers cryptographic support on every CP with Data Encryption Standard (DES), Triple DES (TDES), and Advanced Encryption Standard (AES)-128 bit data encryption/decryption, as well as Secure Hash Algorithm (SHA-1) and SHA-256 hashing. For a more detailed discussion of CPACF, associated technology and functionality, check out the following IBM redbook: IBM eServer zSeries 990 (z990) Cryptography Implementation (SG24-7070).

Basically, the net result is that the cost of encrypting DB2 data under V9 is reduced on the z990 hardware.

Additionally, IBM has added encryption support in the controllers of its storage devices.

Both the IBM TS1120 tape drive and IBM Ultrium 4 tape drives include data encryption capabilities within the drives. This support can allow you to avoid the need for host-based encryption of data or the use of specialized encryption appliances. In addition, IBM claims that the encryption does not significantly impact the performance of the drives so there should be minimal to no impact on the batch processing window when encrypting in this manner.

So far, we’ve been talking about encryption for data at rest. But DB2 9 for z/OS also improves support for encryption of data in transit. DB2 9 supports the Secure Socket Layer (SSL) protocol by implementing the z/OS Communications Server IP Application Transparent Transport Layer Security (AT-TLS) function. The z/OS V1R7 Communications Server for TCP/IP introduces the AT-TLS function in the TCP/IP stack for applications that require secure TCP/IP connections. AT-TLS performs transport layer security on behalf of the application, in this case DB2 for z/OS, by invoking the z/OS system SSL in the TCP layer of the TCP/IP stack. The z/OS system SSL provides support for TLS V1.0, SSL V3.0, and SSL V2.0 protocols.

So encryption of data over the wire is improved in z/OS 1.7. The Communications Server supports AT-TLS, which uses SSL data encryption. Now SSL encryption has been available on z/OS for a long time, but now DB2 9 for z/OS makes use of this facility and offers SSL encryption using a new secure port.

When acting as a requester, DB2 for z/OS can request a connection using the secure port of another DB2 subsystem. When acting as a server, and from within a trusted context (I’ll discuss trusted context in a later DB2portal blog entry), SSL encryption can be required for the connection.

So, little by little, better encryption support is being made available within the world of DB2 for z/OS.

Tuesday, July 24, 2007

Index Compression [DB2 9 for z/OS]

Another useful new feature debuting in V9 is the ability to compress indexes. We’ve been able to compress DB2 data in table spaces for a long time now, either through an exit routine or with the COMPRESS table space parameter (added in DB2 V3). But before V9 we’ve never been able to compress index data.

Why would you want to compress index data? Well, some types of applications require very large indexes on very large tables - - data warehousing applications are one good example. Sometimes, the storage required for indexes to support your data warehouse applications can exceed the storage required for the base table. So it makes sense that you might want to reduce the storage consumed by such indexes.

DB2 V9 introduces the COMPRESS parameter for indexes. You can specify COMPRESS YES (or NO) on your CREATE INDEX and ALTER INDEX statements. Index partitioning is done at the index level and cannot be performed on a partition by partition basis.

Additionally, DB2 will only compress the data in leaf pages, not in the root page and any non-leaf pages in between. This makes sense because you don’t want to incur the expense of decompressing all of these type of pages in your indexes in order to find the right leaf page range.

Index compression does not require a compression dictionary. As such, DB2 can begin to immediately compress data in the leaf pages of your newly created indexes.

Now think about what we’ve already learned for a minute. The data on the leaf page is compressed, but we will want to access it uncompressed, right? So index pages are stored on disk in a compressed format but will be expanded when read. So those 4K index pages on disk will require more than 4K when expanded. This means that compressed indexes must be defined in a larger buffer pool (8K, 16K, or 32K). Nevertheless, when you compress an index DB2 will always compress the data down into a 4K page size on disk no matter what page size you choose.

Another consideration to keep in mind is that index data is decompressed for index image copies, so a copy of an index will require more storage space than the actual index requires.

So, when you move to DB2 9 in NFM you have an additional compression decision to make: which indexes should be compressed and which should not? But it is a good thing to have more options at our disposal, especially for applications with huge indexing requirements.

Monday, July 16, 2007

CLONE Tables [DB2 9 for z/OS]

This new feature in DB2 V9 might sound like an old monster movie (Invasion of the Clone Tables!!!), but it is actually a nifty new capability for managing DB2 data availability. Cloning is basically a method of entirely refreshing all of the rows of a table while maintaining availability to the table.

OK, so how does it work? Basically, you will create a table with the exact same attributes as a table that already exists at the current server, except that it is empty of data. It is created using the ALTER TABLE SQL statement with the ADD CLONE parameter. This clone table is created in the same table space as the existing table. After creating the clone table you can do whatever you want to do with it. LOAD it, INSERT rows to it, etc.

The clone table will be structurally identical to the base table in every way: same number of columns, column names, data types, as well as have the same indexes, before triggers, LOB objects and so on. All of this is done auto-magically by DB2 when it the clone table is created. Authorizations and views on the base table, however, are not duplicated, or cloned.

So, the clone is created and it can be manipulated without impacting the base table. When the clone is ready to become the base table it is exchanged with the base table. This is accomplished using the new EXCHANGE SQL statement. After running an EXCHANGE the clone becomes the real table and the previously “real” table becomes the clone - - and you can repeat the process.

The clone and the base table are kept in different underlying VSAM linear data sets. We all should know that the data sets used by DB2 are named using the following format:
cat.DSNDBD.DBNAME.TSNAME.I0001.A001. Well, the clone uses the same name except for the next to last component. It will be I0002. When the exchange is made, DB2 flips the I0001 with the I0002.

Of course, there are rules and details you will need to know to properly deploy cloning in your organizations, but that is all documented in the manuals. The point of these blog entries is more to make you aware of new functionality and give you the basic flavor for how it works and what it is intended for…

With that in mind, this is intended to allow you to quickly replace one version of DB2 table data with another version, without impacting availability while the new version is built.

Thursday, July 12, 2007

New Data Types [DB2 9 for z/OS]

As we continue our blog series on the new features and functionality of DB2 9 for z/OS, today we examine the four (OK, five) new data types introduced in this version of DB2.

BIGINT

First up, we have the BIGINT data type. A BIGINT is an exact numeric data type capable of representing 63-bit integers. This is the third integer data type now available to DB2 and it offers the ability to store the largest range of values:

  • SMALLINT values can range from -32768 to 32767
  • INTEGER values can range from -2147483648 to 2147483647
  • BIGINT values can range from -9223372036854775808 to 9223372036854775807

So when you have the need to store very large integers you don’t have to use DECIMAL with a zero scale any longer.

BINARY and VARBINARY

Next up, and perhaps more exciting, V9 delivers a true binary data type for the first time in DB2, so we no longer have to use a BLOB or CHAR with FOR BIT DATA. BINARY is a fixed-length binary string up to 255 bytes. DB2 9 also delivers a VARBINARY data type, which is a variable-length binary string up to 32704 bytes.

BINARY and VARBINARY data types extend current support of binary strings (BLOB), and are compatible with the BLOB data type. They are not compatible with character string data types. IBM does make it somewhat easy to migrate existing columns defined as CHAR FOR BIT DATA or VARCHAR FOR BIT DATA over to BINARY or VARBINARY. If there is an index defined on the column, the index is placed in RBDP. You cannot alter BINARY or VARBINARY data types to CHAR FOR BIT DATA or VARCHAR FOR BIT DATA.

Also, there are some usage considerations to keep in mind. Two binary strings are equal only if the lengths are identical. If two strings are equal up to the length of the shorter string length, the shorter string is considered less than the longer string, even when the remaining bytes in the longer string are hex zeros.

DECFLOAT

The next new data type supported by DB2 9 for z/OS is DECFLOAT. V9 takes advantage of new System z9 hardware support delivering the DECFLOAT data type that lets you use decimal floating-point numbers with greater precision. A decimal floating-point value (DECFLOAT) is an IEEE 754r number with a decimal point.

The maximum precision is 34 digits and the range of a decimal floating point number is either 16 or 34 digits of precision.


DECFLOAT(16) values can range from a low of:

-9.999999999999999×10**384


to a high of:

9.999999999999999×10**384



DECFLOAT(34) values can range from a low of:

-9.999999999999999999999999999999999 ×10**6144


to a high of:

9.999999999999999999999999999999999 ×10**6144



Anyone think they’ll need bigger numbers?

In addition, the DECFLOAT data type is able to represent special values representing “non-number numbers”:

  • Infinity - a value that represents a number whose magnitude is infinitely large.
  • Quiet NaN - a value that represents undefined results which does not cause an invalid number condition. NaN is not a number.
  • Signaling NaN - a value that represents undefined results which will cause an invalid number condition if used in any numerical operation.

XML

And finally, we get pureXML support to store XML as a native data type in DB2. That means you can specify XML as a data type for columns in your DB2 tables in DB2 9 for z/OS. But I’m not really going to elaborate any further on XML here. A few more details can be found in an earlier post here.