Friday, August 25, 2017

The Db2 12 for z/OS Blog Series - Part 15: DSN1COPY and Data Validation Improvements

If you’ve worked with Db2 for z/OS for awhile (note to IBM: I still have a problem with that lower case "b" but I'm trying), particularly as a DBA, you’ve almost certainly had the opportunity to use the DSN1COPY offline utility, sometimes called the Offline Copy utility.

DSN1COPY can be used in many helpful ways. For example, it can be used to copy data sets or check the validity of table space and index pages. Another use is to translate Db2 object identifiers for the migration of objects between Db2 subsystems or to recover data from accidentally dropped objects. DSN1COPY also can print hexadecimal dumps of Db2 table space and index data sets.

Its primary function, however, is to copy data sets. DSN1COPY can be used to copy VSAM data sets to sequential data sets, and vice versa. It also can copy VSAM data sets to other VSAM data sets and can copy sequential data sets to other sequential data sets. As such, DSN1COPY can be used to

  • Create a sequential data set copy of a Db2 table space or index data set.
  • Create a sequential data set copy of another sequential data set copy produced by DSN1COPY.
  • Create a sequential data set copy of an image copy data set produced using the Db2 COPY utility, except for segmented table spaces. (The DB2 COPY utility skips empty pages, thereby rendering the image copy data set incompatible with DSN1COPY.)
  • Restore a Db2 table space or index using a sequential data set produced by DSN1COPY.
  • Restore a Db2 table space using a full image copy data set produced using the Db2 COPY utility.
  • Move Db2 data sets from one disk to another.
  • Move a Db2 table space or index space from a smaller data set to a larger data set to eliminate extents. Or move a Db2 table space or index space from a larger data set to a smaller data set to eliminate wasted space.

Given such a wide array of useful purposes you can see how DSN1COPY is an important arrow in a DBA’s quiver… But remember, it is an offline utility, so Db2 is not aware of – or in control of the data that is moving. So if you use it to change data in a production page set data integrity issues can arise. For example, you may get mismatches between the data page format and the description of the format in the Db2 Catalog.

Other types of errors that can ensue when using DSN1COPY include:
  • incorrect DBID/PSID/OBID values,
  • improper table space layout (for example, using DSN1COPY to copy data from a segmented table space to a partition-by-growth universal table 
  • version number and table definition errors


In scenarios where DSN1COPY was not used properly you can encounter invalid data, abends, and storage overlays. Not good!

Thankfully, we get some help in Db2 12 for z/OS though. Improvements to the REPAIR utility make it easier to detect and correct data mismatches. You can use the REPAIR CATALOG utility to fix situations where the column data type or length in the table space differs from the catalog definition for the column. If Db2 can convert from the data type and length in the table space to the data type and length in the column then the REPAIR CATALOG utility enables conversion. The data type or length of the data in the table space will be changed to match the definition in the Db2 Catalog the next time that the data is accessed.

Additionally, we can use the REPAIR CATALOG TEST utility to detect multiple types of data mismatches. All of the following can be detected:
  • If a range-partitioned table space indicates absolute page numbering, but the catalog indicates relative page numbering; and vice versa.
  • The number of columns in the table space is greater than the number of columns in the catalog definition of the table.
  • The column data type or length in the table space differs from the catalog definition for the column.

So Db2 12 makes life a bit easier for those of us who use DSN1COPY and sometimes do not specify the parameters or the data sets exactly perfectly.

Tuesday, August 01, 2017

The DB2 12 for z/OS Blog Series - Part 14: Improved MERGE SQL Statement

A very common requirement for application developers is to be able to read through new data – from a table, a file or as entered by an end user – and either INSERT the data if it does not already exist or UPDATE data that does exist with new values.

The ANSI SQL standard defines the MERGE statement for this purpose. The purpose of the MERGE statement is to take two “tables” and merge the data into one table. DB2 for z/OS has supported the MERGE statement since Version 9, but it is more functional now as of Version 12.

Prior to DB2 12, the MERGE statement could not accept a table reference as a way of supplying source data. Input to the MERGE can only be a host variable array or a list of values. This limitation caused MERGE to be somewhat lightly implemented.

Well, Version 12 does away with this limitation – and adds even more features. So you can now write a MERGE statement where data from one table is merged with data from another table. Remember, merge takes the data and compares it and when the comparison is matched does one thing… and when the comparison is not matched it does another. So you can UPDATE when matched and INSERT when not matched.

Consider the following SQL:

MERGE INTO EMP Tgt
USING (SELECT EMPNO, FNAME, LNAME, ADDRESS, SALARY FROM NEW_EMP) Src
ON (Tgt.EMPNO = Src.EMPNO)
WHEN MATCHED THEN
  UPDATE SET (Tgt.FNAME, Tgt.LNAME, Tgt.ADDRESS, Tgt.SALARY) =
  (Src.FNAME, Src.LNAME, Src.ADDRESS, Src.SALARY)
WHEN NOT MATCHED THEN
  INSERT (EMPNO, FNAME, LNAME, ADDRESS, SALARY)
  VALUES (Src.EMPNO, Src.FNAME, Src.LNAME, Src.ADDRESS, Src.SALARY)
ELSE IGNORE;

This MERGE statement takes a table containing new/revised employee data and inserts the data when a match is not found and updates the data if it is found. Note that this is a simple MERGE that assumes that all the columns (in this case) are provided if the data is to be updated.

More complex MERGE statements are possible as of DB2 12 because you can now provide additional matching condition options and additional predicates on the matching conditions (instead of just matched/not matched). It is also possible to issue a SIGNAL statement to return an error when a matching condition evaluates to True.

When you use the new functionality of the MERGE statement in DB2 12+, the operations is atomic; this means that the source rows are processed as a set of rows by each WHEN clause. If an error occurs for any source row, processing stops and no target rows are modified.


But the bottom line here is that the MERGE statement has been significantly improved and is a powerful way of processing data using only SQL as of DB2 12 for z/OS. 

Monday, July 17, 2017

The IBM z14: There's a New Mainframe Coming!

Today, July 17, 2017, IBM announced the next iteration of its stalwart mainframe hardware, the IBM z14. The news comes a little more than two years since the previous version, the IBM z13. The hardware will be generally available later this year.



The big news is that IBM delivering pervasive encryption as part of the IBM z14. With pervasive encryption you can encrypt and protect all of your data all the time, at any scale, without having to change any application code or do any development work. That means you can protect all of your data without requiring an interruption to your business.

The IBM z14 features the industry's fastest microprocessor, running at 5.2GHz, and a new scalable system structure that delivers up to a 35 percent capacity increase compared to the previous generation z13. The system can support over 12 billion encrypted transactions per day on a single system.
Other new capabilities of the IBM z14 include up to 32 TB of memory (3x the memory of the z13), three times faster I/O, and a 10x latency reduction with SAN response time using zHyperLink.
IBM also has introduced three new pricing models to help reduce the cost of mainframe computing, specifically for development and testing, new application workloads, and payment processing.
All in all, with the announcement of the IBM z14, IBM is doing what it takes to make the best computing platform even better. I look forward to using the platform and seeing what it can do for high availability, high performance, secure computing!

Wednesday, July 12, 2017

The DB2 12 for z/OS Blog Series - Part 13: DRDA Fast Load

Have you ever had a situation where you needed to load data into a DB2 table, but the file with the data was not on the mainframe? So you had to PTF that data to the mainframe and then load it.

Well, with DB2 12 for z/OS you get a new capability to load the data to the mainframe without moving the file. The DRDA fast load feature provides you with an efficient way to load data to DB2 for z/OS tables from files that are stored on distributed clients.

The DNSUTILU stored procedure can be invoked by a DB2 application
program to run DB2 online utilities. This means that you can run an online LOAD utility using DSNUTILU. Before loading remote data, you must bind the DSNUT121 package at each location where you will be loading data. A local package for DSNUT121 is bound by installation job DSNTIJSG when you install or migrate to a new version of DB2 for z/OS.

The DB2 Call Level Interface APIs and Command Line Processor have been enhanced to support remote loading of data to DB2 for z/OS. They have been modified to stream data in continuous blocks for loading. This feature is supported in all DB2 client packages. The extraction task for data blocks that passes them to the LOAD utility is 100 percent offloadable to the zIIP, so the process can result in reduced elapsed time.


This capability is available before activating new function.

Thursday, June 29, 2017

The DB2 12 for z/OS Blog Series - Part 12: New Built-in Functions

As with most new releases of DB2 for z/OS, at least lately, there are several new built-in functions (or BIFs) that have been added. DB2's BIFs are used to translate data from one form or state to another. They can be used to overcome data format, integrity and transformation issues when you are reading data from DB2 tables. 

So what new things can we do with functions in DB2 12 for z/OS?


The ARRAY_AGG function can be used to build an array from table data. It returns an array in which each value of the input set is assigned to an element of the array. So basically speaking, you can use ARRAY_AGG to read values from rows of a table and convert those values into an array. For example, if I wanted to create an array of name from the EMP table for all females employees I could write it like this:


SET ARRAYNAME = (SELECT LASTNAME FROM DSN8C10.EMP WHERE SEX = 'F'); 

The new part is the ability to use an associative array aggregation. That means that the ARRAY_AGG function is invoked where there is a target user-defined array data type in the same statement, or the result of the ARRAY_AGG function is explicitly cast to a user-defined array data type.


More details can be found here.


Another new capability comes with the LISTAGG function, which is only available as of function level 501. The LISTAGG function aggregates a set of string values for a group into

one string by appending the string-expression values based on the order that is specified in the WITHIN GROUP clause.

So if I needed to create a list of comma-separated names, in alphabetical order grouped by department I could write:


SELECT WORKDEPT,
       LISTAGG(LASTNAME, ’, ’) WITHIN GROUP(ORDER BY LASTNAME)
       AS EMPLOYEES
FROM   DSN8C10.EMP

GROUP BY WORKDEPT;

You can find additional details here.


DB2 12 for z/OS also adds functions for calculating the percentile of a set of values. There are two options:

  • PERCENTILE_CONT
  • PERCENTILE_DISC
The PERCENTILE_CONT function returns a percentile of a set of values treated as a continuous distribution. The calculated percentile is an interpolated value that might not have appeared in the input set.

On the other hand, the PERCENTILE_DISC function returns a percentile of a set of values treated as discrete values. The calculated percentile is always a value that appeared in the input set.


Consider the following two statements:


SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY COMM),
FROM   DSN8C10.EMP
WHERE  WORKDEPT = 'E21';


The result here, using the sample data, would be 1968.50. There are an even number of rows, so the percentile using the PERCENTILE_CONT function would be determined by interpolation. The average of the value of the two middle rows (1907.00 and 2030.00) is used.

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY COMM),
FROM EMPLOYEE
WHERE WORKDEPT = 'E21';

The same SQL statement but substituting PERCENTILE_DISC for PERCENTILE_CONT would return 1907.00. Again, the example would return 6 rows (an even number) but instead of an average a discrete value is returned; the value of the first of the two middle rows, which is 1907.00.


Another set of new functions give the ability to generate unique values that can be used for keys:
  • GENERATE_UNIQUE
  • GENERATE_UNIQUE_BINARY
In both cases, the function will return a unique value that includes the internal form of the Universal Time, Coordinated (UTC), and the Sysplex member (for Data Sharing environments). 

For GENERATE_UNIQUE a bit data character string 13 bytes long is returned. That means CHAR(13) FOR BIT DATA.

For GENERATE_UNIQUE_BINARY a BINARY(16) value is returned. Both functions require parentheses without any arguments.


You can use the new WRAP function to obfuscate your database code objects. The function works only on procedural objects (stored procedures, triggers and user-defined functions).

The general idea behind wrapping procedural database objects is to encode a readable data definition statement such that its contents are not easily identified. The procedural logic and embedded SQL statements in an obfuscated data definition statement are scrambled in such a way that any intellectual property in the logic cannot be easily extracted.


A related system stored procedure, CREATE_WRAPPED, is also provided that can be used to obfuscate a readable data definition statement and deploy it in the database. 
Read syntax diagram

More details can be found here and here and here.


Finally, there are a series of new functions for returning hashes. Given an expression, a hash algorithm is applied and the hash value is returned. There are four options:
  • HASH_CRC32
  • HASH_MD5
  • HASH_SHA1
  • HASH_SHA256
The name of the function determines the hashing algorithm that is used and the data type of the result, as shown in the table below:


BIF Algorithm Data Type
HASH_CRC32 CRC32 BINARY(4)
HASH_MD5 MD5 BINARY(16)
HASH_SHA1 SHA1 BINARY(20)
HASH_SHA156 SHA256 BINARY(32)

Summary


The general advice for every release of DB2 holds for DB2 12: always read through the manuals to find the new functions that can be used to minimize the amount of programming and work that needs to be done. It is important for both DBAs (in order to give good advice and be able to review SQL) and programmers (in order to write efficient and effective SQL) to know what functions are available. Be sure to review the new BIFs in DB2 12 and test them out to see how they work and where they can best be used at your shop!