Tuesday, January 17, 2017

The DB2 12 for z/OS Blog Series - Part 1: SQL Pagination

Today’s blog post kicks off a new series of posts that examine new features and functionality delivered with the latest and greatest new version of DB2 foir z/OS – Version 12.

We’ll begin with a nice new feature that should make it easier for programmers writing queries where the program needs to be able to deal with a subset of results. For example, to show the first 25 rows, then the next, etc. This is a frequent requirement for mobile and web applications that are common these days.

This separating of rows into piece can now be accomplished quite easily in DB2 12 with the new OFFSET syntax. OFFSET is specified as a clause of the SELECT statement and it is used to specify the number of rows in the result table that will be skipped before
any rows are retrieved.

The offset clause is simple to code. You just need to determine the number of rows that you want to skip and code that in the clause. For example, to skip 5 rows you would code OFFSET 5 ROWS. Simple, right? Well, conceptually, yes, but in practice you have to be careful.

First of all, you must know where/when you can code an OFFSET clause. OFFSET can be specified on a subselect or fullselect that is outermost fullselect in a prepared SQL statement or a DECLARE CURSOR statement. OFFSET is also permissible in a SELECT INTO statement. However, you cannot code an OFFSET clause in a view definition, an MQT, the RETURN statement of SQL table functions, in row permissions or column masks, or in the outermost fullselect for a sensitive dynamic cursor. A further restriction is that the statements that includes the OFFSET clause cannot contain an expression that is not deterministic or that has external action.

The second, and I think more interesting aspect to consider when using the OFFSET clause is that your result set must be predictable to assure that you are retrieving useful data. As we all should know, the access path chosen by the optimizer can alter the order in which rows are returned. If you want to access rows in sets of 25, for example, then you want to make sure that each successive set does not contain any rows from previous sets, and does not omit any rows in between sets.

So how do we do this? The best approach is to code up an ORDER BY clause that specifies columns that uniquely identify each row in the result table. If there are duplicates, then there is no way to verify that you are getting the data that you want. IN other words, the order of the rows is not deterministic. The data being skipped will not be predictable and it is highly likely that you will not be accessing all of the data that you want (or perhaps even accessing the same data twice).

It is also important to understand that if the OFFSET number of rows is greater than the number of rows in the intermediate result table, you will get an empty result.

So let’s look at a quick example. Suppose we are looking for employees earning more than $50000. Further, we want to skip the first 10 rows of the EMP table, and grab the next 10. This SQL should do the trick:

SELECT *
FROM EMP
WHERE SALARY > 50000.00
ORDER BY EMPNO
OFFSET 10 ROWS
FETCH FIRST 10 ROWS ONLY;

The OFFSET 10 ROWS will cause the first 10 qualifying rows to be skipped. The FETCH clause gets the next 10 qualifying rows (after skipping the first 10).

But OFFSET is not the only new feature to help developers with SQL pagination. Also available with DB2 12 is data-dependent pagination, which uses row value
expressions in a basic predicate. That sounds more complicated than it really is, so let’s look at an example:

We can code a WHERE clause like this to search more rows with a name greater than mine:

WHERE (LASTNAME, FIRSTNAME) > (′MULLINS′, ′CRAIG′)

This is easier than what we previously had to do (before DB2 12), which was breaking the WHERE up as follows:

WHERE (LASTNAME = ′MULLINS′ AND FIRSTNAME > ′CRAIG′)
OR (LASTNAME > ′MULLINS′)


The bottom line is that application coding for pagination becomes a lot easier in DB2 12 for z/OS…