Sorting is a very costly operation that you should strive to avoid if at all possible. Indexes are very useful for this purpose. DB2 can choose to use an available index to satisfy the sorting requirements of an ORDER BY, GROUP BY, or DISTINCT clause. Therefore, it can be beneficial to create indexes that correspond to these clauses for frequently executed SQL statements.
For example, consider this SQL statement:
SELECT LAST_NAME, TITLE, SALARY FROM EMPLOYEE ORDER BY LAST_NAME, TITLE;
If an index on LAST_NAME and TITLE exists, DB2 can use it to avoid sorting. By retrieving the data in order from the index, sorting becomes unnecessary.
You can use this information to tune your SQL further to avoid sorting. When ordering is required only by one column, sometimes adding another column to the ORDER BY can enable indexed access to avoid the sort.
Consider this SQL statement:
SELECT DEPT_CODE, MANAGER, DEPT_NAME, FROM DEPARTMENT ORDER BY MANAGER, DEPT_CODE;
Say there is a unique index on MANAGER. DB2 will probably use this index to satisfy the request. However, a sort will be done after retrieving the data to put the results into DEPT_CODE order within MANAGER. But, since we know our data, we are able to change this situation. Because MANAGER is unique we know that the following SQL statement is equivalent to the prior one:
SELECT DEPT_CODE, MANAGER, DEPT_NAME, FROM DEPARTMENT ORDER BY MANAGER;
In this case, DB2 can use the index on MANAGER to avoid the sort. The extra column, DEPT_CODE, is removed from the ORDER BY clause. But, since MANAGER is unique, there can be at most one DEPT_CODE per MANAGER. Therefore, the sort order will be equivalent. Because we knew our data we removed a sort from the process and made our query more efficient!
One final note on sorting: although most of you probably know this, it cannot be stated too strongly (or frequently) - always code an ORDER BY if the order of the results of your query is important. The ORDER BY clause is the only foolproof method of ensuring appropriate sort order of query results. Simply verifying that access is via an index is not sufficient to ensure the order of the results because:
- The access path may change in the future to non-indexed access.
- Even indexed access does not always return ordered results (e.g. list prefetch).
Another simple, yet effective, means of enhancing SQL performance is to understand how UNION, EXCEPT and INTERSECT work. Let's start with UNION because it has been around in DB2 the longest. UNION takes the results of multiple SELECT statements and combines them together. It does this, as part of the UNION operation, by sorting the results and removing any duplicates. UNION ALL, on the other hand, will not sort and will not remove duplicates.
If you know that the results of the queries being unioned together are distinct
(that is, no duplicates will be encountered), then you can use UNION ALL instead of UNION, and thereby enhance performance by avoiding the sort. Additionally, if you do not care whether duplicates are returned, always use UNION ALL.
The same applies to the EXCEPT and INTERSECT operations, which were added
as of DB2 9 for z/OS.