The last post I made to this blog used functions to translate data from one form to another. Functions frequently can be used to overcome issues when you are reading data from your DB2 tables. With that in mind, I thought it might make sense to go over DB2 SQL functions at a basic level - as a review for those who might not have used them before and to refresh the memory of those who have.
Functions can be specified in SQL statements to transform data from one state to another. Two types of functions can be applied to data in a DB2 table using SQL: column functions and scalar functions. Column functions analyze a group of rows and compute a single value for a designated column or expression. For example, the SUM function can be used to add, returning the sum of the values instead of each individual value:
SELECT SUM(SALARY)
FROM EMP
WHERE WORKDEPT = 'A01';
This query will return the total salary for all employees in the A01 department. This is a better solution than writing a program to read all employee rows in department A01 and then adding the salaries. With the SUM function less data is returned to the program, so it will operate much more efficiently.
The second type of function is a scalar function. Scalar functions are applied to a column or expression and operate on a single value. For example, the DAYOFWEEK function reads a single date value and returns an integer in the range of 1 to 7; where 1 is Sunday and 7 is Saturday:
SELECT DAYOFWEEK(HIREDATE)
FROM EMP
WHERE EMPNO = '000100';
This query will return an integer between 1 and 7 that indicates the day of the week when employee number 000100 was hired. Of course, this is just one example of many, many scalar functions supported by DB2. For a complete list of the available functions, refer to Chapter 3 of the IBM DB2 SQL Reference manual. Using scalar functions is generally more efficient than writing your own code to do the same thing.
Speaking of writing your own code, though, you can write your own user-defined functions (UDFs) in addition to the base, system-defined functions. UDFs give you the ability to write your own customized DB2 functions and standardize on that code. Consider creating UDFs for specific algorithms and processing that need to be executed in multiple applications in your organization. By writing the code once, putting it into a DB2 UDF, and then calling it whenever you need it, you will minimize errors and improve reusability.
A user-defined function can be specified anywhere a system-defined function can be specified.
Wednesday, January 31, 2007
Sunday, January 28, 2007
Sorting Out a Solution in DB2
Sometimes an application can require different, perhaps even somewhat "odd" data sorting. These needs may cause developers to sit and scratch their head for hours trying to make DB2 do something that seems "unnatural." But often you can conjure up an answer by properly understanding the problem and applying some creative SQL.
Some of you might be asking "What the heck is he talking about?" That's a fair question, so let’s look at an example to clarify.
Assume that you have a table containing transactions, or some other type of interesting facts. The table has a CHAR(3) column containing the name of the day on which the transaction happened; let’s call this column DAY_NAME.
Now, let’s further assume that we want to write queries against this table that orders the results by DAY_NAME. We’d want Sunday first, followed by Monday, Tuesday, Wednesday, and so on. How can this be done?
Well, if we write the first query that comes to mind the results will obviously be sorted improperly:
SELECT DAY_NAME, COL1, COL2 . . .
FROM TXN_TABLE
ORDER BY DAY_NAME;
The results from this query would be ordered alphabetically; in other words
FRI
MON
SAT
SUN
THU
TUE
WED
This is what I mean be an irregular sorting requirement. The example may not be an everyday need, but it is not unrealistic for a business to have this, or a similar requirement that needs a different sort order than strictly alphabetical or numeric. So what is the solution here?
Of course, one solution would be to design the table with an additional numeric or alphabetic column that would sort properly. By this I mean that we could add a DAY_NUM column that would be 1 for Sunday, 2 for Monday, and so on. But this requires a database design change, and it becomes possible for the DAY_NUM and DAY_NAME to get out of sync.
There is another solution that is both elegant and does not require any change to the database. To implement this solution all you need is an understanding of SQL and SQL functions – in this case, the LOCATION function. Here is the SQL:
SELECT DAY_NAME, COL1, COL2 . . .
FROM TXN_TABLE
ORDER BY LOCATE(DAY_NAME, 'SUNMONTUEWEDTHUFRISAT');
The trick here is to understand how the LOCATE function works: it returns the starting position of the first occurrence of one string within another string.
So, in our example SQL, LOCATE finds the position of the DAY_NAME value within the string 'SUNMONTUEWEDTHUFRISAT', and returns the integer value of that position. So, if DAY_NAME is WED, the LOCATE function in the above SQL statement returns 10. Sunday would return 1, Monday 4, Tuesday 7, Wednesday 10, Thursday 13, Friday 16, and Saturday 19. This means that our results would be in the order we require.
(Note: Other database systems have a similar function called INSTR.)
Of course, you can go one step further if you’d like. Some queries may need to actually return the day of week. You can use the same technique with a twist to return the day of week value given only the day’s name. To turn this into the appropriate day of the week number (that is, a value of 1 through 7), we divide by three, use the INT function on the result to return only the integer portion of the result, and then add one:
INT(LOCATE(DAY_NAME, 'SUNMONTUEWEDTHUFRISAT')/3) + 1;
Let’s use our previous example of Wednesday again. The LOCATE function returns the value 10. So, INT(10/3) = 3 and add 1 to get 4. And sure enough, Wednesday is the fourth day of the week.
With a sound understanding of the features of DB2 SQL, many requirements that seem "odd" at first glance, are achievable using nothing more than SQL and your imagaination.
Some of you might be asking "What the heck is he talking about?" That's a fair question, so let’s look at an example to clarify.
Assume that you have a table containing transactions, or some other type of interesting facts. The table has a CHAR(3) column containing the name of the day on which the transaction happened; let’s call this column DAY_NAME.
Now, let’s further assume that we want to write queries against this table that orders the results by DAY_NAME. We’d want Sunday first, followed by Monday, Tuesday, Wednesday, and so on. How can this be done?
Well, if we write the first query that comes to mind the results will obviously be sorted improperly:
SELECT DAY_NAME, COL1, COL2 . . .
FROM TXN_TABLE
ORDER BY DAY_NAME;
The results from this query would be ordered alphabetically; in other words
FRI
MON
SAT
SUN
THU
TUE
WED
This is what I mean be an irregular sorting requirement. The example may not be an everyday need, but it is not unrealistic for a business to have this, or a similar requirement that needs a different sort order than strictly alphabetical or numeric. So what is the solution here?
Of course, one solution would be to design the table with an additional numeric or alphabetic column that would sort properly. By this I mean that we could add a DAY_NUM column that would be 1 for Sunday, 2 for Monday, and so on. But this requires a database design change, and it becomes possible for the DAY_NUM and DAY_NAME to get out of sync.
There is another solution that is both elegant and does not require any change to the database. To implement this solution all you need is an understanding of SQL and SQL functions – in this case, the LOCATION function. Here is the SQL:
SELECT DAY_NAME, COL1, COL2 . . .
FROM TXN_TABLE
ORDER BY LOCATE(DAY_NAME, 'SUNMONTUEWEDTHUFRISAT');
The trick here is to understand how the LOCATE function works: it returns the starting position of the first occurrence of one string within another string.
So, in our example SQL, LOCATE finds the position of the DAY_NAME value within the string 'SUNMONTUEWEDTHUFRISAT', and returns the integer value of that position. So, if DAY_NAME is WED, the LOCATE function in the above SQL statement returns 10. Sunday would return 1, Monday 4, Tuesday 7, Wednesday 10, Thursday 13, Friday 16, and Saturday 19. This means that our results would be in the order we require.
(Note: Other database systems have a similar function called INSTR.)
Of course, you can go one step further if you’d like. Some queries may need to actually return the day of week. You can use the same technique with a twist to return the day of week value given only the day’s name. To turn this into the appropriate day of the week number (that is, a value of 1 through 7), we divide by three, use the INT function on the result to return only the integer portion of the result, and then add one:
INT(LOCATE(DAY_NAME, 'SUNMONTUEWEDTHUFRISAT')/3) + 1;
Let’s use our previous example of Wednesday again. The LOCATE function returns the value 10. So, INT(10/3) = 3 and add 1 to get 4. And sure enough, Wednesday is the fourth day of the week.
With a sound understanding of the features of DB2 SQL, many requirements that seem "odd" at first glance, are achievable using nothing more than SQL and your imagaination.