Sunday, December 07, 2008
Hectic
In the meantime, be sure to check in regularly on my other blog, Data Management Today.
And if you live in Richmond or Detroit, you can see me present at your local user group next week... Tuesday in Richmond (12/9/08) and Thursday in Detroit (12/11/08).
Friday, November 14, 2008
There is Still Time to Attend IDUG Regional Forums
Each Forum offers 2 days of education with 2 tracks: one covering DB2 for z/OS and another covering DB2 for LUW. IDUG is offering full two day registrations for $425 and single day registrations for only $225.
Here are the scheduled dates:
* Camp Hill, PA - November 17 and 18
* Kansas City, MO - November 19 and 20
Check out the links above for the full list of sessions in your area.
I'll be delivering my presentation titled "DB2 9: For Developer's Only" at both of these forums. And there will be many other great speakers there, too!
Friday, November 07, 2008
More on DB2 Date and Time Data: Arithmetic Expressions
Keep the following rules in mind.
When you issue date arithmetic statements using durations, do not try to establish a common conversion factor between durations of different types. For example, the following two date arithmetic statements are not equivalent:
1997/04/03 - 1 MONTH
1997/04/03 - 30 DAYS
April has 30 days, so the normal response would be to subtract 30 days to subtract one month. The result of the first statement is 1997/03/03, but the result of the second statement is 1997/03/04. In general, use like durations (for example, use months or use days, but not both) when you issue date arithmetic.
Another consideration: if one operand is a date, the other operand must be a date or a date duration. If one operand is a time, the other operand must be a time or a time duration. You cannot mix durations and data types with date and time arithmetic.
If one operand is a timestamp, the other operand can be a time, a date, a time duration, or a date duration. The second operand cannot be a timestamp. You can mix date and time durations with timestamp data types.
Now, what exactly is in that field returned as the result of a date or time calculation? Simply stated, it is a duration. There are three types of durations: date durations, time durations, and labeled durations.
Date durations are expressed as a DECIMAL(8,0) number. The result of subtracting one DATE value from another is a date duration. To be properly interpreted, the number must have the format yyyymmdd, where yyyy represents the number of years, mm the number of months, and dd the number of days.
Time durations are expressed as a DECIMAL(6,0) number. To be properly interpreted, the number must have the format hhmmss, where hh represents the number of hours, mm the number of minutes, and ss the number of seconds. The result of subtracting one TIME value from another is a time duration.
Labeled durations represent a specific unit of time as expressed by a number followed by one of the seven duration keywords: YEARS, MONTHS, DAYS, HOURS, MINUTES, SECONDS, or MICROSECONDS. A labeled duration can only be used as an operand of an arithmetic operator, and the other operand must have a data type of DATE, TIME, or TIMESTAMP. For example:
CURRENT DATE + 3 YEARS + 6 MONTHS
This will add three and a half years to the current date.
Thursday, November 06, 2008
Data Driven: A Great New Book on Data Quality Issues
Redman offers the basic thesis of the book right there on page one, where he states “…bad data lie at the root of issues of international importance, including the current subprime mortgage meltdown, lost and stolen identities, hospital errors and contested elections.” After laying down the problem, the rest of the book tells us what we need to do to correct the problems.
Data Driven will help you to improve the methods you deploy for the care and feeding of your data and information; in other words, helping you to control and manage data using similar processes and controls that you deploy on your other assets (finances, people, structures, etc.) – a noble goal, indeed!
The writing is concise and snappy – you won’t get bored reading this book. The style is engaging and it is easy to read. For example, instead of just saying what to do and how to do it, which can be boring, Redman discusses many of the arguments people use to say that data quality is impossible, and then debunks them showing that data quality is possible, if approached properly and thoroughly.
There are many good ideas, charts, and graphs in Data Driven, too. One of my favorites is on page 54, where you can find a chart of the ten habits followed by those with the best data. If you buy this book, make a poster-sized photocopy of that page and hang it up on the wall of the break room and in the data folks’ cubicles. Maybe the habits will rub off on everyone as they gaze upon them everywhere.
But the best little gem in this wonderful book is the entirety of the last chapter, which is titled “The Next One Hundred Days.” In this chapter Dr. Redman offers what he calls a hundred-day panorama. It is not a grand plan because most will not have the depth of understanding required to create such a plan and have it succeed. Instead, the panorama strives for breadth, not depth, with a focus on quality. Diligent readers can follow the guidance in this chapter and thereby begin the long-term process of appreciating the importance of data quality on their business practices.
And that alone is worth the price of the book… but, of course, Data Driven offers much more and I recommend it to every IT and business professional whose job relies on accurate data.
Monday, November 03, 2008
On Date Formats, Part 2
Q: My format does not fit into any of the formats listed in the DB2 manuals. What if I have a DATE stored like YYYYMMDD (with no dashes or slashes) and I want to compare it to a DB2 date?
A: Okay, let's look at one potential solution to your problem (and then I want to briefly talk about the use of proper data types). First of all you indicate that your date column contains dates in the following format: yyyymmdd with no dashes or slashes. You do not indicate whether this field is a numeric or character field - I will assume that it is character. If it is not, you can use the CHAR function to convert it to a character string.
Then, you can use the SUBSTR function to break the character column apart into the separate components, for example SUBSTR(column,1,4) returns the year component, SUBSTR(column,5,2) returns the month, and SUBSTR(column,7,2) returns the day.
Then you can concatenate all of these together into a format that DB2 recognizes, for example, the USA format which is mm/DD/yyyy. This can be done as follows:
SUBSTR(column,5,2) || "/" || SUBSTR(column,7,2) ||
"/" || SUBSTR(column,1,4)
Then you can use the DATE function to convert this character string into a DATE that DB2 will recognize. This is done as follows:
DATE(SUBSTR(column,5,2) || "/" || SUBSTR(column,7,2) ||
"/" || SUBSTR(column,1,4))
The result of this can be used in date arithmetic with other dates or date durations. Of course, it may not perform extremely well, but it should return the results you desire.
Now, a quick word about using proper data types. I say this all of the time, but there are many applications and implementations "out there" that do not heed the advice: it is wise to use the DATE data type when you store dates in DB2 tables. It simplifies life later on when you want to do things like formatting dates and performing date arithmetic.
Using the appropriate data type also ensures that DB2 will perform the proper integrity checks on the columns when data is entered, instead of requiring application logic to ensure that valid dates are entered.