Friday, November 14, 2008

There is Still Time to Attend IDUG Regional Forums

Just a quick entry today to remind everyone that IDUG has started conducting regional events. The first one was held last week in San Ramon, California, and next week two events will be held: in Camp Hill, Pennsylvania, and Kansas City, Missouri, (actually Lenexa, KS). The forums offer you an opportunity to obtain DB2 education rather quickly and inexpensively.

Each Forum offers 2 days of education with 2 tracks: one covering DB2 for z/OS and another covering DB2 for LUW. IDUG is offering full two day registrations for $425 and single day registrations for only $225.

Here are the scheduled dates:

* Camp Hill, PA - November 17 and 18
* Kansas City, MO - November 19 and 20

Check out the links above for the full list of sessions in your area.

I'll be delivering my presentation titled "DB2 9: For Developer's Only" at both of these forums. And there will be many other great speakers there, too!

Friday, November 07, 2008

More on DB2 Date and Time Data: Arithmetic Expressions

DB2 allows you to add and subtract DATE, TIME, and TIMESTAMP columns. In addition, you can add date and time durations to, or subtract them from, date and time columns. But use date and time arithmetic with care. If you do not understand the capabilities and features of date and time arithmetic, you will likely encounter some problems implementing it.

Keep the following rules in mind.

When you issue date arithmetic statements using durations, do not try to establish a common conversion factor between durations of different types. For example, the following two date arithmetic statements are not equivalent:

    1997/04/03 - 1 MONTH
  1997/04/03 - 30 DAYS

April has 30 days, so the normal response would be to subtract 30 days to subtract one month. The result of the first statement is 1997/03/03, but the result of the second statement is 1997/03/04. In general, use like durations (for example, use months or use days, but not both) when you issue date arithmetic.

Another consideration: if one operand is a date, the other operand must be a date or a date duration. If one operand is a time, the other operand must be a time or a time duration. You cannot mix durations and data types with date and time arithmetic.

If one operand is a timestamp, the other operand can be a time, a date, a time duration, or a date duration. The second operand cannot be a timestamp. You can mix date and time durations with timestamp data types.

Now, what exactly is in that field returned as the result of a date or time calculation? Simply stated, it is a duration. There are three types of durations: date durations, time durations, and labeled durations.

Date durations are expressed as a DECIMAL(8,0) number. The result of subtracting one DATE value from another is a date duration. To be properly interpreted, the number must have the format yyyymmdd, where yyyy represents the number of years, mm the number of months, and dd the number of days.

Time durations are expressed as a DECIMAL(6,0) number. To be properly interpreted, the number must have the format hhmmss, where hh represents the number of hours, mm the number of minutes, and ss the number of seconds. The result of subtracting one TIME value from another is a time duration.

Labeled durations represent a specific unit of time as expressed by a number followed by one of the seven duration keywords: YEARS, MONTHS, DAYS, HOURS, MINUTES, SECONDS, or MICROSECONDS. A labeled duration can only be used as an operand of an arithmetic operator, and the other operand must have a data type of DATE, TIME, or TIMESTAMP. For example:

CURRENT DATE + 3 YEARS + 6 MONTHS 

This will add three and a half years to the current date.

Thursday, November 06, 2008

Data Driven: A Great New Book on Data Quality Issues

If you are at all involved in assuring the quality of your company’s data you need to know the work of Thomas C. Redman. Dr. Redman has been working on improving data quality for years and he has written numerous articles and books on the subject. His latest book, Data Driven: Profiting From Your Most Important Business Asset is another winner.

Redman offers the basic thesis of the book right there on page one, where he states “…bad data lie at the root of issues of international importance, including the current subprime mortgage meltdown, lost and stolen identities, hospital errors and contested elections.” After laying down the problem, the rest of the book tells us what we need to do to correct the problems.

Data Driven will help you to improve the methods you deploy for the care and feeding of your data and information; in other words, helping you to control and manage data using similar processes and controls that you deploy on your other assets (finances, people, structures, etc.) – a noble goal, indeed!

The writing is concise and snappy – you won’t get bored reading this book. The style is engaging and it is easy to read. For example, instead of just saying what to do and how to do it, which can be boring, Redman discusses many of the arguments people use to say that data quality is impossible, and then debunks them showing that data quality is possible, if approached properly and thoroughly.

There are many good ideas, charts, and graphs in Data Driven, too. One of my favorites is on page 54, where you can find a chart of the ten habits followed by those with the best data. If you buy this book, make a poster-sized photocopy of that page and hang it up on the wall of the break room and in the data folks’ cubicles. Maybe the habits will rub off on everyone as they gaze upon them everywhere.

But the best little gem in this wonderful book is the entirety of the last chapter, which is titled “The Next One Hundred Days.” In this chapter Dr. Redman offers what he calls a hundred-day panorama. It is not a grand plan because most will not have the depth of understanding required to create such a plan and have it succeed. Instead, the panorama strives for breadth, not depth, with a focus on quality. Diligent readers can follow the guidance in this chapter and thereby begin the long-term process of appreciating the importance of data quality on their business practices.

And that alone is worth the price of the book… but, of course, Data Driven offers much more and I recommend it to every IT and business professional whose job relies on accurate data.

Monday, November 03, 2008

On Date Formats, Part 2

Here is a follow-up question and answer based on my previous blog post:


Q: My format does not fit into any of the formats listed in the DB2 manuals. What if I have a DATE stored like YYYYMMDD (with no dashes or slashes) and I want to compare it to a DB2 date?


A: Okay, let's look at one potential solution to your problem (and then I want to briefly talk about the use of proper data types). First of all you indicate that your date column contains dates in the following format: yyyymmdd with no dashes or slashes. You do not indicate whether this field is a numeric or character field - I will assume that it is character. If it is not, you can use the CHAR function to convert it to a character string.


Then, you can use the SUBSTR function to break the character column apart into the separate components, for example SUBSTR(column,1,4) returns the year component, SUBSTR(column,5,2) returns the month, and SUBSTR(column,7,2) returns the day.


Then you can concatenate all of these together into a format that DB2 recognizes, for example, the USA format which is mm/DD/yyyy. This can be done as follows:


SUBSTR(column,5,2) || "/" || SUBSTR(column,7,2) || 
"/" || SUBSTR(column,1,4)


Then you can use the DATE function to convert this character string into a DATE that DB2 will recognize. This is done as follows:


DATE(SUBSTR(column,5,2) || "/" || SUBSTR(column,7,2) || 
"/" || SUBSTR(column,1,4))


The result of this can be used in date arithmetic with other dates or date durations. Of course, it may not perform extremely well, but it should return the results you desire.


Now, a quick word about using proper data types. I say this all of the time, but there are many applications and implementations "out there" that do not heed the advice: it is wise to use the DATE data type when you store dates in DB2 tables. It simplifies life later on when you want to do things like formatting dates and performing date arithmetic.


Using the appropriate data type also ensures that DB2 will perform the proper integrity checks on the columns when data is entered, instead of requiring application logic to ensure that valid dates are entered.

Wednesday, October 29, 2008

On Date Formats

Regular readers of my blog know that from time to time I use the blog as a forum to answer questions I get via e-mail. Today, we address a popular theme - dealing with DB2 date data...


Q:I have a DATE column in a DB2 table, but I do not want it to display the way DB2 displays it by default. How can I get a date format retrieved from a column in a table from DB2 database in the format MM/DD/YYYY?


A:The simplest way to return a date in the format you desire is to use the built-in column function CHAR. Using this function you can convert a date column into any number of formats. The specific format you request, MM/DD/YYYY, is the USA date format. So, for example, to return the date in the format you requested for a column named START_DATE you would code the function as follows:


CHAR(START_DATE,USA)

The first argument is the column name and the second argument is the format. Consult the following table for a list of the date formats that are supported by DB2.

Name

Layout

Example

ISO

yyyy-mm-dd

2002-10-22

USA

mm/dd/yyyy

10/22/2002

EUR

dd.mm.yyyy

22.10.2002

JIS

yyyy-mm-dd

2002-10-22

LOCAL

Locally defined layout

N/A


You may also have an installation-defined date format that would be named LOCAL. For LOCAL, the date exit for ASCII data is DSNXVDTA, the date exit for EBCDIC is DSNXVDTX, and the date exit for Unicode is DSNXVDTU.

Of course, this is a simple date question... I will follow-up with some additional date-related questions and answers in my next couple of blog posts.

Wednesday, October 22, 2008

Bad Standards

Just started a new series on bad standards over on my Data Management Today blog.

Check it out when you get a chance and share your favorite "bad standards" either here or there... or by e-mailing me.

Monday, October 20, 2008

DBA Rules of Thumb

Database administration is a very technical discipline, but it is also a discipline in which the practitioner is very visible politically within the organization. As such, DBAs should be armed with the proper attitude and knowledge before attempting to practice the discipline of database administration.

Just as important as technical acumen, though, is the ability to carry oneself properly and to embrace the job appropriately. With this in mind, I wrote a series of blog entries on DBA Rules of Thumb over at my Data Management Today blog... and I thought the information I wrote there may be helpful to my DB2 and mainframe readership here, so I'm sharing the eight rules of thumb (with links) here on my DB2 Portal blog:
  1. Document Everything!
  2. Automate Ingelligently
  3. Share
  4. Don't Panic!
  5. Focus Your Efforts
  6. Invest In Yourself
  7. Diversify
  8. Develop Business Acumen
What do you think? Did I miss anything important?

P.S. Just a reminder that I will be presenting a webinar on assuring DB2 recoverability with my colleague, Michael Figaro, this Thursday, October 23, 2008 at 10:30 Central time. If you are at all interested in the topic, be sure to register today - and attend this Thursday!

Thursday, October 09, 2008

Assuring the Recoverability of Your DB2 Databases

Availability requires much more than just having a reliable hardware and database platform. Most companies cannot afford significant downtime, and some cannot afford any! As such, it is crucial for unplanned outages to be as short as possible. But it is not just a business requirement, in many cases assuring a speedy recovery is also a legal mandate. Regulations such as SOX and Basel II dictate that any outage is resolved within a predefined period of time.

But how many of us can answer, with any degree of certainty, the question “How long will this outage last?” There are many variables that need to be considered when estimating a DB2 recovery time: backups available, quality, point-in-time requirements, amount of log processing, disk speed, tape mounts, and on and on and on...

With these thoughts in mind, Michael Figaro and I will be delivering a webinar titled Assuring the Recoverability of Your DB2 Databases, on Thursday, October 23, 2008 at 10:30 am CDT.

We will tackle issues ranging from regulations, IT complexity, and business continuity, to DSNZPARMs and backup/recovery planning. We’ll also make the case that planning for database recoverability is the most important task among the many tasks of the DBA.

As part of the webinar we will introduce and demonstrate Recovery AssuranceExpert, a new technology to help you ensure that all of your critical DB2 objects are recoverable within your recovery time objectives. Recovery AssuranceExpert is an automated solution to perform daily health checks of data availability and recoverability, as well as provide actual recovery times required for a DB2 object, a complete application, or even a whole DB2 subsystem. Join us on October 23rd to find out how you can insure that your actual recoverability times fit into your SLAs.

Tuesday, September 30, 2008

A Perfect Storm?

There is something of a perfect storm brewing in the world of data today. The world is becoming more automated, more connected, more wireless, and more complex. The next wave of database administration is intelligent automation. I refer to this as implementing software scrubbing bubbles that “work hard, so you don’t have to.” (Remember that commercial!)

As more of the tasks required of DBAs become more automated, the DBA will be freed to expand into other areas. So one front on this storm is the autonomic computing initiatives that automate DBA tasks. At the same time, IT professionals are being asked to know more about the business instead of just knowing the technology. So DBAs need to understand the business purpose and definition of the data they manage, as well as the technological underpinnings of the DBMS. The driving force here is predominantly regulatory compliance. This second front of the perfect storm will cause DBAs to work more closely with metadata to drive database archiving, data auditing, and security to ensure their organization complies with regulations like Sarbanes-Oxley, HIPAA, and others.

Regarding the wireless aspect of things, pervasive devices (PDA, handhelds, cell phones, etc.) will increasingly interact with database systems. DBAs will need to get involved there to ensure successful data synchronization. And database systems will work with disconnected data seamlessly, such as data generated by RFID tags.

Yet another big database trend is technology "suck." By that I mean the DBMS is as it sucks up technologies and functions that previously required you to purchase separate software. Remember when the DBMS had no ETL or OLAP functionality? Those days are gone. This will continue as the DBMS adds capabilities to tackle more and more IT tasks.

Another trend impacting DBAs will be a change in some of their roles as more and more of the recent DBMS features actually start being used in more production systems.

The net result of this perfect storm of changes is that data professionals are absolutely being required to do more... sometimes with less (less time, less money, less staff, etc.)

If you know the technology but are then required to know the business, this is doing more – much more. But the technology, in many cases, is also expanding. For example, DB2 9 incorporates native XML. Most DBAs are not XML savvy, but increasingly they will have to learn more about XML as the DBMS technology expands. And this is just one example.

Additionally, data is growing at an ever-increasing rate. Every year the amount of data under management increases (some analysts peg the compound annual rate of data growth at 125%) and in many cases the number of DBAs to manage that growing data is not increasing, and indeed, could be decreasing.

And, budgetary limitations can cause DBAs to have to do more work, to more data, with less resources. When a company reduces budget but demands more work, automation is an absolute necessity. Turning work over to the computer can help (although it is unlikely to solve every administrative issue). Sometimes IT professionals fight against the very thing they excel in – that is, automating work. If you think about it, every computer program is written to automate someone’s work – the write (word processing), the accountant (financials, payroll, spreadsheets), and so on. This automation did not put the executives whose work was automated out of a job; instead it made them more efficient. Yet, for some reason, there is a notion in the IT industry that automating IT tasks will eliminate jobs. You cannot automate a DBA out of existence – but you can make that DBA’s job more effective and efficient with DBA tools and autonomic computing.

And the sad truth of the matter is that there is still a LOT more than can, and should, be done in most companies. We can start with better automation of DBA tasks, but we shouldn't stop there!

Corporate governance is hot – that is, technologies to help companies comply with governmental regulations. Software to enable archiving for long-term data retention, auditing to determine who did what to which piece of data, and security to better protect data are all hot data technologies right now. But database security need to be improve and technologies for securing and auditing data need to be more pervasively implemented.

Metadata is increasing in importance. As data professionals really begin to meld together technology and business, they find that metadata is imperative. But most organizations do not have a metadata repository fully-populated and up-to-date that acts as a lexicon for business data.

And finally, something that isn’t nearly hot enough is data quality and integrity. Tools, processes, and database options that can be used to make data more accurate and reliable are not implemented appropriately with any regularity. So the data stored in our corporate databases is suspect. According to Thomas C. Redman, data quality guru, poor data quality costs the typical company at least ten percent (10%) of revenue. That is a significant cost! Data quality is generally bad in most organizations – and more needs to be done to address that problem.

With all of these thoughts in mind, are you prepared to weather this perfect storm?

Tuesday, September 23, 2008

Who Did What to Which Data When... and How?

As the list of government regulations impacting IT grows organizations must adapt to understand and comply with new rules. This increasing compliance pressure is particularly intense on data stored in corporate databases. As such, organization need to be ever more vigilant in the techniques used to protect their data, and monitor access.

Database auditing, sometimes called data access auditing, is one technique growing in popularity as a response to the demands of regulatory compliance. At a high level, database auditing is basically a facility to track the use of database resources and authority. It can be used to help answer questions like “Who accessed or changed data?” and “What was actually changed?” and “When did it change?”

But how you implement your database auditing, especially in a mainframe environment, will have a significant impact on not just "the completeness" of what you capture in the audit trail, but on the performance and availability of your entire environment.

Join me on Wednesday, September 24, 2008 at 10:30 am, Central Daylight Time, for a free webinar where I will discuss the issues and requirements driving database auditing. This presentation can help to serve as a roadmap of sorts for your data access auditing needs.

Monday, September 22, 2008