Wednesday, March 17, 2010

What is Production Data?

I received an interesting e-mail recently that made me stop and think a bit... so I thought I'd blog about it. Basically, the e-mail posed the question in the title of this blog entry – “What is production data?”

The e-mail read as follows:


I'm looking for a one paragraph definition of "production data". What do you think of this: "Production data is data recorded for the purpose of controlling/managing/reporting/researching events, processes or states."

I'm trying to get around the belief that data recorded by a development team to manage its projects and resources is somehow less than production data. To me it should be regarded as the development team's "production data" and so I'm looking for a definition that satisfactorily encompasses that belief, as well as encompassing regular business production data.


You know, I do not recall ever seeing an actual definition of the term “production data.” The above definition is a good starting point, but I do not think it is complete. The author of the e-mail makes a good point about different types of production data. The data used by an application development team to conduct their business (writing computer programs to support business processes) is definitely production data… to the application development team.

Here is my take on a definition:

  • Production data is information that is persistently stored and used by professionals to conduct business processes. It must be accurate, documented, and managed on an on-going basis to ensure its value to the organization.

I say information instead of data because the data must be defined and in context in order to be useful for production work. And I say persistent because even though there may be many forms of transitory data used by production processes, it is the data that is stored over periods of time that needs to be managed.

I think this definition should serve the needs of the e-mailer... and more. What do you think?

Did I miss anything?

Friday, March 05, 2010

Mainframes: The Safe IT Career Choice

A recent Computerworld article (Bank of America touts mainframe work as a safe career) touts the mainframe as a safe haven for those considering a career in IT. This is an interesting article because the usual spiel you hear in industry trade rags is that the mainframe is dying and only a fool would work on such a platform. It is good to hear an alternate opinion on the matter in a journal as respected as Computerworld. (Of course, the fact that I agree with this opinion might have a little something to do with my cheer upon reading the article.)

One of the highlights of this particular article is the discussion of avialable mainframe jobs at sites such as Monster (764 jobs over 30 days) and Dice.com (1,200 ads over 30 days). These are significant numbers of jobs, especially in a down economy.

Another interesting tidbit from this piece is that "IBM says it's mainframe revenue has grown in eight of the last 13 quarters." This is impressive; consider the difficult servers market coupled with the impression that the platform is dying.

Speaking of the death of the mainframe, don't you believe it for a minute. People having been predicting the death of the mainframe since the advent of client/server in the late 1980s. That is more than 20 years! Think of all the things that have died in that timespan while the mainframe keeps on chugging away: IBM's PC business, Circuit City, Koogle peanut butter, public pay phones, Johnny Cash... the list is endless.

Some may counter that they recall reading about companies that were going to eliminate their mainframe. Well, yes, I'm sure you do remember those, I do, too. But do you recall reading many articles about companies that SUCCESSFULLY eliminated their mainframes? Many tried, few succeeded. Indeed, the re-Boot Hill web site provides examples of companies that tried to eliminate the mainframe but could not (hence, they had to re-boot). If you follow the link to the re-Boot Hill site click on the little tombstones to read the stories of failure.

So, the mainframe is a rock-solid platform, continues to grow, and is producing a significant number of job opportunities... what is not to like?

This blog has moved

This blog is now located at http://db2portal.blogspot.com/.

You will be automatically redirected in 30 seconds, or you may click here.

For feed subscribers, please update your feed subscriptions to
http://db2portal.blogspot.com/atom.xml.

Tuesday, February 09, 2010

IBM Announces DB2 10 for z/OS Beta Program

IBM announced the beta program for the next version of DB2 today, now "officially" known as DB2 10 (no more DB2 X). It is a closed beta program that will begin on March 12, 2010. That means you have to be selected by IBM to participate.

The announcement highlighted some of the areas of improvement to be delivered by DB2 10 for z/OS, and at the top of that list, to no one's surprise, is performance. DB2 10 promises to deliver out-of-the-box savings by improving operational efficiencies ranging from 5% to 10% out-of-the-box CPU savings for traditional workloads and up to 20% out-of-the-box CPU savings for nontraditional workloads.

Other areas called out by IBM in the announcement include
  • Improved business resiliency through scalability improvements and fewer outages (planned or unplanned).
  • Schema evolution or data definition on demand as well as query performance manageability enhancements support improved availability.
  • New features such as hash access, index include columns, inline large objects, parallel index updates, faster single row retrievals, work file in-memory, index list prefetch, 64-bit memory enhancements, use of the 1 MB page size of the System z10, buffer pools in memory, access path enhancements, member clustering for universal table spaces, efficient caching of dynamic SQL statements with literals, improved large object streaming, and SQL procedure language performance.
  • Rapid application and warehouse deployment for business growth including improved concurrency for data access, data management, and data definition.
  • The ability avoid an outage by adding active log data to a subsystem.
  • Improved application and data warehousing support including temporal data, a 64 bit ODBC driver, currerntly committed locking, implicit casting or loose typing, timestamp with time zone, variable timestamp precision, moving sum, and moving average.
  • Improvements to DB2's XML support including expanded pureXML, customer-driven performance and usability requirements, schema validation in the engine, binary XML exchange format, multiversioning, easy update of subparts of XML document, stored procedures, user-defined functions and triggers, XML index matching with date/timestamp, and a CHECK XML utility.
  • Enhanced query and reporting facilities, including QMF V10 with over 140 new analytical functions, support for HTML, PDF, and Flash reports, and more.
So it would seem that there is a lot of new functionality for us to begin to become acquainted with. As IBM rolls out more details, and customers begin to use the new version of DB2, we will examine some of these new features in more depth here on the DB2 Portal blog.

If you are interested in the beta program, the pre-requisite for DB2 10 is z/OS V1.10 (5694-A01) or later running in 64 bit mode. More information about the DB2 10 beta program is available on IBM's web site.

No GA date for DB2 10 has been announced.

Wednesday, February 03, 2010

IBM Manages the Data Lifecycle

Data lifecycle is a somewhat new-ish term, at least in terms of what I plan to talk about in this blog posting. The data lifecycle – and data lifecycle management – deals with tracking, managing, and understanding data and metadata as it flows through organizations. From its inception…whether entered by a clerk or read via a feed or loaded from an external source, etc…through its various usages…whether to conduct business, analyze trends and patterns, and so on…tracked from system to system, application to application, and user to user…and finally through its end of life.

Not many companies today can track all of their important data and what happens to it throughout its entire lifecycle. But doing so is important. Having such a capability enables organizations to adapt and react, gaining a competitive advantage. Much can go awry as data moves throughout an organization. Schema changes, policy changes, regulations adapt, programs change, formats changes, and so on. Any of these things can cause data quality issues, which should be brought to the attention of the business analyst using the data. But how often is this done? Knowing the history of data and its related metadata can improve business processes. But it is a major task – both for businesses and IT vendors hoping to offer solutions.

Which brings me to today’s (February 3, 2010) announcements from IBM. Big Blue announced new data protection software, a line of consulting services and resources and previewed information monitoring software to help organizations expand their use of trusted information to improve decision making. These moves further bolster IBM’s already formidable arsenal of data lifecycle management solutions.

The data protection announcement was for Optim Data Redaction. This solution, engineered for unstructured data like Word documents and PDF files, automatically recognizes and removes sensitive content from documents and forms. For example, a customer’s credit scores in a loan document could be hidden from an office clerk, while still being visible to a loan officer. In today’s atmosphere of more and more stringent regulations, a data redaction solution is becoming a requirement. For example, PCI DSS industry standards dictate specific rules regarding the display of debit and credit card information on receipts and reports.

Optim Data Redaction is planned for general availability in March 2010.

The information monitoring announcement was for InfoSphere Business Monitor. This technology is based on a combination of work from IBM’s research group and technology gained when IBM acquired Guardium. Guardium is a database activity monitoring (or auditing) solution. InfoSphere Business Monitor tracks the quality and flow of an organization’s information and provides real-time alerts of potential flaws. For example, if a health insurance company was analyzing profit margins across different product lines (individual, group, HMO, Medicare, etc.), decision makers would immediately be alerted when a data feed from a specific geography was not successfully integrated.

InfoSphere Business Monitor is available as a technology preview; it is not generally available and no GA date was announced.

At the same time, IBM announced its intention to acquire Initiate Systems, a provider of data integrity software for information sharing among healthcare and government organizations. Initiate's software helps healthcare clients work more intelligently and efficiently with timely access to patient and clinical data. It also enables governments to share information across multiple agencies to better serve citizens. IBM plans to continue to support and enhance Initiate's technologies while helping clients take advantage of the broader IBM portfolio, specifically Cognos and InfoSphere solutions for BI and analytics. This acquisition bolsters IBM’s data lifecycle management offering along these verticals.

And all of today’s announcements serve to clarify IBM’s ascent to the throne within the realm of information and data lifecycle management.