Monday, January 11, 2016

The Impact of Data Volume on Operational Databases

Operational databases are growing in size for many reasons, not the least of which is the growing importance of big data and analytics projects. There is the overarching trend of more and more data being generated every year. But also, there is the growing need to store more data for longer periods of time due to regulatory and compliance issues. Some organizations and business have encountered the need to store certain types of data for 100 years or more (as this video and this storage project point out).

But I doubt that I really need to convince you that your databases are growing in size. Most DBAs experience the reality of data growth every day.

As data volumes expand, it impacts operational databases in two ways:
  1. additional data stresses transaction processing and can cause performance slow downs, and;
  2. database administration tasks are negatively impacted.
In terms of performance, the more data in the operational database, the less efficient transactions running against that database tend to be. Table scans must reference more pages of data to return a result. Indexes grow in size to support larger data volumes, causing access by the index to degrade because there are more levels to traverse to return an answer. Such performance impacts are causing many companies to seek solutions that offload older data to either reference databases or to archive data stores.

The other impact, database administration complexity, causes longer processing time and outages to perform such functions as backups, unloads, reorganizations, recoveries, and disaster recoveries.  The larger the underlying data sets for your tables and table spaces, the longer it takes to run administrative utilities for them. In many cases the lengthened outages can become unacceptable, causing companies to again seek ways to lighten up the operational databases... or perhaps acquire next generation utility technology that understands the reality of large DB2 database objects.

But even though we want to keep all of that additional data, there is no reason that it necessarily has to be stored in operational databases that run the business. For many reasons, you probably want to separate active data from historical data. 

Some companies create purge jobs for all (or many) of their tables to remove data from the production databases as it ages. This can be an acceptable approach to reduce the size of your operational databases. But it also means that the data, which you might want to keep for analytical purposes, is lost. Another approach is to archive the data. Archiving data and purging data are two different processes. When data is purged, it is removed from the operational database and discarded. But archived data is removed from the operational database and maintained in an archive data store. The archive might be a flat file, another relational table or to HDFS using Hadoop.

The bottom line is that it makes sense for us, as DBAs, to keep any eye on the size of our operational databases and take action when production workload is impacted.  

Monday, January 04, 2016

A Lot of Extraneous Data Sets?

In a recent blog post here I talked about a quick and dirty method of converting your partitioned table spaces from index-controlled to table-controlled. If you haven't read that post, take a moment to click over and read it here: Easily Convert to Table-Controlled Partitioning.

The reason I bring this up today is that I received an interesting e-mail from a long-time friend and DB2 DBA who read the post and had some information to share with me. He told me about how his organization used one of my tips to drop unused indexes as part of this process.

He told me that during the conversion process they dropped a lot of the clustering indexes because they weren't being used for access paths or for uniqueness. And they were able to release an "astonishing 4,100 data sets" by doing so!

Now I am not suggesting that every shop will be able to experience a similar savings, but if you have indexes that have no purpose other than enforcing index-controlled partitioning, it is time to bite the bullet and drop those indexes as you convert to table-controlled-partitioning (and then on to Universal table spaces).

And when you convert, please drop a note here on the blog to let us know how your conversion efforts went!

Tuesday, December 15, 2015

Happy Holidays!

Well, it is that time of year again. The days are shorter and the weather is colder... even if it isn't as cold as normal it is colder than it was in July! And most people are taking the time to celebrate the holiday season. 

Here's wishing each and every one of my readers a happy holiday... regardless of your chosen season to celebrate! Whether you celebrate Chanukah, Christmas, Kwanzaa, the Winter Solstice, Saturnalia, or just the end of another year on Planet Earth, I'm with you, and celebrating my good fortune, great family and friends, and you, my regular blog readers. I appreciate and thank you all...

This will be the final post of the year (2015) for this blog, but be sure to join me again next year - 2016 - as we continue to examine all aspects of everybody's favorite DBMS... IBM's DB2...

Tuesday, December 08, 2015

Easily Convert to Table-Controlled Partitioning

Up through DB2 V8 for z/OS, the only way to control partitioning of DB2 table spaces was by using a clustering index that specified the range of key values for each partition. With V8, though, DB2 adds the ability to specify the partitioning criteria in the CREATE TABLE specification. This is known as table-controlled partitioning and it is the preferred method for creating (non-Universal) partitioned table spaces. With table-controlled partitioning you can cluster on a different column (or set of columns) than you are partitioning on. Furthermore, you can make changes such as dropping a partitioning index or creating a table in a partitioned table space without defining any indexes at all.


But given the long history of DB2, many existing partitioned table spaces are index-controlled. 

Fortunately, there is a quick-and-dirty technique that you can use to easily convert from index-controlled to table-based partitioning. Simply follow these steps:

  • Identify the index-controlled partitioned table space you wish to convert
  • Convert the clustering index on the table to NOT CLUSTER using ALTER INDEX. (Alternately, you could drop the clustering index, but I wouldn’t recommend that unless you no longer need that index at all.)
  • Convert the index back to CLUSTER, again using ALTER INDEX
Voila! DB2 will have converted your table space to table-controlled partitioning.



Note: DB2 will also convert from index-controlled to table-controlled partitioning if you use ALTER TABLE to add a new partition, change a partition boundary, or rotate a partition to last on an index-controlled partitioned table space. But these are more intrusive methods than simply altering the index from clustering to non-clustering and back again.

Wednesday, November 25, 2015

Happy Thanksgiving 2015

Every year this week those of us in the USA take time out to give thanks for all that we have. We do this by taking time off of work, gathering with our families, eating turkey (and a lot of other stuff), and watching football.

It is one of my favorite holidays as it offers most of the joys of Christmas without many of the trappings.

So with this in mind, I'd like to wish all of my readers -- whether you reside here in the USA or anywhere in the world -- a very Happy Thanksgiving. Take some time to reflect on your good fortune... consider what you might be able to do to help others achieve success... and relax a bit and enjoy yourself...



We can talk about DB2 and databases again in December!