Thursday, August 05, 2010

DB2 Best Practices

With today's blog entry I'm hoping to encourage some open-ended dialogue on best practices for DB2 database administration. Give the following questions some thought and if you've got something to share, post a comment!

What are the things that you do, or want to do, on a daily basis to manage your database infrastructure?

What things have you found to be most helpful to automate in administering your databases? Yes, I know that all the DBMS vendors are saying that they've created the "on demand" "lights-out" "24/7" database environment, but we all know that ain't so! So what have you done to automate (either using DBMS features, tools, or homegrown scripts) to keep an eye on things?

How have you ensured the recovery of your databases in the case of problems? Application problems? Against improper data entry or bad transactions? Disaster situations? And have you tested your disaster recovery plans? If so, how? And were they successful?

What type of auditing is done on your databases to track who has done what to what data? Do you audit all changes? To all applications, or just certain ones? Do you audit access, as well as modification? If so how?

How do you manage change? Do you use a change management tool or do it all by hand? Are database schema changes integrated with application changes? If so, how? If not, how do you coordinate things to keep the application synchronized with the databases?

What about DB2 storage management? Do you actively monitor disk usage of your DB2 table space and index spaces? Do you have alerts set so that you are notified if any object is nearing its maximum size? How about your VSAM data sets? Do you monitor extents and periodically consolidate? How do you do it... ALTER/REORG? Defrag utilities? Proactive defrag?

Is your performance management set up with triggers and farmed out to DBAs by exception or is it all reactive, with tuning tasks being done based on who complains the loudest?

Do you EXPLAIN every SQL statement before it goes into production? Does someone review the acess plans or are they just there to be reviewed in case of production performance problems? Do you rebind your programs periodically (for static SQL programs) as your data volume and statistics change, or do you just leave things alone until (or unless) someone complains?

When do you reorganize your data structures? On a pre-scheduled regular basis or based on database statistics? Or a combination of both? And how do you determine which are done using which method? What about your other DB2 utilities? Have you automated their scheduling or do you still manually build JCL?

How do you handle PTFs? Do you know which have been applied and which have not? And what impact that may be having on your database environment and applications? Do you have a standard for how often PTFs are applied?

How is security managed? Do the DBAs do all of the GRANTs and REVOKEs or is that job shared by security administrators? Are database logons coordinated across different DBMSs? Or could I have an operating system userid that is different from my SQL Server logon that is different than my Oracle logon -- with no capability of identifying that the user is the same user across the platforms?

How has regulatory compliance (e.g. PCI DSS, SOX, etc.) impacted your database administration activities? Have you had to purchase additional software to ensure compliance? How is compliance policed at your organization?

Just curious... Hope I get some responses!

2 comments:

Anonymous said...

Hi Craig,
Lots of questions for us there. For the time being I'll answer two of them.

1. Reorgs - The how why when. We have some tables that are on a weekly schedule to be reorged, due to the high frequency of insert/update/delete activity. A couple of those reorgs are reorg with discard. Most of the remaining tables are reorged when required according to real time statistics. The vast majority of those are SHRLEVEL change and can be run during the business day, but not during the evening hours. This is due to some batch processing that does not issue periodic commits, as they should. A few of the tables are too large and have too many indexes to reorg in their entirety, so we have to reorg by partition ranges. Not being on V9 yet, we still have to deal with the build2 phase. The way we handle this is to start the reorgs early evening on Sunday's with MAXRO(DEFER), so that we do not go into the build2 phase until we are ready. At 11pm, when we have our alloted window, we alter the MAXRO and let the build2 and switch phases go.
2. Explains pre production - We do explain packages as they are moved into the unit test environment. On a periodic basis we copy the production stats into the test region. There is a nightly process that unloads the current production plan table, then loads it into a table on the test machine. Then we run a comparison of any packages explained that day to its production version. If there are any differences in the 2 packages, then an email is sent to the DBA team and the devlopment leads, so that those SQL statements can be reviewed prior to moving the new package to production.
Dave Nance

Doug Platz said...

Great series of questions to keep us open minded and forward thinking. Where I work, we use automated reorgs based on statistics (RTS & catalog), along with a mix of static reorgs based on past experience or need (discard). I personally have automated as much as possible to reorg, gather stats, image copy, etc..

We have weekly and daily automated email reports to monitor space, last image copy taken, stats older than a certain date, etc...

We also have automated pages if an object reaches critical space issues overnight or on weekends.