Thursday, August 05, 2010

DB2 Best Practices

With today's blog entry I'm hoping to encourage some open-ended dialogue on best practices for DB2 database administration. Give the following questions some thought and if you've got something to share, post a comment!

What are the things that you do, or want to do, on a daily basis to manage your database infrastructure?

What things have you found to be most helpful to automate in administering your databases? Yes, I know that all the DBMS vendors are saying that they've created the "on demand" "lights-out" "24/7" database environment, but we all know that ain't so! So what have you done to automate (either using DBMS features, tools, or homegrown scripts) to keep an eye on things?

How have you ensured the recovery of your databases in the case of problems? Application problems? Against improper data entry or bad transactions? Disaster situations? And have you tested your disaster recovery plans? If so, how? And were they successful?

What type of auditing is done on your databases to track who has done what to what data? Do you audit all changes? To all applications, or just certain ones? Do you audit access, as well as modification? If so how?

How do you manage change? Do you use a change management tool or do it all by hand? Are database schema changes integrated with application changes? If so, how? If not, how do you coordinate things to keep the application synchronized with the databases?

What about DB2 storage management? Do you actively monitor disk usage of your DB2 table space and index spaces? Do you have alerts set so that you are notified if any object is nearing its maximum size? How about your VSAM data sets? Do you monitor extents and periodically consolidate? How do you do it... ALTER/REORG? Defrag utilities? Proactive defrag?

Is your performance management set up with triggers and farmed out to DBAs by exception or is it all reactive, with tuning tasks being done based on who complains the loudest?

Do you EXPLAIN every SQL statement before it goes into production? Does someone review the acess plans or are they just there to be reviewed in case of production performance problems? Do you rebind your programs periodically (for static SQL programs) as your data volume and statistics change, or do you just leave things alone until (or unless) someone complains?

When do you reorganize your data structures? On a pre-scheduled regular basis or based on database statistics? Or a combination of both? And how do you determine which are done using which method? What about your other DB2 utilities? Have you automated their scheduling or do you still manually build JCL?

How do you handle PTFs? Do you know which have been applied and which have not? And what impact that may be having on your database environment and applications? Do you have a standard for how often PTFs are applied?

How is security managed? Do the DBAs do all of the GRANTs and REVOKEs or is that job shared by security administrators? Are database logons coordinated across different DBMSs? Or could I have an operating system userid that is different from my SQL Server logon that is different than my Oracle logon -- with no capability of identifying that the user is the same user across the platforms?

How has regulatory compliance (e.g. PCI DSS, SOX, etc.) impacted your database administration activities? Have you had to purchase additional software to ensure compliance? How is compliance policed at your organization?

Just curious... Hope I get some responses!

Friday, July 30, 2010

Happy SYSADMIN Day

To all of the system administrators out there (and I include DBAs and network admins in that group), HAPPY SYSADMIN DAY.

For those who are unaware if this very important holiday, every year on the last Friday of July responsible people everywhere celebrate Sysadmin Day. The idea is to show some appreciation for the folks who keep your systems up and running every day of the week. There is even a page with some gift ideas if you are so inclined to get something for your favorite sysadmin. (Personally, I prefer cash... .)

At the very least you can wish him/her a Happy Sysadmin Day today... and hoist a beer or two in his/her honor at the pub this evening...

Tuesday, July 13, 2010

Classics of Computer Literature

Although the main focus of this blog is DB2 and mainframe software, I thought it would be worthwhile to take some time to recommend a few classic books for computer professionals. I am an avid reader of all kinds of books, not only on technology but on a wide variety of topics. Periodically I will use my blog to extol the virtues of some of my favorite books.


I'm starting with computer books as everyone reading this is probably in the field of IT. (...except maybe my Mom, hi Mom!) These books are not DBMS- or data-focused: I will recommend data and database books later, in some future blog posting.

So, here goes, my coverage of a nice starter set of 4 computer books that everyone should read...
















Every computer professional should own a copy of Frederick P. Brooks Jr.’s seminal work, The Mythical Man-Month (Addison-Wesley Pub Co; ISBN: 0201835959). Brooks is best known as the father of the IBM System/360, the quintessential mainframe computer. He managed the projects that created the S/360 and its operating system.

This book contains a wealth of knowledge about software project management including the now common-sense notion that adding manpower to a late software project just makes it later. The 20th anniversary edition of The Mythical Man-Month, published in 1995, contains a reprint of Brooks’ famous article “No Silver Bullet” as well as Brooks’ reflections on the twenty years since the book’s publication. If creating software is your discipline, you absolutely need to read and understand the tenets in this book.



Another essential book for technologists is Peopleware (Dorset House; ISBN: 0932633439) by Tom DeMarco and Timothy Lister. This book concentrates on the human aspect of project management and teams. If you believe that success is driven by technology more so than people, this book will change your misconceptions. Even though this book was written in the late 1980’s, it is still very pertinent to today’s software development projects.

DeMarco is the author of several other revolutionary texts such as Structured Analysis and Design (Yourdon Press; ISBN: 0138543801). This book almost single-handedly introduced the concept of structured design into the computer programming lexicon. Today, structured analysis and design is almost completely taken for granted as the best way to approach the development of application programs.



If you are a systems analyst, application programmer, or software engineer then you will surely want Donald Knuth’s three volume series The Art of Computer Programming (Addison-Wesley Pub Co; ISBN: 0201485419). This multi-volume reference is certainly the definitive work on programming techniques.

Knuth covers the algorithmic gamut in this three volume set, with the first volume devoted to fundamental algorithms (like trees and linked lists), a second volume devoted to semi-numerical algorithms (e.g. dealing with polynomials and primes), and a final volume dealing with sorting and searching. Even though a comprehensive reading and understanding of this entire set can be foreboding, all good programmers should have these techniques at their disposal.

OK, I know, this is sort of cheating because it is a 3 book set, but so what... my blog... my rules!



Finally, I’d like to recommend a good book on the history of computing. The old maxim still stands: "Those who do not know history are doomed to repeat it." But most computer specialists are only dimly aware of the rich history of their chosen field.

There are quite a few books available on computing hsitory and most provide coverage of the basics. A current favorite though, is The Universal History of Computing: From the Abacus to the Quantum Computer by Georges Ifrah. The book offers a comprehensive journey through the history of computing. Particularly interesting is the chronological summary offered up in Chapter 1. It starts out in 35000 BCE - the era from which we have discovered the first notched bones that were probably used for counting, and progresses into the modern era of computing.

This book spans the complete history of information processing providing useful insight into the rise of the computer.


Now I don’t pretend to believe that these are the only classic books in IT literature, but I do know that they will provide a good, solid core foundation for your IT library. Books promote knowledge better than any other method at our disposal. And knowledge helps us do our jobs better. So close down that web connection and pick up a book. You’ll be glad you did.

Tuesday, June 22, 2010

Access Your DB2 Catalog "Poster" Online

If you're anything like me, you're constantly looking for DB2 Catalog table and column names. For writing catalog queries, for examining statistics, for looking at your table and tablespace parameters, for many, many things. But it is not very easy to keep reaching for the DB2 manuals (Which manual is it in? Which appendix was that? Why did they put it there? Did they move it?)...

So, many of us gladly tacked up those posters from Platinum Technology that graphically depicted the DB2 Catalog... and then later similar posters from CA, Inc, and BMC Software... but those posters have grown is size (as has the DB2 Catalog)... and the posters have become less useful over the years because they contain less information and the smaller type.

Well, here comes a solution from zSystems and SoftwareOnZ: a free and very easy to use online DB2 Catalog reference application.

If you can use a web browser and a mouse then you can find the DB2 Catalog information you desire. Simply point and click on the appropriate table and you'll get its definition along with a listing of its columns and their data type and length. And you'll also get information about which columns participate in any indexes and what type of index (primary, unique, duplicate index).

Not only that, there is both a V8 and V9 edition of the DB2 Catalog so you can easily toggle back and forth between the two versions. That is very handy for sites that have some V8 subsystems and some V9 subsystems!

So if you are looking for a better way to view your DB2 Catalog information, be sure to check out the online DB2 Catalog reference from zSystems.

Thursday, May 13, 2010

IDUG NA 2010, Days Two and Three

I’ve been running around kinda busy the past couple of days here at IDUG in Tampa, so I got a bit behind in blogging about the conference. So, today I’m combining two days of thoughts into one blog post.

(For a summary of IDUG Day One, click here.)

I started off day two by attending Brent Gross’ presentation on extracting the most value from .NET and ODBC applications. Brent discussed some of the things to be aware of when developing with .NET, an important “thing” being awareness that .NET is designed to work in a disconnected data architecture. So applications will not go through data a row at a time but instead send the data to the application and let it process it there. As an old mainframe DBA that caused alarm bells to ring.

I also got the opportunity to hear Dave Beulke discuss Java DB2 developer performance best practices. Dave delivered a lot of quality information, including the importance of developing quality code because Java developers reuse code – and you don’t want bad code being reused everywhere, right?

Dave started out mentioning how Java programmer are usually very young and do not have a lot of database experience. So DBAs need to get some Java knowledge and work closely with Java developers to ensure proper development. He also emphasized the importance of understanding the object to relational mapping method.

From a performance perspective Dave noted the importance of understanding the distributed calls (how many, where located, and bandwidth issues), controlling commit scope, and making sure your servers have sufficient memory. He also indicated that it is important to be able to track how many times Java programs connect to the database. He suggested using a server connection pool and to be sure that threads are always timed out after a certain period of time.

And I’d be remiss if I didn’t note that Dave promoted the use of pureQuery, which can be used to turn dynamic JDBC into static requests. Using pureQuery can improve performance (perhaps as much as 25 percent), as well as simplifying debugging & maintenance.

Dave also discussed how Hibernate can cause performance problems. Which brings me to the first session I attended on day three, John Mallonee’s session titled Wake Up to Hibernate. Hibernate is a persistent layer that maps Java objects to relational tables. It provides an abstraction layer between DB2 and your program. And it can also be thought of as a code generator. Hibernate plugs into popular IDEs, such as Eclipse and Rational tools. It is open source, and part of JBoss Enterprise Middleware (JBoss is a division of Red Hat).

John walked attendees through Hibernate, discussing the Java API for persistence, its query capabilities (including HQL, or Hibernate Query Language), and configuration issues. Examples of things that are configurable include JDBC driver, connection URL, user name, DataSource, connection pool settings, SQL controls (logging, log formatting), and the mapping file location.

HQL abstracts SQL. It is supposed to simplify query coding, but from what I saw of it in the session, I am dubious. John warned, too, that when HQL is turned into SQL the SQL won’t necessarily look the way you are used to seeing it. He recommended to setup the configuration file such that it formats the generated SQL or it won’t be very readable. John noted that one good thing about HQL is that you cannot easily write code with literals in them; it forces you to use parameter markers.

OK, so why can Hibernate be problematic? John talked about four primary concerns:

  1. SQL is obscured
  2. performance can be bad with generated code
  3. Hibernate does not immediately support new DB2 features
  4. Learning curve can be high

But he also noted that as you learn more about these problems -- and how Hibernate works -- that things tend to improve. Finally (at least with regard to Hibernate) John recommends that you should consider using HQL for simple queries, native SQL for advanced queries, for special situations use JDBC, and to achieve the highest performance use native DB2 SQL (e.g. stored procedure).

I also attended two presentations on the DB2 for z/OS optimizer. Terry Purcell gave his usual standout performance on optimization techniques. I particularly enjoyed his advice on what to say when someone asks why the optimizer chose a particular path: “Because it thinks that is the lowest cost access path.” After all, the DB2 optimizer is a cost-based optimizer. So if it didn’t choose the “best” path then chances are you need to provide the optimizer with better statistics.

And Suresh Sane did a nice job in his presentation in discussing the optimization process and walking thru several case studies.

All-in-all, it has been a very productive IDUG conference… but then again, I didn’t expect it to be anything else! Tomorrow morning I deliver my presentation titled “The Return of the DB2 Top Ten Lists.” Many of you have seen my original DB2 top ten lists presentation, but this one is a brand new selection of top ten lists… and I’m looking forward to delivering it for the first time at IDUG…