Thursday, August 04, 2016

The Tao of DB2 - Part 3: The New Big Project

When last we left our protagonists, the intern had been sent away to learn more about tools and documentation, while his mentor planned for the new database and application that was in the works.

The DBA summoned his intern to his cubicle and told him, “Today I want you to work on implementing the new database for Project C that everybody here has been talking about for some time now. I’m sure you’ve heard about it, right”

“Uh, sure, the developers are all excited about working on it, but I don’t know anything about it. “Where do I begin,” asked the intern.

His mentor replied, “Begin at the beginning… and continue working until you reach the end.”
The lesson here is that every database must begin its life as a well-designed data model. Only then can it be physically designed and implemented with an understanding of the capabilities of DB2. An improperly designed DB2 database can cause horrible  performance problems.

From logical data model to physical implementation, all steps must be planned and well thought-out for success to follow. Errors in planning at the initial stages can result in disaster at later stages, requiring re-design and/or re-writing code.

And the intern went off to work with the data analysts and subject matter experts to get the data model. The intern had previously learned about the data modeling tool and understood what it could do. He used it to produce a quick set of DDL for the database.

“Does this DDL look right,” asked the intern of his mentor.

“Can one know what one does not know? The vessel is not empty, nor is it full. We may even know not where the vessel is…” the Taoist DBA rather cryptically muttered.

The lesson here is to define each column carefully. The logical data model does not specifically force any component of the physical database design, and there are some things the logical model will not provide, such as access patterns, locking details, etc. Furthermore, the DBA must verify the details of the logical model. Was care and precision taken when defining whether a value must be known? Without this knowledge the nullability of the columns cannot be accurately coded. What about column defaults? Do the domains match the data and the constraints of the database?

No, one cannot simply press a button and create accurate database DDL without thinking and understanding. So the intern went off to communicate further with the subject matter experts to get a better understanding of the requirements.

Better-informed, the intern begins to make some appropriate changes but then gets stumped. “I do not know which of these options to choose. Should I just use the defaults,” moaned the intern.

“I’m for whatever is right,” exclaimed his mentor. “Choose wisely and we will not have this conversation again. The Tao teaches us to ‘Use what is naturally useful!’”

The lesson here, of course, is to understand every option at your disposal before choosing any parameter. Use the one that best matches the needs of the database, the application, or the situation. Avoid DB2’s defaults. When you don’t know the options off the top of your head, consult the documentation!

Confused, the intern asked “But when should I partition the data?” His mentor slapped him and asked if it would have been beneficial for a partition to have existed between them. 

The lesson here is to partition when the amount of data would benefit from managing it in “pieces” or to access it in parallel. There is no “mythical” cutoff of pages/rows after which you must partition, other than size limitations (the largest segmented TS is 64 GB). And today, most table spaces should be Universal anyway.

The database DDL created and implemented the intern worked with the developers to get test data for the database. When the intern asked his mentor “Why will DB2 not allow SPACES in this DATE column,” the DBA smiled and tried to answer the intern in writing. But he wrote his answer with his finger instead of a pen.

The lesson here: No kludging! Use the features of DB2 as they were designed and meant to be used.  For example, use DATE, TIME, and TIMSTAMP for chronological data; use numeric data types for numbers, character data types for alphanumeric; use referential constraints (declarative RI) and CHECK constraints instead of using programs to enforce constraints; and so on…


Progress is being made… but tune in next time to continue following our intern as he works on his first new database project…

No comments: