When last we left our protagonists, the intern had been sent
away to learn more about tools and documentation, while his mentor planned for
the new database and application that was in the works.
The DBA summoned his intern to his cubicle and told him, “Today
I want you to work on implementing the new database for Project C that
everybody here has been talking about for some time now. I’m sure you’ve heard
about it, right”
“Uh, sure, the developers are all excited about working on
it, but I don’t know anything about it. “Where do I begin,” asked the intern.
His mentor replied, “Begin at the beginning… and continue
working until you reach the end.”
The lesson here is that every database must begin its life
as a well-designed data model. Only then can it be physically designed and implemented
with an understanding of the capabilities of DB2. An improperly designed DB2
database can cause horrible performance
problems.
From logical data model to physical implementation, all
steps must be planned and well thought-out for success to follow. Errors in
planning at the initial stages can result in disaster at later stages,
requiring re-design and/or re-writing code.
And the intern went off to work with the data analysts and
subject matter experts to get the data model. The intern had previously learned
about the data modeling tool and understood what it could do. He used it to
produce a quick set of DDL for the database.
“Does this DDL look right,” asked the intern of his mentor.
“Can one know what one does not know? The vessel is not
empty, nor is it full. We may even know not where the vessel is…” the Taoist
DBA rather cryptically muttered.
The lesson here is to define each column carefully. The
logical data model does not specifically force any component of the physical
database design, and there are some things the logical model will not provide,
such as access patterns, locking details, etc. Furthermore, the DBA must verify
the details of the logical model. Was care and precision taken when defining
whether a value must be known? Without this knowledge the nullability of the
columns cannot be accurately coded. What about column defaults? Do the domains
match the data and the constraints of the database?
No, one cannot simply press a button and create accurate database
DDL without thinking and understanding. So the intern went off to communicate
further with the subject matter experts to get a better understanding of the
requirements.
Better-informed, the intern begins to make some appropriate
changes but then gets stumped. “I do not know which of these options to choose.
Should I just use the defaults,” moaned the intern.
“I’m for whatever is right,” exclaimed his mentor. “Choose
wisely and we will not have this conversation again. The Tao teaches us to ‘Use
what is naturally useful!’”
The lesson here, of course, is to understand every option at
your disposal before choosing any parameter. Use the one that best matches the
needs of the database, the application, or the situation. Avoid DB2’s defaults.
When you don’t know the options off the top of your head, consult the
documentation!
Confused, the intern asked “But when should I partition the
data?” His mentor slapped him and asked if it would have been beneficial for a
partition to have existed between them.
The lesson here is to partition when the amount of data would
benefit from managing it in “pieces” or to access it in parallel. There is no
“mythical” cutoff of pages/rows after which you must partition, other than size
limitations (the largest segmented TS is 64 GB). And today, most table spaces
should be Universal anyway.
The database DDL created and implemented the intern worked
with the developers to get test data for the database. When the intern asked his
mentor “Why will DB2 not allow SPACES in this DATE column,” the DBA smiled and
tried to answer the intern in writing. But he wrote his answer with his finger instead of a pen.
The lesson here: No kludging! Use the features of DB2 as
they were designed and meant to be used.
For example, use DATE, TIME, and TIMSTAMP for chronological data; use
numeric data types for numbers, character data types for alphanumeric; use referential
constraints (declarative RI) and CHECK constraints instead of using programs to
enforce constraints; and so on…
Progress is being made… but tune in next time to continue
following our intern as he works on his first new database project…