Wednesday, October 12, 2016

The Tao of DB2 - Part 9: The Five R's

The last time we checked in on our experienced DBA and his young intern the lesson was focused on when to REORG. The intern was sent away to learn all about Real Time Stats and RUNSTATS. Let's see how he is progressing...

"So, the Real Time Stats can be used to help me figure out when it is time to run a utility, like REORG, on my DB2 objects. There area a lot of different statistics and hte manuals give some good guidance on which ones to look at and when they should trigger reorganization," offered the intern.

"Well done!" exclaimed the Taoist DBA. "You have grasped this concept quickly. But now, tell me, can you think about how this might tie in with anything else you know?"

"Hmmm..." thought the intern. "If a REORG changes the composition of the table space, then a RUNSTATS should be done to get the stats that the optimizer uses to be accurate. And maybe it could make sense to REBIND any programs that access tables in that table space."

"Yes!" said the surprised DBA. "You are getting it. It is not enough to just learn each new thing that you happen upon, but to think about how it impacts everything else you already know. You are coming along nicely! So tell me about how these things should work together."

"Well, I guess there are a couple of difference approaches we could take, but I think the best approach is would be to REBIND our applications over time as the data changes. We'd monitor the RTS tables using the thresholds recommended in the manual..."

The expert interrupted his apprentice "...and you can tweak them over time as we learn more about how they change.

"Yes, that makes sense. And when we REORG we can follow that with a RUNSTATS..." said the apprentice.

"Or perhaps  use in-lie stats to reduce the number of runs through the data. That can reduce CPU time," chimed in his teacher.

"So we are using RTS to keep track of changes to data volume and patterns. And then we'd follow that up with a REBIND process?" questioned the intern.

"Sure, but we don't want to REBIND everything, do we?" he queried.

"No, probably not. I can use the dependency information in the DB2 Catalog to find out which packages access or change data in the objects that were reorganized. Then we can just REBIND those." said the intern.

"Good thinking, but keep in mind that the DB2 Optimizer is not perfect, so we have to be ready in case we get degraded performance after rebinding. What do you think we can do there?" asked the mentor.

"Well, if we get any problems I can always try to code up a hint to get back to the way things used to work," said the intern. 

"Sounds like you're reading old documentation to me! What about access path stability and using the PLANMGMT options of the BIND command? That way we can have backup versions of the access paths to go back to if performance suffers." 

"I vaguely recall hearing about that. I better read up on that," said the intern.

"Yes, you should," said the mentor, "And while you're at it, noodle on this. I call what we have been discussing here The Five R's of DB2 application performance. Here is a note card that steps through The Five R's:"

"OK, I see. We start with RTS, use them to REORG, make sure we RUNSTATS after or during REORG, and then REBIND any impacted packages. But what is Recheck? Is that a DB2 utility I don't know about?" asked the intern.

"No, no, no," cautioned his mentor. "That is referring to the manual process of checking the access paths to see what changed to make sure we don't get any surprises after rebinding. And then using PLANMGMT to control things while we investigate, tweak and try again."

"Well I have a lot of work ahead of me getting this all set up," said the intern.

"Yes, you'll have some work because we've been short-staffed. But I think you'll find a lot of this already setup and running. Please look into it all and tell me how things are going," replied the mentor, confident that his young apprentice was learning the way...

Tuesday, October 04, 2016

The Tao of DB2 - Part 8: REORG and Statistics

When last we looked in on our Taoist DBA and his young mentor they were dealing with performance issues. They were dealing with performance issues as they arose... reactively. The last question the intern had asked his mentor was a good one: "So is there more that I need to know about DB2 performance management?" 

Well, today the mentor is instructing his intern on how to be more proactive, instead of reactive.

"Today, let's talk about keeping our DB2 objects well ordered," said the mentor. "What if you were doing a home repair project and needed a specific size bolt? Would you rather look for it in the pile on the left or the tray on the right?"

"Obviously, the tray," said the intern.

"Then we must work on making our DB2 table spaces and indexes look more like the tray than the pile!" nodded the mentor.

"Oh, yes, I have read all about the REORG utility and I was thinking about creating a REORG job for every table space and index and just scheduling them to run every month," said the intern.

Shaking his head solemnly, his mentor said "Well, yes, that may be better than doing nothing. But again, maybe not. Do you really know that every object needs to be reorganized once a month? What if some get disorganized very quickly, like daily or weekly? Wouldn't it benefit your applications if those objects could be reorganized more frequently?"

"I suppose so," blinked the intern.

"And what about those that are almost never modified? Why do you want to reorganize them so frequently? Then there are those that change slowly, maybe you only need to reorganize them once or twice a year. If you just blindly reorganize everything at the same time you are wasting valuable mainframe CPU cycles... and that can be costly!" countered his mentor.

"Hmmm... So what should I do?" asked the intern.

"As with most complicated things, start by researching and making a plan. Understand each DB2 object and how it is accessed and modified. Is the amount of data growing? How fast? Basically, get an idea of the activity that is being performed on each object."

Things to consider include:

  • How frequently does the data change in the object? 
    • Changes? Rare; Variable? No = infrequent REORG
    • Variable? Yes = more frequent REORG (compression?)
  • How radical is the change?
    • Is the data simply updated in place?
    • Does it move?
    • Are there gaps in the data?
  • Is downtime possible? 
    • Online vs. offline REORG
    • How busy is the object in terms of access and modification during the time you want to reorganize it?

"Fortunately, DB2 gives us some help here!" replied his teacher.

"Oh, yes, I saw some of these things out in the DB2 Catalog... you know, like cluster ratio and near and far off distribution statistics. Right?" asked the intern.

"Ten years ago, you would have been right. And I give you credit for at least reading and learning about the statistics in the DB2 Catalog. But these days we should be using Real Time Statistics, or RTS. The DB2 Catalog stats are only modified when you run the RUNSTATS utility. The RTS are modified as their name implies, regularly in real-time as changes are made to the data," said the mentor. "Here is some documentation on RTS, take some time to understand the statistics and how you might use them to decide when you should reorganize stuff!"

"OK, thanks. I'll learn this stuff and be back to talk about what I should do after than," said the intern.

"Good plan!" said his mentor.

Tuesday, September 13, 2016

The Tao of DB2 - Part 7: Dealing with Performance Issues

The last time we checked in on our DBAs (the soon-to-retire mentor and his intern) the mentor was schooling his young intern on storage and data retention issues. But there is still much to learn, as our intern will soon find out!

"Soon you will get the chance to learn about performance tuning," said the mentor, as he nodded solemnly in his chair. As if on cue, one of the programmers came stumbling into the DBA's cubicle muttering "There's a problem with DB2!"

"Guilty until proven innocent again," mumbled the senior DBA as he rocked back in his chair and smiled. His apprentice though was running back and forth from his terminal to the developer’s area back to his phone and over to his monitor. Frustrated, the intern yelled over at his mentor “How can you be so calm? Things are falling apart!”

“Things are always falling apart and yet life goes on,” muttered the senior DBA under his breath. Not only did he have years of experience as a DBA but he was also a practitioner of Tao. As the apprentice continued running around like a chicken with its head cut off the Taoist DBA calmly glided his mouse across the top of his desk, clicked, and typed something on his keyboard. In a minute or two the apprentice walked into the senior DBA’s office and said, “Boy am I glad that is over. I don’t know what exactly I did to solve that problem, but I am exhausted.”

The senior DBA shook his head and pointed to a printout he had thumb tacked to his wall that read “Don’t Panic!” He showed the apprentice the script he had run and told him why it solved the problem. A good DBA is prepared to resolve issues he has seen before because he has must be a packrat with a good memory. “That means you must learn from every action you take,” said the senior DBA. “Running around like a crazy man when things are going wrong will never help. Remaining calm and taking things one step at a time will help. And always remember the situation and the steps you took to remedy it. Things have a way of repeating themselves.”

The senior DBA continued, “Skill and knowledge are required to be a successful DBA, but sometimes you need to master the art of getting out of your own way. You must learn how to act without forcing conclusions.”

Although learning from past experience is important, most performance problems are not solved on the first try; many performance problems require multiple things to be changed for results to be achieved. It is important, too, to remember the DB2 performance tuning hierarchy. Start with the SQL and application code, because that is where most of the problems reside. Move to the database structures and organization after that. And finally, if the problem has not been isolated yet, move on to the system and DB2 subsystem. Furthermore, tune one thing at a time and test between every change you make!

Just then, another visitor comes by stating that things are looking grim in production for the new system. "One of the programs is just taking forever to run! It was very efficient in the test system..." 

"Oh, I am sure it was," grinned the experienced DBA as he sent his intern off to check the production access paths. 

"Hey look, here is a table space scan!" shouted the intern. "Oh, that was there in test, so it can't be the problem, right" questioned the developer. The experienced DBA directed the intern to build a new index and rebind the program and everybody was happy again. 

The lesson here is that the DB2 optimizer formulates access paths at bind time based on the statistics that measure the amount of data and its make-up. The test system obviously had less data so a scan was not a problem... but more data in production caused performance issues! It is important to build effective SQL that encompasses as many of the “needs” of the application as possible; this eases performance tuning. And as data patterns change and DB2 evolves new features, rebinding the SQL can quickly adapt the program to the data and updates. 

The intern requested of his teacher, “Teach me all the mysteries of the DB2 Optimizer.” The Taoist DBA replied “A mystery is named such because it cannot be fully understood. Sometimes that which we cannot understand 100%  is still well-designed and required. As Heraclitus teaches us 'The unseen design of things is more harmonious than the seen.'"

The lesson here is that we can (and should) take the time to understand how the Optimizer works, but we must also understand that nobody (outside of IBM) fully understands every nuance of the optimizer. We can learn general rules, one of which is to entrust the optimizer to create more efficient access than we can write ourselves. Additionally, perform as much work as possible within the SQL and always remember: the DB2 code is more efficient than my/your code!

"There is sure a lot to learn here. Even though I've tried to take your lessons to the programmers they continue to blame DB2 for their errors,” lamented the intern. 

“A monkey chatters until you give it a banana. Then the monkey chatters again tomorrow,” said the experienced DBA. "But that is a bit unfair. Those programmers have a lot on their plate and cannot become as experienced as you will become with DB2. You must exercise patience and wisdom."

"So is there more that I need to know about DB2 performance management?" questioned the intern.

"Patience," replied his mentor.

Until next time...

Monday, August 22, 2016

The Tao of DB2 - Part 6: That Storage Stuff

When we last looked in on our DBAs (the soon-to-retire mentor and his intern) the intern was learning about backup and recovery and had put together a comprehensive backup plan for the new system when it would move to production. 

As the intern reclined in his chair viewing the magnificent backup plan with pride, his mentor glanced at it and admitted that the intern had done well. Just then, the team lead for the new project came into their cubicle to talk about the production turn over plans for later in the month.

The intern was a bit surprised that things were moving so quickly; hadn't they originally planned to move to production later in the year? "Well, yes," said the team leader, "but the team decided to implement in stages and we are moving to production sooner."

"Nice of you to let us know," chuckled the expert DBA as he looked at his terrified intern. "I guess we have some work to do, don't we?"

"Oh, it shouldn't be too bad," said the team leader, "just move the stuff we've been using in test over to production and we can take it from there."

The intern, knowing his mentor, ushered the team leader out of the cubicle before his mentor could react. 

As the intern came back into the cubicle he asked "How much storage space will be needed for this database?" 

"What amount of space is required to store sand?" replied his mentor. 

"How much sand... a grain or the whole beach?" asked the intern. 

"Exactly," replied his mentor. "You will need to schedule meetings with the development team and the SMEs because they will need to give us guidance on such things."

The lesson here is to plan for production sizing and database growth early in the development cycle. Understand how the business use of the database will cause data to expand or contract. Use DB2 features to your advantage. Specify sufficient free space to allow for growth. Utilize compression to accommodate larger objects. And be sure that you have sufficient disk in place before it is needed. You don't want to try to create a production database without the right amount of disk the night before things go live in production, do you?

If our DBA team had been working through a database design review schedule they would have encountered fewer surprises like this one.

As the intern went off to schedule some meetings he heard his mentor muttering "Remember always the wise words of the Tao. As Lao Tzu says 'Reduce the size and population of the state'." After scheduling the meetings, the intern asked what his mentor meant by what he had said.

"Always remember the lesson of today. Because we were not prepared early we have had to scramble to get the information that we need," said the mentor. "Think about what else we might need."

"Maybe I should have worked with those design reviews you told me about earlier?" mused the intern.

"Yes. And as you conduct your meetings, be sure to learn about the data retention requirements for your database. How long must data remain for each of the tables before it can be purged or archived. And know well the difference between the two!"

The lesson here is to prepare a data retention plan for each and every DB2 table in your DB2 subsystem. Less data in a table space can result in more efficient access; as old data gets archived there may be more relevant rows per page, improving read efficiency; table space scans will run faster; backup/recovery and REORG will run faster; and less disk space may be needed. Purging data that is no longer needed at all, or archiving stale, or outdated, data to cheaper storage can enhance the accuracy of database reporting and queries.  Of course, the plan for a table may be "No Archiving" but you need to understand the business and regulatory requirements for data retention for all of the data in the databases you manage.

Monday, August 15, 2016

The Tao of DB2 - Part 5: Build a Backup and Recovery Plan

We join our intern DBA and his mentor where we left them last week...

The intern is feeling good about himself. He has worked with the application team on the big new project designing the database structures, protecting the data using constraints and implementing proper security protocols, and assisting the development team on a daily basis. The intern was beginning to think, that perhaps, just maybe, he might be able to handle this job.

His mentor was proud of him, but he knew that the intern had much more to learn. "Keep in mind the word of Lao Tzu, who said "Lay plans for the accomplishment of the difficult before it becomes difficult; make something big by starting with something small"

"And be ready," said the Taoist DBA “for everything, because eventually everything will happen.” The intern wept as his mentor laid out all that was still required to do.

The lesson here is to be sure that you schedule appropriate backups for each and every table space... or a means of backing things up at the system level when that is appropriate. Understand the volatility of the data and the type of recovery that may been needed for each object. Learn about incremental backups and how they differ from full backups. Know how often backups may be needed: more than once-a-day, daily, weekly, monthly, etc. And be sure to test the recoverability of your databases... it is not enough to simply make the image copy backups without ever testing that they can be used for recovery purposes!

The intern learned and worked hard to develop an appropriate backup strategy for the new system. After a few weeks of diligent effort the intern nodded his head and told his mentor "This backup stuff is pretty easy, isn't it?"

"I’m outta here," was the sole reply of his mentor, and all that remained was his spinning chair and a cooling mug of coffee.

The lesson here is that nothing is easy and that there is always more to learn and do. A local recovery plan is vitally important, but you also need to develop a disaster recovery plan, keep it up-to-date and be sure to test it at least annually. Take nothing for granted!

Always remember that contingency planning and recovery from a disaster is a complex and time-consuming endeavor.

As the intern was working on how to integrate the databases for the new system into the disaster recovery plan, a developer he was working with on the project stopped by his cubicle and told him that he couldn't access one of the test tables.

The intern logged onto DB2I and displayed the test databases noticing that one of the table spaces was stopped. "Hmmm, how did that happen," muttered the intern DBA under his breath. "I think I can just restart this," he thought, "now what is that command?" As he rifled through the DB2 Command Guide his mentor stepped back into his cubicle and looked over his shoulder as he typed:


Fortunately his mentor had arrived just in time to smack him in the head and stop him executing the command.

The lesson here is that force fitting a resolution to a problem is not a wise approach for the DB2 professional. Why is the object stopped? Uncover the reason behind the symptom and resolve it appropriately. Find out the state of the object and the reason code that caused it to be stopped. Only when you are sure that it can be safely started should you run a START command... better yet, resolve the issue by running a RECOVER or a CHECK on the object based on the situation.

Crisis averted, the intern went back to his recovery planning and his mentor went to get a fresh cup of coffee.