Wednesday, October 12, 2016

The Tao of DB2 - Part 9: The Five R's

The last time we checked in on our experienced DBA and his young intern the lesson was focused on when to REORG. The intern was sent away to learn all about Real Time Stats and RUNSTATS. Let's see how he is progressing...

"So, the Real Time Stats can be used to help me figure out when it is time to run a utility, like REORG, on my DB2 objects. There are a lot of different statistics and the manuals give some good guidance on which ones to look at and when they should trigger reorganization," offered the intern.

"Well done!" exclaimed the Taoist DBA. "You have grasped this concept quickly. But now, tell me, can you think about how this might tie in with anything else you know?"

"Hmmm..." thought the intern. "If a REORG changes the composition of the table space, then a RUNSTATS should be done to get the statistics that the optimizer uses to be accurate. And maybe it could make sense to REBIND any programs that access tables in that table space."

"Yes!" said the surprised DBA. "You are getting it. It is not enough to just learn each new thing that you happen upon, but to think about how it impacts everything else you already know. You are coming along nicely! So tell me about how these things should work together."

"Well, I guess there are a couple of different approaches we could take, but I think the best approach would be to REBIND our applications over time as the data changes. We'd monitor the RTS tables using the thresholds recommended in the manual..."

The expert interrupted his apprentice "...and you can tweak them over time as we learn more about how they change.

"Yes, that makes sense. And when we REORG we can follow that with a RUNSTATS..." said the apprentice.

"Or perhaps use in-lie stats to reduce the number of runs through the data. That can reduce CPU time," chimed in his teacher.

"So we are using RTS to keep track of changes to data volume and patterns. And then we'd follow that up with a REBIND process?" questioned the intern.

"Sure, but we don't want to REBIND everything, do we?" he queried.

"No, probably not. I can use the dependency information in the DB2 Catalog to find out which packages access or change data in the objects that were reorganized. Then we can just REBIND those" said the intern.

"Good thinking, but keep in mind that the DB2 Optimizer is not perfect, so we have to be ready in case we get degraded performance after rebinding. What do you think we can do there?" asked the mentor.

"Well, if we get any problems I can always try to code up a hint to get back to the way things used to work," said the intern. 

"Sounds like you're reading old documentation to me! Sure, sometimes a hint may be needed, but what about access path stability and using the PLANMGMT options of the BIND command? That way we can have backup versions of the access paths to go back to if performance suffers." 

"I vaguely recall hearing about that. I better read up on that," said the intern.

"Yes, you should," said the mentor, "And while you're at it, noodle on this. I call what we have been discussing here The Five R's of DB2 application performance. Here is a note card that steps through The Five R's:"

"OK, I see. We start with RTS, use them to REORG, make sure we run RUNSTATS after (or during) the REORG, and then REBIND any impacted packages. But what is Recheck? Is that a DB2 utility I don't know about?" asked the intern.

"No, no, no," cautioned his mentor. "That is referring to the manual process of checking the access paths to see what changed to make sure we don't get any surprises after rebinding. And then using PLANMGMT to control things while we investigate, tweak and try again."

"Well I have a lot of work ahead of me getting this all set up," said the intern.

"Yes, you'll have some work because we've been short-staffed. But I think you'll find a lot of this already set up and running. Please look into it all and tell me how things are going," replied the mentor, confident that his young apprentice was learning the way...

Tuesday, October 04, 2016

The Tao of DB2 - Part 8: REORG and Statistics

When last we looked in on our Taoist DBA and his young mentor they were dealing with performance issues. They were dealing with performance issues as they arose... reactively. The last question the intern had asked his mentor was a good one: "So is there more that I need to know about DB2 performance management?" 

Well, today the mentor is instructing his intern on how to be more proactive, instead of reactive.

"Today, let's talk about keeping our DB2 objects well ordered," said the mentor. "What if you were doing a home repair project and needed a specific size bolt? Would you rather look for it in the pile on the left or the tray on the right?"

"Obviously, the tray," said the intern.

"Then we must work on making our DB2 table spaces and indexes look more like the tray than the pile!" nodded the mentor.

"Oh, yes, I have read all about the REORG utility and I was thinking about creating a REORG job for every table space and index and just scheduling them to run every month," said the intern.

Shaking his head solemnly, his mentor said "Well, yes, that may be better than doing nothing. But again, maybe not. Do you really know that every object needs to be reorganized once a month? What if some get disorganized very quickly, like daily or weekly? Wouldn't it benefit your applications if those objects could be reorganized more frequently?"

"I suppose so," blinked the intern.

"And what about those that are almost never modified? Why do you want to reorganize them so frequently? Then there are those that change slowly, maybe you only need to reorganize them once or twice a year. If you just blindly reorganize everything at the same time you are wasting valuable mainframe CPU cycles... and that can be costly!" countered his mentor.

"Hmmm... So what should I do?" asked the intern.

"As with most complicated things, start by researching and making a plan. Understand each DB2 object and how it is accessed and modified. Is the amount of data growing? How fast? Basically, get an idea of the activity that is being performed on each object."

Things to consider include:

  • How frequently does the data change in the object? 
    • Changes? Rare; Variable? No = infrequent REORG
    • Variable? Yes = more frequent REORG (compression?)
  • How radical is the change?
    • Is the data simply updated in place?
    • Does it move?
    • Are there gaps in the data?
  • Is downtime possible? 
    • Online vs. offline REORG
    • How busy is the object in terms of access and modification during the time you want to reorganize it?

"Fortunately, DB2 gives us some help here!" replied his teacher.

"Oh, yes, I saw some of these things out in the DB2 Catalog... you know, like cluster ratio and near and far off distribution statistics. Right?" asked the intern.

"Ten years ago, you would have been right. And I give you credit for at least reading and learning about the statistics in the DB2 Catalog. But these days we should be using Real Time Statistics, or RTS. The DB2 Catalog stats are only modified when you run the RUNSTATS utility. The RTS are modified as their name implies, regularly in real-time as changes are made to the data," said the mentor. "Here is some documentation on RTS, take some time to understand the statistics and how you might use them to decide when you should reorganize stuff!"

"OK, thanks. I'll learn this stuff and be back to talk about what I should do after than," said the intern.

"Good plan!" said his mentor.