The Db2 Portal Blog

Thursday, December 17, 2020

Db2 Utilities and Modern Data Management

Db2 utilities are the unappreciated, and often over-looked, workhorses of your mainframe Db2 environment. They perform the dirty work that has to be done to populate, organize, backup, and recover your vital mainframe data. Without them, building effective Db2 databases, managing data, optimizing performance, and even accessing mainframe data would be a lot more difficult than it currently is.

The Situation

Think about the Db2 utility situation at your shop. If you are like most organizations you will have Db2 utilities regularly running all the time. There are load and unload tasks running to refresh data for development and testing, for moving data between environments for analysis and processing, and for various other purposes. The LOAD and UNLOAD utilities bear a lot of the hard work for data movement.

You are also most likely reorganizing data using a REORG utility for most of your Db2 table spaces and probably indexes, too. In many cases reorganization jobs are scheduled to run on a regular basis: weekly, monthly, quarterly, etc. Frequently you just set these jobs up when the object is created. The job gets scheduled and is just run without anybody taking a look at them unless, or until there are performance problems.

Then there are COPY and RECOVER utilities for backing up and recovering data when there are problems. The image copy backup jobs are running all the time, taking either full or incremental copies to ensure that you can recover data in case problems are encountered. The copies are running all the time, but the recover jobs (hopefully) are not running all the time!

You are also going to be running the RUNSTATS utility to gather statistics for Db2 to use for query optimization. Depending on how often your data changes, you may be running RUNSTATS frequently or infrequently. Many times the same fate as REORG befalls RUNSTATS… that is, it is scheduled and forgotten about unless problems arise.

There are other utilities, like CHECK which is used to verify the integrity of data. You are probably not running this one very often but when you need it you want it to run fast, right?

So, all of these utilities are “out there” running and consuming CPU to move, copy, and manage your Db2 data. But are they being run effectively as possible?

Moving to the Modern Db2 Utility Way

I think by this point everybody will agree that utility type processing is not just critical, but mandatory for a Db2 environment. But just running with the bare basics is not the best approach.

If we think about data movement with unload and load processing there are several things that you might want to consider for improvement. First of all, consider the speed and performance of the unload and load tasks. You probably want these jobs to run as fast as possible – that is, to consume as little elapsed time as possible to complete. After all, you are probably using these utilities to build environments or even refresh portions of an environment… and there will be developers and testers waiting to use that data as soon as it is available. Using the fastest utility programs available will minimize the wait time and make your developers and testers more productive. Furthermore, you want these tasks to consume as little CPU as possible to reduce your monthly mainframe bills!

In some cases you might want to re-consider unloading and loading altogether, using alternate utilities and offerings that can clone an entire subsystem or move data outside the control of Db2 at the data set level.

If we think about reorganization, it is likely that you are running REORG tasks that don’t need to be run, at least not as regularly as they are being run. At the same time, it is also likely that you are not running other REORG tasks as frequently as you should, thereby causing every other task that accessing the data to degrade. Fortunately, you can use RTS (real time statistics) to help guide when you should (and should not) reorganize your data. In the best case the utility itself relies on RTS to figure out if it needs to run and runs when it makes sense only. Failing this, you are again likely consuming more CPU than is necessary (either running unneeded REORGs or accessing poorly organized data, as the case may be).

If you think about your backup and recovery situation, the issue is likely complexity. Sure you want COPY and RECOVER utilities that run fast and consume minimal CPU, but the big issue is analysis. By that I mean, when you need to recover you want to make sure that you can use the image copies (and, of course, the log) to recover and meet your RTOs (recovery time objectives). But creating recover jobs on-the-fly, in a probably complicated environment with inter-related tables and data, can be difficult. And doing so when there is an outage, which is usually the case, exacerbates the situation. Using intelligent utilities to create the right image copies and to automatically build an appropriate recovery strategy when needed should be the modern approach.

And not to neglect RUNSTATS and CHECK, you want both of those utilities to run as fast as possible, consuming minimal CPU, too. And you want guidance on when and how to run them using available RTS, statistics, and any system information available.

What Can You Do?

One approach is to use modern utilities, not only built for speed but that incorporate AI and machine learning to automate and improve the Db2 utility experience. BMC Software is once again on the vanguard with its BMC AMI utilities for Db2.

The first question you probably have is "What the heck is AMI?" Well, AMI, which stands for Automated Mainframe Intelligence, is technology that is being infused into BMC’s product line to leverage AI, machine learning, and predictive analytics to achieve a self-managing mainframe.

BMC AMI Utilities for Db2 are designed for modern complex Db2 environments. They use a centralized, intelligent architecture (see diagram below) designed specifically to handle the complexity facing IT today. Through intelligent policy-driven automation, you can use the AMI Utilities for Db2 to manage growing amounts of data with ease and, at the same time, deliver full application availability.

Figure 1. BMC AMI Utilities for Db2

If you are looking to reduce CPU and elapsed time by as much as 75%, eliminate downtime while delivering full application availability, lower disk usage, eliminate sort in your REORGs, and simplify complex utility operations, then it makes sense to take a look at the BMC AMI Utilities for Db2.

----------

You might also want to take a look at this blog post from BMC that discusses how to Save Time and Money with Updated Unload Times

And this analysis of the BMC next generation REORG technology from Ptak Associates

Wednesday, November 18, 2020

Deleting "n" Rows From a Db2 Table

I regularly receive database- and Db2-related questions via e-mail. And that is great, but I don't always get a chance to respond to everything. If you've sent me a question and I haven't replied, I apologize. But every now and then, I will use one of the e-mail questions in my in-box and write about it in the blog. Today's question is this:

How do you delete N rows from a Db2 table?

Also, how do you retrieve bottom N rows from a Db2 table without sorting the table on key?

And here is my response:

First things first, you need to refresh your knowledge of "relational" database systems and Db2. There really is no such thing as the "top" or "bottom" N rows in a table. Tables are sets of data that have no inherent logical order.

With regard to the result set though, there is a top and a bottom. You can use the FETCH FIRST N ROWS ONLY clause to retrieve only the first N rows, but to retrieve only the bottom N rows is a bit more difficult. For that, you would have to use scrollable cursors.

A scrollable cursor allows you to move back and forth through the results set without first having to read/retrieve all of the rows before. I suggest that you read up on scrollable cursors in the Db2 SQL Reference manual and the Db2 Application Programming manual. All Db2 manuals can be downloaded in Adobe PDF format for free over the IBM web site.

Basically, you would want to FETCH LAST from the scrollable cursor and then loop through with a FETCH PRIOR statement executing the loop N-1 times. That would give you the "bottom" N of any results set -- sorted or not.

As for your other question, I am confused as to why you would want to delete N rows from a table. Doesn't it matter what the data in the rows is? My guess is that you are asking how you would limit a DELETE to a subset of the rows that would apply to the WHERE condition of the DELETE. The answer is, you cannot, at least not without writing some code.

You would have to open a cursor with the same WHERE conditions specifying FOR UPDATE OF. Then you would FETCH and DELETE WHERE CURRENT OF cursor for that row in a loop that occurs N times. Of course, that means you have to write a program to embed that SQL in.

Hope this answer helps...

Wednesday, October 21, 2020

Automation and the Future of Modern Db2 Data Management

Recently I was invited by BMC Software to participate in their AMI Z Talk podcast series to talk about modern data management for Db2... and I was happy to accept.

Anne Hoelscher, Director of R+D for BMC's Db2 solutions, and I spent about 30 minutes discussing modern data management, the need for intelligent automation, DevOps, the cloud, and how organizations can achieve greater availability, resiliency, and agility managing their mainframe Db2 environment.

Here's a link to the podcast that you can play right here in the blog!

BMC AMI Z Talk · Episode 7: Automation and the Future of Modern Db2 Data Management

Modern data management, to me, means flexibility, adaptability, and working in an integrated way with a team. Today’s data professionals have to move faster and more nimbly than ever before. This has given rise to agile development and DevOps - and, as such, modern DBAs participate in development teams. And DBA tasks and procedures are integrated into the DevOps pipeline.

And as all of this DevOps adoption is happening, the amount of data we store, and have to manage, continues to grow faster than ever before.

These are just some of the challenges that Anne and I discuss in this podcast... and at the end, Anne even asks me to predict the future...

I hope you'll take the time to listen to our discussion and sharing your thoughts and issues regarding the resiliency and agility required to succeed with modern data management and Db2 for z/OS.

----------

I’d also like to extend an offer to all the listeners of this BMC podcast (and readers of this blog post) to get a discount on my latest book, A Guide to Db2 Performance for Application Developers. The link is https://tinyurl.com/craigdb2

There’s also a link to the book publisher on home page of my website. Once you are there, click on the link/banner for the book and when you order from the publisher you can use the discount code 10percent to get 10% off your order of the print or ebook.

Monday, October 19, 2020

Improving Mainframe Performance with In-Memory Techniques

A recent, recurring theme of my blog posts has been the advancement of in-memory processing to improve the performance of database access and application execution. I wrote an in-depth blog post, The Benefits of In-Memory Processing, back in September 2020, and I definitely recommend you take a moment or two to read through that to understand the various ways that processing data in-memory can provide significant optimization.

There are multiple different ways to incorporate in-memory techniques into your systems ranging from system caching to in-memory tables to in-memory database systems and beyond. These techniques are gaining traction and being adopted at increasingly higher rates because they deliver better performance and better transaction throughput.

Processing in-memory instead of on disk can have a measurable impact on not just the performance of you mainframe applications and systems, but also on your monthly software bill. If you reduce the time it takes to process your mainframe workload by more effectively using memory, you can reduce the number of MSUs you consume to process your mission-critical applications. And depending upon the type of mainframe pricing model you deploy you can either be saving now or be planning to save in the future as you move to Tailored-Fit Pricing.

So it makes sense for organizations to look for ways to adopt in-memory techniques. With that in mind, I recommend that you plan to attend this upcoming IBM Systems webinar titled The benefits and growth of in-memory database and data processing to be held Tuesday, October 27, 2020 at 12:00 PM CDT.

This presentation features two great speakers: Nathan Brice, Program Director at IBM for IBM Z AIOps, and Larry Strickland, Chief Product Officer at DataKinetics.

In this webinar Nathan and Larry will take a look at the industry trends moving to in-memory, help to explain why in-memory is gaining traction, and review some examples of in-memory databases and alternate in-memory techniques that can deliver rapid transaction throughput. And they’ll also look at the latest Db2 for z/OS features like FTBs, contiguous buffer pools, fast insert and more that have caused analysts to call Db2 an in-memory database system.

Don’t miss this great session if you are at all interested in better performance, Db2’s in-memory capabilities, and a discussion of other tools that can aid you in adopting an in-memory approach to data processing.

Wednesday, October 14, 2020

Db2 12 for z/OS Function Level 508

This month, October 2020, IBM introduced the latest new function level, FL508, for Db2 12 for z/OS. This is the second new function level this year (the first came out in June and you can learn more about it here).

For those who don't know, the Function Level process was designed by IBM for releasing new Db2 functionality using Continuous Delivery (CD) in short, quick bursts, instead of waiting for new versions (or releases).

With FL508, IBM adds support for moving tables from multi-table table spaces, both simple and segmented, to partition-by-growth (PBG) universal table spaces (UTS). For an overview of UTS capabilities and types, check out this blog post I made earlier this year: Know Your Db2 Universal Table Spaces.

Multi-table table spaces are deprecated functionality, which means that even though they are still supported, they are on their way out. So it makes sense for IBM to give us a better way to convert them to PBG UTS without having to experience an outage. And that is just what FL508 delivers.

This is accomplished in FL508 by enhancements to the ALTER TABLESPACE statement. A new option, MOVE TABLE, is delivered which, as you might expect from its name, can be used to move a table from its current table space to a target table space.

If, as you would expect in most cases, the source table space data sets are already created, the changes made by MOVE TABLE are pending changes and a REORG must be run on the source table space (the current one you are moving from) to materialize the change. Of course, this is an online REORG, so no outage is required.

The target table space must already exist as a PBG UTS in the same database as the current, source multi-table table space. Furthermore, the PBG UTS must be defined with MAXPARTITIONS 1, DEFINE NO, and [NOT] LOGGED and CCSID values that are the same as the current, existing table space. You can move only one table per ALTER TABLESPACE statement, meaning that each table in a multi-table table space must be moved with a separate ALTER TABLESPACE execution. However, because the changes are pending, you can issue multiple ALTER TABLESPACE statements, one for each table in the multi-table table space, and wait until they have all completed successfully before materializing all of the changes with a single REORG run.

It seems simple, and the functionality is nice, but don't just go willy-nilly into things moving tables all over the place once you get this capability in FL508. IBM has documented the things to take care of before you begin to move tables using ALTER TABLESPACE. Check out the IBM recommendations here.

It is also worth mentioning that you still need to keep in mind the impact that moving all tables from multi-table table spaces into their own table space will have on the system. By that I mean, you have to consider the potential impact on things like the number of open data sets (DSMAX ZPARM), DBD size, EDM pool size, and management issues (number of utility jobs, for example).

But it is nice that we now have a reasonable approach for moving tables out of deprecated multi-table table spaces so we can begin the process of moving them before they are no longer supported. A lot of shops "out there" have been waiting for something like this and it is likely to cause FL508 to be adopted quickly.

Let me know what you think by commenting below...