Showing posts with label packages. Show all posts
Showing posts with label packages. Show all posts

Monday, August 30, 2021

What Type of Changes Cause Db2 Packages to Get Invalidated?

Db2 DBAs are constantly working with database objects such as Databases, Tablespaces, Tables, and Indexes. And many requirements cause DBAs to have to modify these objects. Some modifications may be simple, such as just issuing an ALTER statement. Others may be more in-depth, even to the point of having to DROP and re-CREATE the object.

Now I've blogged here before about the many different types of Db2 database changes and how to manage them. For reference, here is a blog post that summarizes the posts I've made on this topic.

My purpose today is not to rehash all of that information again, but to discuss one aspect of change management that probably causes DBAs the most grief: package invalidation. 

Packages can be invalidated by Db2 for many reasons. For example, when a privilege that is required by a package is revoked, the package is invalidated by Db2. 

When a package is invalidated it cannot be executed until it has been rebound. This can be automatic, but it is usually better to be proactive and to Rebind packages when you take an action that invalidates packages. 

And as we all know, rebinding can cause access paths to change. Hopefully for the better... but not always. If access paths always got better then there would be no DBA grief, right? Whenever DBAs perform Rebinds they are always dreading that call from the developer or end-user that says "Hey, this transaction (or job) that used to run quickly is now taking forever."

So it makes sense that DBAs need to be aware of what changes cause packages to be invalidated. Of course, if you have to DROP an object that the package accessed it is obvious that a Rebind is required. But there are many other types of changes that will invalidate packages.

Fortunately, the IBM Db2 documentation is good and easy to find. Here is a link to the Db2 12 for z/OS documentation for Changes that invalidate packages. If you are a DBA, I recommend that you click on that link and bookmark that page!

I'm not going to copy and paste all of the information from the manual here (no reason to and over time it could change). But here are some of the things to keep in mind that you may not at first think will affect packages, but can:

  • Altering, dropping, or renaming a column
  • Adding date/time columns with defaults
  • Adding a constraint
  • Adding, changing, or rotating partitions in a partitioned or partition-by range UTS tablespace
  • Temporal and transparent archiving changes
  • Adding, altering, or dropping a materialized query table (MQT) 
  • Activating or deactivating row-level or column-level access control
  • Enabling or disabling masks if column access control is in effect
  • Increasing a table space's MAXPARTITIONS attribute
  • Changing a table space's SEGSIZE or DSSIZE
  • Changing the buffer pool for a tablespace (with a different page size)
  • Altering indexes to add a column, change the PADDED attribute, or changing the limit key value of a partitioning index
  • Regenerating an index
  • Running the REORG utility with the REBALANCE keyword
  • Running the REPAIR utility on a database with the DBD REBUILD option

Again, these are just some of the admin changes that can invalidate packages. There are others and you should always refer to the current Db2 documentation for the list of things that will invalidate packages before you make any changes. Failing to do so might mean that you will have to run a mass Rebind... maybe at a time when you'd rather not!

Finally, I'll leave you with a couple of helpful queries you can run to help as you manage changes.

To identify all packages that will be invalidated by a change to a specific object, run the following query:

SELECT   DISTINCT DCOLLID, DNAME, DTYPE 
FROM     SYSIBM.SYSPACKDEP
WHERE    BQUALIFIER = ?
AND      BNAME = ?
AND      BTYPE = ?
ORDER BY DCOLLID, DNAME;

Simply plug in the qualifier and name of the object, along with type of the object (appropriate values can be found in the Catalog table documentation in the appendix of the IBM Db2 SQL Reference manual).

And if you want to identify all invalid packages, try running this query:

SELECT   COLLID, NAME, VALID
FROM     SYSIBM.SYSPACKAGES
WHERE    VALID <> 'Y'
ORDER BY COLLID, NAME;

Tuesday, June 15, 2021

Db2 12 for z/OS Function Level 510

I'm a little late with this Db2 function level update, but better late than never, right?

In April 2021, IBM introduced a new function level, FL510, for Db2 12 for z/OS. If you want to take a look at the announcement for it, you can read it here, but there really isn't a lot to it.

Unlike all the other function levels, FL510 does not add any new features or capabilities, nor does it introduce any new changes to the Db2 Catalog. So what does it do?

This function level is basically there to prepare for the next new release of Db2, which will obviously be coming soon, or IBM would not have created this function level for it!  So it is time to start thinking about Db2 Next and getting ready for a new release/version of our favorite DBMS!

But we really haven't answered what FL510 does, have we? It is a housekeeping type of function level. When you activate FL510 it verifies and enforces several pre-migration conditions that have to be met before you can migrate to the next Db2 release. It will make sure that all Db2 12 function levels are activated and that all catalog updates for Db2 12 have been applied. This means that the Db2 catalog level is at the last catalog level for Version 12 and any future migration can therefore proceed.

Additionally, FL510 will check to make sure that your application packages were rebound recently enough to ensure that they are supported by the next Db2 release.

If any of the previous conditions are not met, then the activation of FL510 will fail. You will have to remediate your system and try to activate FL510 again before you can move forward to the new release.

Also, please be aware that FL510 has nothing to do with the fallback SPE that will have to be applied before moving forward with the eventual, new Db2 release. IBM will deliver the fallback SPE in a subsequent APAR at a point in time.

So I guess that this is a boring function level in that it delivers no new functionality... but it is exciting as it is a pre-req for a new  Db2 release that is on the horizon!

Monday, October 12, 2015

OK, So How Does the CURRENTDATA Parm Work?

There seems to be a LOT of confusion out there in DB2-land about the CURRENTDATA parameter, so I thought it might be helpful to blog about the subject.

So first things first: CURRENTDATA is a BIND parameter that can be specified when you bind or rebind your plans and packages. There are two options that can be specified: YES and NO. The default is NO.

But what does it do?

Well, the manuals tell us that CURRENTDATA indicates whether data currency is required for read-only and ambiguous cursors when the ISOLATION(CS) option is used. But that is not very clear... except that CURRENTDATA really only applies with an isolation level of cursor stability.

The general idea here is to enable the developer to exert a level of control by choosing between high performance or up-to-date data. Sure, in a perfect world, everybody would choose both. But in the real world we deal with tradeoffs and the CURRENTDATA parameter allows us to tell DB2 which of these options to prefer.

CURRENTDATA (NO)

The default, CURRENTDATA(NO), is the performance choice, and probably the best choice for most applications. When NO is chosen, you are telling DB2 that completely up-to-date data currency is not required for read-only and ambiguous cursors. That means that DB2 can perform block fetching for distributed, ambiguous cursors.

And that is probably what you want. However, if you are not using pristine coding practices, and your program tries attempts to DELETE WHERE CURRENT OF against an ambiguous cursor, DB2 will return a negative SQLCODE.

What is an Ambiguous Cursor?

At this point you might be asking “What is an ambiguous cursor?” Generally speaking, an ambiguous cursor is one where DB2 cannot tell if your program needs to use the cursor for updating or deleting. You can avoid ambiguous cursors by coding the FOR READ ONLY and FOR UPDATE OF clause on your cursors to inform DB2 of the program’s future intent for the data being accessed.

CURRENTDATA (YES)

What about CURRENTDATA(YES)? So far, we have discussed only the NO option, but you can also specify YES. Doing so indicates that the data should be current for read-only and ambiguous cursors. That means DB2 will acquire page (or row) locks to ensure the currency of the data and block fetching for distributed, ambiguous cursors will be curtailed.

Summary

Hopefully this short blog post helps to clear up a muddy topic. Of course, there is a lot more to DB2 locking and performance and I do not want to make it seem like this is the only thing you need to know in terms of the tradeoffs between data currency and performance.


But at least one parameter might be a little bit clearer to you today than it was yesterday…

Tuesday, January 20, 2015

Approaches to DB2 Access Path Management


BIND and REBIND are crucially important components as you attempt to assure efficient DB2 applications. Because the BIND/REBIND process determines exactly how your DB2 data is accessed it is important that you develop an appropriate strategy for when and how to REBIND your programs.

There are several common REBIND approaches taken by DB2 users. By far, the best approach is to REBIND your applications over time as the data changes. This approach involves some form of regular maintenance that keeps DB2 statistics up to date and formulates new access paths as data volumes and patterns change.

Other approaches include REBINDing only when a new version of DB2 is installed, or perhaps more ambitious, whenever new PTFs are applied to DB2. Another approach is to REBIND automatically after a regular period of time (days, weeks, months, etc.). This approach can work if the period of time is wisely chosen based on the application data – but it still can pose administrative issues.

The final approach can be summarized as “if it ain’t broke don’t fix it!” This is the worst of the several approaches discussed here. The biggest problem with this approach is that you are penalizing every program in your subsystem for fear that a program or two may have a degraded access path. This results in potentially many programs having sub-optimal performance because the optimizer never gets a chance to create better access paths as the data changes.

Of course, the possibility of degraded performance is real – and that is why this approach has been adopted at some sites. The problem is being able to find which statements have degraded. In an ideal world we would be to be able to review the access path changes beforehand to determine if they are better or worse. But DB2 itself does not provide any systematic method of administering access paths that way. There are third party tools that can help you achieve this though.

Anyway, let’s go back to the best approach again, and that is to REBIND on a regular basis as your data changes. This approach has become known as the Three Rs. To implement this approach you:
  1. Regularly reorganize the data to ensure that it is optimally structured.
  2. Follow that with RUNSTATS to be sure that the reorganized state of the data is reflected in the DB2 Catalog.
  3. And follow that with a REBIND for all the application programs that access the data structures impacted by the REORG and RUNSTATS.
At any rate, your goal should be to keep your access paths up-to-date with the current state of your data. Failing to do this means that DB2 is accessing data based upon false assumptions. DB2 is unlikely to make the same access path choice as your data grows – and as patterns within the data change.

By REBINDing you can generally improve the overall performance of your applications because the access paths will be better designed based on an accurate view of the data. Additionally, as DB2 changes are introduced (PTFs, new version/release) optimizer improvements and new access techniques can be incorporated into the access paths. That is, if you never REBIND, not only are you forgoing better access paths due to data changes but you are also forgoing better access paths due to changes to DB2 itself.

Adopting the Three R’s approach can pose additional questions. For example, when should you reorganize? In order to properly determine when a REORG is needed you’ll have to look at statistics. This means looking at either RUNSTATS or Real-Time Statistics (RTS). So, perhaps it should be at least 4 R’s – in other words:
  1. RTS (or RUNSTATS)
  2. REORG
  3. RUNSTATS
  4. REBIND
Now it is true that some folks don’t rely on statistics to schedule a REORG. Instead, they just build the JCL to REORG their database objects when they create the object. So they create a table space then build the REORG job and schedule it to run monthly, or quarterly, or on some regular basis. This is better than no REORG at all, but it is probably not the best approach because you are most likely either reorganizing too soon (in which case you waste the CPU cycles to do the REORG) or you are reorganizing too late (in which case performance is suffering for a period of time before the REORG runs). Better to base your REORGs off of statistics and thresholds using either RUNSTATS or RTS.

Without accurate statistics there is little hope that the optimizer will formulate the best access path to retrieve your data. If the optimizer does not have accurate information on the size, organization, and particulars of your data then it will be creating access paths based on either default or inaccurate statistics. Incorrect statistics will cause bad choices to be made – such as choosing a merge-scan join when a nested loop join would be better, or failure to invoke sequential prefetch, or using the wrong index – or no index at all. And the problem of inaccurate statistics is pervasive. There are shops out there that never, or rarely, run RUNSTATS to gather up-to-date statistics. Make sure yours is not one of those shops!

When should you run RUNSTATS? One answer is “As frequently as possible based on how often your data changes.” To do this you will need to know a thing or two about your data growth patterns: what is its make-up, how is it used, how fast does it grow, and how often does it change? These patterns will differ for every table space in your system.

Next we need to decide when to REBIND? The best answer for this is when statistics have changed significantly enough to change access paths. When we know that data has significantly changed it makes sense to REBIND after the RUNSTATS completes. But the trick is determining exactly when we have a “significant” change in our data. Without an automated method of comparing and contrasting statistics (or even better yet, access paths) coming up with an answer in a manual way can be time-consuming and error-prone – especially if we have thousands of DB2 programs to manage.

As we REBIND, we always must be on alert for rogue access paths. A rogue access path is created when the optimizer formulates a new access path that performs worse than the previous access path. This can happen for a variety of reasons. Of course, number one is that the optimizer, though good, is not perfect. So mistakes can happen. Other factors can cause degraded access paths, too. The access paths for volatile tables depend on when you run the RUNSTATS. Volatile tables are those that start out empty, get rows added to them during processing, and are emptied out at the end of the day. And, of course, if the catalog or statistics are not accurate we can get problems, too.

So adopting the Four Rs approach implies that you will have to develop a methodology for reviewing your access paths and taking care of any “potential” problem access paths. Indeed, the Four Rs becomes The Five Rs as we add a step to review the access paths after REBINDing to make sure that there are no rogue access paths:
  1. Start with Real Time Stats (or RUNSTATS) to determine when to REORG.
  2. Then we Reorganize the needy table spaces (and indexes)
  3. After reorganizing, run RUNSTATS to pick up current statistics,
  4. Follow that with the BEBINDs.
  5. Then we need that fifth R – which is to Review the access paths generated by the REBIND.
The review is of utmost importance because the optimizer can make mistakes. And, of course, so can you. But your users will not call you when performance is better (or the same). They only dial your numbers when performance gets worse. As such, proactive shops will put best practices in place to test REBIND results comparing the before and after impact of the optimizer’s choices. 

Friday, August 12, 2011

Do You Still Use That DB2 Program?

A recurring problem for programmers and IT folks in general is determining whether or not a particular program is still required. As your organization grows and the number of programs increases, keeping track of them all can become quite difficult.

As administration and management burdens increase, a common desire is to get rid of programs that are no longer being used. But it can be difficult to determine which programs are no longer used.

You can always “ask around,” but few IT professionals would be willing to delete anything based on such a general tactic. Another common method is to review performance reports or extracts from a performance warehouse. But perhaps your performance traces are not turned on all the time.

The question is probably more common in DB2 environments because of the plans and packages that consume storage and "sit around" taking up space if their associated program is no longer being used.

Well, for DB2 professionals this type of question becomes easier to answer once you migrate to DB2 10 for z/OS. DB2 maintains a new column, LASTUSED, in the DB2 Catalog. The column exists in both SYSIBM.SYSPACKAGE and SYSIBM.SYSPLAN and is defined as a DATE data type. The date is changed when the package header is requested from EDM. The column is also maintained for triggers and stored procedures. After BIND (REPLACE) this column is reset to the default value (CURRENT DATE).

This is similar to the LASTUSED column added to SYSIBM.SYSINDEXSPACESTATS in DB2 9, which is used to show the last time an index was used.

Of course, you will have to give it some time –- because you might have a program that is used only rarely, yet still used. Most shops have queries and programs that run quarterly, or even annually, but nevertheless are very important. So don’t just start freeing packages a month after you’ve migrated to DB2 10!

But it is good to know that we now have additional usage information at our fingertips in the DB2 Catalog, isn’t it?

Thursday, December 10, 2009

A Short History of DB2 for z/OS – Part 1

Let's go back in time... almost three decades ago... back to the wild and woolly 1980s! And watch our favorite DBMS, DB2, grow up over time.

Version 1 Release 1 was announced on June 7, 1983. And it became generally available on Tuesday, April 2, 1985. I wonder if it was ready on April 1st but not released because of April Fool’s Day? Any old-time IBMer out there care to comment?

Initial DB2 development focused on the basics of making a relational DBMS work. Early releases of DB2 were viewed by many as an “information center” DBMS, not for production work like IMS.

Version 1 Release 2 was announced on February 4, 1986 and was released for general availability a month later on March 7, 1986. Wow! Can you imagine waiting only a month for a new release of DB2 these days? But that is how it happened back then. Same thing for Version 1 Release 3, which was announced on May 19, 1987 and became GA on June 26, 1987. DB2 V1R3 saw the introduction of date data types.

You might notice that IBM delivered “releases” of DB2 back in the 1980s, whereas today (and ever since V3) there have only been versions. Versions are major, whereas releases are not quite as significant as a version.

Version 2 of DB2 became a reality in 1988. Version 2 Release 1 was announced in April 1988 and delivered in September 1988. Here we start to see the gap widening again between announcement and delivery. V2R1 was a very significant release in the history of DB2. Some mark it as the bellwether for when DB2 began to be viewed as a DBMS capable of supporting mission critical, transaction processing workloads. Not only did V2R1 provide many performance enhancements but it also signaled the introduction of declarative Referential Integrity (RI) constraints. RI was important for the acceptance of DB2 because it helps to assure data integrity within the DBMS.

No sooner than V2R1 became GA than IBM announced Version 2 Release 2 on October 4, 1988. But it was not until a year later that it became generally available on September 23, 1988. DB2 V2R2 again bolstered performance in many ways. It also saw the introduction of distributed database support (private protocol) across MVS systems.

Version 2 Release 3 was announced on September 5, 1990 and became generally available on October 25, 1991. Two very significant features were added in V2R3: segmented table spaces and packages. Segmented table spaces have become the de facto standard for most DB2 data and packages made DB2 application programs easier to support. DB2 V2R3 is also the version that beefed up distributed support with Distributed Relational Database Architecture (DRDA). Remote unit of work distribution was not available in the initial GA version, but IBM came out with RUOW support for DB2 V2R3 in March 1992.

And along comes DB2 Version 3 announced in November 1993 and GA in December 1993. Now it may look like things sped up again here, but not really. This is when the QPP program for early support of DB2 started. QPP was announced in March 1993 and delivered to customers in June 1993. Still though, fairly rapid turnaround by today’s standards, right?

V3 greatly expanded the number of bufferpool options available (from 5 pools to 80). There were many advances made in DB2 V3 to take better advantage of the System 390 environment: V3 introduced support for hardware assisted compression and hiperpools. It was also V3 that introduced I/O parallelism for the first time.

We’ll stop here for today and continue our short history of DB2 in my next DB2Portal blog posting. See you soon...