The Db2 Portal Blog: June 2020

Tuesday, June 30, 2020

Consider Cross-Compiling COBOL to Java to Reduce Costs

Most organizations that rely on the mainframe for their mission-critical workload have a considerable amount of COBOL programs. COBOL was one of the first business-oriented programming languages having been first introduced in 1959. Designed for business and available when the IBM 360 became popular, COBOL is ubiquitous in most mainframe shops.

Organizations that rely on COBOL need to make sure that they continue to support and manage these applications or risk interruptions to their business, such as those experienced by the COBOL applications that run the state unemployment systems when the COVID-19 pandemic caused a spike in unemployment applications.

Although COBOL continues to work -- and work well -- for many application needs, there are on-going challenges that will arise for organizations using COBOL. One issue is the lack of skilled COBOL programmers. The average age of a COBOL programmer is in the mid-50’s, and that means many are close to retirement. What happens when all these old programmers retire?

Another issue is cost containment. As business improves and workloads increase, your monthly mainframe software bill is likely increasing. IBM continues to release new pricing models that can help, such as tailored-fit pricing, but it is not easy to understand all of the different pricing models, nor is it quick or simple to switch, at least if you want to understand what you are switching to.

And you can’t really consider reducing cost without also managing to maintain your existing performance requirements. Sure, we all want to pay less, but we need to maintain our existing service level agreements and meet our daily batch window deadline.

Cross-Compiling COBOL to Java

Which brings me to the main point of today’s blog post. Have you considered cross-compiling your COBOL applications to Java? Doing so can help to address some of the issues we just discussed, as well as being a starting point toward your application modernization efforts.

What do I mean by cross-compiling COBOL to Java? Well, the general idea is to refactor the COBOL into high-quality Java using Cloudframe^TM. CloudFrame is the company and the product, which is used to migrate business logic in COBOL into modular Java. This refactoring of the code changes the program structure from COBOL to object-oriented Java without changing its external behavior.

After refactoring, there are no platform dependencies, which allows the converted Java workloads to run on any platform while not requiring changes to legacy data, batch schedulers, CICS triggers or Db2 stored procedures.

I can already hear some of you out there saying “wait-a-minute… do you really want me to convert all of my COBOL to Java?” You can, but I’m not really suggesting that you convert it all and leave COBOL behind… at least not immediately.

But first, let’s think about the benefits you can get when you refactor your COBOL into Java. Code that runs on a Java Virtual Machine (JVM) can run on zIIP processors. When programs run on the zIIP, the workload is not charged against the rolling four-hour average or the monthly capacity for your mainframe software bill. So, refactoring some of your COBOL to Java can help to lower your software bill.

Additionally, moving workload to zIIPs frees up your general-purpose processors to accommodate additional capacity. Many mainframe organizations are growing their workloads year after year, requiring them to upgrade their capacity. But if you can offload some of that work to the zIIP, not only can you use the general purpose capacity that is freed, but if you need to expand capacity you may be able to do it on zIIPs, which are less expensive to acquire than general purpose processors.

It's like Cloudframe is brining cloud economics to the mainframe.

COBOL and Java

CloudFrame refactors Batch COBOL workloads to Java without changing data, schedulers, and other infrastructure (e.g. MQ). CloudFrame is fully automated and seamlessly integrated with the change management systems you use on the mainframe. This means that your existing COBOL programmers can maintain the programs in COBOL while running the actual workloads in Java.

Yes, it is possible to use Cloudframe to refactor the COBOL to Java and then maintain and run Java only. But it is also possible to continue using your existing programmers to maintain the code in COBOL, and then use Cloudframe to refactor to Java and run the Java. This enables you to keep your existing developers while you embrace modernization in a manageable, progressive way that increases the frequency of tangible business deliverables at a lower risk.

An important consideration for such an approach is the backward compatibility that you can maintain. Cloudframe provides binary compatible integration with your existing data sources (QSAM, flat files, VSAM, Db2), subsystems, and job schedulers. By maintaining COBOL and cross-compiling to Java, you keep your COBOL until you are ready to shift to Java. At any time, you can quickly fall back to your COBOL load module with no data changes. The Java data is identical to the COBOL data except for date and timestamp.

With this progressive transformation approach, your migration team is in complete control of the granularity and velocity of the migration. It reduces the business risk of an all-or-nothing, shift-and-lift approach because you convert at your pace without completely eliminating the COBOL code.

Performance is always a consideration with conversions like this, but you can achieve similar performance, and sometimes even better performance as long as you understand your code and refactor wisely. Of course, you are not going to convert all of your COBOL code to Java, but only those applications that make sense. By considering the cost savings that can be achieved and the type of programs involved, cross-compiling to Java using Cloudframe can be an effective, reasonable, and cost-saving approach to application modernization.

Check out their website at www.cloudframe.com or request more information.

Thursday, June 25, 2020

Db2 12 for z/OS Function Level 507

This month, June 2020, IBM introduced a new function level, FL507, for Db2 12 for z/OS. This is the first new function level this year, and the first since October 2019. The Function Level process was designed to release Db2 functionality using Continuous Delivery (CD) in short, quick bursts. However, it seems that the global COVID-19 pandemic slowed things a bit… and that, of course, is understandable. But now we have some new Db2 for z/OS capabilities to talk about for this first time in a little bit!

There are four significant impacts of this new function level:

Application granularity for locking limits
Deletion of old statistics from the Db2 Catalog when using profiles
CREATE OR REPLACE capability for stored procedures
Passthrough-only expressions with IBM Db2 Analytics Accelerator (IDAA)

Let’s take a quick look at each of these new things.

The first new capability is the addition of application granularity for locking limits. Up until now, the only way to control locking limits was with NUMLKUS and NUMLKTS subsystem parameters, and they applied to the entire subsystem.

NUMLKTS defines the threshold for the number of page locks that can be concurrently held for any single table space by any single DB2 application (thread). When the threshold is reached, DB2 escalates all page locks for objects defined as LOCKSIZE ANY according to the following rules:

All page locks held for data in segmented table spaces are escalated to table locks.
All page locks held for data in partitioned table spaces are escalated to table space locks.

NUMLKUS defines the threshold for the total number of page locks across all table spaces that can be concurrently held by a single DB2 application. When any given application attempts to acquire a lock that would cause the application to surpass the NUMLKUS threshold, the application receives a resource unavailable message (SQLCODE of -904).

Well, now we have two new built-in global variables to support application granularity for locking limits.

The first is SYSIBMADM.MAX_LOCKS_PER_TABLESPACE and it is similar to the NUMLKTS parameter. It can be set to an integer value for the maximum number of page, row, or LOB locks that the application can hold simultaneously in a table space. If the application exceeds the maximum number of locks in a single table space, lock escalation occurs.

The second is SYSIBMADM.MAX_LOCKS_PER_USER and it is similar to the NUMLKUS parameter. You can set it to an integer value that specifies the maximum number of page, row, or LOB locks that a single application can concurrently hold for all table spaces. The limit applies to all table spaces that are defined with the LOCKSIZE PAGE, LOCKSIZE ROW, or LOCKSIZE ANY options.

The next new capability is the deletion of old statistics when using profiles. When you specify the USE PROFILE option with RUNSTATS, Db2 collects only those statistics that are included in the specified profile. Once function level 507 is activated, Db2 will delete any existing statistics for the object(s) that are not part of the profile. This means that all frequency, key cardinality, and histogram statistics that are not included in the profile are deleted from the Db2 Catalog for the target object.

This is a welcome new behavior because it makes it easier to remove old and stale distribution statistics. Keep in mind that this new behavior also applies when you use profiles to gather inline statistics with the REORG TABLESPACE and LOAD utilities.

Another great new capability that stored procedure users have been waiting for for some time now is the ability to specify CREATE OR REPLACE for procedures. This means that you do not have to first DROP a procedure if you want to modify it. You can simply specify CREATE OR REPLACE PROCEDURE and if it already exists, the procedure will be replaced, and if not, it will be created. This capability has been available in other DBMS products that support stored procedures for a while and it is good to see it come to Db2 for z/OS!

Additionally, for native SQL procedures, you can use the OR REPLACE clause on a CREATE PROCEDURE statement in combination with a VERSION clause to replace an existing version of the procedure, or to add a new version of the procedure. When you reuse a CREATE statement with the OR REPLACE clause to replace an existing version or to add a new version of a native SQL procedure, the result is similar to using an ALTER PROCEDURE statement with the REPLACE VERSION or ADD VERSION clause. If the OR REPLACE clause is specified on a CREATE statement and a procedure with the specified name does not yet exist, the clause is ignored and a new procedure is still created.

And finally, we have support for passthrough-only expressions to IDAA. This is needed because you may want to use an expression that exists on IDAA, but not on Db2 12 for z/OS. With a passthrough-only expression, Db2 for z/OS simply verifies that the data types of the parameters are valid for the functions. The expressions get passed over to IDAA, and the accelerator engine does all other function resolution processing and validation.

What new expressions does FL507 support you may ask? Well all of the following built-in functions are now supported as passthrough-only expressions to IDAA:

ADD_DAYS
BTRIM
DAYS_BETWEEN
NEXT_MONTH
Regression functions

REGR_AVGX
REGR_AVGY
REGR_COUNT
REGR_INTERCEPT
REGR_ICPT
REGR_R2
REGR_SLOPE
REGR_SXX
REGR_SXY
REGR_SYY

ROUND_TIMESTAMP (when invoked with a DATE expression)

You can find more details on the regression functions from IBM here.

Summary

These new capabilities are all nice, new features that you should take a look at, especially if you have applications and use cases where they can help.

The enabling APAR for FL507 is PH24371. There are no incompatible changes with FL 507. But be sure to read the instructions for activation details and Db2 Catalog impacts for DL 507.

Tuesday, June 09, 2020

Optimizing Mainframe Data Access

Nobody can deny that the amount of data that we store and manage continues to expand at a rapid pace. A recent study by analysts at IDC Corporation concludes that the size of what they call The Global Datasphere will reach 175 zettabytes by 2025.

This data growth is being driven by many different industry trends and patterns. We see mobile usage continuing to grow unabated, while at the same time we are hooking up more devices that generate more data to the Internet of Things (IoT), we are storing more data for analysis and machine learning, creating data lakes, and copying data all over the place.

As such, organizations are looking to process, analyze, and exploit this data accurately and quickly. This is especially the case for mainframe sites, where optimizing data usage and access can result in big returns on decision-making and also big savings…

So how can organizations leverage the best data storage for each type of usage required? Well, it helps to think of the different types and usages of data at a high level. If we consider data along two axes -- volatility and usage -- we can map out where it makes sense to store the data… and which IBM technologies we can bring to bear to optimize that data.

From a volatility perspective, there is a continuum of possibilities from never-to-rarely changing to frequently changing. And from a usage perspective there is a continuum of possibilities ranging from mostly analytical and decision-making to transactional that conducts day-to-day business operations.

These continuums are outlined in the chart below. As you review the chart, keep in mind that transaction processing is typified by short queries that get in, do their business, and get out. Analytics processing, on the other hand, is typically going to require longer-running queries. Furthermore, the chart calls out two other types of data: reference data and temporary data.

Reference data is that which defines permissible values to be used by other data elements (columns or fields). Reference data is typically widely-used and referenced by many applications. Additionally, reference data does not change very often (hence its inclusion near the bottom of the volatility continuum on the chart). Temporary data, as the name suggests, exists for a period of time during processing, but is not stored persistently (which is why it is depicted near the top of the volatility continuum on the chart).

With this framework as our perspective, let’s dig in and look at the options shown in the center of the chart. For the most part, we assume that mainframe data will be stored in IBM Db2 for z/OS, but of course, not all mainframe data will be. For analytical processes, IBM provides the IBM Db2 Analytics Accelerator (aka IDAA).

IBM Db2 Analytics Accelerator

IDAA is a high-performance appliance for analytical processing that is tightly-integrated with Db2 for z/OS. The general idea is to enable HTAP (Hybrid Transaction Analytical Processing) from the same database, on Db2 for z/OS. IDAA stores data in a columnar format that is ideal for speeding up complex queries – sometimes by orders of magnitude.

Data that is loaded into the IDAA goes through a hashing process to map data to multiple discs and different blades. The primary purpose of spreading the data over multiple discs is to enable parallelism, where searching is performed across multiple discs (portions of the data) at the same time, resulting in efficiency gains.

Db2 for z/OS automatically decides which queries are appropriate for execution on IDAA. And when a query is run on IDAA, it is distributed across multiple blades. Each blade delivers a partial answer to the query, based on the portion of the data on its discs. The combination of each of the blades results provides the final query results.

When you think of the types of queries, that is, longer-running ones, that IDAA can boost, think of queries like the following:

SELECT something
FROM big table
WHERE suitable filter clause

SELECT something with aggregation
FROM big table
WHERE suitable filter clause

IBM Z Table Accelerator

But IDAA is not designed to help your short-running transactional queries; at least not that much. So we can turn to another IBM offering to help out here: IBM Z Table Accelerator? At a high level, IBM Z Table Accelerator is an in-memory table accelerator for Db2 and or VSAM tables that can dramatically improve overall Z application performance and reduce operational cost.

The most efficient way to access data is, of course, in-memory access. Disk access is orders-of-magnitude less efficient than access data from memory. Memory access is usually measured in microseconds, whereas disk access is measured in milliseconds. (Note that 1 millisecond equals 1000 microseconds.)

This is the case not only because disk access is mechanical and memory access is not, but because there are a lot of actions going on behind the scenes when you request an I/O. Take a look at this diagram, which comes from a Marist University white paper on mainframe I/O.

The idea here is to show the complexity of operations that are required in order to request and move data from disk to memory for access, not to explicitly walk through each of these activities. If you are interested in doing that, I refer you to the link shown for the white paper.

So an in-memory table processor, like IBM Z Table Accelerator, can be used to keep data in memory for program access to eliminate the processing and complexity of disk-based I/O operations. The benefits of IBM Z Table Accelerator are many: it can allow you to reduce resource consumption, it can help to reduce elapsed time experienced in batch windows, and it can reduce operational cost and improve system capacity.

The concept is simple enough, as shown in this overview graphic (above). The IBM Z Table Accelerator is used to host reference and/or temporary data in memory, instead of on disk, to significantly improve application performance.

So let’s take a look at the difference between an I/O operation (or any fetch of data from Db2) versus accessing the data using IBM Z Table Accelerator:

The top of the diagram shows the code path required by the data request (or fetch) as it makes it way from disk through Db2 and back to the application. The bottom portion of the diagram shows the code path accessing data using IBM Z Table Accelerator. This is a significant simplification of the process and it should help to clarify how much more efficient in-memory table access can be.

The Bottom Line

If you take the time to analyze the type of data you are using, and how you are using it, you can use complementary acceleration software from IBM to optimize your application accesses. For analytical, long-running queries consider using IBM Db2 Analytics Accelerator. And for transactional processing of reference data and temporary data, consider using IBM Z Table Accelerator.

Both technologies are useful for different types of data and processing.

(Note: You can click on any graphic in the post to see it enlarged.)