The Db2 Portal Blog

Thursday, September 17, 2020

Convert Your COBOL Db2 Programs to Java Without Rebinding

As most Db2 developers and DBAs know, when you modify a Db2 program you have to prepare the program to enable it to be executed. This program preparation process requires running a series of code preprocessors that—when enacted in the proper sequence—creates an executable load module and a Db2 application package. The combination of the executable load module and the application package is required before any Db2 program can be run, whether batch or online.

But it is not our intent here to walk through and explain all of the steps and nuances involved in Db2 program preparation. Instead, we are taking a look at the impact of converting COBOL programs to Java programs, particularly when it comes to the need to bind as a part of the process.

We all know that issuing the BIND command causes Db2 to formulate access paths for SQL. If enough things (statistics, memory, buffers, etc.) have changed, then access paths can change whenever you BIND or REBIND. And this can be troublesome to manage.

But if the SQL does not change, then it is not technically necessary to bind to create a new package. You can prevent unnecessary BIND operations by comparing the new DBRM from the pre-compile with the previous version. Of course, there is no native capability in Db2 or the BIND command to compare the DBRM. That is why there are third-party tools on the market that can be used for this purpose.

But again, it is not the purpose of today’s post to discuss such tools. Instead, the topic is converting COBOL to Java. I have discussed this previously in the blog in the post Consider Cross-Compiling COBOL to Java to Reduce Costs, so you might want to take a moment to read through that post to acquaint yourself with the general topic.

Converting COBOL to Java and BIND

So, let’s consider a COBOL program with Db2 SQL statements in it. Most COBOL uses static SQL, meaning that the access paths are determined at bind time, not at execution time. If we convert that COBOL program to Java then we are not changing the SQL, just the code around it. Since the SQL does not change, then a bind should not be required, at least in theory, right?

Well, we first need to get into a quick discussion about types of Java programs. You can use either JDBC or SQLJ for accessing Db2 data from a Java program. With JDBC the program will use dynamic SQL whereas SQLJ will deliver static SQL. The Db2 BIND command can be issued using a DBRM (precompiler output) or a SQLJ customized profile.

So, part of the equation to avoid binding is to utilize SQLJ for converted COBOL programs.

CloudFrame, the company and product discussed in the referenced blog post above can be used to convert COBOL programs into modular Java. And it uses SQLJ for the Db2 access. As such, with embedded SQLJ, static SQL will be used and the access paths will be determined at bind time instead of execution time.

But remember, we converted business logic, not SQL. The same SQL statements that were used in the COBOL program can be used in the converted Java. CloudFrame takes advantage of this and re-purposes the existing package from the previous COBOL program to the new Java SQLJ. CloudFrame automates the entire process as part of the conversion from COBOL to Java. This means that the static SQL from the COBOL program is converted and customized into SQLJ in java. This is a built-in capability of CloudFrame that allows you to simply reuse the same package information that was already generated and bound earlier.

This means no bind is required when you use CloudFrame to convert your Db2 COBOL applications to Java… and no access paths will change. And that is a good thing, right? Conversion and migration are already time-consuming processes; eliminating performance problems due to changing access paths means that one less issue to worry about during a COBOL to Java conversion when you use CloudFrame.

Wednesday, September 16, 2020

Planet Db2 is Back!

For those of you who were fans of the Planet Db2 blog aggregator, you’ll be happy to know that it is back up and operational, under new management.

For those who do not know what I am talking about, for years Leo Petrazickis curated and managed the Planet Db2 blog aggregator. Leo provided a great service to the Db2 community, but unfortunately, about a year ago he had to discontinue his participation in the site. So Planet Db2 has been gone for a while. But it is back now!

Before I continue, for those who don’t know what a blog aggregator is, it a service that monitors and posts new blog content on a particular topic as it is published. This means that whenever any blog that is being tracked by the aggregator posts new content, it is highlighted with a link to the blog post on the aggregator site. The benefit is that you can watch the blog aggregator page (in this case Planet Db2) for new content instead of trying to monitor multiple blogs.

So if you are a Db2 DBA, programmer, user, vendor, or just an interested party, be sure to highlight and visit Planet Db2 on a regular basis to monitor what’s new in the Db2 blogosphere. And if you write a Db2 blog be sure to register your blog at the Planet Db2 site so your content is tracked and aggregated to Planet Db2… you’ll surely get more readers of your stuff if you do!

Tuesday, September 08, 2020

Know Your Db2 Universal Table Spaces

One of the biggest changes in the last decade or so has been the introduction of new types of table spaces – known as Universal table spaces, or UTS. Not only are UTS new to Db2, they are quickly becoming the de facto standard type of table space for Db2 applications, new and old.

At some point, Universal table spaces will displace your existing segmented and classic partitioned table spaces. We’ll examine why this is so later in the post, but first let’s briefly describe what Universal table spaces are.

Two Types of Universal Table Spaces

Introduced in Db2 9 for z/OS, Universal table spaces combine the best attributes of partitioned and segmented table spaces. If you do not know what partitioned and segmented table spaces are, I refer you to this older article I wrote on DB2 Table Space Options to bring you up to speed (note that this article is almost 20 years old at this point).

Universal table spaces offer improved space management for variable length rows because they use space map pages (like segmented table spaces). Also, like segmented table spaces, UTS deliver improved mass delete performance, and you can immediately reuse the table segments after the mass delete. And like partitioned table spaces, Universal table spaces can grow large (up to 128TB of data) and consist of multiple partitions.

At a high-level, there are two types of Universal table spaces:

1. Partition-by-growth (PBG): The PBG UTS creates new partitions as the amount of data grows without the need to specify key ranges. This type of UTS is beneficial for tables that grow over time and need the additional limits afforded by partitioning but can benefit from the performance of segmented.

2. Partition-by-range (PBR): The range-partitioned, or PBR UTS requires a key range for partitioning like classic partitioned table spaces. A PBR UTS basically adds segmentation to the existing partitioned table space.

Both types of UTS can contain only a single table, but IBM presentations have indicated that this is likely to change at some point in the future (although nothing has been announced or confirmed for certain).

A partition-by-range UTS is basically a segmented, partitioned table space. The limit key ranges must be specified in the table DDL. Index partitioning, which was supported for the earliest classic partitioned table spaces, is not supported for a PBR UTS. So before converting your classic partitioned table spaces to PBR UTS, you must first convert from index-controlled partitioning to table-controlled partitioning. Check out this blog post for a trick to quickly convert to table-controlled partitioning.

The second type of UTS is the partition-by-growth Universal table space. As its name implies, a PBG UTS can automatically add a new partition as the data in the table space grows. Over time, as the UTS is used by applications, data gets added to the table. When the PBG UTS reaches its maximum size, a new partition is automatically added to the table space. The new partition uses the same characteristics as the existing partitions, including compression details, free space, and so on.

You control the type of UTS using the DDL keywords: NUMPARTS, MAXPARTITIONS, and SEGSIZE. To create a PBR UTS you specify both NUMPARTS and SEGSIZE. To get a PBG UTS you must code the MAXPARTITIONS and SEGSIZE parameters. MAXPARTITIONS indicates the limit on the total number of partitions that a PBG UTS can grow to. Be careful, because if you only code the NUMPARTS parameter without SEGSIZE, then you will create a traditional partitioned table space. If you only code the SEGSIZE parameter (without either NUMPARTS or MAXPARTITIONS) you will create a traditional segmented table space.

Db2 12 for z/OS

A significant new feature for supporting big data was introduced in Db2 12, relative page numbering (or RPN) for range-partitioned table spaces. An RPN range-partitioned table space can be created, or an existing range-partitioned table space can be changed to RPN via an ALTER TABLESPACE with PAGENUM RELATIVE, followed by an online REORG of the entire table space.

An RPN table space delivers many benefits for availability and storing large amounts of data. This requires an expanded RID, which increases from 5 bytes to 7 bytes.

From an availability perspective, you can specify DSSIZE at the partition level for RPN table spaces. Furthermore, the allowable DSSIZE value is no longer dependent on the page size and number of table space partitions. The DSSIZE change can be an immediate change (no online REORG required to take effect) as long as the change does not decrease the DSSIZE value. You still can decrease DSSIZE, but only at the table space level.

From a big data perspective, the DSSIZE can grow up to 1 TB for a partition. And the maximum table size increases to 4 PB with approximately 280 trillion rows per table. That is a lot of data that can be stored. Think about it this way: if you were to insert 1000 rows per second it would take more than 8000 years to fill the table to capacity!

Why Are Universal Table Spaces the Future of Db2?

As of today (September 2020, Db2 12 for z/OS), there are basically five types of table spaces from which to choose:

1. Segmented table spaces

2. Universal Partition-By-Growth (PBG) table spaces

3. Universal Partition-By-Range (PBR) table spaces

4. Universal Partition-By-Range Relative Page Number table spaces

5. Classic partitioned table space

Of course, for new databases, it is best to remove the classic partitioned table space from consideration because the PBR UTS is more efficient (and classic partitioning will likely be deprecated at some point). Technically speaking, there are actually two other types of table spaces (LOB and XML table spaces), but they are not general-purpose table spaces and can be used only in specific situations (with LOB and XML data).

So why do I advise that you favor Universal table spaces over segmented whenever you can? Well, for a number of reasons. First of all, because Universal table spaces are newer and all you really need for most every Db2 implementation. Secondly, because many new features of Db2 can only be used with Universal table spaces. Newer features that only work with UTS include:

· Clone tables

· Hash-organized tables

· Currently committed locking

· Pending DDL

· Inline LOBs

· XML multi-versioning

· ALTER TABLE with DROP COLUMN

And this trend is likely to continue. As IBM introduces new function levels and versions of Db2 with new features that only work with UTS, it will become increasingly difficult for DBAs to keep track of which table spaces are not UTS so that they can make sure they are not using any new features that will not work with their old types of table spaces.

What this means is that other than Universal table spaces, the only other type you should be using are segmented table spaces and then only when you absolutely must have a multi-table table space. Therefore, the best practice I recommend is to make all new table spaces Universal (except for multi-table table spaces which can be segmented).

So, what is the future of the segmented table space? For the immediate future, segmented table spaces will continue to be supported. My guess is that, at some point, IBM will deliver a multi-table UTS capability, and then at some point deprecate segmented table spaces. But this is only my guess. As of the date I am writing this, IBM has not committed to a multi-table UTS and the segmented table space is still the recommended (and only) method for assigning multiple tables into a single table space.

My general recommendation though is that you avoid multi-table table spaces unless you have many very small tables and are close to reaching the open data set limit (200,000). Of course, your limit may be lower depending on the setting of the DSMAX subsystem parameter, which specifies the maximum number data sets that can be open at one time. Acceptable values range from 1 to 200,000; a lower setting may be specified due to operating system contraints or storage/memory limitations.

My general recommendation for table spaces is to slowly work on a conversion project to migrate your classic partitioned table spaces to PBR UTS and your segmented table spaces to PBG UTS. Doing so will bring you to the latest and greatest Db2 table space technology and position you to be able to use all new functionality in current and future versions of Db2 whenever – and wherever – you see fit.

Summary

To make sure that your systems are up-to-date and ready for new functionality it makes sense to adopt Universal table spaces for all of your Db2 tables. The only exception is for multi-table segmented table spaces, and you shouldn’t have too many of them.

Good luck Universalizing your Db2 databases!

Wednesday, September 02, 2020

The Benefits of In-Memory Processing

One area that most organizations can benefit from is by better using system memory more effectively. This is so because accessing and manipulating data in memory is more efficient than doing so from disk.

Think about it… There are three aspects of computing that impact the performance and cost of applications: CPU usage, I/O, and concurrency. When the same amount of work is performed by the computer using fewer I/O operations, CPU savings occur and less hardware is needed to do the same work. A typical I/O operation (read/write) involves accessing or modifying data on disk systems; disks are mechanical and have latency – that is, it takes time to first locate the data and then read or write it.

There are many other factors involved in I/O processing that involve overhead and can increase costs, all depending upon the system and type of storage you are using. For example, each I/O consists of a multitude of background system processes, all of which contribute to the cost of an I/O operation (as highlighted in Figure 1 below). It is not my intent to define each of these processes but to highlight the in-depth nature of the processing that goes on behind-the-scenes that contributes to the costly nature of an I/O operation.

Figure 1. The Cost of an I/O

So, you can reduce the time it takes to process your mainframe workload by more effectively using memory. You can take advantage of things like increased parallelism for sorts and improve single-threaded performance of complex queries when you have more memory available to use. And for OLTP workloads, large memory provides substantial latency reduction, which leads to significant response time reductions and increased transaction rates.

The most efficient way to access data is, of course, in-memory access. Disk access is orders-of-magnitude less efficient than access data from memory. Memory access is usually measured in microseconds, whereas disk access is measured in milliseconds. (Note that 1 millisecond equals 1000 microseconds.)

The IBM z15 has layers of on-chip and on-board cache that can improve the performance of your application workloads. We can view memory usage on the mainframe as a pyramid, from the slowest to the fastest, as shown in Figure 2. As we go up the pyramid, performance improves; from the slowest techniques (like tape) to the fastest (core cache). But this diagram drives home our core point even further: that system memory is faster than disk and buffering techniques.

Figure 2. The Mainframe Memory Pyramid

So how can we make better use of memory to avoid disk processing and improve performance? Although there are several different ways to adopt in-memory processing for your applications, one of the best methods can be to utilize a product. One such product is the IBM Z Table Accelerator.

IBM Z Table Accelerator is an in-memory table accelerator that can improve application performance and reduces operational cost by utilizing system memory. Using it can help your organization to focus development efforts more on revenue-generating business activity, and less on other less efficient methods of optimizing applications. It is ideal for organizations that need to squeeze every ounce of power from their mainframe systems to maximize performance and transaction throughput while minimizing system resource usage at the application level. You can use it to optimize the performance of all types of data, whether from flat files, VSAM, Db2, or even IMS.

So how does it work? Well, typically a small percentage of your data is accessed and used a large percentage of the time. Think about it in terms of the 80/20 Rule (or the Pareto Principle). About 80% of your data is accessed only 20% of the time, and 20% of your data is accessed 80% of the time.

The data that you are accessing most frequently is usually reference data that is used by multiple business transactions. By focusing on this data and optimizing it you can gain significant benefits. This is where the IBM Z Table Accelerator comes into play. By copying some of the most often accessed data into the accelerator, which uses high-performance in-memory tables, significant performance gains can be achieved. That said, it is only a small portion of the data that gets copied from the system of record (e.g. Db2, VSAM, etc.) into the accelerator.

High-performance in-memory technology products -- such as IBM Z Table Accelerator -- use system memory. Sometimes, if the data is small enough, it can make it into the L3-L4 cache. This can be hard to predict, but when it occurs things get even faster.

Every customer deployment is different, but using IBM Z Table Accelerator to optimize in-memory data access can provide a tremendous performance boost.

A Use Case: Tailor-Fit Pricing

Let’s pause for a moment and consider a possible use case for IBM Z Table Accelerator.

In 2019, IBM announced Tailored Fit Pricing (TFP), with the goal of simplifying mainframe software pricing and billing. IBM designed TFP as a more predictable, cloud-like pricing model than its traditional pricing based on a rolling-four-hour-average of usage. Without getting into all of the details, TFP eliminates tracking and charging based on monthly usage and instead charges a consistent monthly bill based on the previous year’s usage (plus growth).

It is important to note that last point: TFP is based on last year’s usage. So you can reduce your bill next year by reducing your usage this year, before you convert to TFP. Therefore, it makes a lot of sense to reduce your software bills to the lowest point possible the year before the move to TFP.

So what does this have to do with IBM Z Table Accelerator? Well, adopting techniques to access data in-memory can lower MSU usage – and therefore your monthly software bill. Using IBM Z Table Accelerator to optimize your reference data in-memory before moving to TFP can help you to lower your software bills and better prepare you for the transition to Tailored Fit Pricing.

--------------------

If you’d like to learn more about IBM Z Table Accelerator there is an upcoming SHARE webinar on September 15, 2020, that goes into some more details about the offering. It is titled Digital Transformation IncludesGetting The Most Out of Your Mainframe: click the link for details and to register to attend.

Friday, August 28, 2020

IBM Db2 on Cloud News

If you are thinking about, or have already adopted Db2 in the cloud, there is some recent news you should know about. But before we explore that news, let’s take a look at the quick highlights of using Db2 in the cloud.

IBM’s Db2 on Cloud offering is a fully-managed operational data store running the IBM Db2 11.5 engine offering 24x7x365 availability. So if you know Db2 on Linux, Unix, and Windows platforms, you know Db2 on Cloud… but there’s more.

Db2 on Cloud runs containerized Db2 on with a dedicated DevOps team managing the maintenance and updates required to run your mission-critical workloads. This includes features like seamless data federation, point-in-time recovery, HADR with multizone region support and independent scaling. So many of the administrative burdens of managing Db2 on-premises are handled by IBM in the cloud.

Now if you know me, and have been reading my “stuff” on cloud and DBA, you know that this does not mean that you can entirely offload you DBA. But it is cool and it does help, especially with DBA teams being stressed to their limits these days.

So yes, you can run Db2 on Cloud! And there are many good reasons to consider doing so, such as scalability, pay-as-you-use pricing, and to take advantage of managed services.

OK, So What is New?

I promised some news in the title of this blog post and so far we have just set the stage by examining IBM’s cloud offering of Db2 (albeit at a high level). So, what’s new?

Well, IBM is revamping its pricing plans. Before digging into the news, you need to know that IBM offers two high-level pricing plan options.

The Lite plan uses a shared multi-tenant system designed for application development and evaluation of IBM Db2 on Cloud. It is offered free-of-charge, without any time limitations.
Enterprise plans are for usage and deployment of business applications and systems. It includes one database per service instance with 4 vCPU x 16 GB RAM x 20 GB storage on dedicated compute slices with the option to use a three-HA-node configuration with multizone region support. Pricing starts at $989/month.

What is new is that on August 19 IBM introduced two new plans, the Enterprise non-HA plan, and the Standard non-HA plan. This means that there are now four options, other than the free Lite plan: Enterprise HA, Enterprise non-HA, Standard HA, and Standard non-HA.

As is typical with IBM pricing, it is not really all that simple and it is getting more complex. But options are always good (I think).

So what is this Standard plan that does not appear on the IBM Db2 on Cloud: Pricing page? Well, we can find this on the IBM Db2 on Cloud catalog page Here we see that (as one might expect) it is a lower-cost option between Lite and Enterprise starting at 8 GB RAM with 20 GB storage.

IBM also noted that IBM Db2 on Cloud is now available in the following six data centers: Dallas, Frankfurt, Tokyo, London, Sydney, and Washington. And your instances can be provisioned either with or without the Oracle compatibility feature.

It is important to note that IBM also notes that customers on older, legacy plans (how about that, cloud legacy already) will be required to upgrade their to one of the newer plans.

Summary

So, there are more options to choose from with your Db2 on Cloud implementations. And if you have an older plan take some time to familiarize yourself with the new pricing plan options and be ready to choose accordingly for your workload requirements.