Figure 1. BMC AMI Utilities for Db2
You might also want to take a look at this blog post from BMC that discusses how to Save Time and Money with Updated Unload Times
And this analysis of the BMC next generation REORG technology from Ptak Associates
You might also want to take a look at this blog post from BMC that discusses how to Save Time and Money with Updated Unload Times
And this analysis of the BMC next generation REORG technology from Ptak Associates
How do you delete N rows from a Db2 table?
Also, how do you retrieve bottom N rows from a Db2 table without sorting the table on key?
And here is my response:
First things first, you need to refresh your knowledge of "relational" database systems and Db2. There really is no such thing as the "top" or "bottom" N rows in a table. Tables are sets of data that have no inherent logical order.
With regard to the result set though, there is a top and a bottom. You can use the FETCH FIRST N ROWS ONLY clause to retrieve only the first N rows, but to retrieve only the bottom N rows is a bit more difficult. For that, you would have to use scrollable cursors.
A scrollable cursor allows you to move back and forth through the results set without first having to read/retrieve all of the rows before. I suggest that you read up on scrollable cursors in the Db2 SQL Reference manual and the Db2 Application Programming manual. All Db2 manuals can be downloaded in Adobe PDF format for free over the IBM web site.
Basically, you would want to FETCH LAST from the scrollable cursor and then loop through with a FETCH PRIOR statement executing the loop N-1 times. That would give you the "bottom" N of any results set -- sorted or not.
As for your other question, I am confused as to why you would want to delete N rows from a table. Doesn't it matter what the data in the rows is? My guess is that you are asking how you would limit a DELETE to a subset of the rows that would apply to the WHERE condition of the DELETE. The answer is, you cannot, at least not without writing some code.
You would have to open a cursor with the same WHERE conditions specifying FOR UPDATE OF. Then you would FETCH and DELETE WHERE CURRENT OF cursor for that row in a loop that occurs N times. Of course, that means you have to write a program to embed that SQL in.
Hope this answer helps...
Recently I was invited by BMC Software to participate in their AMI Z Talk podcast series to talk about modern data management for Db2... and I was happy to accept.
Anne Hoelscher, Director of R+D for BMC's Db2 solutions, and I spent about 30 minutes discussing modern data management, the need for intelligent automation, DevOps, the cloud, and how organizations can achieve greater availability, resiliency, and agility managing their mainframe Db2 environment.
Here's a link to the podcast that you can play right here in the blog!
Modern data management, to me, means flexibility, adaptability, and working in an integrated way with a team. Today’s data professionals have to move faster and more nimbly than ever before. This has given rise to agile development and DevOps - and, as such, modern DBAs participate in development teams. And DBA tasks and procedures are integrated into the DevOps pipeline.
I’d also like to extend an offer to all the listeners of this BMC podcast (and readers of this blog post) to get a
discount on my latest book, A Guide to Db2 Performance for Application
Developers. The link is https://tinyurl.com/craigdb2
There’s
also a link to the book publisher on home page of my website. Once you are there, click on the link/banner for the book and when you order from the publisher you can use the discount code 10percent to get 10% off
your order of the print or ebook.
A recent, recurring theme of my blog posts has been the advancement of in-memory processing to improve the performance of database access and application execution. I wrote an in-depth blog post, The Benefits of In-Memory Processing, back in September 2020, and I definitely recommend you take a moment or two to read through that to understand the various ways that processing data in-memory can provide significant optimization.
There are multiple different ways to incorporate in-memory techniques into your systems ranging from system caching to in-memory tables to in-memory database systems and beyond. These techniques are gaining traction and being adopted at increasingly higher rates because they deliver better performance and better transaction throughput.
Processing
in-memory instead of on disk can have a measurable impact on not just the
performance of you mainframe applications and systems, but also on your monthly
software bill. If you reduce the time it takes to process your mainframe
workload by more effectively using memory, you can reduce the number of MSUs you
consume to process your mission-critical applications. And depending upon the
type of mainframe pricing model you deploy you can either be saving now or be
planning to save in the future as you move to Tailored-Fit
Pricing.
So
it makes sense for organizations to look for ways to adopt in-memory
techniques. With that in mind, I recommend that you plan to attend this
upcoming IBM Systems webinar titled The
benefits and growth of in-memory database and data processing to be held Tuesday,
October 27, 2020 at 12:00 PM CDT.
This presentation features two great speakers: Nathan Brice, Program Director at IBM
for IBM Z AIOps, and Larry Strickland, Chief Product Officer at DataKinetics.
In
this webinar Nathan and Larry will take a look at the industry trends moving to
in-memory, help to explain why in-memory is gaining traction, and review
some examples of in-memory databases and alternate in-memory techniques that
can deliver rapid transaction throughput. And they’ll also look at
the latest Db2 for z/OS features like FTBs, contiguous buffer pools, fast
insert and more that have caused analysts to call Db2 an in-memory
database system.
Don’t
miss this great session if you are at all interested in better performance, Db2’s
in-memory capabilities, and a discussion of other tools that can aid you in
adopting an in-memory approach to data processing.
Register today by clicking here!
This month, October 2020, IBM introduced the latest new function level, FL508, for Db2 12 for z/OS. This is the second new function level this year (the first came out in June and you can learn more about it here).
With FL508, IBM adds support for moving tables from multi-table table spaces, both simple and segmented, to partition-by-growth (PBG) universal table spaces (UTS). For an overview of UTS capabilities and types, check out this blog post I made earlier this year: Know Your Db2 Universal Table Spaces.
Multi-table table spaces are deprecated functionality, which means that even though they are still supported, they are on their way out. So it makes sense for IBM to give us a better way to convert them to PBG UTS without having to experience an outage. And that is just what FL508 delivers.
This is accomplished in FL508 by enhancements to the ALTER TABLESPACE statement. A new option, MOVE TABLE, is delivered which, as you might expect from its name, can be used to move a table from its current table space to a target table space.
If, as you would expect in most cases, the source table space data sets are already created, the changes made by MOVE TABLE are pending changes and a REORG must be run on the source table space (the current one you are moving from) to materialize the change. Of course, this is an online REORG, so no outage is required.
The target table space must already exist as a PBG UTS in the same database as the current, source multi-table table space. Furthermore, the PBG UTS must be defined with MAXPARTITIONS 1, DEFINE NO, and [NOT] LOGGED and CCSID values that are the same as the current, existing table space. You can move only one table per ALTER TABLESPACE statement, meaning that each table in a multi-table table space must be moved with a separate ALTER TABLESPACE execution. However, because the changes are pending, you can issue multiple ALTER TABLESPACE statements, one for each table in the multi-table table space, and wait until they have all completed successfully before materializing all of the changes with a single REORG run.
It seems simple, and the functionality is nice, but don't just go willy-nilly into things moving tables all over the place once you get this capability in FL508. IBM has documented the things to take care of before you begin to move tables using ALTER TABLESPACE. Check out the IBM recommendations here.
It is also worth mentioning that you still need to keep in mind the impact that moving all tables from multi-table table spaces into their own table space will have on the system. By that I mean, you have to consider the potential impact on things like the number of open data sets (DSMAX ZPARM), DBD size, EDM pool size, and management issues (number of utility jobs, for example).
But it is nice that we now have a reasonable approach for moving tables out of deprecated multi-table table spaces so we can begin the process of moving them before they are no longer supported. A lot of shops "out there" have been waiting for something like this and it is likely to cause FL508 to be adopted quickly.
Let me know what you think by commenting below...
For those of you who have attended an IDUG conference before you know why I am always excited when IDUG-time rolls around again. And the EMEA event is right around the corner!
Participating at an IDUG conference
always delivers a boatload of useful information on how to better use, tune,
and develop applications for Db2 – both for z/OS and LUW. IDUG offers
phenomenal educational opportunities delivered by IBM developers, vendor
experts, users, and consultants from all over the world.
Unfortunately, due to the COVID-19 pandemic, in-person events are not happening this year... and maybe not for some time to come, either. But IDUG has gone virtual, and it is the next best thing to being there! The IDUG EMEA 2020 virtual event will take place November 16–19, 2020. So you have ample time to plan for, register, and attend this year.
If you attended any of the IDUG North American virtual conference earlier this year you know that you can still get great Db2 information online at an IDUG event. And there are a ton of great presentations at the EMEA virtual IDUG conference – just check out the great agenda for the event.
Of course, a virtual event does not offer the face-to-face camaraderie of an in-person event, but it still boasts a bevy of educational opportunities. And the cost is significantly less than a traditional IDUG conference: both in terms of the up-front cost (which is significantly less) and also because there are no travel costs...
For just $199, you get full access to
the virtual conference, as well as a year-long premium IDUG membership and a
complimentary certification or proctored badging voucher. If you're
already a premium member, you can add the EMEA 2020 Conference access to your
membership for just $99.
You can register here https://www.idug.org/p/cm/ld/fid=2149
So whether you are a DBA, a developer, a programmer, an analyst, a data scientist, or anybody else who relies on and uses Db2, the IDUG EMEA Db2 Tech Conference will be the place to be this November 2020.
With all of this great stuff available online from this IDUG virtual event, why wouldn’t you
want to participate?
As most Db2 developers and DBAs know, when you modify a Db2 program you have to prepare the program to enable it to be executed. This program preparation process requires running a series of code preprocessors that—when enacted in the proper sequence—creates an executable load module and a Db2 application package. The combination of the executable load module and the application package is required before any Db2 program can be run, whether batch or online.
But it is not our intent here to walk through and explain
all of the steps and nuances involved in Db2 program preparation. Instead, we
are taking a look at the impact of converting COBOL programs to Java programs,
particularly when it comes to the need to bind as a part of the process.
We all know that issuing the BIND command causes Db2 to
formulate access paths for SQL. If enough things (statistics, memory, buffers,
etc.) have changed, then access paths can change whenever you BIND or REBIND.
And this can be troublesome to manage.
But if the SQL does not change, then it is not technically necessary
to bind to create a new package. You can prevent unnecessary BIND operations by
comparing the new DBRM from the pre-compile with the previous version. Of
course, there is no native capability in Db2 or the BIND command to compare the
DBRM. That is why there are third-party tools on the market that can be used
for this purpose.
But again, it is not the purpose of today’s post to discuss
such tools. Instead, the topic is converting COBOL to Java. I have discussed
this previously in the blog in the post Consider Cross-Compiling COBOL to Java
to Reduce Costs, so you might want to take a moment to read through that
post to acquaint yourself with the general topic.
Converting COBOL to Java and BIND
So, let’s consider a COBOL program with Db2 SQL statements
in it. Most COBOL uses static SQL, meaning that the access paths are determined
at bind time, not at execution time. If we convert that COBOL program to Java
then we are not changing the SQL, just the code around it. Since the SQL does
not change, then a bind should not be required, at least in theory, right?
Well, we first need to get into a quick discussion about types of Java programs. You can use either JDBC or SQLJ for accessing Db2 data from a Java program. With JDBC the program will use dynamic SQL whereas SQLJ will deliver static SQL. The Db2 BIND command can be issued using a DBRM (precompiler output) or a SQLJ customized profile.
So, part of the equation to avoid binding
is to utilize SQLJ for converted COBOL programs.
CloudFrame,
the company and product discussed in the referenced blog post above can be
used to convert COBOL programs into modular Java. And it uses SQLJ for the Db2
access. As such, with embedded SQLJ, static SQL will be used and the access
paths will be determined at bind time instead of execution time.
But remember, we converted business logic, not SQL. The same
SQL statements that were used in the COBOL program can be used in the converted
Java. CloudFrame takes advantage of this and re-purposes the existing package
from the previous COBOL program to the new Java SQLJ. CloudFrame automates the entire
process as part of the conversion from COBOL to Java. This means that the static
SQL from the COBOL program is converted and customized into SQLJ in java. This
is a built-in capability of CloudFrame that allows you to simply reuse the same
package information that was already generated and bound earlier.
This means no bind is required when you use CloudFrame to
convert your Db2 COBOL applications to Java… and no access paths will change. And
that is a good thing, right? Conversion and migration are already
time-consuming processes; eliminating performance problems due to changing
access paths means that one less issue to worry about during a COBOL to Java
conversion when you use CloudFrame.
For those of you who were fans of the Planet Db2 blog aggregator, you’ll be happy to know that it is back up and operational, under new management.
For those who do not know what I am talking about, for years Leo Petrazickis curated and managed the Planet Db2 blog aggregator. Leo provided a great service to the Db2 community, but unfortunately, about a year ago he had to discontinue his participation in the site. So Planet Db2 has been gone for a while. But it is back now!
Before I continue, for those who don’t know what a blog
aggregator is, it a service that
monitors and posts new blog content on a particular topic as it is published. This
means that whenever any blog that is being tracked by the aggregator posts new
content, it is highlighted with a link to the blog post on the aggregator site.
The benefit is that you can watch the blog aggregator page (in this case Planet Db2) for new content instead of trying
to monitor multiple blogs.
So if you are a Db2 DBA, programmer, user, vendor, or just
an interested party, be sure to highlight and visit Planet Db2 on a regular basis to monitor what’s
new in the Db2 blogosphere. And if you write a Db2 blog be sure to register your
blog at the Planet Db2 site so your content is tracked and aggregated to Planet
Db2… you’ll surely get more readers of your stuff if you do!
One of the biggest changes in the last decade or so has been the introduction of new types of table spaces – known as Universal table spaces, or UTS. Not only are UTS new to Db2, they are quickly becoming the de facto standard type of table space for Db2 applications, new and old.
At some point, Universal
table spaces will displace your existing segmented and classic partitioned
table spaces. We’ll examine why this is so later in the post, but first let’s
briefly describe what Universal table spaces are.
Two Types of Universal Table
Spaces
Introduced in Db2 9 for z/OS, Universal table spaces combine the best attributes of partitioned and segmented table spaces. If you do not know what partitioned and segmented table spaces are, I refer you to this older article I wrote on DB2 Table Space Options to bring you up to speed (note that this article is almost 20 years old at this point).
Universal table spaces
offer improved space management for variable length rows because they use space
map pages (like segmented table spaces). Also, like segmented table spaces, UTS
deliver improved mass delete performance, and you can immediately reuse the
table segments after the mass delete. And like partitioned table spaces, Universal
table spaces can grow large (up to 128TB of data) and consist of multiple
partitions.
At a high-level, there
are two types of Universal table spaces:
1. Partition-by-growth (PBG): The PBG UTS creates new partitions as the
amount of data grows without the need to specify key ranges. This type of UTS
is beneficial for tables that grow over time and need the additional limits
afforded by partitioning but can benefit from the performance of segmented.
2. Partition-by-range (PBR): The range-partitioned, or PBR UTS requires a
key range for partitioning like classic partitioned table spaces. A PBR UTS
basically adds segmentation to the existing partitioned table space.
Both types of UTS can
contain only a single table, but IBM presentations have indicated that this is
likely to change at some point in the future (although nothing has been
announced or confirmed for certain).
A partition-by-range
UTS is basically a segmented, partitioned table space. The limit key ranges
must be specified in the table DDL. Index partitioning, which was supported for
the earliest classic partitioned table spaces, is not supported for a PBR UTS.
So before converting your classic partitioned table spaces to PBR UTS, you must
first convert from index-controlled partitioning to table-controlled
partitioning. Check out this blog post for a
trick to quickly convert to table-controlled partitioning.
The second type of UTS is the partition-by-growth Universal table space. As its name implies, a PBG UTS can automatically add a new partition as the data in the table space grows. Over time, as the UTS is used by applications, data gets added to the table. When the PBG UTS reaches its maximum size, a new partition is automatically added to the table space. The new partition uses the same characteristics as the existing partitions, including compression details, free space, and so on.
You control the type
of UTS using the DDL keywords: NUMPARTS, MAXPARTITIONS, and SEGSIZE. To create
a PBR UTS you specify both NUMPARTS and SEGSIZE. To get a PBG UTS you must code
the MAXPARTITIONS and SEGSIZE parameters. MAXPARTITIONS indicates the limit on
the total number of partitions that a PBG UTS can grow to. Be careful, because
if you only code the NUMPARTS parameter without SEGSIZE, then you will create a
traditional partitioned table space. If you only code the SEGSIZE parameter
(without either NUMPARTS or MAXPARTITIONS) you will create a traditional
segmented table space.
Db2
12 for z/OS
A significant new
feature for supporting big data was introduced in Db2 12, relative page
numbering (or RPN) for range-partitioned table spaces. An RPN range-partitioned
table space can be created, or an existing range-partitioned table space can be
changed to RPN via an ALTER TABLESPACE with PAGENUM RELATIVE, followed by an
online REORG of the entire table space.
An RPN table space
delivers many benefits for availability and storing large amounts of data. This
requires an expanded RID, which increases from 5 bytes to 7 bytes.
From an availability
perspective, you can specify DSSIZE at the partition level for RPN table
spaces. Furthermore, the allowable DSSIZE value is no longer dependent on the
page size and number of table space partitions. The DSSIZE change can be an
immediate change (no online REORG required to take effect) as long as the
change does not decrease the DSSIZE value. You still can decrease DSSIZE, but
only at the table space level.
From a big data
perspective, the DSSIZE can grow up to 1 TB for a partition. And the maximum
table size increases to 4 PB with approximately 280 trillion rows per table.
That is a lot of data that can be stored. Think about it this way: if you were
to insert 1000 rows per second it would take more than 8000 years to fill the
table to capacity!
Why Are Universal Table Spaces the Future of Db2?
As of today (September
2020, Db2 12 for z/OS), there are basically five types of table spaces from
which to choose:
1. Segmented table spaces
2. Universal Partition-By-Growth (PBG) table
spaces
3. Universal Partition-By-Range (PBR) table
spaces
4. Universal Partition-By-Range Relative Page
Number table spaces
5. Classic partitioned table space
Of course, for new databases, it is best to remove the classic partitioned table space from consideration because the PBR UTS is more efficient (and classic partitioning will likely be deprecated at some point). Technically speaking, there are actually two other types of table spaces (LOB and XML table spaces), but they are not general-purpose table spaces and can be used only in specific situations (with LOB and XML data).
So why do I advise
that you favor Universal table spaces over segmented whenever you can? Well,
for a number of reasons. First of all, because Universal table spaces are newer
and all you really need for most every Db2 implementation. Secondly, because
many new features of Db2 can only be used with Universal table spaces. Newer
features that only work with UTS include:
·
Clone tables
·
Hash-organized tables
·
Currently committed
locking
·
Pending DDL
·
Inline LOBs
·
XML multi-versioning
·
ALTER TABLE with DROP
COLUMN
And this trend is
likely to continue. As IBM introduces new function levels and versions of Db2
with new features that only work with UTS, it will become increasingly
difficult for DBAs to keep track of which table spaces are not UTS so that they
can make sure they are not using any new features that will not work with their
old types of table spaces.
What this means is that other than Universal table spaces, the only other type you should be using are segmented table spaces and then only when you absolutely must have a multi-table table space. Therefore, the best practice I recommend is to make all new table spaces Universal (except for multi-table table spaces which can be segmented).
So, what is the future
of the segmented table space? For the immediate future, segmented table spaces
will continue to be supported. My guess is that, at some point, IBM will
deliver a multi-table UTS capability, and then at some point deprecate
segmented table spaces. But this is only my guess. As of the date I am writing
this, IBM has not committed to a multi-table UTS and the segmented table space
is still the recommended (and only) method for assigning multiple tables into a
single table space.
My general recommendation
though is that you avoid multi-table table spaces unless you have many very
small tables and are close to reaching the open data set limit (200,000). Of
course, your limit may be lower depending on the setting of the DSMAX subsystem
parameter, which specifies the maximum number data sets that can be open at one
time. Acceptable values range from 1 to 200,000; a lower setting may be specified due to operating system contraints or storage/memory limitations.
My general
recommendation for table spaces is to slowly work on a conversion project to
migrate your classic partitioned table spaces to PBR UTS and your segmented
table spaces to PBG UTS. Doing so will bring you to the latest and greatest Db2
table space technology and position you to be able to use all new functionality
in current and future versions of Db2 whenever – and wherever – you see fit.
Summary
To make sure that your
systems are up-to-date and ready for new functionality it makes sense to adopt Universal
table spaces for all of your Db2 tables. The only exception is for multi-table
segmented table spaces, and you shouldn’t have too many of them.
One area that most organizations can benefit from is by better using system memory more effectively. This is so because accessing and manipulating data in memory is more efficient than doing so from disk.
Think about it… There are three aspects of computing that impact
the performance and cost of applications: CPU usage, I/O, and concurrency. When
the same amount of work is performed by the computer using fewer I/O
operations, CPU savings occur and less hardware is needed to do the same work.
A typical I/O operation (read/write) involves accessing or modifying data on
disk systems; disks are mechanical and have latency – that is, it takes time to
first locate the data and then read or write it.
There are many other factors involved in I/O processing that
involve overhead and can increase costs, all depending upon the system and type
of storage you are using. For example, each I/O consists of a multitude of background
system processes, all of which contribute to the cost of an I/O operation (as
highlighted in Figure 1 below). It is not my intent to define each of these
processes but to highlight the in-depth nature of the processing that goes on
behind-the-scenes that contributes to the costly nature of an I/O operation.
So, you can reduce the time it takes to process your mainframe
workload by more effectively using memory. You can take advantage of things
like increased parallelism for sorts and improve single-threaded performance of
complex queries when you have more memory available to use. And for OLTP
workloads, large memory provides substantial latency reduction, which leads to
significant response time reductions and increased transaction rates.
The most efficient way to
access data is, of course, in-memory access. Disk access is orders-of-magnitude
less efficient than access data from memory. Memory access is usually measured
in microseconds, whereas disk access is measured in milliseconds. (Note that 1
millisecond equals 1000 microseconds.)
The IBM z15 has layers of on-chip and on-board cache that can improve the performance of your application workloads. We can view memory usage on the mainframe as a pyramid, from the slowest to the fastest, as shown in Figure 2. As we go up the pyramid, performance improves; from the slowest techniques (like tape) to the fastest (core cache). But this diagram drives home our core point even further: that system memory is faster than disk and buffering techniques.
Figure 2. The Mainframe Memory Pyramid
So how can we make better use of memory to avoid disk processing and improve performance? Although there are several different ways to adopt in-memory processing for your applications, one of the best methods can be to utilize a product. One such product is the IBM Z Table Accelerator.
IBM Z Table Accelerator is an in-memory table accelerator that can improve application performance and reduces operational cost by utilizing system memory. Using it can help your organization to focus development efforts more on revenue-generating business activity, and less on other less efficient methods of optimizing applications. It is ideal for organizations that need to squeeze every ounce of power from their mainframe systems to maximize performance and transaction throughput while minimizing system resource usage at the application level. You can use it to optimize the performance of all types of data, whether from flat files, VSAM, Db2, or even IMS.
So how does it work? Well, typically a small percentage of your data is accessed and used a large percentage of the time. Think about it in terms of the 80/20 Rule (or the Pareto Principle). About 80% of your data is accessed only 20% of the time, and 20% of your data is accessed 80% of the time.
The data that you are accessing most frequently is usually reference data that is used by multiple business transactions. By focusing on this data and optimizing it you can gain significant benefits. This is where the IBM Z Table Accelerator comes into play. By copying some of the most often accessed data into the accelerator, which uses high-performance in-memory tables, significant performance gains can be achieved. That said, it is only a small portion of the data that gets copied from the system of record (e.g. Db2, VSAM, etc.) into the accelerator.
High-performance in-memory technology products -- such as IBM Z Table Accelerator -- use system memory. Sometimes, if the data is small enough, it can make it into the L3-L4 cache. This can be hard to predict, but when it occurs things get even faster.
Every customer deployment is different, but using IBM Z Table Accelerator to optimize in-memory data access can provide a tremendous performance boost.
A Use Case: Tailor-Fit Pricing
Let’s pause for a moment and consider a possible use case for IBM Z Table Accelerator.
In 2019, IBM announced Tailored Fit Pricing (TFP), with the goal of simplifying mainframe software pricing and billing. IBM designed TFP as a more predictable, cloud-like pricing model than its traditional pricing based on a rolling-four-hour-average of usage. Without getting into all of the details, TFP eliminates tracking and charging based on monthly usage and instead charges a consistent monthly bill based on the previous year’s usage (plus growth).
It is important to note that last point: TFP is based on last year’s usage. So you can reduce your bill next year by reducing your usage this year, before you convert to TFP. Therefore, it makes a lot of sense to reduce your software bills to the lowest point possible the year before the move to TFP.
So what does this have to do
with IBM Z Table Accelerator? Well, adopting techniques to access data
in-memory can lower MSU usage – and therefore your monthly software bill. Using
IBM Z Table Accelerator to optimize your reference data in-memory before moving
to TFP can help you to lower your software bills and better prepare you for the
transition to Tailored Fit Pricing.
--------------------
If you’d like to learn more about IBM Z Table Accelerator there is an upcoming SHARE webinar on September 15, 2020, that goes into some more details about the offering. It is titled Digital Transformation IncludesGetting The Most Out of Your Mainframe: click the link for details and to register to attend.
If you are thinking about, or have already adopted Db2 in the cloud, there is some recent news you should know about. But before we explore that news, let’s take a look at the quick highlights of using Db2 in the cloud.
Db2 on Cloud runs containerized Db2 on with a dedicated DevOps team managing the maintenance and updates required to run your mission-critical workloads. This includes features like seamless data federation, point-in-time recovery, HADR with multizone region support and independent scaling. So many of the administrative burdens of managing Db2 on-premises are handled by IBM in the cloud.
Now if you know me, and have been reading my “stuff” on cloud and DBA, you know that this does not mean that you can entirely offload you DBA. But it is cool and it does help, especially with DBA teams being stressed to their limits these days.
So yes, you can run Db2 on Cloud! And there are many good reasons to consider doing so, such as scalability, pay-as-you-use pricing, and to take advantage of managed services.
OK, So What is New?
I promised some news in the title of this blog post and so far we have just set the stage by examining IBM’s cloud offering of Db2 (albeit at a high level). So, what’s new?
Well, IBM is revamping its pricing plans. Before digging into the news, you need to know that IBM offers two high-level pricing plan options.
What is new is that on August 19 IBM introduced two new plans, the Enterprise non-HA plan, and the Standard non-HA plan. This means that there are now four options, other than the free Lite plan: Enterprise HA, Enterprise non-HA, Standard HA, and Standard non-HA.
As is typical with IBM pricing, it is not really all that simple and it is getting more complex. But options are always good (I think).
So what is this Standard plan that does not appear on the IBM Db2 on Cloud: Pricing page? Well, we can find this on the IBM Db2 on Cloud catalog page Here we see that (as one might expect) it is a lower-cost option between Lite and Enterprise starting at 8 GB RAM with 20 GB storage.
IBM also noted that IBM Db2 on Cloud is now available in the following six data centers: Dallas, Frankfurt, Tokyo, London, Sydney, and Washington. And your instances can be provisioned either with or without the Oracle compatibility feature.
It is important to note that IBM also notes that customers on older, legacy plans (how about that, cloud legacy already) will be required to upgrade their to one of the newer plans.
So, there are more options to choose from with your Db2 on Cloud implementations. And if you have an older plan take some time to familiarize yourself with the new pricing plan options and be ready to choose accordingly for your workload requirements.
Surprisingly, COBOL has been in the news a lot recently, due to its significant usage in many federal govern-ment and state systems, most recently with unemployment systems, being in the news. With the global COVID-19 pandemic, those unemployment systems were stressed like never before with a 1600% increase in traffic (Government Computer News, May 12, 2020) as those impacted by the pandemic filed claims.
Nevertheless, there is another impending event that will
likely pull COBOL back into the news as IBM withdraws older versions of the
COBOL compiler from service. All IBM product versions go through a lifecycle
that starts with GA (general availability), after some time moves to EOM (end
of marketing) where IBM no longer sells that version, and ends with EOS (end of
support) where IBM no longer supports that product or version. It is at this
point that most customers will need to decide to stop using that product or
upgrade to a newer version because IBM will no longer fix or support EOS
products or versions.
Of course, code that was compiled using an unsupported COBOL
compiler will continue to run, but it is not wise to use unsupported software
for important, mission-critical software, such as is usually written using
COBOL. And you need to be aware of interoperability issues if you rely on more
than one version of the COBOL compiler.
So what
is going on in the world of COBOL that will require your attention? First
of all, earlier this year on April 30, 2020, IBM withdrew support for
Enterprise COBOL 5.1 and 5.2. And Enterprise COBOL 4.2 will be withdrawn from
service on April 30, 2022 – just about two years from now.
So now is the time for your organization to think about its
migration strategy.
Why is COBOL still being used?
Sometimes people who do not work in a mainframe environment
are surprised that COBOL is still being used. But it is, and it is not just a
fringe language. COBOL is a language that was designed for business data
processing, and it is extremely well-suited for that purpose. It provides
features for manipulating data and printing reports that are common business requirements. COBOL was purposely designed for applications that
perform transaction processing like payroll, banking, airline booking, etc. You
put data in, process that data, and send results out.
COBOL was invented in 1959, so its history stretches back over
60 years; a lot of time for organizations to build complex applications to
support their business. And IBM has delivered new capabilities and features
over the years that enable organizations to keep up to date as they maintain
their application portfolio.
So, COBOL is in wide use across many industries.
A majority of global financial transactions are processed
using COBOL, including processing 85 percent of the world’s ATM swipes. According
to Reuters, almost
3 trillion dollars in DAILY commerce flows through COBOL systems!
The reality is that more than 30 billion COBOL transactions
run every day. And there are more than 220 billion lines of COBOL in use today.
COBOL is not dead…
With COBOL 5.1 and 5.2 already out of support, and COBOL 4.2
soon to follow, one migration path is to Enterprise COBOL 6, and IBM has
already delivered three releases of it: 6.1, 6.2, and 6.3. There are some nice
new features that are in the latest version(s) of IBM Enterprise COBOL,
including:
At the same time, there are concerns that need to be
considered if and when you migrate to version 6. One example is that the new
compiler will take longer to compile programs than earlier versions – from 5 to
12 times longer depending on the optimization level. There are also additional
work data sets required and additional memory considerations that need to be
addressed to ensure the compiler works properly. As much as 20 times more
memory may be needed to compile than with earlier versions of the compiler.
Some additional compatibility issues to keep in mind are
that your executables are required to be stored in PDSE data sets and that COBOL
6 programs cannot call or be called by OS/VS COBOL programs.
And of course, one of the biggest issues when migrating from
COBOL 4.2 to a new version of COBOL is the possibility of invalid data – even
if you have not changed your data or your program (other than re-compiling in
COBOL 6). This happens because the new code generator may optimize the code
differently. That is to say, you can get different generated code sequences for
the same COBOL source with COBOL 6 than with 4.2 and earlier versions of COBOL. While
this can help minimize CPU usage (a good thing) it can cause invalid data to be
processed differently, causing different behavior at runtime (a bad thing).
Whether you will experience invalid data processing issues
depends on your specific data and how your programmers coded to access it. Some
examples of processing that may cause invalid data issues include invalid data
in numeric USAGE DISPLAY data items; parameter/argument size mismatches; using
TRUNC() with binary data values having more digits than they are defined for in
working storage; and data items that are used before they have been assigned a
value.
Migration considerations
Keep in mind that migration will be a lengthy process for any medium-to-large organization, mostly due to testing application behavior after compilation, and comparing it to pre-compilation behavior. You need to develop a plan that best suits your organization’s requirements and work to implement it in the roughly 2-year timeframe before IBM Enterprise COBOL 4.2 goes out of support.
Things to consider:
Migration challenges and Options to Consider
As you put your plan together, you might consider converting
some of your COBOL applications to Java. An impending event such as the end of
support for a compiler is a prime opportunity for doing so. But why might you
want to convert your COBOL programs to Java?
Well, it can be difficult to obtain and keep skilled COBOL programmers.
As COBOL coders age and retire, there are fewer and fewer programmers with the
needed skills to manage and maintain all of the COBOL programs out there. At
the same time, there are many skilled Java programmers available on the market,
and universities are churning out more every year.
Additionally, Java code is portable, so if you ever want to
move it to another platform it is much easier to do that with Java than with
COBOL. Furthermore, it is easier to adopt cloud technologies and gain the
benefits of elastic compute with Java programs.
Cost reduction can be another valid reason to consider converting from COBOL to Java. Java programs can be run on zIIP processors, which can reduce the cost of running your applications. A workload that runs on zIIPs is not
subject to IBM (and most ISV) licensing charges... and, as every mainframe shop
knows, the cost of software rises as capacity on the mainframe rises. But if
capacity can be redirected to a zIIP processor, then software license charges
do not accrue - at least for that workload.
Additional benefits of zIIPs include:
So, there are many reasons to consider converting at least
some of your COBOL programs to Java. Some may be worried about Java
performance, but Java performance is similar to COBOL these days; in other
words, most of the performance issues of the past have been resolved.
Furthermore, there are many tools to help you develop, manage, and test your
Java code, both on the mainframe and other platforms.
Keeping in mind the concerns about “all-or-nothing”
conversions, most organizations will be working toward a mix of COBOL
migrations and Java conversions, with a mix of COBOL and Java being the end
results. As you plan for this be sure to analyze and select appropriate candidate
programs and applications for conversion to Java. There are tools that can
analyze program functionality to assist you in choosing which the best
candidates. For example, you may want to avoid converting programs that frequently call other
COBOL programs and programs that use pre-relational DBMS technologies (such as IDMS and IMS).
How to convert COBOL to Java
At this point, you may be thinking, “Sure, I can see the
merit in converting some of my programs to Java, but how can I do that? I don’t
have the time for my developers to re-create COBOL programs in Java going
line-by-line!” Of course, you don’t!
This is where an automated tool comes in handy. The CloudFrame Migration Suite provides code
conversion tools, automation, and DevOps integration to deliver very
maintainable, object-oriented Java that can integrate with modern technology
available within your open architecture. It can be used to refactor COBOL
source code to Java without changing data, schedulers, and other infrastructure
components. It is fully automated and seamlessly integrates with the change management
systems you already use on the mainframe.
The Java code generated by CloudFrame will operate the same
as your COBOL and produce the same output. There are even options you can use
to maintain the COBOL 4.2 treatment of data, thereby avoiding the invalid data
issues that can occur when you migrate to COBOL 6. This can help to reduce
project testing and remediation time.
It is also possible to use CloudFrame to refactor your COBOL programs to Java
but keep maintaining the code in COBOL. Such an approach, as described in this
blog post (Consider
Cross-Compiling COBOL to Java to Reduce Costs), can allow you to keep using
your COBOL programmers for maintenance but to gain the zIIP eligibility of Java
when you run the code.
Upcoming Webinar
To learn more about COBOL migration, modernization
considerations, and how CloudFrame can help you to achieve your modernization
goals, be sure to attend CloudFrame’s upcoming webinar where I will be
participating on a panel along with Venkat Pillay (CEO and founder of CloudFrame)
and Dale Vecchio (industry analyst and former Gartner research VP). The webinar,
titled Navigating the COBOL 4.2 End of Support(EOS) Waters: An expert panel discusses the best course of action to benefit your business will be held on September 23,
2020 at 11:00 AM Eastern time. Be sure to register and attend!
Summary
Users of IBM Enterprise COBOL 4.2 need to be aware of the imminent
end of service date in April 2022 and make appropriate plans for migrating off
of the older compiler.
This can be a great opportunity to consider what should
remain COBOL and where the opportunities to modernize to Java are. Learn how CloudFrame can help you navigate
that journey.