Friday, July 16, 2021

Keeping Track of Data Movement in Db2 for z/OS

Creating and managing test data for Db2 application development and testing requirements can be a significant challenge. To enable not only the development of new programs, but to be able to maintain existing ones, organizations must ensure that there is an adequate amount of accurate test data always available. Without relevant, useful data, there is no way to test applications to make sure they are operating correctly. 

Although this duty must be a shared one between the application developers and the DBAs, managing and controlling all of the data movement tasks typically falls on the DBAs. And keeping track of what data moved where, when it moved, and why can at times be as much of a challenge as moving the data itself.

Fortunately, there are test data management tools available to not only move the data, but to keep track of it. I’ve written about one of the better Db2 for z/OS data movement tools here in the blog before: Fast and Effective Db2 for z/OS Test Data Management with BCV5. I hope you'll take a moment to click and read that post.

Now BCV5 has been improved with a new reporting feature, to enable users to track the movement of data across their Db2 subsystems. This is a significant new feature that can be used to glean useful information for DBAs, storage administrators, and even by data stewards for data governance.

There are six tables of metadata that BCV5 populates to track the data movement and the copy tasks it performs. These tables are:
  • BCV531.TASK_EXECUTIONS
  • BCV531.TASK_EXECUTIONS_OBJECTS
  • BCV531.TASK_EXECUTIONS_PARAMETERS
  • BCV531.TASK_EXECUTIONS_JOBS
  • BCV531.TASK_EXECUTIONS_RULES
  • BCV531.TASK_EXECUTIONS_MASKING
The information in these tables is updated whenever BCV5 runs a task to copy Db2 data. Users can query these tables just like any other Db2 tables to monitor the details of the BCV5 tasks you have run. This information may be useful for many different IT and business professionals, but let’s take a look at three specific use cases: 
  1. database administration (DBA), 
  2. storage administration, and 
  3. data governance.
DBA tracking
DBAs tasked with moving and refreshing data from one Db2 environment to another are the typical users of BCV5, and therefore they will be one of the primary users of the new reporting tables. Most sites that use BCV5 use it to refresh test data, for example, copying production data to test, or copying unit test data to an integration test set of tables.  

Regardless of the type of data movement that is being undertaken, it is usually being done for multiple tables, tablespaces, and databases. Usually, there will be regularly scheduled processes that copy some of the data, but this is rarely sufficient as there will be on-off requests, special situations, and emergency data refreshes happening all the time. Keeping track of such a hectic morass of copying and refreshing data can be difficult. 

Fortunately, if you are using BCV5 the new Usage Tracker tables can simplify keeping track of data refreshes for DBAs. For example, a DBA looking to find out which BCV5 copy tasks were run during the month of May could code a query like:

 SELECT T.ROWDATE, T.USER, T.TASKNAME 
 FROM BCV531.TASK_EXECUTIONS T 
 WHERE T.ROWDATE BETWEEN ´2021-05-01´ AND ´2021-05-31´ 
 ORDER BY T.ROWDATE ;

This will show all the BCV5 copy tasks that ran during that timeframe, and will look something like this:

ROWDATE      USER     TASKNAME 
---------+---------+---------+---------+--------- 
2021-05-12   USERID1  TSK0001 
2021-05-12   USERID1  TSK0002 
2021-05-20   USERID5  TSKPROD1 
2021-05-21   USERID9  TSKPROD4 
 

The results shown here are just a sample and will likely be a subset of the actual results of running such a query. 

Of course, this is rudimentary information and it is likely that the DBA will want to know more, such as which objects were impacted by these tasks. A query such as the following will come in handy:

SELECT SUBSTR(SRCSCHEMA,1,8) AS SS, 
       SUBSTR(SRCNAME,1,12) AS SN, 
       SUBSTR(TGTSCHEMA,1,8) AS TS, 
       SUBSTR(TGTNAME,1,12) AS TN, 
       SUBSTR(OBJTYPE,1,1), 
       ROWDATE AS DATE_COPIED, 
       SIZEKB 
FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
ORDER BY DATE_COPIED ;

The results here show all the Db2 objects copied by BCV5 showing the source and target names as well as the object type, date copied, and amount of data (in KB) copied:

SS      SN       TS     TN         DATE_COPIED   SIZEKB 
---------+---------+---------+---------+---------+-----
DB500XA TS500X01 DBA001 TS500XA1 S  2021-05-17  1462480 
QUALID  XCL59011 TESTID XCL59011 X  2021-05-17   325040 
QUALID  XCL59012 TESTID XCL59012 X  2021-05-17   125200 
QUALID  XCL59013 TESTID XCL59013 X  2021-05-17   301460 
QUALID  XCL5901C TESTID XCL5901C X  2021-05-17    98400 
QUALID  TEST_TBL TESTID TEST_TBL T  2021-05-17       20 

Again, the results have been truncated as this is intended as an example.

A DBA looking to track down the results of a specific copy task that ran on a specific date might want to run a query like this, to verify which objects were copied. Simply plug in the name of your task and the date it ran:

SELECT T.TASKNAME, O.OBJTYPE, O.SRCSCHEMA, O.SRCNAME, 
       O.TGTSCHEMA, O.TGTNAME, O.PARTITIONS 
FROM   BCV531.TASK_EXECUTIONS         T, 
       BCV531.TASK_EXECUTIONS_OBJECTS O 
WHERE  T.ID = O.EXECID 
AND    T.TASKNAME = ? 
AND    T.ROWDATE = ? ;

Storage Administration Tracking 
Another type of user who might find the Usage Tracker capabilities of BCV5 useful is the storage administrator. Storage administrators are responsible for managing an organization’s disk and tape systems. Additionally, they are also responsible for monitoring storage usage and capacity to ensure that sufficient storage is available for the organization’s IT requirements. 

As such, the storage administrator will likely want to keep an eye on the data movement activities of BCV5. For example, a query such as this one can be used to report on the total amount of data (TS and IX) copied by date:

SELECT ROWDATE AS DATE_COPIED, 
       SUM(SIZEKB) AS TOTAL_KB_COPIED 
FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
WHERE  OBJTYPE IN ('S', 'X') 
GROUP BY ROWDATE ;

Which will return data similar to this:

DATE_COPIED  TOTAL_KB_COPIED 
---------+---------+---------+---------+------ 
2021-06-12         106231270 
2021-06-19         106231270 
2021-06-21        4451457810 
2021-06-26         106231270 
Another potentially useful query, not only for storage administrators and DBAs, but also for application managers, is tracking the amount of actual (tablespace) data copied by date and application. Finding the application name or identifier can be tricky, but if we assume that an application identifier is embedded in the second 2 chars of database name then a query like this can be run:

WITH SIZEBYAPP AS ( 
  SELECT ROWDATE AS DATE_COPIED, 
         SUBSTR(TGTSCHEMA,2,2) AS APPL_NAME, 
         SIZEKB AS SIZE_IN_KB 
  FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
  WHERE OBJTYPE = 'S' 
                  ) 
SELECT DATE_COPIED, APPL_NAME, 
       SUM(SIZE_IN_KB) AS TOTAL_KB_COPIED 
FROM   SIZEBYAPP 
GROUP BY DATE_COPIED, APPL_NAME ;

Which might return a report looking something like this:

DATE_COPIED APPL_NAME TOTAL_KB_COPIED 
---------+------------------+---------+------- 
2021-05-11  EN               46805760 
2021-05-22  BA              242791056 
2021-05-22  BX                4094640 
2021-05-22  CM                1008720 
2021-05-22  DA              270390816 
2021-05-22  OR                  90528 
2021-05-26  PR               55737376 
2021-05-26  XX              537647328 

You can adjust this query if you want to know the amount of index data copied by data and application like so (under the same assumption as above for application identifier):

WITH SIZEBYAPP AS ( 
  SELECT B.ROWDATE AS DATE_COPIED, 
         SUBSTR(T.DBNAME,2,2) AS APPL_NAME, 
         B.SIZEKB AS SIZE_IN_KB 
  FROM   BCV531.TASK_EXECUTIONS_OBJECTS B, 
         SYSIBM.SYSINDEXES              X, 
         SYSIBM.SYSTABLES               T 
 WHERE   B.OBJTYPE = 'X' 
 AND     B.TGTNAME = X.NAME 
 AND     B.TGTSCHEMA = X.CREATOR 
 AND     X.TBNAME = T.NAME 
 AND     X.TBCREATOR = T.CREATOR 
                 ) 
SELECT DATE_COPIED, APPL_NAME, 
       SUM(SIZE_IN_KB) AS TOTAL_KB_COPIED 
FROM   SIZEBYAPP 
GROUP BY DATE_COPIED, APPL_NAME ;

Data Governance Tracking 
Although tracking data movement activities is useful for DBAs, it is also important for data governance reporting. Data governance refers to the processes and standards of ensuring access to high-quality data throughout an organization. Data governance encompasses all aspects of data quality including its accuracy, availability, consistency, integrity, security, and usability. The role of data governance has expanded as data privacy rules and regulations have expanded in response to an increasing number of data breaches and hacker attacks. For example, in early July 2021, Colorado passed the Colorado Privacy Act, meaning that now Colorado, Virginia, and California have passed data privacy legislation that impacts how personal data must be governed. More states are certain to follow their lead… and let’s not forget the European GDPR act!

These type of regulations provide rights for access, deletion, correction, portability, and protection for personally identifiable information, or PII. Note that “portability” is one aspect of data protection covered under the auspices of such regulations… and BCV5 is a mover of data, so you need to be able to track what data moved where, especially when the data that moved contains any PII. 

So, what types of queries can be run using the new BCV5 reporting tables to help satisfy the needs of data governance? 

Well, if you have identified specific tables that have personally identifiable information, and therefore requires specific policies to ensure its privacy and protection, a data steward might want to run a query that shows all of the times that a specific protected table was copied:

SELECT SUBSTR(SRCSCHEMA,1,8) AS SS, 
       SUBSTR(SRCNAME,1,12) AS SN, 
       SUBSTR(TGTSCHEMA,1,8) AS TS, 
       SUBSTR(TGTNAME,1,12) AS TN, 
       ROWDATE AS DATE_COPIED, 
       SIZEKB 
FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
WHERE  OBJTYPE = 'T' 
AND    SRCSCHEMA = ? 
AND    SRCNAME = ? 
ORDER BY DATE_COPIED ;

Simply code the appropriate schema (SRCSCHEMA) and table name (SRCNAME) and this query will show all the times that the particular table (say PROD.CUSTOMERS) was copied. A data governance professional with a list of tables that contains PII could alter this query to accept that list as an IN clause instead of the simple equality clause shown here. 

Additionally, BCV5 can de-identify sensitive data using masking. Whenever a task that requires data to be masked is run, information is captured in the TASK_EXECUTIONS_MASKING table. So, a data governance professional might want to run a query like this one to report on all the masking of sensitive data.

SELECT SUBSTR(TBCREATOR,1,8) AS TBCREATOR,
       SUBSTR(TBNAME,1,12) AS TBNAME, 
       SUBSTR(COLNAME,1,18) AS COLNNAME, 
       METHOD AS MASKING_METHOD, 
       SQLEXPR 
FROM   BCV531.TASK_EXECUTIONS_MASKING 
ORDER BY ROWDATE ;

This can always be modified to join it to the TASK_EXECUTIONS table to obtain the task name if that is important. And with a little manipulation of the query it is possible to look for tables that contain sensitive data that have been copied using BCV5, but have not had masking applied. 

Summary 
BCV5 has offered powerful data movement and masking capabilities for Db2 data for a long time, but now it also offers the ability to track and report on your organization’s Db2 data movement and copy tasks. This new functionality opens a plethora of useful information for BCV5 users.

Tuesday, July 06, 2021

How Many Temporal Tables Does Your Site Have?

Sometimes it can be difficult to remember where information is stored in the Db2 Catalog. Usually, with a little rumination and a little review of Appendix A of the IBM Db2 SQL Reference manual (SC27-8859), you can come up with a solution.

For example, I was talking to some DBAs who were trying to remember if they had ever created any business-time temporal tables. A comment was made that we could surely find that in the Db2 Catalog and the conversation moved along... but then I thought, hmmm, let me see what I can do about coming up with a catalog query.

The first step was to think about where this information might be found, which took me to SYSTABLES. A good first thought, but no, it isn't there. So I thought, how about SYSCOLUMNS? And lo' and behold, there was the answer.

The columns identified as the start and end date/time for the temporal range are documented in SYSCOLUMNS in the PERIOD column. PERIOD is defined as a CHAR(1) column and it contains one of the following values for every column defined for each table:

Value Meaning                                                                                    
   B         Column is the start of period BUSINESS_TIME
   C     Column is the end of period BUSINESS_TIME with
    an exclusive endpoint
    I     Column is the end of period BUSINESS_TIME with
    an inclusive endpoint
   S     Column is the start of period SYSTEM_TIME
   T     Column is the end of period SYSTEM_TIME
blank         Column is not used as either the start or the end of
    a period


So using this information, here is a query that will show information about all of the business-time temporal tables you have created:

SELECT SUBSTR(TBCREATOR,1,8) || '.' || SUBSTR(TBNAME,1,30) 
       AS TABLENAME,
       SUBSTR(NAME,1,40) AS COLUMNNAME, 
       COLNO, 
       PERIOD
FROM   SYSIBM.SYSCOLUMNS
WHERE  PERIOD IN ('B', 'C', 'I')
ORDER BY TABLENAME, PERIOD, COLUMNNAME;

If you want to find the system-time temporal tables, just swap out the WHERE clause with this one:

  WHERE PERIOD IN ('S', 'T')  


By becoming adept at querying the Db2 Catalog tables you can find out just about everything you want to know about the databases and objects defined in your Db2 subsystems!

Tuesday, June 15, 2021

Db2 12 for z/OS Function Level 510

I'm a little late with this Db2 function level update, but better late than never, right?

In April 2021, IBM introduced a new function level, FL510, for Db2 12 for z/OS. If you want to take a look at the announcement for it, you can read it here, but there really isn't a lot to it.

Unlike all the other function levels, FL510 does not add any new features or capabilities, nor does it introduce any new changes to the Db2 Catalog. So what does it do?

This function level is basically there to prepare for the next new release of Db2, which will obviously be coming soon, or IBM would not have created this function level for it!  So it is time to start thinking about Db2 Next and getting ready for a new release/version of our favorite DBMS!

But we really haven't answered what FL510 does, have we? It is a housekeeping type of function level. When you activate FL510 it verifies and enforces several pre-migration conditions that have to be met before you can migrate to the next Db2 release. It will make sure that all Db2 12 function levels are activated and that all catalog updates for Db2 12 have been applied. This means that the Db2 catalog level is at the last catalog level for Version 12 and any future migration can therefore proceed.

Additionally, FL510 will check to make sure that your application packages were rebound recently enough to ensure that they are supported by the next Db2 release.

If any of the previous conditions are not met, then the activation of FL510 will fail. You will have to remediate your system and try to activate FL510 again before you can move forward to the new release.

Also, please be aware that FL510 has nothing to do with the fallback SPE that will have to be applied before moving forward with the eventual, new Db2 release. IBM will deliver the fallback SPE in a subsequent APAR at a point in time.

So I guess that this is a boring function level in that it delivers no new functionality... but it is exciting as it is a pre-req for a new  Db2 release that is on the horizon!

Sunday, May 09, 2021

Thinking About the Mainframe, the Cloud, and IBM Think 2021

A Bit about Think

I am looking forward to attending the IBM Think 2021 conference, IBM's annual flagship technology event. I have attended several in-person Think events, as well as last year’s virtual conference, and I always come away with new knowledge and additional insight into technology and IBM’s vast portfolio of hardware, software, and solutions. The Think conference is always one of the tech highlights of the year for me!

This year’s event, IBM Think 2021, is again being held as a virtual conference, May 11 and 12, 2021. And it is free of charge, which means that you can experience all the great education, presentations, and networking opportunities without having to leave your desk.

My favorite aspect of the Think conference is the breadth and scope of pertinent technical content that it covers. Whether you are a developer, a DBA, a data scientist, a manager, an executive, or any flavor of IT or business specialist, there will be a wealth of useful information presented to educate you and make you “think.”  Be sure to register here.

My Think 2021 Agenda

There are multiple sessions to be delivered at this year’s IBM Think conference that intrigue me because they focus on areas where I specialize.  For example, Dr. Dario Gil, SVP and Director of IBM Research will be delivering a keynote session on IT infrastructure which is sure to be educational. This session, 2081, offers a deep dive into the IBM innovations powering the next generation of hardware, including IBM Z.

Another session I am looking forward to is session 2303 focusing on security “everywhere.” It features IBM luminaries like Tom Rosamilia, Senior Vice President, IBM Systems, and Mary O’Brien, General Manager IBM Security. And Forrester Research Director, Lauren Nelson, will also be lending her industry expertise to the session.

But I think the Think 2021 session I am most looking forward to is The IBM Z roadmap for hybrid cloud and AI (session 1605) featuring Ross Mauri General Manager for IBM Z. Mauri promises to offer a timely discussion on the business value of integrating the IBM Z platform as a full participant into your hybrid cloud. And he’ll speak with Russell Plew, Technology Senior Manager at M&T Bank who will discuss their real-life experiences in doing so!

Why is this session so interesting to me? Well, I’ve worked with the mainframe my entire career, and as anybody who works on the mainframe knows, the IBM Z platform is used to drive mission-critical workloads across all major industry sectors, worldwide. If your organization needs to perform large-scale transaction processing (thousands of transactions per second), support thousands of users and programs concurrently, manage terabytes of information, and handle large-bandwidth communication, chances are you rely on the mainframe to do that because the platform excels at all of those things.

If you’ve ever deposited a check into your bank account, booked a flight on an airline, or used a credit card to purchase something, it is probable that a mainframe was involved in completing that activity!

Ever since Stewart Alsop of InfoWorld predicted the last mainframe would be unplugged on March 15, 1996 there has been a lingering perception that the mainframe would go away at some point. But here we are, 25 years later, and the mainframe is still going strong! At last year’s IBM Think conference IBM presented the following statistics on the mainframe’s ubiquity and power:

      70% of the Fortune 500 use mainframes and 72% of customer-facing applications are dependent on the mainframe for some or all data processing.

      Mainframes are designed to be able to process a trillion web transactions a day with the capability to process 1.1 million transactions per second.

      95% of transactions in the banking, insurance, airline and retail industries are handled by mainframes.


Indeed, the mainframe continues to offer a strong, unparalleled platform for performance, security, and reliability. Of course, the mainframe has changed and grown over its 50+ year lifespan. Today’s IBM z15 is light-years beyond the original IBM System/360 introduced in 1964. Some of the great newer capabilities of the IBM Z include encryptions everywhere with pervasive encryption and Data Privacy Passports, rack-mountable mainframes, Instant Recovery, and cloud-native development. I’m looking forward to hear how IBM’s customers have taken advantage of these, and other capabilities, to integrate the IBM Z into their hybrid cloud architecture.

It only makes sense that businesses relying on the mainframe will continue to do so, even as they embrace cloud computing. This is what the “hybrid” in the term hybrid cloud implies, an IT infrastructure that uses a mix of on-premises and private / public cloud from multiple providers. And this approach makes the most sense because everything can’t shift to the cloud immediately (perhaps ever) because most existing applications were not built with an understanding of the public cloud and it would take a lot of investment to re-engineer them to properly take advantage of a public cloud architecture. And even if you wanted to move everything, cloud service providers (CSPs) can’t build out their infrastructure fast enough to support all the existing data center capacity “out there” to immediately support everything.

So, it will be exciting to watch the IBM continue to innovate on the IBM Z platform as enterprise customers work to integrate Z as a vital component of their hybrid cloud infrastructure. With the large investment enterprises have in their working mainframe applications, large data sets and databases containing crucial data, and high-volume processing requirements they will continue to rely on the mainframe well into the future… and that makes it important to understand how IBM is enabling the IBM Z to participate in your hybrid cloud architecture.

So, join me at Think 2021 for session 1605 to learn how to use your investments in IBM Z and build and modernize applications into container-based workloads using a common DevOps experience. And stick around for other sessions to gain insights on harnessing the full value of IBM hardware, software and services in your organization as you continue to support, manage, and transform traditional business and IT operations.


Wednesday, April 07, 2021

Happy Birthday to the IBM Mainframe

I am older than the mainframe... I turned 58 on April 3rd, and the IBM mainframe officially celebrates its 57th birthday today, April 7th.

The IBM 360 was launched on April 7, 1964 and the world of enterprise computing has never been the same.

Here are a few links and articles to check out as we celebrate the ongoing vitality of mainframe computing:

So, all of you mainframe users out there, today is indeed a day to celebrate... another year has gone by, and mainframes are still here... running the world!

Sunday, March 14, 2021

Db2 12 for z/OS Function Level 509

Late last month, February 2021, IBM introduced a new function level, FL509, for Db2 12 for z/OS. You can find in-depth details here.

But if you are looking for a high-level synopsis, read on! 

There are several interesting new capabilities introduced with this function level, but perhaps the most important thing that organizations want to know is that there are no new incompatible changes or deprecations introduced with FL509.

Okay, so what’s new here. The first thing to report is an improvement to data security with tamper-proof audit policies. This means that an audit policy cannot be changed, or even stopped, unless requested by an authorized user. And the authorization must be via a z/OS security product (such as IBM’s RACF), not Db2.

This capability provides another step in the separation of duties required for proper auditing. In other words, the audited must not be the controller of the audit policy or auditing capabilities. It also protects administrative users from mistakenly modifying audit policies.

The next new capability delivered by FL509 is high-availability accelerator-only tables. Accelerator-only tables (AOTs) are those defined to the IBM Db2 Analytics Accelerator only, and not in the base Db2 for z/OS. Queries and DML statements issued against AOTs are always routed to an accelerator (because the data does not exist anywhere else).

So, what are high availability AOTs? Well, FL509 delivers the capability to define an accelerator-only table in more than one accelerator. This can improve availability and with workload balancing a query can be rerouted to another available accelerator if the target accelerator is not available.

Also as of FL509, you can specify a compression algorithm at the table, table space, or partition level. This means you can explicitly use either the fixed-length or Huffman compression algorithm at the table, table space, or partition level using CREATE TABLE and ALTER statements. The Db2 catalog is updated to indicate the compression algorithm used for each object.

Finally, FL509 delivers enhanced temporal RI. What this means is that restrictions on UPDATE and DELETE statements are removed relating to the temporal RI introduced originally in Db2 12.

To elaborate, one FL509 is active, when an UPDATE statement with a FOR PORTION OF clause attempts to update the parent table in a temporal RI relationship, the update is allowed as long as the rules of temporal RI are not violated. Likewise, when a DELETE statement with a FOR PORTION OF clause attempts to delete from the parent table in a temporal RI relationship, the deletion is allowed, as long as the rules of temporal RI are not violated.

At any lower application compatibility level, such UPDATE or DELETE statements for a parent table in an RI relationship will fail (with SQLCODE -4736).

Summary

Now that IBM is using function levels to deliver significant new capabilities for Db2 12 for z/OS, it is imperative that your organization keeps up-to-date on this new functionality and determines where and when it makes sense to introduce it into your Db2 databases and applications.

Also, be aware that if you are not currently running at FL508, moving to FL509 activates all earlier function levels. You can find a list of all the current function levels here.

 

Thursday, January 07, 2021

BMC AMI Ops: The Next Generation of Mainframe Systems Management

Assuring the performance of your mainframe systems and applications is an imposing task that keeps getting more complex all the time. It makes sense to arm your IT performance analysts, DBAs, and systems programmers with modern tools so you can optimize performance and thereby deliver superior service to your customers.

Of course, BMC MainView has helped IT professionals manage the performance of their mainframe systems and applications for years. But there are new challenges facing modern organizations that require adaptation and transformation.

Organizations are transforming to become autonomous digital enterprises (ADE). This means that things are getting more complex because availability requirements are expanding (many times requiring 24/7 availability), but IT pros are expected to resolve problems rapidly even as workloads become more unpredictable and IT staff has less experience. These challenges are real and require attention.

And that is why BMC is transforming its MainView product line into BMC AMI Ops!

With BMC AMI Ops you can experience next-level mainframe operational resiliency, AI-powered observability, an intuitive user interface with embedded expertise, actionable insights, and enterprise platform interoperability.

How is BMC AMI Ops engineered to help? Well, it is built for digital business with the understanding that being reactive is not sufficient these days. BMC AMI Ops provides a complete, modular solution with central administration and management.

Artificial intelligence and machine learning techniques are being embraced by an increasing number of organizations for improving their business, so it only stands to reason that your IT operations and support functions should be looking to improve their capabilities using AI and ML, too. And BMC AMI Ops helps you to do that because it is infused with AI/ML-powered analytics to find and fix problems before business services are impacted. With BMC AMI Ops you can improve performance and availability by taking advantage of its built-in intelligent automation and remediation features.

And the user interface is brand new, engineered to support ease of use, to facilitate information instead of raw data, and to guide the user experience. BMC AMI Ops delivers a custom dashboard approach where you can group widgets together for related logical systems or business areas. And you get “out of the box” health indicators for each of the widgets you deploy, meaning it takes less time to be productive right away. Furthermore, a guided path is provided so the user can drill down into additional details as needed. If you are interested in seeing more details on the new user experience for BMC AMI Ops, chick out this blog post from Shay Alsberg (BMC AMI Ops: Evolving the MainView User Experience).

And not to fear, for those of you experienced mainframe pros who not only know how to drive ISPF panels but prefer it, BMC AMI Ops can still be accessed using character-based panels.

The bottom line is that BMC AMI Ops is designed for modern businesses and IT, as they embrace digital transformation to become autonomous digital enterprises, enabling them to deliver a simplified yet customizable systems management experience for optimizing your system and application performance. That’s BMC AMI Ops in a nutshell… and it is worth looking into how BMC AMI Ops can help you to improve the performance of your systems and applications.

Friday, January 01, 2021

Happy New Year 2021!

Well, here it is, the day we've all waited for since about March of last year... the dawning of a new year. 

Happy New Year 2021!

Good riddance to 2020 and all of the problems we faced and hello to a brand new year that, of course, will bring new problems and issues, but hopefully not on the scale we dealt with last year!

Here's hoping that the COVID vaccination process works well and that we can all get back to something resembling normal this year. I, for one, am looking forward to attending some tech conferences in person later this year. For example, I'd sure like to attend an IDUG event, the IBM Think conference, and Teradata Analytics Universe in person this year. Hopefully, one or more of those events will happen! 

If not in person, then I'll happily attend a virtual event until things are safe.

And I hope that everybody out there has been able to relax and enjoy this holiday season... and will soon be ready to dive back in and tackle the new year. 

Cheers!

Thursday, December 17, 2020

Db2 Utilities and Modern Data Management

Db2 utilities are the unappreciated, and often over-looked, workhorses of your mainframe Db2 environment. They perform the dirty work that has to be done to populate, organize, backup, and recover your vital mainframe data. Without them, building effective Db2 databases, managing data, optimizing performance, and even accessing mainframe data would be a lot more difficult than it currently is.

The Situation 
Think about the Db2 utility situation at your shop. If you are like most organizations you will have Db2 utilities regularly running all the time. There are load and unload tasks running to refresh data for development and testing, for moving data between environments for analysis and processing, and for various other purposes. The LOAD and UNLOAD utilities bear a lot of the hard work for data movement. 

You are also most likely reorganizing data using a REORG utility for most of your Db2 table spaces and probably indexes, too. In many cases reorganization jobs are scheduled to run on a regular basis: weekly, monthly, quarterly, etc. Frequently you just set these jobs up when the object is created. The job gets scheduled and is just run without anybody taking a look at them unless, or until there are performance problems. 

Then there are COPY and RECOVER utilities for backing up and recovering data when there are problems. The image copy backup jobs are running all the time, taking either full or incremental copies to ensure that you can recover data in case problems are encountered. The copies are running all the time, but the recover jobs (hopefully) are not running all the time! 

You are also going to be running the RUNSTATS utility to gather statistics for Db2 to use for query optimization. Depending on how often your data changes, you may be running RUNSTATS frequently or infrequently. Many times the same fate as REORG befalls RUNSTATS… that is, it is scheduled and forgotten about unless problems arise. 

There are other utilities, like CHECK which is used to verify the integrity of data. You are probably not running this one very often but when you need it you want it to run fast, right? 

So, all of these utilities are “out there” running and consuming CPU to move, copy, and manage your Db2 data. But are they being run effectively as possible? 

Moving to the Modern Db2 Utility Way 
I think by this point everybody will agree that utility type processing is not just critical, but mandatory for a Db2 environment. But just running with the bare basics is not the best approach. 

If we think about data movement with unload and load processing there are several things that you might want to consider for improvement. First of all, consider the speed and performance of the unload and load tasks. You probably want these jobs to run as fast as possible – that is, to consume as little elapsed time as possible to complete. After all, you are probably using these utilities to build environments or even refresh portions of an environment… and there will be developers and testers waiting to use that data as soon as it is available. Using the fastest utility programs available will minimize the wait time and make your developers and testers more productive. Furthermore, you want these tasks to consume as little CPU as possible to reduce your monthly mainframe bills! 

In some cases you might want to re-consider unloading and loading altogether, using alternate utilities and offerings that can clone an entire subsystem or move data outside the control of Db2 at the data set level. 

If we think about reorganization, it is likely that you are running REORG tasks that don’t need to be run, at least not as regularly as they are being run. At the same time, it is also likely that you are not running other REORG tasks as frequently as you should, thereby causing every other task that accessing the data to degrade. Fortunately, you can use RTS (real time statistics) to help guide when you should (and should not) reorganize your data. In the best case the utility itself relies on RTS to figure out if it needs to run and runs when it makes sense only. Failing this, you are again likely consuming more CPU than is necessary (either running unneeded REORGs or accessing poorly organized data, as the case may be). 

If you think about your backup and recovery situation, the issue is likely complexity. Sure you want COPY and RECOVER utilities that run fast and consume minimal CPU, but the big issue is analysis. By that I mean, when you need to recover you want to make sure that you can use the image copies (and, of course, the log) to recover and meet your RTOs (recovery time objectives). But creating recover jobs on-the-fly, in a probably complicated environment with inter-related tables and data, can be difficult. And doing so when there is an outage, which is usually the case, exacerbates the situation. Using intelligent utilities to create the right image copies and to automatically build an appropriate recovery strategy when needed should be the modern approach.

And not to neglect RUNSTATS and CHECK, you want both of those utilities to run as fast as possible, consuming minimal CPU, too. And you want guidance on when and how to run them using available RTS, statistics, and any system information available. 

What Can You Do? 
One approach is to use modern utilities, not only built for speed but that incorporate AI and machine learning to automate and improve the Db2 utility experience. BMC Software is once again on the vanguard with its BMC AMI utilities for Db2

The first question you probably have is "What the heck is AMI?" Well, AMI, which stands for Automated Mainframe Intelligence, is technology that is being infused into BMC’s product line to leverage AI, machine learning, and predictive analytics to achieve a self-managing mainframe. 

BMC AMI Utilities for Db2 are designed for modern complex Db2 environments. They use a centralized, intelligent architecture (see diagram below) designed specifically to handle the complexity facing IT today. Through intelligent policy-driven automation, you can use the AMI Utilities for Db2 to manage growing amounts of data with ease and, at the same time, deliver full application availability. 

Figure 1. BMC AMI Utilities for Db2



If you are looking to reduce CPU and elapsed time by as much as 75%, eliminate downtime while delivering full application availability, lower disk usage, eliminate sort in your REORGs, and simplify complex utility operations, then it makes sense to take a look at the BMC AMI Utilities for Db2. 


----------

You might also want to take a look at this blog post from BMC that discusses how to Save Time and Money with Updated Unload Times 

And this analysis of the BMC next generation REORG technology from Ptak Associates

Wednesday, November 18, 2020

Deleting "n" Rows From a Db2 Table

I regularly receive database- and Db2-related questions via e-mail. And that is great, but I don't always get a chance to respond to everything. If you've sent me a question and I haven't replied, I apologize. But every now and then, I will use one of the e-mail questions in my in-box and write about it in the blog. Today's question is this:

How do you delete N rows from a Db2 table?

Also, how do you retrieve bottom N rows from a Db2 table without sorting the table on key?

And here is my response:

First things first, you need to refresh your knowledge of "relational" database systems and Db2. There really is no such thing as the "top" or "bottom" N rows in a table. Tables are sets of data that have no inherent logical order.

With regard to the result set though, there is a top and a bottom. You can use the FETCH FIRST N ROWS ONLY clause to retrieve only the first N rows, but to retrieve only the bottom N rows is a bit more difficult. For that, you would have to use scrollable cursors.

A scrollable cursor allows you to move back and forth through the results set without first having to read/retrieve all of the rows before. I suggest that you read up on scrollable cursors in the Db2 SQL Reference manual and the Db2 Application Programming manual. All Db2 manuals can be downloaded in Adobe PDF format for free over the IBM web site.

Basically, you would want to FETCH LAST from the scrollable cursor and then loop through with a FETCH PRIOR statement executing the loop N-1 times. That would give you the "bottom" N of any results set -- sorted or not.

As for your other question, I am confused as to why you would want to delete N rows from a table. Doesn't it matter what the data in the rows is? My guess is that you are asking how you would limit a DELETE to a subset of the rows that would apply to the WHERE condition of the DELETE. The answer is, you cannot, at least not without writing some code.

You would have to open a cursor with the same WHERE conditions specifying FOR UPDATE OF. Then you would FETCH and DELETE WHERE CURRENT OF cursor for that row in a loop that occurs N times. Of course, that means you have to write a program to embed that SQL in.

Hope this answer helps...

Wednesday, October 21, 2020

Automation and the Future of Modern Db2 Data Management

Recently I was invited by BMC Software to participate in their AMI Z Talk podcast series to talk about modern data management for Db2... and I was happy to accept.

Anne Hoelscher, Director of R+D for BMC's Db2 solutions, and I spent about 30 minutes discussing modern data management, the need for intelligent automation, DevOps, the cloud, and how organizations can achieve greater availability, resiliency, and agility managing their mainframe Db2 environment.

Here's a link to the podcast that you can play right here in the blog!


Modern data management, to me, means flexibility, adaptability, and working in an integrated way with a team. Today’s data professionals have to move faster and more nimbly than ever before. This has given rise to agile development and DevOps - and, as such, modern DBAs participate in development teams. And DBA tasks and procedures are integrated into the DevOps pipeline. 

And as all of this DevOps adoption is happening, the amount of data we store, and have to manage, continues to grow faster than ever before.

These are just some of the challenges that Anne and I discuss in this podcast... and at the end, Anne even asks me to predict the future... 

I hope you'll take the time to listen to our discussion and sharing your thoughts and issues regarding the resiliency and agility required to succeed with modern data management and Db2 for z/OS.

----------

I’d also like to extend an offer to all the listeners of this BMC podcast (and readers of this blog post) to get a discount on my latest book, A Guide to Db2 Performance for Application Developers. The link is https://tinyurl.com/craigdb2

There’s also a link to the book publisher on home page of my website. Once you are there, click on the link/banner for the book and when you order from the publisher you can use the discount code 10percent to get 10% off your order of the print or ebook.

 


Monday, October 19, 2020

Improving Mainframe Performance with In-Memory Techniques

 A recent, recurring theme of my blog posts has been the advancement of in-memory processing to improve the performance of database access and application execution. I wrote an in-depth blog post, The Benefits of In-Memory Processing, back in September 2020, and I definitely recommend you take a moment or two to read through that to understand the various ways that processing data in-memory can provide significant optimization.

There are multiple different ways to incorporate in-memory techniques into your systems ranging from system caching to in-memory tables to in-memory database systems and beyond. These techniques are gaining traction and being adopted at increasingly higher rates because they deliver better performance and better transaction throughput.

Processing in-memory instead of on disk can have a measurable impact on not just the performance of you mainframe applications and systems, but also on your monthly software bill. If you reduce the time it takes to process your mainframe workload by more effectively using memory, you can reduce the number of MSUs you consume to process your mission-critical applications. And depending upon the type of mainframe pricing model you deploy you can either be saving now or be planning to save in the future as you move to Tailored-Fit Pricing.

So it makes sense for organizations to look for ways to adopt in-memory techniques. With that in mind, I recommend that you plan to attend this upcoming IBM Systems webinar titled The benefits and growth of in-memory database and data processing to be held Tuesday, October 27, 2020 at 12:00 PM CDT.

This presentation features two great speakers: Nathan Brice, Program Director at IBM for IBM Z AIOps, and Larry Strickland, Chief Product Officer at DataKinetics.

In this webinar Nathan and Larry will take a look at the industry trends moving to in-memory, help to explain why in-memory is gaining traction, and review some examples of in-memory databases and alternate in-memory techniques that can deliver rapid transaction throughput. And they’ll also look at the latest Db2 for z/OS features like FTBs, contiguous buffer pools, fast insert and more that have caused analysts to call Db2 an in-memory database system.

Don’t miss this great session if you are at all interested in better performance, Db2’s in-memory capabilities, and a discussion of other tools that can aid you in adopting an in-memory approach to data processing.

Register today by clicking here!

Wednesday, October 14, 2020

Db2 12 for z/OS Function Level 508

This month, October 2020, IBM introduced the latest new function level, FL508, for Db2 12 for z/OS. This is the second new function level this year (the first came out in June and you can learn more about it here).


For those who don't know, the
Function Level process was designed by IBM for releasing new Db2 functionality using Continuous Delivery (CD) in short, quick bursts, instead of waiting for new versions (or releases). 

With FL508, IBM adds support for moving tables from multi-table table spaces, both simple and segmented, to partition-by-growth (PBG) universal table spaces (UTS). For an overview of UTS capabilities and types, check out this blog post I made earlier this year: Know Your Db2 Universal Table Spaces.

Multi-table table spaces are deprecated functionality, which means that even though they are still supported, they are on their way out. So it makes sense for IBM to give us a better way to convert them to PBG UTS without having to experience an outage. And that is just what FL508 delivers.

This is accomplished in FL508 by enhancements to the ALTER TABLESPACE statement. A new option, MOVE TABLE, is delivered which, as you might expect from its name, can be used to move a table from its current table space to a target table space. 

If, as you would expect in most cases, the source table space data sets are already created, the changes made by MOVE TABLE are pending changes and a REORG must be run on the source table space (the current one you are moving from) to materialize the change. Of course, this is an online REORG, so no outage is required.

The target table space must already exist as a PBG UTS in the same database as the current, source multi-table table space. Furthermore, the PBG UTS must be defined with MAXPARTITIONS 1, DEFINE NO, and [NOT] LOGGED and CCSID values that are the same as the current, existing table space. You can move only one table per ALTER TABLESPACE statement, meaning that each table in a multi-table table space must be moved with a separate ALTER TABLESPACE execution. However, because the changes are pending, you can issue multiple ALTER TABLESPACE statements, one for each table in the multi-table table space, and wait until they have all completed successfully before materializing all of the changes with a single REORG run. 

It seems simple, and the functionality is nice, but don't just go willy-nilly into things moving tables all over the place once you get this capability in FL508. IBM has documented the things to take care of before you begin to move tables using ALTER TABLESPACE. Check out the IBM recommendations here

It is also worth mentioning that you still need to keep in mind the impact that moving all tables from multi-table table spaces into their own table space will have on the system. By that I mean, you have to consider the potential impact on things like the number of open data sets (DSMAX ZPARM), DBD size, EDM pool size, and management issues (number of utility jobs, for example).

But it is nice that we now have a reasonable approach for moving tables out of deprecated multi-table table spaces so we can begin the process of moving them before they are no longer supported. A lot of shops "out there" have been waiting for something like this and it is likely to cause FL508 to be adopted quickly.

Let me know what you think by commenting below...




Wednesday, October 07, 2020

IDUG 2020 EMEA Db2 Tech Conference Goes Virtual

For those of you who have attended an IDUG conference before you know why I am always excited when IDUG-time rolls around again. And the EMEA event is right around the corner!

Participating at an IDUG conference always delivers a boatload of useful information on how to better use, tune, and develop applications for Db2 – both for z/OS and LUW. IDUG offers phenomenal educational opportunities delivered by IBM developers, vendor experts, users, and consultants from all over the world.

Unfortunately, due to the COVID-19 pandemic, in-person events are not happening this year... and maybe not for some time to come, either. But IDUG has gone virtual, and it is the next best thing to being there! The IDUG EMEA 2020 virtual event will take place November 16–19, 2020. So you have ample time to plan for, register, and attend this year.

If you attended any of the IDUG North American virtual conference earlier this year you know that you can still get great Db2 information online at an IDUG event. And there are a ton of great presentations at the EMEA virtual IDUG conference – just check out the great agenda for the event.

Of course, a virtual event does not offer the face-to-face camaraderie of an in-person event, but it still boasts a bevy of educational opportunities. And the cost is significantly less than a traditional IDUG conference: both in terms of the up-front cost (which is significantly less) and also because there are no travel costs... 

For just $199, you get full access to the virtual conference, as well as a year-long premium IDUG membership and a complimentary certification or proctored badging voucher. If you're already a premium member, you can add the EMEA 2020 Conference access to your membership for just $99.

You can register here https://www.idug.org/p/cm/ld/fid=2149

The Bottom Line

So whether you are a DBA, a developer, a programmer, an analyst, a data scientist, or anybody else who relies on and uses Db2, the IDUG EMEA Db2 Tech Conference will be the place to be this November 2020. 

With all of this great stuff available online from this IDUG virtual event, why wouldn’t you want to participate?


Thursday, September 17, 2020

Convert Your COBOL Db2 Programs to Java Without Rebinding

 As most Db2 developers and DBAs know, when you modify a Db2 program you have to prepare the program to enable it to be executed. This program preparation process requires running a series of code preprocessors that—when enacted in the proper sequence—creates an executable load module and a Db2 application package. The combination of the executable load module and the application package is required before any Db2 program can be run, whether batch or online.

But it is not our intent here to walk through and explain all of the steps and nuances involved in Db2 program preparation. Instead, we are taking a look at the impact of converting COBOL programs to Java programs, particularly when it comes to the need to bind as a part of the process.

We all know that issuing the BIND command causes Db2 to formulate access paths for SQL. If enough things (statistics, memory, buffers, etc.) have changed, then access paths can change whenever you BIND or REBIND. And this can be troublesome to manage.

But if the SQL does not change, then it is not technically necessary to bind to create a new package. You can prevent unnecessary BIND operations by comparing the new DBRM from the pre-compile with the previous version. Of course, there is no native capability in Db2 or the BIND command to compare the DBRM. That is why there are third-party tools on the market that can be used for this purpose.

But again, it is not the purpose of today’s post to discuss such tools. Instead, the topic is converting COBOL to Java. I have discussed this previously in the blog in the post Consider Cross-Compiling COBOL to Java to Reduce Costs, so you might want to take a moment to read through that post to acquaint yourself with the general topic.

Converting COBOL to Java and BIND

So, let’s consider a COBOL program with Db2 SQL statements in it. Most COBOL uses static SQL, meaning that the access paths are determined at bind time, not at execution time. If we convert that COBOL program to Java then we are not changing the SQL, just the code around it. Since the SQL does not change, then a bind should not be required, at least in theory, right?

Well, we first need to get into a quick discussion about types of Java programs. You can use either JDBC or SQLJ for accessing Db2 data from a Java program. With JDBC the program will use dynamic SQL whereas SQLJ will deliver static SQL. The Db2 BIND command can be issued using a DBRM (precompiler output) or a SQLJ customized profile. 

So, part of the equation to avoid binding is to utilize SQLJ for converted COBOL programs.

CloudFrame, the company and product discussed in the referenced blog post above can be used to convert COBOL programs into modular Java. And it uses SQLJ for the Db2 access. As such, with embedded SQLJ, static SQL will be used and the access paths will be determined at bind time instead of execution time.

But remember, we converted business logic, not SQL. The same SQL statements that were used in the COBOL program can be used in the converted Java. CloudFrame takes advantage of this and re-purposes the existing package from the previous COBOL program to the new Java SQLJ. CloudFrame automates the entire process as part of the conversion from COBOL to Java. This means that the static SQL from the COBOL program is converted and customized into SQLJ in java. This is a built-in capability of CloudFrame that allows you to simply reuse the same package information that was already generated and bound earlier.

This means no bind is required when you use CloudFrame to convert your Db2 COBOL applications to Java… and no access paths will change. And that is a good thing, right? Conversion and migration are already time-consuming processes; eliminating performance problems due to changing access paths means that one less issue to worry about during a COBOL to Java conversion when you use CloudFrame.

Wednesday, September 16, 2020

Planet Db2 is Back!

For those of you who were fans of the Planet Db2 blog aggregator, you’ll be happy to know that it is back up and operational, under new management.


For those who do not know what I am talking about, for years Leo Petrazickis curated and managed the Planet Db2 blog aggregator. Leo provided a great service to the Db2 community, but unfortunately, about a year ago he had to discontinue his participation in the site. So Planet Db2 has been gone for a while. But it is back now!

Before I continue, for those who don’t know what a blog aggregator is, it  a service that monitors and posts new blog content on a particular topic as it is published. This means that whenever any blog that is being tracked by the aggregator posts new content, it is highlighted with a link to the blog post on the aggregator site. The benefit is that you can watch the blog aggregator page (in this case Planet Db2) for new content instead of trying to monitor multiple blogs.

So if you are a Db2 DBA, programmer, user, vendor, or just an interested party, be sure to highlight and visit Planet Db2 on a regular basis to monitor what’s new in the Db2 blogosphere. And if you write a Db2 blog be sure to register your blog at the Planet Db2 site so your content is tracked and aggregated to Planet Db2… you’ll surely get more readers of your stuff if you do!