Monday, August 13, 2018

A Guide to Db2 Application Performance for Developers - New Book on the Way!

Those of you who are regular readers of my blog know that I have written one of the most enduring books on Db2 called DB2 Developer's Guide. It has been in print for over twenty years in 6 different editions.

Well, the time has come for me to write another Db2 book. The focus of this book is on the things that application programmers and developers can do to write programs that perform well from the beginning. 

You see, in my current role as an independent consultant that focuses on Db2, I get to visit a lot of different organizations... and I get to see a lot of poorly performing programs and applications. So I thought: "Wouldn't it be great if there was a book I could recommend that would advise coders on how to ensure optimal performance in their code as they write their Db2 programs?"

This was a similar thought I had way back when before I wrote my first book. At that time, back when the only manuals available were printed and housed in binders, I thought "Wouldn't it be great if there was a single book that captured the essentials of what you need to know to administer and use DB2?" There really wasn't, so I wrote that book.

Well, again, there really isn't a book that focuses on just what programmers should know to write efficient programs. So I figured it was time to write another book. This one is called A Guide to Db2 Application Performance for Developers.





This book is written for all Db2 professionals, covering both Db2 for LUW and Db2 for z/OS. When there are pertinent differences between the two it will be pointed out in the text. The book’s focus is on develop­ing applications, not database and system administration. So it doesn’t cover the things you don’t do on a daily basis as an application coder (like reorgs, backups, monitoring, etc).  Instead, the book offers guidance on application devel­opment procedures, techniques, and philosophies for producing optimal code. The goal is to educate developers on how to write good appli­cation code that lends itself to optimal performance. 

By following the principles in this book you should be able to write code that does not require significant remedial, after-the-fact modifications by performance ana­lysts. If you follow the guidelines in this book your DBAs and performance analysts will love you!

The book does not rehash material that is freely available in Db2 manuals that can be downloaded or read online. It is assumed that the reader has access to the Db2 manuals for their environment (Linux, Unix, Windows, z/OS).

The book is not a tutorial on SQL; it assumes that you have knowledge of how to code SQL statements and embed them in your applications. Instead, it offers advice on how to code your programs and SQL statements for performance.

What you will get from reading this book is a well-grounded basis for designing and developing efficient Db2 applications that perform well. 

Planned publication for this book is late September 2018. News of its publication and how to order will be on my web site when the book is available. 


NOTE
This new book is NOW AVAILABLE in both print and ebook formats. You can order it here: https://store.bookbaby.com//bookshop/book/index.aspx?bookURL=A-Guide-to-Db2-Performance-for-Application-Developers&b=p_bu-ba-or

Monday, August 06, 2018

Security, Compliance and Data Privacy – GDPR and More!

Practices and procedures for securing and protecting data are under increasing scrutiny from industry, government and your customers. Simple authorization and security practices are no longer sufficient to ensure that you are doing what is necessary to protect your Db2 for z/OS data. 

The title of this blog post uses three terms that are sometimes used interchangeably, but they are different in what they mean and imply. Data security is the protective digital privacy measures we can apply to prevent unauthorized access to computers, databases and websites. Then there is compliance. This describes the ability to act according to an order, set of rules or request. In this context we mean compliance with industry and governmental regulations. Finally, there is data privacy (or data protection). That is the relationship between the collection and dissemination of data, technology, the public expectation of privacy, and the legal and political issues surrounding them.

Data privacy and data security are sometimes used as synonyms, but they are not! Of course, they are related. A data security policy is put in place to protect data privacy. When an organization is trusted with the personal and private information of its customers, it must enact an effective data security policy to protect the data.  So you can have security without data privacy, but you can’t really have data privacy without security controls.

Security is a top-of-mind concern for most IT professionals, showing up in the top spot of many industry surveys that ask about the most important organizational initiatives. Indeed, the 2018 State of Resilience Report shows that security is the number one initiative for IT shops this year. That is a good thing… but you need to look a little deeper to find the reality…

Register and attend my webinar with the same title as this blog post, Security, Compliance, and Data Privacy - GDPR and More! (August 9, 2018), to hear more about this. I will also talk about data breaches, regulatory compliance (with a special concentration on GDPR), the importance of metadata, things you can do to address security issues at your shop, and closer look at Db2 for z/OS security issues, features, and functionality.

I hope to see you there on August 9th! Register and attend at this link.

Monday, July 16, 2018

Broadcom Set to Purchase CA Technologies for Close to $19 Billion

If you've been paying attention the past couple of days you no doubt will have heard that Broadcom, heretofore known for their semiconductors, has made a bid to acquire CA Technologies. I've been expecting something to happen ever since the rumors of a merger with BMC were rampant about a year ago.

Broadcom is offering an all cash deal for CA, but many analysts and customers of both companies are questioning the synergy of the deal.

The general thinking, at least what I have seen in the news, is that Broadcom acquiring CA is "illogical." And I can see that point-of-view. Although Broadcom and CA are both ostensibly in the technology market, CA is in the enterprise software space, a completely different part of the technology industry than the semiconductor and components space occupied by Broadcom. 


The other aspect of this acquisition focuses on CA, which has been in a bit of a slump. Its stock price has pinged between the mid-20s to the mid-30s for the past 5 years (until this acquisition was announced). And CA's product portfolio is what it is. If you have ever dealt with CA you kind of know that there is not a lot of new and innovative functionality being added to its products. (To my CA friends, yes, this is a broad generalization and I know that there have been some new things you've been adding, but CA has a reputation of being an acquirer, not an innovator.)


So, yes, this is a difficult acquisition to understand. That said, Broadcom has probably got the cash for it since its attempted acquisition of Qualcomm fell through back in March 2018 (over $100 billion). If Broadcom has a plan for taking advantage of CA’s customer base – high end enterprise accounts – and building out a core of hardware and software, the acquisition could work. The company bought Brocade last year to extend into the mobile and networking connectivity market. If Broadcom uses CA’s assets and expertise to include the mainframe as part of its connectivity business -- and moves to further embrace the cloud and IoT -- the acquisition could make sense in the long term. 


CA's mainframe products make up the bulk of their revenue consisting of $2.2bn in the 2017-2018 financial year. The remainder of its enterprise software garnered $1.75bn with $311m in services revenue. So the big nut in this acquisition is the mainframe solutions. What will Broadcom do with them? How will they fit into the overall company and strategy for Broadcom? Are there plans to spin off just the mainframe business so it can operate more nimbly? Note to Broadcom: if you plan to do this call me! You should call the spinoff Platinum Technology, inc.


But who knows? My initial reaction was “that’s strange,” but after investigating it a bit I guess I can see some rationale for this acquisition.


With all of this on the table, keep in mind that most large acquisitions fail. And the business models of the two companies are wildly different. So there is a lot for Broadcom/CA to overcome.

As an outsider, it’ll be fun to watch this unfold. 

If you are a CA customer, let us know what you think about this. Will it be good or bad for the products? And how are you and your company planning to react?

Wednesday, June 20, 2018

Fast and Effective Db2 for z/OS Test Data Management with BCV5


Perhaps the most significant requirement for coding successful Db2 application programs is having a reasonable set of test data to use during the development process. Without data, there is no way to test your code to make sure it runs. But there are many issues that must be overcome in order to have a useful test data management process. Today we will talk about this along with a key enabling component of such a process, BCV5 from UBS Hainer.
One of the first things that organizations try is to make a copy of the production for testing. But this is easier said than done. You cannot just stop your production databases to make a copy of them for testing. But you still want a fast, consistent copy of the data. Consistent in terms of the units of work and referential integrity. And maybe you just want some of the data, not all of it. And we haven’t even talked about the potential regulatory concerns if you are copying personally identifiable information.
When you initially go to build your test data environment, the tools at your disposal are likely the utilities that came with Db2. This means that you will start with solutions like unloading and loading the data. But the LOAD and UNLOAD utilities are not known for their speed, so this can take a long time to accomplish – both for the initial creation and for any subsequent refreshing of the test data. This is important because test data must be refreshed on a regular basis as application testing is performed. Without the capability to refresh it is impossible to compare test runs and develop your programs consistently.
So, what should you do? Well, the first step is to create a consistent test bed either from scratch or, more likely, from production. And you want to do this efficiently and without interrupting production processing. This core bed of test data can be manipulated to reduce its size and even to satisfy regulatory requirements. With a core set of data you can then develop procedures to copy this data out to the various development and QA environments. To succeed, you need a fast method of populating multiple environments, on demand, from the approved test bed.
A key to achieving such an environment is an efficient Db2 data copying tool like BCV5, which can be used to copy and refresh Db2 data very rapidly. BCV5 copies Db2 table spaces and indexes within the same Db2 subsystem or even between different Db2 subsystems much faster than unloading and reloading the data. Using BCV5 you can deliver speedy copies because it works directly at the VSAM level. As BCV5 copies at the VSAM level it can replace Db2-internal OBIDs with the correct target values. This is significantly more efficient than unloading and loading one row at a time. And it takes away the complicated user-managed OBIDXLAT capability of DSN1COPY.
If you have used DSN1COPY in the past you know that it can be difficult to use; this is not the case with BCV5. With DSN1COPY you must specify a series of parameters that describe the input, such as the PIECESIZE, NUMPARTS, DSSIZE, whether it is a LOB table space or not, and more. BCV5 determines all required values automatically, making things a lot easier and less prone to failure.
And if you use LOB and XML data, and these days who doesn’t, BCV5 handles this data like any other, copying it at the same rate as regular table spaces.
BCV5 copies everything, not just the physical Db2 data, but also all of the associated structures including databases, table spaces, tables, indexes, and even views, triggers, aliases, synonyms, constraints, and so on! And you don’t need to worry if objects already exist; BCV5 will check for compatibility and keep the environment accurate. And all of the functionality you’d expect is there, such as the ability to rename objects between environments and to run the copy job either manually or via a job scheduler. Furthermore, you can interact with BCV5 using either an ISPF or a GUI interface.
Using BCV5, you can even use image copies as the source for your test data. BCV5 can use the most recent image copy, or an older image copy chosen by generation number, timestamp, or data set name pattern. BCV5 can automatically identify the correct image copy data sets and use them as the source for the data to be copied. You can even use BCV5 to refresh indexes using image copies of indexes if they exist.
Keeping Db2 statistics accurate can be another vexing test data issue. Generally speaking, you want to keep statistics up-to-date, but in test you probably want test statistics to mirror production. BCV5 can copy both RUNSTATS and RTS (Real Time Stats) directly from the source environment into the target. There is no need for a separate RUNSTATS job or to do a REORG in order to collect an RTS baseline.
And let’s not forget the most impressive aspect of BCV5, its speed and efficiency. BCV5 runs tasks in parallel with automatic workload balancing to further improve the performance of copying Db2 data. This efficiency comes in three forms: less CPU consumption, less elapsed run time, and a reduction in the management steps which can be automated instead of being done manually.
A case in point, a large automobile manufacturer uses BCV5 to manage its large Db2 test data environment consisting of over 11,000 table space partitions, another 11,000+ index partitions, and 20 LOBs. Before deploying BCV5 the company required hundreds of jobs that took almost 2 weeks to create, configure, and execute. After automating the process with BCV5, the entire process requires only 6 jobs that can refresh the test environments in 91 minutes. Impressive, no?
UBS Hainer markets other tools that augment and assist BCV5. For example, its In-Flight Copy add-on can enable BCV5 to get up-to-the-moment accurate data by gathering information from the Db2 log to make consistent copies of table spaces and indexes. It also offers a Reduction and Masking Data add-on to assist with enforcing privacy regulations in your test data. And BCV4 can be used to duplicate an entire Db2 subsystem.
The bottom line is that setting up test data can be difficult and time-consuming. Without a well-thought-out approach to gathering and refreshing test beds, application developers and quality assurance personnel will run into issues as they try to test Db2 code with corrupted or improper data. If your organization has issues with effectively managing test data for your Db2 for z/OS developers, take a look at UBS Hainer’s BCV5 solution for quickly copying and refreshing Db2 data.

Friday, June 15, 2018

Db2 for z/OS Performance Traces Part 3 - Trace Destinations and IFCIDs


In parts 1 and 2 of this series we examined each of the six types of Db2 traces that you can start to get Db2 to record information about its activities. Be sure to check out those earlier blog posts to learn about what is available. 

Today we are going to look at a couple of additional details with regard to Db2 traces, starting with where the trace records can be written.

Trace Destinations

When a trace is started, Db2 formats records containing the requested information. After the information is prepared, it must be externalized somewhere. Db2 traces can be written to six destinations:

  • GTF, or Generalized Trace Facility, is a component of z/OS and is used for storing large volumes of trace data.
  • SMF , or System Management Facility, a source of data collection used by MVS to accumulate information and measurements. This destination is the most common for DB2 traces.
  • SRV is a routine used primarily by IBM support personnel for servicing Db2.
  • OPn, where n is a value from 1 to 8, is an output buffer area used by the Db2 Instrumentation Facility Interface (IFI).
  • OPX is a generic output buffer. When used as a destination, OPX signals Db2 to assign the next available OPn buffer (OP1 to OP8).
The Instrumentation Facility Interface, which is a DB2 trace interface, enables Db2 programs to read, write, and create Db2 trace records and issue Db2 commands. Many online Db2 performance monitors are based on the IFI.

But which trace destination should you use? Well, it depends! Consult Table 1 below for a synopsis of the available and recommended destinations for each Db2 trace type. A "Y" indicates that the specified trace destination is valid for the given type of trace; Default indicates that it is the default destination for that trace type.



Note: you can click on the Table to expand its size for easier viewing.

If you use SMF as your Db2 trace destination, consider using the SMFCOMP system parameter (DSNZPARM), which was introduced in Db2 10. This parameter causes the compression of Db2 trace records written to SMF. When SMFCOMP is set ON, the z/OS compression service CSRCESRV is used to compress any Db2 trace records written to SMF.

Use this parameter if your shop is concerned about the volume of SMF records written by Db2.

Using IFCIDs

Sometimes a trace class can be too broad, by which I mean that it generates more information than you need for the probelm at hand. This brings us to the IFCID.

Each trace class is associated with specific trace events known as Instrumentation Facility Component Identifier (IFCIDs), pronounced “if-kid.” An IFCID defines a record that represents a trace event. IFCIDs are the single smallest unit of tracing that can be invoked
by Db2.

In some cases, it can make sense to avoid activating trace classes altogether and start traces specifying only the IFCIDs needed. This way, you can reduce the overhead associated with tracing by recording only the trace events needed.

You can use trace classes 30 through 32 for starting IFCID-only traces. These classes have no predefined IFCIDs and are available for a location to use. Consider the following -START TRACE command, which starts only IFCIDs 1, 2, 42, 43, 107, and 153:

-START TRACE(PERFM) DEST(GTF) CLASS(30) IFCID(1,2,42,43,107,153)

If you do not specify the IFCID option, only those IFCIDs contained in the activated trace classes are started. The maximum number of IFCIDs per trace is 156.

Because this task can be tedious, if you decide to trace only at the IFCID level, it is probably a good idea to use a performance monitor that starts these IFCID-level traces based on menu choices. For example, if you choose to trace the elapsed time of Db2 utility jobs, the monitor or tool would have a menu option for this, initiating the correct IFCID traces (for example, IFCIDs 023 through 025).

There are several hundred different IFCIDs. Most IFCIDs contain data fields that describe events pertinent to the event being traced. Some IFCIDs have no data; instead they merely mark a specific time. For example, IFCID 074, which marks the beginning of Terminate Thread, has no data field.

Certain trace events of extended durations require a pair of IFCIDs: one for the beginning of the event and another for the end. These pairs enable the computation of elapsed times. Other trace events that are not as lengthy require only a single IFCID. 

You can find the IFCIDs associated with each trace class in the IBM DB2 Command Reference in the section describing the -START TRACE command. But that manual does not describe the purpose of each IFCID. A list describing each IFCID can be found in the SDSNIVPD(DSNWMSGS) data set. It also contains DDL and LOAD utility statements that can be used to create DB2 tables containing IFCID information that can be easily queried by SQL.

For example, the following query retrieves field descriptions for performance trace records for particular classes:

SELECT DISTINCT NAME, CLASS, DESCRIPTION, A.IFCID, SEQ
FROM   ownerid.TRACE_DESCRIPTIONS  A,
       ownerid.TRACE_TYPES         B
WHERE  A.IFCID = B.IFCID
AND    TYPE =
'PERFORMANCE'
AND    CLASS IN (1,2,3,4,5,6,7)
ORDER BY SEQ;

You can modify the query as needed to retrieve only the specific IFCID information you need.

Synopsis


This series of blog posts has offered up a high-level introduction to Db2 tracing and what type of information is available using traces. A comprehensive strategy for monitoring the performance of your Db2 subsystems and applications is an important part of establishing a best practices approach to database performance management. Use this information to help you on your journey to an efficiently monitored Db2 for z/OS environment.


Of course, this information is just the beginning. Keep checking back on this blog for additional useful information, and/or make sure you have a copy of my book, Db2 Developer's Guide, which contains over 1,600 pages of Db2 guidance and advice.