Monday, April 14, 2014

Aggregating Aggregates Using Nested Table Expressions


Sometimes when you are writing your SQL to access data you come across the need to work with aggregates. Fortunately, SQL offers many simple ways of aggregating data. But what happens when you uncover then need to perform aggregations of aggregates?

What does that mean? Well, consider an example. Let's assume that you  want to compute the average of a sum. This is a reasonably common requirement that comes up frequently in applications that are built around sales amounts. Let's say that we have a table containing sales information, where each sales amount has additional information indicating the salesman, region, district, product, date, etc. 

A common requirement is to produce a report of the average sales by region for a particular period, say the first quarter of 2014. But the data in the table is at a detail level, meaning we have a row for each specific sale.

A novice SQL coder might try to write a query with a function inside a function, like AVG(SUM(SALE_AMT)). Of course, this is invalid SQL syntax. DB2 will not permit the nesting of aggregate functions. But we can use nested table expressions and our knowledge of SQL functions to build the correct query.

Let’s start by creating a query to return the sum of all sales by region for the time period in question. That query should look something like this:

SELECT REGION, SUM(SALE_AMT)
FROM   SALES
WHERE SALE_DATE BETWEEN DATE(‘2014-01-01’)
                AND     DATE(‘2014-03-31’)
GROUP BY REGION;


Now that we have the total sales by region for the period in question, we can embed this query into a nested table expression in another query like so:

SELECT NTE.REGION, AVG(NTE.TOTAL_SALES)
FROM (SELECT REGION, SUM(SALE_AMT)
      FROM   SALES
      WHERE SALE_DATE BETWEEN DATE(‘2014-01-01’)
                      AND     DATE(‘2014-03-31’)
      GROUP BY REGION) AS NTE

GROUP BY NTE.REGION;


And voila! We have aggregated an aggregate, (averaged a sum)...

Sunday, April 06, 2014

DB2 Buffer Pool Monitoring

After setting up your buffer pools, you will want to regularly monitor your configuration for performance. The most rudimentary way of doing this is using the -DISPLAY BUFFERPOOL command. There are many options of the DISPLAY command that can be used to show different characteristics of your buffer pool environment; the simplest is the summary report, requested as follows:

-DISPLAY BUFFERPOOL(BP0) LIST(*) DBNAME(DSN8*)

And a truncated version of the results will look something like this:

DSNB401I - BUFFERPOOL NAME BP0, BUFFERPOOL ID 0, USE COUNT 20
DSNB402I - VIRTUAL BUFFERPOOL SIZE = 500 BUFFERS 736
             ALLOCATED = 500 TO BE DELETED = 0
             IN-USE/UPDATED = 0
DSNB404I - THRESHOLDS - 739
             VP SEQUENTIAL        = 80   HP SEQUENTIAL = 75
             DEFERRED WRITE       = 85   VERTICAL DEFERRED WRT = 80,0
             PARALLEL SEQUENTIAL  = 50   ASSISTING PARALLEL SEQT = 0

Of course, you can request much more information to be displayed using the DISPLAY BUFFERPOOL command by using the DETAIL parameter. Additionally, you can request that DB2 return either incremental statistics (since the last DISPLAY) or cumulative statistics (since DB2 was started). The statistics in a detail report are grouped in the following categories: GETPAGE information, Sequential Prefetch information, List Prefetch information, Dynamic Prefetch information, Page Update statistics, Page Write statistics, and Parallel Processing Activity details.

A lot of interesting and useful details can be gathered simply using the DISPLAY BUFFERPOOL command. For example, you can review GETPAGE requests for random and sequential activity, number of prefetch requests (whether static or dynamic, or for sequential or list prefetch), number of times each of the thresholds were tripped, and much more. Refer to the DB2 Command Reference manual (SC19-4054-02 for DB2 11) for a definition of each of the actual statistics returned by DISPLAY BUFFERPOOL.

Many organizations also have a performance monitor (such as IBM’s Omegamon) that simplifies the gathering and display of buffer pool statistics. Such tools are highly recommended for in-depth buffer pool monitoring and tuning. More sophisticated tools also exist that offer guidance on how to tune your buffer pools — or that automatically adjust your buffer pool parameters according to your workload. Most monitors also provide more in-depth statistics, such as buffer pool hit ratio calculations.

The buffer pool hit ratio is an important tool for monitoring and tuning your DB2 buffer pools. It is calculated as follows:

Hit ratio = GETPAGES - pages_read_from_disk / GETPAGEs

“Pages read from disk” is a calculated field that is the sum of all random and sequential reads.

The highest possible buffer pool hit ratio is 1.0. This value is achieved when each requested page is always in the buffer pool. When requested pages are not in the buffer pool, the hit ratio will be lower. You can have a negative hit ratio — this just means that prefetch was requested to bring pages into the buffer pool that were never actually referenced.


In general, the higher the hit ratio the better because it indicates that pages are being referenced from memory in the buffer pool more often. Of course, a low hit ratio is not always bad. The larger the amount of data that must be accessed by the application, the lower the hit ratio will tend to be. Hit ratios should be monitored in the context of the applications assigned to the buffer pool and should be compared against hit ratios from prior processing periods. Fluctuation can indicate problems.

Sunday, March 30, 2014

DB2 Buffer Pool Sizing

Sizing DB2 buffer pools can be a time-consuming and arduous process. With that in mind, here are a few thoughts on the matter...
  
DB2 very efficiently manages data in large buffer pools. To search for data in a large buffer pool does not consume any more resources than searching in a smaller buffer pool. With this in mind, do not skimp on allocating memory to your DB2 buffer pools.

So, just how big should you make each buffer pool?  One rule of thumb is to make the buffer pool large enough to hold five minutes worth of randomly read pages during peak processing time. To clarify, the goal is that once a page has been read from disk into the buffer pool, the goal is to maintain it in memory for at least five minutes. The idea is that a page, once read, is likely to be needed again by another process during peak periods. Of course, your particular environment may differ. 

If you have metrics on how frequently a piece of data once read, will need to be read again, as well as how soon it will need to be read, you can use that to tinker with your buffer pool size and parameters to optimize buffer pool data residency.

But it can be difficult to know how much data is read at peak processing time, as well as even knowing when peak processing time will be, before an application is coded. You can gather estimates from the subject matter experts, end users, and application designers, but it will just be estimates.  Over time you will have to examine buffer pool usage and perhaps re-size your buffer pools, or move objects from one buffer pool to another to improve performance.

Of course, you will want to make sure that you have not allocated so much memory to DB2 buffer pools that the system starts paging. When there is not enough real storage to back the buffer pool storage, paging activity will cause performance to degrade.

What is paging? Paging occurs when the virtual storage requirements for a buffer pool exceeds the real storage capacity for the z/OS image. When this happens DB2 migrates the least recently used pages in the buffer pool to auxiliary storage. If the data that was migrated is then accessed those pages must be brought back into real storage from auxiliary storage. When you detect that DB2 is paging you should either increase the amount of real storage or decrease the size of your buffer pools.


Only DBAs or systems programmers should be allowed to add or modify the size of DB2 buffer pools.  And these qualified professionals should be able to analyze the system to know the amount of memory available (real and virtual), as well as the amount being used by other system software and applications. 

For a good overview of mainframe virtual storage consult the following link from IBM

Friday, March 21, 2014

DB2 Tool Requirements

The last blog post here at the DB2 Portal offered up a brief overview of the types of tools that you might want to consider to help you use, manage, and administer your DB2 applications and databases. But it did not really look into the capabilities and requirements for modern DB2 tools and solutions.
Today’s DB2 management and administration tools should provide intelligent automation to reduce the problems inherent in the tedious day-to-day tasks of database administration. Simple automation is no longer sufficient. Modern data management software must be able to intelligently monitor, analyze, and optimize applications using past, present, and future analysis of collected data. Simply stated, the software should work the way a consultant works--fulfilling the role of a trusted advisor. The end result should be software that functions like a consultant, enabling your precious human resources to spend time on research, strategy, planning, and implementing new and advanced features and technologies, instead of rote day-to-day tasks.
Furthermore, modern database tools should provide cross-platform, heterogeneous management. For most medium-to-large IT organization it is not enough to manage just DB2 for z/OS systems, for example. The ability to offer administrative and development assistance across multiple DBMS platforms (for example, DB2 for LUW, Oracle, SQL Server, MySQL, and so on). Most companies have multiple DBMSs that need to be managed -- not just one... and DBAs and developers are precious resources that increasingly are being asked to work on more than just a single DBMS. When the tools can manage cross-platform, the learning curve is reduced and productivity can be enhanced.
And while it is true that today’s DBMS products are becoming more self-managing, they do not yet provide out-of-the-box, lights-out operation, nor do they offer all of the speed, usability, and ease of use features of ISV admin, management, and development tools. An organization looking to provide 24/7 data availability coupled with efficient performance will have to augment the capabilities of their DBMS software with data management and DBA tools to get the job done.
As data management tasks get more complex and DBAs become harder to find and retain, more and more database maintenance duties should be automated using intelligent management software. Using intelligent, automated DB2 tools will help to reduce the amount of time, effort, and human error associated with implementing and managing efficient database applications.

Monday, March 17, 2014

Types of DB2 Tools

As a user of DB2, which I'm guessing you are since you are reading this blog, you should always be on the lookout for useful tools that will help you achieve business value from your investment in DB2. There are several categories of tools that can help you to achieve this value.

Database Administration and Change Management tools simplify and automate tasks such as creating database objects, examining existing structures, loading and unloading data, and making changes to databases. Without an administration tool these tasks require intricate, complex scripts to be developed and run. One of the most important administration tools is the database change manager. Without a robust, time-tested product that is designed to effect database changes, database changes can be quite time-consuming and error prone. A database change manager automates the creation and execution of scripts designed to implement required changes – and will ensure that data integrity is not lost.

One of the more important categories of DB2 tools offers Performance Management capabilities. Performance tools help to gauge the responsiveness and efficiency of SQL queries, database structures, and system parameters. Performance management tools should be able to examine and improve each of the three components of a database application: the DB2 subsystem, the database structures, and the application programs. Advanced performance tools can take proactive measures to correct problems as they happen.

Backup and Recovery tools simplify the process of creating backups and recovering from those backup copies. By automating complex processes, simulating recovery, and implementing disaster recovery procedures these tools can be used to assure business resiliency, with no data being lost when the inevitable problems arise.

Another important category of DB2 tool is Utilities and Utility Management. A utility is a single purpose tool for moving and/or verifying database pages; examples include LOAD, UNLOAD, REORG, CHECK, COPY, and RECOVER. Tools that implement and optimize utility processing, as well as those that automate and standardize the execution of DB2 utilities, can greatly improve the availability of your DB2 applications. You might also want to consider augmenting your utilities with a database archiving solution that moves data back and forth between your database and offline storage.

Governance and Compliance tools deliver the ability to protect your data and to assure compliance with industry and governmental regulations, such as HIPAA, Sarbanes-Oxley, and PCI DSS. In many cases business executives have to vouch for the accuracy of their company’s data and that the proper controls are in place to comply with required regulations. Governance and compliance tools can answer questions like “who did what to which data when?” that are nearly impossible to otherwise answer.

And finally, Application Management tools help developers improve application performance and speed time-to-market. Such tools can improve database and program design, facilitate application testing including the creation and management of test data, and streamline application data management efforts.

Tools from each of these categories can go a long way toward helping your organization excel at managing and accessing data in your DB2 databases and applications...

Thursday, March 06, 2014

What Makes DB2 Tick?

Conceptually, DB2 is a relational database management system. Actually, some might object to this term instead calling DB2 a SQL DBMS because it does not conform exactly to Codd’s relational model. Physically, DB2 is an amalgamation of address spaces and intersystem communication links that, when adequately tied together, provide the services of a database management system.

"What does all this information have to do with me?" you might wonder. Well, understanding the components of a piece of software helps you use that software more effectively. By understanding the physical layout of DB2, you can arrive at system solutions more quickly and develop SQL that performs better.

This blog entry will not get very technical and won't delve into the bits and bytes of DB2. Instead, it presents the basic architecture of a DB2 subsystem and information about each subcomponent of that architecture.

Each DB2 subcomponent is comprised of smaller units called CSECTs. A CSECT performs a single logical function. Working together, a bunch of CSECTs provide general, high level functionality for a subcomponent of DB2. DB2 CSECT names begin with the characters DSN.

There are three major subcomponents of DB2: 
  1. System services (SSAS)
  2. Database services (DBAS)
  3. Distributed Data Facility services (DDF).


The SSAS, or System Services Address Space, coordinates the attachment of DB2 to other subsystems (CICS, IMS/TM, or TSO). SSAS is also responsible for all logging activities (physical logging, log archival, and BSDS). DSNMSTR is the default name for this address space. (The address spaces may have been renamed at your shop.) DSNMSTR is the started task that contains the DB2 log. The log should be monitored regularly for messages indicating the errors or problems with DB2. Products are available that monitor the log for problems and trigger an event to contact the DBA or systems programmer when a problem is found.

The DBAS, or Database Services Address Space, provides the facility for the manipulation of DB2 data structures. The default name for this address space is DSNDBM1. This component of DB2 is responsible for the execution of SQL and the management of buffers, and it contains the core logic of the DBMS. Database services use system services and z/OS to handle the actual databases (tables, indexes, etc.) under the control of DB2. Although DBAS and SSAS operate in different address spaces, they are interdependent and work together as a formal subsystem of z/OS.

The DBAS can be further broken down into three components, each of which performs specific data-related tasks: 
  1. Relational Data System (RDS), 
  2. Data Manager (DM) 
  3. Buffer Manager (BM). 


The Buffer Manager handles the movement of data from disk to memory; the Data Manager handles the application of Stage 1 predicates and row-level operations on DB2 data; and the Relational Data System, or Relational Data Services, handles the application of Stage 2 predicates and set-level operations on DB2 data.

Figure 1. The components of the Database Services Address Space.

The next DB2 address space, DDF, or Distributed Data Facility services, is optional. DDF is required only when you want distributed database functionality. If your shop must enable remote DB2 subsystems to query data between one another, the DDF address space must be activated. DDF services use VTAM or TCP/IP to establish connections and communicate with other DB2 subsystems using either DRDA or private protocols.

DB2 also requires an additional address space to handle locking. The IRLM, or Intersystem Resource Lock Manager, is responsible for the management of all DB2 locks (including deadlock detection). The default name of this address space is IRLMPROC.

Finally, DB2 uses additional address spaces to manage the execution of stored procedures and user-defined functions. In older releases of DB2 (V4 and V5 era) these address spaces are known as the Stored Procedure Address Spaces, or SPAS. For current DB2 releases (V8 and later), however,  the z/OS Workload Manager (WLM) is used and can define multiple address spaces for stored procedures. 

So, at a high level, DB2 uses five address spaces to handle all DB2 functionality. DB2 also communicates with allied agents, like CICS, IMS/TM, and TSO. And database services uses the VSAM Media Manager to actually read data. A summary of the DB2 address spaces and the functionality they perform is provided in Figure 2.

Figure 2. The DB2 address spaces.

Monday, March 03, 2014

Time to Start Planning for This Year's IDUG DB2 Tech Conference

Well, here we are in March... the cold and snow will hopefully soon be behind us... and thoughts of Spring and warm weather start to fill our minds. And that can only mean one thing - the annual North American IDUG DB2 Tech Conference will soon be upon us! So if you haven't started to plan on how to get funding to attend it is time to start!
The 2014 North American IDUG DB2 Tech Conference will be held in Phoenix, Arizona the week of May 12 thru May 16… and if you are a DB2 professional (and since you’re reading this I will assume that you are) then you should be making plans to attend. As it does every year, IDUG features all of the latest in DB2 technologies, networking opportunities and the technical content you need to be successful. There are over 100 technical sessions to choose from at this year’s conference!
The conference also hosts Special Interest Groups (SIGs) on nine different DB2 and data-related topics, so I’m sure you can find something interesting and helpful to attend. And you can use the conference as a vehicle to get your DB2 certification! All DB2 Tech Conference attendees have the opportunity to take a Free IBM Certification Examwhile in Phoenix! Each attendee may take one exam for free and if they pass they are eligible for a second free exam.
 
You can also arrive early and attend a full day session on a DB2 topic of your choice (at an additional cost). Every year IDUG provides several in-depth Educational Seminars delivered by some of the top DB2 consultants out there. This year you can sign up to see Bonnie Baker deliver one of her last DB2 classes before she retires in June 2014 (we’ll miss you Bonnie)!
And don't forget the Vendor Exhibit Hall, which boasts all of the DB2 tools and services providers that you’d ever want to see – all under one roof in one convenient place. And the vendors always offer a lot of goodies and giveaways, ranging from t-shirts and pens to tablets and other nice gadgets.

I'll be presenting this again at IDUG, this year on the topic of Big Data for the DB2 Professional. So be sure to stop in to say "Hi" and chat about DB2, big data, or your favorite topic du jour!

The IDUG DB2 Tech Conference is the place to be to learn all about DB2 from IBMers, gold consultants, IBM champions, end users, and more. With all of this great stuff going on in Phoenix this May, why wouldn't you want to be there!?!?

Tuesday, February 25, 2014

Dynamic SQL - Let's Do The Math

We've come a long way in the world of DB2 in the past decade or so. I remember way back when it was common for DBAs to say "If performance is an issue, dynamic SQL should not be used... always use static SQL!"  But today, in 2014, it is usually the case that dynamic SQL is the predominant form of new development.

Now a lot of things have changed to make this the case. Particularly that most new applications are being developed for distributed and web applications, instead of traditional mainframe, COBOL applications. And dynamic SQL is the default way to access DB2 from these type of apps.

But you know what? Even if you are developing traditional mainframe COBOL programs, dynamic SQL can be a better solution for you.

The Basics

Before we go on, let's tackle a few of the basics. What makes dynamic SQL different than static SQL?  Well, static SQL is optimized prior to program execution.  Each and every static SQL statement in a program is analyzed and optimized during the DB2 Bind process.  During this process the best access path is determined and coded into your DB2 package.  When the program is executed, the pre-formulated access path is executed.

Dynamic SQL, on the other hand, is optimized at run time.  Prior to the dynamic SQL statement being executed, it must be processed by the DB2 Optimizer so that an optimal access path can be determined.  This is called the Prepare process.  Prepare can be thought of as a dynamic Bind. 

We will not go into the details of dynamic statement caching and its ability to improve dynamic SQL performance here. Suffice it to say, dynamic SQL these days can be tuned using caching. For additional details on dynamic statement caching (and REOPT parms) check out my article, Dynamic SQL Performance, on TOAD World.

Now let's turn our attention to traditional dynamic SQL development. There are four types of dynamic SQL:
  • EXECUTE IMMEDIATE
  • Non-SELECT
  • Fixed-List SELECT
  • Varying-List SELECT

EXECUTE IMMEDIATE dynamic SQL ­will (implicitly) prepare and execute complete SQL statements embedded in host-variables.  Its drawbacks are that it can not be used to retrieve data using the SELECT statement and the PREPARE is implicit within the EXECUTE IMMEDIATE; so optimization must occur every time the statement is executed.

Non-SELECT Dynamic SQL can be used to explicitly prepare and execute SQL statements in an ­application program.  The PREPARE and EXECUTE are separated so that once a statement is prepared, it can be executed multiple time without re-optimization.  However, as its name implies, Non-SELECT dynamic SQL can not ­issue the SELECT statement. 

Fixed-List SELECT can be used to explicitly prepare and execute SQL SELECT statements ­where the exact columns to be retrieved are always known in advance.  The columns to be retrieved must be known at the time the program is being coded and they can not change during execution.  This is necessary ­in order to create the proper working-storage declaration for ­host-variables in your program. 

If you do not know in advance ­the exact columns that are to be accessed, you can use Varying-List SELECT dynamic SQL.  In this case, pointer variables are used to maintain the list of selected columns.  Although Varying-List SELECT is the most complicated type of dynamic SQL, it also provides the most flexibility for dynamic SELECT statements.  Changes can be made "on the fly" to tables,­ and columns, and predicates.  Because everything about the query­ can change during one invocation of the program, the number and­ type of host-variables needed to store the retrieved rows cannot­ be known beforehand.  This will add considerable sophistication and complexity to­ your application programs.

Mathematical Reason to Reconsider Dynamic SQL

Even if the decreasing cost of dynamic SQL and the newer performance improvements like dynamic statement caching do not compel you to use dynamic SQL, there is at least one situation where dynamic SQL should almost always be chosen over static SQL:  when numerous combinations of predicates can be chosen by a user at run-time.

Consider the following:  What if, for a certain query, there are twenty possible predicates.  The user of the program is permitted to choose up to six of these predicates for any given request.  How many different static SQL statements need to be coded to satisfy these specifications?

First, let's determine the number of different ways that you can choose six predicates out of twenty.  To do so we need to use combinatorial coefficients.  So, if n is the number of different ways then:

            n = (20 x 19 x 18 x 17 x 16 x 15) / (6 x 5 x 4 x 3 x 2 x 1)

            n = (27,907,200) / (720)

            n = 38,760

38,760 separate static SELECTs is quite a large number, but this is still not enough!  This number shows the total number of different ways we can choose six predicates out of twenty if the ordering of the predicates does not matter (which for all intents and purposes it does not)[1].  However, since the specifications clearly state that the user can choose up to six, we have to modify our number.  This means that we have to add in:
  • the number of different ways of choosing five predicates out of twenty
  • the number of different ways of choosing four predicates out of twenty
  • the number of different ways of choosing three predicates out of twenty
  • the number of different ways of choosing two predicates out of twenty
  • the number of different ways of choosing one predicate out of twenty


Figure 1.  Combinatorial Coefficients Calculations


Ways to Choose Six Predicates Out of Twenty

            (20 x 19 x 18 x 17 x 16 x 15) / (6 x 5 x 4 x 3 x 2 x 1) = 38,760

Ways to Choose Five Predicates Out of Twenty

            (20 x 19 x 18 x 17 x 16) / (5 x 4 x 3 x 2 x 1) = 15,504

Ways to Choose Four Predicates Out of Twenty

            (20 x 19 x 18 x 17) / (4 x 3 x 2 x 1) = 4,845

Ways to Choose Three Predicates Out of Twenty

            (20 x 19 x 18) / (3 x 2 x 1) = 1,140

Ways to Choose Two Predicates Out of Twenty

            (20 x 19) / (2 x 1) = 190

Ways to Choose One Predicate Out of Twenty

            20 / 1 = 20

Total Ways to Choose Up To Six Predicates Out of Twenty

            38,760 + 15,504 + 4,845 + 1,140 + 190 + 20 = 60,459


This brings the grand total number of static SQL statements that must be coded to 60,459.  The calculations are shown in Figure 1.  In a situation like this, if static SQL is being forced upon us, we have one of two options:

1.   code for forty days and forty nights hoping to successfully write 60,459 SQL statements
2.   compromise on the design and limit the users flexibility

I can guarantee that 99.99% of the time the second option will be chosen.  My solution would be to abandon static SQL and use dynamic SQL in this situation.  How would this ease the development situation?  Consider the following:
  • With dynamic SQL, the twenty predicates need be coded only once (in working storage)
  • As the program runs, the application logic can build the complete SQL statement based upon user input
  • The size of the DBRM will decrease dramatically.  The DBRM for the static SQL program will be huge if it contains all of the 60,459 SQL statements.  Even if a compromise number is reached, chances are that the DBRM will be large.  And guaranteed it will be larger than the DBRM for the dynamic SQL program.
  • Although there will be additional run-time overhead to build the SQL and perform the dynamic Prepare, performance may not suffer. Queries issued against non-uniform data, may actually experience improved access paths and perform better.

So When Should You Seriously Consider Dynamic SQL?
  • When the nature of the program is truly changeable, such as the example given in the text above.
  • When the columns to be retrieved can vary from execution to execution.  This is similar to the example given where multiple combinations of predicates can be chosen, but in this case, multiple combinations of columns can be chosen.
  • When benefit can be accrued from interacting with other dynamic SQL applications.  
  • When the SQL must access non-uniform data.
You can find some additional guidance for helping you to evaluate when to use dynamic versus static SQL in my Data Perspectives column Dynamic vs. Static SQL.

Synopsis

Dynamic SQL is not always bad... and it is already pervasive in distributed and web applications.  In this day and age, dynamic SQL should be considered as a viable option even for traditional mainframe applications that are not distributed or web-based.





[1] It is true that for performance reasons you may want to put the predicate with the highest cardinality within each type of operator first, but we will not concern ourselves with this in this blog post.

Monday, February 17, 2014

Rebinding for Optimal DB2 Access Paths

The access paths formulated by the DB2 optimizer during the BIND and REBIND processes are critical to your application performance. It is these access paths that determine not only how DB2 data is accessed by your program, but how efficiently it is accessed. Whether you are preparing a new program, implementing changes into your existing DB2 applications, upgrading to a new version of DB2, or simply trying to achieve optimum performance for existing applications, an exhaustive and thorough REBIND management policy should be of paramount importance.

However, many organizations are not doing everything possible to keep access paths up-to-date with the current state of their data. So what is the best practice approach for rebinding your DB2 programs? The answer is “The Five R’s.” This methodology of regular rebinding followed by a review of your access paths required the following steps:

  1.       Start with an inspection of the RTS (Real Time Statistics) to determine which objects need to be reorganized.
  2.       Follow that up by running a REORG on table spaces and indexes as appropriate based on the statistics.
  3.       After reorganizing, run RUNSTATS (to ensure the DB2 Catalog is up-to-date).
  4.       Follow that with REBINDs of your programs.
  5.       Then we need that fifth R – which is to review the access paths generated by the REBIND.
For shops that have avoided rebinding for years this approach represents a significant change. So what new DB2 features are available to help? Well, several releases ago, back in DB2 9 for z/OS, plan stability was added. This feature enables you to save a backup version of your access paths as a precautionary measure. If any of the new access paths after rebinding are less efficient, the DBA can switch back to the backed up access paths. In order to implement this level of stability you can use the PLANMGMT parameter of the REBIND command. There are three options: NONE, BASIC, and EXTENDED. BASIC saves the previous access paths, and EXTENDED saves the previous and an original. You can use REBIND and the SWITCH parameter to revert back to the saved access paths when the new access paths cause degraded performance. 

As of DB2 10 for z/OS you can tell DB2 to try to reused previous access paths for SQL statements whenever possible. This is called access path reuse and is implemented using the APREUSE bind option. When invoked, DB2 uses information about the previous access paths to create a hint.

When BIND PACKAGE or REBIND PACKAGE specifies APREUSE(ERROR), DB2 tries to locate the access path information from a package that has a matching identity. If no such package exists, DB2 tries to locate another recent version of the package that has the matching location, collection ID, and name. The APREUSE option only applies to statements that have identical statement text in both packages. Newly added statements and statements with text changes never reuse previous access paths.
Reusing access paths can help to minimize administrative and performance issues involved in rebinding.

Of course, there are products on the market which can be used to implement a proactive approach to rebinding. These products preview the new access paths and then run them through a rules system to determine if the new access will be improved, unchanged, or degraded. With this information we can rebind everything that would improve and avoid rebinding anything else until we can analyze the cause of the degradation. Using such an approach you should not have degraded access paths sneaking into your production environment.

Summary

At any rate, a systematic approach to DB2 binding and rebinding is necessary to assure optimal performance within your DB2 applications. This short blog entry covers some of the basics and recent changes to DB2 in this area. 

Be sure to take your time and to plan your statistics-gathering and bind/rebind approach... or be ready to be in constant firefighter mode as you address poorly-performing SQL in your applications!


Sunday, February 09, 2014

Those Good Old IBM Mainframe Utility Programs

Most mainframe programmers are aware that IBM supplies many utility programs that are useful for system maintenance chores such as copying and deleting files, manipulating partitioned data sets, and the like.  

These utilities typically begin with an IEB, IEF, or IEH prefix.  One of the most common of these is IEFBR14, which is used to catalog a data set.  But few people are aware, however, that IBM also supplies many other utilities that can make the life of a programmer much easier.  Below is a select list of these:

IEBMAGIC        fixes any problem without having to use your brain; perfect for consultants and contract programmers

IEBIQUIT         automatically updates your resume, writes a letter of resignation, forwards it to your boss, and prints reams of paper to make it look like you actually did something when you worked here

IEBIBALL         compares any source file to the last one that actually worked displaying all changes and who made them; perfect tool for technical support personnel overwhelmed by programmers chanting the phrase "I didn't change anything"

IEBPANIC         if all else fails, run IEBPANIC; sometimes it fixes your problem and sometimes it doesn't, but it never tells you how it did it; companion program to IEBMAGIC;

IEBNOTME      alters all trace of your userid from system change logs, SMF, etc.; useful to eliminate finger-pointing; this should always be run before running IEBIQUIT

IEFINGER        when designing on-line systems sometimes intermediate screens are sent that serve no purpose other than to tie together a series of related transactions; these intermediate screens generally direct the user to "Press ENTER to Continue"; IEFINGER simulates the end user pressing the ENTER key thereby eliminating unnecessary screens

IEHAMMER      forces a square peg into a round hole; for example, if you try to compile a COBOL program using the FORTRAN compiler, attaching IEHAMMER to the compile will make it work

IEBPIG               finds all unused resources of any type and assigns them to any specified job

IEBHAHA         randomly changes source code in test libraries; it is often speculated that IEBHAHA is the cause of most program problems that IEBIBALL is used to correct

IEBEIEIO          run this utility when you have too many problems to handle for one person;  it corrects the old "with an abend here, and a meeting there, e-i-e-i-o" syndrome by causing a system problem so large (in someone else's application) that all direction is diverted away from you to them

So did I forget your favorite?  Drop a comment below to share it!

Saturday, February 01, 2014

The Twelve DBA Rules of Thumb... a summary

Over the past couple of months this blog has offered up some rules of thumb for DBAs to follow that can help you to build a successful and satisfying career as a DBA. These twelve rules of thumb worked well for me as I worked my way through my career and I have shared them with you, my faithful readers, so that you can benefit from my experiences. I hope you find them useful... and if I have missed anything, please post a comment with your thoughts and experiences on being a good DBA.


As a reminder of what we have discussed, I am posting a short synopsis of the Twelves DBA Rules of Thumb here, along with links to each blog post.

 1. Write Down Everything
 2. Automate Routine Tasks
 3. Share Your Knowledge
 4. Analyze, Simplify and Focus
 5. Don't Panic!
 6. Be Prepared
 7. Don't Be a Hermit
 8. Understand the Business, Not Just the Technology
 9. Ask for Help When You Need It
10. Keep Up-to-Date
11. Invest in Yourself
12. Be a Packrat


Good luck with your career as a DBA...

Saturday, January 25, 2014

DBA Rules of Thumb - Part 12 (Be a Packrat)

Today's post in the DBA Rules of Thumb series is short and sweet. It can be simply stated as "Keep Everything!"

Database administration is the perfect job for you if you are a pack rat.


It is a good practice to keep everything you come across during the course of performing your job. When you slip up and throw something away, it always seems like you come across a task the very next day where that stuff would have come in handy... but you you threw it out!

I still own some printed manuals for DB2 Version 2. They are packed up in a plastic tub in my garage, but I have them in case I need them.

Tuesday, January 21, 2014

DBA Rules of Thumb - Part 11 (Invest in Yourself)

Most IT professionals continually look for their company to invest money in their ongoing education. Who among us does not want to learn something new — on company time and with the company’s money? Unless you are self-employed, that is!

Yes, your company should invest some funds to train you on new technology and new capabilities, especially if it is asking you to do new things. And since technology changes so fast, most everyone has to learn something new at some point every year. But the entire burden of learning should not be placed on your company.

Budget some of your own money to invest in your career. After all, you probably won’t be working for the same company your entire career. Why should your company be forced to bankroll your entire ongoing education? Now, I know, a lot depends on your particular circumstances. Sometimes we accept a lower salary than we think we are worth because of the “perks” that are offered. And one of those perks can be training. But perks have a way of disappearing once you are "on the job."

Some folks simply abhor spending any of their hard-earned dollars to help advance their careers. This is not a reasonable approach to your career! Shelling out a couple of bucks to buy some new books, subscribe to a publication, or join a professional organization should not be out of the reach of most DBAs.

A willingness to spend some money to stay abreast of technology is a trait that DBAs need to embrace. 




Most DBAs are insatiably curious, and many are willing to invest some of their money to learn something new. Maybe they bought that book on NoSQL before anyone at their company started using it. Perhaps it is just that enviable bookshelf full of useful database books in their cubicle. Or maybe they paid that nominal fee to subscribe to the members-only content of that SQL Server portal. They could even have forked over the $25 fee to attend the local user group.

Don’t get me wrong. I’m not saying that companies should not reimburse for such expenses. They should, because it provides for better-rounded, more educated, and more useful employees. But if your employer won’t pay for something that you think will help your career, why not just buy it yourself?

And be sure to keep a record of such purchases because unreimbursed business expenses can be tax deductible. 

Sunday, January 12, 2014

DBA Rules of Thumb - Part 10 (Keep Up-to-Date)

If you wish to be a successful DBA for a long period of time, you will have to keep up-to-date on all kinds of technology — both database-related and other.

Of course, as a DBA, your first course of action should be to be aware of all of the features and functions available in the DBMSs in use at your site — at least at a high level, but preferably in depth. Read the vendor literature on future releases as it becomes available to prepare for new functionality before you install and migrate to new DBMS releases. The sooner you know about new bells and whistles, the better equipped you will be to prepare new procedures and adopt new policies to support the new features.

Keep up-to-date on technology in general, too. For example, DBAs should understand new data-related technologies such as NoSQL, Hadoop, and predictive analytics, but also other newer technologies that interact with database systems. Don’t ignore industry and technology trends simply because you cannot immediately think of a database-related impact. Many non-database-related “things” (for example, XML) eventually find their way into DBMS software and database applications.

Keep up-to-date on industry standards — particularly those that impact database technology such as the SQL standard. Understanding these standards before the new features they engender have been incorporated into your DBMS will give you an edge in their management. DBMS vendors try to support industry standards, and many features find their way into the DBMS because of their adoption of an industry standard.

As we've already discussed in this series, one way of keeping up-to-date is by attending local and national user groups. The presentations delivered at these forums provide useful education. Even more important, though, is the chance to network with other DBAs to share experiences and learn from each other’s projects.

Through judicious use of the Internet and the Web, it is easier than ever before for DBAs to keep up-to-date. Dozens of useful and informative Web sites provide discussion forums, script libraries, articles, manuals, and how-to documents. Consult my web site at http://www.craigsmullins.com/rellinks.html for a regularly-updated  list of DBMS, data, and database-related Web resources.

Remember, though, this is just a starting point. There are countless ways that you can keep-up-to-date on technology. Use every avenue at your disposal to do so, or risk becoming obsolete.


Sunday, January 05, 2014

DBA Rules of Thumb - Part 9 (Call on Others for Help When Needed)

Use All of the Resources at Your Disposal

Remember that you do not have to do everything yourself. Use the resources at your disposal. We have talked about some of those resources, such as articles and books, Web sites and scripts, user groups and conferences. But there are others.

Do not continue to struggle with problems when you are completely stumped. Some DBAs harbor the notion that they have to resolve every issue themselves in order to be successful. Sometimes you just need to know where to go to get help to solve the problem. Use the DBMS vendor’s technical support, as well as the technical support line of your DBA tool vendors. Consult internal resources for areas where you have limited experience, such as network specialists for network and connectivity problems, system administrators for operating system and system software problems, and security administrators for authorization and protection issues.

As a DBA you are sometimes thought of as "knowing everything" (or worse a know-it-all), but it is far more important to know where to go to get help to solve problems than it is to try to know everything there is to know. Let's face it... it is just not possible to know everything about database systems and making them work with all types of applications and users these days.

When you go to user groups, build a network of DBA colleagues whom you can contact for assistance. Many times others have already encountered and solved the problem that vexes you. A network of DBAs to call on can be an invaluable resource (and no one at your company even needs to know that you called for outside help).

Finally, be sure to understand the resources available from your DBMS vendors. DBMS vendors offer their customers access to a tremendous amount of useful information. All of the DBMS vendors offer software support on their Web sites. Many of them provide a database that users can search to find answers to database problems. IBM customers can use IBMLink,[1] and both Oracle and Microsoft offer a searchable database in the support section of their Web sites. Some DBAs claim to be able to solve 95 percent or more of their problems by researching online databases. These resources can shrink the amount of time required to fix problems, especially if your DBMS vendor has a reputation of “taking forever” to respond to issues.

Of course, every DBA should also be equipped with the DBMS vendor’s technical support phone number for those tough-to-solve problems. Some support is offered on a pay-per-call basis, whereas other times there is a prepaid support contract. Be sure you know how your company pays for support before calling the DBMS vendor. Failure to know this can result in your incurring significant support charges.




[1].IBMLink is a proprietary network that IBM opens up only to its customers.

Thursday, January 02, 2014

DBA Rules of Thumb - Part 8 (Being Business Savvy)

Understand the Business, Not Just the Technology

Remember that being technologically adept is just a part of being a good DBA. Although technology is important, understanding your business needs is more important. If you do not understand the impact on the business of the databases you manage, you will simply be throwing technology around with no clear purpose.

Business needs must dictate what technology is applied to what database—and to which applications. Using the latest and greatest (and most expensive) technology and software might be fun and technologically challenging, but it most likely will not be required for every database you implement. The DBA’s tools and utilities need to be tied to business strategies and initiatives. In this way, the DBA’s work becomes integrated with the goals and operations of the organization.
The first step in achieving this needed synergy is the integration of DBA services with the other core components of the IT infrastructure. Of course, DBAs should be able to monitor and control the databases under their purview, but they should also be able to monitor them within the context of the broader spectrum of the IT infrastructure—including systems, applications, storage, and networks. Only then can companies begin to tie service-level agreements to business needs, rather than technology metrics.

DBAs should be able to gain insight into the natural cycles of the business just by performing their job. Developers and administrators of other parts of the IT infrastructure will not have the vision into the busiest times of the day, week, quarter, or year because they are not privy to the actual flow of data from corporate business functions. But the DBA has access to that information as a component of performing the job. It is empowering to be able to understand business cycle information and apply it on the job.

DBAs need to expand further to take advantage of their special position in the infrastructure. Talk to the end users — not just the application developers. Get a sound understanding of how the databases will be used before implementing any database design. Gain an understanding of the database’s impact on the company’s bottom line, so that when the inevitable problems occur in production you will remember the actual business impact of not having that data available. This also allows you to create procedures that minimize the potential for such problems.

To fulfill the promise of business/IT integration, it will be necessary to link business services to the underlying technology. For example, a technician should be able to immediately comprehend that a service outage to transaction X7R2 in the PRD2 environment means that regional demand deposit customers cannot access their accounts. See the difference?

Focusing on transactions, TP monitors, and databases is the core of the DBA’s job. But servicing customers is the reason the DBA builds those databases and manages those transactions. Technicians with an understanding of the business impact of technology decisions will do a better job of servicing the business strategy. This is doubly true for the DBA’s manager. Technology managers who speak in business terms are more valuable to their company.

Of course, the devil is in the details. A key component of realizing effective business/IT integration for DBAs is the ability to link specific pieces of technology to specific business services. This requires a service impact management capability—that is, analyzing the technology required to power each critical business service and documenting the link. Technologies exist to automate some of this through event automation and service modeling. Such capabilities help to transform availability and performance data into detailed knowledge about the status of business services and service-level agreements.


Today’s modern corporations need technicians who are cognizant of the business impact of their management decisions. As such, DBAs need to get busy transforming themselves to become more business savvy — that is, to keep an eye on the business impact of the technology under their span of control.