Thursday, September 04, 2014

The Importance of SLAs and RTOs

Assuring optimal performance is one of the most frequently occurring tasks for DB2 DBAs. Being able to assess the effectiveness and performance of various and sundry aspects of your DB2 systems and applications is one of the most important things that a DBA must be able to do. This can include online transaction response time evaluation, sizing of the batch window and determining whether it is sufficient for the workload, end-to-end response time management of distributed workload, and so on. 

But in order to accurately gauge the effectiveness of your current environment and setup, Service Level Agreements, or SLAs, are needed. SLAs are derived out of the practice of Service-level management (SLM), which is the “disciplined, proactive methodology and procedures used to ensure that adequate levels of service are delivered to all IT users in accordance with business priorities and at acceptable cost.”

In order to effectively manage service levels, a business must prioritize its applications and identify the amount of time, effort, and capital that can be expended to deliver service for those applications.

A service level is a measure of operational behavior. SLM ensures that applications behave accordingly by applying resources to those applications based on their importance to the organization. Depending on the needs of the organization, SLM can focus on availability, performance, or both. In terms of availability, the service level might be defined as “99.95 percent uptime from 9:00 a.m. to 10:00 p.m. on weekdays.” Of course, a service level can be more specific, stating that “average response time for transactions will be 2 seconds or less for workloads of 500 or fewer users.”

For an SLA to be successful, all parties involved must agree on stated objectives for availability and performance. The end users must be satisfied with the performance of their applications, and the DBAs and technicians must be content with their ability to manage the system to the objectives. Compromise is essential to reach a useful SLA.
In practice, though, many organizations do not institutionalize SLM. When new applications are delivered, there may be vague requirements and promises of subsecond response time, but the prioritization and budgeting required to assure such service levels are rarely tackled (unless, perhaps, if the IT function is outsourced). It never ceases to amaze me how often SLAs simply do not exist. I always ask for them whenever I am asked to help track down performance issues or to assess the performance of a DB2 environment.

Let's face it, if you do not have an established agreement for how something should perform, and what the organization is willing to pay to achieve that performance, then how can you know whether or not things are operating efficiently enough? The simple answer is: you cannot.

It may be possible for a system assessment to offer up general advice on areas where performance gains can be achieved. But in such cases -- where SLAs are non-existent -- it you cannot really deliver guidance on whether the effort to remediate the "problem areas" is worthwhile. Without the SLAs in place you simply do not know if current levels of performance are meeting agreed upon service levels, because there are no agreed-upon service levels (and, no, "subsecond respond time" is NOT a service level! Additionally, you cannot know what level of spend is appropriate for any additional effort needed to achieve the potential performance, because no budget has been agreed upon.

Another potential problem is the context of the service being discussed. Most IT professionals view service levels on an element-by-element basis. In other words, the DBA views performance based on the DBMS, the SysAdmin views performance based on the operating system or the transaction processing system, and so on. SLM properly views service for an entire application. However, it can be difficult to assign responsibility within the typical IT structure. IT usually operates as a group of silos that do not work together very well. Frequently, the application teams operate independently from the DBAs, who operate independently from the SAs, and so on.

To achieve end-to-end SLM, these silos need to be broken down. The various departments within the IT infrastructure need to communicate effectively and cooperate with one another. Failing this, end-to-end SLM will be difficult to implement.

The bottom line is that the development of SLAs for your batch windows, your transactions and business processes is a best practice that should be implemented at every DB2 shop (indeed, you can remove DB2 from that last sentence and it is still true).

Without SLAs, how will the DBA and the end users know whether an application is performing adequately? Not every application can, or needs to, deliver subsecond response time. Without an SLA, business users and DBAs may have different expectations, resulting in unsatisfied business executives and frustrated DBAs—not a good situation.
With SLAs in place, DBAs can adjust resources by applying them to the most-mission-critical applications as defined in the SLA. Costs will be controlled and capital will be expended on the portions of the business that are most important to the business. Without SLAs in place, an acceptable performance environment will be ever elusive. Think about it; without an SLA in place, if the end user calls up and complains to the DBA about poor performance, there is no way to measure the veracity of the claim or to gauge the possibility of improvement within the allotted budget.

Recovery Time Objectives (RTOs)

Additionally, the effectiveness of backup and recovery should be a concern to all DB2 DBAs. This requires that RTOs (Recovery Time Objectives) be established. An RTO is basically an SLA for the recovery of your database objects. Without RTOs, it is difficult (if not impossible) to gauge the state of recoverability and the efficacy of image copies being taken. 

Each database object should have an RTO assigned to it. The RTO needs to take into account the same type of things that an SLA considers. In other words, the business must prioritize its applications, DBAs must map database objects to the applications, and together they must identify the amount of time, effort, and capital that can be expended to assure the minimization of downtime for those applications.

Again, we are measuring operational behavior. The RTO ensures that, when problems occur requiring database recovery, the application outage is limited to what has been defined as tolerable for the business (in terms of uptime and cost to provide that uptime).
Again, as with an SLA, for the RTO to be successful, all parties involved must agree on stated objectives for downtime and time to recovery. The end users must be satisfied with the potential duration of their application’s downtime, and the DBAs and technicians must be content with their ability to recover the system to the objectives. And again, cost is a contributing factor. The RTO cannot simply be I need my application up in 5 minutes and I can’t spend any more money to do that, because that is not reasonable (or possible).

Without written RTOs, DBAs can provide due diligence to make sure that database objects are backed up and recoverable, but cannot really provide any guarantee in terms of how quickly the data can be recovered (or perhaps, to what point in time) when an outage occurs. Of course, the DBA can create and review backup policies and procedures to encourage a recoverable environment. But there won't be any way to ensure with any consistency that the backup plan can deliver the time-to-recovery needed by the business.

So why don't organizations create SLAs and RTOs as a regular course of business? 

And if your organization does create SLAs and RTOs, please share with us how doing so became a standard at your shop...


Saturday, August 23, 2014

DB2 Health Checks - Part 3

In parts one and two of this series on DB2 health checks, we discussed the importance of regularly checking the health of your DB2 subsystems and applications. We also looked at some of the issues involved in a health check including figuring out the scope of what is to be involved and some of the considerations to ponder as you approach assessing the health of your DB2 environment.

Of course, it is not really feasible to cover all of the components that you might need to address in your health checks in a series of posts in a blog. My true intent here is to get you to understand the importance of regularly checking DB2's health, instead of just plodding along and only making changes when someone complains!


But even though DB2 health checks are important and crucial to the on-going stability of your systems, they can be costly, time-consuming, and valid only for the point in time(s) that you review. But maybe there is something else you can do to attack this problem?

DB2 Offline Analysis
Instead of relying on outside experts to conduct your DB2 health checks you can instead rely on expert system software to provide a reliable, impartial analysis of your DB2 databases and applications. Such as solution is offered by Data Kinetics’ InnovizeIT Offline Analyzer for DB2 for z/OS.

How does InnovizeIT work? Well, similar to a  DB2 health check, the product deploys a two-step process to check the health of your DB2 databases and applications:
  1. Collect data about your DB2 environment and ship it to your personal computer
  2. Analyze the data and identify issues and potential problems
InnovizeIT is a planning and analysis tool that identifies mainframe DB2 bottleneck and performance degradation problems. DB2 performance and availability metadata is collected on the mainframe and downloaded to a Windows workstation. All of the analysis is performed offline, on the workstation, so there is no use of mainframe resources and no effect on mainframe performance.

Runing the analysis on a PC workstation instead of the mainframe is an important feature in today’s world of cost-cutting and resource management. Most organizations are looking for ways to reduce their mainframe MSU consumption and would not really look too kindly on a big analysis job consuming a lot of mainframe CPU to analyze your DB2 environment. PC resources are frequently idle during off hours, so it makes a lot of sense to run the analysis on those under-utilized resources.

The offline analysis process uses weighted analysis results with targeted and prioritized recommendations for fixing performance problems. The guided assistance InnovizeIT provides enables you to plan corrective actions and protect your budget regardless of static or dynamic SQL use, or variable workload processing.

The results of the analysis are categorized and reported using an easy-to-navigate GUI. You can scan and review the problems identified by the analysis all on your PC workstation. There is no need to go back and forth between the mainframe and the PC because all of the relevant information is captured to allow the DBA to review the results of the analysis. 

The information displayed is context-sensitive depending upon the issue you are investigating and the report you are viewing. You can combine performance metrics from your DB2 performance monitor to add more detail to the analysis and reports. And you can send all of the reports to a spreadsheet for posterity and distribution to all of the DB2 DBAs, developers and, indeed,  anyone interested.

Summary
DB2 health checking should be a standard component of your DB2 database management procedures. Regularly examining your DB2 environment for problematic issues makes good business sense because it can improve performance and reduce costs. And InnovizeIT for DB2 for z/OS is a useful and cost-effective mechanism for conducting regular health checks.

Consider taking a look at it today at http://dkl.com/innovizeit.html

Friday, August 15, 2014

Join the Transaction TweetChat

Today's blog post is an invitation to join me -- and several of my esteemed colleagues -- on Twitter on August 20, 2014 for a TweetChat on transactions.

Now that sentence may have caused some of you to have a couple questions. First of all, what is a TweetChat? Well, a TweetChat is a pre-arranged conversation that happens on Twitter. It is arranged by an organizer (for this one, that would be IBM) and features several invited "experts" to discuss the topic at hand.

The featured guests for this TweetChat are:
  • Scott Hayes – @srhayes
  • Craig Mullins  @craigmullins
  • Kelly Schlamb  @KSchlamb
But everybody can participate. All that you need is a Twitter account and the hashtag, which for this event is #Transactions. You can search for the #Transactions hashtag, and all of the tweets using that hashtag will show up. You can participate in the TweetChat simply by including the hashtag #Transactions in your tweets.

So if you are interested in the conversation topic -- transactions -- be sure to join us and participate in the discussion... or at least just listen in to hear what folks think...


Monday, August 04, 2014

A Short Report from SHARE in Pittsburgh

Today’s blog post will be a short review of SHARE posted directly from the conference floor in Pittsburgh!

What is SHARE
For those of you who are not aware of SHARE, it is an independent, volunteer run association providing enterprise technology professionals with continuous education and training, valuable professional networking and effective industry influence. SHARE has existed for almost 60 years. It was established in 1955 and is the oldest organization of computing professionals.
The group conducts two conferences every year. Earlier in 2014 the first event was held in Anaheim, and this week (the week of August 3rd) the second annual event is being held in my original hometown, Pittsburgh, PA. Now I’ve been attending SHARE, more regularly in the past than lately, since the 1990’s. But with the event being held in Pittsburgh I just had to participate!
The keynote (or general) session today started up at 8:00 AM. It was titled “Beyond Silicon: Cognition and Much, Much More”  and it was delivered by Dr. Bernard S.Meyerson, IBM Fellow and VP, Innovation.  Meyerson delighted the crowd with his entertaining and educational session.

Next up was “Enterprise Computing: The Present and the Future”, an entertaining session that focused on what IBM believes are the four biggest driving trends in IT/computing: cloud, analytics, mobile, and social media. And, indeed, these trends are pervasive and interact with one another to create the infrastructure of most modern development efforts. Bryan Foley Program Director, System z Strategy at IBM delivered the presentation and unloaded a number of interesting stats on the audience, including:
  • Mainframe is experiencing 31 percent growth
  • Mainframes process 30 billion business transactions daily
  • The mainframe is the ultimate virtualized system
  • System z is the most heavily instrumented platform in the world
  • The mainframe is an excellent platform for analytics because that’s where the data is

Clearly, if you are a mainframer, there is a lot to digest… and a lot to celebrate. Perhaps the most interesting tidbit shared by Foley is that “PC is the new legacy!” He backed this up with a stat claiming that mobile Internet users are projected to surpass PC Internet users in 2015. Interesting, no?

Now those of you that know me know that I am a DB2 guy, but I have not yet attended much DB2 stuff. I sat in on an intro to MQ and I’m currently prepping for my presentation this afternoon – “Ten Breakthroughs That Changed DB2 Forever.”


The presentation is based on a series of articles I wrote a couple years ago, but I am continually tweaking it to keep it up to date and relevant. So even if you’ve read the article, if you are at SHARE and a DB2 person, stop by Room 402 at 3:00PM… and if you’re not here, the articles will have to do!

That's all for now... gotta get back to reviewing my presentation... hope to see you at SHARE this week... or, if not, somewhere else out there in DB2-land!

Friday, August 01, 2014

DB2 Health Checks - Part Two

In the first part of this series on DB2 health checks, DB2 Health Checks - Part One, I discussed the general concept of a health check and their basic importance in terms of maintaining a smooth-running DB2 environment.

Today, I want to briefly look at how DB2 health checks are usually done... if they are done at all.

The Scope of a DB2 Health Check

Some people mistakenly view a DB2 Health Check as being performance-focused only. Yes, performance is an important aspect of a health check -- and I admit that performance is generally the area that causes an organization to undergo the health check process. But the overall health of the DB2 environment needs to be addressed by the health check. In addition to performance-related issues (system, database and application), this can include:


  • availability
  • fault tolerance
  • recoverability
  • use of automation
  • process review
  • documentation
  • people skills (DBA, sysprog, development, etc.)
Considerations Before Undergoing a DB2 Health Check

DB2 health checks are important and crucial to the on-going stability of your systems, but there are issues:
  • Health checks can be costly (consulting engagements)
  • When a consulting company conducts a health check the analysis usually is done off-site, so your DBAs do not learn the techniques used by the consultants as they massage and analyze the data
  • Health checks generally are valid for a specific point-in-time and can become obsolete quickly

Conducting DB2 Health Checks

DB2 health checks typically are conducted by IBM personnel, a DB2 consultant, or a larger services firm. The engagement begins with experts/consultants interviewing the DBAs, submitting questionnaires as needed and collecting data from DB2. After collecting the data the consulting team goes off site and analyzes the reams of collected data. There may be intermittent communication between the consulting team and the on-site DBAs to clear up any lingering questions or to clarify things during the analysis phase. After some time (usually a week or more), a report on the health of your DB2 environment, perhaps with some recommendations to implement, is delivered.

What happens next is all up to you. After reading the report you can ignore it, implement some or all of the recommendations, conduct further in-house investigation for the feasibility of implementing the recommendations, or send it along to management for their perusal. But there is a deadline involved. After all, your systems are not static. So the health check report is only as good as the point-in-time for which it was delivered. Time, as it always does, will creep up on you. If you wait too long, the recommendations become stale and you might not be doing the proper thing for your environment by implementing changes based on old information.

Of course, when too much time has gone by after the health check, you could always engage with the services company and consultants again, requiring additional spending.

Is another way? 

Stay tuned, as we'll look at some other options in upcoming installments of this blog series on DB2 health checking...

Friday, July 25, 2014

Happy DBA Day!

Hey everybody, time to celebrate... today, July 25, 2014 is SysAdmin day! For the past 15 years, the last Friday in June has been set aside to recognize the hard work done by System Administrators. This is known as System Administrator Appreciation Day

As a DBA, I have regularly co-opted the day to include DBAs because, after all, we are a special type of system administrator -- the system we administer is the DBMS!

So if you are a SysAdmin, DBA, Network Admin, etc. have an extra cup of Joe and a donut or two. Hang up a sign on your cubicle telling people it is SysAdmin Day. And hopefully get a little respect and appreciation for all you do every day of the year!

Thursday, July 17, 2014

DB2 Health Checks - Part 1

Left to their own devices, DB2 databases and applications will accumulate problems over time. Things that used to work, stop working. This can happen for various reasons including the addition of more data, a reduction in some aspect of business data, different types of data, more users, changes in busy periods, business shifts, software changes, hardware changes… you get the idea.

And there is always the possibility of remnants from the past causing issues with your DB2 environment. Some things may have been implemented sub-optimally from the start, perhaps many years ago… or perhaps more recently. Furthermore, DB2 is not a static piece of software; it changes over time with new versions, features and functionality. As new capabilities are introduced, older means of performing similar functionality become suboptimal, and in some cases, even obsolete. Identifying these artifacts can be troublesome and is not likely to be something that a DBA will do on a daily basis.

Nonetheless, the performance and availability of your DB2 environment – and therefore the business systems that rely on DB2 – can suffer if you do not pay attention to the health and welfare of your DB2 databases and applications.

Health Checking Your DB2
The general notion of a health check is well known in the IT world, especially within the realm of DB2 for z/OS. The purpose of a DB2 health check is to assess the stability, performance, and availability of your DB2 environment. Health checks are conducted by gathering together all of the pertinent details about your DB2-based systems and reviewing them to ascertain their appropriateness and effectiveness. You may narrow down a health check to focus on specific aspects of your infrastructure, for example, concentrating on just availability and performance, or on other aspects such as recoverability, security, and so on.

At any rate, scheduling regular independent reviews of your DB2 environment is an important aspect of assuring the viability and robustness of your implementation. Simply migrating DB2 applications to production and then neglecting to review them until or unless there are complaints from the end users is not a best practice for delivering good service to your business. Just like a car requires regular maintenance, so too does your DB2 environment. Regular analysis and health check with an overall goal should of identifying weaknesses and targeting inefficiencies, can save your organization time and money, as well as reduce the daily effort involved in implementing and maintaining your DB2 applications.

Think about the health of your DB2 system the same way you think about your health. A regular health check helps to identify and eliminate problems. And it helps you to perform the daily operational tasks on your DB2 databases and applications with the peace of mind that only regular, in-depth, knowledgeable analysis can deliver.

Check Back Soon
Later in this series we'll uncover more aspects of health checking and look at some software that might be able to assist. So stay tuned...

Tuesday, July 08, 2014

DB2 Application Performance Management

Assuring optimal performance for database applications can be a tricky thing. In today's blog I ruminate on the high-level issues involved in optimizing your DB2 for z/OS applications.

Applications that access databases are only as good as the performance they achieve. And every user wants their software to run as fast as possible. As such, performance tuning and management is one of the biggest demands on the DBA’s time. When asked what is the single most important or stressful aspect of their job, DBAs typically respond "assuring optimal performance."  Indeed, a Forrester Research survey indicates that performance and troubleshooting tops the list of most challenging DBA tasks.

But when you are dealing with data in a database management system there are multiple interacting components that must be managed and tuned to achieve optimal performance. That is, every database application, at its core, requires three components to operate:
·  the system (that is, the DBMS itself, the network, and the O/S),
·  the database (that is, the DDL and database schema), and
·  the application (that is, the SQL and program logic).

Each of these components requires care and attention, but today I want to focus on the high-level aspects of performance management from the perspective of the application. Furthermore, I will discuss this in terms of DB2 for z/OS.

So where do we begin? For DB2, a logical starting point is with BIND Parameters. There are many parameters and values that must be chosen from and specified when you bind a DB2 application program. The vast array of options at our disposal can render the whole process extremely confusing -- especially if you don’t bind on a daily basis. And even if you do, some of the options still might be confusing if you rarely have to change them. You know what I’m talking about, parameters like ACQUIRE, RELEASE, VALIDATE, and DEGREE.

I will not delve into the myriad bind options and give you advice on which to use when. There are many articles and books, as well as the IBM DB2 manuals that you can use to guide you along that path. Suffice it to say, that there are some standard parameters and values that should be chosen most of the time in most situations. As such, a wise DBA group will set up canned routines for the programmers to use for compiling and binding their applications. Choices such as: CICS transaction, DB2 batch, and BI/analytical query can be presented to the developer and then, based on which of the various types of programs and environments that are available, the canned script can choose the proper bind options. Doing so can greatly diminish the problems that can be encountered when the "wrong" parameters or values are chosen at Bind time.

Before concluding this short section on Bind parameters I want to give one important piece of advice: In production, always Bind your plans and packages specifying EXPLAIN YES. Failing to do so means that access paths will be generated, but you will not know what they are. This is akin to blinding yourself to what DB2 is doing and it makes application performance tuning much more difficult.

Access Path Management

Bind and Rebind are important components to achieve optimal DB2 application performance. This is so because these commands are what determine the access paths to the data requested by your program. So it is vitally important that you create a strategy for when and how to Rebind your programs. There are several common approaches. The best approach is to Rebind your applications over time as the data changes. This approach involves some form of regular maintenance that keeps DB2 statistics up to date and formulates new access paths as data volumes and patterns change. More on this in a moment.

Other approaches include Rebinding only when a new version of DB2 is installed, or perhaps more ambitious, whenever new PTFs are applied to DB2. Another approach is to Rebind automatically after a regular period of time, whether it is days, weeks, months, or whatever period of time you deem significant. This approach can work if the period of time is wisely chosen based on the application data – but it still can pose significant administrative issues.

The final approach -- the worst of the bunch -- comes from the if it ain’t broke don’t fix it school of thought. Basically, it boils down to (almost) never rebinding your programs. This approach penalizes every program for fear that a single program (or two) might experience a degraded access path. Oh, the possibility of degraded performance is real and that is why this approach has been adopted by some. And it can be difficult to find which statements may have degraded after a Rebind. The ideal situation would allow us to review the access path changes before hand to determine if they are better or worse. But DB2 itself does not provide any systematic method of administering access paths that way. There are third party tools that can help you achieve this though.

Anyway, let’s go back to the best approach again, and that is to Rebind regularly as your data changes. This involves what is known as the three Rs: REORG, RUNSTATS, and Rebind. At any rate, your goal should be to keep your access paths up-to-date with the current state of your data. Failing to do this means that DB2 is accessing data based upon false assumptions.
By Rebinding you will generally improve the overall performance of your applications because the access paths will be better designed based on an accurate view of the data. And as you apply changes to DB2 (new releases/PTFs) optimizer improvements and new access techniques can be used. If you never Rebind, not only are you forgoing better access paths due to data changes but you are also forgoing better access paths due to changes to DB2 itself.
To adopt the Three R’s you need to determine when to REORG. This means looking at either RUNSTATS or Real-Time Statistics (RTS). So, perhaps we need 4 R’s:

  1. RUNSTATS or preferably, RTS
  2. REORG
  3. RUNSTATS
  4. REBIND

But is this enough? Probably not because we need to review the access paths after rebinding to make sure that there are no rogue access paths. So, let’s add another R to Review the access paths generated by the REBIND. As we mentioned, the optimizer can make mistakes. And, of course, so can you. Users don't call you when performance is better (or the same). But if performance gets worse, you can bet on getting a call from irate users.

So we need to put in place best practices whereby we test Rebind results to compare the before and after impact of the optimizer’s choices. Only then can we assure that we are achieving optimal DB2 application performance.

Tuning the Code

Of course, everything we’ve discussed so far assumes that the code is written efficiently to begin with -- and that is a big assumption. We also need to make sure that we are implementing efficient application code. The application code consists of two parts: the SQL code and the host language code in which the SQL is embedded.

SQL is simple to learn and easy to start using. But SQL tuning and optimization is an art that takes years to master. Some general rules of thumb for creating efficient SQL statements include:
  • Let SQL do the work instead of the program. For example, code an SQL join instead of two cursors using program logic to join.
  • Simpler is generally better, but complex SQL can be very efficient.
  • Retrieve only the columns required, never more.
  • Retrieve the absolute minimum number of rows by specifying every WHERE clause that is appropriate.
  • When joining tables, always provide join predicates. In other words, avoid Cartesian products.
  • Favor using Stage 1 and Indexable predicates.
  • But favor Stage 2 predicates over application logic.
  • Avoid sorting (if possible) by creating indexes for ORDER BY and GROUP BY operations.
  • Avoid black boxes -- that is, avoid I/O routines that are called by programs instead of using embedded SQL.
  • Minimize deadlocks by updating tables in the same sequence in every program.
  • Issue data modification statements (INSERT, UPDATE, DELETE) as close as possible to the COMMIT statement as possible.
  • Be sure to build a COMMIT strategy into every batch program that changes data. Failing to COMMIT can cause locking problems.

Even if you follow the guidelines in this bulleted list, there will still be numerous opportunities for you to tune SQL for performance. To tune SQL you must be able to interpret the output of the access paths produced by EXPLAIN. This information is encoded in the plan tables. IBM offers Data Studio (as a free download) with a visual explain capability that can simplify this process. But you will also have to accumulate experience as to which SQL formulations work more efficiently than others. This skill will come with time and on-the-job learning.

Finally, some attention must be paid to the host language code. Host language code refers to the application programs written in C, COBOL, Java, Visual Basic or the programming language du jour. SQL statements are usually embedded into host language code and it is quite possible to have finely tuned SQL inside of inefficient host language code. And, of course, that would cause a performance problem.

Bottom Line

Although DBAs must understand all three aspects of database performance management concentrating on the application aspects of performance will most likely provide the most bang-for-the-buck. Of course, we have only touched the tip of the DB2 application performance iceberg today. But even this high-level view into application performance can serve as a nice starting place for tuning your DB2 applications.


Good luck with DB2 for z/OS and happy performance tuning! 

Thursday, July 03, 2014

Database Versus DBMS

What is a database? I bet most people reading this blog post think that they know the answer to that question. But many of them would be wrong. DB2 is not a database, it is a DBMS, or Database Management System. You can use DB2 to create a database, but DB2, in and of itself, is not a database. Same goes for Oracle (which is a DBMS and a company) and SQL Server (just a DBMS).
So what is a database? A database is an organized store of data wherein the data is accessible by named data elements (for example, fields, records, and files). It does not even have to be computerized to be a database. The phone book is a database (Why do they still send out phone books? Does anyone even use them any more? Now I’m way off topic, so let’s get back on track.)
A DBMS is software that enables end users or application programmers to share data. It provides a systematic method of creating, updating, retrieving and storing information in a database. DBMSs also are generally responsible for data integrity, data access control, and automated rollback, restart and recovery.
In layman’s terms, you can think of a database as a filing system. You can think of the filing cabinet itself along with the file folders and labels as the DBMS. A DBMS manages databases. You implement and access database instances using the capabilities of the DBMS.
So, DB2 and Oracle and SQL Server and MySQL are database management systems. Your payroll application uses the payroll database, which may be implemented using DB2 or Oracle or…
Why is that important? If we do not use precise terms when we write, speak, and work confusion can result. And confusion leads to over budget projects, improperly developed systems, and lost productivity. So precision must be important to us.

Tuesday, July 01, 2014

Blog Recommendation: Essential SQL

Hello, regular readers... just a short post today with a blog recommendation for anybody who uses SQL or wants to learn how to use SQL.

The name of the blog is Essential SQL by Kris Wenzel.

I happened upon the blog a couple of weeks ago and it offers up some nice, educational content on SQL. It is not specific to DB2, but the material is high-level and easily convertible to a DB2 environment.

The material on the blog starts out very basic with no assumption of any prior SQL knowledge... and builds up over time adding on details. Learn as much, or as little as you'd like.

Hope you find the blog to be useful (either for yourself, or to pass along to others)...

Happy SQL coding!

Monday, June 16, 2014

Don't Forget the Humble DB2 DISPLAY Command

Although robust performance and administration tools are probably the best solution for gathering information about your DB2 subsystems and databases, you can gain significant insight into your DB2 environment simply using the DISPLAY command.  There are multiple variations of the DISPLAY command depending on the type of information you are looking for.

DISPLAY DATABASE is probably the most often-used variation of the DISPLAY command. The output of the basic command shows the status of the database objects specified, along with any exception states. For example, issuing -DISPLAY DATABASE(DBNAME) shows details on the DBNAME database, including information about its tablespaces and indexes. With one simple command you can easily find all of the tablespaces and indexes within any database — pretty powerful stuff. But you also get status information for each space, too. When a status other than RO or RW is encountered, the object is in an indeterminate state or is being processed by a DB2 utility.

There are additional options that can be used with DISPLAY DATABASE. For partitioned page sets, you can specify which partition, or range of partitions, to show. And you can choose to display only objects in restricted or advisory status using either the ADVISORY or RESTRICT key word.

You can control the amount of output generated by DISPLAY DATABASE using the LIMIT parameter. The default number of lines returned by the DISPLAY command is 50, but the LIMIT parameter can be used to set the maximum number of lines returned to any numeric value; or you can use an asterisk (*) to indicate no limit.

Moving on, the DISPLAY BUFFERPOOL command provides the current status and allocation information for each buffer pool. The output includes the number of pages assigned to each pool, whether the pages have been allocated, and the current settings for the sequential steal and deferred write thresholds. For additional information on buffer pools you can specify the DETAIL parameter to return usage information such as number of GETPAGEs, prefetch usage, and synchronous reads. You can use this data for rudimentary buffer pool tuning.

You can gather even more information about your buffer pools using the LIST and LSTATS parameters. The LIST parameter shows open table spaces and indexes within the specified buffer pools; the LSTATS parameter shows statistics for the table spaces and indexes. Statistical information is reset each time DISPLAY with LSTATS is issued, so the statistics are as of the last time LSTATS was issued. 

If you are charged with running (IBM) DB2 utilities, another useful command is DISPLAY UTILITY. Issuing this command causes DB2 to display the status of all active, stopped, or terminating utilities. So, if you are in over the weekend running REORGs, RUNSTATS, or image copies, you can issue occasional DISPLAY UTILITY commands to keep up-to-date on the status of your jobs. By monitoring the current phase of the utility and matching this information with the utility phase information, you can determine the relative progress of the utility as it processes. The COUNT specified for each phase lists the number of pages that have been loaded, unloaded, copied, or read.

You can use the DISPLAY LOG command to display information about the number of active logs, their current capacity, and the setting of the LOGLOAD parameter. For archive logs, use the DISPLAY ARCHIVE command.

DISPLAY is helpful, too, if your organization uses stored procedures or user-defined functions (UDFs). DISPLAY PROCEDURE monitors whether procedures are currently started or stopped, how many requests are currently executing, the high-water mark for requests, how many requests are queued, how many times a request has timed out, and the WLM environment in which the stored procedure executes. And you can use the DISPLAY FUNCTION SPECIFIC command to monitor UDF statistics.

DISPLAY also returns a status indicating the state of each procedure or UDF. A procedure or UDF can be in one of four potential states: STARTED, STOPQUE (requests are queued), STOPREJ (requests are rejected), or STOPABN (requests are rejected because of abnormal termination).
And there remains a wealth of additional information that the DISPLAY command can uncover. For distributed environments, DISPLAY DDF shows configuration and status information, as well as statistical details on distributed connections and threads; DISPLAY LOCATION shows distributed threads details; DISPLAY PROFILE shows whether profiling is active or inactive; DISPLAY GROUP provides details of data-sharing groups (including the version of DB2 for each member) and DISPLAY GROUPBUFFERPOOL shows information about the status of DB2 group buffer pools; DISPLAY RLIMIT provides the status of the resource limit facility; DISPLAY THREAD display active and in-doubt connections to DB2; and DISPLAY TRACE lists your active trace types and classes along with the specified destinations for each.

If you are looking for some additional, more in-depth details on the DISPLAY command, take a look at this series of blog posts I wrote last year:
  • Part 1 of the series focused on using DISPLAY to monitor details about your database objects; 
  • Part 2 focused on using DISPLAY to monitor your DB2 buffer pools;
  • Part 3 covered utility execution and log information;
  • And Part 4 examined using the DISPLAY command to monitor DB2 stored procedures and user-defined functions.

Summary

The DB2 DISPLAY command is indeed a powerful and simple tool that can be used to gather a wide variety of details about your DB2 subsystems and databases. Every DBA should know how to use DISPLAY and its many options to simplify their day-to-day duties and job tasks.

Tuesday, June 10, 2014

ORDER BY an Expression

Sometimes a program requires that the results of a query be returned in a specific sequence. We all know that the ORDER BY clause can be used to sort SQL results into a specific order. For example, to return a sorted list of employee compensation sorted by last name we could write:

SELECT LASTNAME, 
       FIRSTNAME, 
       SALARY+COMM+BONUS
FROM   EMP
ORDER BY LASTNAME;

But what if we need to sort it by total compensation? There are two approaches that work here: position number and column renaming. Using position number the ORDER BY clause becomes:

SELECT LASTNAME, 
       FIRSTNAME, 
       SALARY+COMM+BONUS
FROM   EMP
ORDER BY 3;

This will cause DB2 to sort by the third element in the SELECT-list, in this case the total compensation expression. But what if we add another column at the beginning of the SELECT-list? Or what if we need to port the SQL to a different database with different standards? Well, in that case we can use column renaming:

SELECT LASTNAME, 
       FIRSTNAME, 
       SALARY+COMM+BONUS AS TOTAL_COMP
FROM   EMP
ORDER BY TOTAL_COMP;

This method is preferred for a number of reasons:

  • it will continue to work even if the SQL statement is changed
  • it gives the expression a name making it more self-documenting
  • it should be more portable

Monday, June 02, 2014

Don't Neglect Your DB2 Rebind Strategy

We’re all busy. Frequently it can seem like you just got in to the office and already it is past quitting time! There is so much to do and so little time to do it all. And we all work more than 40 hours a week… these are some of the common complaints of the busy DBA.

And those are valid concerns, but it does not diminish the need to properly address DB2 database administration and performance management... with a special focus on proactive management. 

So please take a little bit of time to read about, and consider your organization's strategy for rebinding DB2 applications.

REBIND Strategy

One of the most important contributors to the on-going efficiency and health of your DB2 environment is proper management of DB2 access path changes. A thorough REBIND management process is a requirement for healthy DB2 applications.

But many shops do not do everything possible to keep access paths up-to-date with the current state of their data. Approaches vary, such as rebinding only when a new version of DB2 is installed, whenever PTFs are applied to DB2, or to rebind automatically after a regular period of time. Although these methods are workable, they are less than optimal. 

The worst approach though is the “if it ain’t broke don’t fix it” mentality. In other words, many DBA groups adopt “never REBIND unless you absolutely have to” as a firm policy. The biggest problem this creates is that it penalizes every program in your subsystem for fear of a few degraded access paths. This results in potentially many programs having sub-optimal performance because the optimizer never gets a chance to create better access paths as the data and environment change. Of course, the possibility of degraded performance after a REBIND is real – and that is why many sites avoid regularly rebinding their programs.

Even so, the best approach is to perform regular REBINDs as your data changes. To do so, you should follow the Three R’s. Regularly reorganizing to ensure optimal structure; followed by RUNSTATS to ensure that the reorganized state of the data is reflected in the DB2 Catalog; and finally, rebinding all of programs that access the reorganized structures. This technique can improve application performance because access paths will be better designed based on an accurate view of your data.

Of course, adopting the Three R’s approach raises questions, such as “When should you reorganize?” To properly determine when to reorganize you’ll have to examine statistics. This means looking at either RUNSTATS in the catalog or Real Time Statistics (RTS). So, the Three R’s become the Four 4 R’s – examine the Real Time Stats, REORG database objects as indicated by RTS, RUNSTATS to get the new statistics, then REBIND any impacted application programs.

Some organizations do not rely on statistics to schedule REORGs. Instead, they build REORG  JCL as they create each object – that is, create a table space, build and schedule a REORG job, and run it monthly or quarterly. This is better than no REORG at all, but it is not ideal because you are likely to be reorganizing too soon (wasting CPU cycles) or too late (causing performance degradation until REORG).

It is better to base your REORGs off of thresholds on catalog or real-time statistics. Statistics are the fuel that makes the optimizer function properly. Without accurate statistics the optimizer cannot formulate the best access path to retrieve your data because it does not know how your data is currently structured. So when should you run RUNSTATS? One answer is “as frequently as possible based on how often your data changes.” To succeed you need an understanding of data growth patterns – and these patterns will differ for every table space and index.

The looming question is this: why are we running all of these RUNSTATS and REORGs? To improve performance, right? But only with regular REBINDs will your programs take advantage of the new statistics to build more efficient access paths, at least for static SQL applications.

Without an automated method of comparing and contrasting access paths, DB2 program change management can be time-consuming and error-prone – especially when we deal with thousands of programs. And we always have to be alert for a rogue access path – that is, when the optimizer formulates a new access path that performs worse than the previous access path.

Regular rebinding means that you must regularly review access paths and correct any “potential” problems. Indeed, the Four R’s become the Five R’s because we need to review the access paths after rebinding to make sure that there are no problems. So, we should begin with RTS (or RUNSTATS) to determine when to REORG. After reorganizing we should run RUNSTATS again, followed by a REBIND. Then we need that fifth R – which is to Review the access paths generated by the REBIND.

The review process involves finding which statements might perform worse than before. Ideally, the DBAs would review all access path changes to determine if they are better or worse. But DB2 does not provide any systematic means of doing that. There are tools that can help you achieve this though.

The bottom line is that DB2 shops should implement best practices whereby access paths are tested to compare the before and after impact of the optimizer’s choices. By adopting best practices to periodically REBIND your DB2 programs you can achieve better overall application performance because programs will be using access paths generated from statistics that more accurately represent the data. And by implementing a quality review step there should be less need to reactively tune application performance because there will be fewer access path and SQL-related performance problems.

Wednesday, May 21, 2014

IBM DB2 11 Tools and Utilities: Delivering timely value to your business

Migrating to any new software release can be a lot easier when you are familiar with new features prior to deployment. So it stands to reason that you should familiarize yourself with new DB2 functionality before you try to migrate your environment to a new version. This training can take the form of:
  • Reading the DB2 manuals, especially the What’s New manual and the Technical Overview redbook that typically comes out with each new version of DB2
  • Attending presentations on the new release, whether online, at user groups, or even at IDUG or IBM Insight (which used to be the IOD conference)
  • Formal training from IBM or other sources

But the bottom line is that you need to educate yourself in advance of migrating to any new version of DB2... Otherwise, you may not be ready to move to the new version on a schedule that fits your business needs.

With IBM DB2 11 for z/OS, you can have a smoother migration process that also enables you to deploy key applications faster. New features and capabilities, both within DB2 11 and the tools and utilities that support DB2 can make migration easier.

And, of course, DB2 11 for z/OS comes with out-of the box cost savings and features that allow you to do more with business-critical analytics and applications. But are your DB2 Tools and DB2 Utilities ready to provide you complete exploitation or support? Do you know the difference?

Join me on June 10, 2014 as I deliver a webcast on DB2 11 for z/OS Tools and Utilities on behalf of IBM. During this informative webinar I will take you through some of the key features in DB2 11 and the importance of timely support for these features by your DB2 tools and utilities. I’ll expose some of the new capabilities of IBM’s tools and utilities for DB2 and I’ll also share ways to make your DB2 11 migration simpler, safer and faster.


And I’ll see you in June!