Friday, October 17, 2014

Performance Tools That Operate on Databases and Database Objects

In our last blog post here, we covered DB2 system performance management tools - that is, tools that look at the performance at a  system or subsystem level. Today, we turn our attention to the database objects...

Most DBMSs do not provide an intelligent database analysis capability. Instead, the DBA or performance analyst must use system catalog views and queries, or a system catalog tool, to keep watch over each database and its objects. This is not an optimal solution because it relies on human intervention for efficient database organization, opening up the possibility for human error.

DB2 for z/OS, however, does provide Real Time Statistics that can be used to drive database optimization and maintenance. What are Real Time Statistics (or RTS)?
Well, RTS are similar to traditional database statistics that are accumulated using a utility programs (RUNSTATS), but the RTS are accumulated by DB2 “on the fly” as the database management system and its applications are running. That is to say, without having to run a utility program.

RTS are stored in two tables in the DB2 Catalog:
  • SYSIBM.SYSTABLESPACESTATS: Contains statistics on table spaces and table space partitions
  • SYSIBM.SYSINDEXSPACESTATS: Contains statistics on index spaces and index space partitions
But since this post is supposed to be talking about database-performance tools, I don’t want to get into a full blown discussion of RTS… after all, RTS are a built-in component of DB2. That said, the ability of DB2 to generate and store RTS enables database performance tools to make decisions based on actual, up-to-date performance metrics. Of course, DB2 is not the only DBMS with such metrics, but since this is a blog about DB2, I won’t get into any details of the other database systems.

Database Analysis Tools

At any rate, database analysis tools are available that can proactively and automatically monitor your database environment. These database analysis tools typically can: 
  • Collect statistics for tables and indexes: standard statistical information from the DBMS, extended statistics capturing more information (for example, data set extents), or a combination of both.
  • Read the underlying data sets for the database objects to capture current statistics, read the database statistics from the system catalog, read tables unique to the tool that captured the enhanced statistics, or any combination thereof.
  • Set thresholds based on database statistics whereby the automatic scheduling of database reorganization and other maintenance tasks can be invoked.
  • Provide a series of canned reports detailing the potential problems for specific database objects.
Database Utilities

Another category of performance tool that operates at the database (or database object) level are database utilities. Usually there are some number of rudimentary utilities that ship for free with the DBMS. These are usually simple, no-frills programs that are notorious for poor performance, especially on very large tables. However, these utilities are required to populate, administer, and organize your databases. The typical utilities that are provided are LOAD, UNLOAD, REORG, RUNSTATS, BACKUP, and RECOVER, as well as utilities for integrity checking.

Although I suppose it is possible to make an argument, at some level, for any and all of these utilities to have a performance aspect to them, REORG and RUNSTATS are the ones that definitely impact database performance.

RUNSTATS is used to gather statistics on the composition of the database and REORG is used to organize table space data optimally.

There are third-party vendors that provide support tools that replace the database utilities and provide the same or more functionality in a more efficient manner. For example, it is not unheard of for third-party vendors to claim that its utilities execute anywhere from four to ten times faster than the native DBMS utilities. These claims must be substantiated for the data and applications at your organization (but such claims are believable). Before committing to any third-party utility, the DBA should be sure that the product provides all of the basic functionality required.

When testing utility tools from different vendors, be sure to conduct fair tests. For example, always reload or recover prior to testing REORG utilities, or you may skew your results due to different levels of table organization. Additionally, always run the tests for each tool on the same object with the same amount of data, and make sure that the data cache is flushed between each test run. Finally, make sure that the workload on the system is the same (or as close as possible) when testing each product because concurrent workload can skew benchmark test results.

Yet another category of database-focused tool is the Utility management tool. This type of tool provides administrative support for the creation and execution of database utility jobstreams. These utility generation and management tools:
  • Automatically generate utility parameters, JCL, or command scripts.
  • Monitor the database utilities as they execute.
  • Automatically schedule utilities when exceptions are triggered.
  • Restart utilities with a minimum of intervention. For example, if a utility cannot be restarted, the utility manager should automatically terminate the utility before resubmitting it.
Space Management Tools

Most DBMSs provide basic statistics for space utilization, but the in-depth statistics required for both space management and performance tuning are usually inadequate for heavy duty administration. For example, most DBMSs lack the ability to monitor the requirements of the underlying files used by the DBMS. When these files go into extents or become defragmented, performance can suffer. Without a space management tool, the only way to monitor this information is with arcane and difficult-to-use operating system commands. This can be a tedious exercise.

Additionally, each DBMS allocates space differently. The manner in which the DBMS allocates this space can result in inefficient disk usage. Sometimes space is allocated, but the database will not use it. A space management tool is the only answer for ferreting out the amount of used space versus the amount of allocated space.

Space management tools often interface with other database and systems management tools such as operating system space management tools, database analysis tools, system catalog query and management tools, and database utility generators.

Compression Tools

A standard tool for reducing storage costs is the compression utility. This type of tool operates by applying an algorithm to the data in a table such that the data is encoded in a more compact area. By reducing the amount of area needed to store data, overall storage costs are decreased. Compression tools must compress the data when it is added to the table and subsequently modified, then expand the data when it is later retrieved.

In the earlier days of DB2, compression tools that used an exit routine were common. But ever since DB2 Version 3, which introduced the built-in, hardware-assisted compression capability of DB2, compression duties are handled quite efficiently with out-of-the-box DB2 functionality.

Additionally, some tools are available that compress database logs, enabling more log information to be retained on disk before it is offloaded to another medium.

Synopsis

So, there are a number of different categories of performance tools that function at the database or database object level that are worth considering. These differ from system performance tools (covered in the last blog post) and application performance tools (which will be covered in the next blog post).

Thursday, October 09, 2014

Database System Performance Tools

System performance tools examine the database server, its configuration, and usage. The most commonly used system performance tool is the performance monitor. Database performance monitoring and analysis tools support many types of performance-oriented requests in many ways. For example, system performance tools can operate:
  • In the background mode as a batch job that reports on performance statistics written by the DBMS trace facility
  • In the foreground mode as an online monitor that either traps trace information or captures information from the DBMS control blocks as applications execute
  • By sampling the database kernel and user address spaces as the program runs and by capturing information about the performance of the job, independent of database traces
  • By capturing database trace information and maintaining it in a history file (or table) for producing historical performance reports and for predicting performance trends
  • As a capacity planning device that gives statistical information about an application and the environment in which it will operate
  • As an after-the-fact analysis tool on a workstation, that analyzes and graphs all aspects of application performance and system-wide performance


Each database performance monitor supports one or more of these features. The evaluation of database performance monitors is a complex task. Sometimes more than one performance monitor is used at a single site—perhaps one for batch reporting and another for online event monitoring. Maybe an enterprise-wide monitoring solution has been implemented and one component of that solution is a database module that monitors your DBMS, but it lacks the details of a more sophisticated DBMS monitor. So, another performance monitor is purchased for daily DBA usage, while the module of the enterprise-wide monitoring solution is used for integrated monitoring by system administrators.

Modern database performance tools can set performance thresholds that, once reached, will alert the DBA, perform another task to report on, or actually fix the problem. These tools are typically agent-based. An agent is a piece of independent code that runs on the database server looking for problems. It interacts with, but does not rely on, a console running on another machine that is viewed by the DBA. This agent architecture enables efficient database monitoring because the agent is not tied to a workstation and can act independently. The agent sends information to the DBA only when required.

Additionally, some system performance tools are available that focus on a specific component of the DBMS such as the buffer pools (data cache). Such a tool can be used to model the memory requirements for database caching, to capture data cache utilization statistics, and perhaps even to make recommendations for improving the performance of the buffers.

Another type of performance optimization tool enables database configuration parameters to be changed without recycling the DBMS instance, subsystem, or server. These tools are useful when the changes require the DBMS to be stopped and restarted. Such tools can dramatically improve availability, especially if configuration parameters need to be changed frequently and the DBMS does not support dynamic parameter modification.

A few ISVs provide invasive system performance tools that enhance the performance of databases by adding functionality directly to the DBMS and interacting with the database kernel. Typically, these products take advantage of known DBMS shortcomings.

For example, products are available that enhance the performance of reading a database page or block or that optimize data caching by providing additional storage and control over buffers and their processing. Care must be taken when evaluating invasive performance tools. New releases of the DBMS may negate the need for these tools because functionality has been added or known shortcomings have been corrected. However, this does not mean that you should not consider invasive database performance tools. They can pay for themselves after only a short period of time. Discarding the tool when the DBMS supports its functionality is not a problem if the tool has already paid for itself in terms of better performance.

One final caution: Because invasive performance tools can interact very closely with the database kernel, be careful when migrating to a new DBMS release or a new release of the tool. Extra testing should be performed with these tools because of their intrusive nature.

Saturday, October 04, 2014

DB2 Performance Tuning Tools

Well, as I promised a post or two ago, in this and the next couple of posts we will take a look at database performance tools...

Database tools are helpful to enable organizations to effectively manage the performance of applications that access database data... and to help manage the DBMS itself. Some DBMS vendors provide embedded options and bundled tools to address database performance management. However, these tools are frequently insufficient for large-scale or heavily used database applications. Fortunately, many third-party tools will effectively manage the performance of mission-critical database applications. Tools that enable DBAs to tune databases fall into two major categories: performance management and performance optimization.

Many different types of performance management tools are available.

  • Performance monitors enable DBAs and performance analysts to gauge the performance of applications accessing databases in one (or more) of three ways: real time, near real time (intervals), or based on historical trends. The more advanced performance monitors are agent-based.
  • Performance estimation tools provide predictive performance estimation for entire programs and SQL statements based on access paths, operating environment, and a rules or inference engine.
  • Capacity planning tools enable DBAs to analyze the current environment and database design and perform “what-if” scenarios on both.
  • SQL analysis and tuning tools provide graphical and/or textual descriptions of query access paths as determined by the relational optimizer. These tools can execute against single SQL statements or entire programs.
  • Advisory tools augment SQL analysis and tuning tools by providing a knowledge base that provides tips on how to reformulate SQL for optimal performance. Advanced tools may automatically change the SQL (on request) based on the coding tips in the knowledge base.
  • System analysis and tuning tools enable the DBA to view and change database and system parameters using a graphical interface (e.g., cache and/or bufferpool tuning, log sizing).

In the performance optimization category, several tools can be used to tune databases.

  • Reorganization tools automate the process of rebuilding optimally organized databases. Databases can cause performance problems due to their internal organization (e.g., fragmentation, row ordering, storage allocation).
  • Caching tools work to buffer frequently used data in memory which can be accessed faster than secondary disk storage. These tools can augment the performance of the DBMS cache or, more commonly, integrate with the disk storage subsystem.
  • Compression tools enable DBAs to minimize the amount of disk storage used by databases, thereby reducing overall disk utilization and, possibly, elapsed query/program execution time, because fewer I/Os may be required. (Caution: Compression tools can also increase CPU consumption due to the overhead of their compress/decompress algorithms.)
  • Sorting tools can be used to sort data prior to loading databases to ensure that rows will be in a predetermined sequence. Additionally, sorting tools can be used in place of ORDER BY or GROUP BY SQL. Retrieving rows from a relational database is sometimes more efficient using SQL and ORDER BY rather than SQL alone followed by a standalone sort of the SQL results set.

The DBA will often need to use these tools in conjunction with one another—integrated and accessible from a central management console. This enables the DBA to perform core performance-oriented and database administration tasks from a single platform.

Many DBMS vendors provide solutions to manage their databases only; for example, Oracle provides Oracle Enterprise Manager, IBM offers Data Studio for DB2, and Microsoft provides SQL Server Management Studio for this purpose. Third-party vendors provide more robust options that act across heterogeneous environments such as multiple different database servers or operating systems. One example is Dell's Toad product family (there are others).

In general, it is only a good idea to use the DBMS vendor solution as your only management tool if your shop has just a single DBMS. Organizations with multiple DBMS engines running across multiple operating systems should investigate the third-party tool vendors with heterogeneous support (perhaps in addition to the single solution tools).

We will take a closer look at some of these types of tools, with a focus on DB2 for z/OS, in upcoming blog posts.

Monday, September 22, 2014

Rules for an Effective DB2 Monitoring Strategy

DB2, and relational databases in general, have a reputation of being (relatively) easy for users to understand; users specify what data to retrieve, not how to retrieve it. The layer of complexity removed for the users, however, had to be relegated elsewhere: to the code of DB2. And that means you sometimes have to dig into technical details of the DB2 optimizer or other arcane details to uncover performance issues.

DB2 also has a reputation as a large resource consumer. This reputation is largely because of DB2’s complexity. Because DB2 performance analysts must understand and monitor this complexity, they require an array of performance monitoring tools and techniques.

But I do not want to get into all of the potential tools and techniques in today’s short post. I plan to talk about the various types of DB2 performance and monitoring solutions that are available in upcoming posts.

Instead, today’s post just covers the high-level components of what is needed for an effective DB2 performance management strategy... An effective monitoring strategy includes the following:
  • Scheduled batch performance reports on the recent performance of DB2 applications and the DB2 subsystem; a history of these reports would be useful, too.
  • An online monitor that executes when DB2 executes to enable quick monitoring of performance problems as they occur.
  • Online monitors for all teleprocessing environments in which DB2 transactions execute (for example, CICS, IMS/TM, or TSO).
  • A monitoring solution that can track and report on dynamic distributed traffic.
  • End-to-end transaction monitoring capability, sometimes called Application Performance Management.
  • SQL query monitoring and explain analysis.
  • Regular monitoring of z/OS for memory use and VTAM for network use.
  • Scheduled reports from the DB2 Catalog and queries run against the RTS tables
  • Access to the DB2 DSNMSTR address space to review console messages.
  • Use of the DB2 -DISPLAY command to view databases, threads, and utility execution.


As I mentioned, I will cover the various types of performance tools and product offerings in upcoming posts. But for now, if you are interested in uncovering more information about third-party performance tools take a look at this link on my web site

Wednesday, September 17, 2014

Plan to Attend IBM Insight 2014

The IBM Insight conference is just around the corner and if you care about DB2, Big Data, Analytics, data warehousing, or really, anything at all about enterprise data, then Las Vegas is the place to be the week of October 26 thru 30, 2014.

But what is IBM Insight? Well, you may remember it as the IBM Information on Demand conference, or IOD for short, Yes, IBM has renamed the conference yet again. I'm sure a lot of you can remember when there were a bunch of different "Technical Conferences" like the IMS Tech Conference or the DB2 Tech Conference. Those all (and several others) got rolled up into IOD... which is now IBM Insight.


IBM claims to have changed the name because "It's no longer just about information; it's what you can do with the information."  And that kinda makes some sense... or at least it does to me!

With that bit of confusion out of the way, why should you attend the IBM Insight 2014 conference? Well, there is something for everybody in a data-related profession. Over 1500 presentations that run the gamut from DB2 to IMS to Cognos to BI to Big Data to analytics to... well, you get the idea. And there are five fast track groups that highlight several of IBM's important initiatives covering:

  • Watson and Cognitive Computing
  • Cloud
  • Security
  • Infrastructure
  • Mobile and Social Engagement



And with over 13,000 attendees the opportunity to network with your peers is unmatched! Add to that the over 350 exhibitors at the Expo hall and you will be able to view, review, and examine all kinds of interesting software to help you manage your enterprise data.


Not to mention the fact that I'll be presenting there on the Top 10 DB2 Things You Need to Know (Wednesday, October 29, at 3:00 pm). Well, OK, I guess I mentioned it.

So what are you waiting for? Register today... and there is still time (expires September 19, 2014) to get the early bird discount of $300 off!