I/O operations play a critical role in the performance of an IBM DB2 database and the applications that access it. Efficient disk I/O can significantly impact query response times and overall system performance. Let’s explore the importance of I/O optimization and discuss techniques for maximizing disk I/O performance.
The performance of I/O directly affects the speed at which
data is read from or written to disk. Slow or inefficient I/O can become a
performance bottleneck, leading to increased query execution times and degraded
system responsiveness.
Data Set Placement
One technique for optimizing disk I/O is strategic data
placement. Placing frequently accessed data on faster storage devices, such as
solid-state drives (SSDs) or high-performance disk arrays, can significantly
improve I/O performance. Identifying hot spots, such as frequently accessed
tables or indexes, and ensuring they are stored on optimal storage
media can help minimize I/O bottlenecks and accelerate query processing.
When it comes to the placement of Db2 data sets, the most critical
consideration is certainly the log data sets and the BSDS. Be sure to place the
active logs and BSDS data sets on the fastest disk you have at your disposal.
The log can quickly become a bottleneck because all changes (with few
exceptions) are written to the logs. Db2 writes first to log buffers and
then from the buffers to an active log data set. The log is there to ensure
recoverability. So, using the highest-speed disk for your logs can help to
minimize the potential for pain here.
You also need to take care to make sure that your logs are efficient
and effective by running with dual logs. By doing so you ensure that if one
copy of the log is damaged, Db2 can continue operating with the other copy. If
there is no log available Db2 operations screech to a halt. To run with dual
logging, one set of log data sets should be on separate volumes than the other
set. This should remove a disk failure as the cause of an outage.
As the active logs are filled, a process called log
offloading is invoked by Db2 to move the log information to archive log data
sets. This process prohibits the active logs from filling up during Db2
processing which would stifle processing. The archive logs are needed for
recovery situations, such as rolling back transactions and recovering table
spaces and indexes.
It is a wise course of action to keep your archive logs on
disk, and over time migrate them to tape. Think about the trade-off in saving
tape versus recovery speed, and your decision should be obvious. Before backing
up any logs to tape, it’s a good idea to always have at least 24 hours covered
by the active log and the least 48 hours by archive logs on disk.
And then there is the BSDS (bootstrap data set), which is a
VSAM key-sequenced data set (KSDS). The BSDS contains important information for Db2
operations. One such piece of information is the names of the logs. Db2 uses
information in the BSDS for system restarts and for any activity that requires
reading the log. So again, keeping the BSDS on high-speed disk further
boosts the operational performance of Db2.
Of course, there are other important Db2 data sets that
you will want to carefully consider placing on optimal I/O devices. For example, the Db2 Directory (DSNBD01) and Db2
Catalog (DSNBD06) data sets are frequently accessed and should be cared for to ensure
the performance of the entire Db2 subsystem. And you may have some user database
objects that are critically important or frequently accessed that you can place
on faster disk than your other user database objects. This, of course, requires
an understanding of your business and Db2 applications, as well as the
appropriate budget to accomplish.
The overall goal of data set placement should be to balance
I/O activity between different volumes, control units, and channels. This can minimize
I/O elapsed time and I/O queuing.
RAID
RAID (Redundant Array of Independent Disks) configurations
also play a significant role in I/O optimization. RAID levels, such as RAID 0,
RAID 1, RAID 5, or RAID 10, offer different levels of performance, fault
tolerance, and data protection. Configuring RAID appropriately based on
workload requirements and system resources can improve I/O performance and
provide the desired level of data redundancy. RAID striping, in particular, can
distribute data across multiple disks, enabling parallel I/O operations and
enhancing throughput.
What About Extents?
Some folks think “With RAID/modern storage devices and the latest
Db2 and z/OS features, extents are no longer anything to worry about.” And this
is a largely true statement, but there are some things you still should think about with regard to extents.
First, keep in mind that the latest extent management features
only work with SMS-managed data sets, so if you are still using user-managed
data sets then all of the old rules apply! For SMS-managed data set you can
have up to 123 extents on each of 59 volumes. So as of z/OS 1.7, the limit is
7,257 extents for a data set instead of the 255 we’ve been used to for some
time. Again though, to enable this requires DFSMS.
Extent consolidation also requires SMS-managed STOGROUPs. If
a new extent is adjacent to an old extent, they will be merged together
automatically. This can result in some extents being larger than the PRIQTY or
SECQTY specification(s). Note that this feature was introduced in z/OS 1.5.
So, what if everything is SMS-controlled? Even then it is
possible for extents to impact performance, albeit probably only slightly. Each
extent on a disk file has different control blocks controlling access. This
means that elapsed time can increase if there is heavy insert activity. For
other types of processing (read and update) the number of extents really does
not impact on performance.
Another thing to consider is that Db2 can allocate sliding
scale secondary extents. This is enabled by setting MGEXTSZ DSNZPARM to YES.
Note that the default is YES as of DB2 9 for z/OS. With sliding scale extents
the extent sizes allocated gradually increase. Db2 uses a sliding scale for
secondary extent allocations of table spaces and indexes when:
- You do not specify a value for the SECQTY option of a CREATE TABLESPACE or CREATE INDEX statement
- You specify a value of -1 for the SECQTY option of an ALTER TABLESPACE or ALTER INDEX statement.
Otherwise, Db2 uses the SECQTY value for secondary extent
allocations, if one is explicitly specified (and the SECQTY value is larger
than the value that is derived from the sliding scale algorithm). If the table
space or index space has a SECQTY greater than 0, the primary space allocation
of each subsequent data set is the larger of the SECQTY setting and the value
that is derived from a sliding scale algorithm. Without going into all of the
gory details, sliding scale extent allocation can help to reduce the number of
extents for your Db2 objects as they grow in size over time. And it can help
when you do not have a firm understanding of how your data will grow over time.
At any rate, things are not like the olden days where you
had to regularly monitor extents and clean them up all the time by reorganizing
your table spaces and index spaces.
Additional I/O Considerations
Optimizing I/O parallelism settings can help to improve disk
I/O performance. Tuning I/O parallelism settings, such as the number of
concurrent I/O operations or I/O thread configuration, can also enhance I/O
performance by allowing simultaneous access to multiple disks or storage
devices.
Regular monitoring of I/O performance metrics, such as disk
read/write rates, queue lengths, and average response times, is essential for
identifying potential I/O bottlenecks and fine-tuning the I/O subsystem.
Analyzing performance data and workload patterns can help identify areas for
improvement and guide decision-making regarding data placement, file system
selection, RAID configuration, and other I/O optimization techniques.
Conclusion
In conclusion, optimizing disk I/O is vital for improving
the performance of Db2 databases and the applications that access them. By
strategically placing data, properly configured your logs and BSDS, configuring
RAID appropriately, tuning I/O parallelism settings, and regular monitoring,
you can enhance I/O performance, reduce latency, and accelerate query
processing.