The Db2 Portal Blog: performance

Showing posts with label performance. Show all posts

Tuesday, November 11, 2025

Don't Forget About Histogram Statistics in Db2 for z/OS

There are a lot of options and parameters to consider when putting together your RUNSTATS jobs to collect statistics on your Db2 data. For those who do not know, RUNSTATS is a Db2 for z/OS utility program that is used to collect information about characteristics of tables and indexes, and storing that data in the Db2 Catalog where it can be used by the Db2 Optimizer to formulate SQL access paths.

Without reasonable statistics, as collected by RUNSTATS, you will likely experience SQL performance problems because Db2 will be accessing the data based on incorrect or outdated information about your data. As I have discussed before, RUNSTATS is one of The 5 R's of Db2 Performance.

OK, with that introductory bit out of the way, one capability of RUNSTATS, first introduced in Db2 9 for z/OS is somewhat underutilized. As you might guess from the title of this blog post, I am talking about histogram statistics.

But what is a histogram? Well, in mathematical terms, a histogram is a representation of data that shows the distribution of data in a data set. Often represented using a bar graph, each bar represents the frequency of data points within a specific range. However, unlike a typical bar chart (displaying categorical data), a histogram groups numbers into ranges, making it useful for identifying patterns, such as underlying frequency distribution, outliers, and skew.

So, a histogram is a way of summarizing data that’s measured on an interval scale. A histogram is particularly helpful when you want:

to highlight how data is distributed,
to determine if data is symmetrical or skewed, or
to indicate whether or not outliers exist.

The histogram is appropriate only for variables whose values are numerical and measured on an interval scale. To be complete, let’s define interval, which is a set of real numbers between two numbers either including or excluding one or both of them.

Histograms are generally most useful when dealing with large data sets.

Why Use Histogram Statistics?

Db2 historgram statistics can be quite useful for certain types of data. Instead of the frequency statistics, which are collected for only a subset of the data, sometimes Db2 can improve access path selection by estimating predicate selectivity from histogram statistics, which are collected over all values in a table space.

Histogram statistics can help the Db2 Optimizer estimate filtering when certain data ranges are heavily populated and others are sparsely populated. Think about data that represents customers of a diner. Typically, there will be periods of high activity (breakfast, lunch, dinner) and low activity (other periods). Or perhaps on a grander scale, consider data that represents customer activity at a big box store. During big sales (such a Black Friday) and over holidays (Christmas, for example), there will be more sales than during other periods.

Depending on the type and makeup of the data, it can make sense to collect histogram statistics to improve access paths for troublesome queries used with the following predicates:

RANGE,
LIKE, and
BETWEEN.

Histogram statistics also can help in some cases for equality (=), IS NULL, IN list, and COL op COL predicates.

How to Collect Histogram Statistics

OK, so how can you go about collecting histogram statistics? Well, the IBM RUNSTATS utility can collect statistics by quantiles. A quantile is the name for a bucket representing a range of data. For instance, from our diner example discussed earlier, you might want to have quantiles representing pre-breakfast, breakfast, pre-lunch, lunch, pre-dinner, dinner, and late night.

Db2 allows up to 100 quantiles. You can specify how many quantiles Db2 is to use: from one up to 100. Of course, specifying one isn’t wise because it’s the same as collecting for everything and so it will not be very helpful.

You do not explicitly define the make-up of quantiles, though. Db2 does that. Db2 creates equal-depth histogram statistics, dividing the whole range of values into quantiles that each contain about the same percentage of the total number of rows.

You indicate that RUNSTATS should collect histogram statistics by coding the HISTOGRAM keyword in conjunction with the COLGROUP option. This way you can collect histogram statistics for a group of columns. You also must tell Db2 the number of quantiles to collect by specifying the NUMQUANTILES parameter. NUMQUANTILES also can be specified with the INDEX parameter, in which case it indicates that histogram statistics are to be collected for the columns of the index.

A single value can never be broken into more than one interval. This means the maximum number of intervals is equal to the number of distinct column values. For example, if you have 40 values, you can have no more than 40 quantiles, each consisting of a single value. In other words, be sure that you don’t specify a value for NUMQUANTILES that is greater than the total number of distinct values for the column (or column group) specified. Also, keep in mind that any NULLs will occupy a single interval.

So, then, how do you decide on the number of quantiles to collect? If you don’t specify NUMQUANTILES, the default value of 100 will be used; then based on the number of records in the table, Db2 will adjust the number of quantiles to an optimal number. Therefore, unless you have a good understanding of the application or a viable reason to deviate, a reasonable rule of thumb is to simply let NUMQUANTILES default and let Db2 work it out.

RUNSTATS will attempt to produce an equal-depth histogram. This means each interval will have about the same number of rows. Please note that this doesn’t mean the same number of values—it’s the same number of rows. In some cases, a highly frequent single value could potentially occupy an interval all by itself.

Where Are the Histogram Statistics Stored?

The following columns in a histogram statistics table define the interval:

QUANTILENO - a sequence number that identifies the interval
HIGHVALUE - a value specifying the upper bound for the interval.
LOWVALUE - a value specifying the lower bound for the interval

These columns can be found in the following six Db2 Catalog tables:

SYSCOLDIST,
SYSKEYTGTDIST,
SYSCOLDIST_HIST,
SYSCOLDISTSTATS,
SYSKEYTGTDIST_HIST, and
SYSKEYTGTDISTSTATS.

Here’s a sample RUNSTATS specification that collects histogram statistics for the key columns of the indexes on tables in the CSMTS02 table space:

RUNSTATS TABLESPACE DB.CSMTS02

INDEX ALL

HISTOGRAM NUMCOLS 2 NUMQUANTILES 10

SHRLEVEL(CHANGE)

UPDATE ALL

REPORT YES

Summary

If you have troublesome queries on data that is skewed or disproportionately distributed, you should consider gathering histogram statistics. Keep in mind that running RUNSTATS specifying HISTOGRAM will probably consume more CPU than just gathering frequency distribution statistics, the performance gains from improved access paths for your queries could more than offset that cost.

Monday, October 06, 2025

The Many Types of Mainframe Pricing

Whenever somebody starts talking about software pricing on the mainframe, or tuning to save on their monthly mainframe bill, I always wonder just exactly "what" they are talking about. Are they aware of all their company’s licensing agreements and all the intricacies they contain?

Before we move on, let me just state that the mainframe still makes good economic sense. The total cost of ownership (TCO) for the mainframe, when you add up all of the cost components (such as hardware, software licenses, storage, network, labor, power and cooling), is similar to or lower than comparable platforms.

So, don’t get me wrong, the mainframe is a great place to run your enterprise applications. And I am all in favor of tuning your code, infrastructure, and environment. But there are many reasons for doing so. Of course, the first reason is to improve the experience of your end users. The more efficiently tuned your software and the system it runs on, the more optimal and pleasant the experience of your end users will be. So, improving your customer experience is always a valid reason for tuning and optimizing your software.

That said, sometimes the intention of tuning efforts gets stated as to reduce costs. And that is a noble goal. But quite frequently, it seems, people undertaking the assignment to tune for cost optimization do not have enough information on achieving that goal!

What must be understood if you are looking to reduce mainframe cost by tuning? Well, I think the first thing you need to understand is what comprises your mainframe cost.

At a high level there is the cost of the IBM Z hardware that has to be taken into account. Of course, there is little that you can do to reduce hardware cost once you have purchased your mainframe. That said, by tuning your workloads to avoid reaching the peak utilization capacity of your mainframe you can avoid the need to upgrade to a new machine (a costly endeavor, indeed). This can be achieved by tuning activity to reduce CPU requirements or by moving workload around to smooth out the peaks. This is something that capacity planners should always be looking at and taking measures to achieve.

The next thing you need to understand is the mainframe software licenses that you have in place. This is not as easy as it might sound. Let’s start with the pricing metrics offered by IBM for Monthly License Charge (MLC) products. The IBM MLC products include operating systems, middleware, compilers, and other system software offerings. Examples of MLC products include: z/OS, z/TPF, CICS, Db2, IMS, COBOL, and so on. These products have a recurring charge that is applied each month for the right to use the product and also access IBM product support. Fair enough, and so far, not overly confusing.

But you also must know the specific MLC pricing metric used by your organization. And there are multiple, including:

Advanced Workload License Charges (AWLC)

Country Multiples License Charges (CMLC)

Variable Workload License Charges (VWLC)

Flat Workload License Charges (FWLC)

Advanced Entry Workload License Charges (AEWLC)

Entry Workload License Charges (EWLC)

Tiered Workload License Charges (TWLC)

System z New Application License Charges (zNALC)

Parallel Sysplex License Charges (PSLC)

Midrange Workload License Charges (MWLC)

zSeries Entry License Charges (zELC)

Growth Opportunity License Charges (GOLC)

Each of these metrics has specific requirements and specific conditions for how MSU usage is charged. Some of these metrics, such as AWLC and VWLC, offer sub-capacity licensing. That means that the software charges are based on the utilization capacity of the logical partitions (LPARs) on which the product runs. There are many nuances to how this actually works, but the bottom line is that if you can tune workloads that run during a peak rolling four-hour average (R4HA) period for the month, you can likely reduce your monthly bill.

Of course, there are also full-capacity metrics, where all software charges are determined by the full IBM-rated capacity (MSUs) of the CPC in which a product runs. Examples of full capacity based pricing metrics are PSLC and zELC.

With the sub-capacity pricing metrics above your monthly bill can vary, sometimes substantially, from month-to-month. IBM introduced Tailored-Fit Pricing (TFP) for organizations looking for a predictable monthly bill. The general approach of TFP is to provide a predictable cloud-like pricing option for IBM z software. At a high-level, your overall usage for the last year is reviewed and an estimated increase is factored into the usage. Your monthly bill for the upcoming year is then 1/12 of last year’s total usage (plus the increase).

Then we have to consider International Program License Agreement (IPLA) products, which have an up-front license fee and an optional annual maintenance charge. IPLA products include tools for managing Db2, CICS, IMS, and others. To further complicate matters, at least when trying to tune for cost reduction, some mainframe IPLA products can be licensed at a sub-capacity level.

And let’s not forget Container Pricing for IBM Z, where certain defined workloads can scale without impacting other workloads, either within an existing LPAR, separate LPARs, or multiple LPARs.

The Bottom Line

Given this general information, you can see how the choice of pricing metric will impact the efficacy of your tuning to reduce costs. Before undertaking any mainframe cost optimization effort be sure that you understand the different licenses for IBM Z software that are in effect at your organization. Furthermore, it is essential that you have effective tools for monitoring your CPU utilization, including identify monthly peaks. Armed with sufficient information and proper tools (including tuning and modernization tools), your chances of achieving cost reduction will increase greatly!

---

For additional information on mainframe pricing metrics and how zIIPs can help, check out my book on the topic: IBM Mainframe Specialty Processors: Understanding zIIPs, Licensing, and Cost Savings on the IBM Sytem z.

Wednesday, April 16, 2025

The Role of Db2 DBAs in Promoting zIIP Usage

As most z/OS practitioners know, zIIP processors can provide significant benefits in terms of cost savings and improved performance. And one of the most significant areas for taking advantage of the benefits that zIIPs can provide is within Db2 for z/OS. As such, the role of Db2 DBAs in promoting zIIP usage can be critical.

It is important to note that not all features of Db2 for z/OS can run on zIIP processors, and that the extent to which a particular workload can be offloaded to zIIP processors depends on several factors, including the workload characteristics, system configuration, and the Db2 for z/OS version and licensing.

Of course, the first thing you need to be sure of is that the system is configured to utilize zIIPs. This means you must procure a license from IBM to use zIIPs. But you also need to ensure that you have configured your system appropriately for zIIPs, which is usually done by the system programming team. Configuration issues include:

Ensuring that sufficient zIIP weight is defined for LPARs where Db2 for z/OS workloads run, and
Using simultaneous multi-threading in the z/OS LPAR on z13 or later processors to increase zIIP capacity (when only one zIIP engine can be dedicated to the LPAR)

Once the system is properly configured, Db2 DBAs need to immerse themselves in understanding what type of workloads are zIIP-eligible. IBM documents the authorized zIIP uses for Db2 processing for each version of Db2 for z/OS. DBAs should bookmark this page (for Db2 13 for z/OS) and return to it for clarification as needed.

As a high-level guide, the following Db2 processing can run on zIIPs:

Up to 100% of Db2 system agents processing running under enclave SRBs that execute in the MSTR, DBM1, and DDF address spaces are zIIP-eligible (except for P-lock negotiation). This includes things like buffer pool processing, log reading and writing, index pseudo-deletes and so on. In other words, things that Db2 will be doing as part of its general operation. These are not generally things that DBAs can influence or encourage much one way or the other, but can deliver benefits by offloading work from the general purpose CP to zIIPs.
Up to 60% of distributed SQL that uses DRDA to access Db2 over TCP/IP and native REST calls over HTTP are zIIP-eligible. This can be a significant source for offloading work to zIIPs. DBAs can work with development teams to encourage the use of distributed SQL to further their organization’s usage of zIIPs.
Up to 100% of parallel query child processes can be run on zIIPs, after you have reached a preset CPU threshold (which is defined by IBM for each specific model of IBM Z). DBAs can help to encourage parallelism, where appropriate, to further zIIP usage. This can be done by binding packages using DEGREE(ANY) or by setting CURRENT DEGREE to ANY. Furthermore, since parallelism can only be used by read-only queries, encourage developers to identify appropriate cursors as FOR READ ONLY.
Up to 100% of XML processing for XML schema validation and non-validation parsing, as well as for the deletion of unneeded versions of XML documents. So, if you are using XML in your Db2 databases and applications, certain processing-intensive XML operations can be run on zIIPs.
Many IBM Db2 utility processes are also zIIP-eligible. Up to 100% of the index maintenance tasks for LOAD, REORG, and REBUILD INDEX are zIIP-eligible. And up to 100% of the statistics-gathering portion of RUNSTATS is also zIIP-eligible. Planning and executing IBM Db2 utilities is something else that DBAs can do to encourage zIIPs usage. For example, encourage developers to use the LOAD utility instead of writing programs to load or bulk insert a lot of data whenever possible.
And up to 100% of the SQL AI functions (SQL Data Insights) in Db2 13 for z/OS that is eligible to be run as a parallel query child process are zIIP-eligible. It is a bit more complex than that, because a portion of SQL statements that reference AI functions but are ineligible to be run as a parallel query child process may still be eligible if the SQL request is made through DRDA. Therefore, utilizing, where appropriate, the SQL Data Insights AI functions (AI_ANALOGY, AI_COMMONALITY, AI_SEMANTIC_CLUSTER, and AI_SIMILARITY) can help bolster the usage of zIIP processors.

I would be remiss if I did not mention that other ISVs offer Db2 utilities with varying degrees of zIIP eligibility; for example, BMC Software, Broadcom, and InfoTel. So, if you have Db2 utilities from vendors other than IBM, be sure to consult their documentation for details on their zIIP exploitation and proceed accordingly.

Furthermore, components of other types of system software may be zIIP-eligible, so be sure to investigate and document which products that you regularly use may be able to utilize zIIPs. For example, if you have heavy sorting requirements Precisely’s Syncsort MFSort can offload a good percentage of sort workload to zIIPs.

And do not forget about Java! Applications written in Java can be redirected to run on zIIPs. So, instead of writing a new application in COBOL (or another language that is not zIIP-eligible) consider using Java to create more zIIP eligible workloads. You might even take a look at where and when it makes sense to convert some existing workloads to run on a JVM to expand you zIIP usage.

On-going DBA zIIP Responsibilities

The DBA’s involvement with zIIPs does not end after workload has been made eligible. On-going activity is required to ensure effective zIIP usage. Db2 DBA must:

Monitor performance: Db2 DBAs should monitor system performance to ensure that zIIP processors are being used effectively. This involves tracking zIIP processor utilization, general-purpose CPU utilization, and overall system performance.
Optimize performance: Furthermore, Db2 DBAs can help optimize performance by making changes to the system configuration or adjusting workload placement to improve zIIP processor utilization and overall system performance.

And finally, Db2 DBAs should take it upon themselves to educate other team members about zIIPs, their benefits, and how Db2 workloads can take advantage of zIIPs to reduce cost.

The Bottom Line

In summary, the role of a Db2 DBA in promoting zIIP usage involves identifying eligible workloads, ensuring the system is configured appropriately, monitoring performance, optimizing performance, and educating other team members on the benefits of zIIP processors. It can also include encouraging zIIP usage by educating and training developers on what types of processes are zIIP eligible. By promoting zIIP usage, a Db2 DBA can help improve system performance, reduce costs, and improve overall efficiency on IBM Z mainframes.

Tuesday, December 17, 2024

How About a Db2 Book for the Holidays?

If you are still on the look-out for a gift for the Db2 DBA or developer in your life, have you considered getting them a Db2 book? Technical books can be gifts that keep on giving throughout the year! And you'll be remembered as the kind gift-giver as the reader digests the information in the book, and then comes back to the book for reference as they work!

Technical books serve as vital resources for professionals and students, providing in-depth knowledge, practical guidance, and up-to-date information on Db2 subject areas. They can be used for knowledge expansion, skill development, problem solving, exam preparation, and staying up-to-date in your career.

Here are a few books you might want to consider for the Db2 person in your life.

"DB2 Developer's Guide" offers numerous benefits for database professionals working with Db2 for z/OS. The book provides a comprehensive and practical approach to Db2 development and administration, covering essential topics from basic SQL and database design to advanced performance tuning and application development techniques. Its clear explanations, real-world examples, and best practices make it an invaluable resource for both novice and experienced Db2 developers. By mastering the concepts presented in this guide, developers can design efficient, robust, and high-performing Db2 applications, ultimately improving data management and business processes. The book's focus on practical application and problem-solving makes it a highly effective tool for enhancing Db2 development skills and optimizing database performance.
Another useful book, particularly for application programmers and developers, is "A Guide to Db2 Performance for Application Developers" which offers invaluable insights for developers seeking to optimize Db2 database performance. This book focuses on how coding practices impact Db2's efficiency. It offers practical guidance on writing efficient SQL and designing effective data access strategies. By understanding these principles, developers can avoid common performance pitfalls, reduce resource consumption, and improve application responsiveness. The purpose of this book is to give advice and direction to Db2 application developers and programmers on writing efficient, well-performing programs. The material is written for all Db2 professionals, whether you are coding on z/OS (the mainframe) or on Linux, Unix or Windows (distributed systems). When there are pertinent differences between the platforms it is explained in the text. This guide empowers developers to proactively contribute to database performance optimization, leading to faster applications, reduced costs, and improved user experiences.
If you are looking for a book for the mainframe professional in your life, consider "IBM Mainframe Specialty Processors: Understanding zIIPs, Licensing, and Cost Savings on the IBM System z." This book will clarify the purpose of specialty processors and how you can best utilize them for cost optimization. The book provides a high-level overview of pertinent mainframe internals such as control blocks, SRBs, and TCBs, and why they are important for understanding how zIIPs work. Additionally, because reducing mainframe software cost is essential to the purpose of specialty processors, the book proivdes a high-level introduction to understanding mainframe licensing and pricing. The book describes the types of workloads that can take advantage of specialty processors, including advice on how to promote zIIP usage in your applications and systems. Read a review of the book here.
And finally, consider gifting "The Tao of Db2" to the Db2 DBA in your life. This short, low-cost but insightful book offers guidance on how to manage Db2 properly to achieve harmonious systems and applications that deliver quality and performance. It follow the exploits of a seasoned DBA and his intern as they learn "the way" of Db2 database management and administration. Learn along with them and improve your Db2 administration chops!

I want to wrap up this post by wishing all of my readers a very happy holiday season... and I hope you will consider grabbing at least one of these Db2-related books for the techie in your life... or even as a gift for yourself.

Monday, November 11, 2024

5 Big Concerns of Modern IT When Using Db2 for z/OS

Db2 for z/OS is an entrenched solution for managing data at the world's largest organizations. It is a strong, reliable DBMS and I wrote about its strength recently on the blog (here). You really cannot go wrong using Db2 for z/OS for mission-critical workloads.

That said, there are concerns and issues facing organizations using Db2 for z/OS. One of the biggest concerns with Db2 for z/OS today is managing the cost and complexity of maintaining mainframe environments while still delivering high availability and performance.

As such, here are 5 specific concerns facing large organizations using Db2 for z/OS today:

Skill Shortages: Many mainframe experts, especially those with deep Db2 for z/OS knowledge, are approaching retirement, creating a significant skills gap. The lack of trained professionals has made it challenging to manage and maintain Db2 for z/OS systems effectively.
Cost of Licensing and Maintenance: Mainframe systems come with substantial licensing costs. Many organizations are looking for ways to optimize usage or even repatriate workloads to more cost-effective platforms, where feasible, to reduce operational expenses. Whether or not such changes result in "actual" cost reductions is unfortunately irrelevant as many executives believe it will regardless of reality and studies to the contrary.
Integration with Modern Architectures: As companies adopt cloud, big data, and other modern architectures, integrating Db2 for z/OS with these systems can be complex and costly. Many seek seamless data integration between Db2 on mainframes and newer platforms like data lakehouses, which involves architectural and technological challenges.
Automation and DevOps Compatibility: Modern IT environments emphasize agility, continuous integration, and deployment, but the mainframe environment traditionally doesn’t integrate well with DevOps practices. Nevertheless, many companies are pushing for Db2 automation tools and integration with DevOps workflows to streamline operations and reduce manual workloads... and DevOps is being successfully deployed by mainframe organizations today using Zowe and other traditional DevOps tooling.
Performance and Availability: High performance and availability are always top concerns, especially as organizations process more data and need to meet stringent SLAs. Handling lock contention, optimizing query performance, and scaling resources efficiently continue to be challenges. But, to be fair, these are challenges with many DBMS implementations, not just Db2 for z/OS.

Organizations are adopting several strategies to address the challenges with Db2 for z/OS and ensure their mainframe environments remain relevant and efficient:

Workforce Development and Knowledge Transfer: To counter skill shortages, organizations are investing in training and upskilling initiatives for new IT staff, partnering with universities, or using mentoring programs to transfer knowledge from retiring mainframe experts to newer employees. Additionally, some companies are leveraging consulting firms or managed services providers with mainframe expertise to fill gaps temporarily.
Cost Optimization with Usage Analytics: Companies are using detailed workload and resource monitoring tools to optimize Db2 for z/OS usage, identify inefficient processes, and reduce costs. This includes tuning queries, scheduling batch jobs during off-peak hours, and leveraging IBM’s Workload Manager (WLM) to prioritize workloads based on business needs.
Hybrid Cloud and Data Lakehouse Integrations: To manage integration with modern architectures, organizations are implementing hybrid cloud strategies and data lakehouses that can interface with Db2 for z/OS. Tools such as IBM Db2 Analytics Accelerator allow data stored on Db2 for z/OS to be offloaded to faster, scalable platforms, enabling integration with big data and analytics environments without entirely migrating off the mainframe.
Automation and DevOps Integrations: Organizations are investing in DevOps and automation tools compatible with Db2 for z/OS, such as IBM UrbanCode and mainframe DevOps solutions from other ISVs such as Broadcom and BMC Software. By automating routine tasks like provisioning, patching, and deploying schema changes, organizations can adopt more agile, efficient processes. Integrating Db2 for z/OS with CI/CD pipelines helps streamline development workflows, bridging mainframe operations with modern DevOps practices. For more details on integrating Db2 for z/OS into DevOps, consult this blog post that highlights several posts I wrote on the topic!
Mainframe Modernization with AI and Machine Learning: Using AI and machine learning to optimize Db2 for z/OS operations is becoming common. AI-based monitoring tools, such as IBM’s Watson AIOps, can predict system issues and detect anomalies to prevent downtime. Machine learning algorithms can also be used for capacity planning, workload optimization, and tuning Db2 performance parameters, helping reduce manual intervention.
Resilience and High Availability Improvements: For performance and availability, companies are implementing high-availability solutions like IBM Geographically Dispersed Parallel Sysplex (GDPS) to ensure continuous uptime. They’re also using backup automation and disaster recovery solutions tailored for Db2 to meet stringent SLAs and minimize downtime in case of failures.

By combining these strategies, organizations are better equipped to manage the costs, complexity, and skills required to maintain and modernize Db2 for z/OS environments in today’s rapidly evolving IT landscape.

Thursday, September 19, 2024

Db2 for z/OS: The Performance and Management Champion!

Usually, posts I write for this blog focus on technical details, tips, and techniques for better using and optimizing your experience with Db2. Today, I want to do something a little different. You see, I am a big fan of Db2 for z/OS, and I do not see it getting the press, or the accolades that I think it is due. So I am going to use my platform to shout out the performance benefits of Db2 for z/OS.

When it comes to performance, nothing beats Db2 for z/OS. This mainframe database has been setting the standard for decades, delivering unmatched speed and efficiency for mission-critical applications. Let's explore some of the reasons why Db2 for z/OS is the performance champion.

Hardware Acceleration

z/Architecture: Db2 for z/OS takes full advantage of the powerful z/Architecture, which includes specialized hardware for database operations. This hardware acceleration provides a significant performance boost for tasks like query processing and data loading.
Storage Subsystem: The mainframe's storage subsystem is designed for high performance and reliability. With features like z/Hyperlink, data compression, and flash storage, Db2 for z/OS can access data quickly and efficiently.
IDAA: IBM Db2 Analytics Accelerator is a high-performance, in-memory database appliance designed to accelerate analytic workloads. It's optimized for large-scale data analysis tasks, providing significant speedups compared to traditional disk-based databases. By leveraging solid-state drives (SSDs) and advanced hardware architecture, IDAA can handle complex queries and data manipulations with exceptional efficiency. This makes it ideal for applications requiring real-time analytics, data warehousing, and big data processing.

Database Optimization

Query Optimization: Db2 for z/OS has a sophisticated query optimizer that can automatically select the most efficient execution plan for your queries. This ensures that your applications run as fast as possible.
Data Compression: Db2 for z/OS supports data compression, which can reduce storage requirements and improve performance. By compressing data, Db2 can reduce the amount of data that needs to be read and processed.
Parallel Processing: Db2 for z/OS can take advantage of multiple processors to perform tasks in parallel. This can significantly improve performance for large workloads.
AI: IBM Db2 AI for z/OS integrates autonomics to simplify database management efforts. Using machine learning and AI, it can help improve operational performance and maintain Db2 for z/OS efficiency and health while enhancing Db2 for z/OS performance, reliability and cost effectiveness–even under the most demanding circumstances.

Workload Management

Resource Allocation: Db2 for z/OS provides powerful tools for managing resources and ensuring that your database applications get the resources they need to perform optimally.
Workload Balancing: Db2 can automatically balance workloads across multiple systems to ensure that resources are used efficiently.
WLM: Workload Manager is an integrated, critical component of z/OS that is used for optimizing the performance and resource utilization of Db2 for z/OS. It provides a comprehensive framework for managing workloads across the mainframe environment, ensuring that Db2 applications receive the resources they need to perform optimally.

Data Sharing and Parallel Sysplex

Finally, Data Sharing using IBM Z Parallel Sysplex confers a significant advantage onto Db2 for z/OS in that it can enhanced availability by providing inherent redundancy, as multiple subsystems can access the same data. This helps to mitigate the impact of hardware failures or system outages. And in case of a disaster, data sharing can facilitate rapid recovery by allowing applications to access data from a different subsystem.

Furthermore, Data Sharing enhances scalability by enabling workloads to be distributed across multiple subsystems, improving scalability and preventing bottlenecks. It facilitates simpler growth: as data volumes and application demands increase, data sharing can help to accommodate growth without requiring significant hardware investments.

And Data Sharing can improve performance. By allowing multiple Db2 subsystems to access the same data without requiring individual copies, data sharing significantly reduces I/O operations, leading to improved performance. And with data readily available to multiple subsystems, queries can be executed more quickly, resulting in faster response times for applications.

So, IBM Z data sharing on Db2 offers a range of benefits, including improved performance, enhanced availability, increased scalability, reduced costs, and simplified management. These benefits make it a valuable feature for organizations that require high-performance, reliable, and scalable database solutions.

Real-World Results

Organizations around the world rely on Db2 for z/OS to power their most critical applications. From financial services to healthcare, Db2 has proven its ability to deliver the performance and reliability that businesses need to succeed.

So, if you're looking for a database that can handle your most demanding workloads and deliver exceptional performance, Db2 for z/OS is the way to go.

Thursday, July 25, 2024

Coding Db2 Applications for Performance - Expert Videos Series

Today's blog post is to share with my readers that I have partnered with Interskill Learning and produced a series of videos in the Expert Video Series on how to code Db2 applications for performance.

My regular readers know that application performance is a passion of mine. You may also have read my recent book on the topic, A Guide to Db2 Performance for Application Developers. But if you are looking for videos to guide you through the process optimizing your application development for Db2, look no further than the six-part series I recorded for Interskill Learning, Coding Db2 Applications for Performance.

You do not need in-depth pre-existing knowledge of Db2 to gain insight from these video lessons. The outline of the six courses are as follows:

Db2 Coding – Defining Database Performance

Providing a Definition
The Four Components
Diving a Little Deeper

Db2 Coding – Coding Relationally

What is Relational?
Relational vs. Traditional Thinking
What Does It Mean to Code Relationally?
Unlearning Past Coding Practices

Db2 Coding – General SQL and Indexing Guidelines

Types of SQL
SQL Coding Best Practices
Indexes and Performance
Stages and Clustering

Db2 Coding – Coding for Concurrent Access

Introduction to Concurrency
Locking
Locking Duration and Binding
Locking Issues and Strategies
Query Parallelism

Db2 Coding – Understanding and Reviewing Db2 Access Paths

Single Table Access Paths
Multi-table Access Paths
Filter Factors
Access Paths and EXPLAIN

Db2 Coding – SQL Coding Tips and Techniques

Avoid Writing Code
Reusable Db2 Code
Dynamic and Static SQL
SQL Guidelines
Set Operations

So if you are looking for an introduction to Db2 performance or want to brush up on the fundamentals of coding for performance, look no further. Check out this series of videos on Coding Db2 Applications for Performance from Interskill Learning (featuring yours truly)!

Note that Interskill Learning also offers other categories of training in their Expert Video series including systems programming, quantum computing, and pervasive encryption.

Thursday, June 20, 2024

The Basics of Coding Db2 SQL for Performance

When it comes to assuring optimal performance of Db2 applications, coding properly formulated SQL is an imperative. Most experts agree that poorly coded SQL and application code is the cause of most performance problems – perhaps as high as 80% of poor relational performance is caused by “bad” SQL and application code.

But writing efficient SQL statements can be a tricky proposition. This is especially so for programmers and developers new to a relational database environment. So, before we delve into the specifics of coding SQL for performance, it is best that we take a few moments to review SQL basics.

SQL, an acronym for Structured Query Language, is a powerful tool for manipulating data. It is the de facto standard query language for relational database management systems and is used not just by Db2, but also by the other leading RDBMS products such as Oracle, Sybase, and Microsoft SQL Server.

SQL is a high-level language that provides a greater degree of abstraction than do procedural languages. Most programming languages require that the programmer navigate data structures. This means that program logic needs to be coded to proceed record-by-record through data elements in an order determined by the application programmer or systems analyst. This information is encoded in the program logic and is difficult to change after it has been programmed.

SQL, on the other hand, is fashioned so that the programmer can specify what data is needed, and not how to retrieve it. SQL is coded without embedded data-navigational instructions. Db2 analyzes the SQL and formulates data-navigational instructions "behind the scenes." These data-navigational instructions are called access paths. By having the DBMS determine the optimal access path to the data, a heavy burden is removed from the programmer. In addition, the database can have a better understanding of the state of the data it stores, and thereby can produce a more efficient and dynamic access path to the data. The result is that SQL, used properly, can provide for quicker application development.

Another feature of SQL is that it is not merely a query language. The same language used to query data is used also to define data structures, control access to the data, and insert, modify, and delete occurrences of the data. This consolidation of functions into a single language eases communication between different types of users. DBAs, systems programmers, application programmers, systems analysts, and end users all speak a common language: SQL. When all the participants in a project are speaking the same language, a synergy is created that can reduce overall system-development time.

Arguably, though, the single most important feature of SQL that has solidified its success is its capability to retrieve data easily using English-like syntax. It is much easier to understand the following than it is to understand pages and pages of program source code.

SELECT LASTNAME

FROM EMP

WHERE EMPNO = '000010';

Think about it; when accessing data from a file the programmer would have to code instructions to open the file, start a loop, read a record, check to see if the EMPNO field equals the proper value, check for end of file, go back to the beginning of the loop, and so on.

SQL is, by nature, quite flexible. It uses a free-form structure that gives the user the ability to develop SQL statements in a way best suited to the given user. Each SQL request is parsed by the DBMS before execution to check for proper syntax and to optimize the request. Therefore, SQL statements do not need to start in any given column and can be strung together on one line or broken apart on several lines. For example, the following SQL statement is equivalent to the previously listed SQL statement:

SELECT LASTNAME FROM EMP WHERE EMPNO = '000010';

Another flexible feature of SQL is that a single request can be formulated in a number of different and functionally equivalent ways. One example of this SQL capability is that it can join tables or nest queries. A nested query always can be converted to an equivalent join. Other examples of this flexibility can be seen in the vast array of functions and predicates. Examples of features with equivalent functionality are:

· BETWEEN versus <= / >=

· IN versus a series of predicates tied together with AND

· INNER JOIN versus tables strung together in the FROM clause separated by commas

· OUTER JOIN versus a simple SELECT, with a UNION, and a correlated subselect

· CASE expressions versus UNION ALL statements

This flexibility exhibited by SQL is not always desirable as different but equivalent SQL formulations can result in extremely differing performance. The ramifications of this flexibility are discussed later in this paper with guidelines for developing efficient SQL.

As mentioned, SQL specifies what data to retrieve or manipulate, but does not specify how you accomplish these tasks. This keeps SQL intrinsically simple. If you can remember the set-at-a-time orientation of a relational database, you will begin to grasp the essence and nature of SQL. A single SQL statement can act upon multiple rows. The capability to act on a set of data coupled with the lack of need for establishing how to retrieve and manipulate data defines SQL as a non-procedural language.

Because SQL is a non-procedural language a single statement can take the place of a series of procedures. Again, this is possible because SQL uses set-level processing and DB2 optimizes the query to determine the data-navigation logic. Sometimes one or two SQL statements can accomplish tasks that otherwise would require entire procedural programs to do.

High-Level SQL Coding Guidelines

When you are writing your SQL statements to access Db2 data be sure to follow the subsequent guidelines for coding SQL for performance. These are certain very simple, yet important rules to follow when writing your SQL statements. Of course, SQL performance is a complex topic and to understand every nuance of how SQL performs can take a lifetime. That said, adhering to the following simple rules puts you on the right track to achieving high-performing Db2 applications.

1) The first rule is to always provide only the exact columns that you need to retrieve in the SELECT-list of each SQL SELECT statement. Another way of stating this is “do not use SELECT *”. The shorthand SELECT * means retrieve all columns from the table(s) being accessed. This is fine for quick and dirty queries but is bad practice for inclusion in application programs because:

· Db2 tables may need to be changed in the future to include additional columns. SELECT * will retrieve those new columns, too, and your program may not be capable of handling the additional data without requiring time-consuming changes.

· Db2 will consume additional resources for every column that requested to be returned. If the program does not need the data, it should not ask for it. Even if the program needs every column, it is better to explicitly ask for each column by name in the SQL statement for clarity and to avoid the previous pitfall.

2) Do not ask for what you already know. This may sound simplistic, but most programmers violate this rule at one time or another. For a typical example, consider what is wrong with the following SQL statement:

SELECT EMPNO, LASTNAME, SALARY

FROM EMP

WHERE EMPNO = '000010';

Give up? The problem is that EMPNO is included in the SELECT-list. You already know that EMPNO will be equal to the value '000010' because that is what the WHERE clause tells DB2 to do. But with EMPNO listed in the WHERE clause Db2 will dutifully retrieve that column too. This causes additional overhead to be incurred thereby degrading performance.

3) Use the WHERE clause to filter data in the SQL instead of bringing it all into your program to filter. This too is a common rookie mistake. It is much better for Db2 to filter the data before returning it to your program. This is so because Db2 uses additional I/O and CPU resources to obtain each row of data. The fewer rows passed to your program, the more efficient your SQL will be. So, the following SQL

SELECT EMPNO, LASTNAME, SALARY

FROM EMP

WHERE SALARY > 50000.00;

Is better than simply reading all of the data without the WHERE clause and then checking each row to see if the SALARY is greater than 50000.00 in your program.

These rules, though, are not the be-all, end-all of SQL performance tuning – not by a long shot. Additional, in-depth tuning may be required. But following the above rules will ensure that you are not making “rookie” mistakes that can kill application performance.

In Closing

This short blog post is just the very beginning of SQL performance for Db2 programmers. Indeed, I wrote a book on the topic called A Guide to Db2 Performance for Application Developers, so check that out if this post has whetted your appetite for more Db2 performance tips... and if you are a more visual learner, I have also partnered with Interskill Learning for a series of videos in their Expert Video series on the topic of Coding Db2 Applications for Performance. So, why wait, dig in to a book, some videos, or both, to help improve the performance of your Db2 applications!