The Db2 Portal Blog: performance

Showing posts with label performance. Show all posts

Thursday, July 06, 2023

Top 10 Db2 Performance Tips - No. 2: Optimize Database Design

A well-designed database schema forms the foundation of a high-performing and efficient IBM Db2 database... and therefore, also serves as the basic starting point for efficient Db2 applications. The importance of optimizing the database design cannot be overstated, as it directly impacts query performance, data integrity, and overall system efficiency.

The Logical Data Model

The first step toward a proper database design is the creation of a logical data model. Before implementing databases of any sort, it is imperative to first develop a sound model of the data to be used. Novice database developers frequently begin with the quick-and-dirty approach to database implementation. They approach database design from a programming perspective. Because novices often lack experience with databases and data requirements gathering, they attempt to design databases like the flat files they are accustomed to using. This is a major mistake. Indeed, most developers using this approach quickly discover problems after the databases and applications become operational in a production environment. At a minimum, performance will suffer and data may not be as readily available as required. At worst, data integrity problems and/or performance problems may arise, rendering the entire application unusable.

The goal of a data model is to record the data requirements of a business process. The scope of the data model for each line of business must be comprehensive. A data model serves as lexicon for the data needs of the business... and as a blueprint for the physical implementation of the structures of the database.

A key component of building a proper data model is to ensure proper normalization.

Normalization

Normalization reduces data redundancy and inconsistencies by ensuring that the data elements are designed appropriately. A series of normalization rules are applied to the entities and data elements, each of which is called a “normal form.” If the data conforms to the first rule, the data model is said to be in “first normal form,” and so on.

A database design in First Normal Form (1NF) will have no repeating groups and each instance of an entity can be identified by a primary key. For Second Normal Form (2NF), instances of an entity must not depend on anything other than the primary key for that entity. Third Normal Form (3NF) removes data elements that do not depend on the primary key. If the contents of a group of data elements can apply to more than a single entity instance, those data elements belong in a separate entity.

This is a quick and dirty introduction to normalization, and there are further levels of normalization not discussed here in order to keep the discussion moving along. For an introductory discussion of normalization visit http://wdvl.com/Authoring/DB/Normalization.

The bottom line is that normalization reduces data redundancy and improves data integrity by organizing data into logical entities and minimizing data duplication. By carefully analyzing the business requirements and applying normalization principles, database designers can create tables that are lean, efficient, and accurately represent the data model.

Relationships

Optimizing relationships between tables is another critical aspect of database design. Relationships, such as primary key-foreign key associations, define the logical connections between tables. This too, should be evident in the logical data model, which is frequently depicted as an entity/relationship diagram.

Choosing appropriate indexing strategies, enforcing referential integrity, and carefully considering the cardinality and selectivity of relationships are crucial steps to ensure efficient query processing and join operations.

From Logical to Physical

Assuming you have a well-designed logical data model, the first step in moving to a physical database design is the process of transforming that logical data model into an actual physical database. The first step is to create an initial physical data model by transforming the logical data model into a physical implementation based on an understanding of the DBMS being used for deployment. To successfully create a physical database design you will need to have a good working knowledge of the features of the DBMS including:

In-depth knowledge of the database objects supported by the DBMS and the physical structures and files required to support those objects.
Details regarding the manner in which the DBMS supports indexing, referential integrity, constraints, data types, and other features that augment the functionality of database objects.
Detailed knowledge of new and obsolete features for particular versions or releases of the DBMS to be used.
Knowledge of the DBMS configuration parameters that are in place.
Data definition language (DDL) skills to translate the physical design into actual database objects.

Armed with the correct information, you can create an effective and efficient database from a logical data model. The first step in transforming a logical data model into a physical model is to perform a simple translation from logical terms to physical objects. Of course, this simple transformation will not result in a complete and correct physical database design – it is simply the first step. The transformation consists of the following:

Transforming entities into tables
Transforming attributes into columns
Transforming domains into data types and constraints

Data Types

To support the mapping of attributes to table columns you will need to map each logical domain of the attribute to a physical data type and perhaps additional constraints. In a physical database, each column must be assigned a data type.

Selecting appropriate data types is vital for optimizing database design. Choosing the right data types can have a significant impact on storage requirements, query performance, and overall system efficiency. By selecting data types that accurately represent the data and minimize storage overhead, such as using integer types instead of character types for numeric values, assuring that date and time data use appropriate data/time data types, and choosing wisely between the various text and character data types for each column helps to improve data integrity, optimize storage utilization, and improve query execution speed.

Constraints

Furthermore, you will need to implement appropriate constraints, such as primary keys, unique constraints, and foreign keys. This enhances data integrity and query performance. Furthermore, additional constraints such as check constraints and nullability enable the DBMS to better enforce data integrity, instead of leaving it to application code written at a later time. Constraints enforce data consistency rules, ensure referential integrity, and provide the optimizer with valuable information for query optimization.

An Iterative Process

It is worth mentioning that database design optimization is an iterative process that should consider not only the current requirements but also the future growth and scalability of the system. Regularly reviewing and revisiting the database design as the application evolves can help identify areas for improvement and ensure that the database remains optimized over time.

Finally...

In conclusion, a well-designed database schema is fundamental to achieving optimal performance for your database applications. By focusing on strategies such as normalization, relationship optimization, appropriate data types, and constraints, database designers can create a robust and efficient database environment. Optimizing the database design not only enhances query performance and data integrity but also lays the groundwork for scalability and adaptability as the system evolves.

Monday, June 26, 2023

Top 10 Db2 Performance Tips - No. 1: Analyze Query Execution Plans

Query execution plans play a crucial role in identifying performance bottlenecks within an IBM Db2 database. When a query is executed, the database optimizer determines the most efficient way to retrieve the requested data by generating a query execution plan. It does this by parsing SQL requests and turning them into actionable commands for accessing and modifying data.

So the query execution plan outlines the steps and operations involved in executing the query, including accessing tables, applying filters, joining data, and performing sorting or aggregations.

Understanding query execution plans is important because they reveal the inner workings of how Db2 will process the query. And they also highlight areas where performance can be improved if you understand the various ways that data can be processed by Db2. By analyzing query execution plans, you can identify potential inefficiencies and take proactive steps to optimize queries for better performance.

For example, one high-level performance tuning step that you can identify using a query execution plan is to find table (space) scans that would be better off using an index instead.

Plan Tables

But how can you review and analyze query execution plans? Perhaps the first method used by most folks is to examine the Plan Tables. A lot of information is contained in the PLAN_TABLE. The PLAN_TABLE is simply a specific table used by the EXPLAIN command to populate information about the choices made by the Db2 Optimizer when it formulates a query execution plan. The information in the PLAN_TABLE provides detailed information about the access paths chosen, join strategies employed, and other relevant details that impact query performance.

As long as EXPLAIN YES is specified when you bind your program, or you explicitly issue an EXPLAIN PLAN statement, information about the query execution plan will be placed in the PLAN_TABLE.

So, after the optimizer creates the access paths and populates the PLAN_TABLE with data representing those access paths, you will need to examine the results to determine if everything is satisfactory. Many questions can be answered by analyzing the results of EXPLAIN – questions like:

if we are joining what type of join is used (NLJ, MS, Hybrid),
was an index used, and if so how many columns matched,
are we doing a scan, and if so what type of scan (full or page range)
is prefetch being used, and if so what type (sequential, list)
was a hint used
was parallelism used, and if so what degree and type
was a sort required, and if so why (Join, Unique, Group By, Order By)
what type of locking is required

And that just covers the main PLAN_TABLE. There are actually many other explain tables available that, if created, will be populated by EXPLAIN. But that is beyond the scope of this high-level tip, which is already becoming rather lengthy.

Other Tools

There are also several additional tools and techniques that can be used to analyze Db2 query execution plans. One commonly used tool is Visual Explain, which provides a graphical representation of the query execution plan. Many tuners prefer a visual approach to revieing and tuning access paths. Visual Explain allows users to visualize the flow of data, understand the order in which operations are performed, and identify potential performance bottlenecks.

Visual Explain is a feature of the free IBM Data Studio tool, as well as the new Db2 Administration Foundation offering. There are other tools, from other vendors, that also provide a visual Explain capability.

Speaking of which, another technique to analyze query execution plans is to use database monitoring tools. Such tools capture real-time performance data, including query execution plans, and provide useful performance metrics that can be reviewed and acted upon as needed.

By monitoring the execution plans of frequently executed queries, it is possible to detect patterns, identify resource-intensive operations, and make informed decisions about query optimization.

A Bit of Guidance

It is wise to always Bind your production programs (that use static SQL) specifying EXPLAIN YES so that you can see the access paths chosen by Db2. This way, you can either check to make sure that no access paths have changed before proceeding to implement changes in production, or at least you have the information available should problems arise after you move code into production.

It can be difficult to determine if any access paths have changed, because you will need the old execution plans to compare to the new ones. For this reason, it is a good idea to keep several versions of access path information for each program.

Additionally, sometimes the program has changed between running EXPLAIN, which can make it challenging to find the exact SQL statements to compare. But you are still better off with older versions than without any historical data.

To optimize query execution plans for better performance, it is essential to focus on areas such as index usage, join strategies, and predicate selection. By strategically creating and maintaining appropriate indexes, rewriting complex queries, and refining predicate conditions, it is possible to influence the optimizer's decisions and improve query performance.

In Conclusion

Analyzing query execution plans is a vital step in optimizing the performance of IBM Db2 applications. By using tools like EXPLAIN and Visual Explain, PLAN_TABLE data, and monitoring tools, it is possible to gain valuable insights into query execution, identify potential bottlenecks, and enhance overall performance.

By proactively analyzing and optimizing query execution plans, organizations can achieve faster response times, improved scalability, and a more efficient database environment.

Be sure to check back here next week when we will address database design as a performance issue.

Wednesday, June 21, 2023

Top 10 Performance Tuning Tips for IBM Db2

Performance is a critical aspect of any database management system, and IBM Db2 is no exception. As organizations deal with ever-increasing data volumes and complex workloads, it is crucial that Db2 environments are fine-tuned to ensure optimal performance, responsiveness, and scalability. In this multi-part blog post, I will examine the top ten performance tuning tips for assuring optimal IBM Db2 applications and systems.

Whether you are an experienced database administrator or just starting your journey with Db2, these performance tuning tips will provide some practical strategies and techniques to enhance the efficiency and speed of your Db2 deployments. This series of posts will cover a range of topics, including query optimization, database design, indexing strategies, memory configuration, I/O optimization, and more.

By following these Db2 performance tips, you can significantly improve query response times, optimize resource utilization, and ensure the smooth operation of your Db2 environment. Whether you are dealing with transactional workloads, analytical queries, or a combination of both, the tips presented here will equip you with the knowledge and techniques to tackle performance bottlenecks and fine-tune Db2 to meet your specific needs.

Whether you are looking to optimize existing Db2 deployments or planning for a new implementation, this blog post will serve as a valuable resource to guide your performance tuning efforts. So, let's dive into the top ten performance tuning tips for IBM Db2 and discover how you can unlock the full potential of your database system.

Here's an outline for the upcoming posts in this series on "Top 10 Performance Tuning Tips for IBM Db2":

Tip 1: Analyze Query Execution Plans

Tip 2: Optimize Database Design

Tip 3: Efficient Indexing Strategies

Tip 4: Effective Memory Configuration

Tip 5: Workload Management and Query Optimization

Tip 6: Monitoring and Performance Metrics

Tip 7: Configuration Parameters and Tuning

Tip 8: I/O Optimization

Tip 9: Regular Maintenance and Housekeeping

Tip 10: Stay Up to Date

So be sure to keep tuning in to the Db2 Portal blog as we publish these Top 10 Performance Tips over the course of the next few weeks.

And be sure to share your own performance and tuning thoughts here... or on any of the subsequent posts in this series.

Monday, March 20, 2023

Harnessing the Power of zIIP Processors for Improved Db2 Performance and Lower Cost

As a Db2 DBA, you're constantly looking for ways to improve performance and efficiency while minimizing costs. One technology that can help achieve these goals is the zIIP (IBM System z Integrated Information Processor) processor. By offloading eligible Db2 workloads to zIIP processors, you can free up capacity on general-purpose processors and reduce costs, while improving performance.

So, what workloads are eligible for offloading to zIIP processors? XML processing, and portions of the Db2 LOAD, REORG, RUNSTATS and REBUILD utilities are among the most common. If you have third-party utilities (BMC, Broadcom, InfoTel) it is likely that they, too, will be zIIP-eligible, at least for some of their functionality.

Shifting workload to distributed/DDF is another good way to exploit zIIPs because SQL statements executed through DDF are zIIP-eligible. But most of the time DBAs have little influence on moving workload to distributed processing. This choice is typically driven by application development plans instead of DBA tuning tactics.

Nevertheless, by understanding what type of workload is zIIP-eligible and encouraging such usage, you can offload workload to zIIP processors. Moving workload from general-purpose processors to zIIPs can possibly improve system performance and reduce costs.

You might also want to take a look at converting some of your COBOL workload to Java, if at all possible because Java programs are zIIP-eligible. Of course, this requires application developers to get involved, as well as (possibly) a conversion tool.

To fully harness the power of zIIP processors, it's important to identify eligible workloads and configure the system accordingly. Here are some tips to help you get started:

Analyze workload usage: Use Db2 performance monitoring tools to identify workloads that are good candidates for offloading to zIIP processors. This can help you prioritize workload optimization efforts and maximize the benefits of zIIP processors. Be careful, as what is zIIP-eligible can change from version to version of Db2. As of Db2 13 for z/OS, here is the IBM documented list of zIIP-eligible Db2 "stuff."

Configure Db2 for zIIP offload: Configure Db2 to take advantage of zIIP processors by setting the appropriate parameters and options. Consult the IBM Db2 documentation for specific guidance on configuring zIIP offload.
Monitor and analyze performance: Use Db2 performance monitoring tools to track the performance of zIIP offloaded workloads and identify areas for further optimization. This can help you continually improve performance and efficiency over time.

By effectively utilizing zIIP processors for Db2 workloads, you can achieve significant cost savings and performance improvements on IBM Z mainframe systems. Don't let this powerful technology go to waste – start exploring the benefits of zIIP processors today!

Monday, December 13, 2021

Top 10 Steps to Building the Right Indexes

One of the constant struggles with Db2 development, and indeed developing for any relational DBMS, is designing and creating appropriate indexes. Perhaps the most important thing you can do to assure optimal application performance when accessing data in a relational/SQL database is to create correct indexes for your tables based on the queries your applications use. Of course, this is easier said than done.

But we can start with some basics. For example, consider this SQL statement:

    SELECT   LASTNAME, SALARY
    FROM     EMP
    WHERE    EMPNO = '000010'
    AND      DEPTNO = 'D01';

What index (or indexes) would it make sense to build for this simple query? First, think about all the possible indexes that you could create. Your first short-list probably looks something like this:

Index1 on EMPNO
Index2 on DEPTNO
Index3 on EMPNO and DEPTNO

This is a good start, and Index3 is likely to be the best of the lot. It enables the optimizer to use the index to immediately look up the row or rows that satisfy the two simple predicates in the WHERE clause. Of course, if you already have a lot of indexes on the EMP table, you might want to examine the impact of creating yet another index on the table.

With the caveat that appropriate index creation can be complicated, let’s look at the Top 10 things you can do to build the right indexes on your tables:

1. Index by workload, not by object

Many people make the mistake of just guessing at some indexes to create when they are creating other database objects (like tables and tablespaces). But without an idea of how the tables are going to be accessed these guesses are usually wrong – at least some of them.

Indexes should be built to optimize the access of your SQL queries. To properly create an optimal set of indexes requires a list of the SQL to be used, an estimate of the frequency that each SQL statement will be executed, and the importance of each query. Only then can the delicate balancing act of creating the right indexes to optimize the right queries most of the time be made.

If you are doing it any other way, you are doing it wrong.

2. Build indexes based on predicates

3. Index most-heavily used queries

Numbers 2 and 3 can be thought of as corollaries to Number 1… that is, these are the aspects of application workload that need to be examined to produce appropriate and effective indexes.

Look at the predicates of the queries in the workload and work to create a set of indexes that match up to most (if not all) of them. When it is not practical to build all of the indexes you have identified, then it makes sense to look at the queries that will be used most often and optimize them first.

4. Index important queries

The more important the query, the more you might want to tune by index creation. If you are coding a query that the CIO will run every day, you want to make sure it delivers optimal performance. So building indexes for that particular query is important. On the other hand, a query for a clerk might not necessarily be weighted as high, so that query might have to make do with the indexes that already exist.

Of course, the decision should depend on the application’s importance to the business - not just on the user’s importance. For example, if that clerk runs a query hundreds of times a day and the CIO runs his query only once, then perhaps the clerk's query is more important.

5. Index to avoid sorting (GROUP BY, ORDER BY)

In addition to building indexes to optimize data access, indexes can be used to avoid sorting. The GROUP BY and ORDER BY clauses tend to invoke sorts, which can cause performance slowdowns. By indexing on the columns specified in these clauses, the optimizer can use an index to avoid a sort, and thereby potentially improve performance.

6. Create indexes for uniqueness (PK, U)

Some indexes are required in order to make the database schema valid. Most Database systems require that unique indexes be created when unique and primary key constraints exist.

7. Create indexes for foreign keys

Even if not required, creating indexes for each foreign key can optimize the performance when accessing and enforcing referential constraints (RI – referential integrity).

8. Consider adding columns for index-only access

Sometimes it can be advantageous to include additional columns in an index to increase the chances of index-only access. With index-only access, all of the data needed to satisfy the query can be found in the index alone — without having to read data from the tablespace.

For example, suppose that there is an index on the DEPTNO column of the DEPT table. The following query may use this index:

   SELECT   DEPTNAME
FROM     DEPT
WHERE    DEPTNO = 'D01';

The index could be used to access only those columns with a DEPTNO greater than D00, but then the DBMS would need to access the data in the table space to return the DEPTNAME. If you added DEPTNAME to the index, that is, create the index on (DEPTNO, DEPTNAME) then all of the data needed for this query exists in the index and additional I/O to the table space would not be needed. This technique is sometimes referred to as index overloading.

Of course, this is not always a good idea. You have to take into account whether other queries use the index and how it might negatively impact their performance.

It is also worth mentioning index include columns, which allows you to define additional (included) columns that are not part of the actual key but are available in the index. So we can create an index like this:

CREATE INDEX IXNAME
    ON DEPT ( DEPTNO )
     INCLUDE ( DEPTNAME )

The index key is only on the DEPTNO column, but the DEPTNAME is also included in the index, so the earlier query can be satisfied using index-only access with this index.

9. Don’t arbitrarily limit the number of indexes

An example of a bad standard would be to place an artificial limit on indexing. There should be no arbitrary limit on the number of indexes that you can create for any database table. Indexes are undoubtedly one of the most important factors in creating efficient queries. Relational optimizers rely on indexes to build fast access paths to data. Without indexes data must be scanned – and that can be a long, inefficient means by which to retrieve your data. When a rule such as this exists, it usually is stated in the standards manual using verbiage something like “Each table can have at most five indexes created for it” — or — “Do not create more than three indexes for any single table in the database.” These are bad standards.

If you already have three indexes, or five indexes, or even 57 indexes, and another index will improve performance why would you arbitrarily want to avoid creating that index?

Anyway, a good indexing standard, if you choose to have one, should read something like this: “Create indexes as necessary to support your database queries. Limitations on creating new indexes should only be entertained when they begin significantly to impede the efficiency of data modification.”

Which brings us to…

10. Be aware of I/U/D implications

Db2 must automatically maintain every index you create. This means every INSERT and every DELETE to an indexed table will insert and delete not just from the table, but also from its indexes.

Additionally, when you UPDATE the value of a column that has been defined in an index, the DBMS must also update the index. So, indexes speed the process of retrieval but slow down modification.

So the general rule of thumb should be "Index until it hurts... and then back off the least effective index to make it no longer hurt." Sure, that is easier said than done, but it is a valid philosophy to follow.

Summary

Following these Top 10 index design techniques can go a long way toward improving not just your index usage, but also the performance of your database applications.

Thursday, January 07, 2021

BMC AMI Ops: The Next Generation of Mainframe Systems Management

Assuring the performance of your mainframe systems and applications is an imposing task that keeps getting more complex all the time. It makes sense to arm your IT performance analysts, DBAs, and systems programmers with modern tools so you can optimize performance and thereby deliver superior service to your customers.

Of course, BMC MainView has helped IT professionals manage the performance of their mainframe systems and applications for years. But there are new challenges facing modern organizations that require adaptation and transformation.

Organizations are transforming to become autonomous digital enterprises (ADE). This means that things are getting more complex because availability requirements are expanding (many times requiring 24/7 availability), but IT pros are expected to resolve problems rapidly even as workloads become more unpredictable and IT staff has less experience. These challenges are real and require attention.

And that is why BMC is transforming its MainView product line into BMC AMI Ops!

With BMC AMI Ops you can experience next-level mainframe operational resiliency, AI-powered observability, an intuitive user interface with embedded expertise, actionable insights, and enterprise platform interoperability.

How is BMC AMI Ops engineered to help? Well, it is built for digital business with the understanding that being reactive is not sufficient these days. BMC AMI Ops provides a complete, modular solution with central administration and management.

Artificial intelligence and machine learning techniques are being embraced by an increasing number of organizations for improving their business, so it only stands to reason that your IT operations and support functions should be looking to improve their capabilities using AI and ML, too. And BMC AMI Ops helps you to do that because it is infused with AI/ML-powered analytics to find and fix problems before business services are impacted. With BMC AMI Ops you can improve performance and availability by taking advantage of its built-in intelligent automation and remediation features.

And the user interface is brand new, engineered to support ease of use, to facilitate information instead of raw data, and to guide the user experience. BMC AMI Ops delivers a custom dashboard approach where you can group widgets together for related logical systems or business areas. And you get “out of the box” health indicators for each of the widgets you deploy, meaning it takes less time to be productive right away. Furthermore, a guided path is provided so the user can drill down into additional details as needed. If you are interested in seeing more details on the new user experience for BMC AMI Ops, chick out this blog post from Shay Alsberg (BMC AMI Ops: Evolving the MainView User Experience).

And not to fear, for those of you experienced mainframe pros who not only know how to drive ISPF panels but prefer it, BMC AMI Ops can still be accessed using character-based panels.

The bottom line is that BMC AMI Ops is designed for modern businesses and IT, as they embrace digital transformation to become autonomous digital enterprises, enabling them to deliver a simplified yet customizable systems management experience for optimizing your system and application performance. That’s BMC AMI Ops in a nutshell… and it is worth looking into how BMC AMI Ops can help you to improve the performance of your systems and applications.

Thursday, December 17, 2020

Db2 Utilities and Modern Data Management

Db2 utilities are the unappreciated, and often over-looked, workhorses of your mainframe Db2 environment. They perform the dirty work that has to be done to populate, organize, backup, and recover your vital mainframe data. Without them, building effective Db2 databases, managing data, optimizing performance, and even accessing mainframe data would be a lot more difficult than it currently is.

The Situation

Think about the Db2 utility situation at your shop. If you are like most organizations you will have Db2 utilities regularly running all the time. There are load and unload tasks running to refresh data for development and testing, for moving data between environments for analysis and processing, and for various other purposes. The LOAD and UNLOAD utilities bear a lot of the hard work for data movement.

You are also most likely reorganizing data using a REORG utility for most of your Db2 table spaces and probably indexes, too. In many cases reorganization jobs are scheduled to run on a regular basis: weekly, monthly, quarterly, etc. Frequently you just set these jobs up when the object is created. The job gets scheduled and is just run without anybody taking a look at them unless, or until there are performance problems.

Then there are COPY and RECOVER utilities for backing up and recovering data when there are problems. The image copy backup jobs are running all the time, taking either full or incremental copies to ensure that you can recover data in case problems are encountered. The copies are running all the time, but the recover jobs (hopefully) are not running all the time!

You are also going to be running the RUNSTATS utility to gather statistics for Db2 to use for query optimization. Depending on how often your data changes, you may be running RUNSTATS frequently or infrequently. Many times the same fate as REORG befalls RUNSTATS… that is, it is scheduled and forgotten about unless problems arise.

There are other utilities, like CHECK which is used to verify the integrity of data. You are probably not running this one very often but when you need it you want it to run fast, right?

So, all of these utilities are “out there” running and consuming CPU to move, copy, and manage your Db2 data. But are they being run effectively as possible?

Moving to the Modern Db2 Utility Way

I think by this point everybody will agree that utility type processing is not just critical, but mandatory for a Db2 environment. But just running with the bare basics is not the best approach.

If we think about data movement with unload and load processing there are several things that you might want to consider for improvement. First of all, consider the speed and performance of the unload and load tasks. You probably want these jobs to run as fast as possible – that is, to consume as little elapsed time as possible to complete. After all, you are probably using these utilities to build environments or even refresh portions of an environment… and there will be developers and testers waiting to use that data as soon as it is available. Using the fastest utility programs available will minimize the wait time and make your developers and testers more productive. Furthermore, you want these tasks to consume as little CPU as possible to reduce your monthly mainframe bills!

In some cases you might want to re-consider unloading and loading altogether, using alternate utilities and offerings that can clone an entire subsystem or move data outside the control of Db2 at the data set level.

If we think about reorganization, it is likely that you are running REORG tasks that don’t need to be run, at least not as regularly as they are being run. At the same time, it is also likely that you are not running other REORG tasks as frequently as you should, thereby causing every other task that accessing the data to degrade. Fortunately, you can use RTS (real time statistics) to help guide when you should (and should not) reorganize your data. In the best case the utility itself relies on RTS to figure out if it needs to run and runs when it makes sense only. Failing this, you are again likely consuming more CPU than is necessary (either running unneeded REORGs or accessing poorly organized data, as the case may be).

If you think about your backup and recovery situation, the issue is likely complexity. Sure you want COPY and RECOVER utilities that run fast and consume minimal CPU, but the big issue is analysis. By that I mean, when you need to recover you want to make sure that you can use the image copies (and, of course, the log) to recover and meet your RTOs (recovery time objectives). But creating recover jobs on-the-fly, in a probably complicated environment with inter-related tables and data, can be difficult. And doing so when there is an outage, which is usually the case, exacerbates the situation. Using intelligent utilities to create the right image copies and to automatically build an appropriate recovery strategy when needed should be the modern approach.

And not to neglect RUNSTATS and CHECK, you want both of those utilities to run as fast as possible, consuming minimal CPU, too. And you want guidance on when and how to run them using available RTS, statistics, and any system information available.

What Can You Do?

One approach is to use modern utilities, not only built for speed but that incorporate AI and machine learning to automate and improve the Db2 utility experience. BMC Software is once again on the vanguard with its BMC AMI utilities for Db2.

The first question you probably have is "What the heck is AMI?" Well, AMI, which stands for Automated Mainframe Intelligence, is technology that is being infused into BMC’s product line to leverage AI, machine learning, and predictive analytics to achieve a self-managing mainframe.

BMC AMI Utilities for Db2 are designed for modern complex Db2 environments. They use a centralized, intelligent architecture (see diagram below) designed specifically to handle the complexity facing IT today. Through intelligent policy-driven automation, you can use the AMI Utilities for Db2 to manage growing amounts of data with ease and, at the same time, deliver full application availability.

Figure 1. BMC AMI Utilities for Db2

If you are looking to reduce CPU and elapsed time by as much as 75%, eliminate downtime while delivering full application availability, lower disk usage, eliminate sort in your REORGs, and simplify complex utility operations, then it makes sense to take a look at the BMC AMI Utilities for Db2.

----------

You might also want to take a look at this blog post from BMC that discusses how to Save Time and Money with Updated Unload Times

And this analysis of the BMC next generation REORG technology from Ptak Associates

Monday, October 19, 2020

Improving Mainframe Performance with In-Memory Techniques

A recent, recurring theme of my blog posts has been the advancement of in-memory processing to improve the performance of database access and application execution. I wrote an in-depth blog post, The Benefits of In-Memory Processing, back in September 2020, and I definitely recommend you take a moment or two to read through that to understand the various ways that processing data in-memory can provide significant optimization.

There are multiple different ways to incorporate in-memory techniques into your systems ranging from system caching to in-memory tables to in-memory database systems and beyond. These techniques are gaining traction and being adopted at increasingly higher rates because they deliver better performance and better transaction throughput.

Processing in-memory instead of on disk can have a measurable impact on not just the performance of you mainframe applications and systems, but also on your monthly software bill. If you reduce the time it takes to process your mainframe workload by more effectively using memory, you can reduce the number of MSUs you consume to process your mission-critical applications. And depending upon the type of mainframe pricing model you deploy you can either be saving now or be planning to save in the future as you move to Tailored-Fit Pricing.

So it makes sense for organizations to look for ways to adopt in-memory techniques. With that in mind, I recommend that you plan to attend this upcoming IBM Systems webinar titled The benefits and growth of in-memory database and data processing to be held Tuesday, October 27, 2020 at 12:00 PM CDT.

This presentation features two great speakers: Nathan Brice, Program Director at IBM for IBM Z AIOps, and Larry Strickland, Chief Product Officer at DataKinetics.

In this webinar Nathan and Larry will take a look at the industry trends moving to in-memory, help to explain why in-memory is gaining traction, and review some examples of in-memory databases and alternate in-memory techniques that can deliver rapid transaction throughput. And they’ll also look at the latest Db2 for z/OS features like FTBs, contiguous buffer pools, fast insert and more that have caused analysts to call Db2 an in-memory database system.

Don’t miss this great session if you are at all interested in better performance, Db2’s in-memory capabilities, and a discussion of other tools that can aid you in adopting an in-memory approach to data processing.

Wednesday, September 02, 2020

The Benefits of In-Memory Processing

One area that most organizations can benefit from is by better using system memory more effectively. This is so because accessing and manipulating data in memory is more efficient than doing so from disk.

Think about it… There are three aspects of computing that impact the performance and cost of applications: CPU usage, I/O, and concurrency. When the same amount of work is performed by the computer using fewer I/O operations, CPU savings occur and less hardware is needed to do the same work. A typical I/O operation (read/write) involves accessing or modifying data on disk systems; disks are mechanical and have latency – that is, it takes time to first locate the data and then read or write it.

There are many other factors involved in I/O processing that involve overhead and can increase costs, all depending upon the system and type of storage you are using. For example, each I/O consists of a multitude of background system processes, all of which contribute to the cost of an I/O operation (as highlighted in Figure 1 below). It is not my intent to define each of these processes but to highlight the in-depth nature of the processing that goes on behind-the-scenes that contributes to the costly nature of an I/O operation.

Figure 1. The Cost of an I/O

So, you can reduce the time it takes to process your mainframe workload by more effectively using memory. You can take advantage of things like increased parallelism for sorts and improve single-threaded performance of complex queries when you have more memory available to use. And for OLTP workloads, large memory provides substantial latency reduction, which leads to significant response time reductions and increased transaction rates.

The most efficient way to access data is, of course, in-memory access. Disk access is orders-of-magnitude less efficient than access data from memory. Memory access is usually measured in microseconds, whereas disk access is measured in milliseconds. (Note that 1 millisecond equals 1000 microseconds.)

The IBM z15 has layers of on-chip and on-board cache that can improve the performance of your application workloads. We can view memory usage on the mainframe as a pyramid, from the slowest to the fastest, as shown in Figure 2. As we go up the pyramid, performance improves; from the slowest techniques (like tape) to the fastest (core cache). But this diagram drives home our core point even further: that system memory is faster than disk and buffering techniques.

Figure 2. The Mainframe Memory Pyramid

So how can we make better use of memory to avoid disk processing and improve performance? Although there are several different ways to adopt in-memory processing for your applications, one of the best methods can be to utilize a product. One such product is the IBM Z Table Accelerator.

IBM Z Table Accelerator is an in-memory table accelerator that can improve application performance and reduces operational cost by utilizing system memory. Using it can help your organization to focus development efforts more on revenue-generating business activity, and less on other less efficient methods of optimizing applications. It is ideal for organizations that need to squeeze every ounce of power from their mainframe systems to maximize performance and transaction throughput while minimizing system resource usage at the application level. You can use it to optimize the performance of all types of data, whether from flat files, VSAM, Db2, or even IMS.

So how does it work? Well, typically a small percentage of your data is accessed and used a large percentage of the time. Think about it in terms of the 80/20 Rule (or the Pareto Principle). About 80% of your data is accessed only 20% of the time, and 20% of your data is accessed 80% of the time.

The data that you are accessing most frequently is usually reference data that is used by multiple business transactions. By focusing on this data and optimizing it you can gain significant benefits. This is where the IBM Z Table Accelerator comes into play. By copying some of the most often accessed data into the accelerator, which uses high-performance in-memory tables, significant performance gains can be achieved. That said, it is only a small portion of the data that gets copied from the system of record (e.g. Db2, VSAM, etc.) into the accelerator.

High-performance in-memory technology products -- such as IBM Z Table Accelerator -- use system memory. Sometimes, if the data is small enough, it can make it into the L3-L4 cache. This can be hard to predict, but when it occurs things get even faster.

Every customer deployment is different, but using IBM Z Table Accelerator to optimize in-memory data access can provide a tremendous performance boost.

A Use Case: Tailor-Fit Pricing

Let’s pause for a moment and consider a possible use case for IBM Z Table Accelerator.

In 2019, IBM announced Tailored Fit Pricing (TFP), with the goal of simplifying mainframe software pricing and billing. IBM designed TFP as a more predictable, cloud-like pricing model than its traditional pricing based on a rolling-four-hour-average of usage. Without getting into all of the details, TFP eliminates tracking and charging based on monthly usage and instead charges a consistent monthly bill based on the previous year’s usage (plus growth).

It is important to note that last point: TFP is based on last year’s usage. So you can reduce your bill next year by reducing your usage this year, before you convert to TFP. Therefore, it makes a lot of sense to reduce your software bills to the lowest point possible the year before the move to TFP.

So what does this have to do with IBM Z Table Accelerator? Well, adopting techniques to access data in-memory can lower MSU usage – and therefore your monthly software bill. Using IBM Z Table Accelerator to optimize your reference data in-memory before moving to TFP can help you to lower your software bills and better prepare you for the transition to Tailored Fit Pricing.

--------------------

If you’d like to learn more about IBM Z Table Accelerator there is an upcoming SHARE webinar on September 15, 2020, that goes into some more details about the offering. It is titled Digital Transformation IncludesGetting The Most Out of Your Mainframe: click the link for details and to register to attend.