Showing posts with label DBA. Show all posts
Showing posts with label DBA. Show all posts

Monday, April 29, 2024

Intelligent Automation of Db2 Administration and Management

It is vitally important to ensure that your Db2 databases and systems are running effectively and efficiently. And this requires the diligent application of administration and management tasks on a regular basis. Failure to keep up with the status of your Db2 databases can result in poorly performing applications, unavailable data and systems, and ultimately, lost revenue.

Automation can help. But what do we mean by automation? Sure, most of you inherently know what automation is, at least at a high level. But it is worthwhile to consider a brief definition. Automation involves reducing human involvement in your activities, turning things over to intelligent software. The goal is to reduce the amount of time, effort, and human error involved in maintaining efficient systems. That sounds good, right?

Nevertheless, many IT professionals have an aversion to automation. We are happy to automate everyone else’s job but not our own. As experts on technology, you’d think we’d be able to see the fallacy of this belief. By developing computerized applications to support business processes, we automate just about every job in our organizations. But try to tell a DBA to automate their utilities or to use advanced autonomics to direct their actions and you’d think you just insulted their mother. Many technology folks resist automation for fear of losing control or perhaps, losing their job. These fears are understandable, but not really justifiable.

There is an IT skills shortage and companies want to hire more IT professionals than are available. And we are over-worked – who among us really wants to work 12-hour days all the time? And in the mainframe world, the workforce is aging and we need to do something as the experienced folks retire and move on.

The truth is, most IT tasks and procedures can be streamlined and made more efficient using automation: automated systems management, database administration and tuning, and yes, even application development. Automation will not be able to completely replace IT professionals any time soon, but it is important as organizations struggle to cope with a shortage of skilled IT professionals. By turning some of the work over to the computer, IT can become more efficient, more effective, and provide a higher ROI to the business.

One of the biggest challenges IT professionals in general, and DBAs in particular, face is the growing complexity of technology and software. Contributing to this complexity is the growing number of devices, the increasing number of systems that interoperate with one another, and the growing number of parameters and options available in systems software and utilities.

 

A Day in the Life of a DBA

To help us understand the growing complexity and need for automation, let’s examine a typical day in the life of a Db2 DBA. It can be quite hectic. The DBA is required to maintain production and test environments while at the same time keeping an eye on active application development projects, attending strategy and design meetings, helping to select and evaluate new products, and connecting legacy systems to the web. And Joe in Accounting, he just submitted that “query from hell” again that is bringing the system to a halt, can you do something about that? All these things can occur within a single DBA workday.

Databases are at the center of modern applications. If Db2 fails, applications fail, and if applications fail the entire business can come to a halt. If databases and applications fail often enough, the entire business can fail. Database administration therefore is critical to the on-going success of modern business.

So, growing complexity and overwhelmed DBAs contribute to the need to automate. Automation can optimize management and reduce complexity by instrumenting and running tasks based on tried-and-true best practices.


Automating Db2 Utilities

Db2 utilities are a prime candidate for automation because they need to be run regularly and there are industry best practices and statistics that can be used to automate them. Not to mention the fact that the complexity of Db2 utilities is increasing all the time. Just compare the number of parameters and options available today for Db2 13 for z/OS versus what was available in the past.

For example, do you understand how to run your utilities online, while data is accessible to your applications? Do you use parallel index build with the LOAD, REORG and REBUILD INDEX utilities? Have you ever loaded partitions in parallel? How about using the LISTDEF, TEMPLATE, and OPTIONS statements for controlling utility operations? And these are only a few of the innovations made to IBM’s Db2 utilities over the years. Indeed, the list of improvements made to IBM utilities in Db2 13 for z/OS is challenging, and some DBAs have not really implemented any new utility functionality in years.

Truly, automating utilities is a best practice, but many shops do not adhere to this best practice. Instead, they just create utility jobs for every object and schedule them to run regularly. This set-it-and-forget-it mentality means that most utility jobs get built when the object is created and then rarely examined ever again. The decision on how frequently the jobs will be run is made up-front and never re-examined, unless there are performance, availability, or recovery problems.

If you fall into this category of utility scheduler, how frequently do you schedule your REORG, RUNSTATS, and COPY jobs to run. Weekly, monthly, quarterly? There is not usually a lot of thought put into the frequency of execution. And even if you meticulously analyze your scheduling decision when the object is created, are you sure that decision is still sound now, perhaps many years later? Few DBAs re-assess the situation over time to see if their initial scheduling decisions were accurate or still stand.

At any rate, automating utilities based on thresholds is a better practice. Automation can ensure that you are running your REORG, RUNSTATS, and COPY utilities when it makes sense, instead of on a rigid schedule. If you run these utilities too late, you risk recoverability and availability issues, as well as increased cost due to poor performance and increased CPU usage. And if you run the utilities too soon, you are consuming CPU and I/O that you do not need to run, and that, too, increases costs.

The goal is to run your Db2 utilities at just the right time. And that can be done using Db2 statistics. Of course, each utility needs to examine different real-time statistics to run them at the proper time. You should consider a tool, such as InfoTel Corp’s iDBA, which can help you to implement intelligent DBA automation. Such a solution can consider all the pertinent parameters and statistics and determine what the appropriate maintenance tasks and utilities that need to be run based on your environment and situation… and not a schedule from long ago.

Tuesday, April 16, 2024

Mixing Db2 Database Administration with DevOps - Summary

Recently, I posted a series of blog entries on DevOps and Db2 DBA here on the Db2 Portal blog site. The posts dealt with the many issues and considerations that need to be addressed as mature Db2 shops embrace DevOps practices and procedures.

This post is a short one that collects together links to all 7 of the posts in this series. You can use this post to review all of the posts in this series on DevOps Db2 DBA...


Part 1: Intro to DevOps


Part 2: The DevOps Toolchain


Part 3: Automating the DevOps Toolchain


Part 4: Database Schema Change and DevOps


Part 5: SQL Performance Testing


Part 6: Treat Dev and Ops as Equals


Part 7: Culture and The Bottom Line



I hope that you find the information in these posts to be informative and useful. And please, feel free to add your observations, thoughts, and experiences to the comments in this post (or any of the 7 DevOps posts linked above).

Wednesday, April 03, 2024

Mixing Db2 Database Administration with DevOps - Part 7: Culture and The Bottom Line

When DevOps is embraced by an organization it should mean that DBAs get aligned more closely with development and appli­cations than in the past. By deploying agile development, with DBAs participating in teams along with the developers, you get increased cooperation and communication between the folks coding the application (that’s Dev) and the folks developing and managing the database (that’s Ops, or the DBAs).

So an overarching change required to succeed with DevOps is that DBAs should be work­ing in teams with developers, instead of in teams of other DBAs... at least for periods of time when development projects are very active. For some applications, a permanent DBA, or team of DBAs, may be assigned. For others, the DBA may rotate back and forth between the development team and a centralized DBA team.

Regardless of the pattern, DBAs are becoming more appli­cation-savvy. That’s a good thing because with improved appli­cation knowledge the DBA will be better able to administer the database for an application’s needs as the app moves from devel­opment to test to production.

The Bottom Line

Db2 for z/OS administration and management techniques need to adapt to the modern practices of agile application development and DevOps. But this is easier said than done.

It includes adapting the behavior of both developers and operations to be more collaborative between developers and operations (DBA) personnel. It also requires automating as much of the software development lifecycle as possible into a DevOps toolchain to reduce development time and deliver a better return on investment to application development and support.


Friday, March 29, 2024

Mixing Db2 Database Administration with DevOps - Part 6: Treat Dev and Ops as Equals

Although DevOps is widely accepted in the industry, it is not without some problems. The biggest problem, as I see it, is that many DevOps shops, emphasize the “Dev” over the “Ops” causing important operational aspects to be overshadowed by development concerns. And the “Ops” part is where you will find database management and administration.

Far too often, control of the application delivery process is driven by development, sometimes without the traditional control and oversight of the DBA group. Without the expertise of the DBA, the delivery and integration process can fall apart because the oversight and administration required for database performance and change management cannot be accomplished in the same exact way as applications.

A particular problematic perception is that the DBA acts as a barrier to progress. The developers work on their code, test it, and are ready to move forward, only to be barred by the DBA from doing so. So, developers view the DBA as a bottleneck.

But it important to understand that DBAs are not just stopping things for the fun-of-it. There is important work that the DBA must perform to ensure that the application can be turned over to production without causing problems. This includes:

  • Reviewing SQL code and access paths for performance under production volume and workload.
  • Ensure that everything is production-ready: this includes statistics collection, index analysis and design, join method analysis, system parameter analysis, and more.
  • Reviewing database structure changes required to support the application.
  • Building scripts to ensure a successful migration.
  • Coordination of database changes with application changes.
  • Ensuring that all support jobs (backup, recovery, reorg, etc.) are in place for every database object.
  • Determine and mitigate any impact of the new application (or changes) on any other existing applications and databases.

When there is little, or no, communication between development and database administration until the code is ready to be delivered to production, it will take time to allow the DBA to perform their portion of the application delivery.

Of course, if the DBA and application teams communicate and coordinate their workload the perceived bottleneck can be eliminated. This is at the core of DevOps and is sometimes referred to as “shifting left.”

The shift-left ideology arose out of the application testing discipline. Software development typically progresses from Requirements to Design to Coding to Testing to Deployment and then to Support, which can be viewed as a chain, starting from the right and moving to the left:

          Requirements – Design – Code – Testing – Deployment – Support

With a shift-left development mentality, processes on the right are moved to the left, so they are performed earlier in the development lifecycle. So, the goal with database administration should be to shift tasks to the left instead of performing them all right before Deployment. And to enable developers and DBAs to work together and communicate as team members with the same goal: servicing the customers and end-users.

But it should not, indeed cannot, be that DBA practices and procedures are shifted away. DBA functionality still needs to be performed to ensure the accuracy and effectiveness of corporate data!

Tuesday, March 26, 2024

Mixing Db2 Database Administration with DevOps - Part 5: SQL Performance Testing

Although implementing database schema changes is the most important component when incorporating database structures into your DevOps pipeline, it is not the only thing to consider. It is also important to be able to analyze and optimize SQL performance within your application code.

As anyone who has written SQL knows, it is a very flexible language. There are multiple ways to write SQL queries to achieve the same results. For example, you can combine multiple tables using a join or a subselect and achieve the same results. But each SQL formulation is likely to perform differently, one better than the other. And this is but one example of the various ways you can build SQL statements to perform the same function.

The development mindset is usually to write code that matches the requirements and delivers the expected results, not necessarily to assure the best performance. Therefore, SQL performance testing should be carried out on all programs before they are migrated to a production environment. Failure to do so will likely result in poorly performing applications.

In a DevOps environment, the best approach is to measure, analyze and improve SQL statements at all stages as your code progresses from development to testing to production. The more SQL performance testing that can be accomplished by developers the earlier performance problems will be found and corrected. And that means the cost of delivering high-quality Db2 applications will decline.

However, things are not as simple as just running your program and evaluating its performance metrics. The data that you use in your test environment will not be the same as your production data. Typically, you will have less test data than you do in production. So, if you run the RUNSTATS utility on your test data you will get different statistics than in production, which means you will also get different access paths and performance results.

Setting up the test environment with production statistics and modeling the environment to mimic production is an important aspect of performance testing during development.

With the proper setup and tooling, developers can examine the access paths of their SQL statements to judge their efficiency. Of course, tools that can simplify this process are needed to speed up SQL performance testing. Such tooling should be able to capture Explain information, display it graphically and combine it with pertinent catalog statistics, store a repository of access paths by statement, compare access paths, identify changes, and make recommendations. Ideally, the tool should be integrated into the DevOps toolchain so that information is automatically captured and analyzed each time the program is compiled and bound. 

Considerations should also be made for testing specific use cases for performance. For example, consider skewed data. Db2 assumes that data values are mostly uniformly distributed throughout the data. However, not all data is uniformly distributed. RUNSTATS can be used to capture information about non-uniformly distributed and skewed data.

Another performance testing consideration is to always try multiple SQL variations, especially for queries that access a lot of data or have complex access paths. Do not just find one SQL formulation that works and stick with it. Remember that you can code multiple variations of SQL statements that return the same data, but that perform quite differently.

Tools that can help set up testing for various use cases and SQL variations will be needed for integrating SQL performance testing into the DevOps toolchain. There are a wide variety of vendors and solutions for managing Db2 for z/OS SQL performance, but I am not aware of any that have been fully integrated into the DevOps toolchain.



Thursday, March 21, 2024

Mixing Db2 Database Administration with DevOps - Part 4: Database Schema Change and DevOps

Traditionally, the DBA is the custodian of database changes. But the DBA typically is not the one requesting the change. Usually, a programmer does that. There are times when the DBA requests changes, such as to improve performance or to utilize new features, but this is not as common as development changes. Regardless of who requests the change, the DBA must be involved to ensure that each change is performed successfully and with no impact on the rest of the database. 

After moving to a DevOps approach, there is a shift that places more of the responsibility for database change on the developer. But to succeed, the DBA still must be involved to oversee, analyze, and approve any changes. As is common with DevOps practices, it is desirable to automate as much of the process as possible in order to remove manual, error-prone tasks and increase the speed of delivery. 

This requires a tool that automates complex database changes and integrates into the DevOps toolchain. Without such a tool incorporating database changes into application delivery and deployment remains a slow, mostly manual process. 

To effectively make database changes, the DBA needs to consider multiple issues, the most important of which are the appropriateness of the change in terms of the database design and the impact of the change on all other database objects and applications. Additionally, the DBA must determine if the change conforms to standards, how best to implement the change, and the timing of the change. 

The ideal arrangement is for database schema changes to be incorporated into the DevOps toolchain using a tool that allows developers to request changes. Those changes should be analyzed and compared against standards and rules for conformance. Non-compliant changes should automatically be referred back to the developer for modification and resubmission. Compliant changes should be accepted and cause a script to be generated using the most appropriate mechanisms to implement the change. This is a non-trivial activity which if done properly can eliminate a lot of manual downtime. The generated script should be presented to the DBA for review and upon acceptance, be implemented.

Db2 for z/OS, like all of today’s major DBMS products, do not support fast and efficient database structure changes for all types of change. A quick example: try to add a column to the middle of an existing row. This requires a complex series of metadata capture, data unloading, DROP, and CREATE statements. And don’t forget about all of the dependent objects and structures like indexes, referential integrity, authorizations, and so on.

Adding to the difficulty of making schema changes is the fact that most organizations have at least two, and sometime more, copies of each database. There may be copies of the database at different locations or for different divisions of the company. And at the very least, a test and a production version will exist. But there may be multiple testing environments—for example, to support simultaneous development, quality assurance, unit testing, and integration testing. Each database change will need to be made to each of these copies, as well as, eventually, to the production copy. So, you can see how database change can quickly monopolize a DBA’s time.

You can see how a robust, time-tested process that is designed to effect database changes is required. BMC, Liquibase, and IBM all offer DevOps-integrated database change management solutions for Db2 for z/OS.

As you review their capabilities, be sure that the tooling supports the type of changes your organization requires. For example, be sure that the tool is aware of all the different requirements for making any change you may need.

From my experience, vendors can focus on the development side of the DevOps experience and minimize the complexity of database change. All too often the tool demo shows a request to add a column... how boring is that? How about changing a data type from numeric to text? That would be a bit more challenging... or requiring a tablespace be converted to Universal from segmented as part of the change (perhaps to support larger sizes)?

Monday, March 11, 2024

Mixing Db2 Database Administration with DevOps - Part 3: Automating DevOps Toolchain

Understanding what is meant by DevOps and the many requisite pieces of the DevOps toolchain is just the first step in adopted DevOps.

The key to success with automating DevOps is a well-constructed toolchain for automating all of the afore-mentioned processes in an integrated and cooperative manner. If you have been an IT professional for any length of time most of the categories we just discussed will not be new to you. What is new, however, is likely the integration of the tools hooked up together to work in concert for the purpose of delivering application software quickly and accurately.

Additionally, the list of activities we just reviewed is not comprehensive. A glaring omission is the lack of integration with your database management systems. The orchestration of application changes in the DevOps toolchain is fairly well established, but this is not so much the case for database changes. And this has especially been the case for the mainframe world in terms of incorporating Db2 for z/OS into the DevOps toolchain.

As any good DBA or performance analyst knows, there are multiple management and operational activities required to assure functionality and optimal performance when applications rely on Db2 databases for persistent storage. Perhaps the single most difficult task is managing database schema changes. 

Thursday, March 07, 2024

Mixing Db2 Database Administration with DevOps - Part 2: The DevOps Toolchain

Adopting a DevOps approach to application development is all about moving faster. Automating the multitude of processes required during software development and management is a core part of increasing speed and enabling faster development.

As mentioned in the first part of this series, a DevOps toolchain is used to automate the development process. A toolchain is a set of software tools used to perform a complex development task... or to create a software product. The software tools that comprise a toolchain typically are executed sequentially with the output or state of one tool becoming the input for the next, but this is not a hard-and-fast requirement.

A DevOps toolchain therefore is a set of tools that interoperate with one another to aid in the delivery, development, and management of software applications throughout the SDLC. The following components typically are involved in putting together a useful DevOps toolchain.

Orchestration enables the automation of coordinating and managing the SDLC. Whereas automation refers to a single task or capability, orchestration automates a complex, multi-step process. The orchestration tool drives the entire DevOps process. Jenkins is the most popular DevOps orchestration tool.

Continuous Integration (CI) and Continuous Delivery (CD) are important components of the DevOps toolchain. Typically, CI and CD are implemented into collaboration tools that provide a dashboard for integration and delivery of software. Bamboo and Jenkins offers CI and CD capabilities. 

Tools that deliver Configuration and Resource Management provide an automated method for maintaining your systems and software infrastructure in a consistent state. This can include servers, storage, networking, and software, with the goal of managing the environment in a known, desired state. Examples of configuration management tools include Chef, Ansible, and Puppet.

Source Control software is used to manage who can change code, who is changing code, and track those changes. It enables developers to work collectively on a complex software project without impacting code changes other developers are making. GitHub is the most popular modern source control tool for DevOps projects.

Collaboration tools aid in the scheduling and tracking of code sprints by delivering transparency to the process for all stakeholders. Many different types of tools can fall into this category, including communication, project management, and service management software. Tools that help developers to catalog and track issues help to speed delivery by improving the response taking corrective action. 

Whenever code is being written, it must also be tested. Automated Software Testing tools can improve the speed and quality of software by quickly identifying defects and validating the accuracy of code. 

Deployment tools automate the migration and implementation of application code throughout your environments. It facilitates rapid feedback and continuous delivery in agile development while providing the audit trails, versioning and approvals needed in production. A popular deployment tool IBM UrbanCode Deploy is an example of a popular deployment tool for DevOps.

A Container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.  Using a container developers can incorporate everything required to run an application to enable it to be portable, without being impacted by differences across multiple environments. Docker is the most popular container software. Kubernetes is frequently used with Docker as a container orchestration system that automates application deployment, scaling, and management.  

Monitoring software that can oversee the performance and operation of applications and services is a crucial component of a DevOps pipeline. There are many types of monitoring software including IT infrastructure (operating system, network, and database); the end-user response and experience; and for the performance and functionality of the actual applications themselves. 

Thursday, February 29, 2024

Mixing Db2 Database Administration with DevOps - Part 1: Intro to DevOps

Organizations of all types are being impacted by the transformation to the modern digital economy. This digital transformation, driven by the 24/7 expectations of users and customers to access data and apps at any time, from any device, requires the ability to rapidly change technology and software. This means that today’s businesses have to deliver and improve the services and software used by their customers faster than ever before.

To survive and thrive in the new digital economy requires that organizations adopt new and faster methods of developing, testing and delivering application software. The clear trend for modern application development is the use of agile development and DevOps techniques. Why have these techniques been adopted for software development over long-standing techniques like the waterfall approach? It all comes down to improved collaboration, resulting in faster development and therefore a quicker return on investment.

An agile development methodology involves the organization of the team and goals such that software development is iterative, incremental, and evolutionary. Using an agile approach, development and testing activities are undertaken concurrently, unlike the traditional Waterfall development model. Large software projects are broken into pieces such that immediate value can be delivered quickly, in smaller pieces, instead of waiting for a monolithic product to be completed. With continuous testing and integration, the smaller software pieces can be integrated into a larger, final deliverable.

Another key methodology used to speed up the delivery of software is the DevOps approach, which results in small and frequent code changes. Its name is an amalgamation of Development and Operations. As such, DevOps relies on agile development, coupled with a collaborative approach between development and operations personnel during all stages of the application development lifecycle. Such an approach can significantly reduce the lead time for changes, lower the rate of failure, and reduce the mean-time-to-recovery when errors are encountered. With such benefits that can accrue, it is no wonder that DevOps and continuous delivery are gaining in popularity.

Instead of long software development projects that may not deliver value for months, or perhaps even years (such as are common using the Waterfall development methodology), an agile DevOps approach delivers value quickly, and then incrementally over time. DevOps enables the continuous delivery of new functionality demanded by customers in the digital economy.

Succeeding with DevOps however, requires a cultural shift in which all groups within IT work in collaboration with one another, and where management endorses and cultivates this cultural change. Because DevOps relies upon incremental development and rapid software delivery, your IT department can only thrive if there is a culture of accountability, collaboration, and team responsibility for desired business outcomes. Creating such a culture is not easy. 

The goal of DevOps is a constantly repeating cycle of continuous development, continuous integration and continuous deployment. To achieve such a high level of continuous delivery also requires in-depth automation of the entire development lifecycle. This is often referred to as the DevOps toolchain.



Tuesday, March 07, 2023

A Db2 for z/OS DBA Poem

I'm a hip-hopping monster of the CRT,

All the programmers here, they be envying me,

I'm a DBA, a master of data,

Work with Db2 and I diss all the haters!


I strive every day to give smooth operation,

And to keep the data in perfect formation.

My eyes are trained to spot any trouble,

To fix it before the data turns to rubble.


My hands deftly skip over the keyboard, 

Tuning and tweaking all the inefficient code.

I know all the ins and outs of Db2,

Every command, every option, every ZPARM, I do.


My skills are honed by years of practice,

Through all the challenges, I've gained my status.

My expertise and skills are sought by many,

But few can match all my skills, or any.


I'm a Db2 DBA, indeed I am...

A guardian of data, and a z/OS fan!

Monday, November 21, 2022

Db2 Education is a Constant Requirement

Ensuring access to proper Db2 educational materials should be one of the first issues to be addressed after your organization decides to implement Db2. But education sometimes falls through the cracks... 



Does your organization understand what Db2 is? How it works? For what purposes it is needed at your shop? How it will be used? Without a sound understanding of Db2, its components, and features, it is unlikely that you will be able to use Db2 to its best advantage. A basic level of Db2 knowledge can be acquired through a short Db2 fundamentals class for the IT personnel charged with making Db2 a success at your ­organization. But long-term success with Db2 requires ongoing education.

After addressing the basics of Db2 education, you must support a curriculum for on-going Db2 education for your organization. This support falls into four categories: 

The first category of training is a standard regimen of SQL and Db2 programming training to be used by application developers. Every programmer should receive basic training on SQL, education on how to embed SQL in the programming languages they will be using, and possibly additional courses on using Db2 with specific infrastructure software like MQ, WebSphere, CICS, and IMS. Also, with the importance of distributed access to Db2 these days, a course should be made available on that topic, with specific portions that address the technology used by your shop to provide distributed Db2 data access. If this basic level of Db2 education is not required for every Db2 programmer, then Db2 application performance will surely suffer as untrained coders write inefficient and incorrect SQL. This type of training can be delivered as instructor-led training by in-house SQL coding experts, as computer-based training, or as external classes. It should also be augmented with reference material such as books, articles, and blog posts.

The second category of education support is external training for special needs. This support includes education for database administrators, technical support personnel, and performance analysts. Additionally, your organization needs to plan for ongoing education to keep appropriate personnel up-to-date on new versions and releases of DB2. Although IBM offers great courses for new Db2 releases, several third-party vendors such as ProTech, Interskill Learning and KBCE regularly offer in-depth training and release-specific Db2 courses and lectures.

The third category of education is in-house, interactive training in the form of videos, computer-based training, and instructor-led courses. These courses should be used to augment and refresh the formal training given to your Db2 professional staff. 

The fourth, and final category of support, is reference material—for example, IBM’s Db2 manuals, Db2 books (such as DB2 Developer's Guide and A Guide to Db2 Performance for Application Developers), vendor-supplied white papers, and industry publications and periodicals. The current IBM manuals for Db2 are always available on the web, as well:

 



Of course, you should consider augmenting the standard IBM Db2 manuals with IBM redbooks. IBM redbooks provide in-depth, detailed coverage of a specific technology topic. IBM publishes redbooks on multiple subjects, including Db2, IMS, CICS, z/OS, and many other topics. IBM redbooks can greatly assist Db2 technicians working to understand a feature or nuance of DB2. You can download IBM red books for free in Adobe Acrobat format over the Web at http://www.redbooks.ibm.com/

IBM also offers many other useful sources of DB2 information on the Web including white papers, articles, and book excerpts from IBM Press publications. As does the International Db2 User Group (IDUG).

Independent Software Vendors (ISVs) are another rich source of Db2 information. The major vendors provide in-depth technical papers on features of Db2 that would be difficult for most shops to research in the same detail. Be sure to watch for and request the great Db2 Catalog posters, reference guides, white papers and eBooks available from BMC Software and Broadcom. There are other great Db2 ISVs who offer educational information, too, such as the newsletters, white papers, and webinars; these include InfoTel, SEG, and UBS-Hainer, among others.

All of these educational components—in-house education, external education, and industry publications—are useful for learning how you can use Db2 effectively. You would be wise to have a mix of material that supports more than one of the categories outlined previously. In this way, you provide a varied learning environment that meets the needs of all students. This varied learning environment allows each student to learn in the most conducive way for him or her. Plan to provide an on-site library of educational material addressing the following subjects:
  • Introduction to relational databases
  • Introduction to DB2 and SQL
  • Advanced SQL
  • Programming DB2 using (your languages of choice)
  • Programming DB2 in batch
  • Programming DB2 using TSO, CICS, and IMS
  • Programming DB2 and the Web
  • Creating DB2 stored procedures, triggers, and UDFs
  • Programming DB2 in a Distributed Environment
  • Debugging and Problem Analysis

You also might want to have an introductory Db2 database administration course to train new DBAs. In addition to this basic education library, plan to provide advanced education for technical Db2 users, such as DBAs, technical support personnel, and technical programmers and analysts. Advanced DBA topics (such as Sysplex Data Sharing, performance management, and backup/recovery) should be left to instructor-led training courses because of the complex nature of Db2 database administration. Additional advanced topics to consider include system administration (for systems programmers) and disaster recovery. Many vendors, including the ones metntioned above, offer these classes. Searching for smaller consulting firms and local resources is also prudent; these firms sometimes can provide courses tailored to your installation-specific needs.

The advanced education program should include allocating time to attend area user groups meetings, such as the International Db2 User Group (IDUG) conferences held every year in North America, EMEA, and Canada. 

The bottom line is simple, though. Plan for ongoing Db2 education for your DBAs, programmers, and analysts... or plan on failing.

Friday, June 17, 2022

My Speaking Schedule at IDUG Db2 Tech Conference in Boston (2022)

 Just a quick note to let everybody who is coming to Boston in July for IDUG know what I will be speaking about and when my presentations are scheduled!

First of all, my regular IDUG session this year is titled "Things Your DBAs Hear... and how to stop making them crazy!" This session is based on my decades of experience as a DBA and as a consultant. This session walks you through interactions between developers and DBAs, in a light-hearted way. All of them are real-life examples of actual conversations I've been in (or observed).  Attend this session to learn what frustrates DBAs and how improving your communication can improve your relationship with your DBAs... and therefore improve your development  efforts!  This is session E11, and it will be delivered on Wednesday, July 13 at 11:30 AM.

I will also be presenting at two different VSP sessions, one on Tuesday and another on Wednesday (this is the 10:15 AM time slot on both days).

On Tuesday, I will be presenting with InfoTel on the topic "To Protect and Preserve: Treat Your Data Properly or Pay the Consquences." This session will discuss vital data management issues such as data archiving and data protection (my portion), as well as some products that can help you manager your data better (the InfoTel portion). 

On Wednesday, I will be presenting "How to Accelerate Db2 SQL Workloads... Without Db2!" for Log-On Software. This session takes a look at in-memory trends and issues, and shines a light on how QuickSelect can improve the performance of SQL queries.

I hope to see you at this year's IDUG North American conference the week of July 11, 2022. If you are there, come see one (or all) of my sessions... and be sure to say "Howdy!"

Friday, July 16, 2021

Keeping Track of Data Movement in Db2 for z/OS

Creating and managing test data for Db2 application development and testing requirements can be a significant challenge. To enable not only the development of new programs, but to be able to maintain existing ones, organizations must ensure that there is an adequate amount of accurate test data always available. Without relevant, useful data, there is no way to test applications to make sure they are operating correctly. 

Although this duty must be a shared one between the application developers and the DBAs, managing and controlling all of the data movement tasks typically falls on the DBAs. And keeping track of what data moved where, when it moved, and why can at times be as much of a challenge as moving the data itself.

Fortunately, there are test data management tools available to not only move the data, but to keep track of it. I’ve written about one of the better Db2 for z/OS data movement tools here in the blog before: Fast and Effective Db2 for z/OS Test Data Management with BCV5. I hope you'll take a moment to click and read that post.

Now BCV5 has been improved with a new reporting feature, to enable users to track the movement of data across their Db2 subsystems. This is a significant new feature that can be used to glean useful information for DBAs, storage administrators, and even by data stewards for data governance.

There are six tables of metadata that BCV5 populates to track the data movement and the copy tasks it performs. These tables are:
  • BCV531.TASK_EXECUTIONS
  • BCV531.TASK_EXECUTIONS_OBJECTS
  • BCV531.TASK_EXECUTIONS_PARAMETERS
  • BCV531.TASK_EXECUTIONS_JOBS
  • BCV531.TASK_EXECUTIONS_RULES
  • BCV531.TASK_EXECUTIONS_MASKING
The information in these tables is updated whenever BCV5 runs a task to copy Db2 data. Users can query these tables just like any other Db2 tables to monitor the details of the BCV5 tasks you have run. This information may be useful for many different IT and business professionals, but let’s take a look at three specific use cases: 
  1. database administration (DBA), 
  2. storage administration, and 
  3. data governance.
DBA tracking
DBAs tasked with moving and refreshing data from one Db2 environment to another are the typical users of BCV5, and therefore they will be one of the primary users of the new reporting tables. Most sites that use BCV5 use it to refresh test data, for example, copying production data to test, or copying unit test data to an integration test set of tables.  

Regardless of the type of data movement that is being undertaken, it is usually being done for multiple tables, tablespaces, and databases. Usually, there will be regularly scheduled processes that copy some of the data, but this is rarely sufficient as there will be on-off requests, special situations, and emergency data refreshes happening all the time. Keeping track of such a hectic morass of copying and refreshing data can be difficult. 

Fortunately, if you are using BCV5 the new Usage Tracker tables can simplify keeping track of data refreshes for DBAs. For example, a DBA looking to find out which BCV5 copy tasks were run during the month of May could code a query like:

 SELECT T.ROWDATE, T.USER, T.TASKNAME 
 FROM BCV531.TASK_EXECUTIONS T 
 WHERE T.ROWDATE BETWEEN ´2021-05-01´ AND ´2021-05-31´ 
 ORDER BY T.ROWDATE ;

This will show all the BCV5 copy tasks that ran during that timeframe, and will look something like this:

ROWDATE      USER     TASKNAME 
---------+---------+---------+---------+--------- 
2021-05-12   USERID1  TSK0001 
2021-05-12   USERID1  TSK0002 
2021-05-20   USERID5  TSKPROD1 
2021-05-21   USERID9  TSKPROD4 
 

The results shown here are just a sample and will likely be a subset of the actual results of running such a query. 

Of course, this is rudimentary information and it is likely that the DBA will want to know more, such as which objects were impacted by these tasks. A query such as the following will come in handy:

SELECT SUBSTR(SRCSCHEMA,1,8) AS SS, 
       SUBSTR(SRCNAME,1,12) AS SN, 
       SUBSTR(TGTSCHEMA,1,8) AS TS, 
       SUBSTR(TGTNAME,1,12) AS TN, 
       SUBSTR(OBJTYPE,1,1), 
       ROWDATE AS DATE_COPIED, 
       SIZEKB 
FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
ORDER BY DATE_COPIED ;

The results here show all the Db2 objects copied by BCV5 showing the source and target names as well as the object type, date copied, and amount of data (in KB) copied:

SS      SN       TS     TN         DATE_COPIED   SIZEKB 
---------+---------+---------+---------+---------+-----
DB500XA TS500X01 DBA001 TS500XA1 S  2021-05-17  1462480 
QUALID  XCL59011 TESTID XCL59011 X  2021-05-17   325040 
QUALID  XCL59012 TESTID XCL59012 X  2021-05-17   125200 
QUALID  XCL59013 TESTID XCL59013 X  2021-05-17   301460 
QUALID  XCL5901C TESTID XCL5901C X  2021-05-17    98400 
QUALID  TEST_TBL TESTID TEST_TBL T  2021-05-17       20 

Again, the results have been truncated as this is intended as an example.

A DBA looking to track down the results of a specific copy task that ran on a specific date might want to run a query like this, to verify which objects were copied. Simply plug in the name of your task and the date it ran:

SELECT T.TASKNAME, O.OBJTYPE, O.SRCSCHEMA, O.SRCNAME, 
       O.TGTSCHEMA, O.TGTNAME, O.PARTITIONS 
FROM   BCV531.TASK_EXECUTIONS         T, 
       BCV531.TASK_EXECUTIONS_OBJECTS O 
WHERE  T.ID = O.EXECID 
AND    T.TASKNAME = ? 
AND    T.ROWDATE = ? ;

Storage Administration Tracking 
Another type of user who might find the Usage Tracker capabilities of BCV5 useful is the storage administrator. Storage administrators are responsible for managing an organization’s disk and tape systems. Additionally, they are also responsible for monitoring storage usage and capacity to ensure that sufficient storage is available for the organization’s IT requirements. 

As such, the storage administrator will likely want to keep an eye on the data movement activities of BCV5. For example, a query such as this one can be used to report on the total amount of data (TS and IX) copied by date:

SELECT ROWDATE AS DATE_COPIED, 
       SUM(SIZEKB) AS TOTAL_KB_COPIED 
FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
WHERE  OBJTYPE IN ('S', 'X') 
GROUP BY ROWDATE ;

Which will return data similar to this:

DATE_COPIED  TOTAL_KB_COPIED 
---------+---------+---------+---------+------ 
2021-06-12         106231270 
2021-06-19         106231270 
2021-06-21        4451457810 
2021-06-26         106231270 
Another potentially useful query, not only for storage administrators and DBAs, but also for application managers, is tracking the amount of actual (tablespace) data copied by date and application. Finding the application name or identifier can be tricky, but if we assume that an application identifier is embedded in the second 2 chars of database name then a query like this can be run:

WITH SIZEBYAPP AS ( 
  SELECT ROWDATE AS DATE_COPIED, 
         SUBSTR(TGTSCHEMA,2,2) AS APPL_NAME, 
         SIZEKB AS SIZE_IN_KB 
  FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
  WHERE OBJTYPE = 'S' 
                  ) 
SELECT DATE_COPIED, APPL_NAME, 
       SUM(SIZE_IN_KB) AS TOTAL_KB_COPIED 
FROM   SIZEBYAPP 
GROUP BY DATE_COPIED, APPL_NAME ;

Which might return a report looking something like this:

DATE_COPIED APPL_NAME TOTAL_KB_COPIED 
---------+------------------+---------+------- 
2021-05-11  EN               46805760 
2021-05-22  BA              242791056 
2021-05-22  BX                4094640 
2021-05-22  CM                1008720 
2021-05-22  DA              270390816 
2021-05-22  OR                  90528 
2021-05-26  PR               55737376 
2021-05-26  XX              537647328 

You can adjust this query if you want to know the amount of index data copied by data and application like so (under the same assumption as above for application identifier):

WITH SIZEBYAPP AS ( 
  SELECT B.ROWDATE AS DATE_COPIED, 
         SUBSTR(T.DBNAME,2,2) AS APPL_NAME, 
         B.SIZEKB AS SIZE_IN_KB 
  FROM   BCV531.TASK_EXECUTIONS_OBJECTS B, 
         SYSIBM.SYSINDEXES              X, 
         SYSIBM.SYSTABLES               T 
 WHERE   B.OBJTYPE = 'X' 
 AND     B.TGTNAME = X.NAME 
 AND     B.TGTSCHEMA = X.CREATOR 
 AND     X.TBNAME = T.NAME 
 AND     X.TBCREATOR = T.CREATOR 
                 ) 
SELECT DATE_COPIED, APPL_NAME, 
       SUM(SIZE_IN_KB) AS TOTAL_KB_COPIED 
FROM   SIZEBYAPP 
GROUP BY DATE_COPIED, APPL_NAME ;

Data Governance Tracking 
Although tracking data movement activities is useful for DBAs, it is also important for data governance reporting. Data governance refers to the processes and standards of ensuring access to high-quality data throughout an organization. Data governance encompasses all aspects of data quality including its accuracy, availability, consistency, integrity, security, and usability. The role of data governance has expanded as data privacy rules and regulations have expanded in response to an increasing number of data breaches and hacker attacks. For example, in early July 2021, Colorado passed the Colorado Privacy Act, meaning that now Colorado, Virginia, and California have passed data privacy legislation that impacts how personal data must be governed. More states are certain to follow their lead… and let’s not forget the European GDPR act!

These type of regulations provide rights for access, deletion, correction, portability, and protection for personally identifiable information, or PII. Note that “portability” is one aspect of data protection covered under the auspices of such regulations… and BCV5 is a mover of data, so you need to be able to track what data moved where, especially when the data that moved contains any PII. 

So, what types of queries can be run using the new BCV5 reporting tables to help satisfy the needs of data governance? 

Well, if you have identified specific tables that have personally identifiable information, and therefore requires specific policies to ensure its privacy and protection, a data steward might want to run a query that shows all of the times that a specific protected table was copied:

SELECT SUBSTR(SRCSCHEMA,1,8) AS SS, 
       SUBSTR(SRCNAME,1,12) AS SN, 
       SUBSTR(TGTSCHEMA,1,8) AS TS, 
       SUBSTR(TGTNAME,1,12) AS TN, 
       ROWDATE AS DATE_COPIED, 
       SIZEKB 
FROM   BCV531.TASK_EXECUTIONS_OBJECTS 
WHERE  OBJTYPE = 'T' 
AND    SRCSCHEMA = ? 
AND    SRCNAME = ? 
ORDER BY DATE_COPIED ;

Simply code the appropriate schema (SRCSCHEMA) and table name (SRCNAME) and this query will show all the times that the particular table (say PROD.CUSTOMERS) was copied. A data governance professional with a list of tables that contains PII could alter this query to accept that list as an IN clause instead of the simple equality clause shown here. 

Additionally, BCV5 can de-identify sensitive data using masking. Whenever a task that requires data to be masked is run, information is captured in the TASK_EXECUTIONS_MASKING table. So, a data governance professional might want to run a query like this one to report on all the masking of sensitive data.

SELECT SUBSTR(TBCREATOR,1,8) AS TBCREATOR,
       SUBSTR(TBNAME,1,12) AS TBNAME, 
       SUBSTR(COLNAME,1,18) AS COLNNAME, 
       METHOD AS MASKING_METHOD, 
       SQLEXPR 
FROM   BCV531.TASK_EXECUTIONS_MASKING 
ORDER BY ROWDATE ;

This can always be modified to join it to the TASK_EXECUTIONS table to obtain the task name if that is important. And with a little manipulation of the query it is possible to look for tables that contain sensitive data that have been copied using BCV5, but have not had masking applied. 

Summary 
BCV5 has offered powerful data movement and masking capabilities for Db2 data for a long time, but now it also offers the ability to track and report on your organization’s Db2 data movement and copy tasks. This new functionality opens a plethora of useful information for BCV5 users.

Wednesday, October 21, 2020

Automation and the Future of Modern Db2 Data Management

Recently I was invited by BMC Software to participate in their AMI Z Talk podcast series to talk about modern data management for Db2... and I was happy to accept.

Anne Hoelscher, Director of R+D for BMC's Db2 solutions, and I spent about 30 minutes discussing modern data management, the need for intelligent automation, DevOps, the cloud, and how organizations can achieve greater availability, resiliency, and agility managing their mainframe Db2 environment.

Here's a link to the podcast that you can play right here in the blog!


Modern data management, to me, means flexibility, adaptability, and working in an integrated way with a team. Today’s data professionals have to move faster and more nimbly than ever before. This has given rise to agile development and DevOps - and, as such, modern DBAs participate in development teams. And DBA tasks and procedures are integrated into the DevOps pipeline. 

And as all of this DevOps adoption is happening, the amount of data we store, and have to manage, continues to grow faster than ever before.

These are just some of the challenges that Anne and I discuss in this podcast... and at the end, Anne even asks me to predict the future... 

I hope you'll take the time to listen to our discussion and sharing your thoughts and issues regarding the resiliency and agility required to succeed with modern data management and Db2 for z/OS.

----------

I’d also like to extend an offer to all the listeners of this BMC podcast (and readers of this blog post) to get a discount on my latest book, A Guide to Db2 Performance for Application Developers. The link is https://tinyurl.com/craigdb2

There’s also a link to the book publisher on home page of my website. Once you are there, click on the link/banner for the book and when you order from the publisher you can use the discount code 10percent to get 10% off your order of the print or ebook.

 


Wednesday, October 07, 2020

IDUG 2020 EMEA Db2 Tech Conference Goes Virtual

For those of you who have attended an IDUG conference before you know why I am always excited when IDUG-time rolls around again. And the EMEA event is right around the corner!

Participating at an IDUG conference always delivers a boatload of useful information on how to better use, tune, and develop applications for Db2 – both for z/OS and LUW. IDUG offers phenomenal educational opportunities delivered by IBM developers, vendor experts, users, and consultants from all over the world.

Unfortunately, due to the COVID-19 pandemic, in-person events are not happening this year... and maybe not for some time to come, either. But IDUG has gone virtual, and it is the next best thing to being there! The IDUG EMEA 2020 virtual event will take place November 16–19, 2020. So you have ample time to plan for, register, and attend this year.

If you attended any of the IDUG North American virtual conference earlier this year you know that you can still get great Db2 information online at an IDUG event. And there are a ton of great presentations at the EMEA virtual IDUG conference – just check out the great agenda for the event.

Of course, a virtual event does not offer the face-to-face camaraderie of an in-person event, but it still boasts a bevy of educational opportunities. And the cost is significantly less than a traditional IDUG conference: both in terms of the up-front cost (which is significantly less) and also because there are no travel costs... 

For just $199, you get full access to the virtual conference, as well as a year-long premium IDUG membership and a complimentary certification or proctored badging voucher. If you're already a premium member, you can add the EMEA 2020 Conference access to your membership for just $99.

You can register here https://www.idug.org/p/cm/ld/fid=2149

The Bottom Line

So whether you are a DBA, a developer, a programmer, an analyst, a data scientist, or anybody else who relies on and uses Db2, the IDUG EMEA Db2 Tech Conference will be the place to be this November 2020. 

With all of this great stuff available online from this IDUG virtual event, why wouldn’t you want to participate?