Monday, November 26, 2012

SQL Coding Guidelines: The Basics


When you are writing your SQL statements to access DB2 data be sure to follow the subsequent guidelines for coding SQL for performance. These are certain very simple, yet important rules to follow when writing your SQL statements. Of course, SQL performance is a complex topic and to understand every nuance of how SQL performs can take a lifetime. That said, adhering to the following simple rules puts you on the right track to achieving high-performing DB2 applications.


1)  The first rule is to always provide only the exact columns that you need to retrieve in the SELECT-list of each SQL SELECT statement. Another way of stating this is “do not use SELECT *”. The shorthand SELECT * means retrieve all columns from the table(s) being accessed. This is fine for quick and dirty queries but is bad practice for inclusion in application programs because:
  • DB2 tables may need to be changed in the future to include additional columns. SELECT * will retrieve those new columns, too, and your program may not be capable of handling the additional data without requiring time-consuming changes.
  • DB2 will consume additional resources for every column that requested to be returned. If the program does not need the data, it should not ask for it. Even if the program needs every column, it is better to explicitly ask for each column by name in the SQL statement for clarity and to avoid the previous pitfall.

2)  Do not ask for what you already know. This may sound simplistic, but most programmers violate this rule at one time or another. For a typical example, consider what is wrong with the following SQL statement:

    SELECT   EMPNO, LASTNAME, SALARY
    FROM      EMP
    WHERE   EMPNO = '000010';

Give up? The problem is that EMPNO is included in the SELECT-list. You already know that EMPNO will be equal to the value '000010' because that is what the WHERE clause tells DB2 to do. But with EMPNO listed in the WHERE clause DB2 will dutifully retrieve that column too. This causes additional overhead to be incurred thereby degrading performance.

3)  Use the WHERE clause to filter data in the SQL instead of bringing it all into your program to filter. This too is a common rookie mistake. It is much better for DB2 to filter the data before returning it to your program. This is so because DB2 uses additional I/O and CPU resources to obtain each row of data. The fewer rows passed to your program, the more efficient your SQL will be. So, the following SQL:

    SELECT   EMPNO, LASTNAME, SALARY
    FROM      EMP
    WHERE   SALARY > 50000.00;

Is better than simply reading all of the data without the WHERE clause and then checking each row to see if the SALARY is greater than 50000.00 in your program.

These rules, though, are not the be-all, end-all of SQL performance tuning – not by a long shot. Additional, in-depth tuning may be required. But following the above rules will ensure that you are not making “rookie” mistakes that can kill application performance. 

Tuesday, November 20, 2012

Happy Thanksgiving

I know it is only Tuesday (right now), but here's wishing all of my readers and all of their families and friends a very happy and restful Thanksgiving holiday. Visit with your relatives... Eat up... Watch football... Relax...


And try not to get trampled on Friday if you're going out shopping for bargains!

Wednesday, October 31, 2012

Eliminate Performance Management Black Holes


Today’s modern applications are not restricted to a single platform. Yet, in many ways, mainframe management and administration professionals are still tackling performance issues and problem resolution using the same methods they did when mainframe applications were all COBOL and never strayed off of the centralized mainframe computing platform. And this causes many problems.
The IT infrastructure of most organizations is multi-tiered, and business transactions flow through multiple systems and services to deliver business value. This means that today’s applications can utilize the most appropriate and useful technology for each component, thereby delivering more functional and easier to use transactions. Development time can be faster, too.
So far, so good… but performance management and problem resolution become difficult challenges in this brave new world. A business application that relies upon multiple differing computing platforms, technologies, and components to deliver service is characterized by its complexity. Consider an application where users deploy Windows workstations running .NET applications that connect to Linux servers running Oracle and stored procedures that connect to the mainframe to access DB2 data. Such an infrastructure consists of a lot of connecting points to stitch together significantly differing hardware and software platforms. And what happens if there is a problem? Tracking down the root cause of the problem can be difficult.
DBAs and system administrators for each platform may have tooling to monitor and diagnose their particular portion of the infrastructure, but a siloed approach is not pragmatic and usually results in a lot of finger pointing instead of problem solving. You know what I’m talking about – “there’s a problem with DB2!”
The mainframe DBAs usually have a performance monitor for DB2 and the systems programmers have a monitor for CICS and z/OS. And the distributed DBAs and system administrators have their monitors, too, for Oracle, SQL Server, Linux, Unix and so on.
So let’s assume that this multi-tier application begins exhibiting poor performance. Where do you start? You can’t fix what you can’t see, so unless the problem is easily monitored within an individual silo, discovering what is wrong and fixing the issue becomes a lengthy, troublesome, and expensive task. The problem could be anywhere, so it’s a major undertaking to pinpoint the root cause.
To the mainframe professionals the application goes into a “black hole” when it leaves the mainframe; to the distributed computing professionals, it is just the other way around with the mainframe being the “black hole.” Even with visibility into some components of the application, at some point the transactions vanish into one of the black holes. A siloed approach to managing performance just does not work when the application spans multiple silos. What is needed is an application performance management solution.
Organizations must be able to track business transactions from end to end; that is, from the time the user makes a request all through the entire infrastructure required to provide an answer and right back to the end user. Such a tracking solution must be able to follow the transaction in real time and report on the resources consumed at each point along the way. By providing real-time monitoring of each transaction flowing end-to-end across distributed applications it becomes possible to see the problem, to identify its root cause, to determine performance issues, and to solve problems more rapidly and less expensively.
Think about it. What are the applications and transactions like at your organization these days? How many can satisfy business needs completely on a single platform? Have you ever tried to resolve an issue or identify the root cause of a problem for an application that spans multiple platforms? When did the trail become cold because the transaction disappeared? And how did you move past all the finger-pointing?
Compuware’s  latest offering, APM for Mainframe, delivers end to end transaction management so that the mainframe is no longer a black hole. Using this solution you can track your distributed applications across every platform, find the root causes of problems and performance issues, and resolve them on the spot.
Without such a solution you just have to keep living with the pain. And that means unresponsive distributed applications, slower time to correct problems, lost revenue, and higher administrative costs.

Friday, October 12, 2012

Book Signing and Discount at IOD Conference

Just a short blog post today to let everybody know that I will be doing a book signing for the new edition of both of my books at the 2012 Information on Demand conference on Tuesday, October 23rd.


You can choose to have me sign the new 6th edition of my DB2 book, DB2 Developer's Guide... or the new 2nd edition of my DBA book, Database Administration: The Complete Guide to DBA Practices and Procedures.

Or you can be my favorite type of person and choose to get them both signed!
See you in Vegas!

Wednesday, October 03, 2012

DB2 11 for z/OS: Coming Soon!

Today, October 3, 2012, IBM announced the Early Support Program for the next version of DB2. Heretofore code-named Sequoia, the DB2 11 ESP is planned for March 8, 2013 availability. So if you are still lagging behind running DB2 9 (or, heaven forbid, DB2 V8) now is the time to get yourself to DB2 10 so you'll be ready for the imminent onslaught of DB2 11 for z/OS.


The announcement (212-364) offers up a bit of information on some of the features that are planned to be available in DB2 11, including:

  • Performance improvements for some types of online transaction processing (OLTP), heavy insert, select query workloads, and when running queries against compressed tables;
  • Improved-data sharing performance and efficiency;
  • Improved utility performance and additional zIIP eligible workload (particularly with LOAD and RUNSTATS);
  • Cost-effective archiving of warm and cold data with easy access to both within a single query;
  • Intelligent statistics gathering and advanced optimization technology for efficient query execution in dynamic workloads;
  • Additional online schema changes that simplify management, reduce the need for planned outages, and minimize the need for REORG;
  • Productivity improvements for DBAs, application developers, and system administrators;
  • Efficient real-time scoring within your existing transaction environment;
  • Enhanced analysis, forecasting, reporting, and presentation capabilities, as well as improved storage management, in QMF;
  • Expanded SQL, SQL PL, temporal, and XML function for better application performance;
  • Faster migration with application protection from incompatible SQL and XML changes and simpler catalog migration.
One of the more intriguing tidbits is the new SQL PL ARRAY support. IBM is claiming this feature may be able to offer up to 10 percent CPU savings for OLTP workloads with high read-to-write ratios. 

So get ready for DB2 11 - I'm sure we'll hear more about it at the IOD conference, coming up at the end of the month.