Wednesday, April 25, 2012
IDUG DB2 Technical Conference 2012
Thursday, March 22, 2012
Working With Variable Data
- Avoid variable columns if a sufficient disk space is available to store the data using fixed columns. The general rule is: disk is cheap! Trading the cost of disk for more efficient development and subsequent performance is often an easy decision to make.
- Though this may be obvious, do not define a variable column when the data does not vary from row to row.
- Do not define a variable columns if its maximum length is less than 30 bytes. Furthermore, do not define a variable column if its maximum length is within 10 bytes of the average length of the column. Adhering to these two rules will cause you to choose VARCHAR data types only when they can potentially provide enough DASD savings to offset other costs.
- Consider redefining variable columns by placing multiple rows of fixed length columns in another table or by shortening the columns and placing the overflow in another table.
If, after following these guidelines, VARCHAR columns need to be implemented, go ahead and do so. However, it is wise to continuously re-assess the decision to use variable character data.
SELECT COL_LGTH, COUNT(*) FROM (SELECT LENGTH(PROJNAME) AS COL_LENGTH FROM DSN81010.PROJ
) GROUP BY COL_LGTH ORDER BY COL_LGTH;
SELECT 24*COUNT(*), SUM(2+LENGTH(PROJNAME)), 24*COUNT(*)-SUM(2+LENGTH(PROJNAME)), 24, AVG(2+LENGTH(PROJNAME)), 24-AVG(2+LENGTH(PROJNAME)) FROM DSN81010.PROJ;
SPACE SPACE TOTAL AVG. AVG. AVG. USED AS USED AS SPACE SPACE AS SPACE AS SPACE CHAR(24) VARCHAR(24) SAVED CHAR(24) VARCHAR(24) SAVED -------- ----------- ------ --------- ----------- ----- 158058 96515 61543 24 16 8
Definition
|
Calculation
|
Space
Used As CHAR(24)
|
24*COUNT(*)
|
Space
Used As VARCHAR(24)
|
SUM(2+LENGTH(PROJNAME))
|
Total
Space Saved Using VARCHAR
|
24*COUNT(*)-SUM(2+LENGTH(PROJNAME))
|
Avg.
Space Used As CHAR(24)
|
24
|
Avg.
Space Used As VARCHAR(24)
|
AVG(2+LENGTH(PROJNAME))
|
Avg.
Space Saved Using VARCHAR
|
24-AVG(2+LENGTH(PROJNAME))
|
Sunday, March 04, 2012
Fetching Multiple Rows
A multi-row FETCH retrieves multiple rows at one time into column arrays in your application program. By fetching multiple rows at once, your request can become more efficient, especially for distributed requests. The performance improvement using multi-row FETCH in general depends on several factors, such as whether the request is distributed, the number of rows to be fetched, the complexity of the SELECT statement, and the number of columns being fetched.
Nevertheless, using multi-row FETCH (in local environments) can improve performance with a significant reduction of CPU time possible. Tests conducted by IBM have shown performance gains of between 25% and 40% processing 10 rows per SQL statement for programs processing a considerable number of rows. With such significant gains possible, why hasn’t everyone moved to multi-row FETCH? Well, perhaps because it requires programming changes. A multi-row FETCH requires a cursor defined with rowset positioning. A rowset is a group of rows that are operated on as a set. Such a cursor enables your program to retrieve more than one row using a single FETCH statement. By fetching multiple rows at once, your request might become more efficient, especially for distributed requests.
To use this feature, you must DECLARE your cursor as using the WITH ROWSET POSITIONING parameter. For example
EXEC SQL DECLARE CURSOR SAMPCURS WITH ROWSET POSITIONING FOR SELECT DEPTNO FROM DSN81010.DEPT END-EXEC.
Furthermore, to use a multi-row fetch you must have defined the appropriate structures to receive multi-row data. This means you must have defined an array of host variables into which the fetched rows can be placed. Each column fetched requires its own host variable array into which its values will be placed. Be sure to match the array size to the rowset size. With the appropriate setup coded, FETCH statements can be written to retrieve more than a single row from the result set. For example
FETCH ROWSET FROM SAMPCURS FOR 5 ROWS INTO HOSTVAR-ARRAY;
As you can see, the multiple-row fetch block is identical to a single-row-fetch block, except that there are two additional clauses—ROWSET and FOR n ROWS. The ROWSET clause specifies that the orientation of this cursor is rowset positioning (instead of single row). The FOR n ROWS clause specifies the size of the rowset to be returned. The maximum rowset size is 32,767.
Rowset cursors are very useful when you need to retrieve many rows or large amounts of data in distributed systems. By retrieving multiple rows with a single FETCH, multiple trips between the application and the database can be eliminated, thereby improving network performance.
To learn more about multi-row FETCH consider attending my upcoming webinar on the topic. This presentation will introduce and define multi-row FETCH, how to use it, and the performance implications of doing so. The presentation will also touch upon multi-row UPDATE. And it will introduce the new SoftBase Attach Facility MRF Feature, which allows you to implement multi-row FETCH without coding changes. To attend, sign up at this link: https://www1.gotomeeting.com/register/754473201
Tuesday, February 28, 2012
Identifying Unused Indexes
The LASTUSED column in the SYSINDEXSPACESTATS table contains a date indicating the last time the index was used. Any time the index is used to satisfy a SELECT, FETCH, searched UPDATE, searched DELETE, or to enforce a referential constraint, the date is updated.
This helps to solve the problem of determining whether or not an index is being used. Standard operating advice is to DROP or delete anything that is not going to be used. But trying to determine whether something is actually used or not can be difficult at times.
You could always query your PLAN_TABLEs or the plan and package dependency tables in the DB2 Catalog for static SQL. But what about dynamic SQL? That is more difficult. Now, as of DB2 V9, you can simply query the LASTUSED column to see when the index was last used. The LASTUSED date is by partition. So, for a partitioned index, the last used date of each partition in the index should be checked.
Of course, you will have to give it some time because you might have an index supporting a rarely used query. Most shops have queries and programs that run quarterly, or even annually, but nevertheless are very important... and you wouldn't want to drop indexes on those queries even though they do not run frequently because when they do run, they are important...
Examine the LASTUSED column over time to determine which indexes are truly not being used, and then DROP the unused indexes.
Thursday, February 16, 2012
Update on DB2 Developer's Guide, 6th edition
The book has been completely updated and is now up-to-date with DB2 10 for z/OS. Just think of the things that have been added to DB2 since the last time the book was updated: Universal table spaces, pureXML, SECADM, hashes, new data types, INSTEAD OF triggers, temporal support, and much, much more.
Consider pre-ordering a copy today so you'll get it as soon as it comes off the presses!