Sunday, March 04, 2012

Fetching Multiple Rows

When you need to retrieve multiple rows, consider deploying a multi-row fetch to transfer more than one row using a single FETCH statement. This capability was added as of DB2 Version 8.

A multi-row FETCH retrieves multiple rows at one time into column arrays in your application program. By fetching multiple rows at once, your request can become more efficient, especially for distributed requests. The performance improvement using multi-row FETCH in general depends on several factors, such as whether the request is distributed, the number of rows to be fetched, the complexity of the SELECT statement, and the number of columns being fetched.

Nevertheless, using multi-row FETCH (in local environments) can improve performance with a significant reduction of CPU time possible. Tests conducted by IBM have shown performance gains of between 25% and 40% processing 10 rows per SQL statement for programs processing a considerable number of rows. With such significant gains possible, why hasn’t everyone moved to multi-row FETCH? Well, perhaps because it requires programming changes. A multi-row FETCH requires a cursor defined with rowset positioning. A rowset is a group of rows that are operated on as a set. Such a cursor enables your program to retrieve more than one row using a single FETCH statement. By fetching multiple rows at once, your request might become more efficient, especially for distributed requests.

 To use this feature, you must DECLARE your cursor as using the WITH ROWSET POSITIONING parameter. For example

EXEC SQL
  DECLARE CURSOR SAMPCURS
  WITH ROWSET POSITIONING
  FOR
  SELECT DEPTNO
  FROM   DSN81010.DEPT
END-EXEC.

Furthermore, to use a multi-row fetch you must have defined the appropriate structures to receive multi-row data. This means you must have defined an array of host variables into which the fetched rows can be placed. Each column fetched requires its own host variable array into which its values will be placed. Be sure to match the array size to the rowset size. With the appropriate setup coded, FETCH statements can be written to retrieve more than a single row from the result set. For example

FETCH ROWSET FROM SAMPCURS
  FOR 5 ROWS
  INTO HOSTVAR-ARRAY;

As you can see, the multiple-row fetch block is identical to a single-row-fetch block, except that there are two additional clauses—ROWSET and FOR n ROWS. The ROWSET clause specifies that the orientation of this cursor is rowset positioning (instead of single row). The FOR n ROWS clause specifies the size of the rowset to be returned. The maximum rowset size is 32,767.

Rowset cursors are very useful when you need to retrieve many rows or large amounts of data in distributed systems. By retrieving multiple rows with a single FETCH, multiple trips between the application and the database can be eliminated, thereby improving network performance.

To learn more about multi-row FETCH consider attending my upcoming webinar on the topic. This presentation will introduce and define multi-row FETCH, how to use it, and the performance implications of doing so. The presentation will also touch upon multi-row UPDATE. And it will introduce the new SoftBase Attach Facility MRF Feature, which allows you to implement multi-row FETCH without coding changes. To attend, sign up at this link: https://www1.gotomeeting.com/register/754473201

Tuesday, February 28, 2012

Identifying Unused Indexes

Did you know that DB2 V9 added a new column to the Real Time Statistics to help identify unused indexes?

The LASTUSED column in the SYSINDEXSPACESTATS table contains a date indicating the last time the index was used. Any time the index is used to satisfy a SELECT, FETCH, searched UPDATE, searched DELETE, or to enforce a referential constraint, the date is updated.

This helps to solve the problem of determining whether or not an index is being used. Standard operating advice is to DROP or delete anything that is not going to be used. But trying to determine whether something is actually used or not can be difficult at times.

You could always query your PLAN_TABLEs or the plan and package dependency tables in the DB2 Catalog for static SQL. But what about dynamic SQL? That is more difficult. Now, as of DB2 V9, you can simply query the LASTUSED column to see when the index was last used. The LASTUSED date is by partition. So, for a partitioned index, the last used date of each partition in the index should be checked.

Of course, you will have to give it some time because you might have an index supporting a rarely used query. Most shops have queries and programs that run quarterly, or even annually, but nevertheless are very important... and you wouldn't want to drop indexes on those queries even though they do not run frequently because when they do run, they are important...

Examine the LASTUSED column over time to determine which indexes are truly not being used, and then DROP the unused indexes.

Thursday, February 16, 2012

Update on DB2 Developer's Guide, 6th edition

I know a lot of my readers are waiting on the updated edition of my book, DB2 Developer's Guide, so I thought I'd post a short update on the progress. The technical edits are over and production will be starting soon. The book is scheduled now for publication in early May 2012 and is available to be pre-ordered now on amazon com.



The book has been completely updated and is now up-to-date with DB2 10 for z/OS. Just think of the things that have been added to DB2 since the last time the book was updated: Universal table spaces, pureXML, SECADM, hashes, new data types, INSTEAD OF triggers, temporal support, and much, much more.

Consider pre-ordering a copy today so you'll get it as soon as it comes off the presses!

Thursday, January 26, 2012

A Forced Tour of Duty


Mainframe developers are well aware of the security, scalability, and reliability of mainframe computer systems and applications. Unfortunately, though, the bulk of new programmers and IT personnel are not mainframe literate. This should change. But maybe not for the reasons you are thinking.

Yes, I am a mainframe bigot. I readily admit that. In my humble opinion there is no finer platform for mission critical software development than the good ol’ mainframe. And that is why every new programmer should have to work a tour of duty on mainframe systems and applications after graduating from college.

Why would I recommend such a thing? Well, it is because of the robust system management processes and procedures which are in place and working extremely well within every mainframe shop in the world. This is simply not the case for Windows, Unix, and other platforms. By working on mainframe systems newbies will learn the correct IT discipline for managing mission critical software.

What do I mean by that? How about a couple of examples: It should not be an acceptable practice to just insert a CD and indiscriminately install software onto a production machine. Mainframe systems have well-documented and enforced change management procedures that need to be followed before any software is installed into a production environment.

Nor should it be acceptable to just flip the switch and reboot the server. Mainframe systems have safeguards against such practices. And mainframes rarely, if ever, need to be restarted because the system is hung or because of a software glitch. Or put in words PC dudes can understand: there is no mainframe “blue screen of death.” Indeed, months, sometimes years, can go by without having to power down and re-IPL the mainframe.

And don’t even think about trying to get around security protocols. In mainframe shops there is an entire group of people in the operations department responsible for protecting and securing mainframe systems, applications, and data. Security should not be the afterthought that it is in the Windows world.

Ever wonder why there are no mainframe viruses? A properly secured operating system and environment make such a beast extremely unlikely. And with much of the world’s most important and sensitive data residing on mainframes, don’t you think the hackers out there would just love to crack into those mainframes more frequently?

Project planning, configuration management, capacity planning, job scheduling and automation, storage management, database administration, operations management, and so on – all are managed and required in every mainframe site I’ve ever been involved with. When no mainframe is involved many of these things are afterthoughts, if they’re even thought of at all.

Growing up in a PC world is a big part of the problem. Although there may be many things to snark about with regard to personal computers, one of the biggest is that they were never designed to be used the way that mainframes are used. Yet we call a sufficiently “pumped-up” PC a server – and then try to treat it like we treat mainframes. Oh, we may turn it on its side and tape a piece of paper on it bearing a phrase like “Do Not Shut Off – This is the Production Server”… but that is a far cry from the glass house that we’ve built to nourish and feed the mainframe environment.

Now to be fair, strides are being made to improve the infrastructure and best practices for managing distributed systems. Some organizations have built an infrastructure around their distributed applications that rivals the mainframe glass house. But this is more the exception than the rule. With time, of course, the policies, practices, and procedures for managing distributed systems will improve to mainframe levels.

But the bottom line is that today’s distributed systems – that is, Linux, Unix, and Windows-based systems – typically do not deliver the stability, availability, security, or performance of mainframe systems. As such, a forced tour of duty supporting or developing applications for a mainframe would do every IT professional a whole world of good.

Tuesday, January 17, 2012

Row and Column Access Control in DB2 Version 10


Row and column access control enables you to manage access to a table at the level of a row, a column, or both. It enables you to build policies for the particulars of which data can be accessed by specific users, groups, or roles. Row access can be controlled using row permissions and column access control can be accomplished using column masks.

Row and column access control differs from multilevel security in that it is integrated into the database system. All applications and tools that use SQL to access the database are automatically subject to the same control. Sensitive data need not be filtered at the application level when row and column access control is in place.

Prior to row permissions and column masks, row and column level security was implemented in DB2 using views or stored procedures. Using views and stored procedures is a viable approach for simple requirements, but it breaks down as a solution for more complex requirements. When a large number of views are built to support your security needs, it can be difficult to administer as the views need to be updated and maintained.

Let’s see how row permissions and column masks can be used to improve upon row- and column-level security.

Row Permissions: Row Access Control

A row permission must be created and activated to be enforced. The structure of a permission will be familiar to anyone who is used to coding SQL statements. The CREATE PERMISSION statement is used to create a row permission.

Let’s consider an example using a banking system. Assume that bank tellers should only be able to access customers from their local branch. But customer service representatives (CSRs) should be allowed to access all customer data. Assume further, that secondary authids are setup such that tellers have a secondary authid of TELLER, and CSRs have a secondary authid of CSR. Given this scenario, the following row permissions can be created to institute these policies:

CREATE PERMISSION TELLER_ROW_ACCESS
ON     CUST
FOR ROWS WHERE VERIFY_GROUP_FOR_USER(SESSION_USER, ′TELLER′) = 1
AND
BRANCH = (SELECT HOME_BRANCH
          FROM   INTERNAL_INFO
          WHERE  EMP_ID = SESSION_USER)
ENFORCED FOR ALL ACCESS
ENABLE;

COMMIT;

CREATE PERMISSION CSR_ROW_ACCESS
ON     CUST
FOR ROWS WHERE VERIFY_GROUP_FOR_USER(SESSION_USER, ′CSR′) = 1
ENFORCED FOR ALL ACCESS
ENABLE;

COMMIT;



These row permissions will not be enforced, however, until they are activated by alteringthe table, for example:
ALTER TABLE CUST
 ACTIVATE ROW ACCESS CONTROL;

COMMIT;




With the row permissions in force, when tellers SELECT from the CUST table they will only be able to “see” customer data for their branch, whereas customer service representatives can see all customer data.

These row permission definitions use the VERIFY_GROUP_FOR_USER built-in function. This function returns a value indicating whether the primary authid and the secondary authids that are associated with the first argument are in the authorization names specified in the list of the second argument.


Data Masking: Column Access Control

Column access control allows you to manage access to a table with filtering and data masking. As with a row permission, a column mask must be created and activated before it can be enforced. The column mask defines the rules to be used for masking values returned for a specified column.

You use the CREATE MASK statement to create a column mask. Multiple column masks can be created for a table, but each column can have only one mask. The table and column must exist before the mask can be created.

For example, you can create a mask for employee social security numbers (assuming the table name is EMP and the column name is SSN) as follows:


CREATE MASK SSNMASK
ON     EMP
FOR COLUMN SSN RETURN
  CASE
    WHEN (VERIFY_GROUP_FOR_USER(SESSION_USER, ′PAYROLL′) = 1)
    THEN SSN
    WHEN (VERIFY_GROUP_FOR_USER(SESSION_USER, ′HR′) = 1)
    THEN ′XXX-XX-′ || SUBSTR(SSN,8,4)
    ELSE NULL
  END
ENABLE;

COMMIT;

This mask will return the actual data when accessed by a user in accounting, a version with the first 5 digits masked when access by human resources, and null for anyone else. Of course, column access control must be activated for the table before any mask will be enforced:


ALTER TABLE EMP
 ACTIVATE COLUMN ACCESS CONTROL;
 
COMMIT;


Summary

Using row and column access control a security administrator can enforce detailed security policies for the databases under their control.