Thursday, October 07, 2010

Null Follow-up: IS [NOT] DISTINCT FROM

After publishing the last blog post here on the topic of pesky problems that crop up when dealing with nulls, I received a comment lamenting that I did not address the IS [NOT] DISTINCT FROM clause. So today’s blog post will redress that failure.

First of all, IS [NOT] DISTINCT FROM is a relatively new predicate operator, introduced to DB2 for z/OS in Version 8. It is quite convenient to use in situations where you are looking to compare to columns that could contain NULL.

Before diving into the operator, let’s first discuss the problem it helps to solve. Two columns are not equal if both are NULL, that is because NULL is unknown and a NULL never equals anything else, not even another NULL. But sometimes you might want to treat NULLs as equivalent. In order to do that, you would have to code something like this in your WHERE clause:

WHERE COL1 = COL2
OR (COL1 IS NULL AND COL2 IS NULL)

This coding would cause DB2 to return all the rows where COL1 and COL2 are the same value, as well as all the rows where both COL1 and COL2 are NULL, effectively treating NULLs as equivalent. But this coding although relatively simply, can be unwieldy and perhaps, at least not at first blush, unintuitive.

Here comes the IS NOT DISTINCT FROM clause to the rescue. As of DB2 V8, the following clause is logically equivalent to the one above, but perhaps simpler to code and understand:

WHERE COL1 IS NOT DISTINCT FROM COL2

The same goes for checking a column against a host variable. You might try to code a clause specifying WHERE COL = :HV :hvind (host variable and indicator variable). But such a search condition would never be true when the value in that host variable is null, even if the host variable contains a null indicator. This is because one null does not equal another null - ever. Instead we’d have to code additional predicates: one to handle the non-null values and two others to ensure both COL1 and the :HV are both null. With the introduction of the IS NOT DISTINCT FROM predicate, the search condition could be simplified to just:

WHERE COL1 IS NOT DISTINCT FROM :HV :hvind

3 comments:

GuyC said...

thank you very much for the follow up.
Not only is is [not] distinct from simpler an more intuitive, it is also stage 1 (opposed to the or solution).

Craig S. Mullins said...

Yes, in most cases IS [NOT] DISTINCT FROM will be Stage 1. In the example, we should have Stage 1 and indexable.

However, the following would be Stage 2:

T1.COL1 IS [NOT] DISTINCT FROM T2.COL2

I'm a bit puzzled by your "opposed to..." statement though? All of the predicates in the other solution are Stage 1 and Indexable: COL1 = COL2, COL1 IS NULL, and COL2 IS NULL should all be Stage 1 and Indexable.

COL IS NULL (and COL IS NOT NULL) are stage 2 predicates when they query a column that is defined as NOT NULL. But we wouldn't code that, right? Even so, if we code COL IS NULL and the column is defined as NOT NULL, the table is not accessed because C1 cannot be NULL. So Stage 2 predicate, but no table access.

GuyC said...

Sorry, my mistake: they are indeed all stage1 and each one on its own indexable.

I confused it with an OR solution when comparing to Host variables, where you would have "COL1 = :hv OR (COL1 is null and :hv-ind = -1)"