Monday, May 14, 2007

Index on Expressions [DB2 9 for z/OS]

DB2 9 for z/OS offers, for the first time, the ability to create an index on data that is not technically in the table. At this point, you may well be asking “What does that mean?” and “Why would I do that?” I’ll attempt to answer both of those questions.

Basically, the neat new feature is an extension to the CREATE INDEX statement that lets you create an index on an expression. That begs the question of just what is an expression… Well, an expression can be as simple as a column reference, or it can be a built-in function invocation or even a more general expression, with certain restrictions.

The expression that is being indexed is referred to as the key-expression. The maximum length of the text string of each key-expression is 4000 bytes after conversion to UTF-8. Each key-expression must contain as least one reference to a column of the table named in the ON clause of the index. Referenced columns cannot be LOB, XML, or DECFLOAT data types nor can they be a distinct type that is based on one of those data types. Referenced columns cannot include any FIELDPROCs nor can they include a SECURITY LABEL. Furthermore, the key-expression cannot include any of the following:
  • subquery
  • aggregate function
  • not deterministic function
  • function that has an external action
  • user-defined function
  • sequence reference
  • host variable
  • parameter marker
  • special register
  • CASE expression
  • OLAP specification
Unlike a simple index, the index key of an index on expression is composed by concatenating the result (also known as a key-target) of the expression that is specified in the ON clause. An index that is created on an expression lets a query take advantage of index access (if the index is chosen by the optimizer) and avoid a table space scan.

Perhaps an example would help to clarify this topic. Consider this sample DDL:

CREATE INDEX XUPLN
ON EMP
(UPPER(LAST_NAME, 'En_US'))
USING STOGROUP DSN8G910
PRIQTY 360 SECQTY 36
ERASE NO
COPY YES;

This example would create an index on the LAST_NAME column of the EMP table, but would index on the data after applying the UPPER function. This is useful if you store the data in mixed case but submit queries with parameter markers (or host variables) in upper case only. By indexing the data as upper case the index better matches your queries.

Query performance can be enhanced if the optimizer chooses that index. When you use an index on an expression, the results of the expressions are evaluated during insertion time or during an index rebuild and are kept in the index. If the optimizer chooses to use that index, the predicate is evaluated against the values that are stored in the index. As a result, run-time performance overhead is eliminated.

This is a nice new feature of DB2 9 for z/OS that you should consider if you have queries that search on expressions.

8 comments:

  1. I'm using DB2 9.5 in an AIX system.
    Are these expression indexes supported in LUW systems?

    Thanks in advance.

    ReplyDelete
  2. DB2 for LUW does not yet support index on expressions.

    ReplyDelete
  3. Anonymous2:09 PM

    Hi Craig,

    I did implemented the Epression index as below:
    CREATE TYPE 2 UNIQUE INDEX S.SRHSMRS1
    ON S.SRHSMRS
    (UCASE(I_RO_NUM,'En_US')
    ,I_VIN_LAST_8 ASC
    ,I_SERV_COUPN ASC
    )
    USING VCAT DB2L
    FREEPAGE 0
    PCTFREE 10
    BUFFERPOOL BP0
    CLOSE NO ;
    commit;
    My table resides in mainframe only. I included locale name becoz I was getting below error:
    DSNT408I SQLCODE = -356, ERROR: KEY EXPRESSION 1 IS NOT VALID, REASON CODE = 9
    DSNT418I SQLSTATE = 429BX SQLSTATE RETURN CODE
    DSNT415I SQLERRP = DSNXOOS1 SQL PROCEDURE DETECTING ERROR .

    After creation when I am explaining below query using BMC tool:
    SELECT rs.I_SERV_COUPN
    ,rs.A_SERV_RVAL
    FROM S.SRHSMRS rs
    WHERE upper(rs.I_RO_NUM) = ?
    and rs.I_VIN_LAST_8 = ?
    and rs.I_CAMPAIGN = ?
    and rs.C_DLR = ?
    WITH UR
    It is still showing ACC_TYPE as "R", which is table space scan.
    Can you advice how can i reduce the SU rate or why it is not picking the index for above query.

    Thanks in advance.

    Ajay

    ReplyDelete
  4. Ajay:

    I believe that the expression in the SQL must match the expression in the index exactly. In this case, that means the locale is probably causing the problem. Have you tried adding the same locale to the query as you use in the index?

    Regards,
    Craig

    ReplyDelete
  5. Actually, I just found a blog post out there in the wild that specifies this problem exactly. Read this --> http://ibm-mainframe-tips.blogspot.com/2012/01/db2-for-zos-index-on-expression.html

    ReplyDelete
  6. Anonymous3:15 PM

    Craig,

    Thanks for quick response on my query!!.

    Is there a way we can avoid locale name in db2v10 so that application team need not to change their existing query.

    ReplyDelete
  7. Well, the indexed expression and the expression in the SQL statement must completely match. And the SQL Reference manual explicitly states the following: If key-expression references the LOWER or UPPER functions, the input string-expression cannot be FOR BIT DATA, and the function
    invocation must contain the locale-name argument.

    So it looks like your only option is to continue with non-index access because of the UCASE, or change the SQL WHERE clause to specify the locale.

    Sorry for the bad news.

    ReplyDelete
  8. Anonymous5:09 AM

    I have tried expression index but seems there is no improvement in performance.

    ReplyDelete