Monday, September 24, 2007

Histogram Statistics [DB2 9 for z/OS]

Another utility upgrade that has found its way into DB2 9 for z/OS is the ability to gather histogram statistics. This feature is already available in DB2 for Linux, Unix, and Windows… and after you migrate to DB2 V9 it will be available to you on z/OS.

What is it? Well, let’s first define what a histogram is for those of you who are not statistics experts. A histogram is a way of summarizing data that is measured on an interval scale. A histogram is particularly helpful to quickly highlight how data is distributed; to determine if data is symmetrical or skewed; and to indicate whether or not outliers exists.

The histogram is only appropriate for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets. Histogram statistics can be quite useful to the optimizer for certain types of queries.

Instead of the frequency statistics, which are collected for only a subset of the data, sometimes DB2 can improve access path selection by estimating predicate selectivity from histogram statistics, which are collected over all values in a table space.

Consider collecting histogram statistics to improve access paths for troublesome queries with RANGE, LIKE, and BETWEEN predicates. They can also help in some cases for =, IS NULL, IN LIST and COL op COL predicates.

How to Collect Histogram Statistics

IBM RUNSTATS in DB2 V9 can collect statistics by quantiles. DB2 allows up to 100 quantiles. The user can specify how many quantiles DB2 is to use from 1 to 100. Of course, avoid 1 because it will not help.

You can tell RUNSTATS to collect histogram statistics by coding the HISTOGRAM keyword in conjunction with the COLGROUP option. In this way you can collect histogram statistics for a group of columns. You must also tell DB2 the number of quantiles to collect by specifying the NUMQUANTILES parameter. NUMQUANTILES can also be specified with the INDEX parameter, in which can it indicates that histogram statistics are to be collected for the columns of the index.

A single value can never broken into more than one interval. This means that the maximum number of intervals is equal to the number of distinct column values. Therefore, be sure that you do not specify a value for NUMQUANTILES that is greater than the total number of distinct values for the column (or column group) specified. Also, keep in mind that any NULLs will occupy a single interval.

So then, how do you decide on the number of quantiles to collect? If you do not specify NUMQUANTILES, the default value of 100 will be used, and then based on the number of records in the table, the number will be readjusted to an optimal number. Therefore, unless you have a good understanding of the application or a viable reason to deviate, a good rule of thumb is to simply let the NUMQUANTILES default and let DB2 work it out.

RUNSTATS will produce an equal-depth histogram. This means that each interval will have about the same number of rows. Please note that this does not mean the same number of values – it is the same number of rows. This means that in some cases a highly frequent single value could potentially occupy an interval all by itself.

The histogram statistics are collected in three new columns: QUANTILENO, LOWVALUE, and HIGHVALUE. These columns can be found in the following six DB2 Catalog tables:


Here is an example of a RUNSTATS to gather histogram statistics for the key columns of the indexes.:



Histogram statistics is a very powerful new capability of the RUNSTATS utility that can be used to gather distribution statistics across all data values. These statistics can be helpful when you need additional distribution data to enable the optimizer to arrive at a better access path for certain queries/predicates.


Term papers said...

Good Article About "Histogram Statistics [DB2 9 for z/OS]"

Nandhini said...

I have read your blog its very attractive and impressive. I like it your blog....R Programming Training in Bangalore

Afreen said...
This comment has been removed by a blog administrator.
Unknown said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...

Nice blog..! I really loved reading through this article. Thanks for sharing such a
amazing post with us and keep blogging... best angularjs training institute in chennai | angularjs training in omr

service care said...

Thanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts.
mi service center in chennai

Softlogicseo said...

And indeed, I’m just always astounded concerning the remarkable things served by you. Some four facts on this page are undeniably the most effective I’ve had.
C C++ Training in Chennai |Best C C++ Training course in Chennai
linux Training in Chennai | Best linux Training in Chennai
Unix Training in Chennai | Best Unix Training in Chennai
uipath training in chennai | Best uipath training in chennai
Rprogramming Training in Chennai | Best Rprogramming Training in Chennai

Ajish said...

Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
Data Science with R Programming training in Electronic City

DedicatedHosting4u said...

Just seen your Article, it amazed me and surpised me with god thoughts that eveyone will benefit from it. It is really a very informative post for all those budding entreprenuers planning to take advantage of post for business expansions. You always share such a wonderful articlewhich helps us to gain knowledge .Thanks for sharing such a wonderful article, It will be deinitely helpful and fruitful article.

MindtechAffiliates said...

Great. It is good to constantly coming up with creative ideas. Provides much needed knowledge. goal oriented blog posts and always tried to find creative ways to meet goals.

Online affiliates

Jayalakshmi said...

Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
web designing training in chennai

web designing training in tambaram

digital marketing training in chennai

digital marketing training in tambaram

rpa training in chennai

rpa training in tambaram

tally training in chennai

tally training in tambaram

jeni said...

This is the information that ive been looking for. Great insights & you have explained it really well. Thank you & looking forward for more of such valuable updates.
data science training in chennai

data science training in velachery

android training in chennai

android training in velachery

devops training in chennai

devops training in velachery

artificial intelligence training in chennai

artificial intelligence training in velachery