Sometimes when you are writing your SQL to access data you come across the need to work with aggregates. Fortunately, SQL offers many simple ways of aggregating data. But what happens when you uncover then need to perform aggregations of aggregates?
What does that mean? Well, consider an example. Let's assume that you want to compute
the average of a sum. This is a reasonably common requirement that comes up frequently in applications that are built
around sales amounts. Let's say that we have a table containing sales
information, where each sales amount has additional information indicating the
salesman, region, district, product, date, etc.
A common requirement is to produce a report of the average
sales by region for a particular period, say the first quarter of 2014. But the
data in the table is at a detail level, meaning we have a row for each specific
sale.
A novice SQL coder might try to write a query with a
function inside a function, like AVG(SUM(SALE_AMT)). Of course, this is invalid
SQL syntax. DB2 will not permit the nesting of aggregate functions. But we can
use nested table expressions and our knowledge of SQL functions to build the
correct query.
Let’s start by creating a query to return the sum of all
sales by region for the time period in question. That query should look
something like this:
SELECT REGION,
SUM(SALE_AMT)
FROM SALES
WHERE SALE_DATE BETWEEN
DATE(‘2014-01-01’)
AND DATE(‘2014-03-31’)
GROUP BY REGION;
Now that we have the total sales by region for the period in
question, we can embed this query into a nested table expression in another
query like so:
SELECT NTE.REGION,
AVG(NTE.TOTAL_SALES)
FROM (SELECT REGION,
SUM(SALE_AMT)
FROM
SALES
WHERE SALE_DATE BETWEEN DATE(‘2014-01-01’)
AND DATE(‘2014-03-31’)
GROUP BY REGION) AS NTE
AND DATE(‘2014-03-31’)
GROUP BY REGION) AS NTE
GROUP BY NTE.REGION;
And voila! We have aggregated an aggregate, (averaged a sum)...
No comments:
Post a Comment