Paradox Community
Search:

 Welcome |  What is Paradox |  Paradox Folk |  Paradox Solutions |
 Interactive Paradox |  Paradox Programming |  Internet/Intranet Development |
 Support Options |  Classified Ads |  Wish List |  Submissions 


Interactive Paradox Articles  |  Beyond Help Articles  |  Tips & Tricks Articles  


Local SQL Part 7
Grouping and Summarizing Data
© 2001 Lawrence G. DiGiovanni
Digico, Inc.
www.digicoinc.com


Preface

Local SQL is a very powerful yet commonly underutilized tool that is provided with Paradox (as part of the Borland Database Engine). It exposes the power and flexibility of the SQL sublanguage to both Paradox developers as well as interactive end-users.

This is the seventh in a planned series of articles on using Local SQL with Paradox/BDE applications. This article discusses how to summarize data by groups, using the SQL GROUP BY clause and aggregate functions.


Introduction

One of the most powerful capabilities SQL provides is the ability to summarize data. In earlier articles, we briefly discussed the use of aggregate functions to summarize columns of an entire table. The GROUP BY clause allows us to define groups of records to which we can apply aggregate functions to summarize across these groups.

As with the last article, our examples will use the DBDEMOS database, which contains the Biolife example database.


GROUP BY Example

In an earlier article, we used the aggregate function COUNT to show the total number of customers in our database.
SELECT COUNT(Company)
  FROM customer
Suppose we wanted to find out how many customers we had per state? In QBE, we could do it like this:
customer.db | CustNo         | State  |
            | CALC COUNT ALL | Check  |
The checkmark both defines our group and indicates what columns to list for each group, and the CALC COUNT ALL is the expression we want evaluated for each group. Note that, in QBE, the columns to be listed for a grouping must also be the columns that define the grouping.

In SQL, we would do it like this:
SELECT State, COUNT(*)
  FROM customer
GROUP BY State
The GROUP BY clause defines our group, and the SELECT list indicates what columns to list for each group. The COUNT(*) is the expression we want evaluated for each group. It is not obvious from the syntax, but in SQL, as with QBE, the columns to be listed for a grouping must also be the columns that define the grouping. Because of the freeform, text-based nature of SQL, it is easy to omit. Such a query will not execute, producing an error message along the lines of "When GROUP BY exists, every single field in projectors must be in GROUP BY." Projectors is another name for the SELECT list.

The practical implication here is that if you want a query which shows each company name, the state the company is in, and the total number of companies in that same state, you cannot do the following:
SELECT State, Company, Count(*)
  FROM Customer
GROUP BY State
The above query will fail with the earlier message. If you need that info, you will have to, in one way or another, perform two queries.


Aggregate Functions

Following is a list of the SQL aggregate functions and their purpose:

SUM    Sum of a column
MAX    Max value in a column
MIN    Min value in a column
COUNT    Count of values in a column (each occurrence, not unique values)
AVG    Average of all values of a column
The specific usage of each of the above functions can be found in LocalSQL.HLP.


GROUP BY and WHERE

You can also have a WHERE clause when using a GROUP BY clause. There WHERE clause will extract into the grouping only those rows which meet the WHERE clause criteria. For instance, suppose we wanted to know the above information, but only for Customers with recent business - say - with a Last Invoice Date since January 1, 1995. We could include that in our query thusly:
SELECT State, COUNT(*)
  FROM customer
 WHERE LastInvoiceDate > '12/31/1994'
GROUP BY State
Remember, order of the SQL SELECT clauses is fixed, so the WHERE clause must come before the GROUP BY clause.

Similarly, we can order this output by State:
SELECT State, COUNT(*)
  FROM customer
 WHERE LastInvoiceDate > '12/31/1994'
GROUP BY State
ORDER BY State
Or by Count:
SELECT State, COUNT(*)
  FROM customer
 WHERE LastInvoiceDate > '12/31/1994'
GROUP BY State
ORDER BY 2
Remember, the 2 in the ORDER BY clause means to order the output by the second column of the select list, which in this case is COUNT(*).


HAVING

Suppose we want to restrict output by the result of an aggregate function. The instinct (by now) would be to include that criteria in the WHERE clause, but that would be wrong. SQL provides another clause for this type of criteria; the HAVING clause. The HAVING clause must always follow the GROUP BY clause, and there can be no HAVING clause without a GROUP BY clause.

So let's say we wanted to see companies per state, but only in those states with 3 or more companies:
SELECT State, COUNT(*)
  FROM customer
GROUP BY State
HAVING COUNT(*) >= 3
ORDER BY 2
The HAVING clause has restricted output only to those groups which meet the given criteria; in this case, being comprised of more than three members. If you tried to do it with a WHERE clause:
SELECT State, COUNT(*)
  FROM customer
 WHERE COUNT(*) >= 3
GROUP BY State
ORDER BY 2
Local SQL would give you the (in)famous 'Capability not supported' error.

That's not to say you can't use the WHERE clause with GROUP BY/HAVING. For instance, we could combine the logic of the above query with the earlier LastInvoiceDate criteria to get a list of states with a count of Companies per state who have done business since 1/1/1995, having at least 3 companies in that state:
SELECT State, COUNT(*)
  FROM customer
 WHERE LastInvoiceDate > '12/31/1994'
GROUP BY State
HAVING COUNT(*) >= 3
ORDER BY 2

Summary

SQL provides the GROUP BY clause and aggregate functions to allow for the summarization of data. When using the GROUP BY clause, all columns referenced in the SELECT list as part of non-aggregate expressions must be included in the GROUP BY clause. Use of the GROUP BY clause does not preclude the use of the WHERE clause to filter out rows from aggregation; however, aggregate functions should not be used in WHERE criteria. To filter groups, use the HAVING clause.


Local SQL Part 8


Discussion of this article


 Feedback |  Paradox Day |  Who Uses Paradox |  I Use Paradox |  Downloads 


 The information provided on this Web site is not in any way sponsored or endorsed by Corel Corporation.
 Paradox is a registered trademark of Corel Corporation.


 Modified: 15 May 2003
 Terms of Use / Legal Disclaimer


 Copyright © 2001- 2003 Paradox Community. All rights reserved. 
 Company and product names are trademarks or registered trademarks of their respective companies. 
 Authors hold the copyrights to their own works. Please contact the author of any article for details.